Art Generation using Neural Style Transfer

Overview

Using Artificial Intelligence to turn images into synthetic artworks automatically is a glimpse at the future of synthetic media. Neural Networks are being used to make a simple image into artwork with its particular aesthetic features, defined by swirling shapes and incoherent objects.

A CNN based technique, known as Neural style transfer, is an optimization technique used to take two images—a content image and a style reference image (such as an artwork by a famous painter)—and blend them together so the output image looks like the content image, but “painted” in the style of the style reference image.

It models the content of a photo as the feature responses from a pre-trained CNN, and further models the style of an artwork as the summary feature statistics. CNN is capable of extracting content information from an arbitrary photograph and style information from a well-known artwork.

Methodology

The main aim of NST is for the content in "generated" image G to match the content of "content" image C. For this, I have calculated the content cost function, which is defined as:

Content Cost

Here, nH, nW and nC are the height, width and number of channels of the hidden layer, and appear in a normalization term in the cost.
a(C) and a(G) are the 3D volumes corresponding to a hidden layer’s activations.

and the style cost function, which is defined as:

Style Cost

G(S)gram: Gram matrix of the “style” image.
G(G)gram: Gram matrix of the “generated” image.

I optimized the total cost function (linear combination of the above two cost functions) to get pixel values of the image and merged the “content” image (C) and a “style” image (S), to create a “generated” image (G). This generated image is the final output artistic image. I then used pre-trained VGG-19 network to recognize a variety of low level features (at the shallower layers) and high level features (at the deeper layers).

Technical Details

Used VGG 19 for feature extraction.
Used Adam optimizer to minimize the total cost J.
Learning rate of 0.03.
Model training requires about 3min on a GPU for 2500 iterations.
alpha = 10 and beta = 40. alpha and beta are the hyperparameters that control the relative weighting between content and style.

View Code

Share on

Twitter Facebook LinkedIn

Kartik Sharma

Overview

Methodology

Technical Details

Share on