fast-style-transfer - Fast Artistic Style Transfer for Photos and Videos with TensorFlow

Fast Style Transfer Project Overview

"Fast Style Transfer" is a fascinating project built on TensorFlow that allows users to apply artistic styles from famous paintings to their photos or videos in just a fraction of a second. This project is a fusion of various research methods and techniques in the field of neural networks and artistic styling, providing an efficient way to transform visual content with aesthetically pleasing results.

How It Works

The core of the fast style transfer project is its ability to overlay the styles of famous artworks onto any given image or video. For instance, a picture of the MIT Stata Center can be transformed to mimic the style of "Udnie" by Francis Picabia in just 100 milliseconds using a 2015 Titan X GPU. This is accomplished by combining techniques from three significant works:

Gatys' Neural Algorithm of Artistic Style: This serves as the foundational approach for artistic style transfer using deep neural networks.
Johnson's Perceptual Losses: This method focuses on achieving real-time style transfer and super-resolution by optimizing perceptual losses.
Ulyanov's Instance Normalization: This technique replaces batch normalization to enhance the style transfer process.

Key Features

Image Stylization: Quickly transform images by adding styles from various paintings. The stylized version can be compared directly to the original to see the transformative impact of different artistic styles.
Video Stylization: The project extends its capabilities to videos, transforming each frame to incorporate the chosen artistic style and stitching them back together seamlessly. This method is particularly illustrated with the use of "Udnie" to style a complete video.

Implementation Details

The project is executed in TensorFlow and involves several components:

Training Networks: The network for style transfer is trained using the style.py script, which requires a considerable amount of computational power (4-6 hours on a Maxwell Titan X) and is guided by specific parameters for fine-tuning the results.
Evaluating Networks: Post training, the networks are evaluated using evaluate.py, allowing users to assess how effectively the style has been applied to test images.
Video Transformation: Via transform_video.py, users can extend the style transfer to videos, transforming the visual presentation frame by frame.

Technical Requirements

To run the project, certain requirements must be fulfilled:

TensorFlow 0.11.0
Python 2.7.9 along with libraries such as Pillow, scipy, and numpy
For training, a reliable GPU and appropriate NVIDIA software like CUDA.
ffmpeg is necessary for video stylization tasks.

Getting Started

Anaconda serves as the base environment for setting up this project. Instructions are provided for both Windows and Linux systems, detailing the necessary steps to create a virtual environment and prepare the necessary tools and libraries for running the fast style transfer functionalities.

Support and Licensing

The project, led by Logan Engstrom, is free for academic research use, provided proper attribution is given, and potential sponsors are welcome to support its development. Those interested in commercial applications need to contact the creator for additional permissions.

Acknowledgements

The project benefits from the guidance and resources provided by various contributors including Anish Athalye and incorporates elements from related works such as Justin Johnson's Fast Neural Style.

In exploring the creative domain of visual modification, Fast Style Transfer opens up possibilities for users to effortlessly redefine the aesthetic of their imagery through computational art.