Audio / Image / Video Generation
Papers
Optimizing Neural Networks That Generate Images
- intro: 2014 PhD thesis
 - paper : http://www.cs.toronto.edu/~tijmen/tijmen_thesis.pdf
 - github: https://github.com/mrkulk/Unsupervised-Capsule-Network
 
Learning to Generate Chairs, Tables and Cars with Convolutional Networks
DRAW: A Recurrent Neural Network For Image Generation
- intro: Google DeepMind
 - arxiv: http://arxiv.org/abs/1502.04623
 - github: https://github.com/vivanov879/draw
 - github(Theano): https://github.com/jbornschein/draw
 - github(Lasagne): https://github.com/skaae/lasagne-draw
 - youtube: https://www.youtube.com/watch?v=Zt-7MI9eKEo&hd=1
 - video: http://pan.baidu.com/s/1gd3W6Fh
 
What is DRAW (Deep Recurrent Attentive Writer)?
- blog: http://kvfrans.com/what-is-draw-deep-recurrent-attentive-writer/
 - github(tensorflow): https://github.com/kvfrans/draw
 
Colorizing the DRAW Model
Understanding and Implementing Deepmind’s DRAW Model
Generative Image Modeling Using Spatial LSTMs
Conditional generative adversarial nets for convolutional face generation
- paper: http://www.foldl.me/uploads/2015/conditional-gans-face-generation/paper.pdf
 - blog: http://www.foldl.me/2015/conditional-gans-face-generation/
 - github: https://github.com/hans/adversarial
 
Generating Images from Captions with Attention
- arxiv: http://arxiv.org/abs/1511.02793
 - github: https://github.com/emansim/text2image
 - demo: http://www.cs.toronto.edu/~emansim/cap2im.html
 
Attribute2Image: Conditional Image Generation from Visual Attributes
- intro: University of Michigan & Adobe Research & NEC Labs
 - project page: https://sites.google.com/site/attribute2image/
 - arxiv: http://arxiv.org/abs/1512.00570
 - github(Torch): https://github.com/xcyan/eccv16_attr2img
 
Autoencoding beyond pixels using a learned similarity metric
- arxiv: http://arxiv.org/abs/1512.09300
 - demo: http://algoalgebra.csa.iisc.ernet.in/deepimagine/
 - github: https://github.com/andersbll/autoencoding_beyond_pixels
 - github(Tensorflow): https://github.com/timsainb/Tensorflow-MultiGPU-VAE-GAN
 - video: http://video.weibo.com/show?fid=1034:f00b4e5a34e8c1ebe78ccd00da95f9e0
 - github: https://github.com/stitchfix/fauxtograph
 
Deep Visual Analogy-Making

- paper: https://papers.nips.cc/paper/5845-deep-visual-analogy-making
 - github(Tensorflow): https://github.com/carpedm20/visual-analogy-tensorflow
 - slides: http://slideplayer.com/slide/9147672/
 - mirror: http://pan.baidu.com/s/1pKgrdnt
 
Pixel Recurrent Neural Networks
- intro: Google DeepMind. ICML 2016 best paper. PixelRNN
 - arxiv: http://arxiv.org/abs/1601.06759
 - github: https://github.com/igul222/pixel_rnn
 - github(Tensorflow): https://github.com/carpedm20/pixel-rnn-tensorflow
 - notes(by Hugo Larochelle): https://www.evernote.com/shard/s189/sh/fdf61a28-f4b6-491b-bef1-f3e148185b18/aba21367d1b3730d9334ed91d3250848
 - video(by Hugo Larochelle): https://www.periscope.tv/hugo_larochelle/1ypKdnMkjBnJW
 
Generating images with recurrent adversarial networks
- arxiv: http://arxiv.org/abs/1602.05110
 - github: https://github.com/jiwoongim/GRAN
 
Pixel-Level Domain Transfer
- intro: ECCV 2016
 - github(Torch): https://github.com/fxia22/PixelDTGAN
 - author page(Code and dataset): https://dgyoo.github.io/
 
Generative Adversarial Text to Image Synthesis

- intro: ICML 2016
 - arxiv: http://arxiv.org/abs/1605.05396
 - project page: https://www.mpi-inf.mpg.de/departments/computer-vision-and-multimodal-computing/research/embeddings-for-image-classification/generative-adversarial-text-to-image-synthesis/
 - github: https://github.com/reedscot/icml2016
 - code+dataset: http://datasets.d2.mpi-inf.mpg.de/akata/cub_txt.tar.gz
 
Conditional Image Generation with PixelCNN Decoders
- intro: Google DeepMind. PixelCNN 2.0
 - arxiv: http://arxiv.org/abs/1606.05328
 - github(Theano): https://github.com/kundan2510/pixelCNN
 - gtihub(Torch): https://github.com/dritchie/pixelCNN
 - github(Tensorflow): https://github.com/anantzoid/Conditional-PixelCNN-decoder
 
Inverting face embeddings with convolutional neural networks
- arxiv: http://arxiv.org/abs/1606.04189
 - github: https://github.com/pavelgonchar/face-transfer-tensorflow
 
Unsupervised Cross-Domain Image Generation

- intro: Facebook AI Research. Domain Transfer Network (DTN)
 - arxiv: https://arxiv.org/abs/1611.02200
 - github(TensorFlow): https://github.com/yunjey/dtn-tensorflow
 
PixelCNN++: A PixelCNN Implementation with Discretized Logistic Mixture Likelihood and Other Modifications
- intro: OpenAI
 - arxiv: https://arxiv.org/abs/1701.05517
 - paper: http://openreview.net/pdf?id=BJrFC6ceg
 - github: https://github.com/openai/pixel-cnn
 
Generating Interpretable Images with Controllable Structure
- intro: Google DeepMind
 - paper: http://www.scottreed.info/files/iclr2017.pdf
 
Learning to Generate Images of Outdoor Scenes from Attributes and Semantic Layouts
Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space
- intro: University of Wyoming & Geometric Intelligence & Montreal Institute for Learning Algorithms & University of Freiburg
 - project page: http://www.evolvingai.org/ppgn
 - paper: http://www.evolvingai.org/files/nguyen2016ppgn_v1.pdf
 - github: https://github.com/Evolving-AI-Lab/ppgn
 
Image Generation and Editing with Variational Info Generative AdversarialNetworks
DeepFace: Face Generation using Deep Learning
Multi-View Image Generation from a Single-View
- intro: Southwest Jiaotong University & National University of Singapore
 - arxiv: https://arxiv.org/abs/1704.04886
 
Generative Cooperative Net for Image Generation and Data Augmentation
https://arxiv.org/abs/1705.02887
Statistics of Deep Generated Images
https://arxiv.org/abs/1708.02688
Sketch-to-Image Generation Using Deep Contextual Completion
https://arxiv.org/abs/1711.08972
Energy-relaxed Wassertein GANs(EnergyWGAN): Towards More Stable and High Resolution Image Generation
https://arxiv.org/abs/1712.01026
Spatial PixelCNN: Generating Images from Patches
https://arxiv.org/abs/1712.00714
Visual to Sound: Generating Natural Sound for Videos in the Wild
- intro: University of North Carolina at Chapel Hill & Adobe Research
 - project page: http://bvision11.cs.unc.edu/bigpen/yipin/visual2sound_webpage/visual2sound.html
 - arxiv: https://arxiv.org/abs/1712.01393
 
Semi-supervised FusedGAN for Conditional Image Generation
https://arxiv.org/abs/1801.05551
Image Transformer
- intro: Google Brain & UC Berkeley
 - arxiv: https://arxiv.org/abs/1802.05751
 
Unpaired Multi-Domain Image Generation via Regularized Conditional GANs
https://arxiv.org/abs/1805.02456
Transferring GANs: generating images from limited data
- intro: Universitat Aut`onoma de Barcelona
 - arxiv: https://arxiv.org/abs/1805.01677
 - github: https://github.com/yaxingwang/Transferring-GANs
 
Cross Domain Image Generation through Latent Space Exploration with Adversarial Loss
https://arxiv.org/abs/1805.10130
Face Image Generation
Fader Networks: Manipulating Images by Sliding Attributes
- intro: NIPS 2017. Facebook AI Research & Sorbonne Université
 - arxiv: https://arxiv.org/abs/1706.00409
 - github: https://github.com//facebookresearch/FaderNetworks
 
Person Image Generation
Disentangled Person Image Generation
- intro: CVPR 2018 spotlight
 - intro: KU-Leuven/PSI & Max Planck Institute for Informatics & ETH Zurich
 - arxiv: https://arxiv.org/abs/1712.02621
 
Pose Guided Person Image Generation
- intro: NIPS 2017
 - arxiv: https://arxiv.org/abs/1705.09368
 - poster: https://homes.esat.kuleuven.be/~liqianma/NIPS17_PG2/NIPS17_PG2_poster.pdf
 
Deformable GANs for Pose-based Human Image Generation
- intro: University of Trento & Inria Grenoble Rhone-Alpes
 - arxiv: https://arxiv.org/abs/1801.00055
 - github: https://github.com/AliaksandrSiarohin/pose-gan
 
Unpaired Pose Guided Human Image Generation
https://arxiv.org/abs/1901.02284
Video Generation
MoCoGAN: Decomposing Motion and Content for Video Generation
- arxiv: https://arxiv.org/abs/1707.04993
 - github: https://github.com/sergeytulyakov/mocogan
 - github(PyTorch): https://github.com/DLHacks/mocogan
 
Attentive Semantic Video Generation using Captions
https://arxiv.org/abs/1708.05980
Hierarchical Video Generation from Orthogonal Information: Optical Flow and Texture
- intro: AAAI2018. The University of Tokyo
 - project page: http://www.mi.t.u-tokyo.ac.jp/assets/publication/hierarchical_video_generation_sup/
 - arxiv: https://arxiv.org/abs/1711.09618
 
Towards an Understanding of Our World by GANing Videos in the Wild
- intro: ETH Zurich
 - arxiv: https://arxiv.org/abs/1711.11453
 - github: https://github.com//bernhard2202/improved-video-gan
 
Video Generation from Single Semantic Label Map
- intro: CVPR 2019
 - arxiv: https://arxiv.org/abs/1903.04480
 - github: https://github.com/junting/seg2vid
 
Deep Generative Model
Digit Fantasies by a Deep Generative Model
Conditional generative adversarial nets for convolutional face generation
- paper: http://www.foldl.me/uploads/2015/conditional-gans-face-generation/paper.pdf
 - blog: http://www.foldl.me/2015/conditional-gans-face-generation/
 - github: https://github.com/hans/adversarial
 
Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks
- intro: NIPS 2015
 - project page: http://soumith.ch/eyescream/
 - homepage: http://www.cs.nyu.edu/~denton/
 - arxiv: http://arxiv.org/abs/1506.05751
 - code: http://soumith.ch/eyescream/
 - notes: http://colinraffel.com/wiki/deep_generative_image_models_using_a_laplacian_pyramid_of_adversarial_networks
 
Torch convolutional GAN: Generating Faces with Torch
One-Shot Generalization in Deep Generative Models
- intro: Google DeepMind. ICML 2016
 - arxiv: http://arxiv.org/abs/1603.05106
 
Generative Image Modeling using Style and Structure Adversarial Networks
Synthesizing Dynamic Textures and Sounds by Spatial-Temporal Generative ConvNet

- project page: http://www.stat.ucla.edu/~jxie/STGConvNet/STGConvNet.html
 - paper: http://www.stat.ucla.edu/~jxie/STGConvNet/STGConvNet_file/doc/STGConvNet.pdf
 
Synthesizing the preferred inputs for neurons in neural networks via deep generator networks
ArtGAN: Artwork Synthesis with Conditional Categorial GANs
Learning to Generate Chairs with Generative Adversarial Nets
https://arxiv.org/abs/1705.10413
Blogs
Torch convolutional GAN: Generating Faces with Torch
Generating Large Images from Latent Vectors
http://blog.otoro.net/2016/04/01/generating-large-images-from-latent-vectors/
Generating Faces with Deconvolution Networks

- blog: https://zo7.github.io/blog/2016/09/25/generating-faces.html
 - github: https://github.com/zo7/facegen
 
Attention Models in Image and Caption Generation
Deconvolution and Checkerboard Artifacts
- :star::star::star::star::star:
 - intro: Google Brain & Université de Montréal
 - blog: http://distill.pub/2016/deconv-checkerboard/
 
Projects
Generate cat images with neural networks
TF-VAE-GAN-DRAW
- intro: A collection of generative methods implemented with TensorFlow (Deep Convolutional Generative Adversarial Networks (DCGAN), Variational Autoencoder (VAE) and DRAW: A Recurrent Neural Network For Image Generation).
 - github: https://github.com/ikostrikov/TensorFlow-VAE-GAN-DRAW
 
Generating Large Images from Latent Vectors

- project page: http://blog.otoro.net/2016/04/01/generating-large-images-from-latent-vectors/
 - github: https://github.com/hardmaru/cppn-gan-vae-tensorflow
 
Generating Large Images from Latent Vectors - Part Two
- project page: http://blog.otoro.net/2016/06/02/generating-large-images-from-latent-vectors-part-two/
 - github: https://github.com/hardmaru/resnet-cppn-gan-tensorflow
 
Analyzing 50k fonts using deep neural networks
- blog: https://erikbern.com/2016/01/21/analyzing-50k-fonts-using-deep-neural-networks/
 - github: https://github.com/erikbern/deep-fonts
 
Generate cat images with neural networks
- intro: GAN, spatial transformers, weight initialization and LeakyReLUs.
 - github: https://github.com/aleju/cat-generator
 
Generate human faces with neural networks
A TensorFlow implementation of DeepMind’s WaveNet paper
- intro: This is a TensorFlow implementation of the WaveNet generative neural network architecture for image generation.
 - github: https://github.com/Zeta36/tensorflow-image-wavenet