Deep Learning Resources

Book

Survey

Architecture

Application

Library

Book

Goodfellow, I., Bengio, Y., and Courville, A. Deep Learning. MIT Press, 2016. [Link]
Nielsen, M. A. Neural Networks and Deep Learning. Determination Press, 2015. [Link]
Epelbaum, T. Deep Learning: Technical Introduction. arXiv preprint, 2017. [Link]

Survey

LeCun, Y., Bengio, Y., and Hinton, G. Deep learning. Nature 521, 7553 (2015), 436–444. [Link]
Schmidhuber, J. Deep learning in neural networks: An overview. Neural networks 61 (2015), 85–117. [Link]
Deng, L. Three classes of deep learning architectures and their applications: a tutorial survey. APSIPA transactions on signal and information processing (2012). [Link]
Wang, H., Raj, B., and Xing, E. P. On the origin of deep learning. arXiv preprint arXiv:1702.07800 (2017). [Link]

Architecture

Recurrent Neural Network

[Original-RNN] Investigations on dynamic neural networks. Schmidhuber Thesis in Germany [Link in Germany]
[Original-LSTM] Hochreiter, S., and Schmidhuber, J. Long short-term memory. Neural computation 9, 8 (1997), 1735–1780. [Link]
[Refined-RNN-LSTM] Graves, A. Generating sequences with recurrent neural networks. arXiv:1308.0850 (2013). [Link]
[GRU] Cho, K., et. al. Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014). [Link]
[Comparison] Jozefowicz, R., Zaremba, W., and Sutskever, I. An empirical exploration of recurrent network architectures ̇ In Proceedings of the 32nd International Conference on Machine Learning (ICML-15) (2015), pp. 2342–2350. [Link]
[Seq2Seq] Sutskever, I., Vinyals, O., and Le, Q. V. Sequence to sequence learning with neural networks. In Advances in neural information processing systems (2014), pp. 3104–3112. [Link]

Convolutional Neural Network

[AlexNet] Krizhevsky, A., Sutskever, I., and Hinton, G. E. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (2012), pp. 1097–1105. [Link]
[VGGNet] Simonyan, K., and Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014). [Link]
[GoogLeNet]Szegedy, C., et. al. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (2015), pp. 1–9. [Link]
[ResNet]He, K., Zhang, X., Ren, S., and Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (2016), pp. 770–778. [Link]

Unsupervised and Deep Generative Models

[RBM/DBN] Hinton, G. E., and Salakhutdinov, R. R. Reducing the dimensionality of data with neural networks. science 313, 5786 (2006), 504–507. [Link]
[Autoencoder] Le, Q. V. Building high-level features using large scale unsupervised learning. In Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on (2013), IEEE, pp. 8595–8598. [Link]
[RNN] Graves, A. Generating sequences with recurrent neural networks. arXiv:1308.0850 (2013). [Link]
[Seq2Seq] Sutskever, I., Vinyals, O., and Le, Q. V. Sequence to sequence learning with neural networks. In Advances in neural information processing systems (2014), pp. 3104–3112. [Link]
[VAE] Kingma, D. P., and Welling, M. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013). [Link]
[GAN] Goodfellow, I., et. al. Generative adversarial nets. In Advances in neural information processing systems (2014), pp. 2672–2680. [Link]
[VAE+RNN+Attention] Gregor, K., et. al. Draw: A recurrent neural network for image generation. arXiv preprint arXiv:1502.04623 (2015). [Link]
[PixelRNN] Oord, A. v. d., Kalchbrenner, N., and Kavukcuoglu, K. Pixel recurrent neural networks. arXiv preprint arXiv:1601.06759 (2016). [Link]
[PixelCNN] van den Oord, A.,et al. Conditional image generation with pixelcnn decoders. In Advances in Neural Information Processing Systems (2016), pp. 4790–4798. [Link]

Application

Natural Language Processing

[Machine Translation] [Link]
[Sentiment Analysis] [Link]

Image

[Object Detection] [Link]

Time Series

[Anomaly Detection] [Link]

Library

Theano (Univ Montreal) [Link]
Lang: Python
(+) Decent high-level wrappers (Keras, Lasagne)
(-) No multi-GPU support, bulkier
Caffe (UC Berkeley) [Link]
Lang: C++, interface to Python and Matlab
(+) Great for CNN
(-) Not so great for RNN
TensorFlow (Google) [Link]
Lang: C++, Python
(+) Low-level library, excellent documentation and community, visualization tool
(-) Slower, quite hard to debug, not too many pre-trained models
Torch (NYU) [Link]
Lang: C, LUA, Python (PyTorch)
(+) Easier to code and debug than TensorFlow, a lot of pre-trained models
(-) Documentation isn't as polished as TensorFlow
Keras [Link]
Lang: Python
(+) Easy to use, high-level library, run on top of Theano or Tensorflow
(-) Hard to debug, difficult to create new architecture
Lasagne [Link]
Lang: Python
(+) High-level library, run on top of Theano
(-) Hard to debug, difficult to create new architecture
CNTK (Microsoft) [Link]
Lang: C++
Apache MXNet (Amazon) [Link]
Lang: C++
Deeplearning4j (Skymind) [Link]
Lang: Java
Chainer [Link]
Lang: Python

*(+) and (-) are gathered from my own experience, Quora (1, 2), Tarry Singh
*Interesting analysis of framework use by academic papers (March 2017) from Alwyn Matthew here

% of papers 	 framework

----------------------------

    9.1          tensorflow

    7.1               caffe

    4.6              theano

    3.3               torch

    2.5               keras

    1.7          matconvnet

    1.2             lasagne

    0.5             chainer

    0.3               mxnet

    0.3                cntk

    0.2             pytorch

    0.1      deeplearning4j