A Reading List for Neural Networks in NLP & MT

Machine Translation


Chinese Segmentation


Sentence and Document Modeling

Phrase Modeling

  • Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeff Dean, “Distributed Representations of Words and Phrases and their Compositionality,” NIPS 2013.
  • Christian Scheible, Hinrich Schutze. “Cutting Recursive Autoencoder Trees” CoRR abs/1301.2811 (2013)
  • Gershman, S. J., & Tenenbaum, J. B. “Phrase similarity in humans and machines”. Proceedings of the 37th Annual Conference of the Cognitive Science Society.

Sentence Modeling

  • Kalchbrenner, Nal, Edward Grefenstette, and Phil Blunsom. A convolutional neural network for modelling sentences. ACL 2014. [convnet for sentences, dynamic, k-max pooling, stacked]
  • Kim, Yoon. Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882, 2014.
  • Le, Quoc V., and Tomas Mikolov. Distributed representations of sentences and documents, arXiv preprint arXiv:1405.4053, 2014.
  • Misha Denil, Alban Demiraj, Nal Kalchbrenner, Phil Blunsom, Nando de Freitas. Modelling, Visualising and Summarising Documents with a Single Convolutional Neural Network. CoRR 2014. [2D convolutional]
  • Wenpeng Yin and Hinrich Schutze. Convolutional Neural Network for Paraphrase Identification. NAACL 2015. [unsupervised pretraining for CNN]
  • Rie Johnson and Tong Zhang. Effective Use of Word Order for Text Categorization with Convolutional Neural Networks. [convolute better with word order, parallel-CNN, different region]
  • Hermann, Karl Moritz, and Phil Blunsom. Multilingual Models for Compositional Distributed Semantics. ACL 2014.
  • Le, Quoc V., and Tomas Mikolov. Distributed Representations of Sentences and Documents. ICML 2014.
  • Baotian Hu, Zhengdong Lu, Hang Li, etc. Convolutional Neural Network Architectures for Matching Natural Language Sentences. NIPS 2014. [ARC-I, ARC-II, 2D convolutional, order perserving]

Document Modeling

  • Misha Denil, Alban Demiraj, Nal Kalchbrenner, Phil Blunsom, Nando de Freitas. Modelling, Visualising and Summarising Documents with a Single Convolutional Neural Network. CoRR 2014. [2D convolutional]
  • Nitish Srivastava, Ruslan R Salakhutdinov, Geoffrey E. Hinton. Modeling documents with a deep boltzmann machine. In Uncertainty in Artificial Intelligence, 2013. [deep RBM]
  • Chaochao Huang, Xipeng Qiu, Xuanjing Huang. Text Classification with Document Embeddings. Springer 2014.
  • Le, Quoc V., and Tomas Mikolov. Distributed Representations of Sentences and Documents. ICML 2014.

Conversion Generation

  • Oriol Vinyals and Quoc Le. 2015. A Neural Conversational Model. arXiv:1506.05869.
  • Lifeng Shang, Zhengdong Lu, and Hang Li. 2015. Neural Responding Machine for Short-Text Conversation. In ACL. pp. 1577–1586.


  • Socher, R., Lin, C. C.-Y., & Manning, C. D. 2011. Parsing Natural Scenes and Natural Language with Recursive Neural Networks. ICML, pp. 129–136.
  • Billingsley, R., & Curran, J. 2012. Improvements to Training an RNN parser. In Proceedings of COLING, pp. 279–294.
  • Socher, R., Bauer, J., Manning, C. D., & Andrew Y, N. 2013. Parsing with Compositional Vector Grammars. In Proceedings of ACL, pp. 455-465.
  • P Stenetorp. 2013. Transition-based dependency parsing using recursive neural networks. NIPS Workshop on Deep Learning.
  • Joël Legrand and Ronan Collobert. 2014. Recurrent Greedy Parsing with Neural Networks. ECML/PKDD, 8725(Chapter 9):130–144.
  • Oriol Vinyals, Lukasz Kaiser, Terry Koo, Slav Petrov, Ilya Sutskever, and Geoffrey E Hinton. 2015. Grammar as a Foreign Language. In Proceedings of ICLR.


  • Hochreiter, Sepp and Schmidhuber, Juergen. Long Short-Term Memory. Neural Computation, Vol 9 (8), pp. 1735-1780, 1997.
  • Zaremba W, Sutskever I. 2014. Learning to execute. arXiv preprint arXiv:1410.4615.


  • M. D. Zeiler and R. Fergus. Visualizing and Understanding Convolutional Networks. Technical report, 2012.
  • K. Simonyan, A. Vedaldi, and A. Zisserman. Deep Inside Convolutional Networks : Visualising Image Classification Models and Saliency Maps. Technical report, 2013.
  • Denil, M., Demiraj, A., Kalchbrenner, N., Blunsom, P., & de Freitas, N. Modelling, Visualising and Summarising Documents with a Single Convolutional Neural Network. In Proceedings of the 2013 Workshop on Continuous Vector Space Models and Their Compositionality, 2014.

Image Caption

  • Mao, J., Xu, W., Yang, Y., Wang, J., and Yuille, A. L. Explain Images with Multimodal Recurrent Neural Networks. ICLR 2105.
  • Vinyals, O., Toshev, A., Bengio, S., and Erhan, D. Show and Tell: A Neural Image Caption Generator. CVPR 2015.
  • Karpathy, A. and Fei-Fei, L. Deep Visual-Semantic Alignments for Generating Image Descriptions. CVPR 2015.
  • Chen, X. and Zitnick, C. L. Learning a Recurrent Visual Representation for Image Caption Generation. CVPR 2015.
  • Fang, H., Gupta, S., Iandola, F. N., Srivastava, R., Deng, L., Dollar, P., Gao, J., He, X., Mitchell, M., Platt, J. C., Zitnick, C. L., and Zweig, G. From Captions to Visual Concepts and Back. CVPR 2015.
  • Donahue, J., Hendricks, L. A., Guadarrama, S., Rohrbach, M., Venugopalan, S., Saenko, K., and Darrell, T. Long-term Recurrent Convolutional Networks for Visual Recognition and Description. CVPR 2015.
  • Ryan Kiros, Ruslan Salakhutdinov, Richard S. Zemel. Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models. TACL, 2015


  • King, Ben, Rahul Jha, Tyler Johnson, Vaishnavi Sundararajan, and Clayton Scott. Experiments in Automatic Text Summarization Using Deep Neural Networks. Machine Learning (2011).
  • Liu, Yan, Shenghua Zhong, and Wenjie Li. Query-Oriented Multi-Document Summarization via Unsupervised Deep Learning. AAAI. 2012.
  • PadmaPriya, G., and K. Duraiswamy. An Approach For Text Summarization Using Deep Learning Algorithm. Journal of Computer Science 10, no. 1 (2013): 1-9.
  • Denil, Misha, Alban Demiraj, and Nando de Freitas. “Extraction of Salient Sentences from Labelled Documents. arXiv preprint arXiv:1412.6815 (2014).
  • Kågebäck, Mikael, et al. Extractive summarization using continuous vector space models. Proceedings of the 2nd Workshop on Continuous Vector Space Models and their Compositionality, (CVSC)@ EACL. 2014.
  • Denil, Misha, Alban Demiraj, Nal Kalchbrenner, Phil Blunsom, and Nando de Freitas. Modelling, Visualising and Summarising Documents with a Single Convolutional Neural Network. arXiv preprint arXiv:1406.3830 (2014).
  • Cao, Ziqiang, Furu Wei, Li Dong, Sujian Li, and Ming Zhou. Ranking with Recursive Neural Networks and Its Application to Multi-document Summarization. AAAI 2015.
  • Fei Liu, Jeffrey Flanigan, Sam Thomson, Norman Sadeh, and Noah A. Smith. Toward Abstractive Summarization Using Semantic Representations. NAACL 2015
  • Wenpeng Yin, Yulong Pei. Optimizing Sentence Modeling and Selection for Document Summarization. IJCAI 2015.
  • He, Zhanying, Chun Chen, Jiajun Bu, Can Wang, Lijun Zhang, Deng Cai, and Xiaofei He. Document Summarization Based on Data Reconstruction. In AAAI. 2012.
  • Liu, He, Hongliang Yu, and Zhi-Hong Deng. Multi-Document Summarization Based on Two-Level Sparse Representation Model. In Twenty-Ninth AAAI Conference on Artificial Intelligence. 2015.
  • Jin-ge Yao, Xiaojun Wan, Jianguo Xiao. Compressive Document Summarization via Sparse Optimization. IJCAI 2015
  • Li, Piji, Lidong Bing, Wai Lam, Hang Li, and Yi Liao. Reader-Aware Multi-Document Summarization via Sparse Coding. IJCAI 2015.

Question Answering

  • Iyyer, Mohit, Jordan Boyd-Graber, Leonardo Claudino, Richard Socher, and Hal Daumé III. A neural network for factoid question answering over paragraphs. In EMNLP, pp. 633-644. 2014.
  • Yu, Lei, Karl Moritz Hermann, Phil Blunsom, and Stephen Pulman. Deep learning for answer sentence selection. arXiv preprint arXiv:1412.1632 (2014).

Stock Prediction

  • Xiao Ding, Yue Zhang, Ting Liu, Junwen Duan. Deep Learning for Event-Driven Stock Prediction. IJCAI 2015.
  • Si, Jianfeng, Arjun Mukherjee, Bing Liu, Sinno Jialin Pan, Qing Li, and Huayi Li. Exploiting Social Relations and Sentiment for Stock Prediction. EMNLP 2014.
  • Ding, Xiao, Yue Zhang, Ting Liu, and Junwen Duan. Using Structured Events to Predict Stock Price Movement: An Empirical Investigation. EMNLP 2014.
  • Bollen, Johan, Huina Mao, and Xiaojun Zeng. Twitter mood predicts the stock market. Journal of Computational Science 2, no. 1 (2011): 1-8.

Surveys, Mini-Tutorials, Technical Reports

  • Yoav Goldberg. “A note on Latent Semantic Analysis”
  • Yoav Goldberg and Omer Levy “word2vec explained: deriving Mikolov et al.’s negative-sampling word-embedding method”


电子邮件地址不会被公开。 必填项已用*标注