Foundations and Trends® in Machine Learning > Vol 2 > Issue 1

Learning Deep Architectures for AI

Yoshua Bengio, Dept. IRO, Université de Montréal, Canada, yoshua.bengio@umontreal.ca
 
Suggested Citation
Yoshua Bengio (2009), "Learning Deep Architectures for AI", Foundations and TrendsĀ® in Machine Learning: Vol. 2: No. 1, pp 1-127. http://dx.doi.org/10.1561/2200000006

Published: Nov 15, 2009
© 2009 Y. Bengio
 
Subjects
Dimensionality reduction
 
 
Download article
 
In this article:
1 Introduction
2 Theoretical Advantages of Deep Architectures
3 Local vs Non-Local Generalization
4 Neural Networks for Deep Architectures
5 Energy-Based Models and Boltzmann Machines
6 Greedy Layer-Wise Training of Deep Architectures
7 Variants of RBMs and Auto-Encoders
8 Stochastic Variational Bounds for Joint Optimization of DBN Layers
9 Looking Forward
10 Conclusion
Acknowledgments
References

Abstract

Theoretical results suggest that in order to learn the kind of complicated functions that can represent high-level abstractions (e.g., in vision, language, and other AI-level tasks), one may need deep architectures. Deep architectures are composed of multiple levels of non-linear operations, such as in neural nets with many hidden layers or in complicated propositional formulae re-using many sub-formulae. Searching the parameter space of deep architectures is a difficult task, but learning algorithms such as those for Deep Belief Networks have recently been proposed to tackle this problem with notable success, beating the state-of-the-art in certain areas. This monograph discusses the motivations and principles regarding learning algorithms for deep architectures, in particular those exploiting as building blocks unsupervised learning of single-layer models such as Restricted Boltzmann Machines, used to construct deeper models such as Deep Belief Networks.

DOI:10.1561/2200000006
 
Buy book Download E-book
 
Table of contents:
1: Introduction
2: Theoretical Advantages of Deep Architectures
3: Local vs Non-Local Generalization
4: Neural Networks for Deep Architectures
5: Energy-Based Models and Boltzmann Machines
6: Greedy Layer-Wise Training of Deep Architectures
7: Variants of RBMs and Auto-Encoders
8: Stochastic Variational Bounds for Joint Optimization of DBN Layers
9: Looking forward
10: Conclusion
Acknowledgements
References

Learning Deep Architectures for AI

Can machine learning deliver AI? Theoretical results, inspiration from the brain and cognition, as well as machine learning experiments suggest that in order to learn the kind of complicated functions that can represent high-level abstractions (e.g. in vision, language, and other AI-level tasks), one would need deep architectures. Deep architectures are composed of multiple levels of non-linear operations, such as in neural nets with many hidden layers, graphical models with many levels of latent variables, or in complicated propositional formulae re-using many sub-formulae. Each level of the architecture represents features at a different level of abstraction, defined as a composition of lower-level features. Searching the parameter space of deep architectures is a difficult task, but new algorithms have been discovered and a new sub-area has emerged in the machine learning community since 2006, following these discoveries. Learning algorithms such as those for Deep Belief Networks and other related unsupervised learning algorithms have recently been proposed to train deep architectures, yielding exciting results and beating the state-of-the-art in certain areas. Learning Deep Architectures for AI discusses the motivations for and principles of learning algorithms for deep architectures. By analyzing and comparing recent results with different learning algorithms for deep architectures, explanations for their success are proposed and discussed, highlighting challenges and suggesting avenues for future explorations in this area.

ISBN:978-1-60198-294-0 E-ISBN:978-1-60198-295-7 DOI:10.1561/9781601982957