By Majid Janzamin, Twitter, USA, majid.janzamin@gmail.com | Rong Ge, Duke University, USA, rongge@cs.duke.edu | Jean Kossaifi, Imperial College London, UK, jean.kossaifi@imperial.ac.uk | Anima Anandkumar, NVIDIA and California Institute of Technology, USA, anima@caltech.edu
Spectral methods have been the mainstay in several domains such as machine learning, applied mathematics and scientific computing. They involve finding a certain kind of spectral decomposition to obtain basis functions that can capture important structures or directions for the problem at hand. The most common spectral method is the principal component analysis (PCA). It utilizes the principal components or the top eigenvectors of the data covariance matrix to carry out dimensionality reduction as one of its applications. This data pre-processing step is often effective in separating signal from noise. PCA and other spectral techniques applied to matrices have several limitations. By limiting to only pairwise moments, they are effectively making a Gaussian approximation on the underlying data. Hence, they fail on data with hidden variables which lead to non-Gaussianity. However, in almost any data set, there are latent effects that cannot be directly observed, e.g., topics in a document corpus, or underlying causes of a disease. By extending the spectral decomposition methods to higher order moments, we demonstrate the ability to learn a wide range of latent variable models efficiently. Higher-order moments can be represented by tensors, and intuitively, they can encode more information than just pairwise moment matrices. More crucially, tensor decomposition can pick up latent effects that are missed by matrix methods. For instance, tensor decomposition can uniquely identify non-orthogonal components. Exploiting these aspects turns out to be fruitful for provable unsupervised learning of a wide range of latent variable models. We also outline the computational techniques to design efficient tensor decomposition methods. They are embarrassingly parallel and thus scalable to large data sets. Whilst there exist many optimized linear algebra software packages, efficient tensor algebra packages are also beginning to be developed. We introduce Tensorly, which has a simple python interface for expressing tensor operations. It has a flexible back-end system supporting NumPy, PyTorch, TensorFlow and MXNet amongst others. This allows it to carry out multi-GPU and CPU operations, and can also be seamlessly integrated with deep-learning functionalities.
The authors of this monograph survey recent progress in using spectral methods including matrix and tensor decomposition techniques to learn many popular latent variable models. With careful implementation, tensor-based methods can run efficiently in practice, and in many cases they are the only algorithms with provable guarantees on running time and sample complexity.
The focus is on a special type of tensor decomposition called CP decomposition, and the authors cover a wide range of algorithms to find the components of such tensor decomposition. They also discuss the usefulness of this decomposition by reviewing several probabilistic models that can be learned using such tensor methods.
The second half of the monograph looks at practical applications. This includes using Tensorly, an efficient tensor algebra software package, which has a simple python interface for expressing tensor operations. It also has a flexible back-end system supporting NumPy, PyTorch, TensorFlow, and MXNet.
Spectral Learning on Matrices and Tensors provides a theoretical and practical introduction to designing and deploying spectral learning on both matrices and tensors. It is of interest for all students, researchers and practitioners working on modern day machine learning problems.