Foundations and Trends® in Machine Learning > Vol 9 > Issue 1

Generalized Low Rank Models

Madeleine Udell, Cornell University, USA, udell@cornell.edu Corinne Horn, Stanford University, USA, cehorn@stanford.edu Reza Zadeh, Stanford University, USA, rezab@stanford.edu Stephen Boyd, Stanford University, USA, boyd@stanford.edu
 
Suggested Citation
Madeleine Udell, Corinne Horn, Reza Zadeh and Stephen Boyd (2016), "Generalized Low Rank Models", Foundations and TrendsĀ® in Machine Learning: Vol. 9: No. 1, pp 1-118. http://dx.doi.org/10.1561/2200000055

Published: 23 Jun 2016
© 2016 M. Udell, C. Horn, R. Zadeh and S. Boyd
 
Subjects
 

Free Preview:

Article Help

Share

Download article
In this article:
1. Introduction
2. PCA and quadratically regularized PCA
3. Generalized regularization
4. Generalized loss functions
5. Loss functions for abstract data types
6. Multi-dimensional loss functions
7. Fitting low rank models
8. Choosing low rank models
9. Implementations
Acknowledgements
Appendices
References

Abstract

Principal components analysis (PCA) is a well-known technique for approximating a tabular data set by a low rank matrix. Here, we extend the idea of PCA to handle arbitrary data sets consisting of numerical, Boolean, categorical, ordinal, and other data types. This framework encompasses many well-known techniques in data analysis, such as nonnegative matrix factorization, matrix completion, sparse and robust PCA, k-means, k-SVD, and maximum margin matrix factorization. The method handles heterogeneous data sets, and leads to coherent schemes for compressing, denoising, and imputing missing entries across all data types simultaneously. It also admits a number of interesting interpretations of the low rank factors, which allow clustering of examples or of features. We propose several parallel algorithms for fitting generalized low rank models, and describe implementations and numerical results.

DOI:10.1561/2200000055
ISBN: 978-1-68083-140-5
140 pp. $90.00
Buy book
 
ISBN: 978-1-68083-141-2
140 pp. $130.00
Buy E-book
Table of contents:
1. Introduction
2. PCA and quadratically regularized PCA
3. Generalized regularization
4. Generalized loss functions
5. Loss functions for abstract data types
6. Multi-dimensional loss functions
7. Fitting low rank models
8. Choosing low rank models
9. Implementations
Acknowledgements
Appendices
References

Generalized Low Rank Models

Principal components analysis (PCA) is a well-known technique for approximating a tabular data set by a low rank matrix. Here, we extend the idea of PCA to handle arbitrary data sets consisting of numerical, Boolean, categorical, ordinal, and other data types. This framework encompasses many well-known techniques in data analysis, such as nonnegative matrix factorization, matrix completion, sparse and robust PCA, k-means, k-SVD, and maximum margin matrix factorization. The method handles heterogeneous data sets, and leads to coherent schemes for compressing, denoising, and imputing missing entries across all data types simultaneously. It also admits a number of interesting interpretations of the low rank factors, which allow clustering of examples or of features. We propose several parallel algorithms for fitting generalized low rank models, and describe implementations and numerical results.

 
MAL-055