Foundations and Trends® in Optimization > Vol 4 > Issue 1-2

Distributionally Robust Learning

By Ruidi Chen, Boston University, USA, rchen15@bu.edu | Ioannis Ch. Paschalidis, Boston University, USA, yannisp@bu.edu

 
Suggested Citation
Ruidi Chen and Ioannis Ch. Paschalidis (2020), "Distributionally Robust Learning", Foundations and TrendsĀ® in Optimization: Vol. 4: No. 1-2, pp 1-243. http://dx.doi.org/10.1561/2400000026

Publication Date: 23 Dec 2020
© 2020 Ruidi Chen and Ioannis Ch. Paschalidis
 
Subjects
Robustness,  Statistical learning theory,  Classification and prediction,  Optimization,  Stochastic Optimization
 

Free Preview:

Download extract

Share

Download article
In this article:
1. Introduction
2. The Wasserstein Metric
3. Solving the Wasserstein DRO Problem
4. Distributionally Robust Linear Regression
5. Distributionally Robust Grouped Variable Selection
6. Distributionally Robust Multi-Output Learning
7. Optimal Decision Making via Regression Informed K-NN
8. Advanced Topics in Distributionally Robust Learning
9. Discussion and Conclusions
Acknowledgments
References

Abstract

This monograph develops a comprehensive statistical learning framework that is robust to (distributional) perturbations in the data using Distributionally Robust Optimization (DRO) under the Wasserstein metric. Beginning with fundamental properties of the Wasserstein metric and the DRO formulation, we explore duality to arrive at tractable formulations and develop finite-sample, as well as asymptotic, performance guarantees. We consider a series of learning problems, including (i) distributionally robust linear regression; (ii) distributionally robust regression with group structure in the predictors; (iii) distributionally robust multi-output regression and multiclass classification, (iv) optimal decision making that combines distributionally robust regression with nearest-neighbor estimation; (v) distributionally robust semi-supervised learning, and (vi) distributionally robust reinforcement learning. A tractable DRO relaxation for each problem is being derived, establishing a connection between robustness and regularization, and obtaining bounds on the prediction and estimation errors of the solution. Beyond theory, we include numerical experiments and case studies using synthetic and real data. The real data experiments are all associated with various health informatics problems, an application area which provided the initial impetus for this work.

DOI:10.1561/2400000026
ISBN: 978-1-68083-772-8
256 pp. $99.00
Buy book (pb)
 
ISBN: 978-1-68083-773-5
256 pp. $280.00
Buy E-book (.pdf)
Table of contents:
1. Introduction
2. The Wasserstein Metric
3. Solving the Wasserstein DRO Problem
4. Distributionally Robust Linear Regression
5. Distributionally Robust Grouped Variable Selection
6. Distributionally Robust Multi-Output Learning
7. Optimal Decision Making via Regression Informed K-NN
8. Advanced Topics in Distributionally Robust Learning
9. Discussion and Conclusions
Acknowledgments
References

Distributionally Robust Learning

Many of the modern techniques to solve supervised learning problems suffer from a lack of interpretability and analyzability that do not give rise to rigorous mathematical results. This monograph develops a comprehensive statistical learning framework that uses Distributionally Robust Optimization (DRO) under the Wasserstein metric to ensure robustness to perturbations in the data.

The authors introduce the reader to the fundamental properties of the Wasserstein metric and the DRO formulation, before explaining the theory in detail and its application. They cover a series of learning problems, including (i) distributionally robust linear regression; (ii) distributionally robust regression with group structure in the predictors; (iii) distributionally robust multi-output regression and multiclass classification; (iv) optimal decision making that combines distributionally robust regression with nearest-neighbor estimation; (v) distributionally robust semi-supervised learning; (vi) distributionally robust reinforcement learning. Throughout the monograph, the authors use applications in medicine and health care to illustrate the theoretical ideas in practice. They include numerical experiments and case studies using synthetic and real data.

Distributionally Robust Learning provides a detailed insight into a technique that has gained a lot of recent interest in developing robust supervised learning solutions that are founded in sound mathematical principles. It will be enlightening for researchers, practitioners and students working on the optimization of machine learning systems.

 
OPT-026