Articles for MAL

A Tutorial on Meta-Reinforcement Learning

Thu, 03 Apr 2025 00:00:00 +0200

Abstract

While deep reinforcement learning (RL) has fueled multiple high-profile successes in machine learning, it is held back from more widespread adoption by its often poor data efficiency and the limited generality of the policies it produces. A promising approach for alleviating these limitations is to cast the development of better RL algorithms as a machine learning problem itself in a process called meta-RL. Meta-RL is most commonly studied in a problem setting where, given a distribution of tasks, the goal is to learn a policy that is capable of adapting to any new task from the task distribution with as little data as possible. In this survey, we describe the meta-RL problem setting in detail as well as its major variations. We discuss how, at a high level, meta-RL research can be clustered based on the presence of a task distribution and the learning budget available for each individual task. Using these clusters, we then survey meta-RL algorithms and applications. We conclude by presenting the open problems on the path to making meta-RL part of the standard toolbox for a deep RL practitioner.

Suggested Citation

Jacob Beck, Risto Vuorio, Evan Zheran Liu, Zheng Xiong, Luisa Zintgraf, Chelsea Finn and Shimon Whiteson (2025), "A Tutorial on Meta-Reinforcement Learning", Foundations and Trends® in Machine Learning: Vol. 18: No. 2-3, pp 224-384. http://dx.doi.org/10.1561/2200000080

Generalization Bounds: Perspectives from Information Theory and PAC-Bayes

Thu, 23 Jan 2025 00:00:00 +0100

Abstract

A fundamental question in theoretical machine learning is generalization. Over the past decades, the PAC-Bayesian approach has been established as a flexible framework to address the generalization capabilities of machine learning algorithms and design new ones. Recently, it has garnered increased interest due to its potential applicability for a variety of learning algorithms, including deep neural networks. In parallel, an information-theoretic view of generalization has developed, wherein the relation between generalization and various information measures has been established. This framework is intimately connected to the PAC-Bayesian approach, and a number of results have been independently discovered in both strands.

In this monograph, we highlight this strong connection and present a unified treatment of PAC-Bayesian and information- theoretic generalization bounds. We present techniques and results that the two perspectives have in common, and discuss the approaches and interpretations that differ. In particular, we demonstrate how many proofs in the area share a modular structure, through which the underlying ideas can be intuited. We pay special attention to the conditional mutual information (CMI) framework, analytical studies of the information complexity of learning algorithms, and the application of the proposed methods to deep learning. This monograph is intended to provide a comprehensive introduction to information-theoretic generalization bounds and their connection to PAC-Bayes, serving as a foundation from which the most recent developments are accessible. It is aimed broadly towards researchers with an interest in generalization and theoretical machine learning.

Suggested Citation

Fredrik Hellström, Giuseppe Durisi, Benjamin Guedj and Maxim Raginsky (2025), "Generalization Bounds: Perspectives from Information Theory and PAC-Bayes", Foundations and Trends® in Machine Learning: Vol. 18: No. 1, pp 1-223. http://dx.doi.org/10.1561/2200000112

An Introduction to Deep Survival Analysis Models for Predicting Time-to-Event Outcomes

Thu, 12 Dec 2024 00:00:00 +0100

Abstract

Many applications involve reasoning about time durations before a critical event happens—also called time-to-event outcomes. When will a customer cancel a subscription, a coma patient wake up, or a convicted criminal reoffend? Accurate predictions of such time durations could help downstream decision-making tasks. A key challenge is censoring: commonly, when we collect training data, we do not get to observe the time-to-event outcome for every data point. For example, a coma patient has not woken up yet, so we do not know the patient’s time until awakening. However, these data points should not be excluded from analysis as they could have characteristics that explain why they have yet to or might never experience the event.

Wed, 31 Jan 2024 00:00:00 +0100

Abstract

Decision-making systems based on AI and machine learning have been used throughout a wide range of real-world scenarios, including healthcare, law enforcement, education, and finance. It is no longer far-fetched to envision a future where autonomous systems will drive entire business decisions and, more broadly, support large-scale decision-making infrastructure to solve society’s most challenging problems. Issues of unfairness and discrimination are pervasive when decisions are being made by humans, and remain (or are potentially amplified) when decisions are made using machines with little transparency, accountability, and fairness. In this monograph, we introduce a framework for causal fairness analysis with the intent of filling in this gap, i.e., understanding, modeling, and possibly solving issues of fairness in decision-making settings.

The main insight of our approach will be to link the quantification of the disparities present in the observed data with the underlying, often unobserved, collection of causal mechanisms that generate the disparity in the first place, a challenge we call the Fundamental Problem of Causal Fairness Analysis (FPCFA). In order to solve the FPCFA, we study the problem of decomposing variations and empirical measures of fairness that attribute such variations to structural mechanisms and different units of the population. Our effort culminates in the Fairness Map, the first systematic attempt to organize and explain the relationship between various criteria found in the literature. Finally, we study which causal assumptions are minimally needed for performing causal fairness analysis and propose the Fairness Cookbook, which allows one to assess the existence of disparate impact and disparate treatment.

Suggested Citation

Drago Plečko and Elias Bareinboim (2024), "Causal Fairness Analysis: A Causal Toolkit for Fair Machine Learning", Foundations and Trends® in Machine Learning: Vol. 17: No. 3, pp 304-589. http://dx.doi.org/10.1561/2200000106

User-friendly Introduction to PAC-Bayes Bounds

Mon, 22 Jan 2024 00:00:00 +0100

Abstract

Aggregated predictors are obtained by making a set of basic predictors vote according to some weights, that is, to some probability distribution. Randomized predictors are obtained by sampling in a set of basic predictors, according to some prescribed probability distribution.

Mon, 27 Mar 2023 00:00:00 +0200

Abstract

Abstract

The objective in a traditional reinforcement learning (RL) problem is to find a policy that optimizes the expected value of a performance metric such as the infinite-horizon cumulative discounted or long-run average cost/reward. In practice, optimizing the expected value alone may not be satisfactory, in that it may be desirable to incorporate the notion of risk into the optimization problem formulation, either in the objective or as a constraint. Various risk measures have been proposed in the literature, e.g., exponential utility, variance, percentile performance, chance constraints, value at risk (quantile), conditional value-at-risk, prospect theory and its later enhancement, cumulative prospect theory.

Thu, 21 Oct 2021 00:00:00 +0200

Abstract

Spectral methods have emerged as a simple yet surprisingly effective approach for extracting information from massive, noisy and incomplete data. In a nutshell, spectral methods refer to a collection of algorithms built upon the eigenvalues (resp. singular values) and eigenvectors (resp. singular vectors) of some properly designed matrices constructed from data. A diverse array of applications have been found in machine learning, imaging science, financial and econometric modeling, and signal processing, including recommendation systems, community detection, ranking, structured matrix recovery, tensor data estimation, joint shape matching, blind deconvolution, financial investments, risk managements, treatment evaluations, causal inference, amongst others. Due to their simplicity and effectiveness, spectral methods are not only used as a stand-alone estimator, but also frequently employed to facilitate other more sophisticated algorithms to enhance performance.

While the studies of spectral methods can be traced back to classical matrix perturbation theory and the method of moments, the past decade has witnessed tremendous theoretical advances in demystifying their efficacy through the lens of statistical modeling, with the aid of concentration inequalities and non-asymptotic random matrix theory. This monograph aims to present a systematic, comprehensive, yet accessible introduction to spectral methods from a modern statistical perspective, highlighting their algorithmic implications in diverse large-scale applications. In particular, our exposition gravitates around several central questions that span various applications: how to characterize the sample efficiency of spectral methods in reaching a target level of statistical accuracy, and how to assess their stability in the face of random noise, missing data, and adversarial corruptions? In addition to conventional ℓ2 perturbation analysis, we present a systematic ℓ∞ and ℓ2,∞ perturbation theory for eigenspace and singular subspaces, which has only recently become available owing to a powerful “leave-one-out” analysis framework.

Suggested Citation

Yuxin Chen, Yuejie Chi, Jianqing Fan and Cong Ma (2021), "Spectral Methods for Data Science: A Statistical Perspective", Foundations and Trends® in Machine Learning: Vol. 14: No. 5, pp 566-806. http://dx.doi.org/10.1561/2200000079

Tensor Regression

Mon, 27 Sep 2021 00:00:00 +0200

Abstract

Wed, 27 May 2015 00:00:00 +0200

Abstract

Random matrices now play a role in many areas of theoretical, applied, and computational mathematics. Therefore, it is desirable to have tools for studying random matrices that are flexible, easy to use, and powerful. Over the last fifteen years, researchers have developed a remarkable family of results, called matrix concentration inequalities, that achieve all of these goals.

This monograph offers an invitation to the field of matrix concentration inequalities. It begins with some history of random matrix theory; it describes a flexible model for random matrices that is suitable for many problems; and it discusses the most important matrix concentration results. To demonstrate the value of these techniques, the presentation includes examples drawn from statistics, machine learning, optimization, combinatorics, algorithms, scientific computing, and beyond.

Suggested Citation

Joel A. Tropp (2015), "An Introduction to Matrix Concentration Inequalities", Foundations and Trends® in Machine Learning: Vol. 8: No. 1-2, pp 1-230. http://dx.doi.org/10.1561/2200000048

Explicit-Duration Markov Switching Models

Tue, 23 Dec 2014 00:00:00 +0100

Abstract

Abstract

This work covers several aspects of the optimism in the face of uncertainty principle applied to large scale optimization problems under finite numerical budget. The initial motivation for the research reported here originated from the empirical success of the so-called Monte-Carlo Tree Search method popularized in Computer Go and further extended to many other games as well as optimization and planning problems. Our objective is to contribute to the development of theoretical foundations of the field by characterizing the complexity of the underlying optimization problems and designing efficient algorithms with performance guarantees.

Abstract

Property testing deals with tasks where the goal is to distinguish between the case that an object (e.g., function or graph) has a prespecified property (e.g., the function is linear or the graph is bipartite) and the case that it differs significantly from any such object. The task should be performed by observing only a very small part of the object, in particular by querying the object, and the algorithm is allowed a small failure probability.

One view of property testing is as a relaxation of learning the object (obtaining an approximate representation of the object). Thus property testing algorithms can serve as a preliminary step to learning. That is, they can be applied in order to select, very efficiently, what hypothesis class to use for learning. This survey takes the learning-theory point of view and focuses on results for testing properties of functions that are of interest to the learning theory community. In particular, we cover results for testing algebraic properties of functions such as linearity, testing properties defined by concise representations, such as having a small DNF representation, and more.

Suggested Citation

Dana Ron (2008), "Property Testing: A Learning Theory Perspective", Foundations and Trends® in Machine Learning: Vol. 1: No. 3, pp 307-402. http://dx.doi.org/10.1561/2200000004

Graphical Models, Exponential Families, and Variational Inference

Tue, 18 Nov 2008 00:00:00 +0100

Abstract

The formalism of probabilistic graphical models provides a unifying framework for capturing complex dependencies among random variables, and building large-scale multivariate statistical models. Graphical models have become a focus of research in many statistical, computational and mathematical fields, including bioinformatics, communication theory, statistical physics, combinatorial optimization, signal and image processing, information retrieval and statistical machine learning. Many problems that arise in specific instances — including the key problems of computing marginals and modes of probability distributions — are best studied in the general setting. Working with exponential family representations, and exploiting the conjugate duality between the cumulant function and the entropy for exponential families, we develop general variational representations of the problems of computing likelihoods, marginal probabilities and most probable configurations. We describe how a wide variety of algorithms — among them sum-product, cluster variational methods, expectation-propagation, mean field methods, max-product and linear programming relaxation, as well as conic programming relaxations — can all be understood in terms of exact or approximate forms of these variational representations. The variational approach provides a complementary alternative to Markov chain Monte Carlo as a general source of approximation methods for inference in large-scale statistical models.

Suggested Citation

Martin J. Wainwright and Michael I. Jordan (2008), "Graphical Models, Exponential Families, and Variational Inference", Foundations and Trends® in Machine Learning: Vol. 1: No. 1–2, pp 1-305. http://dx.doi.org/10.1561/2200000001