Articles for OPThttps://www.nowpublishers.com/feed/OPTArticles for OPThttp://www.nowpublishers.com/article/Details/OPT-027Information Relaxations and Duality in Stochastic Dynamic Programs: A Review and Tutorial<h3>Abstract</h3>In this monograph, we provide an overview of the information relaxation approach for calculating performance bounds in stochastic dynamic programs (DPs). The technique involves (1) relaxing the temporal feasibility (or nonanticipativity) constraints so the decision-maker (DM) has additional information before making decisions, and (2) incorporating a penalty that punishes the DM for violating the temporal feasibility constraints. The goal of this monograph is to provide a self-contained overview of the key theoretical results of the information relaxation approach as well as a review of research that has successfully used these techniques in a broad range of applications. We illustrate the information relaxation approach on applications in inventory management, assortment planning, and portfolio optimization.<h3>Suggested Citation</h3>David B. Brown and James E. Smith (2022), "Information Relaxations and Duality in Stochastic Dynamic Programs: A Review and Tutorial", Foundations and Trends® in Optimization: Vol. 5: No. 3, pp 246-339. http://dx.doi.org/10.1561/2400000027Mon, 21 Mar 2022 00:00:00 +0100http://www.nowpublishers.com/article/Details/OPT-036Acceleration Methods<h3>Abstract</h3>This monograph covers some recent advances in a range of acceleration
techniques frequently used in convex optimization.
We first use quadratic optimization problems to introduce
two key families of methods, namely momentum and nested
optimization schemes. They coincide in the quadratic case
to form the <em>Chebyshev method.</em><p>We discuss momentum methods in detail, starting with
the seminal work of Nesterov [1] and structure convergence
proofs using a few master templates, such as that for <em>optimized
gradient methods,</em> which provide the key benefit
of showing how momentum methods optimize convergence
guarantees. We further cover proximal acceleration, at the
heart of the <em>Catalyst and Accelerated Hybrid Proximal Extragradient</em>
frameworks, using similar algorithmic patterns.</p><p>Common acceleration techniques rely directly on the knowledge
of some of the regularity parameters in the problem at
hand. We conclude by discussing <em>restart</em> schemes, a set of
simple techniques for reaching nearly optimal convergence
rates while adapting to unobserved regularity parameters.</p><h3>Suggested Citation</h3>Alexandre d’Aspremont, Damien Scieur and Adrien Taylor (2021), "Acceleration Methods", Foundations and Trends® in Optimization: Vol. 5: No. 1-2, pp 1-245. http://dx.doi.org/10.1561/2400000036Wed, 15 Dec 2021 00:00:00 +0100http://www.nowpublishers.com/article/Details/OPT-035Algorithms for Verifying Deep Neural Networks<h3>Abstract</h3>Deep neural networks are widely used for nonlinear function
approximation, with applications ranging from computer
vision to control. Although these networks involve the composition
of simple arithmetic operations, it can be very
challenging to verify whether a particular network satisfies
certain input-output properties. This article surveys methods
that have emerged recently for soundly verifying such
properties. These methods borrow insights from reachability
analysis, optimization, and search. We discuss fundamental
differences and connections between existing algorithms. In
addition, we provide pedagogical implementations of existing
methods and compare them on a set of benchmark problems.<h3>Suggested Citation</h3>Changliu Liu, Tomer Arnon, Christopher Lazarus, Christopher Strong, Clark Barrett and Mykel J. Kochenderfer (2021), "Algorithms for Verifying Deep Neural Networks", Foundations and Trends® in Optimization: Vol. 4: No. 3-4, pp 244-404. http://dx.doi.org/10.1561/2400000035Thu, 11 Feb 2021 00:00:00 +0100http://www.nowpublishers.com/article/Details/OPT-026Distributionally Robust Learning<h3>Abstract</h3>This monograph develops a comprehensive statistical learning framework that is robust to (distributional) perturbations in the data using <em>Distributionally Robust Optimization (DRO)</em> under the Wasserstein metric. Beginning with fundamental properties of the Wasserstein metric and the DRO formulation, we explore duality to arrive at tractable formulations and develop finite-sample, as well as asymptotic, performance guarantees. We consider a series of learning problems, including (i) distributionally robust linear regression; (ii) distributionally robust regression with group structure in the predictors; (iii) distributionally robust multi-output regression and multiclass classification, (iv) optimal decision making that combines distributionally robust regression with nearest-neighbor estimation; (v) distributionally robust semi-supervised learning, and (vi) distributionally robust reinforcement learning. A tractable DRO relaxation for each problem is being derived, establishing a connection between robustness and regularization, and obtaining bounds on the prediction and estimation errors of the solution. Beyond theory, we include numerical experiments and case studies using synthetic and real data. The real data experiments are all associated with various health informatics problems, an application area which provided the initial impetus for this work.<h3>Suggested Citation</h3>Ruidi Chen and Ioannis Ch. Paschalidis (2020), "Distributionally Robust Learning", Foundations and Trends® in Optimization: Vol. 4: No. 1-2, pp 1-243. http://dx.doi.org/10.1561/2400000026Wed, 23 Dec 2020 00:00:00 +0100http://www.nowpublishers.com/article/Details/OPT-028Atomic Decomposition via Polar Alignment: The Geometry of Structured Optimization<h3>Abstract</h3>Structured optimization uses a prescribed set of atoms to assemble a solution that fits a model to data. Polarity, which extends the familiar notion of orthogonality from linear sets to general convex sets, plays a special role in a simple and geometric form of convex duality. This duality correspondence yields a general notion of alignment that leads to an intuitive and complete description of how atoms participate in the final decomposition of the solution. The resulting geometric perspective leads to variations of existing algorithms effective for large-scale problems. We illustrate these ideas with many examples, including applications in matrix completion and morphological component analysis for the separation of mixtures of signals.<h3>Suggested Citation</h3>Zhenan Fan, Halyun Jeong, Yifan Sun and Michael P. Friedlander (2020), "Atomic Decomposition via Polar Alignment: The Geometry of Structured Optimization", Foundations and Trends® in Optimization: Vol. 3: No. 4, pp 280-366. http://dx.doi.org/10.1561/2400000028Tue, 24 Nov 2020 00:00:00 +0100http://www.nowpublishers.com/article/Details/OPT-021Optimization Methods for Financial Index Tracking: From Theory to Practice<h3>Abstract</h3>Index tracking is a very popular passive investment strategy.
Since an index cannot be traded directly, index tracking
refers to the process of creating a portfolio that approximates
its performance. A straightforward way to do that is
to purchase all the assets that compose an index in appropriate
quantities. However, to simplify the execution, avoid
small and illiquid positions, and large transaction costs, it is
desired that the tracking portfolio consists of a small number
of assets, i.e., we wish to create a sparse portfolio.
Although index tracking is driven from the financial industry,
it is in fact a pure signal processing problem: a regression of
the financial historical data subject to some portfolio constraints
with some caveats and particularities. Furthermore,
the sparse index tracking problem is similar to many sparsity
formulations in the signal processing area in the sense that
it is a regression problem with some sparsity requirements.
In its original form, sparse index tracking can be formulated
as a combinatorial optimization problem. A commonly used
approach is to use mixed-integer programming (MIP) to
solve small sized problems. Nevertheless, MIP solvers are not
applicable for high-dimensional problems since the running
time can be prohibiting for practical use.
The goal of this monograph is to provide an in-depth overview
of the index tracking problem and analyze all the caveats and
practical issues an investor might have, such as the frequent
rebalancing of weights, the changes in the index composition,
the transaction costs, etc. Furthermore, a unified framework
for a large variety of sparse index tracking formulations is
provided. The derived algorithms are very attractive for
practical use since they provide efficient tracking portfolios
orders of magnitude faster than MIP solvers.
<h3>Suggested Citation</h3>Konstantinos Benidis, Yiyong Feng and Daniel P. Palomar (2018), "Optimization Methods for Financial Index Tracking: From Theory to Practice", Foundations and Trends® in Optimization: Vol. 3: No. 3, pp 171-279. http://dx.doi.org/10.1561/2400000021Thu, 07 Jun 2018 00:00:00 +0200http://www.nowpublishers.com/article/Details/OPT-011The Many Faces of Degeneracy in Conic Optimization<h3>Abstract</h3>Slater’s condition – existence of a “strictly feasible solution” – is a
common assumption in conic optimization. Without strict feasibility,
first-order optimality conditions may be meaningless, the dual problem
may yield little information about the primal, and small changes
in the data may render the problem infeasible. Hence, failure of strict
feasibility can negatively impact off-the-shelf numerical methods, such
as primal-dual interior point methods, in particular. New optimization
modeling techniques and convex relaxations for hard nonconvex problems
have shown that the loss of strict feasibility is a more pronounced
phenomenon than has previously been realized. In this text, we describe
various reasons for the loss of strict feasibility, whether due to
poor modeling choices or (more interestingly) rich underlying structure,
and discuss ways to cope with it and, in many pronounced cases,
how to use it as an advantage. In large part, we emphasize the facial
reduction preprocessing technique due to its mathematical elegance,
geometric transparency, and computational potential.
<h3>Suggested Citation</h3>Dmitriy Drusvyatskiy and Henry Wolkowicz (2017), "The Many Faces of Degeneracy in Conic Optimization", Foundations and Trends® in Optimization: Vol. 3: No. 2, pp 77-170. http://dx.doi.org/10.1561/2400000011Wed, 20 Dec 2017 00:00:00 +0100http://www.nowpublishers.com/article/Details/OPT-023Multi-Period Trading via Convex Optimization<h3>Abstract</h3>We consider a basic model of multi-period trading, which can be used to evaluate the performance of a trading strategy. We describe a framework for single-period optimization, where the trades in each period are found by solving a convex optimization problem that trades oﬀ expected return, risk, transaction cost and holding cost such as the borrowing cost for shorting assets. We then describe a multi-period version of the trading method, where optimization is used to plan a sequence of trades, with only the ﬁrst one executed, using estimates of future quantities that are unknown when the trades are chosen. The single period method traces back to Markowitz; the multi-period methods trace back to model predictive control. Our contribution is to describe the single-period and multi-period methods in one simple framework, giving a clear description of the development and the approximations made. In this paper, we do not address a critical component in a trading algorithm, the predictions or forecasts of future quantities. The methods we describe in this paper can be thought of as good ways to exploit predictions, no matter how they are made. We have also developed a companion open-source software library that implements many of the ideas and methods described in the paper.
<h3>Suggested Citation</h3>Stephen Boyd, Enzo Busseti, Steve Diamond, Ronald N. Kahn, Kwangmoo Koh, Peter Nystrup and Jan Speth (2017), "Multi-Period Trading via Convex Optimization", Foundations and Trends® in Optimization: Vol. 3: No. 1, pp 1-76. http://dx.doi.org/10.1561/2400000023Tue, 08 Aug 2017 00:00:00 +0200http://www.nowpublishers.com/article/Details/OPT-013Introduction to Online Convex Optimization<h3>Abstract</h3>This monograph portrays optimization as a process. In many practical applications the environment is so complex that it is infeasible to lay out a comprehensive theoretical model and use classical algorithmic theory and mathematical optimization. It is necessary as well as beneficial to take a robust approach, by applying an optimization method that learns as one goes along, learning from experience as more aspects of the problem are observed. This view of optimization as a process has become prominent in varied fields and has led to some spectacular success in modeling and systems that are now part of our daily
lives.<h3>Suggested Citation</h3>Elad Hazan (2016), "Introduction to Online Convex Optimization", Foundations and Trends® in Optimization: Vol. 2: No. 3-4, pp 157-325. http://dx.doi.org/10.1561/2400000013Tue, 30 Aug 2016 00:00:00 +0200http://www.nowpublishers.com/article/Details/OPT-009Low-Rank Semidefinite Programming: Theory and Applications<h3>Abstract</h3>Finding low-rank solutions of semidefinite programs is important in many applications. For example, semidefinite programs that arise as relaxations of polynomial optimization problems are exact relaxations when the semidefinite program has a rank-1 solution. Unfortunately, computing a minimum-rank solution of a semidefinite program is an NP-hard problem. In this paper we review the theory of low-rank semidefinite programming, presenting theorems that guarantee the existence of a low-rank solution, heuristics for computing low-rank solutions, and algorithms for finding low-rank approximate solutions. Then we present applications of the theory to trust-region problems and signal
processing.<h3>Suggested Citation</h3>Alex Lemon, Anthony Man-Cho So and Yinyu Ye (2016), "Low-Rank Semidefinite Programming: Theory and Applications", Foundations and Trends® in Optimization: Vol. 2: No. 1-2, pp 1-156. http://dx.doi.org/10.1561/2400000009Thu, 04 Aug 2016 00:00:00 +0200http://www.nowpublishers.com/article/Details/OPT-006Chordal Graphs and Semidefinite Optimization<h3>Abstract</h3>Chordal graphs play a central role in techniques for exploiting sparsity
in large semidefinite optimization problems and in related convex
optimization problems involving sparse positive semidefinite matrices.
Chordal graph properties are also fundamental to several classical
results in combinatorial optimization, linear algebra, statistics,
signal processing, machine learning, and nonlinear optimization. This
survey covers the theory and applications of chordal graphs, with an
emphasis on algorithms developed in the literature on sparse Cholesky
factorization. These algorithms are formulated as recursions on elimination
trees, supernodal elimination trees, or clique trees associated
with the graph. The best known example is the multifrontal Cholesky
factorization algorithm, but similar algorithms can be formulated for
a variety of related problems, including the computation of the partial
inverse of a sparse positive definite matrix, positive semidefinite and
Euclidean distance matrix completion problems, and the evaluation of
gradients and Hessians of logarithmic barriers for cones of sparse positive
semidefinite matrices and their dual cones. The purpose of the
survey is to show how these techniques can be applied in algorithms
for sparse semidefinite optimization, and to point out the connections
with related topics outside semidefinite optimization, such as probabilistic
networks, matrix completion problems, and partial separability
in nonlinear optimization.<h3>Suggested Citation</h3>Lieven Vandenberghe and Martin S. Andersen (2015), "Chordal Graphs and Semidefinite Optimization", Foundations and Trends® in Optimization: Vol. 1: No. 4, pp 241-433. http://dx.doi.org/10.1561/2400000006Wed, 27 May 2015 00:00:00 +0200http://www.nowpublishers.com/article/Details/OPT-003Proximal Algorithms<h3>Abstract</h3>This monograph is about a class of optimization algorithms called <em>proximal algorithms</em>. Much like Newton’s method is a
standard tool for solving unconstrained smooth optimization problems of modest size, proximal algorithms can be viewed as an analogous
tool for nonsmooth, constrained, large-scale, or distributed versions of these problems. They are very generally applicable, but are
especially well-suited to problems of substantial recent interest involving large or high-dimensional datasets. Proximal methods sit at
a higher level of abstraction than classical algorithms like Newton’s method: the base operation is evaluating the <em>proximal operator</em> of a
function, which itself involves solving a small convex optimization problem. These subproblems, which generalize the problem of projecting
a point onto a convex set, often admit closed-form solutions or can be solved very quickly with standard or simple specialized methods. Here,
we discuss the many different interpretations of proximal operators and algorithms, describe their connections to many other topics in
optimization and applied mathematics, survey some popular algorithms, and provide a large number of examples of proximal operators that
commonly arise in practice.<h3>Suggested Citation</h3>Neal Parikh and Stephen Boyd (2014), "Proximal Algorithms", Foundations and Trends® in Optimization: Vol. 1: No. 3, pp 127-239. http://dx.doi.org/10.1561/2400000003Mon, 13 Jan 2014 00:00:00 +0100http://www.nowpublishers.com/article/Details/OPT-002Dynamic Network Energy Management via Proximal Message Passing<h3>Abstract</h3>We consider a network of devices, such as generators, fixed loads, deferrable loads, and storage devices, each with its own
dynamic constraints and objective, connected by AC and DC lines. The problem is to minimize the total network objective subject to
the device and line constraints, over a given time horizon. This is a large optimization problem, with variables for consumption
or generation for each device, power flow for each line, and voltage phase angles at AC buses, in each time period. In this paper
we develop a decentralized method for solving this problem called <em>proximal message passing</em>. The method is iterative: At each step,
each device exchanges simple messages with its neighbors in the network and then solves its own optimization problem, minimizing its
own objective function, augmented by a term determined by the messages it has received. We show that this message passing method
converges to a solution when the device objective and constraints are convex. The method is completely decentralized, and needs no
global coordination other than synchronizing iterations; the problems to be solved by each device can typically be solved extremely
efficiently and in parallel. The method is fast enough that even a serial implementation can solve substantial problems in
reasonable time frames. We report results for several numerical experiments, demonstrating the method’s speed and scaling,
including the solution of a problem instance with over 10 million variables in under 50 minutes for a serial implementation;
with decentralized computing, the solve time would be less than one second.<h3>Suggested Citation</h3>Matt Kraning, Eric Chu, Javad Lavaei and Stephen Boyd (2014), "Dynamic Network Energy Management via Proximal Message Passing", Foundations and Trends® in Optimization: Vol. 1: No. 2, pp 73-126. http://dx.doi.org/10.1561/2400000002Wed, 01 Jan 2014 00:00:00 +0100http://www.nowpublishers.com/article/Details/OPT-001Performance Bounds and Suboptimal Policies for Multi–Period Investment<h3>Abstract</h3>We consider dynamic trading of a portfolio of assets in discrete periods over a finite time horizon, with arbitrary
time-varying distribution of asset returns. The goal is to maximize the total expected revenue from the portfolio,
while respecting constraints on the portfolio such as a required terminal portfolio and leverage and risk limits. The
revenue takes into account the gross cash generated in trades, transaction costs, and costs associated with the
positions, such as fees for holding short positions. Our model has the form of a stochastic control problem with linear
dynamics and convex cost function and constraints. While this problem can be tractably solved in several special cases,
such as when all costs are convex quadratic, or when there are no transaction costs, our focus is on the more general
case, with nonquadratic cost terms and transaction costs.<p>We show how to use linear matrix inequality techniques and semidefinite programming to produce a quadratic bound on
the value function, which in turn gives a bound on the optimal performance. This performance bound can be used to judge
the performance obtained by any suboptimal policy. As a by–product of the performance bound computation, we obtain an
approximate dynamic programming policy that requires the solution of a convex optimization problem, often a quadratic
program, to determine the trades to carry out in each step. While we have no theoretical guarantee that the performance
of our suboptimal policy is always near the performance bound (which would imply that it is nearly optimal) we observe
that in numerical examples the two values are typically close.</p><h3>Suggested Citation</h3>Stephen Boyd, Mark T. Mueller, Brendan O’Donoghue and Yang Wang (2014), "Performance Bounds and Suboptimal Policies for Multi–Period Investment", Foundations and Trends® in Optimization: Vol. 1: No. 1, pp 1-72. http://dx.doi.org/10.1561/2400000001Wed, 01 Jan 2014 00:00:00 +0100