publications | Research webpage of Adrien Taylor

publications policy — I do my best to maintain updated versions with possible typo corrections and clarifications on arxiv (both are generally marked in bold and red for easy reference). Therefore, please favor the arxiv versions to the official published ones.

thesis — my thesis (under the supervision of François Glineur and Julien Hendrickx) had the chance to be awarded the ICTEAM thesis award for 2018, the IBM-FNRS innovation award for 2018, and to be a finalist for the AW Tucker prize for 2018. In addition, we received the 2017 best paper award in Optimization Letters, for a joint work with Etienne de Klerk and François Glineur (for this paper).

codes — see my github profile for all my codes. The current version of the Performance EStimation TOolbox (PESTO) is available from here (user manual, conference proceeding). The numerical worst-case analyses from PEP can now be performed just by writting the algorithms just as you would implement them in Matlab. The new PEPit (performance estimation in Python) is available from here (due to the fabulous work of Baptiste Goujaud and Céline Moucer). It is easy to experiment with it using this notebook (see colab).

1 - preprints

preprint
A constructive approach to strengthen algebraic descriptions of function and operator classes

Anne Rubbens , Julien M. Hendrickx , and Adrien B. Taylor

arXiv:2504.14377, 2025

Abs arXiv Bib Code

It is well known that functions (resp. operators) satisfying a property p on a subset Q⊂\R^d cannot necessarily be extended to a function (resp. operator) satisfying p on the whole of \R^d. Given Q ⊆\R^d, this work considers the problem of obtaining necessary and ideally sufficient conditions to be satisfied by a function (resp. operator) on Q, ensuring the existence of an extension of this function (resp. operator) satisfying p on \R^d. More precisely, given some property p, we present a refinement procedure to obtain stronger necessary conditions to be imposed on Q. This procedure can be applied iteratively until the stronger conditions are also sufficient. We illustrate the procedure on a few examples, including the strengthening of existing descriptions for the classes of smooth functions satisfying a Łojasiewicz condition, convex blockwise smooth functions, Lipschitz monotone operators, strongly monotone cocoercive operators, and uniformly convex functions. In most cases, these strengthened descriptions can be represented, or relaxed, to semi-definite constraints, which can be used to formulate tractable optimization problems on functions (resp. operators) within those classes.
@article{rubbens2025onepoint, title = {A constructive approach to strengthen algebraic descriptions of function and operator classes}, author = {Rubbens, Anne and Hendrickx, Julien M. and Taylor, Adrien B.}, journal = {arXiv:2504.14377}, year = {2025}, }
preprint
Open Problem: Two Riddles in Heavy-Ball Dynamics

Baptiste Goujaud , Adrien B. Taylor , and Aymeric Dieuleveut

arXiv:2502.19916, 2025

Abs arXiv Bib

This short paper presents two open problems on the widely used Polyak’s Heavy-Ball algorithm. The first problem is the method’s ability to exactly accelerate in dimension one exactly. The second question regards the behavior of the method for parameters for which it seems that neither a Lyapunov nor a cycle exists. For both problems, we provide a detailed description of the problem and elements of an answer.
@article{goujaud2025open, title = {Open Problem: Two Riddles in Heavy-Ball Dynamics}, author = {Goujaud, Baptiste and Taylor, Adrien B. and Dieuleveut, Aymeric}, journal = {arXiv:2502.19916}, year = {2025}, }
preprint
Optimized projection-free algorithms for online learning: construction and worst-case analysis

Julien Weibel , Pierre Gaillard , Wouter Koolen , and Adrien B. Taylor

arXiv:2506.05855, 2025

Abs arXiv Bib

This work studies and develop projection-free algorithms for online learning with linear optimization oracles (a.k.a. Frank–Wolfe) for handling the constraint set. More precisely, this work (i) provides an improved (optimized) variant of an online Frank–Wolfe algorithm along with its conceptually simple potential-based proof, and (ii) shows how to leverage semidefinite programming to jointly design and analyze online Frank–Wolfe-type algorithms numerically in a variety of settings—that include the design of the variant (i). Based on the semidefinite technique, we conclude with strong numerical evidence suggesting that no pure online Frank–Wolfe algorithm within our model class can have a regret guarantee better than O(T^3/4) (T is the time horizon) without additional assumptions, that the current algorithms do not have optimal constants, that the algorithm benefits from similar anytime properties O(t^3/4) not requiring to know T in advance, and that multiple linear optimization rounds do not generally help to obtain better regret bounds.
@article{weibel2025optimised, title = {Optimized projection-free algorithms for online learning: construction and worst-case analysis}, author = {Weibel, Julien and Gaillard, Pierre and Koolen, Wouter and Taylor, Adrien B.}, year = {2025}, journal = {arXiv:2506.05855}, }
preprint
Tight analyses of first-order methods with error feedback

Daniel Berg Thomsen , Adrien B. Taylor , and Aymeric Dieuleveut

arXiv:2506.05271, 2025

Abs arXiv Bib Code

Communication between agents often constitutes a major computational bottleneck in distributed learning. One of the most common mitigation strategies is to compress the information exchanged, thereby reducing communication overhead. To counteract the degradation in convergence associated with compressed communication, error feedback schemes – most notably EF and EF21 – were introduced. In this work, we provide a tight analysis of both of these methods. Specifically, we find the Lyapunov function that yields the best possible convergence rate for each method – with matching lower bounds. This principled approach yields sharp performance guarantees and enables a rigorous, apples-to-apples comparison between EF, EF21, and compressed gradient descent. Our analysis is carried out in a simplified yet representative setting, which allows for clean theoretical insights and fair comparison of the underlying mechanisms.
@article{thomsen2025optimised, title = {Tight analyses of first-order methods with error feedback}, author = {Berg Thomsen, Daniel and Taylor, Adrien B. and Dieuleveut, Aymeric}, year = {2025}, journal = {arXiv:2506.05271}, }
preprint
Geometry-dependent matching pursuit: a transition phase for convergence on linear regression and LASSO

Céline Moucer , Adrien B. Taylor , and Francis Bach

arXiv:2301.01530, 2024

Abs arXiv Bib Code

Greedy first-order methods, such as coordinate descent with Gauss-Southwell rule or matching pursuit, have become popular in optimization due to their natural tendency to propose sparse solutions and their refined convergence guarantees. In this work, we propose a principled approach to generating (regularized) matching pursuit algorithms adapted to the geometry of the problem at hand, as well as their convergence guarantees. Building on these results, we derive approximate convergence guarantees and describe a transition phenomenon in the convergence of (regularized) matching pursuit from underparametrized to overparametrized models.
@article{moucer2024geometry, title = {Geometry-dependent matching pursuit: a transition phase for convergence on linear regression and LASSO}, author = {Moucer, C{\'e}line and Taylor, Adrien B. and Bach, Francis}, year = {2024}, journal = {arXiv:2301.01530}, }
preprint
Constructive approaches to concentration inequalities with independent random variables

Céline Moucer , Adrien B. Taylor , and Francis Bach

arXiv:2408.16480, 2024

Abs arXiv Bib PDF Code

Concentration inequalities, a major tool in probability theory, quantify how much a random variable deviates from a certain quantity. This paper proposes a systematic convex optimization approach to studying and generating concentration inequalities with independent random variables. Specifically, we extend the generalized problem of moments to independent random variables. We first introduce a variational approach that extends classical moment-generating functions, focusing particularly on first-order moment conditions. Second, we develop a polynomial approach, based on a hierarchy of sum-of-square approximations, to extend these techniques to higher-moment conditions. Building on these advancements, we refine Hoeffding’s, Bennett’s and Bernstein’s inequalities, providing improved worst-case guarantees compared to existing results.
@article{moucer2024geometrz, title = {Constructive approaches to concentration inequalities with independent random variables}, author = {Moucer, C{\'e}line and Taylor, Adrien B. and Bach, Francis}, year = {2024}, journal = {arXiv:2408.16480}, }
preprint
Provable non-accelerations of the heavy-ball method

Baptiste Goujaud , Adrien B. Taylor , and Aymeric Dieuleveut

arXiv:2307.11291, 2023

Abs arXiv Bib

In this work, we show that the heavy-ball (\HB) method provably does not reach an accelerated convergence rate on smooth strongly convex problems. More specifically, we show that for any condition number and any choice of algorithmic parameters, either the worst-case convergence rate of \HB on the class of -smooth and -strongly convex quadratic functions is not accelerated (that is, slower than), or there exists an -smooth -strongly convex function and an initialization such that the method does not converge. To the best of our knowledge, this result closes a simple yet open question on one of the most used and iconic first-order optimization technique. Our approach builds on finding functions for which \HB fails to converge and instead cycles over finitely many iterates. We analytically describe all parametrizations of \HB that exhibit this cycling behavior on a particular cycle shape, whose choice is supported by a systematic and constructive approach to the study of cycling behaviors of first-order methods. We show the robustness of our results to perturbations of the cycle, and extend them to class of functions that also satisfy higher-order regularity conditions.
@article{goujaud2023provable, title = {Provable non-accelerations of the heavy-ball method}, author = {Goujaud, Baptiste and Taylor, Adrien B. and Dieuleveut, Aymeric}, journal = {arXiv:2307.11291}, year = {2023}, }
preprint
Optimal first-order methods for convex functions with a quadratic upper bound

Baptiste Goujaud , Adrien B. Taylor , and Aymeric Dieuleveut

arXiv:2205.15033, 2022

Abs arXiv Bib PDF Code

We analyze worst-case convergence guarantees of first-order optimization methods over a function class extending that of smooth and convex functions. This class contains convex functions that admit a simple quadratic upper bound. Its study is motivated by its stability under minor perturbations. We provide a thorough analysis of first-order methods, including worst-case convergence guarantees for several algorithms, and demonstrate that some of them achieve the optimal worst-case guarantee over the class. We support our analysis by numerical validation of worst-case guarantees using performance estimation problems. A few observations can be drawn from this analysis, particularly regarding the optimality (resp. and adaptivity) of the heavy-ball method (resp. heavy-ball with line-search). Finally, we show how our analysis can be leveraged to obtain convergence guarantees over more complex classes of functions. Overall, this study brings insights on the choice of function classes over which standard first-order methods have working worst-case guarantees.
@article{goujad2022opt, title = {Optimal first-order methods for convex functions with a quadratic upper bound}, author = {Goujaud, Baptiste and Taylor, Adrien B. and Dieuleveut, Aymeric}, year = {2022}, journal = {arXiv:2205.15033}, }

2 - books

book
Towards principled and systematic approaches to the analysis and design of optimization algorithms

Adrien B. Taylor

PSL Research University, 2024

Habilitation à diriger des recherches

Abs Bib HTML

The goal of this thesis is to show how to derive in a completely automated way exact and global worst-case guarantees for first-order methods in convex optimization. To this end, we formulate a generic optimization problem looking for the worst-case scenarios. The worst-case computation problems, referred to as performance estimation problems (PEPs), are intrinsically infinite-dimensional optimization problems formulated over a given class of objective functions. To render those problems tractable, we develop (smooth and non-smooth) convex interpolation framework, which provides necessary and sufficient conditions to interpolate our objective functions. With this idea, we transform PEPs into solvable finite-dimensional semidefinite programs, from which one obtains worst-case guarantees and worst-case functions, along with the corresponding explicit proofs. PEPs already proved themselves very useful as a tool for developing convergence analyses of first-order optimization methods. Among others, PEPs allow obtaining exact guarantees for gradient methods, along with their inexact, projected, proximal, conditional, decentralized and accelerated versions.
@article{taylor2024towards, title = {Towards principled and systematic approaches to the analysis and design of optimization algorithms}, author = {Taylor, Adrien B.}, journal = {PSL Research University}, year = {2024}, note = {Habilitation \`a diriger des recherches} }
book
Acceleration Methods

Alexandre d’Aspremont , Damien Scieur , and Adrien B. Taylor

Foundations and Trends in Optimization, 2021

Abs arXiv Bib HTML Code

This monograph covers some recent advances in a range of acceleration techniques frequently used in convex optimization. We first use quadratic optimization problems to introduce two key families of methods, namely momentum and nested optimization schemes. They coincide in the quadratic case to form the Chebyshev method. We discuss momentum methods in detail, starting with the seminal work of Nesterov [1] and structure convergence proofs using a few master templates, such as that for optimized gradient methods, which provide the key benefit of showing how momentum methods optimize convergence guarantees. We further cover proximal acceleration, at the heart of the Catalyst and Accelerated Hybrid Proximal Extragradient frameworks, using similar algorithmic patterns. Common acceleration techniques rely directly on the knowledge of some of the regularity parameters in the problem at hand. We conclude by discussing restart schemes, a set of simple techniques for reaching nearly optimal convergence rates while adapting to unobserved regularity parameters.
@article{daspremont2021acceleration, year = {2021}, volume = {5}, journal = {Foundations and Trends in Optimization}, title = {Acceleration Methods}, number = {1-2}, pages = {1-245}, author = {d’Aspremont, Alexandre and Scieur, Damien and Taylor, Adrien B.}, }

3 - journals

journal
Automated tight Lyapunov analysis for first-order methods

Manu Upadhyaya , Sebastian Banert , Adrien B. Taylor , and Pontus Giselsson

Mathematical Programming, 2025

Abs arXiv Bib Code

We present a methodology for establishing the existence of quadratic Lyapunov inequalities for a wide range of first-order methods used to solve convex optimization problems. In particular, we consider i) classes of optimization problems of finite-sum form with (possibly strongly) convex and possibly smooth functional components, ii) first-order methods that can be written as a linear system on state-space form in feedback interconnection with the subdifferentials of the functional components of the objective function, and iii) quadratic Lyapunov inequalities that can be used to draw convergence conclusions. We provide a necessary and sufficient condition for the existence of a quadratic Lyapunov inequality that amounts to solving a small-sized semidefinite program. We showcase our methodology on several first-order methods that fit the framework. Most notably, our methodology allows us to significantly extend the region of parameter choices that allow for duality gap convergence in the Chambolle-Pock method when the linear operator is the identity mapping.
@article{upadhyaya2023automated, title = {Automated tight Lyapunov analysis for first-order methods}, author = {Upadhyaya, Manu and Banert, Sebastian and Taylor, Adrien B. and Giselsson, Pontus}, journal = {Mathematical Programming}, year = {2025}, }
journal
PROXQP: an Efficient and Versatile Quadratic Programming Solver for Real-Time Robotics Applications and Beyond

Antoine Bambade , Fabian Schramm , Sarah El Kazdadi , Stéphane Caron , Adrien B. Taylor , and Justin Carpentier

Transactions on Robotics (to appear), 2025

Abs Bib HTML Code

Convex Quadratic programming (QP) has become a core component in the modern engineering toolkit, particularly in robotics, where QP problems are legions, ranging from real-time whole-body controllers to planning and estimation algorithms. Many of those QPs need to be solved at high frequency. Meeting timing requirements requires taking advantage of as many structural properties as possible for the problem at hand. For instance, it is generally crucial to resort to warm-starting to exploit the resemblance of consecutive control iterations. While a large range of off-the-shelf QP solvers is available, only a few are suited to exploit problem structure and warm-starting capacities adequately. In this work, we propose the PROXQP algorithm, a new and efficient QP solver that exploits QP structures by leveraging primal-dual augmented Lagrangian techniques. For convex QPs, PROXQP features a global convergence guarantee to the closest feasible QP, an essential property for safe closedloop control. We illustrate its practical performance on various standard robotic and control experiments, including a real-world closed-loop model predictive control application. While originally tailored for robotics applications, we show that PROXQP also performs at the level of state of the art on generic QP problems, making PROXQP suitable for use as an off-the-shelf solver for regular applications beyond robotics.
@article{bambade2023proxqp, title = {PROXQP: an Efficient and Versatile Quadratic Programming Solver for Real-Time Robotics Applications and Beyond}, author = {Bambade, Antoine and Schramm, Fabian and El Kazdadi, Sarah and Caron, St{\'e}phane and Taylor, Adrien B. and Carpentier, Justin}, year = {2025}, journal = {Transactions on Robotics (to appear)} }
journal
Quadratic minimization: from conjugate gradient to an adaptive Polyak’s momentum method with Polyak step-sizes

Baptiste Goujaud , Adrien B. Taylor , and Aymeric Dieuleveut

Open Journal of Mathematical Optimization (to appear), 2024

Abs arXiv Bib PDF Code

In this work, we propose an adaptive variation on the classical Heavy-ball method for convex quadratic minimization. The adaptivity crucially relies on so-called “Polyak step-sizes”, which consists in using the knowledge of the optimal value of the optimization problem at hand instead of problem parameters such as a few eigenvalues of the Hessian of the problem. This method happens to also be equivalent to a variation of the classical conjugate gradient method, and thereby inherits many of its attractive features, including its finite-time convergence, instance optimality, and its worst-case convergence rates. The classical gradient method with Polyak step-sizes is known to behave very well in situations in which it can be used, and the question of whether incorporating momentum in this method is possible and can improve the method itself appeared to be open. We provide a definitive answer to this question for minimizing convex quadratic functions, a arguably necessary first step for developing such methods in more general setups.
@article{goujaud2022quadratic, title = {Quadratic minimization: from conjugate gradient to an adaptive Polyak’s momentum method with Polyak step-sizes}, author = {Goujaud, Baptiste and Taylor, Adrien B. and Dieuleveut, Aymeric}, year = {2024}, journal = {Open Journal of Mathematical Optimization (to appear)}, }
journal
PEPit: computer-assisted worst-case analyses of first-order optimization methods in Python

Baptiste Goujaud , Céline Moucer , François Glineur , Julien M. Hendrickx , Adrien B. Taylor , and Aymeric Dieuleveut

Mathematical Programming Computation, 2024

Abs arXiv Bib Code

PEPit is a python package aiming at simplifying the access to worst-case analyses of a large family of first-order optimization methods possibly involving gradient, projection, proximal, or linear optimization oracles, along with their approximate, or Bregman variants. In short, PEPit is a package enabling computer-assisted worst-case analyses of first-order optimization methods. The key underlying idea is to cast the problem of performing a worst-case analysis, often referred to as a performance estimation problem (PEP), as a semidefinite program (SDP) which can be solved numerically. For doing that, the package users are only required to write first-order methods nearly as they would have implemented them. The package then takes care of the SDP modelling parts, and the worst-case analysis is performed numerically via a standard solver.
@article{goujaud2022pepit, title = {PEPit: computer-assisted worst-case analyses of first-order optimization methods in Python}, author = {Goujaud, Baptiste and Moucer, C{\'e}line and Glineur, Fran{\c{c}}ois and Hendrickx, Julien M. and Taylor, Adrien B. and Dieuleveut, Aymeric}, year = {2024}, journal = {Mathematical Programming Computation} }
journal
Nonlinear conjugate gradient methods: worst-case convergence rates via computer-assisted analyses

Shuvomoy Das Gupta , Robert M. Freund , X. Andy Sun , and Adrien B. Taylor

Mathematical Programming, 2024

Abs arXiv Bib Code

We propose a computer-assisted approach to the analysis of the worst-case convergence of nonlinear conjugate gradient methods (NCGMs). Those methods are known for their generally good empirical performances for large-scale optimization, while having relatively incomplete analyses. Using our computer-assisted approach, we establish novel complexity bounds for the Polak-Ribi‘ere-Polyak (PRP) and the Fletcher-Reeves (FR) NCGMs for smooth strongly convex minimization. Conversely, we provide examples showing that those methods might behave worse than the regular steepest descent on the same class of problems.
@article{gupta2023nonlinear, title = {Nonlinear conjugate gradient methods: worst-case convergence rates via computer-assisted analyses}, author = {Das Gupta, Shuvomoy and Freund, Robert M. and Sun, X. Andy and Taylor, Adrien B.}, year = {2024}, journal = {Mathematical Programming} }
journal
Counter-examples in first-order optimization: a constructive approach

Baptiste Goujaud , Aymeric Dieuleveut , and Adrien B. Taylor

IEEE Control Systems Letters, 2023

Abs arXiv Bib HTML PDF Code

While many approaches were developed for obtaining worst-case complexity bounds for first-order optimization methods in the last years, there remain theoretical gaps in cases where no such bound can be found. In such cases, it is often unclear whether no such bound exists (e.g., because the algorithm might fail to systematically converge) or simply if the current techniques do not allow finding them. In this work, we propose an approach to automate the search for cyclic trajectories generated by first-order methods. This provides a constructive approach to show that no appropriate complexity bound exists, thereby complementing the approaches providing sufficient conditions for convergence. Using this tool, we provide ranges of parameters for which some of the famous heavy-ball, Nesterov accelerated gradient, inexact gradient descent, and three-operator splitting algorithms fail to systematically converge, and show that it nicely complements existing tools searching for Lyapunov functions.
@article{goujaud2023counter, title = {Counter-examples in first-order optimization: a constructive approach}, author = {Goujaud, Baptiste and Dieuleveut, Aymeric and Taylor, Adrien B.}, journal = {IEEE Control Systems Letters}, year = {2023}, }
journal
A systematic approach to Lyapunov analyses of continuous-time models in convex optimization

Céline Moucer , Adrien B. Taylor , and Francis Bach

SIAM Journal on Optimization, 2023

Abs arXiv Bib PDF Code

First-order methods are often analyzed via their continuous-time models, where their worst-case convergence properties are usually approached via Lyapunov functions. In this work, we provide a systematic and principled approach to find and verify Lyapunov functions for classes of ordinary and stochastic differential equations. More precisely, we extend the performance estimation framework, originally proposed by Drori and Teboulle [10], to continuous-time models. We retrieve convergence results comparable to those of discrete methods using fewer assumptions and convexity inequalities, and provide new results for stochastic accelerated gradient flows.
@article{moucer2022systematic, title = {A systematic approach to Lyapunov analyses of continuous-time models in convex optimization}, author = {Moucer, Céline and Taylor, Adrien B. and Bach, Francis}, year = {2023}, journal = {SIAM Journal on Optimization}, }
journal
An optimal gradient method for smooth strongly convex minimization

Adrien B. Taylor , and Yoel Drori

Mathematical Programming, 2023

Abs arXiv Bib HTML Code

We present an optimal gradient method for smooth strongly convex optimization. The method is optimal in the sense that its worst-case bound on the distance to an optimal point exactly matches the lower bound on the oracle complexity for the class of problems, meaning that no black-box first-order method can have a better worst-case guarantee without further assumptions on the class of problems at hand. In addition, we provide a constructive recipe for obtaining the algorithmic parameters of the method and illustrate that it can be used for deriving methods for other optimality criteria as well.
@article{drori2023optimal, title = {An optimal gradient method for smooth strongly convex minimization}, author = {Taylor, Adrien B. and Drori, Yoel}, journal = {Mathematical Programming}, volume = {199}, number = {1-2}, pages = {557--594}, year = {2023}, publisher = {Springer}, }
journal
Principled Analyses and Design of First-Order Methods with Inexact Proximal Operators

Mathieu Barré , Adrien B. Taylor , and Francis Bach

Mathematical Programming, 2023

Abs arXiv Bib Code

Proximal operations are among the most common primitives appearing in both practical and theoretical (or high-level) optimization methods. This basic operation typically consists in solving an intermediary (hopefully simpler) optimization problem. In this work, we survey notions of inaccuracies that can be used when solving those intermediary optimization problems. Then, we show that worst-case guarantees for algorithms relying on such inexact proximal operations can be systematically obtained through a generic procedure based on semidefinite programming. This methodology is primarily based on the approach introduced by Drori and Teboulle (2014) and on convex interpolation results, and allows producing non-improvable worst-case analyzes. In other words, for a given algorithm, the methodology generates both worst-case certificates (i.e., proofs) and problem instances on which those bounds are achieved. Relying on this methodology, we study numerical worst-case performances of a few basic methods relying on inexact proximal operations including accelerated variants, and design a variant with optimized worst-case behaviour. We further illustrate how to extend the approach to support strongly convex objectives by studying a simple relatively inexact proximal minimization method.
@article{barre2023principled, title = {Principled Analyses and Design of First-Order Methods with Inexact Proximal Operators}, author = {Barré, Mathieu and Taylor, Adrien B. and Bach, Francis}, journal = {Mathematical Programming}, year = {2023}, volume = {201}, number = {1-2}, pages = {185--230}, }
journal
On the oracle complexity of smooth strongly convex minimization

Yoel Drori , and Adrien B. Taylor

Journal of Complexity, 2022

Abs arXiv Bib HTML

We construct a family of functions suitable for establishing lower bounds on the oracle complexity of first-order minimization of smooth strongly-convex functions. Based on this construction, we derive new lower bounds on the complexity of strongly-convex minimization under various inaccuracy criteria. The new bounds match the known upper bounds up to a constant factor, and when the inaccuracy of a solution is measured by its distance to the solution set, the new lower bound exactly matches the upper bound obtained by the recent Information-Theoretic Exact Method by the same authors, thereby establishing the exact oracle complexity for this class of problems.
@article{drori2022oracle, title = {On the oracle complexity of smooth strongly convex minimization}, journal = {Journal of Complexity}, volume = {68}, pages = {101590}, year = {2022}, author = {Drori, Yoel and Taylor, Adrien B.}, }
journal
A note on approximate accelerated forward-backward methods with absolute and relative errors, and possibly strongly convex objectives

Mathieu Barré , Adrien B. Taylor , and Francis Bach

Open Journal of Mathematical Optimization, 2022

Abs arXiv Bib PDF Code

In this short note, we provide a simple version of an accelerated forward-backward method (a.k.a. Nesterov’s accelerated proximal gradient method) possibly relying on approximate proximal operators and allowing to exploit strong convexity of the objective function. The method supports both relative and absolute errors, and its behavior is illustrated on a set of standard numerical experiments. Using the same developments, we further provide a version of the accelerated proximal hybrid extragradient method of Monteiro and Svaiter (2013) possibly exploiting strong convexity of the objective function.
@article{barre2020note, author = {Barré, Mathieu and Taylor, Adrien B. and Bach, Francis}, title = {A note on approximate accelerated forward-backward methods with absolute and relative errors, and possibly strongly convex objectives}, journal = {Open Journal of Mathematical Optimization}, volume = {3}, year = {2022}, }
journal
Convergence of a Constrained Vector Extrapolation Scheme

Mathieu Barré , Adrien B. Taylor , and Alexandre d’Aspremont

SIAM Journal on Mathematics of Data Science, 2022

Abs arXiv Bib HTML

We prove non asymptotic linear convergence rates for the constrained Anderson acceleration extrapolation scheme. These guarantees come from new upper bounds on the constrained Chebyshev problem, which consists in minimizing the maximum absolute value of a polynomial on a bounded real interval with l1 constraints on its coefficients vector. Constrained Anderson Acceleration has a numerical cost comparable to that of the original scheme.
@article{Barr20, author = {Barré, Mathieu and Taylor, Adrien B. and d’Aspremont, Alexandre}, journal = {SIAM Journal on Mathematics of Data Science}, volume = {4}, number = {3}, pages = {979-1002}, year = {2022}, publisher = {SIAM}, title = {Convergence of a {C}onstrained {V}ector {E}xtrapolation {S}cheme}, }
journal
Optimal complexity and certification of Bregman first-order methods

Radu-Alexandru Dragomir , Adrien B. Taylor , Alexandre d’Aspremont , and Jérôme Bolte

Mathematical Programming, 2022

Abs arXiv Bib HTML Code

We provide a lower bound showing that the O(1/k) convergence rate of the NoLips method (a.k.a. Bregman Gradient) is optimal for the class of functions satisfying the h-smoothness assumption. This assumption, also known as relative smoothness, appeared in the recent developments around the Bregman Gradient method, where acceleration remained an open issue. On the way, we show how to constructively obtain the corresponding worst-case functions by extending the computer-assisted performance estimation framework of Drori and Teboulle (Mathematical Programming, 2014) to Bregman first-order methods, and to handle the classes of differentiable and strictly convex functions.
@article{dragomir2021optimal, title = {Optimal complexity and certification of Bregman first-order methods}, author = {Dragomir, Radu-Alexandru and Taylor, Adrien B. and d’Aspremont, Alexandre and Bolte, J{\'e}r{\^o}me}, journal = {Mathematical Programming}, pages = {1--43}, year = {2022}, publisher = {Springer}, }
journal
Efficient first-order methods for convex minimization: a constructive approach

Yoel Drori , and Adrien B. Taylor

Mathematical Programming, 2020

Abs arXiv Bib HTML Code

We describe a novel constructive technique for devising efficient first-order methods for a wide range of large-scale convex minimization settings, including smooth, non-smooth, and strongly convex minimization. The technique builds upon a certain variant of the conjugate gradient method to construct a family of methods such that a) all methods in the family share the same worst-case guarantee as the base conjugate gradient method, and b) the family includes a fixed-step first-order method. We demonstrate the effectiveness of the approach by deriving optimal methods for the smooth and non-smooth cases, including new methods that forego knowledge of the problem parameters at the cost of a one-dimensional line search per iteration, and a universal method for the union of these classes that requires a three-dimensional search per iteration. In the strongly convex case, we show how numerical tools can be used to perform the construction, and show that the resulting method offers an improved worst-case bound compared to Nesterov’s celebrated fast gradient method.
@article{drori2019efficient, title = {Efficient first-order methods for convex minimization: a constructive approach}, author = {Drori, Yoel and Taylor, Adrien B.}, journal = {Mathematical Programming}, volume = {184}, number = {1}, pages = {183--220}, year = {2020}, publisher = {Springer}, }
journal
Operator splitting performance estimation: Tight contraction factors and optimal parameter selection

Ernest K. Ryu , Adrien B. Taylor , Carolina Bergeling , and Pontus Giselsson

SIAM Journal on Optimization, 2020

Abs arXiv Bib HTML Code

We propose a methodology for studying the performance of common splitting methods through semidefinite programming. We prove tightness of the methodology and demonstrate its value by presenting two applications of it. First, we use the methodology as a tool for computer-assisted proofs to prove tight analytical contraction factors for Douglas–Rachford splitting that are likely too complicated for a human to find bare-handed. Second, we use the methodology as an algorithmic tool to computationally select the optimal splitting method parameters by solving a series of semidefinite programs.
@article{ryu2020operator, title = {Operator splitting performance estimation: Tight contraction factors and optimal parameter selection}, author = {Ryu, Ernest K. and Taylor, Adrien B. and Bergeling, Carolina and Giselsson, Pontus}, journal = {SIAM Journal on Optimization}, volume = {30}, number = {3}, pages = {2251--2271}, year = {2020}, publisher = {SIAM}, }
journal
Worst-case convergence analysis of inexact gradient and Newton methods through semidefinite programming performance estimation

Etienne De Klerk , François Glineur , and Adrien B. Taylor

SIAM Journal on Optimization, 2020

Abs arXiv Bib HTML Code

We provide new tools for worst-case performance analysis of the gradient (or steepest descent) method of Cauchy for smooth strongly convex functions, and Newton’s method for self-concordant functions, including the case of inexact search directions. The analysis uses semidefinite programming performance estimation, as pioneered by Drori and Teboulle [Mathematical Programming, 145(1-2):451–482, 2014], and extends recent performance estimation results for the method of Cauchy by the authors [Optimization Letters, 11(7), 1185-1199, 2017]. To illustrate the applicability of the tools, we demonstrate a novel complexity analysis of short step interior point methods using inexact search directions. As an example in this framework, we sketch how to give a rigorous worst-case complexity analysis of a recent interior point method by Abernethy and Hazan [PMLR, 48:2520–2528, 2016].
@article{de2020worst, title = {Worst-case convergence analysis of inexact gradient and Newton methods through semidefinite programming performance estimation}, author = {De Klerk, Etienne and Glineur, Fran\c{c}ois and Taylor, Adrien B.}, journal = {SIAM Journal on Optimization}, volume = {30}, number = {3}, pages = {2053--2082}, year = {2020}, publisher = {SIAM}, }
journal

Exact worst-case convergence rates of the proximal gradient method for composite convex minimization

Adrien B. Taylor , Julien M. Hendrickx , and François Glineur

Journal of Optimization Theory and Applications, 2018

Abs arXiv PDF Code

We study the worst-case convergence rates of the proximal gradient method for minimizing the sum of a smooth strongly convex function and a non-smooth convex function whose proximal operator is available. We establish the exact worst-case convergence rates of the proximal gradient method in this setting for any step size and for different standard performance measures: objective function accuracy, distance to optimality and residual gradient norm. The proof methodology relies on recent developments in performance estimation of first-order methods based on semidefinite programming. In the case of the proximal gradient method, this methodology allows obtaining exact and non-asymptotic worst-case guarantees that are conceptually very simple, although apparently new. On the way, we discuss how strong convexity can be replaced by weaker assumptions, while preserving the corresponding convergence rates. We also establish that the same fixed step size policy is optimal for all three performance measures. Finally, we extend recent results on the worst-case behavior of gradient descent with exact line search to the proximal case.
journal

Exact worst-case performance of first-order methods for composite convex optimization

Adrien B. Taylor , Julien M. Hendrickx , and François Glineur

SIAM Journal on Optimization, 2017

Abs arXiv HTML Code

We provide a framework for computing the exact worst-case performance of any algorithm belonging to a broad class of oracle-based first-order methods for composite convex optimization, including those performing explicit, projected, proximal, conditional and inexact (sub)gradient steps. We simultaneously obtain tight worst-case guarantees and explicit instances of optimization problems on which the algorithm reaches this worst-case. We achieve this by reducing the computation of the worst-case to solving a convex semidefinite program, generalizing previous works on performance estimation by Drori and Teboulle [13] and the authors [43]. We use these developments to obtain a tighter analysis of the proximal point algorithm and of several variants of fast proximal gradient, conditional gradient, subgradient and alternating projection methods. In particular, we present a new analytical worst-case guarantee for the proximal point algorithm that is twice better than previously known, and improve the standard worst-case guarantee for the conditional gradient method by more than a factor of two. We also show how the optimized gradient method proposed by Kim and Fessler in [22] can be extended by incorporating a projection or a proximal operator, which leads to an algorithm that converges in the worst-case twice as fast as the standard accelerated proximal gradient method [2].
journal

On the worst-case complexity of the gradient method with exact line search for smooth strongly convex functions [Best paper award]

Etienne De Klerk , François Glineur , and Adrien B. Taylor

Optimization Letters, 2017

Abs HTML PDF

We consider the gradient (or steepest) descent method with exact line search applied to a strongly convex function with Lipschitz continuous gradient. We establish the exact worst-case rate of convergence of this scheme, and show that this worst-case behavior is exhibited by a certain convex quadratic function. We also give the tight worst-case complexity bound for a noisy variant of gradient descent method, where exact line-search is performed in a search direction that differs from negative gradient by at most a prescribed relative tolerance. The proofs are computer-assisted, and rely on the resolutions of semidefinite programming performance estimation problems as introduced in the paper (Drori and Teboulle, Math Progr 145(1–2):451–482, 2014).
journal

Smooth strongly convex interpolation and exact worst-case performance of first-order methods

Adrien B. Taylor , Julien M. Hendrickx , and François Glineur

Mathematical Programming, 2017

Abs arXiv HTML Code

We show that the exact worst-case performance of fixed-step first-order methods for unconstrained optimization of smooth (possibly strongly) convex functions can be obtained by solving convex programs. Finding the worst-case performance of a black-box first-order method is formulated as an optimization problem over a set of smooth (strongly) convex functions and initial conditions. We develop closed-form necessary and sufficient conditions for smooth (strongly) convex interpolation, which provide a finite representation for those functions. This allows us to reformulate the worst-case performance estimation problem as an equivalent finite dimension-independent semidefinite optimization problem, whose exact solution can be recovered up to numerical precision. Optimal solutions to this performance estimation problem provide both worst-case performance bounds and explicit functions matching them, as our smooth (strongly) convex interpolation procedure is constructive. Our works build on those of Drori and Teboulle in [Math. Prog. 145 (1-2), 2014] who introduced and solved relaxations of the performance estimation problem for smooth convex functions. We apply our approach to different fixed-step first-order methods with several performance criteria, including objective function accuracy and gradient norm. We conjecture several numerically supported worst-case bounds on the performance of the fixed-step gradient, fast gradient and optimized gradient methods, both in the smooth convex and the smooth strongly convex cases, and deduce tight estimates of the optimal step size for the gradient method.

4 - conference

conference
The Surprising Agreement Between Convex Optimization Theory and Learning-Rate Scheduling for Large Model Training

Fabian Schaipp , Alexander Hägele , Adrien B. Taylor , Umut Simsekli , and Francis Bach

2025

Abs arXiv Bib Code

We show that learning-rate schedules for large model training behave surprisingly similar to a performance bound from non-smooth convex optimization theory. We provide a bound for the constant schedule with linear cooldown; in particular, the practical benefit of cooldown is reflected in the bound due to the absence of logarithmic terms. Further, we show that this surprisingly close match between optimization theory and practice can be exploited for learning-rate tuning: we achieve noticeable improvements for training 124M and 210M Llama-type models by (i) extending the schedule for continued training with optimal learning-rate, and (ii) transferring the optimal learning-rate across schedules.
@article{schaipp2025surprising, title = {The Surprising Agreement Between Convex Optimization Theory and Learning-Rate Scheduling for Large Model Training}, author = {Schaipp, Fabian and H{\"a}gele, Alexander and Taylor, Adrien B. and Simsekli, Umut and Bach, Francis}, booktitle = {International Conference on Machine Learning (ICML)}, year = {2025} }

conference

Solving generic parametric linear matrix inequalities

Simone Naldi , Mohab Safey El Din , Adrien B. Taylor , and Weijia Wang

In International Symposium on Symbolic and Algebraic Computation (ISSAC) , 2025

Abs arXiv Bib Code

@inproceedings{naldi2025solving,
  title = {Solving generic parametric linear matrix inequalities},
  author = {Naldi, Simone and Safey El Din, Mohab and Taylor, Adrien B. and Wang, Weijia},
  booktitle = {International Symposium on Symbolic and Algebraic Computation (ISSAC)},
  year = {2025}
}

conference
QPLayer: efficient differentiation of convex quadratic optimization

Antoine Bambade , Fabian Schramm , Adrien B. Taylor , and Justin Carpentier

In International Conference on Learning Representations (ICLR, to appear) , 2024

Abs arXiv Bib PDF Code

Optimization layers within neural network architectures have become increasingly popular for their ability to solve a wide range of machine learning tasks and to model domain-specific knowledge. However, designing optimization layers requires careful consideration as the underlying optimization problems might be infeasible during training. Motivated by applications in learning, control, and robotics, this work focuses on convex quadratic programming (QP) layers. The specific structure of this type of optimization layer can be efficiently exploited for faster computations while still allowing rich modeling capabilities. We leverage primal-dual augmented Lagrangian techniques for computing derivatives of both feasible and infeasible QPs. Not requiring feasibility allows, as a byproduct, for more flexibility in the QP to be learned. The effectiveness of our approach is demonstrated in a few standard learning experiments, obtaining three to ten times faster computations than alternative state-of-the-art methods while being more accurate and numerically robust. Along with these contributions, we provide an open-source C++ software package called QPLayer for efficiently differentiating convex QPs and which can be interfaced with modern learning frameworks.
@inproceedings{bambade2023qplayer, title = {QPLayer: efficient differentiation of convex quadratic optimization}, author = {Bambade, Antoine and Schramm, Fabian and Taylor, Adrien B. and Carpentier, Justin}, year = {2024}, booktitle = {International Conference on Learning Representations (ICLR, to appear)}, }
conference
On Fundamental Proof Structures in First-Order Optimization

Baptiste Goujaud , Aymeric Dieuleveut , and Adrien B. Taylor

In Conference on Decision and Control (CDC) , 2023

Abs arXiv Bib PDF

First-order optimization methods have attracted a lot of attention due to their practical suc- cess in many applications, including in machine learning. Obtaining convergence guarantees and worst-case performance certificates for first-order methods have become crucial for understand- ing ingredients underlying efficient methods and for developing new ones. However, obtaining, verifying, and proving such guarantees is often a tedious task. Therefore, a few approaches were proposed for rendering this task more systematic, and even partially automated. In addition to helping researchers finding convergence proofs, these tools provide insights on the general struc- tures of such proofs. We aim at presenting those structures, showing how to build convergence guarantees for first-order optimization methods.
@inproceedings{goujaud2023fundamental, title = {On Fundamental Proof Structures in First-Order Optimization}, author = {Goujaud, Baptiste and Dieuleveut, Aymeric and Taylor, Adrien B.}, booktitle = {Conference on Decision and Control (CDC)}, pages = {3023--3030}, year = {2023}, organization = {IEEE}, }
conference
Convergence of Proximal Point and Extragradient-Based Methods Beyond Monotonicity: the Case of Negative Comonotonicity

Eduard Gorbunov , Adrien B. Taylor , Samuel Horvath , and Gauthier Gidel

In International Conference on Machine Learning (ICML) , 2023

Abs arXiv Bib PDF Code

Algorithms for min-max optimization and variational inequalities are often studied under monotonicity assumptions. Motivated by non-monotone machine learning applications, we follow the line of works [Diakonikolas et al., 2021, Lee and Kim, 2021, Pethick et al., 2022, Böhm, 2022] aiming at going beyond monotonicity by considering the weaker negative comonotonicity assumption. In particular, we provide tight complexity analyses for the Proximal Point, Extragradient, and Optimistic Gradient methods in this setup, closing some questions on their working guarantees beyond monotonicity.
@inproceedings{gorbunov2023convergence, title = {Convergence of Proximal Point and Extragradient-Based Methods Beyond Monotonicity: the Case of Negative Comonotonicity}, author = {Gorbunov, Eduard and Taylor, Adrien B. and Horvath, Samuel and Gidel, Gauthier}, year = {2023}, booktitle = {International Conference on Machine Learning (ICML)}, }
conference
Last-Iterate Convergence of Optimistic Gradient Method for Monotone Variational Inequalities

Eduard Gorbunov , Adrien B. Taylor , and Gauthier Gidel

In Advances in Neural Information Processing Systems (NeurIPS) , 2022

Abs arXiv Bib PDF Code

The Past Extragradient (PEG) [Popov, 1980] method, also known as the Optimistic Gradient method, has known a recent gain in interest in the optimization community with the emergence of variational inequality formulations for machine learning. Recently, in the unconstrained case, Golowich et al. [2020] proved that a O(1/N) last-iterate convergence rate in terms of the squared norm of the operator can be achieved for Lipschitz and monotone operators with a Lipschitz Jacobian. In this work, by introducing a novel analysis through potential functions, we show that (i) this O(1/N) last-iterate convergence can be achieved without any assumption on the Jacobian of the operator, and (ii) it can be extended to the constrained case, which was not derived before even under Lipschitzness of the Jacobian. The proof is significantly different from the one known from Golowich et al. [2020], and its discovery was computer-aided. Those results close the open question of the last iterate convergence of PEG for monotone variational inequalities.
@inproceedings{gorbunov2022last, title = {Last-Iterate Convergence of Optimistic Gradient Method for Monotone Variational Inequalities}, author = {Gorbunov, Eduard and Taylor, Adrien B. and Gidel, Gauthier}, year = {2022}, booktitle = {Advances in Neural Information Processing Systems (NeurIPS)}, }
conference
Fast Stochastic Composite Minimization and an Accelerated Frank-Wolfe Algorithm under Parallelization

Benjamin Dubois-Taine , Francis Bach , Quentin Berthet , and Adrien B. Taylor

In Advances in Neural Information Processing Systems (NeurIPS) , 2022

Abs arXiv Bib PDF Code

We consider the problem of minimizing the sum of two convex functions. One of those functions has Lipschitz-continuous gradients, and can be accessed via stochastic oracles, whereas the other is "simple". We provide a Bregman-type algorithm with accelerated convergence in function values to a ball containing the minimum. The radius of this ball depends on problem-dependent constants, including the variance of the stochastic oracle. We further show that this algorithmic setup naturally leads to a variant of Frank-Wolfe achieving acceleration under parallelization. More precisely, when minimizing a smooth convex function on a bounded domain, we show that one can achieve an εprimal-dual gap (in expectation) in \tildeO(1/\sqrt ε) iterations, by only accessing gradients of the original function and a linear maximization oracle with O(1/\sqrt ε) computing units in parallel. We illustrate this fast convergence on synthetic numerical experiments.
@inproceedings{dubois2022fast, title = {Fast Stochastic Composite Minimization and an Accelerated Frank-Wolfe Algorithm under Parallelization}, author = {Dubois-Taine, Benjamin and Bach, Francis and Berthet, Quentin and Taylor, Adrien B.}, year = {2022}, booktitle = {Advances in Neural Information Processing Systems (NeurIPS)}, }
conference
PROX-QP: Yet another Quadratic Programming Solver for Robotics and beyond

Antoine Bambade , Sarah El Kazdadi , Adrien B. Taylor , and Justin Carpentier

In Robotics: Science and systems (RSS 2022) , 2022

Abs arXiv Bib PDF Code

Quadratic programming (QP) has become a core modelling component in the modern engineering toolkit. This is particularly true for simulation, planning and control in robotics. Yet, modern numerical solvers have not reached the level of efficiency and reliability required in practical applications where speed, robustness, and accuracy are all necessary. In this work, we introduce a few variations of the well-established augmented Lagrangian method, specifically for solving QPs, which include heuristics for improving practical numerical performances. Those variants are embedded within an open-source software which includes an efficient C++ implementation, a modular API, as well as best-performing heuristics for our test-bed. Relying on this framework, we present a benchmark studying the practical performances of modern optimization solvers for convex QPs on generic and complex problems of the literature as well as on common robotic scenarios. This benchmark notably highlights that this approach outperforms modern solvers in terms of efficiency, accuracy and robustness for small to medium-sized problems, while remaining competitive for higher dimensions.
@inproceedings{bambade2022prox, title = {{PROX-QP: Yet another Quadratic Programming Solver for Robotics and beyond}}, author = {Bambade, Antoine and El Kazdadi, Sarah and Taylor, Adrien B. and Carpentier, Justin}, booktitle = {{Robotics: Science and systems (RSS 2022)}}, year = {2022}, }
conference
Super-Acceleration with Cyclical Step-sizes

Baptiste Goujaud , Damien Scieur , Aymeric Dieuleveut , Adrien B. Taylor , and Fabian Pedregosa

In International Conference on Artificial Intelligence and Statistics (AISTATS) , 2022

Abs arXiv Bib Code

We develop a convergence-rate analysis of momentum with cyclical step-sizes. We show that under some assumption on the spectral gap of Hessians in machine learning, cyclical step-sizes are provably faster than constant step-sizes. More precisely, we develop a convergence rate analysis for quadratic objectives that provides optimal parameters and shows that cyclical learning rates can improve upon traditional lower complexity bounds. We further propose a systematic approach to design optimal first order methods for quadratic minimization with a given spectral structure. Finally, we provide a local convergence rate analysis beyond quadratic minimization for the proposed methods and illustrate our findings through benchmarks on least squares and logistic regression problems.
@inproceedings{goujaud2022super, title = {Super-Acceleration with Cyclical Step-sizes}, author = {Goujaud, Baptiste and Scieur, Damien and Dieuleveut, Aymeric and Taylor, Adrien B. and Pedregosa, Fabian}, booktitle = {International Conference on Artificial Intelligence and Statistics (AISTATS)}, year = {2022}, }
conference
A Continuized View on Nesterov Acceleration for Stochastic Gradient Descent and Randomized Gossip [Outstanding paper award]

Mathieu Even , Raphaël Berthier , Francis Bach , Nicolas Flammarion , Pierre Gaillard , Hadrien Hendrikx , Laurent Massoulié , and Adrien B. Taylor

In Advances in Neural Information Processing Systems (NeurIPS) , 2021

Abs arXiv Bib PDF

We introduce the "continuized"‘ Nesterov acceleration, a close variant of Nesterov acceleration whose variables are indexed by a continuous time parameter. The two variables continuously mix following a linear ordinary differential equation and take gradient steps at random times. This continuized variant benefits from the best of the continuous and the discrete frameworks: as a continuous process, one can use differential calculus to analyze convergence and obtain analytical expressions for the parameters; and a discretization of the continuized process can be computed exactly with convergence rates similar to those of Nesterov original acceleration. We show that the discretization has the same structure as Nesterov acceleration, but with random parameters. We provide continuized Nesterov acceleration under deterministic as well as stochastic gradients, with either additive or multiplicative noise. Finally, using our continuized framework and expressing the gossip averaging problem as the stochastic minimization of a certain energy function, we provide the first rigorous acceleration of asynchronous gossip algorithms.
@inproceedings{even2021continuized, title = {A Continuized View on Nesterov Acceleration for Stochastic Gradient Descent and Randomized Gossip [<b><a href="https://blog.neurips.cc/2021/11/30/announcing-the-neurips-2021-award-recipients/?s=09">Outstanding paper award</a></b>]}, author = {Even, Mathieu and Berthier, Rapha{\"e}l and Bach, Francis and Flammarion, Nicolas and Gaillard, Pierre and Hendrikx, Hadrien and Massoulié, Laurent and Taylor, Adrien B.}, booktitle = {Advances in Neural Information Processing Systems (NeurIPS)}, year = {2021}, }
conference
Complexity guarantees for Polyak steps with momentum

Mathieu Barré , Adrien B. Taylor , and Alexandre d’Aspremont

In Conference on Learning Theory (COLT) , 2020

Abs arXiv Bib PDF Code

In smooth strongly convex optimization, knowledge of the strong convexity parameter is critical for obtaining simple methods with accelerated rates. In this work, we study a class of methods, based on Polyak steps, where this knowledge is substituted by that of the optimal value, 𝑓∗. We first show slightly improved convergence bounds than previously known for the classical case of simple gradient descent with Polyak steps, we then derive an accelerated gradient method with Polyak steps and momentum, along with convergence guarantees.
@inproceedings{barre2020complexity, title = {Complexity guarantees for {P}olyak steps with momentum}, author = {Barré, Mathieu and Taylor, Adrien B. and d’Aspremont, Alexandre}, booktitle = {Conference on Learning Theory (COLT)}, pages = {452--478}, year = {2020}, }
conference

Stochastic first-order methods: non-asymptotic and computer-aided analyses via potential functions

Adrien B. Taylor , and Francis Bach

In Conference on Learning Theory (COLT) , 2019

Abs arXiv HTML Code

We provide a novel computer-assisted technique for systematically analyzing first-order methods for optimization. In contrast with previous works, the approach is particularly suited for handling sublinear convergence rates and stochastic oracles. The technique relies on semidefinite programming and potential functions. It allows simultaneously obtaining worst-case guarantees on the behavior of those algorithms, and assisting in choosing appropriate parameters for tuning their worst-case performances. The technique also benefits from comfortable tightness guarantees, meaning that unsatisfactory results can be improved only by changing the setting. We use the approach for analyzing deterministic and stochastic first-order methods under different assumptions on the nature of the stochastic noise. Among others, we treat unstructured noise with bounded variance, different noise models arising in over-parametrized expectation minimization problems, and randomized block-coordinate descent schemes.
conference

Lyapunov functions for first-order methods: Tight automated convergence guarantees

Adrien B. Taylor , Bryan Van Scoy , and Laurent Lessard

In International Conference on Machine Learning (ICML) , 2018

Abs arXiv HTML Code

We present a novel way of generating Lyapunov functions for proving linear convergence rates of first-order optimization methods. Our approach provably obtains the fastest linear convergence rate that can be verified by a quadratic Lyapunov function (with given states), and only relies on solving a small-sized semidefinite program. Our approach combines the advantages of performance estimation problems (PEP, due to Drori and Teboulle (2014)) and integral quadratic constraints (IQC, due to Lessard et al. (2016)), and relies on convex interpolation (due to Taylor et al. (2017c;b)).
conference

Performance estimation toolbox (PESTO): automated worst-case analysis of first-order optimization methods

Adrien B. Taylor , Julien M. Hendrickx , and François Glineur

In Conference on Decision and Control (CDC) , 2017

Abs HTML PDF Code

We present a MATLAB toolbox that automatically computes tight worst-case performance guarantees for a broad class of first-order methods for convex optimization. The class of methods includes those performing explicit, projected, proximal, conditional and inexact (sub)gradient steps. The toolbox relies on the performance estimation (PE) framework, which recently emerged through works of Drori and Teboulle and the authors. The PE approach is a very systematic manner of obtaining non-improvable worst-case guarantees for first-order numerical optimization schemes. However, using the PE methodology requires modelling efforts from the user, along with some knowledge of semidefinite programming. The goal of this work is to ease the use of the performance estimation methodology, by providing a toolbox that implicitly does the modelling job. In short, its aim is to (i) let the user write the algorithm in a natural way, as he/she would have implemented it, and (ii) let the computer perform the modelling and worst-case analysis parts automatically.

5 - PhDtheses

PhDthese

Convex Interpolation and Performance Estimation of First-order Methods for Convex Optimization [ICTEAM thesis award; IBM-FNRS innovation award; AW Tucker prize finalist]

Adrien B. Taylor

Université catholique de Louvain, 2017

Abs Code

The goal of this thesis is to show how to derive in a completely automated way exact and global worst-case guarantees for first-order methods in convex optimization. To this end, we formulate a generic optimization problem looking for the worst-case scenarios. The worst-case computation problems, referred to as performance estimation problems (PEPs), are intrinsically infinite-dimensional optimization problems formulated over a given class of objective functions. To render those problems tractable, we develop (smooth and non-smooth) convex interpolation framework, which provides necessary and sufficient conditions to interpolate our objective functions. With this idea, we transform PEPs into solvable finite-dimensional semidefinite programs, from which one obtains worst-case guarantees and worst-case functions, along with the corresponding explicit proofs. PEPs already proved themselves very useful as a tool for developing convergence analyses of first-order optimization methods. Among others, PEPs allow obtaining exact guarantees for gradient methods, along with their inexact, projected, proximal, conditional, decentralized and accelerated versions.