Search Results

Now showing 1 - 10 of 22
  • Item
    Hyperfast second-order local solvers for efficient statistically preconditioned distributed optimization
    (Amsterdam : Elsevier, 2022) Dvurechensky, Pavel; Kamzolov, Dmitry; Lukashevich, Aleksandr; Lee, Soomin; Ordentlich, Erik; Uribe, César A.; Gasnikov, Alexander
    Statistical preconditioning enables fast methods for distributed large-scale empirical risk minimization problems. In this approach, multiple worker nodes compute gradients in parallel, which are then used by the central node to update the parameter by solving an auxiliary (preconditioned) smaller-scale optimization problem. The recently proposed Statistically Preconditioned Accelerated Gradient (SPAG) method [1] has complexity bounds superior to other such algorithms but requires an exact solution for computationally intensive auxiliary optimization problems at every iteration. In this paper, we propose an Inexact SPAG (InSPAG) and explicitly characterize the accuracy by which the corresponding auxiliary subproblem needs to be solved to guarantee the same convergence rate as the exact method. We build our results by first developing an inexact adaptive accelerated Bregman proximal gradient method for general optimization problems under relative smoothness and strong convexity assumptions, which may be of independent interest. Moreover, we explore the properties of the auxiliary problem in the InSPAG algorithm assuming Lipschitz third-order derivatives and strong convexity. For such problem class, we develop a linearly convergent Hyperfast second-order method and estimate the total complexity of the InSPAG method with hyperfast auxiliary problem solver. Finally, we illustrate the proposed method's practical efficiency by performing large-scale numerical experiments on logistic regression models. To the best of our knowledge, these are the first empirical results on implementing high-order methods on large-scale problems, as we work with data where the dimension is of the order of 3 million, and the number of samples is 700 million.
  • Item
    Accelerated variance-reduced methods for saddle-point problems
    (Amsterdam : Elsevier, 2022) Borodich, Ekaterina; Tominin, Vladislav; Tominin, Yaroslav; Kovalev, Dmitry; Gasnikov, Alexander; Dvurechensky, Pavel
    We consider composite minimax optimization problems where the goal is to find a saddle-point of a large sum of non-bilinear objective functions augmented by simple composite regularizers for the primal and dual variables. For such problems, under the average-smoothness assumption, we propose accelerated stochastic variance-reduced algorithms with optimal up to logarithmic factors complexity bounds. In particular, we consider strongly-convex-strongly-concave, convex-strongly-concave, and convex-concave objectives. To the best of our knowledge, these are the first nearly-optimal algorithms for this setting.
  • Item
    Inexact tensor methods and their application to stochastic convex optimization
    (Berlin : Weierstraß-Institut für Angewandte Analysis und Stochastik, 2021) Agafonov, Artem; Kamzolov, Dmitry; Dvurechensky, Pavel; Gasnikov, Alexander
    We propose a general non-accelerated tensor method under inexact information on higher- order derivatives, analyze its convergence rate, and provide sufficient conditions for this method to have similar complexity as the exact tensor method. As a corollary, we propose the first stochastic tensor method for convex optimization and obtain sufficient mini-batch sizes for each derivative.
  • Item
    Alternating minimization methods for strongly convex optimization
    (Berlin : Weierstraß-Institut für Angewandte Analysis und Stochastik, 2020) Tupitsa, Nazarii; Dvurechensky, Pavel; Gasnikov, Alexander
    We consider alternating minimization procedures for convex optimization problems with variable divided in many block, each block being amenable for minimization with respect to its variable with freezed other variables blocks. In the case of two blocks, we prove a linear convergence rate for alternating minimization procedure under Polyak-Łojasiewicz condition, which can be seen as a relaxation of the strong convexity assumption. Under strong convexity assumption in many-blocks setting we provide an accelerated alternating minimization procedure with linear rate depending on the square root of the condition number as opposed to condition number for the non-accelerated method.
  • Item
    Adaptive gradient descent for convex and non-convex stochastic optimization
    (Berlin : Weierstraß-Institut für Angewandte Analysis und Stochastik, 2019) Ogaltsov, Aleksandr; Dvinskikh, Darina; Dvurechensky, Pavel; Gasnikov, Alexander; Spokoiny, Vladimir
    In this paper we propose several adaptive gradient methods for stochastic optimization. Our methods are based on Armijo-type line search and they simultaneously adapt to the unknown Lipschitz constant of the gradient and variance of the stochastic approximation for the gradient. We consider an accelerated gradient descent for convex problems and gradient descent for non-convex problems. In the experiments we demonstrate superiority of our methods to existing adaptive methods, e.g. AdaGrad and Adam.
  • Item
    Oracle complexity separation in convex optimization
    (Berlin : Weierstraß-Institut für Angewandte Analysis und Stochastik, 2020) Ivanova, Anastasiya; Gasnikov, Alexander; Dvurechensky, Pavel; Dvinskikh, Darina; Tyurin, Alexander; Vorontsova, Evgeniya; Pasechnyuk, Dmitry
    Ubiquitous in machine learning regularized empirical risk minimization problems are often composed of several blocks which can be treated using different types of oracles, e.g., full gradient, stochastic gradient or coordinate derivative. Optimal oracle complexity is known and achievable separately for the full gradient case, the stochastic gradient case, etc. We propose a generic framework to combine optimal algorithms for different types of oracles in order to achieve separate optimal oracle complexity for each block, i.e. for each block the corresponding oracle is called the optimal number of times for a given accuracy. As a particular example, we demonstrate that for a combination of a full gradient oracle and either a stochastic gradient oracle or a coordinate descent oracle our approach leads to the optimal number of oracle calls separately for the full gradient part and the stochastic/coordinate descent part.
  • Item
    On primal and dual approaches for distributed stochastic convex optimization over networks
    (Berlin : Weierstraß-Institut für Angewandte Analysis und Stochastik, 2020) Dvinskikh, Darina; Gorbunov, Eduard; Gasnikov, Alexander; Dvurechensky, Alexander; Uribe, César A.
    We introduce a primal-dual stochastic gradient oracle method for distributed convex optimization problems over networks. We show that the proposed method is optimal in terms of communication steps. Additionally, we propose a new analysis method for the rate of convergence in terms of duality gap and probability of large deviations. This analysis is based on a new technique that allows to bound the distance between the iteration sequence and the optimal point. By the proper choice of batch size, we can guarantee that this distance equals (up to a constant) to the distance between the starting point and the solution.
  • Item
    Inexact model: A framework for optimization and variational inequalities
    (Berlin : Weierstraß-Institut für Angewandte Analysis und Stochastik, 2020) Stonyakin, Fedor; Gasnikov, Alexander; Tyurin, Alexander; Pasechnyuk, Dmitry; Agafonov, Artem; Dvurechensky, Pavel; Dvinskikh, Darina; Piskunova, Victorya
    In this paper we propose a general algorithmic framework for first-order methods in optimization in a broad sense, including minimization problems, saddle-point problems and variational inequalities. This framework allows to obtain many known methods as a special case, the list including accelerated gradient method, composite optimization methods, level-set methods, proximal methods. The idea of the framework is based on constructing an inexact model of the main problem component, i.e. objective function in optimization or operator in variational inequalities. Besides reproducing known results, our framework allows to construct new methods, which we illustrate by constructing a universal method for variational inequalities with composite structure. This method works for smooth and non-smooth problems with optimal complexity without a priori knowledge of the problem smoothness. We also generalize our framework for strongly convex objectives and strongly monotone variational inequalities.
  • Item
    Inexact relative smoothness and strong convexity for optimization and variational inequalities by inexact model
    (Berlin : Weierstraß-Institut für Angewandte Analysis und Stochastik, 2020) Stonyakin, Fedor; Gasnikov, Alexander; Tyurin, Alexander; Pasechnyuk, Dmitry; Agafonov, Artem; Dvurechensky, Pavel; Dvinskikh, Darina; Artamonov, Sergei; Piskunova, Victorya
    In this paper we propose a general algorithmic framework for first-order methods in optimization in a broad sense, including minimization problems, saddle-point problems and variational inequalities. This framework allows to obtain many known methods as a special case, the list including accelerated gradient method, composite optimization methods, level-set methods, Bregman proximal methods. The idea of the framework is based on constructing an inexact model of the main problem component, i.e. objective function in optimization or operator in variational inequalities. Besides reproducing known results, our framework allows to construct new methods, which we illustrate by constructing a universal conditional gradient method and universal method for variational inequalities with composite structure. These method works for smooth and non-smooth problems with optimal complexity without a priori knowledge of the problem smoothness. As a particular case of our general framework, we introduce relative smoothness for operators and propose an algorithm for VIs with such operator. We also generalize our framework for relatively strongly convex objectives and strongly monotone variational inequalities.
  • Item
    Optimal decentralized distributed algorithms for stochastic convex optimization
    (Berlin : Weierstraß-Institut für Angewandte Analysis und Stochastik, 2020) Gorbunov, Eduard; Dvinskikh, Darina; Gasnikov, Alexander
    We consider stochastic convex optimization problems with affine constraints and develop several methods using either primal or dual approach to solve it. In the primal case we use special penalization technique to make the initial problem more convenient for using optimization methods. We propose algorithms to solve it based on Similar Triangles Method with Inexact Proximal Step for the convex smooth and strongly convex smooth objective functions and methods based on Gradient Sliding algorithm to solve the same problems in the non-smooth case. We prove the convergence guarantees in smooth convex case with deterministic first-order oracle. We propose and analyze three novel methods to handle stochastic convex optimization problems with affine constraints: SPDSTM, R-RRMA-AC-SA and SSTM_sc. All methods use stochastic dual oracle. SPDSTM is the stochastic primal-dual modification of STM and it is applied for the dual problem when the primal functional is strongly convex and Lipschitz continuous on some ball. R-RRMA-AC-SA is an accelerated stochastic method based on the restarts of RRMA-AC-SA and SSTM_sc is just stochastic STM for strongly convex problems. Both methods are applied to the dual problem when the primal functional is strongly convex, smooth and Lipschitz continuous on some ball and use stochastic dual first-order oracle. We develop convergence analysis for these methods for the unbiased and biased oracles respectively. Finally, we apply all aforementioned results and approaches to solve decentralized distributed optimization problem and discuss optimality of the obtained results in terms of communication rounds and number of oracle calls per node.