*Result*: Linear convergence of proximal gradient method for linear sparse SVM.
*Further Information*
*Despite the hinge loss function being non-strongly-convex and non-strongly smooth, we establish the linear rate of convergence for sparse linear support vector machines (SVM) up to its statistical accuracy. The algorithm we use is the proximal gradient method for composite functions, applied to a sequence of regularization parameters to compute the approximate solution path on a grid. Unlike works on loss functions that are strongly convex and strongly smooth, here we do not have linear convergence to the exact solution, but we can demonstrate linear convergence to the population truth up to the statistical error (in particular, we simultaneously consider numerical convergence and statistical convergence). For any regularization parameter in the chosen decreasing sequence, we show that the estimator is in a small neighborhood of the exact solution after O(logs<sup>*</sup>) iterations, where s<sup>*</sup> is the sparsity of the true coefficient in the model, and a total number of O(logn) stages (i.e., using a sequence of regularization parameters of length O(logn)) are required to achieve the near-oracle statistical rate, with n the sample size.
(Copyright © 2025 Elsevier Ltd. All rights reserved.)*
*Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.*