Sparse Estimation: Closing the Gap Between L0 and L1 Models

Title: Sparse Estimation: Closing the Gap Between L0 and L1 ModelsAbstract: Sparse statistical estimators are increasingly prevalent due to their ease of interpretability and superior out-of-sample performance. However, sparse estimation problems with an L0 constraint, restricting the support of the estimators, are challenging (typically NP-hard, but not always) non-convex optimization problems. Consequently, academics and practitioners commonly turn to convex L1 proxies, such as Lasso and its variants, as a remedy. Although the L1 models are solved fast, they may lead to biased and/or dense estimators and require substantial cross-validation for calibration. In this talk, we focus on two estimation problems: i) sparse regression and ii) sparse and smooth  signal recovery. The first one is known to be NP-hard; we show that the second one is equivalent to a submodular minimization problem and, hence, it is polynomially solvable. For both problems, we derive a sequence of strong convex relaxations. These relaxations are based on the ideal (convex-hull) formulations for rank-one/pairwise quadratic terms with indicator variables. The new relaxations can be formulated as conic quadratic or semidefinite optimization problems in an extended space; they are stronger and more general than the state-of-the-art models with the reverse Huber penalty and the minimax concave penalty functions. Furthermore, the proposed rank-one strengthening can be interpreted as a non-separable, non-convex, unbiased sparsity-inducing regularizer, which dynamically adjusts its penalty according to the shape of the estimation error function without inducing bias for the sparse solutions. Computational experiments with benchmark datasets show that the proposed conic formulations are solved fast and result in near-optimal estimators for non-convex L0-problems. Moreover, the resulting estimators also outperform L1 approaches from a statistical perspective, achieving high prediction accuracy and good interpretability.This talk is based on the following papers with Andres Gomez & Shaoning Han:

 

Bio: Alper Atamturk is the Earl J. Isaac Chair in the Science and Analysis of Decision Making, Professor and Chair of the Department of Industrial Engineering and Operations at the University of California, Berkeley. He received his Ph.D. from the Georgia Institute of Technology in 1998 with a major in Operations Research and minor in Computer Science. His research interests are in optimization, integer programming, optimization under uncertainty with applications to machine learning, energy systems, portfolio and network design.  He is the current chair of the INFORMS Optimization Society and UC Berkeley site director of the NSF AI Institute for Advances in Optimization. He serves as co-editor for Mathematical Programming, associate editor for Discrete Optimization, Mathematical Programming Computation, and Journal of Risk. Previously, he served on the editorial boards of Operations Research, Management Science, and Networks. He served on the organizing committees of INFORMS 2021, IPCO 2020, INFORMS 2014, IPCO 2010, MIP 2009, INFORMS 2008, MIP 2005, among others. He is a Fellow of INFORMSand Vannevar Bush Fellow of the US Department of Defense.