A Quasi-Newton Approach to Nonsmooth Convex Optimization Problems in Machine Learning

J. Yu, S. Vishwanathan, S. Günter, and N. N. Schraudolph. A Quasi-Newton Approach to Nonsmooth Convex Optimization Problems in Machine Learning. Journal of Machine Learning Research, 11:1145–1200, 2010.
Short version

Download


2.5MB	547.3kB	1.5MB

Abstract

We extend the well-known BFGS quasi-Newton method and its memory-limited variant LBFGS to the optimization of nonsmooth convex objectives. This is done in a rigorous fashion by generalizing three components of BFGS to subdifferentials: the local quadratic model, the identification of a descent direction, and the Wolfe line search conditions. We prove that under some technical conditions, the resulting subBFGS algorithm is globally convergent in objective function value. We apply its memory-limited variant (subLBFGS) to $L_2$-regularized risk minimization with the binary hinge loss. To extend our algorithm to the multiclass and multilabel settings, we develop a new, efficient, exact line search algorithm. We prove its worst-case time complexity bounds, and show that it can also be used to extend a recently developed bundle method to the multiclass and multilabel settings. We also apply the direction-finding component of our algorithm to $L_1$-regularized risk minimization with logistic loss. In all these contexts our methods perform comparable to or better than specialized state-of-the-art solvers on a number of publicly available datasets. An open source implementation of our algorithms is freely available.

BibTeX Entry

@article{YuVisGueSch10,
     author = {Jin Yu and S.~V.~N. Vishwanathan and
               Simon G\"unter and Nicol N. Schraudolph},
      title = {\href{http://nic.schraudolph.org/pubs/YuVisGueSch10.pdf}{
               A Quasi-{N}ewton Approach to Nonsmooth Convex
               Optimization Problems in Machine Learning}},
      pages = {1145-1200},
    journal =  jmlr,
     volume =  11,
       year =  2010,
   b2h_type = {Journal Papers},
  b2h_topic = {>Quasi-Newton Methods},
   b2h_note = {<a href="b2hd-YuVisGueSch08.html">Short version</a>},
   abstract = {
    We extend the well-known BFGS quasi-Newton method and its
    memory-limited variant LBFGS to the optimization of nonsmooth
    convex objectives.  This is done in a rigorous fashion by
    generalizing three components of BFGS to subdifferentials: the
    local quadratic model, the identification of a descent direction,
    and the Wolfe line search conditions. We prove that under some
    technical conditions, the resulting subBFGS algorithm is globally
    convergent in objective function value.  We apply its memory-limited
    variant (subLBFGS) to $L_{2}$-regularized risk minimization
    with the binary hinge loss.  To extend our algorithm to the
    multiclass and multilabel settings, we develop a new, efficient,
    exact line search algorithm. We prove its worst-case time
    complexity bounds, and show that it can also be used to extend
    a recently developed bundle method to the multiclass and
    multilabel settings.  We also apply the direction-finding
    component of our algorithm to $L_{1}$-regularized risk minimization
    with logistic loss.  In all these contexts our methods perform
    comparable to or better than specialized state-of-the-art solvers
    on a number of publicly available datasets. An open source
    implementation of our algorithms is freely available.
}}