A Stochastic Quasi-Newton Method for Online Convex Optimization

N. N. Schraudolph, J. Yu, and S. Günter. A Stochastic Quasi-Newton Method for Online Convex Optimization. In Proc. 11th Intl. Conf. Artificial Intelligence and Statistics (AIstats), pp. 436–443, Journal of Machine Learning Research, San Juan, Puerto Rico, 2007.

Download

pdf djvu ps.gz
398.3kB   123.6kB   328.6kB  

Abstract

We develop stochastic variants of the well-known BFGS quasi-Newton optimization method, in both full and memory-limited (LBFGS) forms, for online optimization of convex functions. The resulting algorithm performs comparably to a well-tuned natural gradient descent but is scalable to very high-dimensional problems. On standard benchmarks in natural language processing, it asymptotically outperforms previous stochastic gradient methods for parameter estimation in conditional random fields. We are working on analyzing the convergence of online (L)BFGS, and extending it to non-convex optimization problems.

BibTeX Entry

@inproceedings{SchYuGue07,
     author = {Nicol N. Schraudolph and Jin Yu and Simon G\"unter},
      title = {\href{http://nic.schraudolph.org/pubs/SchYuGue07.pdf}{
               A Stochastic Quasi-{N}ewton Method
               for Online Convex Optimization}},
      pages = {436--443},
     editor = {Marina Meila and Xiaotong Shen},
  booktitle = {Proc.\ 11$^{th}$ Intl.\ Conf.\ Artificial
               Intelligence and Statistics (AIstats)},
    address = {San Juan, Puerto Rico},
     volume =  2,
     series = {Workshop and Conference Proceedings},
  publisher =  jmlr,
       year =  2007,
   b2h_type = {Top Conferences},
  b2h_topic = {>Quasi-Newton Methods},
   abstract = {
    We develop stochastic variants of the well-known BFGS quasi-Newton
    optimization method, in both full and memory-limited (LBFGS) forms,
    for online optimization of convex functions. The resulting algorithm
    performs comparably to a well-tuned natural gradient descent but is
    scalable to very high-dimensional problems. On standard benchmarks in
    natural language processing, it asymptotically outperforms previous
    stochastic gradient methods for parameter estimation in conditional
    random fields. We are working on analyzing the convergence of online
    (L)BFGS, and extending it to non-convex optimization problems.
}}

Generated by bib2html.pl (written by Patrick Riley) on Thu Sep 25, 2014 12:00:33