Hybrid deterministic-stochastic methods for data fitting

M. P. Friedlander, M. Schmidt
SIAM Journal on Scientific Computing, 34(3):A1380–A1405, 2012

Abstract

Many structured data-fitting applications require the solution of an optimization problem involving a sum over a potentially large number of measurements. Incremental gradient algorithms oﬀer inexpensive iterations by sampling a subset of the terms in the sum; these methods can make great progress initially, but often slow as they approach a solution. In contrast, full-gradient methods achieve steady convergence at the expense of evaluating the full objective and gradient on each iteration. We explore hybrid methods that exhibit the beneﬁts of both approaches. Rate-of-convergence analysis shows that by controlling the sample size in an incremental-gradient algorithm, it is possible to maintain the steady convergence rates of full-gradient methods. We detail a practical quasi-Newton implementation based on this approach. Numerical experiments illustrate its potential beneﬁts.

Reproducible research

The following code facilitates the reproduction of all figures appearing in the paper.

Downloads

Most scripts use pre-computed data files to save time. When a required data file is missing it is automatically regenerated. To regenerate all experiments, simply delete the intermediate files in the +batching/results+ directory.

Getting started

Download the zip archive

2011FriedlanderSchmidt-experiments.zip (42M)

in a suitable directory and unzip. Next, start Matlab, change to the resulting hybrid directory and type addpath(genpath(pwd)) to set up all required paths.

# Run demo, similar Figure 5.5:
>> example_batching

# Run the experiments in the paper (assumes existing results files removed):
>> expBatching_runAll

# Reproduce all the plots in the paper:
>> expBatching_plotAll

BiBTeX

@article{FriedlanderSchmidt2012,
  author = {M. P. Friedlander and M. Schmidt},
  title = {Hybrid deterministic-stochastic methods for data fitting},
  journal = {SIAM J. Scientific Computing},
  volume = {34},
  number = {3},
  pages = {A1380–A1405},
  year = {2012}
}