Any history the derived minimization function needs to do its updates.
Any history the derived minimization function needs to do its updates. typically an approximation to the second derivative/hessian matrix.
Tracks the information about the optimizer, including the current point, its value, gradient, and then any history.
Tracks the information about the optimizer, including the current point, its value, gradient, and then any history. Also includes information for checking convergence.
the current point being considered
f(x)
f.gradientAt(x)
f(x) + r(x), where r is any regularization added to the objective. For LBFGS, this is f(x).
f'(x) + r'(x), where r is any regularization added to the objective. For LBFGS, this is f'(x).
what iteration number we are on.
f(x_0) + r(x_0), used for checking convergence
any information needed by the optimizer to do updates.
the sequence of the last minImprovementWindow values, used for checking if the "value" isn't improving
the number of times in a row the objective hasn't improved, mostly for SGD
did the line search fail?
Given a direction, perform a line search to find a direction to descend.
Given a direction, perform a line search to find a direction to descend. At the moment, this just executes backtracking, so it does not fulfill the wolfe conditions.
the current state
The objective
The step direction
stepSize
How many iterations to improve function by at least improvementTol
How many iterations to improve function by at least improvementTol
Port of LBFGS to Scala.
Special note for LBFGS: If you use it in published work, you must cite one of: * J. Nocedal. Updating Quasi-Newton Matrices with Limited Storage (1980), Mathematics of Computation 35, pp. 773-782. * D.C. Liu and J. Nocedal. On the Limited mem Method for Large Scale Optimization (1989), Mathematical Programming B, 45, 3, pp. 503-528. *