NonlinearMinimizer solves the problem that has the following structure minimize f(x) + g(x)
Projection formula from Duchi et al's paper Efficient Projections onto the l1-Ball for Learning in High Dimensions
Proximal operators and ADMM based Primal-Dual QP Solver
Proximal operators and ADMM based Primal-Dual QP Solver
Reference: http://www.stanford.edu/~boyd/papers/admm/quadprog/quadprog.html
It solves problem that has the following structure
1/2 x'Hx + f'x + g(x) s.t Aeqx = b
g(x) represents the following constraints which covers ALS based matrix factorization use-cases
1. x >= 0 2. lb <= x <= ub 3. L1(x) 4. L2(x) 5. Generic regularization on x
Supported constraints by QuadraticMinimizer object
PDCO dense quadratic program generator
PDCO dense quadratic program generator
Reference
Generates random instances of Quadratic Programming Problems 0.5x'Px + q'x s.t Ax = b lb <= x <= ub
nGram rank of quadratic problems to be generated
H is the quadratic representation of the function
NonlinearMinimizer solves the problem that has the following structure minimize f(x) + g(x)
g(x) represents the following constraints
1. x >= 0 2. lb <= x <= ub 3. L1(x) 4. Aeq*x = beq 5. aeq'x = beq 6. 1'x = s, x >= 0 ProbabilitySimplex from the reference Proximal Algorithms by Boyd et al, Duchi et al
f(x) can be a smooth convex function defined by DiffFunction or a proximal operator. For now the exposed API takes DiffFunction
g(x) is defined by a proximal operator
For proximal algorithms like L1 through soft-thresholding and Huber Loss (look into library of proximal algorithms for further details) we provide ADMM based Proximal algorithm based on the following reference: https://web.stanford.edu/~boyd/papers/admm/logreg-l1/distr_l1_logreg.html
A subset of proximal operators are projection operators. For projection, NonlinearMinimizer companion object provides project API which generates a Spectral Projected Gradient (SPG) or Projected Quasi Newton (PQN) solver. For projection operators like positivity, bounds, probability simplex etc, these algorithms converges faster as compared to ADMM based proximal algorithm.
TO DO
1. Implement FISTA / Nesterov's accelerated method and compare with ADMM 2. For non-convex function experiment with TRON-like Primal solver