2014年8月25日星期一

Notes on Distributed optimization

Notes on Distributed Optimization

From liblinear, if we want to use second order methods for optimization, some kind of approximation is need. Because we need to use hessian matrix, its’ a huge matrix and can’t be stored in memory. The key point is some algorithm just need the Hv. A matrix multiplied by a vector. And as long as we can compute the Matrix-Vector product, we do not need Hessian anymore.

Using Automatic difference method

Get method to calculate the gradient, then use automatically difference to calculate the Hessian.

Use Matrix Decomposition method

Some hessian matrix can be decomposed into H=I+CXTDX. Well, this means some hessian matrix can be written as the low rank update for Identity matrix.

Just some notes.

Written with StackEdit.

没有评论:

发表评论