Davidon–Fletcher–Powell formula

From HandWiki

The Davidon–Fletcher–Powell formula (or DFP; named after William C. Davidon, Roger Fletcher, and Michael J. D. Powell) finds the solution to the secant equation that is closest to the current estimate and satisfies the curvature condition. It was the first quasi-Newton method to generalize the secant method to a multidimensional problem. This update maintains the symmetry and positive definiteness of the Hessian matrix. Given a function [math]\displaystyle{ f(x) }[/math], its gradient ([math]\displaystyle{ \nabla f }[/math]), and positive-definite Hessian matrix [math]\displaystyle{ B }[/math], the Taylor series is

[math]\displaystyle{ f(x_k+s_k) = f(x_k) + \nabla f(x_k)^T s_k + \frac{1}{2} s^T_k {B} s_k + \dots, }[/math]

and the Taylor series of the gradient itself (secant equation)

[math]\displaystyle{ \nabla f(x_k+s_k) = \nabla f(x_k) + B s_k + \dots }[/math]

is used to update [math]\displaystyle{ B }[/math].

The DFP formula finds a solution that is symmetric, positive-definite and closest to the current approximate value of [math]\displaystyle{ B_k }[/math]:

[math]\displaystyle{ B_{k+1}= (I - \gamma_k y_k s_k^T) B_k (I - \gamma_k s_k y_k^T) + \gamma_k y_k y_k^T, }[/math]


[math]\displaystyle{ y_k = \nabla f(x_k+s_k) - \nabla f(x_k), }[/math]
[math]\displaystyle{ \gamma_k = \frac{1}{y_k^T s_k}, }[/math]

and [math]\displaystyle{ B_k }[/math] is a symmetric and positive-definite matrix.

The corresponding update to the inverse Hessian approximation [math]\displaystyle{ H_k = B_k^{-1} }[/math] is given by

[math]\displaystyle{ H_{k+1} = H_k - \frac{H_k y_k y_k^T H_k}{y_k^T H_k y_k} + \frac{s_k s_k^T}{y_k^{T} s_k}. }[/math]

[math]\displaystyle{ B }[/math] is assumed to be positive-definite, and the vectors [math]\displaystyle{ s_k^T }[/math] and [math]\displaystyle{ y }[/math] must satisfy the curvature condition

[math]\displaystyle{ s_k^T y_k = s_k^T B s_k \gt 0. }[/math]

The DFP formula is quite effective, but it was soon superseded by the Broyden–Fletcher–Goldfarb–Shanno formula, which is its dual (interchanging the roles of y and s).[1]

See also


  1. Avriel, Mordecai (1976). Nonlinear Programming: Analysis and Methods. Prentice-Hall. pp. 352–353. ISBN 0-13-623603-0. 

Further reading