Coordinate-wise descent method
One of the methods for minimizing a function of several variables based only on the values of the function to be minimized. The method is used when the function is not differentiable or if a calculation of the derivatives involves a large amount of computation. Below the use of the coordinate-wise descent method for minimizing a function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c0264601.png" /> on a set
| <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c0264602.png" /> |
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c0264603.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c0264604.png" /> are given numbers, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c0264605.png" />, is described and the cases where all or some of the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c0264606.png" /> or <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c0264607.png" /> are not excluded. Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c0264608.png" /> be the coordinate vector in which the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c0264609.png" />-th coordinate is equal to 1 and the other coordinates are equal to zero. One specifies an initial approximation <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c02646010.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c02646011.png" />. Assume that the <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c02646012.png" />-th approximation <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c02646013.png" /> is known and that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c02646014.png" />, for some <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c02646015.png" />. Take <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c02646016.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c02646017.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c02646018.png" /> is the integer part of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c02646019.png" />. Then
| <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c02646020.png" /> |
| <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c02646021.png" /> |
| <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c02646022.png" /> |
i.e. one performs a cyclic selection of the coordinate vectors <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c02646023.png" />. First one checks if the conditions
| <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c02646024.png" /> | (1) |
are fulfilled. If (1) is fulfilled, one sets <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c02646025.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c02646026.png" />. If on the other hand (1) is not fulfilled, one checks the condition
| <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c02646027.png" /> | (2) |
If (2) is fulfilled, one sets <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c02646028.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c02646029.png" />. If conditions (1) and (2) are both not fulfilled, one sets <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c02646030.png" />,
| <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c02646031.png" /> | (3) |
where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c02646032.png" /> is the parameter of the method, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c02646033.png" />. Condition (3) means that if at least one of the conditions (1) and (2) is fulfilled in a single cycle of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c02646034.png" /> iterations involving a selection of all coordinate vectors <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c02646035.png" /> with step <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c02646036.png" />, then the length of the step <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c02646037.png" /> is not reduced and is retained at least during the following cycle of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c02646038.png" /> iterations; if on the other hand neither (1) nor (2) is ever fulfilled in the subsequent <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c02646039.png" /> iterations, the step <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c02646040.png" /> is reduced.
Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c02646041.png" /> be convex and continuously differentiable on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c02646042.png" />, let the set <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c02646043.png" /> be bounded and let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c02646044.png" /> be a positive number. Then the methods (1)–(3) converge, i.e.
| <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c02646045.png" /> |
and the sequence <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c02646046.png" /> converges to the set of minima for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c02646047.png" /> in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c02646048.png" />. If <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c02646049.png" /> is not differentiable on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026460/c02646050.png" />, the method need not converge [1], [2].
References
| [1] | F.P. Vasil'ev, "Numerical methods for solving extremum problems" , Moscow (1980) (In Russian) |
| [2] | V.G. Karmanov, "Mathematical programming" , Moscow (1975) (In Russian) |
Comments
References
| [a1] | W.I. Zangwill, "Nonlinear programming: a unified approach" , Prentice-Hall (1969) |
