MCS algorithm

From HandWiki
A cartoon centipede reads books and types on a laptop.
Figure 1: MCS algorithm (without local search) applied to the two-dimensional Rosenbrock function. The global minimum [math]\displaystyle{ f_{min}=0 }[/math] is located at [math]\displaystyle{ (x,y)=(1,1) }[/math]. MCS identifies a position with [math]\displaystyle{ f\approx 0.002 }[/math] within 21 function evaluations. After additional 21 evaluations the optimal value is not improved and the algorithm terminates. Observe dense clustering of samples around potential minima - this effect can be reduced significantly by employing local searches appropriately.
no alt
Figure 2: MCS (without local search) applied to the Himmelblau's function with four local minima where [math]\displaystyle{ f(x)=0 }[/math].

For mathematical optimization, Multilevel Coordinate Search (MCS) is an efficient[1] algorithm for bound constrained global optimization using function values only.[2]

To do so, the n-dimensional search space is represented by a set of non-intersecting hypercubes (boxes). The boxes are then iteratively split along an axis plane according to the value of the function at a representative point of the box (and its neighbours) and the box's size. These two splitting criteria combine to form a global search by splitting large boxes and a local search by splitting areas for which the function value is good.

Additionally, a local search combining a (multi-dimensional) quadratic interpolant of the function and line searches can be used to augment performance of the algorithm (MCS with local search); in this case the plain MCS is used to provide the starting (initial) points. The information provided by local searches (local minima of the objective function) is then fed back to the optimizer and affects the splitting criteria, resulting in reduced sample clustering around local minima, faster convergence and higher precision.

Simplified workflow

The MCS workflow is visualized in Figures 1 and 2. Each step of the algorithm can be split into four stages:

  1. Identify a potential candidate for splitting (magenta, thick).
  2. Identify the optimal splitting direction and the expected optimal position of the splitting point (green).
  3. Evaluate the objective function at the splitting point or recover it from the already computed set; the latter applies if the current splitting point has already been reached when splitting a neighboring box.
  4. Generate new boxes (magenta, thin) based on the values of the objective function at the splitting point.

At each step the green point with the temporary yellow halo is the unique base point of the box; each box has an associated value of the objective, namely its value at the box's base point.

In order to determine if a box will be split two separate splitting criteria are used. The first one, splitting by rank, ensures that large boxes that have not been split too often will be split eventually. If it applies then the splitting point is easily determined at a fixed fraction of the length of the side being split. The second one, splitting by expected gain, employs a local one-dimensional parabolic quadratic model (surrogate) along a single coordinate. In this case the splitting point is defined as the minimum of the surrogate along a line segment and the box is split only if the interpolant value (serving as a proxy for the true value of the objective) is lower than the current best sampled function value.

Convergence

The algorithm is guaranteed to converge to the global minimum in the long run (i.e. when the number of function evaluations and the search depth are arbitrarily large) if the objective function is continuous in the neighbourhood of the global minimizer. This follows from the fact that any box will become arbitrarily small eventually, hence the spacing between samples tends to zero as the number of function evaluations tends to infinity.

Recursive implementation

MCS is designed to be implemented in an efficient recursive manner with the aid of trees. With this approach the amount of memory required is independent of problem dimensionality since the sampling points are not stored explicitly. Instead, just a single coordinate of each sample is saved and the remaining coordinates can be recovered by tracing the history of a box back to the root (initial box). This method was suggested by the authors and used in their original implementation.[2]

References

External links