Lagrangian relaxation

From HandWiki

In the field of mathematical optimization, Lagrangian relaxation is a relaxation method which approximates a difficult problem of constrained optimization by a simpler problem. A solution to the relaxed problem is an approximate solution to the original problem, and provides useful information. The method penalizes violations of inequality constraints using a Lagrange multiplier, which imposes a cost on violations. These added costs are used instead of the strict inequality constraints in the optimization. In practice, this relaxed problem can often be solved more easily than the original problem.

The problem of maximizing the Lagrangian function of the dual variables (the Lagrangian multipliers) is the Lagrangian dual problem.

Mathematical description

Suppose we are given a linear programming problem, with [math]\displaystyle{ x\in \mathbb{R}^n }[/math] and [math]\displaystyle{ A\in \mathbb{R}^{m,n} }[/math], of the following form:

max [math]\displaystyle{ c^T x }[/math]
s.t.
[math]\displaystyle{ Ax \le b }[/math]

If we split the constraints in [math]\displaystyle{ A }[/math] such that [math]\displaystyle{ A_1\in \mathbb{R}^{m_1,n} }[/math], [math]\displaystyle{ A_2\in \mathbb{R}^{m_2,n} }[/math] and [math]\displaystyle{ m_1+m_2=m }[/math] we may write the system:

max [math]\displaystyle{ c^T x }[/math]
s.t.
(1) [math]\displaystyle{ A_1 x \le b_1 }[/math]
(2) [math]\displaystyle{ A_2 x \le b_2 }[/math]

We may introduce the constraint (2) into the objective:

max [math]\displaystyle{ c^T x+\lambda^T(b_2-A_2x) }[/math]
s.t.
(1) [math]\displaystyle{ A_1 x \le b_1 }[/math]

If we let [math]\displaystyle{ \lambda=(\lambda_1,\ldots,\lambda_{m_2}) }[/math] be nonnegative weights, we get penalized if we violate the constraint (2), and we are also rewarded if we satisfy the constraint strictly. The above system is called the Lagrangian relaxation of our original problem.

The LR solution as a bound

Of particular use is the property that for any fixed set of [math]\displaystyle{ \tilde{\lambda} \succeq 0 }[/math] values, the optimal result to the Lagrangian relaxation problem will be no smaller than the optimal result to the original problem. To see this, let [math]\displaystyle{ \hat{x} }[/math] be the optimal solution to the original problem, and let [math]\displaystyle{ \bar{x} }[/math] be the optimal solution to the Lagrangian relaxation. We can then see that

[math]\displaystyle{ c^T \hat{x} \leq c^T \hat{x} +\tilde{\lambda}^T(b_2-A_2 \hat{x} ) \leq c^T \bar{x} +\tilde{\lambda}^T(b_2-A_2 \bar{x} ) }[/math]

The first inequality is true because [math]\displaystyle{ \hat{x} }[/math] is feasible in the original problem and the second inequality is true because [math]\displaystyle{ \bar{x} }[/math] is the optimal solution to the Lagrangian relaxation.

Iterating towards a solution of the original problem

The above inequality tells us that if we minimize the maximum value we obtain from the relaxed problem, we obtain a tighter limit on the objective value of our original problem. Thus we can address the original problem by instead exploring the partially dualized problem

min [math]\displaystyle{ P(\lambda) }[/math] s.t. [math]\displaystyle{ \lambda \geq 0 }[/math]

where we define [math]\displaystyle{ P(\lambda) }[/math] as

max [math]\displaystyle{ c^T x+\lambda^T(b_2-A_2x) }[/math]
s.t.
(1) [math]\displaystyle{ A_1 x \le b_1 }[/math]

A Lagrangian relaxation algorithm thus proceeds to explore the range of feasible [math]\displaystyle{ \lambda }[/math] values while seeking to minimize the result returned by the inner [math]\displaystyle{ P }[/math] problem. Each value returned by [math]\displaystyle{ P }[/math] is a candidate upper bound to the problem, the smallest of which is kept as the best upper bound. If we additionally employ a heuristic, probably seeded by the [math]\displaystyle{ \bar{x} }[/math] values returned by [math]\displaystyle{ P }[/math], to find feasible solutions to the original problem, then we can iterate until the best upper bound and the cost of the best feasible solution converge to a desired tolerance.

Related methods

The augmented Lagrangian method is quite similar in spirit to the Lagrangian relaxation method, but adds an extra term, and updates the dual parameters [math]\displaystyle{ \lambda }[/math] in a more principled manner. It was introduced in the 1970s and has been used extensively.

The penalty method does not use dual variables but rather removes the constraints and instead penalizes deviations from the constraint. The method is conceptually simple but usually augmented Lagrangian methods are preferred in practice since the penalty method suffers from ill-conditioning issues.

References

Books

  • Ravindra K. Ahuja, Thomas L. Magnanti, and James B. Orlin (1993). Network Flows: Theory, Algorithms and Applications. Prentice Hall. ISBN 0-13-617549-X. 
  • Bertsekas, Dimitri P. (1999). Nonlinear Programming: 2nd Edition. Athena Scientific. ISBN 1-886529-00-0.
  • Bonnans, J. Frédéric; Gilbert, J. Charles; Lemaréchal, Claude; Sagastizábal, Claudia A. (2006). Numerical optimization: Theoretical and practical aspects. Universitext (Second revised ed. of translation of 1997 French ed.). Berlin: Springer-Verlag. pp. xiv+490. doi:10.1007/978-3-540-35447-5. ISBN 3-540-35445-X. https://www.springer.com/mathematics/applications/book/978-3-540-35445-1. 
  • Hiriart-Urruty, Jean-Baptiste; Lemaréchal, Claude (1993). Convex analysis and minimization algorithms, Volume I: Fundamentals. Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. 305. Berlin: Springer-Verlag. pp. xviii+417. ISBN 3-540-56850-6. 
  • Hiriart-Urruty, Jean-Baptiste; Lemaréchal, Claude (1993). "14 Duality for Practitioners". Convex analysis and minimization algorithms, Volume II: Advanced theory and bundle methods. Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. 306. Berlin: Springer-Verlag. pp. xviii+346. ISBN 3-540-56852-2. 
  • Lasdon, Leon S. (2002). Optimization theory for large systems (reprint of the 1970 Macmillan ed.). Mineola, New York: Dover Publications, Inc.. pp. xiii+523. 
  • Lemaréchal, Claude (2001). "Lagrangian relaxation". in Michael Jünger and Denis Naddef. Computational combinatorial optimization: Papers from the Spring School held in Schloß Dagstuhl, May 15–19, 2000. Lecture Notes in Computer Science. 2241. Berlin: Springer-Verlag. pp. 112–156. doi:10.1007/3-540-45586-8_4. ISBN 3-540-42877-1. 
  • Minoux, M. (1986). Mathematical programming: Theory and algorithms. Egon Balas (foreword) (Translated by Steven Vajda from the (1983 Paris: Dunod) French ed.). Chichester: A Wiley-Interscience Publication. John Wiley & Sons, Ltd.. pp. xxviii+489. (2008 Second ed., in French: Programmation mathématique: Théorie et algorithmes. Editions Tec & Doc, Paris, 2008. xxx+711 pp. ). ISBN 0-471-90170-9. 

Articles