Majorization

From HandWiki
Revision as of 19:57, 6 February 2024 by DanMescoff (talk | contribs) (simplify)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Short description: Preorder on vectors of real numbers


In mathematics, majorization is a preorder on vectors of real numbers. For two such vectors, [math]\displaystyle{ \mathbf{x},\ \mathbf{y} \in \mathbb{R}^n }[/math], we say that [math]\displaystyle{ \mathbf{x} }[/math] weakly majorizes (or dominates) [math]\displaystyle{ \mathbf{y} }[/math] from below, commonly denoted [math]\displaystyle{ \mathbf{x} \succ_w \mathbf{y}, }[/math] when

[math]\displaystyle{ \sum_{i=1}^k x_i^{\downarrow} \geq \sum_{i=1}^k y_i^{\downarrow} }[/math] for all [math]\displaystyle{ k=1,\,\dots,\,n }[/math],

where xi denotes ith largest entry of x. If [math]\displaystyle{ \mathbf{x}, \mathbf{y} }[/math] further satisfy [math]\displaystyle{ \sum_{i=1}^n x_i = \sum_{i=1}^n y_i }[/math], we say that [math]\displaystyle{ \mathbf{x} }[/math] majorizes (or dominates) [math]\displaystyle{ \mathbf{y} }[/math], commonly denoted [math]\displaystyle{ \mathbf{x} \succ \mathbf{y} }[/math]. Majorization is a partial order for vectors whose entries are non-decreasing, but only a preorder for general vectors, since majorization is agnostic to the ordering of the entries in vectors, e.g., the statement [math]\displaystyle{ (1,2)\prec (0,3) }[/math] is simply equivalent to [math]\displaystyle{ (2,1)\prec (3,0) }[/math].

Majorizing also sometimes refers to entrywise ordering, e.g. the real-valued function f majorizes the real-valued function g when [math]\displaystyle{ f(x) \geq g(x) }[/math] for all [math]\displaystyle{ x }[/math] in the domain, or other technical definitions, such as majorizing measures in probability theory.[1]

Equivalent conditions

Geometric definition

Figure 1. 2D majorization example

For [math]\displaystyle{ \mathbf{x},\ \mathbf{y} \in \mathbb{R}^n, }[/math] we have [math]\displaystyle{ \mathbf{x} \prec \mathbf{y} }[/math] if and only if [math]\displaystyle{ \mathbf{x} }[/math] is in the convex hull of all vectors obtained by permuting the coordinates of [math]\displaystyle{ \mathbf{y} }[/math]. This is equivalent to saying that [math]\displaystyle{ \mathbf{y} = \mathbf{D}\mathbf{x} }[/math] for some doubly stochastic matrix [math]\displaystyle{ \mathbf{D} }[/math].[2]:Thm. 2.1 In particular, [math]\displaystyle{ \mathbf{y} }[/math] can be written as a convex combination of [math]\displaystyle{ n }[/math] permutations of [math]\displaystyle{ \mathbf{x} }[/math].[3]

Figure 1 displays the convex hull in 2D for the vector [math]\displaystyle{ \mathbf{y}=(3,\,1) }[/math]. Notice that the center of the convex hull, which is an interval in this case, is the vector [math]\displaystyle{ \mathbf{x}=(2,\,2) }[/math]. This is the "smallest" vector satisfying [math]\displaystyle{ \mathbf{x} \prec \mathbf{y} }[/math] for this given vector [math]\displaystyle{ \mathbf{y} }[/math]. Figure 2 shows the convex hull in 3D. The center of the convex hull, which is a 2D polygon in this case, is the "smallest" vector [math]\displaystyle{ \mathbf{x} }[/math] satisfying [math]\displaystyle{ \mathbf{x} \prec \mathbf{y} }[/math] for this given vector [math]\displaystyle{ \mathbf{y} }[/math].

Figure 2. 3D Majorization Example

Other definitions

Each of the following statements is true if and only if [math]\displaystyle{ \mathbf{x}\succ \mathbf{y} }[/math].

  • From [math]\displaystyle{ \mathbf{x} }[/math] we can produce [math]\displaystyle{ \mathbf{y} }[/math] by a finite sequence of "Robin Hood operations" where we replace two elements [math]\displaystyle{ x_i }[/math] and [math]\displaystyle{ x_j \lt x_i }[/math] with [math]\displaystyle{ x_i-\varepsilon }[/math] and [math]\displaystyle{ x_j+\varepsilon }[/math], respectively, for some [math]\displaystyle{ \varepsilon \in (0, x_i-x_j) }[/math].[2]:11
  • For every convex function [math]\displaystyle{ h:\mathbb{R}\to \mathbb{R} }[/math], [math]\displaystyle{ \sum_{i=1}^d h(x_i) \geq \sum_{i=1}^d h(y_i) }[/math].[2]:Thm. 2.9
    • In fact, a special case suffices: [math]\displaystyle{ \sum_i{x_i}=\sum_i{y_i} }[/math] and, for every t, [math]\displaystyle{ \sum_{i=1}^d \max(0,x_i-t) \geq\sum_{i=1}^d \max(0,y_i-t) }[/math].[4]
  • For every [math]\displaystyle{ t \in \mathbb{R} }[/math], [math]\displaystyle{ \sum_{j=1}^d |x_j-t| \geq \sum_{j=1}^d |y_j-t| }[/math].[5]:Exercise 12.17

Examples

Among non-negative vectors with three components, [math]\displaystyle{ (1, 0, 0) }[/math] and permutations of it majorize all other vectors [math]\displaystyle{ (p_1, p_2, p_3) }[/math] such that [math]\displaystyle{ p_1 + p_2 + p_3 = 1 }[/math]. For example, [math]\displaystyle{ (1, 0, 0) \succ (1/2, 0, 1/2) }[/math]. Similarly, [math]\displaystyle{ (1/3, 1/3, 1/3) }[/math] is majorized by all other such vectors, so [math]\displaystyle{ (1/2, 0, 1/2) \succ (1/3, 1/3, 1/3) }[/math].

This behavior extends to general-length probability vectors: the singleton vector majorizes all other probability vectors, and the uniform distribution is majorized by all probability vectors.

Schur convexity

Main page: Schur-convex function

A function [math]\displaystyle{ f:\mathbb{R}^n \to \mathbb{R} }[/math] is said to be Schur convex when [math]\displaystyle{ \mathbf{x} \succ \mathbf{y} }[/math] implies [math]\displaystyle{ f(\mathbf{x}) \geq f(\mathbf{y}) }[/math]. Hence, Schur-convex functions translate the ordering of vectors to a standard ordering in [math]\displaystyle{ \mathbb{R} }[/math]. Similarly, [math]\displaystyle{ f(\mathbf{x}) }[/math] is Schur concave when [math]\displaystyle{ \mathbf{x} \succ \mathbf{y} }[/math] implies [math]\displaystyle{ f(\mathbf{x}) \leq f(\mathbf{y}). }[/math]

An example of a Schur-convex function is the max function, [math]\displaystyle{ \max(\mathbf{x})=x_{1}^{\downarrow} }[/math]. Schur convex functions are necessarily symmetric that the entries of it argument can be switched without modifying the value of the function. Therefore, linear functions, which are convex, are not Schur-convex unless they are symmetric. If a function is symmetric and convex, then it is Schur-convex.

Generalizations

Majorization can be generalized to the Lorenz ordering, a partial order on distribution functions. For example, a wealth distribution is Lorenz-greater than another if its Lorenz curve lies below the other. As such, a Lorenz-greater wealth distribution has a higher Gini coefficient, and has more income disparity.[6]

The majorization preorder can be naturally extended to density matrices in the context of quantum information.[5][7] In particular, [math]\displaystyle{ \rho\succ\rho' }[/math] exactly when [math]\displaystyle{ \mathrm{spec}[\rho]\succ\mathrm{spec}[\rho'] }[/math] (where [math]\displaystyle{ \mathrm{spec} }[/math] denotes the state's spectrum).

Similarly, one can say a Hermitian operator, [math]\displaystyle{ \mathbf{H} }[/math], majorizes another, [math]\displaystyle{ \mathbf{M} }[/math], if the set of eigenvalues of [math]\displaystyle{ \mathbf{H} }[/math] majorizes that of [math]\displaystyle{ \mathbf{M} }[/math].

See also

Notes

  1. Talagrand, Michel (1996-07-01). "Majorizing measures: the generic chaining". The Annals of Probability 24 (3). doi:10.1214/aop/1065725175. ISSN 0091-1798. https://projecteuclid.org/journals/annals-of-probability/volume-24/issue-3/Majorizing-measures-the-generic-chaining/10.1214/aop/1065725175.full. 
  2. 2.0 2.1 2.2 Barry C. Arnold. "Majorization and the Lorenz Order: A Brief Introduction". Springer-Verlag Lecture Notes in Statistics, vol. 43, 1987.
  3. Xingzhi, Zhan (2003). "The sharp Rado theorem for majorizations". The American Mathematical Monthly 110 (2): 152–153. doi:10.2307/3647776. 
  4. July 3, 2005 post by fleeting_guest on "The Karamata Inequality" thread, AoPS community forums. Archived 11 November 2020.
  5. 5.0 5.1 Nielsen, Michael A.; Chuang, Isaac L. (2010). Quantum Computation and Quantum Information (2nd ed.). Cambridge: Cambridge University Press. ISBN 978-1-107-00217-3. OCLC 844974180. 
  6. Marshall, Albert W. (2011). "14, 15". Inequalities : theory of majorization and its applications. Ingram Olkin, Barry C. Arnold (2nd ed.). New York: Springer Science+Business Media, LLC. ISBN 978-0-387-68276-1. OCLC 694574026. https://www.worldcat.org/oclc/694574026. 
  7. Wehrl, Alfred (1 April 1978). "General properties of entropy". Reviews of Modern Physics 50 (2): 221–260. doi:10.1103/RevModPhys.50.221. Bibcode1978RvMP...50..221W. https://link.aps.org/doi/10.1103/RevModPhys.50.221. 

References

  • J. Karamata. "Sur une inegalite relative aux fonctions convexes." Publ. Math. Univ. Belgrade 1, 145–158, 1932.
  • G. H. Hardy, J. E. Littlewood and G. Pólya, Inequalities, 2nd edition, 1952, Cambridge University Press, London.
  • Inequalities: Theory of Majorization and Its Applications Albert W. Marshall, Ingram Olkin, Barry Arnold, Second edition. Springer Series in Statistics. Springer, New York, 2011. ISBN:978-0-387-40087-7
  • A tribute to Marshall and Olkin's book "Inequalities: Theory of Majorization and its Applications"
  • Matrix Analysis (1996) Rajendra Bhatia, Springer, ISBN:978-0-387-94846-1
  • Topics in Matrix Analysis (1994) Roger A. Horn and Charles R. Johnson, Cambridge University Press, ISBN:978-0-521-46713-1
  • Majorization and Matrix Monotone Functions in Wireless Communications (2007) Eduard Jorswieck and Holger Boche, Now Publishers, ISBN:978-1-60198-040-3
  • The Cauchy Schwarz Master Class (2004) J. Michael Steele, Cambridge University Press, ISBN:978-0-521-54677-5

External links

Software