Brascamp–Lieb inequality

From HandWiki

In mathematics, the Brascamp–Lieb inequality is either of two inequalities. The first is a result in geometry concerning integrable functions on n-dimensional Euclidean space [math]\displaystyle{ \mathbb{R}^{n} }[/math]. It generalizes the Loomis–Whitney inequality and Hölder's inequality. The second is a result of probability theory which gives a concentration inequality for log-concave probability distributions. Both are named after Herm Jan Brascamp and Elliott H. Lieb.

The geometric inequality

Fix natural numbers m and n. For 1 ≤ i ≤ m, let ni ∈ N and let ci > 0 so that

[math]\displaystyle{ \sum_{i = 1}^m c_i n_i = n. }[/math]

Choose non-negative, integrable functions

[math]\displaystyle{ f_i \in L^1 \left( \mathbb{R}^{n_i} ; [0, + \infty] \right) }[/math]

and surjective linear maps

[math]\displaystyle{ B_i : \mathbb{R}^n \to \mathbb{R}^{n_i}. }[/math]

Then the following inequality holds:

[math]\displaystyle{ \int_{\mathbb{R}^n} \prod_{i = 1}^m f_i \left( B_i x \right)^{c_i} \, \mathrm{d} x \leq D^{- 1/2} \prod_{i = 1}^m \left( \int_{\mathbb{R}^{n_i}} f_i (y) \, \mathrm{d} y \right)^{c_i}, }[/math]

where D is given by

[math]\displaystyle{ D = \inf \left\{ \left. \frac{\det \left( \sum_{i = 1}^m c_i B_i^{*} A_i B_i \right)}{\prod_{i = 1}^m ( \det A_i )^{c_i}} \right| A_i \text{ is a positive-definite } n_i \times n_i \text{ matrix} \right\}. }[/math]

Another way to state this is that the constant D is what one would obtain by restricting attention to the case in which each [math]\displaystyle{ f_{i} }[/math] is a centered Gaussian function, namely [math]\displaystyle{ f_{i}(y) = \exp \{-(y,\, A_{i}\, y)\} }[/math].[1]

Alternative forms

Consider a probability density function [math]\displaystyle{ p(x)=\exp(-\phi(x)) }[/math]. This probability density function [math]\displaystyle{ p(x) }[/math] is said to be a log-concave measure if the [math]\displaystyle{ \phi(x) }[/math] function is convex. Such probability density functions have tails which decay exponentially fast, so most of the probability mass resides in a small region around the mode of [math]\displaystyle{ p(x) }[/math]. The Brascamp–Lieb inequality gives another characterization of the compactness of [math]\displaystyle{ p(x) }[/math] by bounding the mean of any statistic [math]\displaystyle{ S(x) }[/math].

Formally, let [math]\displaystyle{ S(x) }[/math] be any derivable function. The Brascamp–Lieb inequality reads:

[math]\displaystyle{ \operatorname{var}_p (S(x)) \leq E_p (\nabla^T S(x) [H \phi(x)]^{-1} \nabla S(x)) }[/math]

where H is the Hessian and [math]\displaystyle{ \nabla }[/math] is the Nabla symbol.[2]

BCCT inequality

The inequality is generalized in 2008[3] to account for both continuous and discrete cases, and for all linear maps, with precise estimates on the constant.

Definition: the Brascamp-Lieb datum (BL datum)

  • [math]\displaystyle{ d, n\geq 1 }[/math].
  • [math]\displaystyle{ d_1, ..., d_n \in \{1, 2, ..., d\} }[/math].
  • [math]\displaystyle{ p_1, ..., p_n \in [0, \infty) }[/math].
  • [math]\displaystyle{ B_i: \R^d \to \R^{d_i} }[/math] are linear surjections, with zero common kernel: [math]\displaystyle{ \cap_i ker(B_i) = \{0\} }[/math].
  • Call [math]\displaystyle{ (B, p) = (B_1, ..., B_n, p_1, ..., p_n) }[/math] a Brascamp-Lieb datum (BL datum).

For any [math]\displaystyle{ f_i \in L^1(R^{d_i}) }[/math] with [math]\displaystyle{ f_i \geq 0 }[/math], define[math]\displaystyle{ BL(B, p, f) := \frac{\int_H \prod_{j=1}^m\left(f_j \circ B_j\right)^{p_j}}{\prod_{j=1}^m\left(\int_{H_j} f_j\right)^{p_j}} }[/math]


Now define the Brascamp-Lieb constant for the BL datum:[math]\displaystyle{ BL(B, p) = \max_{f }BL(B, p, f) }[/math]

Theorem — (BCCT, 2007)

[math]\displaystyle{ BL(B, p) }[/math] is finite iff [math]\displaystyle{ d = \sum_i p_i d_i }[/math], and for all subspace [math]\displaystyle{ V }[/math] of [math]\displaystyle{ \R^d }[/math],

[math]\displaystyle{ dim(V) \leq\sum_i p_i dim(B_i(V)) }[/math]

[math]\displaystyle{ BL(B, p) }[/math] is reached by gaussians:

  • If [math]\displaystyle{ BL(B, p) }[/math] is finite, then there exists some linear operators [math]\displaystyle{ A_i : \R^{d_i} \to \R^{d_i} }[/math] such that [math]\displaystyle{ f_i = e^{-\langle A_i x, x\rangle} }[/math] achieves the upper bound.
  • If [math]\displaystyle{ BL(B, p) }[/math] is infinite, then there exists a sequence of gaussians for which

[math]\displaystyle{ \frac{\int_H \prod_{j=1}^m\left(f_j \circ B_j\right)^{p_j}}{\prod_{j=1}^m\left(\int_{H_j} f_j\right)^{p_j}} \to \infty }[/math]

Discrete case

Setup:

  • BL datum defined as [math]\displaystyle{ (G, G_1, ..., G_n, \phi_1, ... \phi_n) }[/math]
  • [math]\displaystyle{ T(G) }[/math] is the torsion subgroup, that is, the subgroup of finite-order elements.

With this setup, we have (Theorem 2.4,[4] Theorem 3.12 [5])

Theorem — If there exists some [math]\displaystyle{ s_1, ..., s_n \in [0, 1] }[/math] such that

[math]\displaystyle{ rank(H) \leq \sum_j s_j rank(\phi_j(H)) \quad \forall H \leq G }[/math]

Then for all [math]\displaystyle{ 0 \geq f_j \in \ell^{1/s_j}(G_j) }[/math],

[math]\displaystyle{ \left\|\prod_j f_j \circ \phi_j\right\|_1 \leq |T(G)| \prod_j \|f_j \|_{1/s_j} }[/math] and in particular,

[math]\displaystyle{ |E| \leq |T(G)| \prod_j |\phi_j(E)|^{s_j} \quad \forall E \subset G }[/math]

Note that the constant [math]\displaystyle{ |T(G)| }[/math] is not always tight.

BL polytope

Given BL datum [math]\displaystyle{ (B, p) }[/math], the conditions for [math]\displaystyle{ BL(B, p) \lt \infty }[/math] are

  • [math]\displaystyle{ d = \sum_i p_i d_i }[/math], and
  • for all subspace [math]\displaystyle{ V }[/math] of [math]\displaystyle{ \R^d }[/math],[math]\displaystyle{ dim(V) \leq\sum_i p_i dim(B_i(V)) }[/math]

Thus, the subset of [math]\displaystyle{ p\in [0, \infty)^n }[/math] that satisfies the above two conditions is a closed convex polytope defined by linear inequalities. This is the BL polytope.

Note that while there are infinitely many possible choices of subspace [math]\displaystyle{ V }[/math] of [math]\displaystyle{ \R^d }[/math], there are only finitely many possible equations of [math]\displaystyle{ dim(V) \leq\sum_i p_i dim(B_i(V)) }[/math], so the subset is a closed convex polytope.

Similarly we can define the BL polytope for the discrete case.

Relationships to other inequalities

The geometric Brascamp–Lieb inequality

The geometric Brascamp–Lieb inequality, first derived in 1976,[6] is a special case of the general inequality. It was used by Keith Ball, in 1989, to provide upper bounds for volumes of central sections of cubes.[7]

For i = 1, ..., m, let ci > 0 and let ui ∈ Sn−1 be a unit vector; suppose that ci and ui satisfy

[math]\displaystyle{ x = \sum_{i = 1}^m c_i (x \cdot u_i) u_i }[/math]

for all x in Rn. Let fi ∈ L1(R; [0, +∞]) for each i = 1, ..., m. Then

[math]\displaystyle{ \int_{\mathbb{R}^n} \prod_{i = 1}^m f_i (x \cdot u_i)^{c_i} \, \mathrm{d} x \leq \prod_{i = 1}^m \left( \int_{\mathbb{R}} f_i (y) \, \mathrm{d} y \right)^{c_i}. }[/math]

The geometric Brascamp–Lieb inequality follows from the Brascamp–Lieb inequality as stated above by taking ni = 1 and Bi(x) = x · ui. Then, for zi ∈ R,

[math]\displaystyle{ B_i^{*} (z_i) = z_i u_i. }[/math]

It follows that D = 1 in this case.

Hölder's inequality

Take ni = n, Bi = id, the identity map on [math]\displaystyle{ \mathbb{R}^{n} }[/math], replacing fi by f1/cii, and let ci = 1 / pi for 1 ≤ i ≤ m. Then

[math]\displaystyle{ \sum_{i = 1}^m \frac{1}{p_i} = 1 }[/math]

and the log-concavity of the determinant of a positive definite matrix implies that D = 1. This yields Hölder's inequality in [math]\displaystyle{ \mathbb{R}^{n} }[/math]:

[math]\displaystyle{ \int_{\mathbb{R}^n} \prod_{i = 1}^m f_{i} (x) \, \mathrm{d} x \leq \prod_{i = 1}^{m} \| f_i \|_{p_i}. }[/math]

Poincaré inequality

The Brascamp–Lieb inequality is an extension of the Poincaré inequality which only concerns Gaussian probability distributions.[8]

Cramér–Rao bound

The Brascamp–Lieb inequality is also related to the Cramér–Rao bound.[8] While Brascamp–Lieb is an upper-bound, the Cramér–Rao bound lower-bounds the variance of [math]\displaystyle{ \operatorname{var}_p (S(x)) }[/math]. The Cramér–Rao bound states

[math]\displaystyle{ \operatorname{var}_p (S(x)) \geq E_p (\nabla^T S(x) ) [ E_p( H \phi(x) )]^{-1} E_p( \nabla S(x) )\! }[/math].

which is very similar to the Brascamp–Lieb inequality in the alternative form shown above.

References

  1. This inequality is in Lieb, Elliott H. (1990). "Gaussian Kernels have only Gaussian Maximizers". Inventiones Mathematicae 102: 179–208. doi:10.1007/bf01233426. Bibcode1990InMat.102..179L. 
  2. This theorem was originally derived in Brascamp, Herm J.; Lieb, Elliott H. (1976). "On Extensions of the Brunn–Minkowski and Prékopa–Leindler theorems, including inequalities for log concave functions, and with an application to the diffusion equation". Journal of Functional Analysis 22 (4): 366–389. doi:10.1016/0022-1236(76)90004-5.  Extensions of the inequality can be found in Hargé, Gilles (2008). "Reinforcement of an Inequality due to Brascamp and Lieb". Journal of Functional Analysis 254 (2): 267–300. doi:10.1016/j.jfa.2007.07.019.  and Carlen, Eric A.; Cordero-Erausquin, Dario; Lieb, Elliott H. (2013). "Asymmetric Covariance Estimates of Brascamp-Lieb Type and Related Inequalities for Log-concave Measures". Annales de l'Institut Henri Poincaré B 49 (1): 1–12. doi:10.1214/11-aihp462. Bibcode2013AIHPB..49....1C. 
  3. Bennett, Jonathan; Carbery, Anthony; Christ, Michael; Tao, Terence (2008-01-01). "The Brascamp–Lieb Inequalities: Finiteness, Structure and Extremals" (in en). Geometric and Functional Analysis 17 (5): 1343–1415. doi:10.1007/s00039-007-0619-6. ISSN 1420-8970. https://doi.org/10.1007/s00039-007-0619-6. 
  4. Bennett, Jonathan; Carbery, Anthony; Christ, Michael; Tao, Terence (2005-05-31). "Finite bounds for Holder-Brascamp-Lieb multilinear inequalities". arXiv:math/0505691.
  5. Christ, Michael; Demmel, James; Knight, Nicholas; Scanlon, Thomas; Yelick, Katherine (2013-07-31). "Communication lower bounds and optimal algorithms for programs that reference arrays -- Part 1". arXiv:1308.0068 [math.CA].
  6. This was derived first in Brascamp, H. J.; Lieb, E. H. (1976). "Best Constants in Young's Inequality, Its Converse and Its Generalization to More Than Three Functions". Advances in Mathematics 20 (2): 151–172. doi:10.1016/0001-8708(76)90184-5. 
  7. "Volumes of Sections of Cubes and Related Problems". Geometric Aspects of Functional Analysis. Lecture Notes in Mathematics. 1376. Berlin: Springer. 1989. pp. 251–260. doi:10.1007/BFb0090058. ISBN 978-3-540-51303-2. 
  8. 8.0 8.1 Saumard, Adrien; Wellner, Jon A. (2014). "Log-concavity and strong log-concavity: a review". Statistics Surveys 8: 45–114. doi:10.1214/14-SS107. PMID 27134693.