Log sum inequality
The log sum inequality is used for proving theorems in information theory.
Statement
Let [math]\displaystyle{ a_1,\ldots,a_n }[/math] and [math]\displaystyle{ b_1,\ldots,b_n }[/math] be nonnegative numbers. Denote the sum of all [math]\displaystyle{ a_i }[/math]s by [math]\displaystyle{ a }[/math] and the sum of all [math]\displaystyle{ b_i }[/math]s by [math]\displaystyle{ b }[/math]. The log sum inequality states that
- [math]\displaystyle{ \sum_{i=1}^n a_i\log\frac{a_i}{b_i}\geq a\log\frac{a}{b}, }[/math]
with equality if and only if [math]\displaystyle{ \frac{a_i}{b_i} }[/math] are equal for all [math]\displaystyle{ i }[/math], in other words [math]\displaystyle{ a_i =c b_i }[/math] for all [math]\displaystyle{ i }[/math].[1]
(Take [math]\displaystyle{ a_i\log \frac{a_i}{b_i} }[/math] to be [math]\displaystyle{ 0 }[/math] if [math]\displaystyle{ a_i=0 }[/math] and [math]\displaystyle{ \infty }[/math] if [math]\displaystyle{ a_i\gt 0, b_i=0 }[/math]. These are the limiting values obtained as the relevant number tends to [math]\displaystyle{ 0 }[/math].)[1]
Proof
Notice that after setting [math]\displaystyle{ f(x)=x\log x }[/math] we have
- [math]\displaystyle{ \begin{align} \sum_{i=1}^n a_i\log\frac{a_i}{b_i} & {} = \sum_{i=1}^n b_i f\left(\frac{a_i}{b_i}\right) = b\sum_{i=1}^n \frac{b_i}{b} f\left(\frac{a_i}{b_i}\right) \\ & {} \geq b f\left(\sum_{i=1}^n \frac{b_i}{b}\frac{a_i}{b_i}\right) = b f\left(\frac{1}{b}\sum_{i=1}^n a_i\right) = b f\left(\frac{a}{b}\right) \\ & {} = a\log\frac{a}{b}, \end{align} }[/math]
where the inequality follows from Jensen's inequality since [math]\displaystyle{ \frac{b_i}{b}\geq 0 }[/math], [math]\displaystyle{ \sum_{i=1}^n\frac{b_i}{b}= 1 }[/math], and [math]\displaystyle{ f }[/math] is convex.[1]
Generalizations
The inequality remains valid for [math]\displaystyle{ n=\infty }[/math] provided that [math]\displaystyle{ a\lt \infty }[/math] and [math]\displaystyle{ b\lt \infty }[/math].[citation needed] The proof above holds for any function [math]\displaystyle{ g }[/math] such that [math]\displaystyle{ f(x)=xg(x) }[/math] is convex, such as all continuous non-decreasing functions. Generalizations to non-decreasing functions other than the logarithm is given in Csiszár, 2004.
Another generalization is due to Dannan, Neff and Thiel, who showed that if [math]\displaystyle{ a_1, a_2 \cdots a_n }[/math] and [math]\displaystyle{ b_1, b_2 \cdots b_n }[/math] are positive real numbers with [math]\displaystyle{ a_1+ a_2 \cdots +a_n=a }[/math] and [math]\displaystyle{ b_1 + b_2 \cdots +b_n=b }[/math], and [math]\displaystyle{ k \geq 0 }[/math], then [math]\displaystyle{ \sum_{i=1}^n a_i \log\left( \frac{a_i}{b_i} +k \right) \geq a\log \left(\frac{a}{b}+k\right) }[/math]. [2]
Applications
The log sum inequality can be used to prove inequalities in information theory. Gibbs' inequality states that the Kullback-Leibler divergence is non-negative, and equal to zero precisely if its arguments are equal.[3] One proof uses the log sum inequality.
Proof[1] Let [math]\displaystyle{ P=(p_i)_{i\in\mathbb{N}} }[/math] and [math]\displaystyle{ Q=(q_i)_{i\in\mathbb{N}} }[/math] be pmfs. In the log sum inequality, substitute [math]\displaystyle{ n=\infty }[/math], [math]\displaystyle{ a_i=p_i }[/math] and [math]\displaystyle{ b_i=q_i }[/math] to get - [math]\displaystyle{ \mathbb{D}_{\mathrm{KL}}(P\|Q) \equiv \sum_{i} p_i \log_2 \frac{p_i}{q_i} \geq 1\log\frac{1}{1} = 0 }[/math]
with equality if and only if [math]\displaystyle{ p_i=q_i }[/math] for all i (as both [math]\displaystyle{ P }[/math] and [math]\displaystyle{ Q }[/math] sum to 1).
The inequality can also prove convexity of Kullback-Leibler divergence.[4]
Notes
- ↑ 1.0 1.1 1.2 1.3 Cover & Thomas (1991), p. 29.
- ↑ F. M. Dannan, P. Neff, C. Thiel (2016). "On the sum of squared logarithms inequality and related inequalities". Journal of Mathematical Inequalities 10 (1): 1–17. doi:10.7153/jmi-10-01. http://files.ele-math.com/articles/jmi-10-01.pdf. Retrieved 12 January 2023.
- ↑ MacKay (2003), p. 34.
- ↑ Cover & Thomas (1991), p. 30.
References
- Cover, Thomas M.; Thomas, Joy A. (1991). Elements of Information Theory. Hoboken, New Jersey: Wiley. ISBN 978-0-471-24195-9.
- "Information Theory and Statistics: A Tutorial". Foundations and Trends in Communications and Information Theory 1 (4): 417–528. 2004. doi:10.1561/0100000004. http://www.renyi.hu/~csiszar/Publications/Information_Theory_and_Statistics:_A_Tutorial.pdf. Retrieved 2009-06-14.
- T.S. Han, K. Kobayashi, Mathematics of information and coding. American Mathematical Society, 2001. ISBN 0-8218-0534-7.
- Information Theory course materials, Utah State University [1]. Retrieved on 2009-06-14.
- MacKay, David J.C. (2003). Information Theory, Inference, and Learning Algorithms. Cambridge University Press. ISBN 0-521-64298-1. http://www.inference.phy.cam.ac.uk/mackay/itila/book.html.
Original source: https://en.wikipedia.org/wiki/Log sum inequality.
Read more |