# Holevo's theorem

Short description: Upper bound on the knowable information of a quantum state

Holevo's theorem is an important limitative theorem in quantum computing, an interdisciplinary field of physics and computer science. It is sometimes called Holevo's bound, since it establishes an upper bound to the amount of information that can be known about a quantum state (accessible information). It was published by Alexander Holevo in 1973.

## Accessible information

As for several concepts in quantum information theory, accessible information is best understood in terms of a 2-party communication. So we introduce two parties, Alice and Bob. Alice has a classical random variable X, which can take the values {1, 2, ..., n} with corresponding probabilities {p1, p2, ..., pn}. Alice then prepares a quantum state, represented by the density matrix ρX chosen from a set {ρ1, ρ2, ... ρn}, and gives this state to Bob. Bob's goal is to find the value of X, and in order to do that, he performs a measurement on the state ρX, obtaining a classical outcome, which we denote with Y. In this context, the amount of accessible information, that is, the amount of information that Bob can get about the variable X, is the maximum value of the mutual information I(X : Y) between the random variables X and Y over all the possible measurements that Bob can do.

There is currently no known formula to compute the accessible information. There are however several upper bounds, the best-known of which is the Holevo bound, which is specified in the following theorem.

## Statement of the theorem

Let {ρ1, ρ2, ..., ρn} be a set of mixed states and let ρX be one of these states drawn according to the probability distribution P = {p1, p2, ..., pn}.

Then, for any measurement described by POVM elements {EY} and performed on $\displaystyle{ \rho= \sum_X p_X \rho_X }$, the amount of accessible information about the variable X knowing the outcome Y of the measurement is bounded from above as follows:

$\displaystyle{ I(X:Y) \leq S(\rho) - \sum_i p_i S(\rho_i) }$

where $\displaystyle{ \rho = \sum_i p_i \rho_i }$ and $\displaystyle{ S(\cdot) }$ is the von Neumann entropy.

The quantity on the right hand side of this inequality is called the Holevo information or Holevo χ quantity:

$\displaystyle{ \chi := S(\rho) - \sum_i p_i S(\rho_i) }$.

## Proof

Consider the composite system that describes the entire communication process, which involves Alice's classical input $\displaystyle{ X }$, the quantum system $\displaystyle{ Q }$, and Bob's classical output $\displaystyle{ Y }$. The classical input $\displaystyle{ X }$ can be written as a classical register $\displaystyle{ \rho^X := \sum\nolimits_{x=1}^n p_x |x\rangle \langle x| }$ with respect to some orthonormal basis $\displaystyle{ \{|x\rangle\}_{x=1}^n }$. By writing $\displaystyle{ X }$ in this manner, the von Neumann entropy $\displaystyle{ S(X) }$ of the state $\displaystyle{ \rho^X }$ corresponds to the Shannon entropy $\displaystyle{ H(X) }$ of the probability distribution $\displaystyle{ \{p_x\}_{x=1}^n }$:

$\displaystyle{ S(X) = -\operatorname{tr}\left(\rho^X \log \rho^X \right) = -\operatorname{tr}\left(\sum_{x=1}^n p_x \log p_x |x\rangle\langle x|\right) = -\sum_{x=1}^n p_x \log p_x = H(X). }$

The initial state of the system, where Alice prepares the state $\displaystyle{ \rho_x }$ with probability $\displaystyle{ p_x }$, is described by

$\displaystyle{ \rho^{XQ} := \sum_{x=1}^n p_x |x\rangle \langle x|\otimes\rho_x. }$

Afterwards, Alice sends the quantum state to Bob. As Bob only has access to the quantum system $\displaystyle{ Q }$ but not the input $\displaystyle{ X }$, he receives a mixed state of the form $\displaystyle{ \rho := \operatorname{tr}_X\left(\rho^{XQ}\right) = \sum\nolimits_{x=1}^n p_x \rho_x }$. Bob measures this state with respect to the POVM elements $\displaystyle{ \{E_y\}_{y=1}^m }$, and the probabilities $\displaystyle{ \{q_y\}_{y=1}^m }$ of measuring the outcomes $\displaystyle{ y=1,2,\dots,m }$ form the classical output $\displaystyle{ Y }$. This measurement process can be described as a quantum instrument

$\displaystyle{ \mathcal{E}^{Q}(\rho_x) = \sum_{y=1}^m q_{y|x} \rho_{y|x} \otimes |y\rangle \langle y|, }$

where $\displaystyle{ q_{y|x} = \operatorname{tr}\left(E_y\rho_x\right) }$ is the probability of outcome $\displaystyle{ y }$ given the state $\displaystyle{ \rho_x }$, while $\displaystyle{ \rho_{y|x} = W\sqrt{E_y}\rho_x\sqrt{E_y}W^\dagger/q_{y|x} }$ for some unitary $\displaystyle{ W }$ is the normalised post-measurement state. Then, the state of the entire system after the measurement process is

$\displaystyle{ \rho^{XQ'Y} := \left[\mathcal{I}^{X}\otimes\mathcal{E}^{Q}\right]\!\left(\rho^{XQ}\right) = \sum_{x=1}^n\sum_{y=1}^m p_x q_{y|x} |x\rangle \langle x|\otimes\rho_{y|x}\otimes |y\rangle \langle y|. }$

Here $\displaystyle{ \mathcal{I}^X }$ is the identity channel on the system $\displaystyle{ X }$. Since $\displaystyle{ \mathcal{E}^Q }$ is a quantum channel, and the quantum mutual information is monotonic under completely positive trace-preserving maps, $\displaystyle{ S(X:Q'Y) \leq S(X:Q) }$. Additionally, as the partial trace over $\displaystyle{ Q' }$ is also completely positive and trace-preserving, $\displaystyle{ S(X:Y) \leq S(X:Q'Y) }$. These two inequalities give

$\displaystyle{ S(X:Y) \leq S(X:Q). }$

On the left-hand side, the quantities of interest depend only on

$\displaystyle{ \rho^{XY} := \operatorname{tr}_{Q'}\left(\rho^{XQ'Y}\right) = \sum_{x=1}^n\sum_{y=1}^m p_x q_{y|x} |x\rangle \langle x|\otimes |y\rangle \langle y| = \sum_{x=1}^n\sum_{y=1}^m p_{x,y} |x,y\rangle \langle x,y|, }$

with joint probabilities $\displaystyle{ p_{x,y}=p_x q_{y|x} }$. Clearly, $\displaystyle{ \rho^{XY} }$ and $\displaystyle{ \rho^Y := \operatorname{tr}_X(\rho^{XY}) }$, which are in the same form as $\displaystyle{ \rho^X }$, describe classical registers. Hence,

$\displaystyle{ S(X:Y) = S(X)+S(Y)-S(XY) = H(X)+H(Y)-H(XY) = I(X:Y). }$

Meanwhile, $\displaystyle{ S(X:Q) }$ depends on the term

$\displaystyle{ \log \rho^{XQ} = \log\left(\sum_{x=1}^n p_x |x\rangle \langle x|\otimes\rho_x\right) = \sum_{x=1}^n |x\rangle \langle x| \otimes \log\left(p_x\rho_x\right) = \sum_{x=1}^n \log p_x |x\rangle \langle x| \otimes I^Q + \sum_{x=1}^n |x\rangle \langle x| \otimes \log\rho_x, }$

where $\displaystyle{ I^Q }$ is the identity operator on the quantum system $\displaystyle{ Q }$. Then, the right-hand side is

\displaystyle{ \begin{aligned} S(X:Q) &= S(X)+S(Q)-S(XQ) \\ &= S(X) + S(\rho) + \operatorname{tr}\left(\rho^{XQ}\log\rho^{XQ}\right) \\ &= S(X) + S(\rho) + \operatorname{tr}\left(\sum_{x=1}^n p_x\log p_x |x\rangle \langle x| \otimes \rho_x\right) + \operatorname{tr}\left(\sum_{x=1}^n p_x|x\rangle \langle x| \otimes \rho_x\log\rho_x\right)\\ &= S(X) + S(\rho) + \underbrace{\operatorname{tr}\left(\sum_{x=1}^n p_x\log p_x |x\rangle \langle x|\right)}_{-S(X)} + \operatorname{tr}\left(\sum_{x=1}^n p_x \rho_x\log\rho_x\right)\\ &= S(\rho) + \sum_{x=1}^n p_x \underbrace{\operatorname{tr}\left(\rho_x\log\rho_x\right)}_{-S(\rho_x)} \\ &= S(\rho) - \sum_{x=1}^n p_x S(\rho_x), \end{aligned} }

which completes the proof.

In essence, the Holevo bound proves that given n qubits, although they can "carry" a larger amount of (classical) information (thanks to quantum superposition), the amount of classical information that can be retrieved, i.e. accessed, can be only up to n classical (non-quantum encoded) bits. It was furthermore explicitly established, both theoretically and experimentally, that there are computations where quantum bits do indeed end up carrying more information than is possible classically. This is surprising, for two reasons: (1) quantum computing is so often more powerful than classical computing, that results which show it to be only as good or inferior to conventional techniques are unusual, and (2) because it takes $\displaystyle{ 2^n }$ complex numbers to encode the qubits that represent a mere n bits.