# CBC-MAC

Short description: Message authentication code algorithm

In cryptography, a cipher block chaining message authentication code (CBC-MAC) is a technique for constructing a message authentication code from a block cipher. The message is encrypted with some block cipher algorithm in CBC mode to create a chain of blocks such that each block depends on the proper encryption of the previous block. This interdependence ensures that a change to any of the plaintext bits will cause the final encrypted block to change in a way that cannot be predicted or counteracted without knowing the key to the block cipher.

To calculate the CBC-MAC of message m, one encrypts m in CBC mode with zero initialization vector and keeps the last block. The following figure sketches the computation of the CBC-MAC of a message comprising blocks $\displaystyle{ m_1\|m_2\|\cdots\|m_x }$ using a secret key k and a block cipher E:

## Security with fixed and variable-length messages

If the block cipher used is secure (meaning that it is a pseudorandom permutation), then CBC-MAC is secure for fixed-length messages. However, by itself, it is not secure for variable-length messages. Thus, any single key must only be used for messages of a fixed and known length. This is because an attacker who knows the correct message-tag (i.e. CBC-MAC) pairs for two messages $\displaystyle{ (m, t) }$ and $\displaystyle{ (m', t') }$ can generate a third message $\displaystyle{ m'' }$ whose CBC-MAC will also be $\displaystyle{ t' }$. This is simply done by XORing the first block of $\displaystyle{ m' }$ with t and then concatenating m with this modified $\displaystyle{ m' }$; i.e., by making $\displaystyle{ m'' = m \| [(m_1' \oplus t) \| m_2' \| \dots \| m_x'] }$. When computing the MAC for the message $\displaystyle{ m'' }$, it follows that we compute the MAC for m in the usual manner as t, but when this value is chained forwards to the stage computing $\displaystyle{ E_{K_\text{MAC}}(m_1' \oplus t) }$ we will perform an exclusive OR operation with the value derived for the MAC of the first message. The presence of that tag in the new message means it will cancel, leaving no contribution to the MAC from the blocks of plain text in the first message m: $\displaystyle{ E_{K_\text{MAC}}(m_1' \oplus t \oplus t) = E_{K_\text{MAC}}(m_1') }$ and thus the tag for $\displaystyle{ m'' }$ is $\displaystyle{ t' }$.

This problem cannot be solved by adding a message-size block to the end. There are three main ways of modifying CBC-MAC so that it is secure for variable length messages: 1) Input-length key separation; 2) Length-prepending; 3) Encrypt last block. In such a case, it may also be recommended to use a different mode of operation, for example, CMAC or HMAC to protect the integrity of variable-length messages.

### Length prepending

One solution is to include the length of the message in the first block; in fact CBC-MAC has been proven secure as long as no two messages that are prefixes of each other are ever used and prepending the length is a special case of this. This can be problematic if the message length may not be known when processing begins.

### Encrypt-last-block

Encrypt-last-block CBC-MAC (ECBC-MAC) is defined as CBC-MAC-ELB(m, (k1, k2)) = E(k2, CBC-MAC(k1, m)). Compared to the other discussed methods of extending CBC-MAC to variable-length messages, encrypt-last-block has the advantage of not needing to know the length of the message until the end of the computation.

## Attack methods

As with many cryptographic schemes, naïve use of ciphers and other protocols may lead to attacks being possible, reducing the effectiveness of the cryptographic protection (or even rendering it useless). We present attacks which are possible due to using the CBC-MAC incorrectly.

### Using the same key for encryption and authentication

One common mistake is to reuse the same key k for CBC encryption and CBC-MAC. Although a reuse of a key for different purposes is a bad practice in general, in this particular case the mistake leads to a spectacular attack:

Suppose Alice has sent to Bob the cipher text blocks $\displaystyle{ C = C_1 \| C_2 \| \dots \| C_n }$. During the transmission process, Eve can tamper with any of the $\displaystyle{ C_1 , \dots , C_{n-1} }$ cipher-text blocks and adjust any of the bits therein as she chooses, provided that the final block, $\displaystyle{ C_n }$, remains the same. We assume, for the purposes of this example and without loss of generality, that the initialization vector used for the encryption process is a vector of zeroes.

When Bob receives the message, he will first decrypt the message by reversing the encryption process which Alice applied, using the cipher text blocks $\displaystyle{ C=C_1 \| C_2 \| \cdots \| C_n }$. The tampered message, delivered to Bob in replacement of Alice's original, is $\displaystyle{ C'=C_1' \| \dots \| C_{n-1}' \| C_n }$.

Bob first decrypts the message received using the shared secret key K to obtain corresponding plain text. Note that all plain text produced will be different from that which Alice originally sent, because Eve has modified all but the last cipher text block. In particular, the final plain text, $\displaystyle{ P_n' }$, differs from the original, $\displaystyle{ P_n }$, which Alice sent; although $\displaystyle{ C_n }$ is the same, $\displaystyle{ C_{n-1}' \not = C_{n-1} }$, so a different plain text $\displaystyle{ P_n' }$ is produced when chaining the previous cipher text block into the exclusive-OR after decryption of $\displaystyle{ C_n }$: $\displaystyle{ P_n' = C_{n-1}' \oplus E_K^{-1}(C_n) }$.

It follows that Bob will now compute the authentication tag using CBC-MAC over all the values of plain text which he decoded. The tag for the new message, $\displaystyle{ t' }$, is given by:

$\displaystyle{ t' = E_K(P_n' \oplus E_K(P_{n-1}' \oplus E_K( \dots \oplus E_K(P_1')))) }$

Notice that this expression is equal to

$\displaystyle{ t' = E_K(P_n' \oplus C_{n-1}') }$

which is exactly $\displaystyle{ C_n }$:

$\displaystyle{ t' = E_K(C_{n-1}' \oplus E_K^{-1}(C_n) \oplus C_{n-1}') = E_K(E_K^{-1}(C_n)) = C_n }$

and it follows that $\displaystyle{ t' = C_n = t }$.

Therefore, Eve was able to modify the cipher text in transit (without necessarily knowing what plain text it corresponds to) such that an entirely different message, $\displaystyle{ P' }$, was produced, but the tag for this message matched the tag of the original, and Bob was unaware that the contents had been modified in transit. By definition, a Message Authentication Code is broken if we can find a different message (a sequence of plain-text pairs $\displaystyle{ P' }$) which produces the same tag as the previous message, P, with $\displaystyle{ P \not = P' }$. It follows that the message authentication protocol, in this usage scenario, has been broken, and Bob has been deceived into believing Alice sent him a message which she did not produce.

If, instead, we use different keys for the encryption and authentication stages, say $\displaystyle{ K_1 }$ and $\displaystyle{ K_2 }$, respectively, this attack is foiled. The decryption of the modified cipher-text blocks $\displaystyle{ C_i' }$ obtains some plain text string $\displaystyle{ P_i' }$. However, due to the MAC's usage of a different key $\displaystyle{ K_2 }$, we cannot "undo" the decryption process in the forward step of the computation of the message authentication code so as to produce the same tag; each modified $\displaystyle{ P_i' }$ will now be encrypted by $\displaystyle{ K_2 }$ in the CBC-MAC process to some value $\displaystyle{ \mathrm{MAC}_i \not = C_i' }$.

This example also shows that a CBC-MAC cannot be used as a collision-resistant one-way function: given a key it is trivial to create a different message which "hashes" to the same tag.

### Allowing the initialization vector to vary in value

When encrypting data using a block cipher in cipher block chaining (or another) mode, it is common to introduce an initialization vector to the first stage of the encryption process. It is typically required that this vector be chosen randomly (a nonce) and that it is not repeated for any given secret key under which the block cipher operates. This provides semantic security, by means of ensuring the same plain text is not encrypted to the same cipher text, allowing an attacker to infer a relationship exists.

When computing a message authentication code, such as by CBC-MAC, the use of an initialization vector is a possible attack vector.

In the operation of a ciphertext block chaining cipher, the first block of plain text is mixed with the initialization vector using an exclusive OR ($\displaystyle{ P_1 \oplus IV }$). The result of this operation is the input to the block cipher for encryption.

However, when performing encryption and decryption, we are required to send the initialization vector in plain text - typically as the block immediately preceding the first block of cipher text - such that the first block of plain text can be decrypted and recovered successfully. If computing a MAC, we will also need to transmit the initialization vector to the other party in plain text so that they can verify the tag on the message matches the value they have computed.

If we allow the initialization vector to be selected arbitrarily, it follows that the first block of plain text can potentially be modified (transmitting a different message) while producing the same message tag.

Consider a message $\displaystyle{ M_1 = P_1 | P_2 | \dots }$. In particular, when computing the message tag for CBC-MAC, suppose we choose an initialization vector $\displaystyle{ IV_1 }$ such that computation of the MAC begins with $\displaystyle{ E_K(IV_1 \oplus P_1) }$. This produces a (message, tag) pair $\displaystyle{ (M_1, T_1) }$.

Now produce the message $\displaystyle{ M_2 = P_1' | P_2 | \dots }$. For each bit modified in $\displaystyle{ P_1' }$, flip the corresponding bit in the initialization vector to produce the initialization vector $\displaystyle{ IV_1' }$. It follows that to compute the MAC for this message, we begin the computation by $\displaystyle{ E_K(P_1' \oplus IV_1') }$. As bits in both the plain text and initialization vector have been flipped in the same places, the modification is cancelled in this first stage, meaning the input to the block cipher is identical to that for $\displaystyle{ M_1 }$. If no further changes are made to the plain text, the same tag will be derived despite a different message being transmitted.

If the freedom to select an initialization vector is removed and all implementations of CBC-MAC fix themselves on a particular initialization vector (often the vector of zeroes, but in theory, it could be anything provided all implementations agree), this attack cannot proceed.

To sum up, if the attacker is able to set the IV that will be used for MAC verification, he can perform arbitrary modification of the first data block without invalidating the MAC.

### Using predictable initialization vector

Sometimes IV is used as a counter to prevent message replay attacks. However, if the attacker can predict what IV will be used for MAC verification, he or she can replay previously observed message by modifying the first data block to compensate for the change in the IV that will be used for the verification. For example, if the attacker has observed message $\displaystyle{ M_1 = P_1 | P_2 | \dots }$ with $\displaystyle{ IV_1 }$ and knows $\displaystyle{ IV_2 }$, he can produce $\displaystyle{ M_1' = (P_1 \oplus IV_1 \oplus IV_2)| P_2 | \dots }$ that will pass MAC verification with $\displaystyle{ IV_2 }$.

The simplest countermeasure is to encrypt the IV before using it (i.e., prepending IV to the data). Alternatively MAC in CFB mode can be used, because in CFB mode the IV is encrypted before it is XORed with the data.

Another solution (in case protection against message replay attacks is not required) is to always use a zero vector IV. Note that the above formula for $\displaystyle{ M_1' }$ becomes $\displaystyle{ M_1' = (P_1 \oplus 0 \oplus 0)| P_2 | \dots = P_1 | P_2 | \dots = M_1 }$. So since $\displaystyle{ M_1 }$ and $\displaystyle{ M_1' }$ are the same message, by definition they will have the same tag. This is not a forgery, rather the intended use of CBC-MAC.

## Standards that define the algorithm

FIPS PUB 113 Computer Data Authentication is a (now obsolete) U.S. government standard that specified the CBC-MAC algorithm using DES as the block cipher.

The CBC-MAC algorithm is equivalent to ISO/IEC 9797-1 MAC Algorithm 1.