Earth:MAC Address Anonymization

From HandWiki

MAC Address anonymization is the idea of performing a one way function on a MAC address so that the result may be used in tracking systems for reporting and the general public, while making it nearly impossible to obtain the original MAC Address from the result. The idea is that this process allows companies like Google,[1] Apple[2] and iInside[3] - which track users movements via computer hardware to simultaneously preserve the identities of the people they are tracking, as well as the hardware itself.

Examples

An easy example of MAC address anonymization would be to use a simple hash algorithm. Given an address of 11:22:33:44:55, the MD5 hash algorithm produces 8,093,140,232,281,458,246 (0x70509c29768f0646).

An address only one character different (11:22:33:44:56) produces 1,390,925,306,346,392,705 (0x134d8f3259e0cc81), an entirely different number.

Why this does not work in practice

Tracking companies rely on the assumption that address anonymization is akin to encryption. Given a message, and an encryption method that is well known to both the encoder and potential decryptor, modern encryption methods (such as AES or RSA) will yield a result that is unbreakable in practice.

The problem lies in the fact that there are only 248 (281,474,976,710,656 ) possible MAC addresses. Given the encoding algorithm, an index can easily be created for each possible address.

Several years ago, the building of such an index would have been difficult due to the compute time involved. With modern, parallel, cloud computing, the index generation can be easily divided among the number of processors desired.

On a 2.5 GHz processor, a C# program was able to produce the following results:[citation needed]

CPU Count Iterations Time (seconds)
1 2.814 x 1014 3.0489 x 10−6
1,000 2.814 x 1011 3.0489 x 10−3
1,000,000 2.814 x 108 304.89

Thus, a million processors could create the entire index in just over five minutes, and 100,000 processors in less than 85 hours. Once the index is complete, conversions of "anonymized" addresses to their actual addresses is almost instantaneous. Such an index is often referred to as a rainbow table.

Salting

One common way of mitigating such brute force attacks, is to append a random SALT into the MAC address before hashing. A SALT is a secret string of characters, that makes the search space for attackers larger. The effect is that the brute force attack can no longer target the valid range of MAC addresses. Lists of salted and hashed MAC addresses, would be impossible to translate, given that the SALT is long enough and created in a way that is difficult to guess.

The entity holding the secret SALT can still perform un-Anonymization by attacking the Hashed MAC using the technique in the previous chapter. Attackers could also obtain the secret SALT through social engineering, phishing or by other means.

Truncating

Where data protection law requires anonymisation, the method used should exclude any possibility of the original MAC address to be identified. Some companys truncate IPv4 addresses by removing the final octet, thus in effect retaining information about the user's ISP or subnet, but not directly identifying the individual. The activity could then originate from any of 254 IP addresses. This may not always be enough to guarantee anonymisation.[4]

References