Several companies regularly "hash" their customers' email addresses while adding such addresses to marketing lists or while uploading them to Facebook to deliver targeted advertisements. This method, they believe, ensures that their customers' email addresses will be secure in the event of a data breach or if such email addresses are shared with third parties or with the supply chain.
However, security experts say that this belief is not accurate and hashing by itself does not ensure that email addresses of customers are secure. According to experts at Princeton University, email hashing is nothing like end-to-end encryption and hashed addresses can easily be reversed without obtaining any private keys to recover original email addresses.
Today's cyber-criminals can gain access to millions of clear-text email addresses via data breaches. Additionally, they can also use data obtained from a breach to recover a victim's email address from its hashes. Those who are willing to spend some money can also easily gain access to millions of unhashed email addresses by purchasing bulk marketing lists from companies.
According to the researchers, bulk mailing lists labeled with privacy-invasive categories like religious affiliation, medical conditions or addictions including underbanked, financially challenged, gamblers, high blood pressure sufferers in Tallahassee, Florida, Anti-Sharia Christian conservatives, and Muslim prime prospects are available to purchase on the web.
They added that cyber-criminals can also recover between 42 and 70 percent of original email addresses from their hashes using simple heuristics and limited resources. These techniques include using patterns such as firstname.lastname@example.org as well as other patterns that include a combination of first names and surnames.
Despite the availability of such techniques, what truly deals a major blow to email hashing is the presence of several firms who offer email address recovery services for as little as 2.84p per email. For example, Datafinder, a company that combines online and offline consumer data, charges 2.84p per email to reverse hashed email addresses and also promises a recovery rate of 70 percent.
A consumer identity management company called Infutor also claims that it can match anonymous hashed emails with a database of known hashed information to provide consumer contact information, insights, and demographic information. The firm recently claimed that it had recovered three million email addresses and has also set up a real-time online service to reverse hashed emails for an EU company.
According to the researchers, companies based in the UK and in the rest of Europe may not be able to categorise email hashing as "pseudonymous identifiers" under the upcoming GDPR. This is because GDPR defines pseydonymisation as "a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately and is subject to technical and organisational measures."
The additional information stated above refers to private keys which can be used by firms to obtain encrypted data. However, hashed emails are not encrypted using private keys stored separately and can be matched with other data sources to reveal a person's identity.
"Hashed email addresses can be easily reversed and linked to an individual, therefore they do not provide any significant protection for the data subjects. The existence of companies that reverse email hashes shows that calling hashed email addresses “anonymous”, “private”, “irreversible” or “de-identified” is misleading and promotes a false sense of privacy. If reversing email hashes were really impossible as claimed, it would cost more than four cents (three pence).
"Even if hashed email addresses were not reversible, they could still be used to match, buy and sell your data between different parties, platforms or devices. As privacy scholars have already argued, when your online profile can be used to target, affect and manipulate you, keeping your true name or email address private may not bear so much significance," they added.
James Houghton, CTO at ThinkMarble, told SC Magazine UK that since email addresses obtained by companies are used for marketing and corresponding back to the owner of the data, they have to be reversible. Even though there is a business risk associated with such a reversal process which should be noted and discussed at the board level but, ultimately, the commercial requirement will win.
"Businesses must follow good cyber-security design practices to obfuscate sensitive and personal information and ensure the location of that data store is protected from unauthorised users, both internally and externally to the business. For example, not using default settings in local database servers and ensuring privacy and security settings are locked down, especially for cloud storage areas that are often accessible from the wider internet," he said.
Etienne Greeff, CTO and co-founder at SecureData, said: “This could well be considered a Red Herring for a number of reasons. Firstly, using email addresses as part of authentication should not happen at all anyway, as email is extremely easy to spoof, and as demonstrated in the past through typo-squatting it is also fairly easy to intercept emails.
"Whilst email addresses are considered Personally Identifiable Information under GDPR, it is the least of the concerns with regards to GDPR and PII being stored and handled by businesses. What should be keeping businesses awake at night is the 4.8 billion email addresses (non-hashed) and passwords (mostly hashed) living in haveibeenpwned.com. These passwords can be used as a very good starting point to get to proprietary corporate data, as previous research we have done in the past has demonstrated.
"Worrying about somebody potentially decrypting an email address, while there are passwords on public websites and known vulnerabilities on networks seems like a mix up in priorities to say the least," he added.