All companies have a duty of care to their customers and employees while ensuring their business runs profitably.
There have been many high-profile cases over the years of companies receiving substantial penalties for breaching employee or customer safety standards. However, most companies have not been held to as high a standard, or focused enough attention, on protecting private data.
Under a thick blanket of privacy legislation in almost all western countries, organisations must take extreme care to protect any personally identifiable information (PII) and personal health information (PHI) they store relating to employees or customers.
For example, the European Commission's proposed General Data Protection Regulation will impose fines of up to two per cent of a company's annual global turnover for failure to protect consumers' private information. This regulation, set to be adopted in 2014, will apply to businesses that operate in the EU or that hold personal information of EU citizens.
Where is private data stored?
Even though business leaders are well aware of the need to protect customer's privacy, the reality for most companies is they don't always store this information in safe places.
First of all, it's important to understand that the biggest data leakage threats don't lay in neatly structured company databases, but in unstructured data such as documents, spreadsheets and email. Because unstructured data is much harder to search, it is challenging for organisations to get a clear picture of what this data contains, where it is stored and who has access to it.
Many organisations make two damaging assumptions when it comes to data leakage. The first is believing they only need to worry about privacy if they are hacked. Unfortunately, employees can easily leak information, either maliciously or inadvertently. People often make 'convenience copies' and store sensitive information in file shares or email it to their personal accounts. They may also take it outside the firewall using personal laptops, smartphones, cloud storage services, flash drives or email.
The second assumption is that it would be equally as hard for anyone else to find sensitive information stored in their systems, and because of the resources required to trawl through the millions of emails and files to find evidence of privacy breaches, they simply don't. Again, this is a poor assumption because a person who gets hold of your data only needs a small amount of the wrong information to cause you grief. Also, they may have got hold of it by means other than searching, such as a leak or accidentally being released in a court case or complying with a regulatory investigation.
Nuix recently cleansed more than 10,000 items of personally identifiable information, personal health information and credit card numbers from the Enron PST Data Set published by EDRM. This is a worldwide standard set of test data for electronic discovery practitioners and vendors, which was released to the public following the US government investigation into the collapse of energy firm Enron.
Nuix's investigation unearthed 60 items containing credit card numbers including departmental contact lists that each contained hundreds of individual credit cards, 572 containing Social Security or other national identity numbers — thousands of individuals' identity numbers in total, 292 items containing individuals' dates of birth and 532 items containing information of a highly personal nature such as medical or legal matters.
Analysis also showed a considerable number of these items had been sent outside the company, for example, by employees forwarding details to their personal email addresses.
While companies today are more aware than Enron was about the need to protect private data, there are also more opportunities for this information to be stored inappropriately. Nuix has conducted sweeps for private and credit card data in unstructured information stores for dozens of customers and is yet to encounter a single data set without some inappropriately stored personal, financial or health information.
Locate and remediate privacy risks
Recent technology advances have made it much easier for companies to index large volumes of unstructured data and locate improperly stored sensitive information within it. The methodology Nuix used to identify the personal and financial data in the Enron data can be applied to any corporate data set.
The crucial first step is indexing the most relevant data sources, capturing all text and metadata. This would most likely include email, network file shares, collaboration systems and individual computers.
With a complete index of this data, common investigative steps include:
- Using pattern matching to identify and cross-reference sensitive information such as credit card numbers, dates of birth and addresses
- Searching for names, phrases or email address domain names that could indicate personal legal or health discussions, online purchases or other private matters
- Creating network maps and timelines to identify communication patterns and understand messages and documents in the context of external events
- Conducting ‘near duplicate' analysis to find similar and related content and put together conversation threads.
Once you understand what is in your data stores and where the biggest threats lie, you can delete the high-risk data or move it somewhere that has appropriate encryption and access controls.
Being proactive about privacy protection
Almost every organisation has personal data stored inappropriately. The increasing burden of privacy and data breach regulations, on top of a duty of care to keep this highly sensitive information safe, makes it an unacceptable risk.
By taking a more proactive approach and using the latest technology to understand what lies within your data sources now, you can keep sensitive information safe, for the sake of your customers, employees and on-going business success.
Eddie Sheehy is CEO of Nuix