A MongoDB database of 188 million records of personal data remained unprotected on the web, found Comparitech and security researcher Bob Diachenko. A bulk of the records appear to be from Pipl.com and LexisNexis, people search and legal search websites, said the Comparitech report.
The trove was accessible to anyone with an internet connection, the report said.
"About 800,000 of the records appear to originate from LexisNexis, a legal search engine," wrote Paul Bischoff in a Comparitech blog post . Those records included names, past names, addresses, gender, parental status, a short biography, family members, redacted emails, and info about the person’s neighbors including full names, dates of birth, reputation scores, and addresses.
The database was first indexed by search engines on 17 June. "We traced the database back to a Github repo for a people search API called thedatarepo. We promptly notified the database owner as soon as we could determine to whom it belonged. The owner then shut down access on 3 July," Bischoff said.
Comparitech deduced that this was not a case of a data breach at the source websites, but either purchase or data or harvesting through existing tools.
"The Github repo gives examples of how the API could have been used, for example, to look up people by their name or what car they own. It was last updated on 18 June. It lists an email for users to request "bulk data purchases and/or access to more data/requests," said Bischoff.
This situation highlights two important areas of concern, said Javvad Malik, security awareness advocate at KnowBe4.
"Even legitimate functionality can be abused and that those uses should be explored when designing systems. The second is one which can be referred to as the 'chemistry of data' in that certain data elements on their own may be of low risk but aggregated together, they can be high risk," he said.
In May researcher Diachenko flagged another MongoDB exposure. A public-facing database held more than 275 million records containing personal information on citizens in India.
The database was hijacked by hackers known as the ‘Unistellar’ group and all of its content was wiped out, said the researcher.
Data by default
"Data brokers gather personal data from many different sources and create combined data sets. These new data sets carry way more risk and need adequate protection," said Warren Poschman, senior solutions architect at comforte AG.
From user registration to usage tracking, data is generated when operations happen online. Making matters worse is the kind of information people share on social media.
SC Media UK reported last month that there were more than 33,000 victims of identity fraud in 2018 in the UK. About 65 percent of the victims had some sort of a social media presence, allowing scam artists to harvest crucial personal data.
The type of data exposed in the MondoDB database can be combined with other user data from other breaches and social media, to build a complete profile, noted Lisa Baergen, director at NuData Security.
"In the hands of fraudsters and criminal organisations, these valuable identity sets are usually sold to other cyber-criminals and used for myriad criminal activities, both on the internet and in the physical world. Using these real identities, and sometimes fake identities with valid credentials, they’ll take over accounts, apply for loans, and much more. Every hack has a snowball effect that far outlasts the initial breach," she said.
All customer information -- name, physical and email addresses, passwords, the content of emails – is valuable to fraudsters, and making these data valueless using passive biometrics is a viable solution, she suggested.
"Passive biometrics technology is making stolen data valueless by verifying users based on their inherent behaviour instead of relying on their data. This makes it impossible for bad actors to access illegitimate accounts, as they can't replicate the customer’s inherent behaviour," she said.
"Analysing customer behaviour with passive biometrics is completely invisible to users. It has the added benefit of providing valid users with a great experience without the extra friction that often comes with other customer identification techniques. When fraudsters try to use stolen customer data or login credentials, they will find the data is useless," she added.