Blackhat EU: Breaking Big Data

Former intelligence officer David Venable gave a crowd at Blackhat EU 2016, a rundown of what big data, and bad data in the private sector could mean for your privacy.

David Venable spent time as an employee of the National Security Agency
David Venable spent time as an employee of the National Security Agency

"Privacy as we know it is dead", said David Venable of Masergy Communications, as he began his talk, Breaking Big Data: Evading Analysis of the Metadata of Your Life at BlackHat Europe 2016.

A series of groundbreaking leaks over the last few years have shown the sheer scale of an infrastructure devoted entirely to collecting and exploiting our data, "something Orwell couldn't have dreamed of".

It's not the government he's worried about though. Venable is a former intelligence officer for the National Security Agency, and knows the level of oversight that government programmes go through. No, Venable is worried about an infrastructure with similar capabilities and much less oversight: the private sector.

Big Data is not going away anytime soon. In fact, it never fails to be pointed out that so-called big data is on the verge of curing cancer.

Certainly, a lot of faith is put in those two little words. Its components are simple: huge datasets and the algorithms to make sense of that data.

Often it's not so much big data as bad data though. Duplicate entries, incorrect addresses and false information cost a lot. IBM predicted this year that bad data costs the American economy US$3.1 trillion (£2.47 billion) per year.

As Cathy O'Neil's new book Weapons of Math Destruction outlines, algorithms are not always to be trusted. As it turns out, "these algorithms reflect the biases of their creators" said Venable.

"There's a tendency in western culture that the facts are the facts", added Venable. Big data often seems implicitly trusted. But with not only bad data, but stilted algorithms, what might those outcomes be? Well if you've ever seen the 1959 Hitchcock thriller North by Northwest, you might have an idea.

Venable offered the possibility, say a terrorist affiliated group is using the same cell tower as you - it wouldn't be too improbable to be mistakenly associated with that group: "you don't even have to be in the same hotel".

"There's no law of opsec information out there", controlling what metadata you're putting out there into this big machine. However, there are some things that work: "one of them is encryption".

Uber once put out a blogpost called Rides of Glory. Using metadata taken from its users, Uber compiled an analysis of its user's one night stands. They were able to determine what times and days were best for one night stands.

The data was apparently safely anonymised but, said Venable, "ultimately this could easily be tied to you as an individual by anyone with access to this data".

"Unfortunately endpoint security is so terrifically weak that anyone can find you" and  "it just seems to be getting worse".

There is an old info security adage. Some variation of there are those who have been compromised and those who don't know they've been compromised. Venable proposed a new one: "everythings compromised".

After all, "what isn't compromised has permission from you".  It's all being recorded and if you're not the direct customer, you're the product.

According to Venable, even the kinds of information you might find trivial are being recorded: "what aisles you're walking down in the store, what things you linger on - this is being recorded, too".

"Your feelings too" are being recorded on social media, collected and used to better sell products to you, using sentiment analysis. "This is being used to predict election, and quite effectively".

The Facebook database might be much better than the FBI but  "the amount of oversight on private companies versus governments is dramatically less".

One can of course, completely disengage, what Venable calls the 'mountain man' approach. "That worked for a long time", but a data black hole "starts to become more of a data point to not have any than to have millions".

In short, it is easier to hide in loads of data points than none at all. The question is whether you get to control that collection of data points or not - "what data you put out there to represent your persona", said Venable.

To that end, and in order to retain your private life, "look at your own pattern of life and understand it".

Sign up to our newsletters