More than 5.5 million websites may have been at risk from a software bug that exposed sensitive information, such as passwords, and cookies and tokens used to authenticate users.
The leak was discovered by Google security researcher Tavis Ormandy. In a security advisory, he said that because of the leak, he was able to find “private messages from major dating sites, full messages from a well-known chat service, online password manager data, frames from adult video sites, hotel bookings. We're talking full https requests, client IP addresses, full https requests, client IP addresses, full responses, cookies, passwords, keys, data, everything.”
Ormandy found the problem last week while working on a project. He encountered some data that didn't match what he had been expecting. He said that that the format of the data was confusing and after a while it became clear that he was looking at chunks of uninitialised memory interspersed with valid data.
“A while later, we figured out how to reproduce the problem. It looked like that if an html page hosted behind cloudflare had a specific combination of unbalanced tags, the proxy would intersperse pages of uninitialised memory into the output,” he said.
“My working theory was that this was related to their "ScrapeShield" feature which parses and obfuscates html - but because reverse proxies are shared between customers, it would affect *all* Cloudflare customers,” he said.
Ormandy reported the bug to Cloudflare last Saturday and within hours its security team had responded and implemented a fix by disabling some features (email obfuscation, server-side excludes, and automatic HTTPS rewrites) — that had caused the problem to surface.
The firm said the issue could have started around five months ago, on September 22, 2016. The issue has been traced back to a coding error introduced by engineers working at Cloudflare. The firm said the flaw in an HTML parser chain Cloudflare used to modify Web pages as they pass through the service's edge servers.
“The engineers working on the new HTML parser had been so worried about bugs affecting our service that they had spent hours verifying that it did not contain security problems," a blog post by Cloudflare's CTO, John Graham-Cumming said.
"Unfortunately, it was the ancient piece of software that contained a latent security problem and that problem only showed up as we were in the process of migrating away from it.”
"The bug was serious because the leaked memory could contain private information and because it had been cached by search engines,” he said.
"We are disclosing this problem now as we are satisfied that search engine caches have now been cleared of sensitive information. We have also not discovered any evidence of malicious exploits of the bug or other reports of its existence,” he said.
In a later update, Ormandy said Cloudflare's blog post was a “postmortem, but severely downplays the risk to customers.”
In a tweet, Ormandy said that Cloudflare customers affected by the bug included Uber, 1Password, FitBit, and OKCupid.
In an email sent to Cloudflare customers today, Cloudflare CEO Matthew Prince said that in a review of these third-party caches, it discovered data that had been exposed from approximately 150 of Cloudflare's customers across its Free, Pro, Business, and Enterprise plans. A list of sites affected has been published on DoMa.
“We have reached out to these customers directly to provided them with a copy of the data that was exposed, helped them understand its impact, and helped them mitigate that impact,” he said.
“To date, we have yet to find any instance of the bug being exploited, but we recommend if you are concerned that you invalidate and reissue any persistent secrets, such as long lived session identifiers, tokens or keys. Due to the nature of the bug, customer SSL keys were not exposed and do not need to be rotated.”
Ilia Kolochenko, CEO of High-Tech Bridge, told SC Media UK that it's difficult to assess the practical exploitability and related risks of the vulnerability.
“It seems to be not very critical, but under some circumstances a set of chunks of disclosed memory may be perfectly enough to compromise the remote website and even the server,” he said.
“The risk is also aggravated by the large scale of impacted websites, providing attackers with a great choice of potential victims. Chances that cyber-criminals had found and exploited the issue much earlier than Google exist. Therefore, all websites owners are better to change all their passwords for the web applications and their back ends.”
Kyle Wilhoit, senior security researcher at DomainTools commented: "The "cloudbleed" issue as it's affectionately now called constitutes a big leak. After looking at DomainTools dataset, it's surprising how many domains use Cloudflare. (A rough count of 4,828,000 million domains.) Cloudflare's response was in-depth, accurate, and timely. They did a good job responding, and included lots of good postmortem facts. However, like Tavis Ormandy mentioned on Google's Project Zero team blog- "[the postmortem report] ... contains an excellent postmortem, but severely downplays the risk to customers."
Based on the infrastructure, the bug could possibly affect all users of Cloudflare, and would likely include full HTTPS request and response data. It's not possible to say there wasn't anyone siphoning and looking at this data, so no one knows if there is a person or group with nefarious intentions actually attempting to do anything with this data. Unfortunately, this isn't the first, nor last time, antiquated software can cause such a leak.
Most people probably have no idea which of the sites they put their personal info on uses Cloudflare. So other than the big ones that have been discussed so far, it might be hard to determine one's exposure to this.”
Craig Young, security researcher at Tripwire adds: “While CloudFlare and Google worked hard to purge any private data from search engine caches, due to the short timing between when the fix was deployed and when the problem was announced, it is possible (and in fact there are already some unverified claims) that data was missed and could still reside in caches for a few more weeks or even longer. (This begs the question of whether it was prudent for Google to push so hard for ASAP disclosure.)
I think it is extremely unlikely that efforts to scrub this data from caches around the world would go completely unnoticed by all intelligence agencies and criminal organisations. …it would not surprise me at all if we learn in the future that a well-resourced group did in fact extract and organize data leaked from CloudFlare. It is easier to just assume that every site you use was potentially affected in some way. It is also a good time to consider whether to discontinue use of services not offering multi-factor authentication.”
Michael Buckbee, Engineer at Varonis agrees saying:“Google, Bing, Baidu and the entire contingent of other web-scrapers that exist on the Internet are also “users” who make “requests” to these sites. The difference is that they keep this data around and show it to whoever requests it. While efforts are furiously underway at these companies to purge their caches of this data – it's out there and really difficult to tell where it all resides.
HTTP Requests are aggressively cached all over the internet. From your regional ISP to corporate proxies to your browser. So some of this data is going to be in the wild somewhere for a long time.
“Nobody gamed out a response plan that started with ‘Let's assume that everything that passed through our website for the last 6 months was publicly accessible'. There are many services that rely upon the use of obscure API keys for their security. Now, suddenly every one of those keys need to be revoked as access to an API key instantly lets you gain access to the features of the service. Similarly OAuth (which is in general a security win) exposes access and renewal tokens, which once compromised are a huge support nightmare to roll.”