Millions of ‘evil twin' search engine bots are infiltrating websites daily to carry out DDoS attacks, hacking, spam, content theft and other shady activities, according to US security firm Incapsula.
Over a 30-day period, Incapsula observed 400 million search engine visits to 10,000 sites and found more than four percent of these bots – about one in 25 – were imposters.
Incapsula analysed the purpose of these fake bots and discovered about one-third were explicitly malicious. That equates to the 10,000 sites suffering more than 1.3 million malicious search engine visits a day.
The spoof bots are being used for a range of purposes, the company said. Around two-thirds gather marketing intelligence, but around a quarter are being deployed to carry out Layer 7 DDoS attacks, with 5.3 percent used for scraping, 3.8 percent for spamming and 1.7 percent for hacking.
In a 24 July blog describing the problem, Incapsula's Igal Zeifman said fake Googlebots are particularly successful at infiltrating sites, because they exploit soft security rules put in place by admins worried about blocking real Google visits and so damaging their SEO rating and visibility.
“Just consider the benefits that come with fake Google credentials,” Zeifman wrote. “For one, ‘Google ID' is as close as a bot can get to having a VIP backstage pass for every show in town.
“After all, most website operators know that to block Googlebots is to disappear from Google. Consequently, to preserve their SEO rankings, these website owners will go out of their way to ensure unhindered Googlebot access to their site, at all times.”
Zeifman said site operators who use rate-limiting security solutions for protection – rather than case-by-case traffic inspection - are unable to identify real Googlebots from fakes.
As a result, when an attack alarm go off, they are presented with a harsh ‘all or nothing' dilemma: to block all Googlebot agents and risk loss of traffic, or to allow all Googlebots in and suffer downtime.
To guard sites, Zeifman advises: “The good news is that Fake Googlebots can be accurately identified using a combination of security heuristics, including IP and ASN verification – a process which allows you to identify bots based on their point of origin.”
But he adds: “Even these practices rely on excessive processing power and software capabilities not typically available to the regular website owner.”
The main countries of origin of the fakes are the US (25.16 percent), China (15.61 percent), Turkey (14.7 percent), Brazil (13.49 percent), India (8.4 percent) and Thailand (4.07 percent).
Zeifman said these represent the ‘usual suspects' identified in an Incapsula DDoS attack study a few months ago, except Brazil.
He speculates: “Does this have something to do with the World Cup? We can't honestly say. Still, these numbers may have something to do with the myriad internet devices brought into the country by a million tourists, some of whom probably should pay more attention to what they download.”
Commenting on Incapsula's findings, independent security expert Graeme Batsman, security director of EncSec, said that the problem of fake bots is not new, but agreed with the researchers that it is difficult to protect against.
He told SCMagazineUK.com by email: “This problem has been around before 2010. It is a little bit like using Adobe PDFs or Microsoft Word documents to infect people - very few people would suspect this method.
“Everyone wants search engines to scan them, find new content, add this online and in time increase ranking. This is why few people would dream about blocking Googlebots and others (Bing, Yahoo, Ask, etc). Blocking search bots outright may protect you but will have a negative impact of SEO rankings.”
Batsman added: “Mitigating them is not for novices. Options include using cloud or on-premise advanced WAFs (web application firewalls), manually updating known bad bots or comparing bots to genuine IPs listed by Google and others.”
Paco Hope, principal consultant with Cigital, agreed this is not a new problem, saying “impersonating browsers, mobile devices or web crawlers has long been possible”.
Hope felt it will be easier for large corporate websites to protect against fake bots than smaller companies, telling SCMagazineUK.com: “Commercial network protection services can cross-reference clients that claim to be a well-known web crawler like Google with other data, and reject fake crawlers. Thus, large enterprises routinely protect themselves by using commercial-edge network services and content distribution networks.
“Small and medium-sized websites are somewhat left to their own devices. And they would be foolish to make decisions about a client's intentions based on the way it identifies itself in its user agent.”