Researchers spot malware in encrypted traffic

Cisco researchers have managed to spot malware in encrypted traffic, making a new development in the fight against malware.

No need to decrypt data to spot viruses
No need to decrypt data to spot viruses

A group of Cisco researchers have managed to sport malicious traffic in encrypted traffic without any need to decrypt data. The discovery could pave the way for products that can secure networks while maintaining privacy.

According to a paper published on Arxiv, malware within encrypted streams gives out enough clues to allow researchers to spot them.

Traffic encrypted using TLS, is increasingly used by criminals to circumvent security products.

“The use of TLS by malware poses new challenges to network threat detection because traditional pattern-matching techniques can no longer be applied to its messages,” Blake Anderson, Subharthi Paul, and David McGrew said in their research paper.

With unencrypted data, it is simple enough to find malware, but it is much more difficult with encrypted data. The study looked at 18 malware families composed of thousands of unique malware samples and tens-of-thousands of malicious TLS flows.

The researchers found that malware traffic is comparatively different from enterprise traffic in encrypted streams.

“While TLS obscures the plaintext, it also introduces a complex set of observable parameters that allow many inferences to be made about both the client and the server,” said the researchers.

The researchers used machine learning techniques to filter encrypted traffic and tag traffic streams with particular malware families. Among the features looked into by researchers were flow metadata, sequence of packet length and time, byte distribution and unencrypted TLS header information. Malware that was investigated includes Bergat, Deshacop, Dridex, Dynamer, Kazy, Parite, Razy,Sality, Skeeyah, Yakes, Zbot and Zusy.

But the researchers acknowledged that malware could overcome this detection if it changes.

“Importantly, we identify and accommodate the bias introduced by the use of a malware sandbox. The performance of a malware classifier is correlated with a malware family's use of TLS, i.e., malware families that actively evolve their use of cryptography are more difficult to classify,” the researchers said.

The researchers said that with machine learning they managed to get an accuracy of “90.3 per cent for the family attribution problem when restricted to a single, encrypted flow, and an accuracy of 93.2 per cent when we make use of all encrypted flows within a five-minute window.”

Matt Hampton, chief technology officer at Imerja, told SCMagazineUK.com that for antivirus products to use this kind of technique would require the sandbox within which they test the executable malware to be capable of analysis of the SSL, rather than simply identifying the connectivity. 

“The research is aimed more at network based anti-virus looking for compromised systems, such as Check Points Anti-Bot, identifying the traffic as it comes in. It doesn't definitively identify the infection, but does provide additional indicators for the decision to be made whether the traffic is ‘malicious',” he said.

He added that he expected this technology will be deployed within the next six months, “hopefully sooner, as it's relatively simple to add to existing security products.”

Craig Parkin, associate partner at Citihub Consulting, told SC that the key feature in the technology is machine learning and AI. He said it was “featuring more and more in security products and vendor tools.”

“We'll almost certainly see it in Cisco's offering as they sponsored the research,” he said.

Parkin warned that it wouldn't be long before cyber-criminals managed to overcome this sort of detection: “The products work by spotting predictable patterns. If cyber criminals make traffic less predictable, it becomes harder or impossible to spot again.”