Are digital loss prevention and signature-based anti-virus living on borrowed time?

Should fingerprint-based data leakage protection be declared dead asks Peter Tyrrell, suggesting it just doesn't scale for the hyper-connected world.

Are digital loss prevention and signature-based anti-virus living on borrowed time?
Are digital loss prevention and signature-based anti-virus living on borrowed time?

Fingerprint-based Data Loss Prevention (DLP) and Signature-based Antivirus (AV) have long been security industry mainstays, but with the speed of the business environment outpacing the technology they rely on, are their fates already sealed?

Signature-based antivirus (AV) software was originally developed to detect and remove common viruses from computers, hence the name.  The first AV signatures were simply hashes of the entire files or sequences of bytes that represented the particular virus. Since inception, AV software providers relied exclusively and heavily upon signatures to identify and contain viruses, now commonly known as malware.

When malware arrives in the hands of an AV firm, it is analysed by security researchers or by automated analysis systems. Then, once it is sure it is actually a malware, a proper signature of the file is extracted and added to the signatures database of the AV software. When a particular file has to be scanned, the AV engine compares the content of the file with all the malware signatures in the signatures database. If the file matches one signature, then the engine knows which malware it is and which procedure has to be performed to contain and clean the infection.

Signature-based detection techniques can be very effective but, clearly, cannot defend against malware unless some of its samples have already been obtained, a proper signature is generated and the signature database of the anti-virus product updated. Signature-based detection system rely on the consideration that, generally speaking, the more infective a malware is the faster it arrives in the hands of researchers. Thus, even if it does not guarantee perfection, it guarantees the protection from the most widespread threats.

However, this approach is not really effective against zero-days or next-generation malware or the malware that has not yet been encountered/analysed. Plus new, more sophisticated malware is being created each day and the signature-based detection approach requires frequent updates of the signatures database. The result: AV has struggled to keep pace with the creativity of today's malware creators, and we're hearing about them in the news before the AV firms have had a chance to create a signature.

The comparison to traditional, network-based data loss prevention (DLP) is fairly obvious.  DLP was originally architected to search for “known” files types travelling across the network that may contain sensitive data such as personally identifiable information (PII) or payment card information (PCI) in order to protect them from data loss or comply with regulatory standards like HIPAA and PCI-DSS.  The most popular way for DLP solutions to do this type of data identification is a technique called fingerprinting. 

Fingerprinting takes a picture of a document and creates a hash of that file in a database - in essence a static “signature” of the document.  Anything that matches that hash as data interactions occur can then invoke a policy (ie block these unencrypted credit card numbers from leaving my network). As it was with AV signatures, so it is with data fingerprints – the fingerprinting approach just doesn't scale in today's hyper-connected and fast-paced business environment. 

The first problem with fingerprinting is the very large amount of work required to scan all document databases and file shares, and that work must be completed before any meaningful data protection occurs. The second problem with this approach is that it doesn't work when an employee is off the corporate network since the fingerprint database is located on network. But the biggest issue with this approach is that every time a file is altered it must be re-fingerprinted as the original “hash” of the file will no longer match. You can easily envision the amount of people and processing power this would take for a global enterprise which is generating terabytes worth of new documents in the normal course of business. There's also no doubt that's why fingerprinting-based DLP is now commonly referred to as Disastrously Long Project by security practitioners.

Not to go all Donald Rumsfeld on you, but it's likely becoming clear by now that both DLP and AV solutions are very similar in that you must have a known-known in order for the products to work – a malware signature or a document fingerprint. Neither AV nor DLP can keep pace with either the ever changing malware landscape or the massive volumes of sensitive data generated in today's business world. To borrow from Rumsfeld again, the creation of unknown unknowns has grown exponentially in the past ten years as it relates to malware variants and structured and unstructured documents, and the old technology solutions simply no longer work. 

The original creator of signature-based anti-virus software, Symantec, made headlines about six months ago when senior vice president for information security, Brian Dye, proclaimed AV software was "dead." Symantec is also one of the largest players in the fingerprint-based DLP space as well and we're wondering when they might announce that's dead too.

Contributed by Peter Tyrrell, COO, Digital Guardian