Threat hunting?  Ditch the SIEM and use the principles of Big Data
Threat hunting? Ditch the SIEM and use the principles of Big Data

Threat hunting is a practice that can be invaluable to incident response times when attacks slip through an organisation's defences. Yet, security professionals confess an overall lack of competency in threat hunting according to the SANS Institute. 

This is in part because many organisations still rely on alerts from a SIEM (among other alerting systems).  Most security teams will painstakingly build models for indicators of compromise, receive alerts from their SIEM, and “do the best they can” to eliminate the intrusion which can lead to negative results including dwell times of over 200 days, flatlining incident detection figures when they should be going up and increased response time. Ultimately, the over-reliance on SIEM is transforming security analysts in to alert analysts.

If we continue down the this path, we will not see threat hunting flourish.  Instead, if we utilise the principles of Big Data: open-ended search and parsing-free data ingest, we can use these methods to take the fight to the adversary.

New approach

Threat hunting is scientific in the broad sense of seeking explanations for why things happen.  As with any practice that follows the scientific method, it requires an openness to forming new hypotheses, theoretical “what ifs” from experience, controlling variables and making conclusions based on evidence.

To follow this method, we need tools that accelerate and amplify our human work, rather than using technologies that brush aside the method in favour of operating within the technology's paradigm.  Too many threat hunting programs are sputtering because we continue to conform our work to the structure and arbitrariness of SIEM and other alerting systems.

Above all, we must recognise that the human brain is the best analytics engine in the known universe.  The human brain is equipped with reason, with empathy and with open-ended means of solving complex problems.  These uniquely human capabilities allow humans to use counterfactual simulation that no machine can match. 

Technology for world-class threat hunting should operate in a ‘human-first' paradigm and provide the following capabilities:

Natural language extraction

Using NLE, humans can tokenise every portion of a dataset without any need for parsing.  This is precisely how Google has indexed the vast Internet without prior knowledge of the content it consumes.  These are Big Data principles, and the security challenge is a Big Data challenge.  Natural language extraction lifts out the elements within the data, wraps it with tokens (markers, attributes), and thus provides an easy way to search all that data.  This facilitates the human using their powers of reason, empathy, and open-endedness.  Now that the data is assembled and positioned for interrogation.

Open-ended Search

Hunters should not have to conform to a query language or closed restrictions – just type what you want to find.  Threat hunting often begins with a known indicator of compromise. Open-ended search finds those IOCs in your network, but goes beyond that with association mapping to reveal the relationships between entities, users, destinations, machines and chatter connected to that IOC. 

Second, hunters tend to create their own self-defined IOCs by leveraging that mass of neural tissue in their skulls.  If I see A in this quantity, B in this quantity, C in this quantity, that is an 80 percent probability of an early indication of compromise.  Open-ended search gives hunters that kind of flexibility, on-demand. 

Metadata enrichment

Searches can be made even more effective by applying metadata (data about the data).  This attribution is added to the data at ingest.  What do I know about that entity coming in for analysis?  Apply those attributes to enrich potential hunting results. 

Importantly, metadata enrichment serves the first two points by giving more to the NLE engine and adding more contextual details for our open-ended search.

Flexible data storage 

The world is awash with data.  We don't need another place to store it all, let alone pay the exorbitant rent for storage.  We need a better way to analyse it all.  Many organisations are building Big Data Lakes with Hadoop et al., and use analysis that sits atop such a data store.  This is a vital step in the right direction.

Secondly, we live in an on-demand world.  Threat hunting needs to mimic this: the best data, assembled on-demand and cleared when completed.  There doesn't need to be another repository and another meter running for data warehousing.


Any solution that provides world-class threat hunting requires performance at scale.  Infrastructure is changing, threats are mutating daily, and it can be disorienting not knowing which way is “up”.  Any solution must scale with these challenges in mind. 

Scale does not mean more computing power or greater volumes of data for analysis.  Scale is simply: one-to-many.  Threat hunters need to position themselves at a single console with the ability to pull together any data type, from any source, at any time.  

In summary, the human is the most essential part of any security programme and they need frictionless ways to work with data, be more productive, secure their environments, and apply their own methods to their tools.  It's time to think differently. 

Let us stop budgeting, buying, and consuming technologies based on Quadrants, Waves, Grids or any other categorisation that forces conformity to vendor solutions.  The threat landscape is evolving.  Solutions for threat hunters must evolve, too.

Contributed by Josh Mayfield, Immediate Insight, FireMon

*Note: The views expressed in this blog are those of the author and do not necessarily reflect the views of SC Media or Haymarket Media.