Security - why it's the burning issue of the HPC future

New approaches are needed to overcome security concerns related to use of big data analysis suggests Andy Grant, with 'containerising' data and merging data on the fly among options suggested.

Security - why it's the burning issue of the HPC future
Security - why it's the burning issue of the HPC future

As cost barriers come down, high-performance computing (HPC) is increasingly moving out of the cloistered worlds of research and academia into fiercely competitive commercial environments. But as businesses' usage of HPC grows, so too do their data security concerns.

One of the big issues is the changing focus of HPC. Its traditional role was as a modelling and simulation tool, developing weather forecast models, building car engines or designing drugs, for example. Today, all that has changed. The focus is on the data itself – whether that is from next generation networks, Twitter feeds or CCTV cameras – and this requires a different kind of architecture.

Businesses are generating more and more big data sets and they are doing so at greatly reduced costs. Unfortunately, the ability to analyse that data has not increased at the same rate, so they end up with warehouses bursting full of difficult-to-analyse data – and a big security headache.

Much of the data processed through these kinds of approaches is sensitive. This is true of most financial data – but the issue is arguably most pressing in healthcare where the emergence of genome sequencing for individuals is currently leading to a growing gap between the availability of personal and private data and the ability to manage it effectively.

So what's the solution? Larger laboratories may be eyeing more powerful supercomputers to analyse this data but the smaller labs; universities or hospitals, with limited funding and resource, are more inclined to adopt cloud models to manage this kind of work.

But there are serious challenges to overcome here. If the data is continuously changing and the business is constantly having to move it to a remote location to work on it, the cost (in time and resource) is likely to offset any benefits of using a remote service. More importantly, because we are talking about genetic data, there are particular sensitivities around privacy - not only relating to the individual concerned but also their parents and children, with potentially serious impacts on insurance premiums.

Given all these issues, it is unlikely that anybody would be comfortable having all of their healthcare data residing in a standard public cloud environment. And while the nature of the challenge is different in every case and with every vertical, the core data management issue remains similar across a range of sectors.

In finance, for example, you could be looking at how to protect individuals from fraud or identity theft. You might also be looking at ways of enabling hedge fund managers to take snapshots of your data and assess your risk position without rendering that data susceptible to hacking or putting its integrity at risk.

So, how can these concerns best be allayed? What kinds of systems and architectures do organisations need to put in place to protect their data in the cloud? The traditional approach is to use firewalls to protect sensitive data. Unfortunately, they can make it difficult for legitimate users to access key data and external hackers can often obtain access via the back door – and critically they provide no protection against internal threats. 

One approach used in healthcare to protect sensitive data is to deliver a dynamic approach to data management and analysis. Typically, this involves merging data sets on the fly to create a combined database that provides greater insight in response to data analysis queries, enabling healthcare providers to identify key risk groups, for example. Critically, once the analysis is completed, the database and the data it contains are destroyed, eliminating any security threat at the same time.

Another option is the use of virtual machines in cloud environments. Software tools are now emerging that effectively create containers to ‘wrap around' specific jobs and ‘seal them', thereby providing a comprehensive security capability even within a multi-tenanted environment.

This is a rapidly evolving technology area. It's not really settled down yet in terms of core standards but reliable solutions are now beginning to emerge – and their roll-out is likely to continue apace as in the world of its HPC, with its rapidly accelerating data growth rates, having robust security systems in place is increasingly a necessity.

Contributed by Andy Grant, director, HPC and Big Data Practice, Bull UK, An Atos company