Digital forensics is still a relatively new science in the mainstream security space, but the intelligence services have been using the technique since the earliest days of computers against the perceived enemies of the day.
Today enemies of the state are often terrorists and specifically include Al Qaeda.
A key question when dealing with Al Qaeda, however, is what can be done on the tight budgets of government agencies, but, according to Gregor Stewart, director of product management with Basis Technology, quite a lot can be achieved through the use of advanced digital forensics.
In his presentation at Counter Terror Expo - entitled `Delivering Mission-Critical Answers for Today's Intelligence Community' - Stewart explained that, when data is extracted from a suspected terrorist's computer, it is a triage situation, with forensics staff duplicating the hard drive and then analysing the meta data that results.
Whilst a problem of dealing with Al Qaeda data in the West is that the text is usually in Arabic, Stewart says that the forensics process can be advanced by non-Arabic speakers using a technique called `gisting,' whereby English speakers allow computer software to highlight names in the data stream and then cross-referencing this against a known person's database.
"From there, full extraction of interesting text can be carried out, with the text translated and human enrichment techniques applied," he said, adding that by cataloguing data in the cloud, information can be added at high speed, preparing the stage for complete forensics analysis by linguistic analysts.
"Human enrichment of the data is a powerful step in the process of data analytics. It allows the intelligence services to home in on high quality data with minimal staff resources," he said.
"The end result is the computer-assisted analysis of names and contextual data allows a more refined analysis to be carried out and on a very tight budget," he added, noting that the use of gisting involves faster translations to be carried out - but without sacrificing any quality in the process.
So what can be learned from digital forensic linguistic analysis in the security space?
SCMagazineUK.com spoke to staff at Basis Technology's stand at the CT Expo event, where they explained that the firm's Rosette linguistics platform has been used as the main Asian linguistic technology needed to create Google's Chinese, Japanese and Korean search engine.
The Rosette platform is billed as using automated and state-of-the-art natural language processing techniques to improve information retrieval, text mining and other applications.
In use, Rosette provides capabilities such as identifying the language of incoming text, providing a normalised representation in Unicode, and locating names, places and other key concepts from a body of the unstructured text.