Document security: Safe words

Feature by Rob Buckley

Choosing the right content management system is crucial to keeping your digital files secure. But it's only the first step. Rob Buckley reports.

Choosing the right content management system is crucial to keeping your digital files secure. But it's only the first step. Rob Buckley reports.

Ever since the arrival of the PC, the paperless office has been a dream for many. Convert everything that was on paper to a digital equivalent and you can reduce storage requirements, locate and search through text with ease and make copies you can send to anyone, anywhere, provided they have a computer.Moving away from paper has its problems, however, many of which are rooted in the very ease of access that make them so attractive to users.

Much of the debate on enterprise document security has focused on content management systems. The terminology around these can be confusing: document management systems are typically used for paper as well as digital documents, while web content management systems only look after websites. Some attempts have been made to come up with all-encompassing terms, such as enterprise content management (ECM) systems. Yet even these leave out records management, the part of the process that's concerned with retention and deletion.

Whatever term is used, ECM systems are principally the middle part of the information lifecycle management (ILM) process, as storage vendors have chosen to describe it. The ECM can look after the storage and security surrounding a digital document once it's been loaded in. But the document still needs to be created and might be archived out of the ECM towards the end of the lifecycle. So other forms of security are needed.

"The first thing I recommend is looking at the lifecycle of information and how it's distributed," says Niek Ijzinga, managing consultant for information security and project manager for LogicaCMG's Security Competence Centre. "Then you can see how to control that lifecycle. That's not just a technical issue, although you need the right software to support it.

"It's also about business processes and cultural issues. It's nonsense to introduce ECM when you don't even know what information you're going to be storing in it," he points out.

Document are usually either created electronically within desktop applications or scanned in from a hard copy. Whether other types, for example emails and web pages downloaded by end-users, need to be included in the document management system is something each organisation needs to consider as part of its ILM audit. Compliancy requirements could well mandate that all these kinds of documents need to be stored and an audit trail kept. Transaction information is often a singular concern of financial services companies.

Digital documents are an easy prospect from the outset. Most ECMs come with plug-ins for standard applications, such as Microsoft Office and Lotus Notes, that force users to store new documents with appropriate permissions and metadata in the ECM. Paper documents make matters harder. These will usually be scanned in-house. There is, of course, the initial consideration of who has access to what kinds of documents, since some post might be confidential and only certain members of staff may be cleared to see it.

Finding the right solution
Then comes the question of which system to use. Smaller companies will often find that the average ECM, even something relatively modest such as Microsoft's SharePoint Services, which are built into Windows Server 2003, is overkill for their needs. More powerful ECMs with a full gamut of security mechanisms and features, such as those available from OpenText, IBM, EMC, Oracle/Stellent or Interwoven, will be way outside their budgets and management capabilities. Few choose to spend the time and consultancy money necessary to implement open-source systems such as Nuxeo or Alfresco.

Often, bespoke implementations or simple systems that take advantage of the permissions of a standard file server can be a viable, if cumbersome, alternative. However, these won't offer most of the high-end ECMs' standard tools - such as workflow, versioning, preventing the creation of multiple instances of the same document, indexing, audit trails and business rules. Security in the more basic systems can be lax as well: certain file servers will only provide permissions for individual directories, not for each individual document, for example.

A sufficiently sophisticated document management implementation will be able to put the output from the scanner directly into the ECM, attaching metadata, setting permissions and adding an audit trail as soon as it is created. Since most scanning software performs optical character recognition to capture text and make it searchable when saved, PDF tends to be the file format of choice at this stage. This means other security measures can be added, such as digital signatures and password protection.

Security-conscious organisations will often set up a separate virtual local area network (VLAN) that integrates with Active Directory for their print and scan infrastructure, to ensure that no one can access anything other than the ECM, according to Paul Birkett, sales operations manager at Xerox Office Services.

However, technical superiority can lead many organisations to think they have their security bases covered at this point. Simon Harvey, technical marketing director of OpenText, argues that forcing users to add metadata at this point can be perceived as "a burden" that some will try to avoid. This will compromise audit trails that rely on this metadata. The only ways around this problem are training, reducing input time or some kind of incentive.

Stewart Mellor, a consultant for document management specialist Digital Vision, says that many companies with a scanning department have several staff who share one user ID to log into the ECM. "With an incoming paper document, you won't be sure who the person ultimately responsible for the document will be."

Some organisations might outsource the scanner process to another company. This provides both integration and security problems in their own right. Will the third party need access to the ECM? How will the connection - and the documents - be secured? How will metadata be added?

Inbuilt security measures
Once within the ECM, security will usually be good. Many models store files within a standard NT file system; others as binary large objects within a database. Access to the files and database will be via login and password, with permissions on each document linked to particular users, groups or roles. Often the ECM will store an ID and password for each user; this will typically alias the user to another authentication system, such as Active Directory.

This approach isn't foolproof, however. Digital Vision's Mellor says that companies will often change their Active Directory's domain names, only to find that users can't access documents any more. Equally, changing the user ID and password in the ECM will prevent the use of a single sign-on.

Physical security, through encryption of the server hard drive, is available as an option in most ECMs, although many rely on a third-party solution for encryption. OpenText's Harvey cautions against using encryption as a rule since it slows down access to files. Meanwhile, Ijzinga insists that the performance loss isn't great and is typically not something most organisations are concerned about in an ECM.

For extra protection, some ECMs will hide where they store files by altering filenames or locations, either natively or by using the content-addressed capabilities of something like EMC's Centera. This gives the document a new name based on its metadata. If someone changes the file, the name changes, making it obvious when alterations have been made. However, a suitably skilled system administrator can circumvent this in some cases by interrogating the system to find out the new name for the file.

Despite these potential holes, security tends to remain strong within the ECM itself. Yet, few organisations operate in a way that allows them to keep all their documents in the ECM at all times. Documents may have to be worked on outside the office and sent to people in other companies. As Stuart Okin, a consultant at Accenture, points out, simply printing a document will create a copy that no longer lies within the ECM. "It's usually the simple stuff that causes the security problems. Some people will have implemented a great document management system. But they'll print stuff and leave it on their desk." Education and awareness are key, he says, with some fairly simple options going a long way. "You can put a big sign over a printer with 'Who's standing behind you?' on it."

Too many cooks ...
Okin also warns that as, collaboration between partners increases, new issues will arise. "You can assume there's an all-embracing document management that will keep everything totally within your security domain. But quite frankly, they didn't even have that when building the Euro Fighter."

To deal with this issue, organisations need to adopt measures for mobile security, such as encryption of laptop hard drives. They can use the capabilities of more sophisticated ECM systems to use some applications' own security features. Office documents can be password-protected and time-limited, for example, when the user checks the document out of the ECM.

Companies that worry about security and need a detailed audit trail once a document is outside the system might have to think about digital rights management (DRM) technology. "The majority of documents are on desktops, laptops and mobile devices," says Martin Lambert, CTO of information rights management at Oracle-owned Stellent, one of the vendors in this space.

"For all you know, 5,000 people might have accessed the file." The company's DRM software encrypts documents and embeds URLs that point to a Stellent information rights management (IRM) server. Anyone who wants to view an encrypted document will then need to install the Stellent IRM desktop agent, which can decrypt the document if the user has sufficient privileges.

However, an enterprise-wide DRM deployment can cost hundreds of thousands of pounds, so it's by no means the solution for everyone.

Archiving is the final part of ILM and this has its own security concerns. There are many horror stories of organisations that needed to restore from back-up, only to find that their tapes had been lost by their storage company - or worse still, swapped with another client's. Encryption is normally the best way to prevent lost media, whether that's a tape, a WORM disk or a disk, proving to be a security problem. Again, organisations need to consider who has the encryption keys and where they should be stored.

Although it's tempting to think that a document management or ECM system is be the solution to any document security concerns, implementing one will raise its own issues that need just as careful consideration. No document management system will ever provide 100 per cent security, but reducing the associated risks is something that any IT or security manager can do with careful planning.


Online jeweller Cool Diamonds takes security very seriously, and this is reflected in the company's approach to document management.

"We have a whole series of systems," explains CEO Michel Einhorn. To gain physical access to the data stored in the document management system requires several authentications. "You need to pass two armoured doors. The first requires a card and a code, the second fingerprint authentication and another code. It's all armoured to withstand a rocket-propelled grenade."

The only computer that has access to the document management system for both data entry and retrieval lies behind the doors. Six branch offices have access via a VPN, but the computers needed to access that are behind similar physical security measures.

The document management system itself contains data such as credit card information and client purchases, as well as scanned paper documents. The company commissioned a bespoke system. "We didn't find anything that fitted the way we worked. So we designed the system around Linux, since I have zero confidence in Microsoft," says Einhorn. Under advice from a member of the Israeli army, particular attention was paid to ensuring access from outside the firewall was impossible.

The company began work on the system in 1999 and last overhauled it in April 2006. "As things grow, little holes appear, encryption turns out not to be so good. You really have to be thorough. You can have a fantastic system, but if there's one flaw ..." warns Einhorn.

Securing the building was "hugely expensive", says Einhorn. A combination of consultancy and development work necessary for the document management system cost the firm nearly a quarter of a million pounds. Still, Einhorn is clear that he doesn't believe his security is impenetrable. "Nothing is totally secure. All it takes is someone with a cameraphone."


When Google unveiled its Writely online word processor, now part of its Docs & Spreadsheets offering, many commentators saw it as the beginning of the end for Microsoft's Office suite. It wasn't the first program to run over the web, but with Google's might behind it, it was argued, all software would soon be delivered as services, overcoming the problems of installation, software upgrades, licence costs and a whole host of other typical enterprise deployment issues.

Writely, they predicted, would start a Web 2.0 revolution among office applications, since it includes collaboration features that allow multiple users to work on a document simultaneously, share it with others via blogs and chat applications, and versioning capabilities that let you roll back others users' changes.

So far, however, software as a service (SaaS) has yet to recruit the entire software industry to its side. But there have been some high-profile converts: has made inroads with its hosted CRM service; SAS uses on-demand business intelligence; and Microsoft offers various Windows Live services, including Office Live.

Online document collaboration tools, such as Writely, are rarer, although there are various products from vendors such as WebEx and HyperOffice, as well as gOffice, ThinkFree, the various Zoho applications and dabble db.

But just how secure is this kind of service? If all the company's documents are stored online and edited over the network, are they vulnerable?

The answer is that it's very much up to the company providing the service.

Google's Writely, for example, doesn't encrypt traffic using HTTPS - all text goes in the clear. Theoretically, therefore, anyone could intercept data and read it to find out what's going into the document. To use Writely, all that's needed is a suitable web browser and a Google account. Anyone can invite someone else to collaborate on a document and they in turn can invite others. There are no corporate lock-downs, no options to restrict who can do what.

Then there are Google's own terms and conditions and privacy policies: "Google reserves the right, but shall have no obligation, to pre-screen, flag, filter, refuse, modify or move any content available via Google services"; "Google reserves the right to syndicate content submitted, posted or displayed by you"; and "You agree that Google has no responsibility or liability for the deletion or failure to store any content and other communications maintained or transmitted by Google services."

As with all security, it's a question of how much risk you're prepared to face.


Find this article useful?

Get more great articles like this in your inbox every lunchtime

Upcoming Events