Build-out and testing should be a very structured and methodical process but can be collapsed and speeded up where appropriate depending on the size and scale of the endeavour to meet business imperatives.
I have had to speed up the build, test and acceptance process several times in my career to meet various ‘Ministerial Targets' and other operational imperatives. The secret to success is knowing the business requirement and design thoroughly enough and knowing which corners can be cut and which should not.
Then when targets are met and celebrations ended, quietly and methodically put the ‘corners cut out, back on' for longevity and proper operational service. Not always an easy task. In practice it is not unknown that testing and acceptance are put aside to meet quarterly sales targets and a product rushed onto the market, sometimes with embarrassment as customers find themselves doing beta testing!
All good project and programme managers know that a mature requirement and SOR schedule are key, and that without them nothing can be designed successfully. Without a properly documented and detailed design specification, a successful and meaningful build-out cannot be achieved - even then there are invariably the odd reference back to the design team and final tweaks to what is actually built operationally. This is why for stage 4 (Through Life Operation) the very detailed “as built” documentation is essential for successful through life operation and security.
All new technology brings some sort of risk and where testing and acceptance into service (and major payment milestones are involved), pressure and persuasion to declare acceptance can be intense. A testing and acceptance plan is key, with proper milestones and contingency in a programme to allow for tweaks to the build and whatever re-testing is necessary. Testing and acceptance criteria should have been agreed and the necessary testing scenarios accepted well in advance and not be treated as last minute activities.
Hopefully the requisite security testing will have begun at the design and development test rig stage with initial ‘hardening'. Hardening is the process of locking down all unnecessary functions and features that come with off-the-shelf hardware - active components such as routers, load balancers, switches, database and firewalls - and application software, eliminating unwanted features and locking out security vulnerabilities.
This stage must be applied very early otherwise there can be unhappy faces. Too often the security lockdown at hardening is neglected at an early enough stage to get ‘early wins' but, when eventually applied, the application software or database fails or runs badly.
As part of business acceptance not only does the question have to be answered with flying colours - "does it work and are the users/customers happy?" - but from a risk perspective, have all risks been identified, managed, mitigated and the residual risk accepted?
There will always be some residual risk that something unforeseen or undocumented has slipped in and not found during testing and acceptance.
A key stage that must not be omitted in a business system to meet a complex international requirement is how to test and accept the legal and regulatory perspective? This is sometimes hugely complex in practice.
Often overlooked or skimped is the vital process of proving that system backup and restore operates properly. This is not simply proving that backup is initiated and when the process is ended that tape or disc is stored. You have to prove that the data required to run the business can be “restored” and live operations resumed. This is an entirely different matter to having a backup tape or disc.
Provided there is what is called a ‘gold build' of each core component of the operating system, database and application software, together with all other active components, the backup should work but requires thorough testing.
Manual builds of complex software are a nightmare operationally as with the best will in the world, even following a script, individual operators will do something subtly different. Where this happens a backup of one machine will not necessarily restore or even be recognised by another machine unless they are all identical down to the bios level.
A backup can be made but without very careful testing to demonstrate that a tape/disc can be restored on and off the original device that made the backup, “backup” can prove an in-service nightmare or be merely a self-delusional routine that will fail the ultimate test: disaster recovery.
The final acceptance test is not, “Does it meet its entire requirement schedule contractually, does it work and can it be operated securely with no apparent security vulnerabilities?" The final acid test to answer the question of whether it is "fit to underpin the business” is, of course, “Can it be operated reliably on a day to day basis, cost effectively and supported without regular failures without having a huge operational and maintenance staff?"
Reputations are made or lost on failing this seemingly simple question operationally.
Contributed by Tony Collings OBE, chairman, The ECA Group Limited