Nick Bostrom - AI potentially an existential threat
Nick Bostrom - AI potentially an existential threat

AI has become an increasingly important component in cyber-security systems struggling with an explosion of data that needs to be analysed and a shortage of experts to handle that data.

But taking the long view, AI itself is a potential existential threat to humanity. If that sounds a little bit SciFi, with echoes of Terminator, it gets more so, because we are only 20 to 50 years away from Machine AI achieving human-level intelligence, according to IP Expo keynote speaker, Nick Bostrom*, in his presentation Artificial intelligence and the future. And from reaching a human equivalent level of capability, AI is forecast to then see its progress take a rapid upward curve to super-intelligence.

“There is a lot more room above human intellectual capability for improvement,” adds Bostrom, noting that the limitations are only those of physics, such as the speed of light for the transmission of messages, switching speeds etc, compared to humans handling the speed of neural networks – which are significantly slower.

The risk of AI being incorporated into people, creating cyborgs, was viewed sceptically with the benefits and feasibility of planting such capabilities in our bodies seen unfavourably compared to having the same devices outside our bodies. Also, the limiting factor was not pumping more information into brain, but our ability to process that information. In the very long run such uploading may achieve superior intelligence.

Bostrom sees AI as a fundamental change in human existence as big as the agricultural and industrial revolutions.  He asked his audience, “What happens from the point of outstripping human intelligence. How do we ensure AI is safe if its smarter than you?”

His solution is that we need to seek scalable control, as we continue to use AI to work as we'll work better with it as its more capable than we are.  But some control mechanisms don't scale, so we'll need to make conservative assumptions beforehand says Bostrom. 

These include:

  • Humans are not always able to control the reward channel. 
  • Strategic behaviour by the AI is possible, including deception.
  • AI may use verbal hyperstimuli (ie it can make very persuasive arguments from the AI perspective of what it wants to do)
  • It will get access to actuators. (ie it will be able to move ‘actors' in the physical world; robots, devices, programmes) so it needs to be safe even if it ‘escapes' from the confines imposed upon it.

To achieve this level of safety we need to instil a value alignment so that the AI wants same as us and is on our side.  The more that it wants to do what it wants, the harder it will be to control.

When establishing this alignment there is a risk of misspecification, as we seek to explain human values (such as fairness etc). And if some human values are omitted then an optimal policy often sets those parameters to extreme values.

Just like Midas, we need to be careful what we wish for – Midas turned everything he touched into gold, but he forgot the exception rule to exclude food or people. Therefore we need a control mechanism so that what we ask for is what we really want.

An example of an arbitrary goal cited was AI optimised to build paperclips, which then conceives greater plans, that only prioritise making paperclips for the whole world or even other worlds thus if it had instrumental control to muster the resources, it could eliminate humanity.

When the issue came up of how to stop a computer hacking its way out of its restrictions to seek ways to optimise and achieve its goal, Bostrom suggested again that if it is not targeted to do so in the first place, and if its fundamental goals are aligned to your own, then even if it is able to change its goals, it would not choose to do so.  So its all about establishing initial control.

We would need to leverage AI intelligence so it is motivated to pursue the goals it is given.  Thus what goals do we want to give AI? which is a whole political and ethical decision – while we still need to solve the technical problem of how to achieve the human goals.

Ownership of AI in the future is an unknown –  with scenarios of concentrated AI on a single winner-takes-all basis, and others where it is distributed.  Deep Mind was seen as a current leader in the field, pursuing Open AI, with the aim that it should be for the common good of all humanity.  Commercially its seen as too big a thing to be just boost one company or one country's dominance.   But that there is a strong first-mover advantage to whoever achieves super intelligence first, and then the future will be dependent upon what agreements had been established in advance.

However, first mover-advantage throws up another control problem. If there is a tight technical race to get there first, then whoever invests in safety could disadvantage themselves in terms of speed to market and cost, and thus be at greater risk of losing the race. So we end up with less safe AI, which is why we ought to foster a collaborative approach for the good of all says Bostrom.  Then, if we all have broadly similar aims and the other guy gets there first its still a broadly acceptable outcome.

Asked if governments should regulate, Bostrom said, “Governments don't have great track record of running large software projects.  Maybe there should be government oversight, and an accountability structure, but its largely unexplored space.

“We don't understand the stability of a world with AI, one with a single AI, and with multi-polar AI, nor the level of technology at which offense beats defence. A multi-polar world could be less stable.”

*Nick Bostrom, University of Oxford, director, Future of Humanity Institute, director Strategic Artificial Intelligence Research Centre