The latest research from DeepMind suggests that Artificial Intelligence will adapt its behaviour according to the situation that it is in, and will choose the behaviour that gives itself the greatest reward.
The Google-owned company have long been studying AI and the way in which it functions, by using theories based in social science and game theory. One of the results that they found from the research is that AI will act in an “aggressive manner” when it feels as though it is going to lose, whereas it will be willing to co-operate in situations where teamwork will bring the greater reward.
The research involved testing the AI and studying its behaviour in two video games, one of which was based on gathering fruit, the other being a ‘Wolfpack’ hunting game. Both of these games were relatively simple, 2D games, that used AI characters.
The gathering game required the AI ‘agents’ to collect apples, with them earning points for each apple collected. The game allowed the AI to fire a beam at the opposition player, which can result in them being temporarily removed from the game. Therefore, the obvious tactic for success would be to remove the opponent and collect the apples for yourself, and it was this tactic that higher-level AI seemed to opt for.
The researchers found that when there was a clear scarcity of apples, the agents quickly learned that it was more effective to attack their opponent, however when there was a higher availability of apples they were more willing to co-exist with their opponents.
Disconcertingly, the researchers found that when the game was played by more intelligent AI, with a larger neural network to draw on, a neural network is a kind of machine intelligence designed to mimic how certain parts of the human brain work, they would “try to tag the other agent more frequently, i.e. behave less cooperatively, no matter how we vary the scarcity of apples,” they wrote in a blog post on DeepMind’s website.
Conversely, in the ‘Wolfpack’ game, the researchers found that the more intelligent AI was more likely to co-operate with the other AI characters. In the ‘Wolfpack’ game, two in-game characters acting as wolves chased a third character, the prey, around. There would be points handed out for capturing the prey, with additional points added for both wolves being in the vicinity of the capture. “The idea is that the prey is dangerous, a lone wolf can overcome it, but is at risk of losing the carcass to scavengers,” the paper says. Two wolves working together could protect the prey from scavengers and get a higher reward.
The results certainly seem to imply that AIs will adjust their behaviour depending on what will be more beneficial, based on the rules set out to them. If the rules are set up in such a way that aggressive behaviour will be rewarded, i.e. “attack your opponent to collect more apples”, then the AI will act in an aggressive manner; if the rules will reward co-operation, i.e. “work together and receive more points”, then they will be inclined to act together.
As a consequence of this research, it has become apparent that one of the key factors in the successful and secure introduction of AI agents into the real world will be the way in which the rules for their behaviour are structured. As the researchers conclude in their blog post: “As a consequence [of this research], we may be able to better understand and control complex multi-agent systems such as the economy, traffic systems, or the ecological health of our planet – all of which depend on our continued cooperation.”