Peering inside an AI’s brain will help us trust its decisions
AI is no longer the future—it’s now here in our living rooms and cars and, often, our pockets. As the technology has continued to expand its role in our lives, an important question has emerged: What level of trust can—and should—we place in these AI systems?
As AI systems become more functional and widespread, a large segment of the public has been slow to trust the tech. A highly publicized study last year called the ethics of self-driving cars into question, concluding that most people wouldn’t want to ride in the cars because they don’t trust the systems making the decisions. Artificial intelligence doesn’t make decisions in the same way that humans do. Even the best image recognition algorithms can be tricked into seeing a robin or cheetah in images that are just white noise.
The biggest issue behind the lack of trust is that programmers cede some aspects of control with automated learning as the neural networks train themselves, making them a so-called “black box” — in other words, they’re tough to understand. Rather than being programmed for specific responses to commands, there’s a potential that the system could act in ways no one can predict when asked to make a choice.
So Grimm and his colleagues created a system that analyses an AI to show which part of an image it is focusing on when it decides what the image is depicting. Similarly, for a document-sorting algorithm, the system highlights which words the algorithm used to decide which category a particular document should belong to.
As AI becomes more pervasive, so too has the concern over how we can trust that it reflects human values. An example that gets cited frequently to show how difficult this can be is the moral decision an autonomous car might have to make to avoid a collision.
For example, when looking at images of horses, Grimm’s analysis showed that the AI first paid close attention to the legs and then searched the image for where it thought a head might be – anticipating that the horse may be facing in different directions. The AI took a similar approach with images containing deer, but in those cases it specifically searched for antlers. The AI almost completely ignored parts of an image that it decided didn’t contain information that would help with categorisation.
Grimm and his colleagues also analysed an AI trained to play the video game Pong. They found that it ignored almost all of the screen and instead paid close attention to the two narrow columns along which the paddles moved. The AI paid so little attention to some areas that moving the paddle away from its expected location fooled it into thinking it was looking at the ball and not the paddle.
Grimm thinks that his tool could help people work out how AIs make their decisions. For example, it could be used to look at algorithms that detect cancer cells in lung scans, making sure that they don’t accidentally come up with the right answers by looking at the wrong bit of the image. “You could see if it’s not paying attention to the right things,” he says.
But first Grimm wants to use his tool to help AIs learn. By telling when an AI is not paying attention, it would let AI trainers direct their software towards relevant bits of information.