If I drive through a stop sign, I might protest to the traffic officer that “I didn’t notice it,” or "it was blocked by a tree," or “I thought it was a red flag.” And those claims might be perfectly true from what I remember of my experience: nothing feels more natural than explaining a mistake. But as plausible as these explanations may sound, the officer shouldn't believe them. The truth is that even I don't know the source of my error.
This is because the real explanation had something to do with computations performed by my brain. As I passed the stop sign, my neurons were busy encoding sensory inputs and combining them with memories of past experiences. We are just beginning to understand that these these computations have natural mathematical descriptions utterly unlike the verbal “explanations” we find most convincing. The post-hoc descriptions of how we made a decision are sometimes pure fabrication, but even when they contain some truth (flags and signs do look similar), we can only trust them, practically, when they predict behavior as well as explain it.
Unfortunately, they usually do not. In experiment after experiment, unconscious internal processes and external influences predict behavior – in run-of-the-mill, moment-to-moment situations like these, if not in slow, high-level deliberations – better than verbal reports can. We don't rely on overt internal monologues as we drive, telling us to gauge the color or shape of an approaching sign. The decision happens on a deeper level, concealed from conscious experience. These facts force us to be skeptical whenever a person offers reasons for their behavior, whether that is a driver missing a sign, a doctor diagnosing an illness, or a juror weighing the evidence.
Yet people do become experts, given enough of the right experience, at making certain decisions. Our brains change in ways that allow us to make the right choices, even if we lack conscious access to those choices or changes. That is, we perform these tasks thanks to unconscious algorithms that we learn to modify.
Recent advances in artificial intelligence (AI) have begun to tell us what these algorithms might be, and how they might change with experience. A certain class of "machine-learning" method currently produces the best models for how our brains learn to recognize patterns, so there is good reason to say they provide the best "explanations" for basic behaviors.
But these explanations are mathematical formulas, not sentences describing why the computations work. This may tempt us to describe what they do in words, like we describe our own behavior, in lieu of understanding the underlying algorithms. That's a major problem if we hope to truly explain ourselves or employ AIs that outperform us - as they are already beginning to, driving cars, interpreting medical data, and playing games.
Between the inaccuracy of our verbal reports and the superiority of mindless machines, it is time to leave behind our comfortable notion of "explanation" as high-level description. Once we accept that unconscious processes dominate much of our familiar life, there are great advantages to speaking in the language of algorithms.
'Black' and glass boxes
Unlike a person, an AI algorithm cannot lie or misinterpret its subjective experience; if we ask the right question, we will learn the truth about how it reached a decision. Rather than a forced translation, these mathematical descriptions are where to look when a self-driving car misses a stop sign or an AI doctor insists on errant surgery.
Some have argued that these misfires will happen because machine learning algorithms, which recognize patterns in larger datasets than a single human can process, inevitably learn unconscious human biases too. This fear has led to frequent claims that machine-learning-based AIs are “black boxes,” which we cannot trust to make decisions unless they provide simple reasons for us to accept or reject.
Part of this caution in speaking about AI comes from a good moral intuition: we should not let an AI drive unless we know it will make the decisions we want it to make. But these claims are misleading: AIs are only called “black boxes” because they do not emit language-level explanations of their choices. This is the same mistake we make so often with human behavior, thinking that what sounds simplest – an English sentence – bears any relation to the causal processes occurring in the brain or the machine. In fact, in their natural mathematical description, AIs are entirely transparent. We can assess every computation they perform.
So instead of demanding explanations as a traffic cop would, we must frame our questions concretely and quantitatively. How do we want the AIs to behave, and what algorithms can produce that pattern of behavior? For machine-learning algorithms, we must further decide which data to use for training, as biases in the data will produce biases in behavior. Thus, as with biological brains, explaining AI behavior will force us to confront issues of both hardwired “nature” and experiential “nurture.” Fortunately, if we take the same attitude toward AI as we should toward the brain, the truth is waiting for us.
Language in numbers
Studies of biological and artificial intelligence should therefore shake off the restrictions of language-level explanations. This does not, however, discount high-level human reasoning: it is responsible for our understanding of the world and all the benefits of science, technology, and government.
Furthermore, we need to communicate with each other about why certain approaches work and how they can be extended, even if the explanation for what a particular brain region or computation does cannot be converted to a textbook sentence. The sort of understanding within reach is that of pattern recognition, quick decision-making, and other tasks we perform so automatically that algorithms describe them more accurately than words. Explanations of high-level cognition are far more distant.
This is why we must be honest and precise in our language, and not fool ourselves into believing that current AIs even could tell us "how they decided" in English. Such an explanation would be just as misleading as my excuse to the traffic cop.
Free from this tangle of words and misconceptions, we can accept that AIs yield the real explanations for their behavior much more readily than we do: our brain is the true black box. Reflecting our inner machine, unconscious algorithms offer the keys to this darkest chamber.