“When Deep Blue, IBM’s chess-playing supercomputer, beat Garry Kasparov in 1997, computers were still just computers,” noted a recent NY Times Magazine article, “We Don’t Really Know How A.I. Works. That’s a Problem,” by freelance writer Oliver Whang. Deep Blue determined the best next move by simulating and assigning values to board positions up to 12 moves ahead — amounting to billions of positions — using algorithms explicitly programmed by its designers. “There was no mystery around what was going on inside them,” wrote Whang, “even though they were, in a way, intelligent.”
Fifteen years later, everything changed. In 2012, researchers at the University of Toronto developed AlexNet, a neural-network system that identified objects in images far more accurately than previous approaches. AlexNet’s success transformed AI research and accelerated the adoption of deep neural networks across a wide range of applications.
But there was a catch. Unlike Deep Blue, neural networks operate largely as black boxes. As these models become larger and more capable, they also become increasingly difficult to understand — even for the researchers building them. These systems represent a new kind of machine intelligence whose internal reasoning processes remain poorly understood, even by the researchers building them.
This has led to the growing field of AI interpretability, whose goal is to better understand how modern AI systems actually work internally, especially as they are increasingly deployed in high-stakes applications ranging from healthcare and finance to law enforcement and military systems. (more…)
