After decades of promise and hype, artificial intelligence is finally achieving an inflection point of market success. It’s now seemingly everywhere In the past few years, the necessary ingredients have come together to propel AI forward beyond the research labs into the marketplace: huge amounts of data; powerful, inexpensive computer technologies; and the advanced algorithms needed to analyze and extract insights from those oceans of data. This is evidenced by the number of companies embracing AI as a key part of their strategies, the innovative, smart products and services they’re increasingly bringing to market, and the volume of articles being written on the subject.
I’d like to discuss a couple of such articles I recently read, one on the major advances recently achieved in machine learning, and the other explaining the differences between machine learning and knowledge discovery.
This past December, the NY Times Magazine published a very interesting article, The Great AI Awakening by Gideon Lewis-Kraus. The article told the story about Google’s adoption of an AI first strategy, and in particular, its use of deep machine learning to dramatically improve Google Translate, one of its more popular online services. In addition, the article nicely explains the technologies developed by the Google Brain research project to achieve its breakthrough improvements.
Machine learning, and related advances like deep learning, have played a major role in AI’s recent achievements. Machine learning gives computers the ability to learn by ingesting and analyzing large amounts of data instead of being explicitly programmed. It’s enabled the construction of AI algorithms that can be trained with lots and lots of sample inputs, which are subsequently applied to difficult AI problems like language translation and natural language processing.
According to Lewis-Kraus, Google achieved a quantum improvement in the quality of its machine translations when it recently switched to a new deep-learning-based system. Over time, the new translation model will become the common multilingual foundation for translating between different language pairs, rather than needing 150 different models as has been the case with the previous translation system. In the more distant future, the new translation model could potentially become the first step toward a general computational facility based on human language.
Machine learning grew out of decades old research on neural networks, a method for having machines learn from data that’s loosely modeled on the way a biological brain, - composed of large clusters of highly connected neurons, - learns to solve problems. Based on each person’s life experiences, the synaptic connections among pairs of neurons get stronger or weaker.
Similarly, each artificial neural unit in a network is connected to many other such units, and the links can be statistically strengthened or decreased based on the data used to train the system, as opposed to being programmed with fixed rules. As new data is ingested, the system rewires itself based on whatever new patterns it now finds.
But, as has often been the case with AI, the capabilities of those early neural networks were way overhyped. One-layer neural networks, - pretty much what was possible with the then available computers, - could only find simple patterns in the data, which was insufficient to address real world machine learning problems.
Research on neural networks significantly declined, until the advent of multi-layered networks and deep learning in the 1990s. While one layer networks can only discover simple patterns, multilayered deep learning networks can look for patterns of patterns, with each successive layer looking for patterns in the previous layer. Such multilayered neural networks are extraordinarily complicated, requiring huge amounts of data and very powerful computers to handle their training, - something that’s now finally possible.
To help illustrate the scale and complexity of these networks, Lewis-Kraus writes: “An average [human] brain has something on the order of 100 billion neurons. Each neuron is connected to up to 10,000 other neurons, which means that the number of synapses is between 100 trillion and 1,000 trillion… We’re still far from the construction of a network of that size, but Google Brain’s investment allowed for the creation of artificial neural networks comparable to the brains of mice.”
Machine learning is now so hot that those working in the field must be extra careful to avoid another round of hype and unfulfilled promises. They must set realistic expectations for what machine learning can accomplish, and explain its strengths and limitations, including when it’s use is called for and when other AI tools might be more appropriate, - which brings me to the second AI article I recently read.
As it turned out, a few days after reading the NY Times Magazine article I was involved in a meeting at MIT where we started out thinking that the problem being addressed called for machine learning. But it in fact, this particular problem required a different set of AI technologies - machine or knowledge discovery. The key differences between these two branches of AI was nicely explained in Machine Learning Versus Machine Discovery, an article by entrepreneur and computer scientist Raul Valdes-Perez.
“[T]he key idea of machine learning is that, given enough data with associated outcomes, together with notions of what data features are relevant to predicting those outcomes, software can be trained to make those associations in future cases.” On the other hand, machine discovery “deals with uncovering new knowledge that enlightens or guides human beings.” The key idea in machine discovery is the use of heuristics to rank alternatives within a complex problem space and then decide which branches to follow and which to ignore. It’s similar to scientific discovery, where researchers tie together important threads looking for relationships that will yield important results.
Beyond scientific research, discovery can also be viewed as the common sense rules that people often use to form judgements and make decisions in everyday life situations. In his article, Valdes-Perez uses such a simple situation to highlight the differences between learning and discovery.
Assume that a party host wants to introduce the guests to each other by identifying areas of common interests in order to stimulate conversation. The host has a list of all the guests present at the party, and for each guest, the host has gathered a number of data sources, such as LinkedIn and Facebook profiles and other biographical data. To automate the process, the host can then use machine learning to discover common patterns or features among all the guests. This is necessary, but not sufficient. Discovery has to then be applied to differentiate between the common patterns that make for good introductions and those that should be ignored.
Wise judgements lead to good introductions: “The three of you graduated from the same college around the same time.” Or “Both of you mentioned having served in the Peace Corps in Africa.” Or even, “You two are the only people here who know about machine learning.” But bad judgements could lead to: “Both of you have been divorced four times or more” (embarrassing), or “All of you are from the Midwest” (too unfocused) or “Your birthdays are in winter” (irrelevant).
In other words, machine learning finds common patterns in data, and uses them to learn and adapt without being explicitly programmed, while machine discovery assists humans in extracting potentially useful and novel knowledge from the common patterns found in the data.
“Discovery requires studying the task logic (i.e., the space of possible solutions), the knowledge that prioritizes good paths within that space and algorithm design to make it all practical,” adds Valdes-Perez. “There is scope for innovation in the space being searched and the heuristics used. But the most innovations may come from novel, creative outputs on specific inputs, because automation enables exploring a much larger space of possibilities than people can practically consider…”
“Learning applies to many data-rich tasks… Discovery tends to be hand-crafted, more elaborate and rarer… Machine learning and discovery will remain close siblings, but productively living apart as they mature.”