Decision making, - how individuals, groups and organizations go about selecting a course of action among several alternative scenarios, - has long been a subject of study. Given the explosive growth of big data over the past decade, it’s not surprising that data-driven decision making is one of the most promising applications in the emerging discipline of data science.
In a recently published article, Data Science and its Relationship to Big Data and Data-Driven Decision Making, Foster Provost and Tom Fawcett succinctly define data-driven decision making as “the practice of basing decisions on the analysis of data rather than purely on intuition.” Equally succinctly, they view data science “as the connective tissue between data-processing technologies (including those for big data) and data-driven decision making.”
Looking deeper at data-driven decision making, it’s important to understand not only the promise but also the limits. When can we embed decisions into well understood, automated processes? When does automation run into limits, and we should view data-driven decision making as a tool to help people make smarter, more effective decisions? And what are the prospects for its future, as technology, big data and data science continue to advance?
In between are many kinds of decisions, including non-routine ones in response to new or unforeseen circumstances beyond the scope of operational processes, and tactical decisions dealing with the necessary adjustments required to implement longer term strategies.
Given their structured nature, IT and data analysis have long been applied to automate routine, day-to-day operational decisions, such as logistics and inventory management, personalized marketing offers and recommendations, and fraud detection in financial transactions. For example, I recently traveled to a Latin America city I had not been to before. When checking into the hotel, my credit card was not accepted. A few moments later, I received an automated call in my mobile phone as well as an e-mail from the credit card company asking me to verify that I was indeed trying to use my credit card in that particular city.
Similarly, when I logged into my e-mail account, I was first asked to verify my identity to make sure that I was the person logging in from a new location. These are concrete examples of operational, data-driven decision making that have been built into automated authentication and fraud detection processes.
The more data we gather, and the more sophisticated the analysis, the more such decisions can be made with little or no human intervention. Over time, big data and advanced data science applications will enable us to take operational decision making to a whole new level in a wide variety of disciplines. In an online conversation, Reinventing Society in the Wake of Big Data, MIT Media Lab Professor Sandy Pentland talks about the promise of becoming a data-driven society:
“I believe that the power of Big Data is that it’s information about people’s behavior - it’s about customers, employees, and prospects for your new business, . . .” he says. “This Big Data comes from location data from your cell phone and transaction data about the things you buy with your credit card. It’s the little data breadcrumbs that you leave behind you as you move around in the world. . . Big data is increasingly about real behavior, and by analyzing this sort of data, scientists can tell an enormous amount about you. They can tell whether you are the sort of person who will pay back loans. They can tell you if you’re likely to get diabetes.”
These decision making applications require access to vast amounts of personal information, which leads to very serious concerns about privacy, data ownership and data control. It’s important that individuals are aware of and have final say about the use of the data collected about them. For example, we will likely be quite happy with the use of personal data in identity management applications, but might want to be more selective in how our data is used to send us marketing offers. Considerable research is needed as we learn how to strike the right balance between such data-driven decision making and privacy.
Beyond automated operational decisions, there are many situations where human intervention is required for a variety of reasons. In my personal credit card example, I could have talked to a customer service representative if the information I provided online had not been sufficient to properly verify my identity.
There are quite a number of interesting cases where people applied data analysis to uncover patterns that were useful in dealing with an unexpected new situation. For example a few years ago, as hurricane Frances was threatening a direct hit on Florida’s Atlantic coast, Walmart executives were trying to predict what kinds of merchandise they should stock in their stores in the affected areas based on analysis of past purchases in other Walmart stores under similar conditions. Their experts mined the data and found that the stores would need certain products beyond the usual flashlights, batteries and bottles of water.
As Walmart’s then CIO noted: “We didn't know in the past that strawberry Pop-Tarts increase in sales, like seven times their normal sales rate, ahead of a hurricane, . . . And the pre-hurricane top-selling item was beer.” By “predicting what’s going to happen, instead of waiting for it to happen,” as she put it, “trucks filled with toaster pastries and six-packs were soon speeding down Interstate 95 toward Walmarts in the path of Frances. Most of the products that were stocked for the storm sold quickly, the company said.”
Strategic decisions are aimed at shaping the future, that is, setting the long term directions and policies of an organization. Making sound strategic decisions is one of the most important qualities of a great leader, and is thus a major component of leadership courses and seminars. But, the use of big data and data science to help with strategic decisions is in its early stages and requires quite a bit more research to understand how to use them under different contexts.
A Leader’s Framework for Decision Making, a 2007 Harvard Business Review paper by Dave Snowden and Mary Boone, offers a very good framework “which allows executives to see things from new viewpoints, assimilate complex concepts, and address real-world problems and opportunities. . . .Using this approach, leaders learn to define the framework with examples from their own organization’s history and scenarios of its possible future. This enhances communication and helps executives rapidly understand the context in which they are operating.”
The framework is designed to help leaders determine the overall context for making their strategic decisions, in particular whether it is ordered and complicated, or unordered and complex.
“Each domain requires different actions,” write Snowden and Boone. “Simple and complicated contexts assume an ordered universe, where cause-and-effect relationships are perceptible, and right answers can be determined based on the fact. Complex and chaotic contexts are unordered, - there is no immediate relationship between cause and effect, and the way forward is determined based on emerging patterns. The ordered world is the world of facts-based management; the unordered world represents pattern-based management.”
The article explains each of these contexts in detail, as well as the decision making style most applicable to each. Data mining and similar analytical methods are most applicable in the ordered contexts, whether simple, - having a clear cause-and-effect relationship that the analysis can uncover, - or complicated, - which unlike simple contexts may have multiple options and answers which can be analyzed, evaluated and compared prior to making a decision.
A complex context is quite different. The right decision cannot be ferreted out from the available information. The article points out that comparing a complicated and a complex context is like comparing a Ferrari with the Brazilian rainforest. “Ferraris are complicated machines, but an expert mechanic can take one apart and reassemble it without changing a thing. The car is static and the whole is the sum of its parts. The rainforest, on the other hand, is in constant flux, - a species becomes extinct, weather patterns change, an agricultural project reroutes a water source, - and the whole is far more than the sum of its parts.”
“Most situations and decisions in organizations are complex because some major change, - a bad quarter, a shift in management, a merger or acquisition, - introduces unpredictability and flux. In this domain we can understand why things happen only in retrospect. Instructive patterns, however, can emerge if the leader conducts experiments that are safe to fail. That is why, instead of attempting to impose a course of action, leaders must patiently allow the path forward to reveal itself. They need to probe first, then sense, then respond.”
One of the biggest challenges in leveraging data science to help make complex strategic decisions is to mistakenly assume that an unordered, unpredictable, complex context is in fact an ordered, predicable complicated one. “This assumption, grounded in the Newtonian science that underlies scientific management, encourages simplifications that are useful in ordered circumstances. Circumstances change, however, and as they become more complex, the simplifications can fail. Good leadership is not a one-size-fits-all proposition.”
Neither is good data-driven decision making. With operational decisions, we have to learn to distinguish between those situations when decisions can be embedded in automated processes, and those that require human intervention. With strategic decisions we have to learn the difference between complicated but predictable contexts, and complex and intrinsically unpredictable ones. This is all part of what makes data science such an important and exciting discipline.