“Artificial intelligence (AI) is the pursuit of machines that are able to act purposefully to make decisions towards the pursuit of goals,” wrote Harvard professor David Parkes in A Responsibility to Judge Carefully in the Era of Prediction Decision Machines, an essay recently published as part of Harvard’s Digital Initiative. “Machines need to be able to predict to decide, but decision making requires much more. Decision making requires bringing together and reconciling multiple points of view. Decision making requires leadership in advocating and explaining a path forward. Decision making requires dialogue.”
In April, 2017 I attended a seminar by University of Toronto professor Avi Goldfarb on the economic value of AI. Goldfarb explained that the best way to assess the impact of a new radical technology is to look at how the technology reduces the cost of a widely used function. For example, computers are essentially powerful calculators whose cost of digital operations have dramatically decreased over the past several decades. Over the years, we’ve learned to define all kinds of tasks in terms of digital operations, e.g., financial transactions, word processing, photography. Similarly, the Internet has drastically reduced the cost of communications and of access to all kinds of information, - including text, pictures, music and videos.
Viewed through this lens, the AI revolution can be viewed as reducing the cost of predictions. Prediction means anticipating what is likely to happen in the future. Over the past decade, increasingly powerful and inexpensive computers, advanced machine learning algorithms, and the explosive growth of big data have enabled us to extract insights from all that data and turn them into valuable predictions.
Given the widespread role of predictions in business, government and everyday life, AI is already having a major impact on many human activities. As was previously the case with arithmetic, communications and access to information, - we will be able to use predictions in all kinds of new applications. Over time, we’ll discover that lots of tasks can be reframed as prediction problems.
The academic community is starting to pay attention to these very important and difficult questions underlying the shift, from predictions to decisions. Last year Parkes was co-organizer of a workshop on Algorithmic and Economic Perspectives on Fairness. The workshop brought together researchers with backgrounds in algorithmic decision making, machine learning, and data science with policy makers, legal experts, economists, and business leaders.
As explained in the workshop report, algorithmic systems have long been used to help us make consequential decisions. Recidivism predictions date back to the 1920s, and automated credit scoring began in the middle of the 20th century. Not surprisingly, prediction algorithms are now used in an increasing variety of domains, including job applications, criminal justice, lending and insurance, medicine and public services.
This prominence of algorithmic methods has led to concerns regarding their overall fairness in the treatment of those whose behavior they’re predicting, such as whether the algorithms systematically discriminate against individuals with a common ethnicity or religion; do they properly treat each person as an individual; and who decides how algorithms are designed and deployed.
These concerns have been present whenever we make important decisions. What’s new is the much, much larger scale at which we now rely on algorithms to help us make decisions. Human errors that may have once been idiosyncratic may now become systematic. Another consideration is their widespread use across domains. Prediction algorithms, such as credit scores, may now be used in contexts beyond their original purpose. Accountability is another serious issue. “Who is responsible for an algorithm’s predictions? How might one appeal against an algorithm? How does one ask an algorithm to consider additional information beyond what its designers already fixed upon?”
While fairness is viewed as subjective and difficult to measure, accuracy measurements are generally regarded as objective and unambiguous. “Nothing could be farther from the truth,” says the workshop report. “Decisions based on predictive models suffer from two kinds of errors that frequently move in opposite directions: false positives and false negatives. Further, the probability distribution over the two kinds of errors is not fixed but depends on the modeling choices of the designer. As a consequence, two different algorithms with identical false positive rates and false negative rates can make mistakes on very different sets of individuals with profound welfare consequences.”
Workshop participants were asked to identify and frame what they felt were the most pressing issues to ensure fairness in an increasingly data- and algorithmic-driven world. Let me summarize some of the key issues they came up with as well as questions to be further investigated.
Decision Making and Algorithms. It’s not enough to focus on the fairness of algorithms because their output is just one of the inputs to a human decision maker. This raises a number of important questions: how do human decision makers interpret and integrate the output of algorithms?; when they deviate from the algorithmic recommendation, is it in a systematic way?; and which aspects of a decision process should be handled by an algorithm and which by a human to achieve fair outcomes?
Assessing Outcomes. It’s very difficult to measure the impact of an algorithm on a decision because of indirect effects and feedback loops. Therefore, it’s very important to monitor and evaluate actual outcomes. Can we properly understand the reasons behind an algorithmic recommendation?; how can we design automated systems that will do appropriate exploration in order to provide robust performance in changing environments?
Regulation and Monitoring. Poorly designed regulations may be harmful to the individuals they’re intended to protect as well as being costly to implement for firms. It’s thus important to specify the precise way in which compliance will be monitored. How should recommendation systems be designed to provide users with more control?; could the regulation of algorithms lead to firms abandoning algorithms in favor of less inspectable forms of decision-making?
Educational and Workforce Implications. The study of fairness considerations as they relate to algorithmic systems is a fairly new area. It’s thus important to understand the effect of different kinds of training on how well people will interact with AI based decisions, as well as the management and governance structure for AI-based decisions. Are managers (or judges) who have some technical training more likely to use machine learning- based recommendations?; what should software engineers learn about ethical implications of their technologies?; what’s the relationship between domain and technical expertise in thinking about these issues?
Algorithm Research. Algorithm design is a well-established area of research within computer science. At the same time, fairness questions are inherently complex and multifaceted and incredibly important to get right. How can we promote cross-field collaborations between researchers with domain expertise (moral philosophy, economics, sociology, legal scholarship) and those with technical expertise?
Interesting article. I remember when I was working at an insurance company there was a push to put automated underwriting into place. There was enormous push back due to expected job loss. So there was a multiple year period of algorithmic underwriting decision suggestion that was then reviewed by real underwriters to see whether human or machine underwriting choices were more accurate given the guidelines. The algorithms won by a significant margin. But three challenges remain. 1. The validity of the data values chosen and the consistent accuracy of their prediction of future events is typically only reviewed if the actuaries see significant economic deviation making the prediction generally good enough for profitability and regulatory scrutiny , but not always "fair". 2. There is no repeat challenge of the machine prediction against human underwriters on the presumption that the machines won. 3. I believe, most importantly, there is no systematic effort to put into place algorithmic back tests that test "if this algorithmic decision is correct, then these things should be true" especially for large quantity of decisions made over time eg. auto insurance underwriting decisions over 1-2 year periods. This would be utilized not so much for "catching a mistake" as watching for drift in expected predictive accuracy to prompt a human review. This would seem to be an area where the tension and potential misapplication of statistics between correlation and causal statistical significance is significant.
Posted by: robert anderson | March 26, 2020 at 04:56 AM