When is AI appropriate?

Home > Uncategorized > When is AI appropriate?

When is AI appropriate?

July 11, 2016 Cathy O'Neil, mathbabe

I was invited last week to an event co-sponsored by the White House,Microsoft, and NYU called AI Now: The social and economic implications of artificial intelligence technologies in the near term. Many of the discussions were under “Chatham House Rule,” which means I get to talk about the ideas without attributing any given idea to any person.

Before I talk about some of the ideas that came up, I want to mention that the definition of “AI” was never discussed. After a while I took it to mean anything that was technological that had an embedded flow chart inside it. So, anything vaguely computerized that made decisions. Even a microwave that automatically detected whether your food was sufficiently hot – and kept heating if it wasn’t – would qualify as AI under these rules.

In particular, all of the algorithms I studied for my book certainly qualified. And some of them, like predictive policing and recidivism risk models, google search and resume filtering algorithms, were absolutely talked about and referred to as AI.

One of the questions we posed was, when is AI appropriate? Is there a class of questions that AI should not be used for, and why? More interestingly, is there AI working and making decisions right now, in some context, that should be outlawed? Or at least put on temporary suspension?

[Aside: I’m so glad we’re actually finally discussing this. Up until now it seems like wherever I go it’s taken as a given that algorithms would be an improvement over human decision-making. People still automatically assume algorithms are more fair and objective than humans, and sometimes they are, but they are by no means perfect.]

We didn’t actually have time to thoroughly discuss this question, but I’m going to throw down the gauntlet anyway.

—

Take recidivism risk models. Julia Angwin and her team at ProPublica recently demonstrated that the COMPAS model, which was being used in Broward County Florida (as well as many other places around the country), is racist. In particular, it has very different errors for blacks and for whites, with high “false positive” rates for blacks and high “false negative” rates for whites. This ends up meaning that blacks go to jail for longer, since that’s how recidivism rates are being used.

So, do we throw out recidivism modeling altogether? After all, judges by themselves are also racist; a models such as the COMPAS model might actually be improving the situation. Then again, it might be making it worse. We simply don’t know without a monitor in place. (So, let’s get some monitors in place, people! Let’s see some academic work in this area!)

I’ve heard people call for removing recidivism models altogether, but honestly I think that’s too simple. I think we should instead have a discussion on what they show, why they’re used the way they are, and how they can be improved to help people.

So, if we’re seeing way more black (men) with high recidivism risk scores, we need to ask ourselves: why are black men deemed so much more likely to return to jail? Is it because they’re generally poorer and don’t have the start-up funds necessary to start a new life? Or don’t have job opportunities when they get out of prison? Or because their families and friends don’t have a place for them to stay? Or because the cops are more likely to re-arrest them because they live in poor neighborhoods or are homeless? Or because the model’s design itself is flawed? In short, what are we measuring when we build recidivism scores?

Second, why are recidivism risk models used to further punish people who are already so disadvantaged? What is it about our punitive and vengeful justice system that makes us punish people in advance for crimes they have not yet committed? It keeps them away from society even longer and further casting them into a cycle of crime and poverty. If our goal were to permanently brand and isolate a criminal class, we couldn’t look for a better tool. We need to do better.

Next, how can we retool recidivism models to help people rather than harm them? We could use the scores to figure out who needs resources the most in order to stay out of trouble after release, to build evidence that we need to help people who leave jail rebuild their lives. How do investments in education inside help people once they get out land a job? Do states that make it hard for employers to discriminate based on prior convictions – or for that matter on race – see better results for recently released prisoners? To what extent does “broken windows policing” in a neighborhood affect the recidivism rates for its inhabitants? These are all questions we need to answer, but we cannot answer without data. So let’s collect the data.

—

Back to the question: when is AI appropriate? I’d argue that building AI is almost never inappropriate in itself, but interpreting results of AI decision-making is incredibly complicated and can be destructive or constructive, depending on how well it is carried out.

And, as was discussed at the meeting, most data scientists/ engineers have little or no training in thinking about this stuff beyond optimization techniques and when to use linear versus logistic regression. That’s a huge problem, because part of AI – a big part – is the assumption that AI can solve every problem in essentially the same way. AI teams are, generally speaking, homogenous in gender, class, and often race, and that monoculture gives rise to massive misunderstandings and narrow ways of thinking.

The short version of my answer is, AI can be made appropriate if it’s thoughtfully done, but most AI shops are not set up to be at all thoughtful about how it’s done. So maybe, at the end of the day, AI really is inappropriate, at least for now, and until we figure out how to involve more people and have a more principled discussion about what it is we’re really measuring with AI.

Categories: Uncategorized

Comments (10)

ax42

July 11, 2016 at 9:53 am

Kahnemann has added to this conversation recently, and he comes out strongly in favour of AI in a corporate setting.

http://knowledge.wharton.upenn.edu/article/nobel-winner-daniel-kahnemans-strategy-firm-can-think-smarter/

LikeLike
- Guest2
  
  July 11, 2016 at 10:25 pm
  
  “The indications from the research are unequivocal,” he said:” When it comes to decision-making, algorithms are superior to people. “Algorithms are noise-free. People are not,” he said. “When you put some data in front of an algorithm, you will always get the same response at the other end.”
  
  Of all the crap I have heard Nobelists say, this has got to be the dumbest. Whatever Kahnemann is peddling, he just stepped into it, big time. So, what cognitive sinkhole just swallowed him, I wonder. Any respect I had for the man instantly evaporated when I saw this. Too bad. I liked his approach — too bad he’s not able to apply it to himself!
  
  LikeLike
  - Guest2
    
    July 13, 2016 at 7:39 pm
    
    Maybe Kahnemann should read this series.
    
    https://www.propublica.org/series/machine-bias
    
    LikeLike
goldenoj

July 11, 2016 at 10:13 am

Thanks for sharing this. I’m consistently amazed how people feel that AI (in this sense) is fairer than a person or group of people. It’s the carving in stone of one person or team’s ideas of what a good decision is. The microwave programmer who has decided what it means for your food to be hot, or the recidivism risk programmers who have decided whom it is acceptable to release. (BTW, what about a predictive model to support intervention before someone enters the crime-jail cycle?) We need to monitor these programs even more closely than people, because the mistakes they make aren’t ever one time affairs; they will accumulate.

LikeLike
rob

July 11, 2016 at 12:19 pm

Are we modelling only the victims of racism or oppression? Model judges’ and their judgments’ racism. And cops, too. And while we’re at it, the abuses of finance and of heads of states. If the results match our intuitions, ask how those models get there. If they don’t match, let’s look at where models go wrong.

LikeLike
kevin

July 12, 2016 at 11:56 am

on the question of recidivism algorithms, I think this old paper remains interesting, http://islandia.law.yale.edu/ayres/A%20Market%20Test%20for%20Race%20Discrimination%20in%20Bail%20Setting.pdf where they find that courts seem to set bail too high for minority cases but there’s another market to un-pick that — so we have competing systems with different degrees of model-vs-judgment

LikeLike
CLB

July 12, 2016 at 11:58 am

” Up until now it seems like wherever I go it’s taken as a given that algorithms would be an improvement over human decision-making. ”

That’s because in actuarial/algorithm v clinical/expert judgement the actuarial/algorithm model wins overwhelmingly. Sometimes human judgement ties.
See: http://meehl.umn.edu/sites/g/files/pua1696/f/138cstixdawesfaustmeehl.pdf

LikeLike
Chaya Cooper

July 12, 2016 at 3:06 pm

Reblogged this on Chaya Cooper's Blog.

LikeLike
Neel Krishnaswami

July 13, 2016 at 5:02 am

Dear Cathy,

The EU’s new data protection regulation (to come into effect in 2018) add a “right to explanation” for automated processes which significantly impact users. This is likely to make many widely big data/machine learning techniques illegal. I’m very happy about this, not just as a citizen, but also as a computer scientist — it is also likely to drive a lot of research effort into techniques which can produce proper explanations. I became aware of this via Bryce Goodman and Seth Flaxman’s workshop paper European Union regulations on algorithmic decision-making and a “right to explanation”.

One thing that does need to be sorted out (i.e., we need to have a big political debate about it) is how to implement nondiscrimination. This is really hard because structural injustices in society mean that decision-relevant variables are often also correlated with protected characteristics.

For example, before I went to grad school, I worked at a startup that built models to forecast home values, with the idea being that banks could avoid making bad loans by using our software. (As the great recession shows, we were not enormously successful at this.) Now, obviously, a property’s location matters for the purposes of estimating its value. But since America is highly segregated by race, it’s also the case that a property’s location also gives you a lot of information about the race of its owner. So if you’re not careful it’s very easy to end up doing “algorithmic redlining”.

We tried a lot of different things but it’s not something we were ever really 100% satisfied with — the real solution would have been desegregation, which is a disruptive innovation no startup can manage.

LikeLike
- Josh
  
  July 13, 2016 at 10:11 am
  
  I’m curious whether an EU law expert can clarify what will constitute an explanation deemed to satisfy this right. At this point, perhaps it is unknowable? If so, how will it be resolved?
  
  In the linked paper, the authors implicitly draw a line in one place, but other lines seem potentially consistent with the wording of the regulation:
  (1) more restrictive: the model has to identify a causal mechanism linking the input variables and the prediction. The authors gloss over this because…no data science model passes this test.
  (2) less restrictive: it might be considered sufficient for the data controller to simply explain which factors were input into the model and the form of the output. For example, “your income and utility bill payment history were used in the model that generated a percentage value which we interpret as the probability you will repay a loan.”
  
  LikeLike