Home > Uncategorized > Can we use data analysis to make policing less racist?

Can we use data analysis to make policing less racist?

April 19, 2016

A couple of weeks ago there was a kerfuffle at Columbia, written up in the Columbia Spectator by Julie Chien. A machine learning course, taught in the CS department by Professor Satyen Kale, was assigned to “Help design RoboCop!” using Stop and Frisk data.

The title was ill-chosen. Kale meant it to be satirical, but his actual wording of the assignment didn’t make that clear at all, which is of course the danger with satire. Given the culture of CS, people misinterpreted and were outraged by it. This eventually led to an organized group of students called ColorCode to issue a statement in protest of the assignment, and then for Kale to issue an apology, after which ColorCode issued a second statement.

I’m really glad this conversation is finally happening, even if the assignment was a disaster. I’ve been saying for years that the CS department at Columbia, like many CS departments everywhere, has an obligation to teach and think about the ethics of machine learning as well as the mathematical techniques. And although this was an awkward way to get it started, it’s absolutely critical that it gets done. Machine learning algorithms are not objective, because the data going into them are historical artifacts of racist police practices.

In other words, we need to revive this topic, and do it right. If I were teaching data science or machine learning at Columbia, I’d want to spend a week on the Stop, Question and Frisk data, which by the way is located here; I’ve been playing around with it for a few days now and it’s really not too hard to look into.

What do I think we could accomplish? Well, here’s something I read yesterday that might be expanded upon. Namely, a paper by Sharad Goel, Maya Perelman, Ravi Shroff, and David Alan Sklansky entitled Combatting Police Discrimination in the Age of Big Data.

The idea behind this paper, and a related project housed at Stanford, is to use the Stop and Frisk data in order to:

  1. gather statistical evidence that the Stop and Frisk practices were racist, by for example showing that the “hit rate” of finding a weapon, for example, was much lower for blacks than it was for whites, even in “high crime” neighborhoods, and
  2. develop simple algorithms that the police themselves could use to determine whether their individual biases were overstating the suspiciousness of a given person in a given situation. In other words, it’s an algorithm that is meant to help officers become less racist.

One of the best things about this article is the historical context it gives about the extent to which “reasonable suspicion” is a statistical construction. Judges have been inconsistent with this idea, but there might be an emerging understanding of whether, and in what contexts, it’s considered OK to stop and frisk someone given that the chance you’ll find a weapon is 1% or less.

Personally, I’m not sure it makes sense to equip police with an algorithm to be used in real time. There are obvious issues around gaming such a model, or otherwise learning to evade undesired outcomes. Another way of implementing it, that I think might be more promising, would be at the precinct level. Imagine looking into certain types of stops and frisks and noting the hit rate is too low to warrant the imposition, which would (ideally) change the rules of stop and frisk themselves.

In other words, although I am excited about the idea of using data to track and help prevent racist practices, I don’t think we know exactly what that would look like in practice. But it’s something we desperately need to start thinking about. Let’s have the conversation!

Categories: Uncategorized
  1. JSE
    April 19, 2016 at 11:37 am

    The irony here is that the actual story of RoboCop is pretty optimistic, proposing that about a partnership of human and machine can, in the end, be a powerful force for justice, so long as the human doesn’t allow their ethics to be subjugated to the machine’s built-in priorities.


    • OCP
      April 24, 2016 at 1:49 pm

      “Optimistic” is maybe the strangest reading I’ve ever heard on Robocop. Gleefully pessimistic is more like it. Verhoven has said that his movie is a more-or-less explicit Christ story, except that in the original after rising from the the dead Jesus ascends to heaven. In this version, in spite of his “win,” he’s still caged in a rusting shell owned by a company that’s still pretty evil. Powerful force for justice, sure, but one that’s completely unethical to create.


  2. Guest2
    April 19, 2016 at 7:17 pm

    I am troubled by the lack of attention to the key variable and how it is constructed moment by moment. Certainly race plays a role, but race itself is constructed locally, situation by situation. It is never a dependent variable without such construction; yet, in all this discussion, the means of its construction are hidden. This is a problem of me, since when we are ignorant of how it is race is socially and organizationally constructed, then we lack the means to address how power is unequally distributed.


    • Guest2
      April 20, 2016 at 10:48 pm

      The politicization and media-izing of police violence detracts attention from the other ways our social and cultural institutions are racist. Why limit critique to what corporate media outlets tell us? Violence and discrimination are widespread.


  3. Guest2
    April 19, 2016 at 7:29 pm

    For example, here is a new patent on data collection — on a massive scale — and race does not appear here at all. It could very well be that other variables “explain” behavior — variables that can be more clearly defined.

    Click to access 20160086262.pdf


  1. No trackbacks yet.
Comments are closed.
%d bloggers like this: