Home > Uncategorized > How to fix recidivism risk models

How to fix recidivism risk models

January 5, 2017

Yesterday I wrote a post about the unsurprising discriminatory nature of recidivism models. Today I want to add to that post with an important goal in mind: we should fix recidivism models, not trash them altogether.

The truth is, the current justice system is fundamentally unfair, so throwing out algorithms because they are also unfair is not a solution. Instead, let’s improve the algorithms and then see if judges are using them at all.

The great news is that the paper I mentioned yesterday has three methods to do just that, and in fact there are plenty of papers that address this question with various approaches that get increasingly encouraging results. Here are brief descriptions of the three approaches from the paper:

  1. Massaging the training data. In this approach the training data is adjusted so that it has less bias. In particular, the choice of classification is switched for some people in the preferred population from + to -, i.e. from the good outcome to the bad outcome, and there are similar switches for some people in the discriminated population from – to +. The paper explains how to choose these switches carefully (in the presence of continuous scorings with thresholds).
  2. Reweighing the training data. The idea here is that with certain kinds of models, you can give weights to training data, and with a carefully chosen weighting system you can adjust for bias.
  3. Sampling the training data. This is similar to reweighing, where the weights will be nonnegative integer values.

In all of these examples, the training data is “preprocessed” so that you can train a model on “unbiased” data, and importantly, at the time of usage, you will not need to know the status of the individual you’re scoring. This is, I understand, a legally a critical assumption, since there are anti-discrimination laws which forbid you to “consider” the race of someone when deciding whether to hire them or so on.

In other words, we’re constrained by anti-discrimination law to not use all the information that might help us avoid discrimination. This constraint, generally speaking, prevents us from doing as good a job as possible.


  1. We might not think that we need to “remove all the discrimination.” Maybe we stratify the data by violent crime convictions first, and then within each resulting bin we work to remove discrimination.
  2. We might also use the racial and class discrepancies in recidivism risk rates as an opportunity to experiment with interventions that might lower those discrepancies. In other words, why are there discrepancies, and what can we do to diminish them?
  3. In other words, I do not claim that this is a trivial process. It will in fact require lots of conversations about the nature of justice and the goals of sentencing. Those are conversations we should have.
  4. Moreover, there’s the question of balancing the conflicting goals of various stakeholders which makes this an even more complicated ethical question.
Categories: Uncategorized
  1. January 5, 2017 at 12:43 pm

    With cameras on every corner and even watching us in our homes (since cameras linked to the internet can be easily hacked), this should be easy.

    Facial recognition software would link up each individual with their cradle-to-grave profile that corporations have been building on each of us.

    If the algorithm takes that cradle to grave profile into account during the profiling, then educated individuals, life-long-learners who are also avid readers, without a criminal record going back to birth would be profiled with a very low risk factor for criminal activities as they walk down a street from camera to camera.

    But individuals with a long rap sheet that are not life-long learners who never checked a book out of a library in their life would end up being tagged and an undercover cop would be sent to follow them everywhere they went.

    If there aren’t enough cops, then flying drones could be used to follow the high-risk individuals as they go about their day. I’m talking about drones that are the size of a fly that would follow them into their house, into their bedrooms, into their bathrooms, etc.


  2. Roger Joseph Witte
    January 5, 2017 at 5:21 pm

    Perhsps we have the right models being used for the wrong purpose. If for example, we were using the models during sentencing to estimste the effects of various sentences on recidivism risk (not trying to rate recidivism independent of sentence) or using them later to target rehabilitation services. Not every gsme needs to be zero sum: if you crsft solutions that benefit everyone, without losers, involved then it is less important that some benefit more than others.


  1. No trackbacks yet.
Comments are closed.
%d bloggers like this: