Two clarifications
First, I think I over-reacted to automated pricing models (thanks to my buddy Ernie Davis who made me think harder about this). I don’t think immediate reaction to price changes is necessarily odious. I do think it changes the dynamics of price optimization in weird ways, but upon reflection I don’t see how they’d necessarily be bad for the general consumer besides the fact that Amazon will sometimes have weird disruptions much like the flash crashes we’ve gotten used to on Wall Street.
Also, in terms of the question of “accuracy versus discrimination,” I’ve now read the research paper that I believe is under consideration, and it’s more nuanced than my recent blog posts would suggest (thanks to Solon Barocas for help on this one).
In particular, the 2011 paper I referred defines discrimination crudely, whereas this new article allows for different “base rates” of recidivism. To see the different, consider a model that assigns a high risk score 70% of the time to blacks and 50% to whites. Assume that, as a group, blacks recidivate at a 70% rate and whites at a 50% rate. The article I referred to would define this as discriminatory, but the newer paper refers to this as “well calibrated.”
Then the question the article tackles is, can you simultaneously ask for a model to be well-calibrated, to have equal false positive rates for blacks and whites, and to have equal false negative rates? The answer is no, at least not unless you are in the presence of equal “base rates” or a perfect predictor.
Some comments:
- This is still unsurprising. The three above conditions are mathematical constraints, and there’s no reason to expect that you can simultaneously require a bunch of really different constraints. The authors do the math and show that intuition is correct.
- Many of my comments still hold. The most important one is the question of why the base rates for blacks and whites are so different. If it’s because of police practice, at least in part, or overall increased surveillance of black communities, then I’d argue “well-calibrated” is insufficient.
- We need to be putting the science into data science and examining questions like this. In other words, we cannot assume the data is somehow fixed in stone. All of this is a social construct.
This question has real urgency, by the way. New York Governor Cuomo announced yesterday the introduction of recidivism risk scoring systems to modernize bail hearings. This could be great if fewer people waste time in jail pending their hearings or trials, but if the people chosen to stay in prison are chosen on the basis that they’re poor or minority or both, that’s a problem.
“The answer is no, at least not in the presence of equal “base rates” or a perfect predictor.”
confuses me. Could it be that it’s the other way round and it should say absence?
LikeLike
Shit. Thanks.
LikeLike
Why don’t they factor in the poverty rate, racism, bias and segregation into that algorithm? And instead of throwing people in jail and/or deny parole based on recidivism rates by race, and do nothing to deal with the causes of poverty, spend more money on early childhood education for preschool children that live in poverty and literacy programs for all ages.
For instance, when a prison inmate is illiterate, make it mandatory that the criminal cannot get out of jail until they become literate at a 9th grade or higher level and then provide classes to teach them to read and make sure every prison has a well stocked library.
The United States has the highest poverty rate among developed nations. 20 percent of children in the U.S., or 14.7 million lived in poverty in 2013, according to a Pew Research study. … During this period, the poverty rate declined for Hispanic, white and Asian children. Among black children, however, the rate held steady at about 38 percent.
If the U.S. was serious about lowering crime rates, it would focus on what works to reduce poverty.
One example is what France did more than thirty years ago, when it funded and launched a national high quality early childhood education system within it’s public school system. France did not turn early childhood education over to autocratic, for profit corporations. Instead, France required public school teachers who taught in its public early childhood education system to be highly qualified, trained and supported.
Thirty years later, poverty in France dropped from about 15 percent when the program was created to less than 7 percent.
LikeLiked by 1 person
Cuomo wants to Reform the Criminal Justice System so as to “Ensuring access to a speedy trial”? That’s great!
Of course, on New Year’s Eve he vetoed an act to properly pay for underfunded public defenders, which passed the state Senate unanimously back in June. So basically he wants to just pipeline more people directly to prison.
http://www.politico.com/states/new-york/albany/story/2016/12/citing-cost-cuomo-vetoes-indigent-legal-defense-bill-108386
LikeLike
You’re doing a real public service raising these issues, as is the Politico team. I think the concerns go beyond your three bullet points, however. In particular, (2) could be a stronger claim.
Using a “well-calibrated” recidivism model (or “race blind” to use the terminology from http://papers.nips.cc/paper/6374-equality-of-opportunity-in-supervised-learning.pdf) that is equally accurate for two groups (one with a higher recidivism rate than the other), the prison population will not reflect the crimes actually committed, but will reflect the crime *rates* in the two communities.
Take a city with a population of 10N and a small housing estate with a population of N. Let the crime rates be 1:10, so an equal number of crimes is committed in the city as in the estate. If crimes are local, recidivism rates will match crime rates.
Each year there will be n crimes in the city (committed by city-dwellers) and the same number of crimes committed in the estate. So each year there is an intake of n city-criminals and n estate-criminals. Give all crimes a sentence of 2 years.
At the end of one year, give early release to all those with low recidivism scores, which by a well-calibrated model will be all the city-dwellers. So the prison population goes to 0 city-criminals, n estate-criminals. After a new intake (this is a discrete model) the population goes to n city-criminals, 2n estate-criminals. So the prison population bounces between (0, n) and (n, 2n) even though there are identical numbers of city-criminals and estate-criminals. The prison population is not reflective of the crimes, but the crime rates.
Now introduce a predictive policing model. As half the crimes take place in the estate a well-calibrated model would send half the police there (so the police-per-population rate is 10x that of the city). Let’s say arrest rates increase by 10% for every extra police-per-population.
Then if there are n crimes in the city and n crimes in the housing estate, the ratio of convicted criminals would be 1:2 (city:estate). Pass this through the prison system with the same recidivism software and we now have a prison population bouncing between {0:2n} and {n:4n} even though the crime rate is {1:1}.
So “well-calibrated” algorithms reinforce each other and produce an outcome (prison population) that increasingly reflects rates rather than numbers.
I haven’t seen much work on ecosystems of algorithms, which are likely to magnify the problems created by single algorithms. Your discussion in WMD is the closest I’ve seen and maybe Barocas gets to it as well.
A few other thoughts:
– the “fairness through awareness” program (which these new papers follow on from) requires the collecting of all protected attribute data (so as to be fair about these attributes). I haven’t seen anyone talking about what happens when trying to protect multiple attributes: I suspect the loss in efficiency will increase and there will be more resistance to protecting.
– there is a line in your book about justice, which is different from fairness. The whole point of justice being blind is that we must narrow our focus and exclude from consideration all these correlated attributes and factors. The response to charges of bias against WMDs seem to be “give us more data, more computers, better algorithms, and we can fix this”. But the question of justice seems to argue more directly against the whole notion of algorithmic governance. In fact, the better the algorithm the more tempting it is to use it, but it is still going to violate principles of (blind) justice.
– the protected attributes list (race, gender, sexual orientation, disability…) is getting longer, with good reason. In an ideal world it would be longer still. A world of big data running “fairness through awareness” should be able to identify new attributes against which we could protect (weight? eye shape? height? posture?). And if it did, would a “fair” algorithm (or “equalized odds” in the language of Hart and others in the paper I link to above) lose all its utility? That would be ironic.
LikeLike
I’m going to have to think about the model you have whereby a risk score is simply the reflection of the crime rate – I’m not sure it’s so simple but then again I’m not sure it isn’t.
Update: of course it is, actually, that’s what “well-calibrated” really means, that we’re happy to allow risk scores to reflect underlying arrest rates for populations. And to the extent that neighborhoods are segregated by race, which they are a lot, this is exactly what’s happening.
In any case, I agree with you that the most pernicious aspect to using recidivism risk algorithms in the first place is the very real possibility of creating feedback loops, especially when it’s in combination with predictive policing.
I keep coming back to the point that there are all these white collar crimes that never get found, never have arrests attached to them, and therefore are invisible to all of this.
One point I’d make, though, is that it’s far better to talk about “arrest rate” than crime rate, for that exact reason. Equating the two is one of the big problems in this area.
LikeLike
I am sure the actual risk score is much more subtle, but for a toy model like this, let’s say each district (city, estate) is comprised of a homogeneous population and that the individuals committing crime are randomly selected from that population. Then the score for a well-calibrated model would be low for city-dwellers, and high for estate-dwellers (the best description of the problem I have seen is this anonymous google document: https://docs.google.com/document/d/1pKtyl8XmJH7Z09lxkb70n6fa2Fiitd7ydbxgCT_wCXs/edit?pref=2&pli=1 ).
The points you make about biases outside the model (like types of crimes) are essential. I’m just wondering how far we can go inside the model. I may, of course, have missed something obvious — it’s worth what you pay for it.
LikeLike
Price optimization requires communications channel isolation.
LikeLike
The idea of feedback loops reinforcing current inequalities is very interesting. What if predictive social services brought more people into contact with Social Workers and other support networks and predictive servicing got more help to those most in need. Would this then create a postiive feedback loop?
Or, more simply, if these algorythms were used to target people for additional assisstence, rather than increased punishment, wouldn’t that resolve a lot of the ethical concerns here?
LikeLike
yes.
LikeLike