Uncollected Criminal Justice Data
This morning I was happy to stumble upon a new whitepaper put out on the Data & Civil Rights webpage entitled Open Data, the Criminal Justice System, and the Police Data Initiative and written by Robyn Caplan, Alex Rosenblat, and danah boyd.
The content concerns the White House initiative, which I am tangentially part of, to encourage police departments to “open up” more of their data. Ideally that would mean more information on crime rates, even though such data is often unreliable, because police departments are assessed on the basis of violent crime rates. Even more aspirationally, that would mean better data on how police officers and citizens interact on a daily basis.
But here’s the thing. You can’t open up data that you don’t collect. And for most precincts, they don’t collect that level of data. That’s my biggest takeaway of the whitepaper, and it was also the theme of a talk I gave a couple of weeks ago at an “open data” conference I spoke at.
In other words, we are starting too downstream. When we ask police departments to “open up” their data, we are assuming they collect the data we want. But they only collect the data that makes them look efficient or successful. Other data collection efforts have failed because they are entirely voluntary.
So, it’s pretty well known that we don’t have a high-quality national register of fatal police shootings, and the Guardian has a better one. But the problems don’t end there. We also don’t generally speaking know whether the public of a given precinct trusts their cops. That’s also uncollected data. We also have little information on what the conditions are for people who have been arrested.
Here’s what I’d like to see: high-quality data on the conditions at Rikers, beyond the surveillance video that the public has no access to. I volunteer to do the data analysis for free. I’m not holding my breath, though: they cannot even be trusted to count inmate fights.
Besides data that the police departments don’t collect is the collection of police reports that downgrade the crimes involved like the fall in NYC crime rates, orchestrated by deliberate downward classification of the underlying events.
On uncollected data, a human rights organization took to building a data warehouse on the disappearances in Argentina. The military never recorded what it did. But, personnel assignments correlated with individual disappearances. The results were stunning. The military got amnesty, but the generals were unable to collect their retirement pay. Their individual culpability became obvious.
Every event can be recorded from various perspectives even retrospectives. Lights can be turned on even in the oldest of events.
LikeLike
I think they do collect the data but maybe don’t want it public? In my experience as an elected official – when budget time comes around, both police and fire departments put out numbers to justify budget increases. They always want MORE public $$$’s. I was always asking if there was any way they could trim the budgets but – no – the unions will not let that happen and, the old fear factor of public safety amid lots of crime is a great incentive to raise taxes.
LikeLike
In another context, you never keep documents or data that you don’t want to be cross-examined on. If it is not there, it can’t be discovered. I would expect the same applies in any adversarial context.
LikeLike
The key is data availability and data quality. You have to ensure that the same type of data is collected using the same approach. Of course, it has to be ensured that there is no incentive to corrupt the data. In addition, there need to be data quality and plausibility checks, as manually collected data is per definition flawed. Then you have a solid basis for reporting and analysis.
LikeLike