There’s a CNN video news story explaining how the NYC Mayor’s Office of Data Analytics is working with private start-up Placemeter to count and categorize New Yorkers, often with the help of private citizens who install cameras in their windows. Here’s a screenshot from the Placemeter website:
You should watch the video and decide for yourself whether this is a good idea.
Personally, it disturbs me, but perhaps because of my priors on how much we can trust other people with our data, especially when it’s in private hands.
To be more precise, there is, in my opinion, a contradiction coming from the Placemeter representatives. On the one hand they try to make us feel safe by saying that, after gleaning a body count with their video tapes, they dump the data. But then they turn around and say that, in addition to counting people, they will also categorize people: gender, age, whether they are carrying a shopping bag or pushing strollers.
That’s what they are talking about anyway, but who knows what else? Race? Weight? Will they use face recognition software? Who will they sell such information to? At some point, after mining videos enough, it might not matter if they delete the footage afterwards.
Since they are a private company I don’t think such information on their data methodologies will be accessible to us via Freedom of Information Laws either. Or, let me put that another way. I hope that MODA sets up their contract so that such information is accessible via FOIL requests.
I’m super excited about the recent “mood study” that was done on Facebook. It constitutes a great case study on data experimentation that I’ll use for my Lede Program class when it starts mid-July. It was first brought to my attention by one of my Lede Program students, Timothy Sandoval.
My friend Ernest Davis at NYU has a page of handy links to big data articles, and at the bottom (for now) there are a bunch of links about this experiment. For example, this one by Zeynep Tufekci does a great job outlining the issues, and this one by John Grohol burrows into the research methods. Oh, and here’s the original research article that’s upset everyone.
It’s got everything a case study should have: ethical dilemmas, questionable methodology, sociological implications, and questionable claims, not to mention a whole bunch of media attention and dissection.
By the way, if I sound gleeful, it’s partly because I know this kind of experiment happens on a daily basis at a place like Facebook or Google. What’s special about this experiment isn’t that it happened, but that we get to see the data. And the response to the critiques might be, sadly, that we never get another chance like this, so we have to grab the opportunity while we can.
There’s been a movement to make primary and secondary education run more like a business. Just this week in California, a lawsuit funded by Silicon Valley entrepreneur David Welch led to a judge finding that student’s constitutional rights were being compromised by the tenure system for teachers in California.
The thinking is that tenure removes the possibility of getting rid of bad teachers, and that bad teachers are what is causing the achievement gap between poor kids and well-off kids. So if we get rid of bad teachers, which is easier after removing tenure, then no child will be “left behind.”
The problem is, there’s little evidence for this very real achievement gap problem as being caused by tenure, or even by teachers. So this is a huge waste of time.
As a thought experiment, let’s say we did away with tenure. This basically means that teachers could be fired at will, say through a bad teacher evaluation score.
An immediate consequence of this would be that many of the best teachers would get other jobs. You see, one of the appeals of teaching is getting a comfortable pension at retirement, but if you have no idea when you’re being dismissed, then it makes no sense to put in the 25 or 30 years to get that pension. Plus, what with all the crazy and random value-added teacher models out there, there’s no telling when your score will look accidentally bad one year and you’ll be summarily dismissed.
People with options and skills will seek other opportunities. After all, we wanted to make it more like a business, and that’s what happens when you remove incentives in business!
The problem is you’d still need teachers. So one possibility is to have teachers with middling salaries and no job security. That means lots of turnover among the better teachers as they get better offers. Another option is to pay teachers way more to offset the lack of security. Remember, the only reason teacher salaries have been low historically is that uber competent women like Laura Ingalls Wilder had no other options than being a teacher. I’m pretty sure I’d have been a teacher if I’d been born 150 years ago.
So we either have worse teachers or education doubles in price, both bad options. And, sadly, either way we aren’t actually addressing the underlying issue, which is that pesky achievement gap.
People who want to make schools more like businesses also enjoy measuring things, and one way they like measuring things is through standardized tests like achievement scores. They blame teachers for bad scores and they claim they’re being data-driven.
Here’s the thing though, if we want to be data-driven, let’s start to maybe blame poverty for bad scores instead:
I’m tempted to conclude that we should just go ahead and get rid of teacher tenure so we can wait a few years and still see no movement in the achievement gap. The problem with that approach is that we’ll see great teachers leave the profession and no progress on the actual root cause, which is very likely to be poverty and inequality, hopelessness and despair. Not sure we want to sacrifice a generation of students just to prove a point about causation.
On the other hand, given that David Welch has a lot of money and seems to be really excited by this fight, it looks like we might have no choice but to blame the teachers, get rid of their tenure, see a bunch of them leave, have a surprise teacher shortage, respond either by paying way more or reinstating tenure, and then only then finally gather the data that none of this has helped and very possibly made things worse.
I’m too busy this morning for a real post but I thought I’d share a few things I’m reading today.
- Matt Stoller just came out with a long review of Timmy Geithner’s book: The Con Artist Wing of the Democratic Party. I like this because it explains some of the weird politics around, for example, the Mexican currency crisis that I only vaguely knew about.
- New York Magazine has a long profile of Stevie Cohen of SAC Capital insider trading fame: The Taming of the Trading Monster.
- The power of Google’s algorithms can make or break smaller websites: On the Future of Metafilter. See also How Google Is Killing The Best Site On The Internet.
- There is no such thing as a slut.
Here’s one recommendation related to discrimination:
Expand Technical Expertise to Stop Discrimination. The detailed personal profiles held about many consumers, combined with automated, algorithm-driven decision-making, could lead—intentionally or inadvertently—to discriminatory outcomes, or what some are already calling “digital redlining.” The federal government’s lead civil rights and consumer protection agencies should expand their technical expertise to be able to identify practices and outcomes facilitated by big data analytics that have a discriminatory impact on protected classes, and develop a plan for investigating and resolving violations of law.
First, I’m very glad this has been acknowledged as an issue; it’s a big step forward from the big data congressional subcommittee meeting I attended last year for example, where the private-data-for-services fallacy was leaned on heavily.
So yes, a great first step. However, the above recommendation is clearly insufficient to the task at hand.
It’s one thing to expand one’s expertise – and I’d be more than happy to be a consultant for any of the above civil rights and consumer protection agencies, by the way – but it’s quite another to expect those groups to be able to effectively measure discrimination, never mind combat it.
Why? It’s just too easy to hide discrimination: the models are proprietary, and some of them are not even apparent; we often don’t even know we’re being modeled. And although the report brings up discriminatory pricing practices, it ignores redlining and reverse-redlining issues, which are even harder to track. How do you know if you haven’t been made an offer?
Once they have the required expertise, we will need laws that allow institutions like the CFPB to deeply investigate these secret models, which means forcing companies like Larry Summer’s Lending Club to give access to them, where the definition of “access” is tricky. That’s not going to happen just because the CFPB asks nicely.
A fascinating and timely study just came out about the “Stand Your Ground” laws. It was written by Cheng Cheng and Mark Hoekstra, and is available as a pdf here, although I found out about in a Reuters column written by Hoekstra. Here’s a longish but crucial excerpt from that column:
It is fitting that much of this debate has centered on Florida, which enacted its law in October of 2005. Florida provides a case study for this more general pattern. Homicide rates in Florida increased by 8 percent from the period prior to passing the law (2000-04) to the period after the law (2006-10).By comparison, national homicide rates fell by 6 percent over the same time period. This is a crude example, but it illustrates the more general pattern that exists in the homicide data published by the FBI.
The critical question for our research is whether this relative increase in homicide rates was caused by these laws. Several factors lead us to believe that laws are in fact responsible. First, the relative increase in homicide rates occurred in adopting states only after the laws were passed, not before. Moreover, there is no history of homicide rates in adopting states (like Florida) increasing relative to other states. In fact, the post-law increase in homicide rates in states like Florida was larger than any relative increase observed in the last 40 years. Put differently, there is no evidence that states like Florida just generally experience increases in homicide rates relative to other states, even when they don’t pass these laws.
We also find no evidence that the increase is due to other factors we observe, such as demographics, policing, economic conditions, and welfare spending. Our results remain the same when we control for these factors. Along similar lines, if some other factor were driving the increase in homicides, we’d expect to see similar increases in other crimes like larceny, motor vehicle theft and burglary. We do not. We find that the magnitude of the increase in homicide rates is sufficiently large that it is unlikely to be explained by chance.
In fact, there is substantial empirical evidence that these laws led to more deadly confrontations. Making it easier to kill people does result in more people getting killed.
If you take a look at page 33 of the paper, you’ll see some graphs of the data. Here’s a rather bad picture of them but it might give you the idea:
That red line is the same in each plot and refers to the log homicide rate in states without the Stand Your Ground law. The blue lines are showing how the log homicide rates looked for states that enacted such a law in a given year. So there’s a graph for each year.
In 2009 there’s only one “treatment” state, namely Montana, which has a population of 1 million, less than one third of one percent of the country. For that reason you see much less stable data. The authors did different analyses, sometimes weighted by population, which is good.
I have to admit, looking at these plots, the main thing I see in the data is that, besides Montana, we’re talking about states that have a higher homicide rate than usual, which could potentially indicate a confounding condition, and to address that (and other concerns) they conducted “falsification tests,” which is to say they studied whether crimes unrelated to Stand Your Ground type laws – larceny and motor vehicle theft – went up at the same time. They found that the answer is no.
The next point is that, although there seem to be bumps for 2005, 2006, and 2008 for the two years after the enactment of the law, there doesn’t for 2007 and 2009. And then even those states go down eventually, but the point is they don’t go down as much as the rest of the states without the laws.
It’s hard to do this analysis perfectly, with so few years of data. The problem is that, as soon as you suspect there’s a real effect, you’d want to act on it, since it directly translates into human deaths. So your natural reaction as a researcher is to “collect more data” but your natural reaction as a citizen is to abandon these laws as ineffective and harmful.
Scott Hodge just came out with a column in the Wall Street Journal arguing that reducing income inequality is way too hard to consider. The title of his piece is Scott Hodge: Here’s What ‘Income Equality’ Would Look Like, and his basic argument is as follows.
First of all, the middle quintile already gets too much from the government as it stands. Second of all, we’d have to raise taxes to 74% for the top quintile to even stuff out. Clearly impossible, QED.
As to the first point, his argument, and his supporting data, is intentionally misleading, as I will explain below. As to his second point, he fails to mention that the top tax bracket has historically been much higher than 74%, even as recently as 1969, and the world didn’t end.
Hodge argues with data he took from a report from the CBO called The Distribution of Federal Spending and Taxes in 2006. This report distinguishes between transfers and spending. Here’s a chart to explain what that looks, before taxes are considered and by quintile, for non-elderly households (page 5 of the report):
The stuff on the left corresponds to stuff like food stamps. The stuff in the middle is stuff like Medicaid. The stuff on the right is stuff like wars.
Here are a few things to take from the above:
- There’s way more general spending going on than transfers.
- Transfers are very skewed towards the lowest quintile, as would be expected.
- If you look carefully at the right-most graph, the light green version gives you a way of visualizing of how much more money the top quintile has versus the rest.
Now let’s break this down a bit further to include taxes. This is a key chart that Hodge referred to from this report (page 6 of the report):
OK, so note that in the middle chart, for the middle quintile, people pay more in taxes than they receive in transfers. On the right chart, for the middle quintile, which includes all spending, the middle quintile is about even, depending on how you measure it.
Now let’s go to what Hodge says in his column (emphasis mine):
Looking at prerecession data for non-elderly households in 2006 in “The Distribution of Federal Spending and Taxes in 2006,” the CBO found that those in the bottom fifth, or quintile, of the income scale received $9.62 in federal spending for every $1 they paid in federal taxes of all kinds. This isn’t surprising, since people with low incomes pay little in taxes but receive a lot of transfers.
Nor is it surprising that households in the top fifth received 17 cents in federal spending for every $1 they paid in all federal taxes. High-income households hand over a disproportionate amount in taxes relative to what they get back in spending.
What is surprising is that the middle quintile—the middle class—also got more back from government than they paid in taxes. These households received $1.19 in government spending for every $1 they paid in federal taxes.
In the first paragraph Hodge intentionally conflates the concept of “transfers” and “spending”. He continues to do this for the next two paragraphs, and in the last sentence, it is easy to imagine a middle-quintile family paying $100 in taxes and receiving $119 in food stamps. This is of course not true at all.
What’s nuts about this is that it’s mathematically equivalent to complaining that half the population is below median intelligence. Duh.
Since we have a skewed distribution of incomes, and therefore a skewed distribution of tax receipts as well as transfers, then in the context of a completely balanced budget, we would expect the middle quintile – which has a below-mean average income – to pay slightly less than the government spends on them. It’s a mathematical fact as long as our federal tax system isn’t regressive, which it’s not.
In other words, this guy is just framing stuff in a “middle class is lazy and selfish, what could rich people possibly be expected do about that?” kind of way. Who is this guy anyway?
Turns out that Hodge is the President of the Tax Foundation, which touts itself as “nonpartisan” but which has gotten funding from Big Oil and the Koch brothers. I guess it’s fair to say he has an agenda.