Archive for the ‘arms race’ Category

Fingers crossed – book coming out next May

As it turns out, it takes a while to write a book, and then another few months to publish it.

I’m very excited today to tentatively announce that my book, which is tentatively entitled Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy, will be published in May 2016, in time to appear on summer reading lists and well before the election.

Fuck yeah! I’m so excited.

p.s. Fight for 15 is happening now.

A/B testing in politics

As research for my book I’m studying the way people use big data techniques, mostly from the marketing world, in politics. So naturally I was intrigued by Kyle Rush’s blogpost about A/B testing on the Obama campaign. Kyle was the Deputy Director of Frontend Web Development at Obama for America.

In case you don’t know the lingo, A/B testing is a test done by marketers to decide which of two ad designs is more effective – the ad with the dark blue background or the ad with the dark red background, for example. But in this case it was more like, the ad with Obama’s family or the ad with Obama’s family and the American flag in the background.

The idea is, as a marketer, you offer your target audience both ads – actually, any individual in the target audience either sees ad A or ad B, randomly – and then, after enough people have seen the ads, you see which population responds more, and you go with that version. Then you move on to the next test, where you keep the characteristic that just won and you test some other aspect of the ad, like the font.

As a mathematical testing framework, A/B testing is interesting and has structural complications – how do you know you’re getting a global maximum instead of a local maximum? In other words, if you’d first tested the font, and then the background color, would you have ended up with a “better ad”? What if there are 50 things you’d like to test, how do you decide which order to test them in?

But that’s not what interests me about Kyle’s Obama A/B testing blogpost. Rather, I’m fascinated by the definition of success that was chosen.

After all, an A/B test is all about which ad “works better,” so there has to be some way to measure success, and it has to be measured in real time if you want to go through many iterations of your ad.

In the case of the Obama campaign, there were two definitions of success, or maybe three: how often people signed up to be on Obama’s newsletter, how often they gave money, and how much money they gave. I infer this from Kyle’s braggy second sentence, “Overall we executed about 500 a/b tests on our web pages in a 20 month period which increased donation conversions by 49% and sign up conversions by 161%.” Those were the measures Kyle and his team was optimizing on.

Most of the blog post focused on getting people to donate more, and specifically on getting them to fill out the credit card donation page form. Here’s what they A/B tested:

Our plan was to separate the field groups into four smaller steps so that users did not feel overwhelmed by the length of the form. Essentially the idea was to get users to the top of the mountain by showing them a small incline rather than a steep slope.

What I find super interesting about this stuff (and of course this not the only “data science” that was used in Obama’s campaign, there was a separate team focused on getting Facebook users to share their friends’ lists and such) is that nowhere is there even a slight nod to the question of whether this stuff will improve or even maintain democracy. They don’t even discuss how maintainable this is.

I mean, we gave the Obama analytics team lots of credit for stuff, but in the end what they did was optimize a bunch of people’s donation money. Is that something we should cheer? It seems more like an arms race with the Republican party, in which the Democrats pulled ahead temporarily. And all it means is that the fight for donations will be even more manipulative, by both sides, by the next presidential election cycle.

As Felix Salmon pointed out to me over beer and sausages last week, the problem with big data in politics is that the easiest thing you can measure in politics is money, which means everything is optimized to that metric of success, leaving all other considerations ignored and probably stifled. And yes, “sign ups” are also measurable, but they more or less correspond to people who will receive weekly or daily requests for money from the candidate.

Readers, please tell me I’m wrong. Or suggest a way we can measure something and optimize to something that is less cynical than the size of a war chest.

Categories: arms race, data science

Me & My Administrative Bloat

I am now part of the administrative bloat over at Columbia. I am non-faculty administration, tasked with directing a data journalism program. The program is great, and I’m not complaining about my job. But I will be honest, it makes me uneasy.

Although I’m in the Journalism School, which is in many ways separated from the larger university, I now have a view into how things got so bloated. And how they might stay that way, as well: it’s not clear that, at the end of my 6-month gig, on September 16th, I could hand my job over to any existing person at the J-School. They might have to replace me, or keep me on, with a real live full-time person in charge of this program.

There are good and less good reasons for that, but overall I think there exists a pretty sound argument for such a person to run such a program and to keep it good and intellectually vibrant. That’s another thing that makes me uneasy, although many administrative positions have less of an easy sell attached to them.

I was reminded of this fact of my current existence when I read this recent New York Times article about the administrative bloat in hospitals. From the article:

And studies suggest that administrative costs make up 20 to 30 percent of the United States health care bill, far higher than in any other country. American insurers, meanwhile, spent $606 per person on administrative costs, more than twice as much as in any other developed country and more than three times as much as many, according to a study by the Commonwealth Fund.

Compare that to this article entitled Administrators Ate My Tuition:

A comprehensive study published by the Delta Cost Project in 2010 reported that between 1998 and 2008, America’s private colleges increased spending on instruction by 22 percent while increasing spending on administration and staff support by 36 percent. Parents who wonder why college tuition is so high and why it increases so much each year may be less than pleased to learn that their sons and daughters will have an opportunity to interact with more administrators and staffers— but not more professors.

There are similarities and there are differences between the university and the medical situations.

A similarity is that people really want to be educated, and people really need to be cared for, and administrations have grown up around these basic facts, and at each stage they seem to be adding something either seemingly productive or vitally needed to contain the complexity of the existing machine, but in the end you have enormous behemoths of organizations that are much too complex and much too expensive. And as a reality check on whether that’s necessary, take a look at hospitals in Europe, or take a look at our own university system a few decades ago.

And that also points out a critical difference: the health care system is ridiculously complicated in this country, and in some sense you need all these people just to navigate it for a hospital. And ObamaCare made that worse, not better, even though it also has good aspects in terms of coverage.

Whereas the university system made itself complicated, it wasn’t externally forced into complexity, except if you count the US News & World Reports gaming that seems inescapable.

Obama has the wrong answer to student loan crisis

Have you seen Obama’s latest response to the student debt crisis (hat tip Ernest Davis)? He’s going to rank colleges based on some criteria to be named later to decide whether a school deserves federal loans and grants. It’s a great example of a mathematical model solving the wrong problem.

Now, I’m not saying there aren’t nasty leeches who are currently gaming the federal loan system. For example, take the University of Phoenix. It’s not a college system, it’s a business which extracts federal and private loan money from unsuspecting people who want desperately to get a good job some day. And I get why Obama might want to put an end to that gaming, and declare the University of Phoenix and its scummy competitors unfit for federal loans. I get it.

But unfortunately it won’t fix the problem. Because the real problem is the federal loan system in the first place, which has grown a shitton since I was in school:

Screen Shot 2014-05-27 at 6.26.10 AM


and in the meantime, our state and private schools are getting more and more expensive relative to the available grants:

Screen Shot 2014-05-27 at 6.41.47 AM



And state funding for public schools has decreased while tuition has increased especially since the financial crisis:

Screen Shot 2014-05-27 at 6.46.40 AM

Screen Shot 2014-05-27 at 6.46.51 AM

The bottomline is that we – and especially our children – need more state school funding much more than we need a ranking algorithm. The best way to bring down tuition rates at private schools is to give them competition at good state schools.

Categories: arms race, education, modeling

What Monsanto and college funds have in common


I recently read this letter to the editor written by Catharine Hill, the President of Vassar College, explaining why reducing family contributions in college tuition and fees isn’t a good idea. It was in response to this Op-Ed by Steve Cohen about the onerous “E.F.C.” system.

Let me dumb down the debate a bit for the sake of simplicity. Steve is on one side basically saying, “college costs too damn much, it didn’t used to cost this much!” and Catharine is on the other side saying, “colleges need to compete! If you’re not willing to pay then someone else will!”

Here’s the thing, there’s an arms race of colleges driving up costs. In some perverse combination of US News & World Reports model gaming and in responding to the Federal loan support incentive system, not to mention political decisions methodically removing funding from state colleges, college costs have been wildly rising.

And when you have an arms race, as I’ve learned from Tom Slee, the only solution is an armistice. In this case an armistice would translate into something like an agreement among colleges to set a maximum and reasonable tuition and fee structure. Sounds good, right? But an armistice won’t happen if the players in question are benefitting from the arms race. In this case parents are suffering but colleges are largely benefitting.


This recent Salon article detailing the big data approach that Monsanto is taking to their massive agricultural empire is in the same boat.

The idea is that Monsanto has bought up a bunch of big data firms and satellite firms to perform predictive analytics on a massive scale for farming. And they are offering farmers who are already internal to the Monsanto empire the chance to benefit from their models.

Farmers are skeptical of using the models, because they are worried about how much data Monsanto will be able to collect about them if they do.

But here’s the thing, farmers: Monsanto already has all your data, and will have it forever, due to their surveillance. They will know exactly what you plant, where, and how densely.

And what they are offering you is probably actually a benefit to you, but of course the more important thing for them is that they are explicitly creating an arms race between Monsanto farmers and non-Monsanto farmers.

In other words, if they give Monsanto farmers a extra boost, it will lead other farmers into the conclusion that, without such a boost, they won’t be able to keep up, and they will be forced into the Monsanto system by economic necessity.

Again an arms race, and again no armistice in sight, since Monsanto is doing this deliberately towards their profit bottom line. Assuming their models are good, the only way to avoid this for non-Monsanto farmers is to build their own predictive models, but clearly that would require enormous investment.

Categories: arms race, modeling