Home > data science, internet startup, modeling, statistics > The politics of data mining

The politics of data mining

June 22, 2013

At first glance, data miners inside governments, start-ups, corporations, and political campaigns are all doing basically the same thing. They’ll all need great engineering infrastructure, good clean data, a working knowledge of statistical techniques and enough domain knowledge to get things done.

We’ve seen recent articles that are evidence for this statement: Facebook data people move to the NSA or other government agencies easily, and Obama’s political campaign data miners have launched a new data mining start-up. I am a data miner myself, and I could honestly work at any of those places – my skills would translate, if not my personality.

I do think there are differences, though, and here I’m not talking about ethics or trust issues, I’m talking about pure politics[1].

Namely, the world of data mining is divided into two broad categories: people who want to cause things to happen and people who want to prevent things from happening.

I know that sounds incredibly vague, so let me give some examples.

In start-ups, irrespective of what you’re actually doing (what you’re actually doing is probably incredibly banal, like getting people to click on ads), you feel like you’re the first person ever to do it, at least on this scale, or at least with this dataset, and that makes it technically challenging and exciting.

Or, even if you’re not the first, at least what you’re creating or building is state-of-the-art and is going to be used to “disrupt” or destroy lagging competition. You feel like a motherfucker, and it feels great[2]!

The same thing can be said for Obama’s political data miners: if you read this article, you’ll know they felt like they’d invented a new field of data mining, and a cult along with it, and it felt great! And although it’s probably not true that they did something all that impressive technically, in any case they did a great job of applying known techniques to a different data set, and they got lots of people to allow access to their private information based on their trust of Obama, and they mined the fuck out of it to persuade people to go out and vote and to go out and vote for Obama.

Now let’s talk about corporations. I’ve worked in enough companies to know that “covering your ass” is a real thing, and can overwhelm a given company’s other goals. And the larger the company, the more the fear sets in and the more time is spent covering one’s ass and less time is spent inventing and staying state-of-the-art. If you’ve ever worked in a place where it takes months just to integrate two different versions of SalesForce you know what I mean.

Those corporate people have data miners too, and in the best case they are somewhat protected from the conservative, risk averse, cover-your-ass atmosphere, but mostly they’re not. So if you work for a pharmaceutical company, you might spend your time figuring out how to draw up the numbers to make them look good for the CEO so he doesn’t get axed.

In other words, you spend your time preventing something from happening rather than causing something to happen.

Finally, let’s talk about government data miners. If there’s one thing I learned when I went to the State Department Tech@State “Moneyball Diplomacy” conference a few weeks back, it’s that they are the most conservative of all. They spend their time worrying about a terrorist attack and how to prevent it. It’s all about preventing bad things from happening, and that makes for an atmosphere where causing good things to happen takes a rear seat.

I’m not saying anything really new here; I think this stuff is pretty uncontroversial. Maybe people would quibble over when a start-up becomes a corporation (my answer: mostly they never do, but certainly by the time of an IPO they’ve already done it). Also, of course, there are ass-coverers in start-ups and there are risk-takers in corporation and maybe even in government, but they don’t dominate.

If you think through things in this light, it makes sense that Obama’s data miners didn’t want to stay in government and decided to go work on advertising stuff. And although they might have enough clout and buzz to get hired by a big corporation, I think they’ll find it pretty frustrating to be dealing with the cover-my-ass types that will hire them. It also makes sense that Facebook, which spends its time making sure no other social network grows enough to compete with it, works so well with the NSA.

1. If you want to talk ethics, though, join me on Monday at Suresh Naidu’s Data Skeptics Meetup where he’ll be talking about Political Uses and Abuses of Data.

2. This is probably why start-up guys are so arrogant.

  1. beewhy2012
    June 22, 2013 at 1:10 pm

    If you look at the “bad things from happening” side of the data mining coin, you need to take into account a much longer time frame than just dealing with immediate threats. So while data miners may be more or less happily engaged in the daily grind, many of the rest of us, ignorant as we are of the charms of the job, have other things that worry us.
    When we look back – over McCarthy, past Orwell, before the regimes of Stalin and Hitler, beyond the Star Chamber all the way back even to Cyrus, the Persian king, whose team of spies were called “the king’s ears” – it is obvious that there is a pressing need for the powerful to keep close tabs on the masses. At the same time, clearly visible from the above list, we see the huge potential for abuse and for massive moral and ethical compromise on those who act as the king’s ears. We know all too well how the victims of the Stasi and the NKVD fared. We understand the importance of the ongoing fight for freedom and its attendant will o’ the wisp, transparency.
    Projecting this timeline into the future along with its attendant trail of deceit, manipulation and oppression by those in power – beyond the next cool app or next year’s developments in handhelds or wrist communicators, past cerebral chip implants and the rest – we are left almost breathless with the overwhelming sense of the possibilities for betrayal that each of us and our descendents might personally experience at the hands of the faceless data miners and their secretive overlords.
    Whistleblower heroes like Snowden – and before him, Binney and Drake – have tried to stop the madness and to expose the darkness to light. It is to be hoped that more people who work in these shadowy worlds will have the courage and our wholehearted support to do the same over and over again.

  2. June 22, 2013 at 6:11 pm

    I read this post twice and still can’t find one single reference to sex anywhere in it. My weekend is now ruined. :)

    The powerful always keep a close eye on the masses and themselves. There is one obscure reference I found to a database being developed by the Koch bros called Thermis.

    I say if they have databases we need databases to watch then watching us from the shadows.

  3. Josh
    June 22, 2013 at 7:13 pm

    Many corporations have managers who know their organisations are infected w CYAitis and hire outsiders as one way to remedy that. That is one reason for the consulting industry. Perversely, hiring consultants can also be a direct CYA tactic.

    As to the start-up arrogance, I think there is a selection bias at work. Only those with the right (or is it wrong?) combination of arrogance, optimism, and foolishness try to build a business from scratch.

  4. June 22, 2013 at 11:55 pm

    “it takes months just to integrate two different versions of SalesForce you know what I mean’…that made me chuckle and it’s so true:) like the way you described the CYA groups versus the risk takers and the arrogance. You are right when you see something you created being used by others, we all get a rush but the size of the rush and the arrogance varies tremendously:) Little rushes are ok as we all need a “feel good” pump about our work that may have been weeks and months to create.

    If you write software one is a data miner, but not to be confused with the “public” view and use that we see out there now as programming is writing queries, which mine data somewhere, some place. Not to be too demented here but it was interesting to hear the President stumble around a bit in his speeches with explaining what meta data is too while explaining the NSA activities. I’m still on my campaign though to license all data sellers and require a federal site to where consumers can look up and see what kind of data they sell and to who. We are owed that much as the fix and the time and money it takes is always on our dime while the sellers make billions.

    Part 2 here is levy an excise tax on those profiting and model it just like a quarterly sales tax. At least doing both steps would give law enforcement something to regulate with. A great way to get some money moved back over to the 99% side as corporations and banks are making billions. Politics definitely play a role and I’m convinced that all the politicians farting around with abortion bills are really the “digital illits” in disguise looking for anything to stand on, like almost literally. Government everywhere is so far behind the 8 ball as they have no clue what the other side is “really” doing and how modeling for profit works.

  1. June 23, 2013 at 6:54 am
Comments are closed.
Follow

Get every new post delivered to your Inbox.

Join 1,726 other followers

%d bloggers like this: