Home > data science, modeling, rant > No, Sandy Pentland, let’s not optimize the status quo

No, Sandy Pentland, let’s not optimize the status quo

It was bound to happen. Someone was inevitably going to have to write this book, entitled Social Physics, and now someone has just up and done it. Namely, Alex “Sandy” Pentland, data scientist evangelist, director of MIT’s Human Dynamics Laboratory, and co-founder of the MIT Media Lab.

A review by Nicholas Carr

This article entitled The Limits of Social Engineering, published in MIT’s Technology Review and written by Nicholas Carr (hat tip Billy Kaos) is more or less a review of the book. From the article:

Pentland argues that our greatly expanded ability to gather behavioral data will allow scientists to develop “a causal theory of social structure” and ultimately establish “a mathematical explanation for why society reacts as it does” in all manner of circumstances. As the book’s title makes clear, Pentland thinks that the social world, no less than the material world, operates according to rules. There are “statistical regularities within human movement and communication,” he writes, and once we fully understand those regularities, we’ll discover “the basic mechanisms of social interactions.”

By collecting all the data – credit card, sensor, cell phones that can pick up your moods, etc. – Pentland seems to think we can put the science into social sciences. He thinks we can predict a person like we now predict planetary motion.

OK, let’s just take a pause here to say: eeeew. How invasive does that sound? And how insulting is its premise? But wait, it gets way worse.

The next think Pentland wants to do is use micro-nudges to affect people’s actions. Like paying them to act a certain way, and exerting social and peer pressure. It’s like Nudge in overdrive.

Vomit. But also not the worst part.

Here’s the worst part about Pentland’s book, from the article:

Ultimately, Pentland argues, looking at people’s interactions through a mathematical lens will free us of time-worn notions about class and class struggle. Political and economic classes, he contends, are “oversimplified stereotypes of a fluid and overlapping matrix of peer groups.” Peer groups, unlike classes, are defined by “shared norms” rather than just “standard features such as income” or “their relationship to the means of production.” Armed with exhaustive information about individuals’ habits and associations, civic planners will be able to trace the full flow of influences that shape personal behavior. Abandoning general categories like “rich” and “poor” or “haves” and “have-nots,” we’ll be able to understand people as individuals—even if those individuals are no more than the sums of all the peer pressures and other social influences that affect them.

Kill. Me. Now.

The good news is that the author of the article, Nicholas Carr, doesn’t buy it, and makes all sorts of reasonable complaints about this theory, like privacy concerns, and structural sources of society’s ills. In fact Carr absolutely nails it (emphasis mine):

Pentland may be right that our behavior is determined largely by social norms and the influences of our peers, but what he fails to see is that those norms and influences are themselves shaped by history, politics, and economics, not to mention power and prejudice. People don’t have complete freedom in choosing their peer groups. Their choices are constrained by where they live, where they come from, how much money they have, and what they look like. A statistical model of society that ignores issues of class, that takes patterns of influence as givens rather than as historical contingencies, will tend to perpetuate existing social structures and dynamics. It will encourage us to optimize the status quo rather than challenge it.

How to see how dumb this is in two examples

This brings to mind examples of models that do or do not combat sexism.

First, the orchestra audition example: in order to avoid nepotism, they started making auditioners sit behind a sheet. The result has been way more women in orchestras.

This is a model, even if it’s not a big data model. It is the “orchestra audition” model, and the most important thing about this example is that they defined success very carefully and made it all about one thing: sound. They decided to define the requirements for the job to be “makes good sounding music” and they decided that other information, like how they look, would be by definition not used. It is explicitly non-discriminatory.

By contrast, let’s think about how most big data models work. They take historical information about successes and failures and automate them – rather than challenging their past definition of success, and making it deliberately fair, they are if anything codifying their discriminatory practices in code.

My standard made-up example of this is close to the kind of thing actually happening and being evangelized in big data. Namely, a resume sorting model that helps out HR. But, using historical training data, this model notices that women don’t fare so well historically at a the made-up company as computer programmers – they often leave after only 6 months and they never get promoted. A model will interpret that to mean they are bad employees and never look into structural causes. And moreover, as a result of this historical data, it will discard women’s resumes. Yay, big data!

Thanks, Pentland

I’m kind of glad Pentland has written such an awful book, because it gives me an enemy to rail against in this big data hype world. I don’t think most people are as far on the “big data will solve all our problems” spectrum as he is, but he and his book present a convenient target. And it honestly cannot surprise anyone that he is a successful white dude as well when he talks about how big data is going to optimize the status quo if we’d just all wear sensors to work and to bed.

  1. May 2, 2014 at 8:14 am

    The principal problem with Pentland’s thesis is that it assumes the social scientist stands outside of society and observes it objectively. Even social scientists don’t believe that anymore.

    • Guest2
      May 2, 2014 at 8:30 am
      • May 2, 2014 at 8:33 am

        Only to the easily impressionable

        • Guest2
          May 2, 2014 at 8:42 am

          I agree — emphatically so — with your opening premise, however, all the analytics you list presume to be objective — are you impugning the objectivity of your own products? Better yet, what is their epistemological basis?

        • May 2, 2014 at 9:05 am

          Crisis! Someone has challenged the epistemological basis of my blog. I may have trouble sleeping tonight.

          The epistemological basis of my blog is that people can do practical and useful things with advanced analytics.

        • Guest2
          May 2, 2014 at 12:58 pm

          Thank you for the response — if I understand it correctly, the social network of people that do “practical and useful things” is the basis of your work, and those that make this determination are your satisfied customers.

  2. May 2, 2014 at 8:25 am

    “Pentland may be right that our behavior is determined largely by social norms and the influences of our peers, but what he fails to see is that those norms and influences are themselves shaped by history, politics, and economics, not to mention power and prejudice”

    Even more importantly, I think, is that he fails to see that social norms arise from our deep evolutionary history, which is a biological phenomenon, not a physical one. They emerge from the interplay of our evolved emotions and behaviors, and millions of years of solving the problem of living successfully in groups.

    • Guest2
      May 2, 2014 at 8:33 am

      Evolutionary biology operates according to rules (Nature’s rules) that can be programmed. This helps Pentland’s argument — how does evolutionary biology help refute him?

      • May 2, 2014 at 9:13 am

        Don’t have time to get into an extended defense of my comment; just trying to make the point that the evolutionary forces underlying social norms predate and contributed to the historical, political, and economic forces.

    • Auntienene
      May 28, 2014 at 3:17 pm

      I’m just an ordinary innumerate but as I was reading the post, the thought popped into my head that he’s (Pentland) about 10,000 years too late to start gathering this data.

  3. Guest2
    May 2, 2014 at 8:25 am

    What is really amazing to me is that, every time someone trots out this theory as something new, they ignore the ways that it (“social engineering”) has been tried before.


    • Guest2
      May 2, 2014 at 8:37 am

      Phillip Ball has written elegantly about this approach in Critical Mass: How One Thing Leads to Another (2004), including the history of the quantitative modeling of society.


  4. Larry Headlund
    May 2, 2014 at 8:36 am

    “He thinks we can predict a person like we now predict planetary motion.”
    I wonder if the person making this analogy knows how difficult orbital dynamics can be? Do they know how long term unstable even moderately complex planetary systems are?

    • Guest2
      May 2, 2014 at 1:00 pm

      Yeah, this reminds me of Minority Report, and “precrime” arrests!


    • H. Alexander Ivey
      May 29, 2014 at 12:44 am

      Quite right!

      I suspect the equations for planetary motion are non-linear, correct? Ties to my my comment, May 29, 2014 at 12:41 am, below.

  5. May 2, 2014 at 8:49 am

    Just another fool with a digital Skinner Box. Pentland’s arrogance of thnking that his vision and his models are reality so outrageous as to be laughable. The only thing more dismaying that his childish utopianism is the fact that MIT would keep such a fool on the faculty.

    Either that, or he’s just another frackdemic cashing in on the Big Data bubble.

  6. May 2, 2014 at 9:11 am

    I’ve said it in comments on this blog before, and I fear I must say it again. If the most advanced science is unpublished science, we’re doomed as a civilization. I’m happy to see a consensus that this operator is a quack, and of course I hope that consensus isn’t wishful thinking…

  7. May 2, 2014 at 9:15 am

    Oh, and the denizens of this corner of the web probably heard this buzzword before I did (yesterday) but I’d still like to put it out there that another buzzword that seems to be coming down the tubes is “data darwinism.” A quick DuckDuckGo search definitely made me go eeeew. Vomit. Kill. Me. Now.

    Data science might not be real, but I suspect information asymmetry is.

  8. Christina Sormani
    May 2, 2014 at 10:02 am

    This whole idea, its success and failures as well the effects are explored in Isaac Asimov’s Foundation Trilogy.

    • tdhawkes
      May 2, 2014 at 11:09 am

      Artists usually get ‘there’ first; ‘there’ being a holistic understanding of what is going on. They tend to see things as constellations of interacting variables, instead of linear progressions of cherry-picked variables.

  9. May 2, 2014 at 3:18 pm

    I wrote about something similar a few weeks ago, except prompted by the physics blogosphere rather than Pentland’s book. Personally, I am a bit afraid of science-folk who aren’t very intune with society and treating humans humanely, wandering into policy with their ‘science klout’ and ‘fixing things’. I imagine it to play out like so many Bay Area social entrepreneurs that fix problems nobody had in ways that create more problems from those that did.

    However, I was a little bit disappointed with your critique. “Kill. Me. Now.” is funny, but not a very good response. If the paragraph is so bad that it warrants no response then why quote it?

    I would be interested in seeing a dissection of Pentland on his own terms. For example, it seems to me like his whole argument rests on “well, people before me thought that individual people are irreducibly complex and unpredictable, however the idiosyncrasies of individual behavior can actually be accounted for by looking at their immediate ‘neighbours'; thus the individual agents aren’t very complicated”. From this he magically concludes that things are predictable.

    This is an inexcusable offense for a computer scientist or even a ‘complexity theorist’ in the SFI sense of the word. Even if your agents are the simplest things possible: deterministic and, or, and not gates; it doesn’t mean that if you arrange them in some complicated network of influence that you can predict (without direct simulation) what that network will do in most cases. So it seems like one of the basic premises of the work relies on demolishing a straw man that wasn’t even blocking the way…

    However, I haven’t read the book, so my comments could be completely misplaced.

    • May 2, 2014 at 3:21 pm

      I addressed it in the next section with the two examples of models.

      • May 2, 2014 at 3:34 pm

        That is a good point. It is actually my bad for misinterpreting the first quote.

        I interpreted the quote as saying that “traditional categories used in the social sciences are badly designed” (i.e. applying just to the theoretical constructs). While I think you and the review author interpreted it as “traditional categories are irrelevant to outcomes”, and then rightly critiqued that by showing examples where the categories are not purely theoretical but actually embed themselves in the mind of decision makers and influence their decisions toward unfair ends.

        Sorry for the sloppy reading on my part.

        • May 2, 2014 at 3:35 pm

          Not at all. I think there are many ways to go about disagreeing with this guy.

  10. H. Alexander Ivey
    May 29, 2014 at 12:41 am

    What I quibble about is why mathematicians don’t say that Pentland’s math is wrong. His equations must be non-linear, high order ones, which, as a group exhibit one very important property. Their output (their answers) are very sensitive to the value of the input numbers.

    Say to predict what a person will do in one hour’s time you have an equation that predicts what they will do after one minute of time. This equation, whatever its actual variables and formula, must be a non-linear one (trust me on this, the argument why would take too long for this posting). To get the answer for what the person is doing 60 minutes from now, first you plug in your numbers for the first minute, use that answer to plug into the same formula and get your answer for the second minute, ditto for the third minute, all the way to the 60th minute. After running the equation 60 times, you can predict what the person will be doing at the 60th minute. So far, so good. But here’s the kick. Lets say x is the variable for the distance the person has moved during the one minute. You have lots of variables, but you only want to change one of them. So for the first prediction, you use x=1.00. You run the equation 60 times and you get an answer of the person is going to the bank. Ok. No problem.

    But you want to see what happens if you use x=1.01 as the first distance (instead of 1.00). You run the equation 60 times and you get an answer of the person is going to work! Not the same answer at all. Not even close. You only changed the input by 1%, but the second answer is changed way more than 1%.

    This property of sensitivity to the input values is a property that has been proven by mathematicians for over 100 years (Poincare I think in 1903). So what Pentland is talking about is just rubbish (mathematically).

