Christian Rudder’s Dataclysm

Home > data science, feedback loop, news, rant, statistics > Christian Rudder’s Dataclysm

Christian Rudder’s Dataclysm

September 16, 2014 Cathy O'Neil, mathbabe

Here’s what I’ve spent the last couple of days doing: alternatively reading Christian Rudder’s new book Dataclysm and proofreading a report by AAPOR which discusses the benefits, dangers, and ethics of using big data, which is mostly “found” data originally meant for some other purpose, as a replacement for public surveys, with their carefully constructed data collection processes and informed consent. The AAPOR folk have asked me to provide tangible examples of the dangers of using big data to infer things about public opinion, and I am tempted to simply ask them all to read Dataclysm as exhibit A.

Rudder is a co-founder of OKCupid, an online dating site. His book mainly pertains to how people search for love and sex online, and how they represent themselves in their profiles.

Here’s something that I will mention for context into his data explorations: Rudder likes to crudely provoke, as he displayed when he wrote this recent post explaining how OKCupid experiments on users. He enjoys playing the part of the somewhat creepy detective, peering into what OKCupid users thought was a somewhat private place to prepare themselves for the dating world. It’s the online equivalent of a video camera in a changing booth at a department store, which he defended not-so-subtly on a recent NPR show called On The Media, and which was written up here.

I won’t dwell on that aspect of the story because I think it’s a good and timely conversation, and I’m glad the public is finally waking up to what I’ve known for years is going on. I’m actually happy Rudder is so nonchalant about it because there’s no pretense.

Even so, I’m less happy with his actual data work. Let me tell you why I say that with a few examples.

Who are OKCupid users?

I spent a lot of time with my students this summer saying that a standalone number wouldn’t be interesting, that you have to compare that number to some baseline that people can understand. So if I told you how many black kids have been stopped and frisked this year in NYC, I’d also need to tell you how many black kids live in NYC for you to get an idea of the scope of the issue. It’s a basic fact about data analysis and reporting.

When you’re dealing with populations on dating sites and you want to conclude things about the larger culture, the relevant “baseline comparison” is how well the members of the dating site represent the population as a whole. Rudder doesn’t do this. Instead he just says there are lots of OKCupid users for the first few chapters, and then later on after he’s made a few spectacularly broad statements, on page 104 he compares the users of OKCupid to the wider internet users, but not to the general population.

It’s an inappropriate baseline, made too late. Because I’m not sure about you but I don’t have a keen sense of the population of internet users. I’m pretty sure very young kids and old people are not well represented, but that’s about it. My students would have known to compare a population to the census. It needs to happen.

How do you collect your data?

Let me back up to the very beginning of the book, where Rudder startles us by showing us that the men that women rate “most attractive” are about their age whereas the women that men rate “most attractive” are consistently 20 years old, no matter how old the men are.

Actually, I am projecting. Rudder never actually specifically tells us what the rating is, how it’s exactly worded, and how the profiles are presented to the different groups. And that’s a problem, which he ignores completely until much later in the book when he mentions that how survey questions are worded can have a profound effect on how people respond, but his target is someone else’s survey, not his OKCupid environment.

Words matter, and they matter differently for men and women. So for example, if there were a button for “eye candy,” we might expect women to choose more young men. If my guess is correct, and the term in use is “most attractive”, then for men it might well trigger a sexual concept whereas for women it might trigger a different social construct; indeed I would assume it does.

Since this isn’t a porn site, it’s a dating site, we are not filtering for purely visual appeal; we are looking for relationships. We are thinking beyond what turns us on physically and asking ourselves, who would we want to spend time with? Who would our family like us to be with? Who would make us be attractive to ourselves? Those are different questions and provoke different answers. And they are culturally interesting questions, which Rudder never explores. A lost opportunity.

Next, how does the recommendation engine work? I can well imagine that, once you’ve rated Profile A high, there is an algorithm that finds Profile B such that “people who liked Profile A also liked Profile B”. If so, then there’s yet another reason to worry that such results as Rudder described are produced in part as a result of the feedback loop engendered by the recommendation engine. But he doesn’t explain how his data is collected, how it is prompted, or the exact words that are used.

Here’s a clue that Rudder is confused by his own facile interpretations: men and women both state that they are looking for relationships with people around their own age or slightly younger, and that they end up messaging people slightly younger than they are but not many many years younger. So forty year old men do not message twenty year old women.

Is this sad sexual frustration? Is this, in Rudder’s words, the difference between what they claim they want and what they really want behind closed doors? Not at all. This is more likely the difference between how we live our fantasies and how we actually realistically see our future.

Need to control for population

Here’s another frustrating bit from the book: Rudder talks about how hard it is for older people to get a date but he doesn’t correct for population. And since he never tells us how many OKCupid users are older, nor does he compare his users to the census, I cannot infer this.

Here’s a graph from Rudder’s book showing the age of men who respond to women’s profiles of various ages:

We’re meant to be impressed with Rudder’s line, “for every 100 men interested in that twenty year old, there are only 9 looking for someone thirty years older.” But here’s the thing, maybe there are 20 times as many 20-year-olds as there are 50-year-olds on the site? In which case, yay for the 50-year-old chicks? After all, those histograms look pretty healthy in shape, and they might be differently sized because the population size itself is drastically different for different ages.

Confounding

One of the worst examples of statistical mistakes is his experiment in turning off pictures. Rudder ignores the concept of confounders altogether, which he again miraculously is aware of in the next chapter on race.

To be more precise, Rudder talks about the experiment when OKCupid turned off pictures. Most people went away when this happened but certain people did not:

Some of the people who stayed on went on a “blind date.” Those people, which Rudder called the “intrepid few,” had a good time with people no matter how unattractive they were deemed to be based on OKCupid’s system of attractiveness. His conclusion: people are preselecting for attractiveness, which is actually unimportant to them.

But here’s the thing, that’s only true for people who were willing to go on blind dates. What he’s done is select for people who are not superficial about looks, and then collect data that suggests they are not superficial about looks. That doesn’t mean that OKCupid users as a whole are not superficial about looks. The ones that are just got the hell out when the pictures went dark.

Race

This brings me to the most interesting part of the book, where Rudder explores race. Again, it ends up being too blunt by far.

Here’s the thing. Race is a big deal in this country, and racism is a heavy criticism to be firing at people, so you need to be careful, and that’s a good thing, because it’s important. The way Rudder throws it around is careless, and he risks rendering the term meaningless by not having a careful discussion. The frustrating part is that I think he actually has the data to have a very good discussion, but he just doesn’t make the case the way it’s written.

Rudder pulls together stats on how men of all races rate women of all races on an attractiveness scale of 1-5. It shows that non-black men find their own race attractive and non-black men find black women, in general, less attractive. Interesting, especially when you immediately follow that up with similar stats from other U.S. dating sites and – most importantly – with the fact that outside the U.S., we do not see this pattern. Unfortunately that crucial fact is buried at the end of the chapter, and instead we get this embarrassing quote right after the opening stats:

And an unintentionally hilarious 84 percent of users answered this match question:

Would you consider dating someone who has vocalized a strong negative bias toward a certain race of people?

in the absolute negative (choosing “No” over “Yes” and “It depends”). In light of the previous data, that means 84 percent of people on OKCupid would not consider dating someone on OKCupid.

Here Rudder just completely loses me. Am I “vocalizing” a strong negative bias towards black women if I am a white man who finds white women and asian women hot?

Especially if you consider that, as consumers of social platforms and sites like OKCupid, we are trained to rank all the products we come across to ultimately get better offerings, it is a step too far for the detective on the other side of the camera to turn around and point fingers at us for doing what we’re told. Indeed, this sentence plunges Rudder’s narrative deeply into the creepy and provocative territory, and he never fully returns, nor does he seem to want to. Rudder seems to confuse provocation for thoughtfulness.

This is, again, a shame. A careful conversation about the issues of what we are attracted to, what we can imagine doing, and how we might imagine that will look to our wider audience, and how our culture informs those imaginings, are all in play here, and could have been drawn out in a non-accusatory and much more useful way.

Categories: data science, feedback loop, news, rant, statistics

Comments (15)

pdehaye

September 16, 2014 at 8:32 am

Some of the examples here:
http://bigdata.fairness.io
were actually pretty interesting and thoughtful.

LikeLike
- Cathy O'Neil, mathbabe
  
  September 16, 2014 at 9:51 am
  
  Wow, thanks so much, that’s fantastic!
  
  Cathy
  
  LikeLike
pdehaye

September 16, 2014 at 9:59 am

You are welcome…
There is also a conference going on at Stanford right now on a similar topic:
http://pacscenter.stanford.edu/content/ethics-data-conference

LikeLike
- Cathy O'Neil, mathbabe
  
  September 16, 2014 at 10:08 am
  
  Thanks! Too bad the conference is invitation only.
  
  On Tue, Sep 16, 2014 at 10:00 AM, mathbabe wrote:
  
  >
  
  LikeLike
Lon Thomas

September 16, 2014 at 12:49 pm

graf 7 “he just says there are lots of people OKCupid for the first few chapters,” is that lots of OKCupid people? Just blew my mind a bit, trying to make OKCupid into an adverbial phrase, o r I’m just misreading it entirely.

LikeLike
- Cathy O'Neil, mathbabe
  
  September 16, 2014 at 12:53 pm
  
  Haha sorry yes meant users.
  
  LikeLike
Auros

September 16, 2014 at 3:44 pm

I agree that the OKCupid blog, which is most of the basis for the book, is kind of facile a lot of the time. As I recall, they did do a comparison of the user base against a broad picture of the US population at some point, though. And regarding the question of how “attractive” gets defined — they’ve addressed that point very directly. They used to have separate ratings, for appearance and personality, and they found the two to be ridiculously highly correlated. (See the “we experiment on human beings!” post, for the example of a soft-porn-y, totally text-free profile that got a 99th percentile “personality” rating, and the experiment with showing or suppressing text.)

http://blog.okcupid.com/index.php/we-experiment-on-human-beings/

My own suspicion about this is that people are much more willing to assign a “hot but dislikable” score, than they are to assign a, “nice but ugly” score. With the latter, they simply decline to rate at all. If you look at the graph of the looks vs personality scores, you’ll see that there are a bunch of “outliers” like {looks 5, personality 3}

LikeLike
- Auros
  
  September 16, 2014 at 3:49 pm
  
  Gah, stupid thing posted before I was done, and I can’t edit. 😛
  
  continuing from where I left off: …but there appear to be none of the form {looks 3, personality 5}. Possibly people also decline to assign {looks 5, personality 1 or 2}, because in that case, the personality is so bad that they don’t want the looks-rating to trigger a possible match.
  
  I think this actually is pretty good evidence that people are a little shallow about looks. They try to be kind about not assigning a low looks rating to somebody that seems nice, but they still don’t apply the high personality rating that might result in a match notice. They *do* assign the high rating to the person who they think is hot, as long as their personality isn’t so terrible that they think they won’t be able to tolerate them for even a few hours.
  
  Basically, people will reject based on either, but they’re much quicker to reject based on looks; to get rejected based on personality, you have to be truly awful. If the typical user sees somebody that they think is hot but kinda dull, they’ll still take a chance on that, most likely on the theory that they might have a fun couple dates and then not see the person again.
  
  LikeLike
Santiago Ortiz (@moebio)

September 16, 2014 at 6:16 pm

I wanted to read the book, now I’m not sure. Please tell me if the following is truth:

Let’s take the matrix of ages and attractiveness, which I find fascinating. This single object informs me about what ages in men seem to be more attractive for women of certain age, and vice-versa. But it might happen that the recommendation engine mechanically recommends Men of 40 to Women that are 35. And the book doesn’t give me any clue about it. In that case I’m afraid all statistics presented are so biased there are simply meaningless, a pale and distorted reflex of the recommendation engine, with a small percentage of real human behavior (out of people’s freedom when choosing among the ludicrous and biased sampling the recommendation offers, or by the audacious that go beyond the recommendations). But I, reader, can’t even measure the effect of one or the other.

LikeLike
mr mcknuckles

September 17, 2014 at 12:18 pm

Ruder made a similar error (assuming population on site was representative) in a post on his blog. It asked if people liked rough sex. The amount of men liking it on OKC went up steadily as they aged, while women didn’t. He assumed that meant men in general liked rough sex – instead of the other obvious assumption. Older men who liked rough sex might be more likely to be single.

LikeLike
sheisonthemove

September 17, 2014 at 2:11 pm

This is a really interesting critique. I’m currently reading Rudder’s book and as an OKC user, I was surprised by some of the basic misunderstandings he has of the way people use his site (specifically women). In evaluating the ways that straight/gay men/women use the site, he notes that straight women are less likely to list “casual sex” as one of the things they’re looking for. And rather than try to acknowledge why a woman might not want to advertise willingness for casual sex (i.e. increased likelihood of receiving harassing or explicit messages from other users) he writes it off as women being “relative prudes” or blaming it on society’s repression of female sexuality.

There were several points that I really wished he had elaborated more on how the data was collected. I really don’t think he explained the OKC rating system well enough. In fact, I was wondering if the data he cited about male/female attractiveness was drawn from the site’s “My Best Face” function, where other users rate profile photos (seen without the context of the user’s profile, so judging purely on appearance). In this way, the a 45 year old man would be rating women from a range of age groups.

Again, I’m not finished with the book, but I certainly find it interesting, if only as a glimpse into the ways that people are (mis)using the site.

LikeLike
- Cathy O'Neil, mathbabe
  
  September 17, 2014 at 2:15 pm
  
  That’s a really great point.
  
  I was also wondering how these 45-year-old men ended up rating 20-year-old women at all, considering that they generally asked to be matched with women in an age range closer to theirs. How does “My Best Face” work? Does a new face automatically get displayed if you rate one face? If it does, then there’s some kind of algorithm deciding which face to show you next, which was one of my biggest concerns (unless it’s totally random). Would love to know more, thanks!
  
  Cathy
  
  LikeLike
  - sheisonthemove
    
    September 17, 2014 at 2:31 pm
    
    Essentially you upload potential profile pics and other OKC users can evaluate it, if I’m remembering correctly, by showing it up against another user’s profile pic. So given the choice between two anonymous users of the opposite sex, they must select the one whose picture appeals to them more.
    
    http://www.okcupid.com/mybestface
    
    Once a certain number of users evaluate your pictures, you get a “report” telling you which pic was the most liked with some information about what kinds of people liked those pictures. I don’t know if the program’s filtered in any way, so that say a 30 year old white man is only ever up against another white man in his 30s. And I have no idea if they somehow filter the users who can vote on a given person’s pics. Again, for all that they’re using our data, they aren’t particularly forthcoming about the methodology (which, I get, they’re preserving their investment, but a little transparency would be nice).
    
    LikeLike
vznvzn

September 18, 2014 at 12:36 am

rudder is quite subversive & hilarious at times, have been following his okcupid blogs for years, and he’s not saying anything he hasnt said for years. he’s like a big data hacker and hacking not only on computers but with the people behind them. hes like the wild/ untamed/ unruly startup vs the stodgy died-in-the-wool corporation. a vital part of the emerging conversation on the topic. great stuff! hope he eventually gets interviewed by some comedian like conan obrien. predict that would be spectacular. maybe a new big data celebrity in the making? there ought to be someone other than mark zuckerberg who seems to be fleeing from the increasing klieg spotlight glare associated with the role!

LikeLike
Black Suburbans

September 18, 2014 at 6:06 am

Great critique, Cathy. I was put off by the jocular tone of Rudder’s “We Experiment” blog post–it seemed to mock people’s legitimate concerns. This article points out the same tone issue in the book, and that the data is not available to you and I: http://www.theverge.com/2014/9/11/6132023/okcupid-data-blog-is-back-in-book-form

It’s a bummer, really, because Christian Rudder has done other really cool stuff. He was the guy behind TheSpark.com (!), and I enjoy his indie rock band Bishop Allen.

LikeLike