Here’s what I’ve spent the last couple of days doing: alternatively reading Christian Rudder’s new book Dataclysm and proofreading a report by AAPOR which discusses the benefits, dangers, and ethics of using big data, which is mostly “found” data originally meant for some other purpose, as a replacement for public surveys, with their carefully constructed data collection processes and informed consent. The AAPOR folk have asked me to provide tangible examples of the dangers of using big data to infer things about public opinion, and I am tempted to simply ask them all to read Dataclysm as exhibit A.
Rudder is a co-founder of OKCupid, an online dating site. His book mainly pertains to how people search for love and sex online, and how they represent themselves in their profiles.
Here’s something that I will mention for context into his data explorations: Rudder likes to crudely provoke, as he displayed when he wrote this recent post explaining how OKCupid experiments on users. He enjoys playing the part of the somewhat creepy detective, peering into what OKCupid users thought was a somewhat private place to prepare themselves for the dating world. It’s the online equivalent of a video camera in a changing booth at a department store, which he defended not-so-subtly on a recent NPR show called On The Media, and which was written up here.
I won’t dwell on that aspect of the story because I think it’s a good and timely conversation, and I’m glad the public is finally waking up to what I’ve known for years is going on. I’m actually happy Rudder is so nonchalant about it because there’s no pretense.
Even so, I’m less happy with his actual data work. Let me tell you why I say that with a few examples.
Who are OKCupid users?
I spent a lot of time with my students this summer saying that a standalone number wouldn’t be interesting, that you have to compare that number to some baseline that people can understand. So if I told you how many black kids have been stopped and frisked this year in NYC, I’d also need to tell you how many black kids live in NYC for you to get an idea of the scope of the issue. It’s a basic fact about data analysis and reporting.
When you’re dealing with populations on dating sites and you want to conclude things about the larger culture, the relevant “baseline comparison” is how well the members of the dating site represent the population as a whole. Rudder doesn’t do this. Instead he just says there are lots of people OKCupid for the first few chapters, and then later on after he’s made a few spectacularly broad statements, on page 104 he compares the users of OKCupid to the wider internet users, but not to the general population.
It’s an inappropriate baseline, made too late. Because I’m not sure about you but I don’t have a keen sense of the population of internet users. I’m pretty sure very young kids and old people are not well represented, but that’s about it. My students would have know to compare a population to the census. It needs to happen.
How do you collect your data?
Let me back up to the very beginning of the book, where Rudder startles us by showing us that the men that women rate “most attractive” are about their age whereas the women that men rate “most attractive” are consistently 20 years old, no matter how old the men are.
Actually, I am projecting. Rudder never actually specifically tells us what the rating is, how it’s exactly worded, and how the profiles are presented to the different groups. And that’s a problem, which he ignores completely until much later in the book when he mentions that how survey questions are worded can have a profound effect on how people respond, but his target is someone else’s survey, not his OKCupid environment.
Words matter, and they matter differently for men and women. So for example, if there were a button for “eye candy,” we might expect women to choose more young men. If my guess is correct, and the term in use is “most attractive”, then for men it might well trigger a sexual concept whereas for women it might trigger a different social construct; indeed I would assume it does.
Since this isn’t a porn site, it’s a dating site, we are not filtering for purely visual appeal; we are looking for relationships. We are thinking beyond what turns us on physically and asking ourselves, who would we want to spend time with? Who would our family like us to be with? Who would make us be attractive to ourselves? Those are different questions and provoke different answers. And they are culturally interesting questions, which Rudder never explores. A lost opportunity.
Next, how does the recommendation engine work? I can well imagine that, once you’ve rated Profile A high, there is an algorithm that finds Profile B such that “people who liked Profile A also liked Profile B”. If so, then there’s yet another reason to worry that such results as Rudder described are produced in part as a result of the feedback loop engendered by the recommendation engine. But he doesn’t explain how his data is collected, how it is prompted, or the exact words that are used.
Here’s a clue that Rudder is confused by his own facile interpretations: men and women both state that they are looking for relationships with people around their own age or slightly younger, and that they end up messaging people slightly younger than they are but not many many years younger. So forty year old men do not message twenty year old women.
Is this sad sexual frustration? Is this, in Rudder’s words, the difference between what they claim they want and what they really want behind closed doors? Not at all. This is more likely the difference between how we live our fantasies and how we actually realistically see our future.
Need to control for population
Here’s another frustrating bit from the book: Rudder talks about how hard it is for older people to get a date but he doesn’t correct for population. And since he never tells us how many OKCupid users are older, nor does he compare his users to the census, I cannot infer this.
Here’s a graph from Rudder’s book showing the age of men who respond to women’s profiles of various ages:
We’re meant to be impressed with Rudder’s line, “for every 100 men interested in that twenty year old, there are only 9 looking for someone thirty years older.” But here’s the thing, maybe there are 20 times as many 20-year-olds as there are 50-year-olds on the site? In which case, yay for the 50-year-old chicks? After all, those histograms look pretty healthy in shape, and they might be differently sized because the population size itself is drastically different for different ages.
One of the worst examples of statistical mistakes is his experiment in turning off pictures. Rudder ignores the concept of confounders altogether, which he again miraculously is aware of in the next chapter on race.
To be more precise, Rudder talks about the experiment when OKCupid turned off pictures. Most people went away when this happened but certain people did not:
Some of the people who stayed on went on a “blind date.” Those people, which Rudder called the “intrepid few,” had a good time with people no matter how unattractive they were deemed to be based on OKCupid’s system of attractiveness. His conclusion: people are preselecting for attractiveness, which is actually unimportant to them.
But here’s the thing, that’s only true for people who were willing to go on blind dates. What he’s done is select for people who are not superficial about looks, and then collect data that suggests they are not superficial about looks. That doesn’t mean that OKCupid users as a whole are not superficial about looks. The ones that are just got the hell out when the pictures went dark.
This brings me to the most interesting part of the book, where Rudder explores race. Again, it ends up being too blunt by far.
Here’s the thing. Race is a big deal in this country, and racism is a heavy criticism to be firing at people, so you need to be careful, and that’s a good thing, because it’s important. The way Rudder throws it around is careless, and he risks rendering the term meaningless by not having a careful discussion. The frustrating part is that I think he actually has the data to have a very good discussion, but he just doesn’t make the case the way it’s written.
Rudder pulls together stats on how men of all races rate women of all races on an attractiveness scale of 1-5. It shows that non-black men find their own race attractive and non-black men find black women, in general, less attractive. Interesting, especially when you immediately follow that up with similar stats from other U.S. dating sites and – most importantly – with the face that outside the U.S., we do not see this pattern. Unfortunately that crucial fact is buried at the end of the chapter, and instead we get this embarrassing quote right after the opening stats:
And an unintentionally hilarious 84 percent of users answered this match question:
Would you consider dating someone who has vocalized a strong negative bias toward a certain race of people?
in the absolute negative (choosing “No” over “Yes” and “It depends”). In light of the previous data, that means 84 percent of people on OKCupid would not consider dating someone on OKCupid.
Here Rudder just completely loses me. Am I “vocalizing” a strong negative bias towards black women if I am a white man who finds white women and asian women hot?
Especially if you consider that, as consumers of social platforms and sites like OKCupid, we are trained to rank all the products we come across to ultimately get better offerings, it is a step too far for the detective on the other side of the camera to turn around and point fingers at us for doing what we’re told. Indeed, this sentence plunges Rudder’s narrative deeply into the creepy and provocative territory, and he never fully returns, nor does he seem to want to. Rudder seems to confuse provocation for thoughtfulness.
This is, again, a shame. A careful conversation about the issues of what we attracted to, what we can imagine doing, and how we might imagine that will look to our wider audience, and how our culture informs those imaginings, are all in play here, and could have been drawn out in a non-accusatory and much more useful way.
I am pushing an unusual way of considering economic health. I call it “distributional thinking.” It requires that you not aggregate everything into one statistic, but rather take a few samples from different parts of the distribution and consider things from those different perspectives.
So instead of saying “things are great because the economy has expanded at a rate of 4%” I’d like us to think about more individual definitions of “great.”
For example, it’s a good time to be rich right now. Really good. The stock market keeps hitting all-time highs, the jobs market is great in tech, and it’s still absolutely possible to hide wealth in off-shore tax havens.
It’s not so good to be middle class. Wages are stagnant and have been forever, and jobs are drying up due to automation and a lack of even maintenance-level infrastructure work. Colleges are super expensive, and the best the government can do is fiddle around the edges with interest rates.
It’s a really bad time to be poor in this country. Jobs are hard to find and conditions are horrible. There are more and more arrests for petty crimes as the violent crime rate goes down. Those petty crime arrests lead to big fees and sometimes jail time if you can’t pay the fee. Look at Ferguson as an example of what this kind of frustration this can lead to.
Once you are caught in the court system, private probation companies act as abusive debt collectors, and nobody controls their fees, which can be outrageous. To be clear, we let this happen in the name of saving money: private for-profit companies like this guarantee that they won’t cost anything to the local government because they make the people on probation pay for services.
And even though that’s an outrageous and predatory system, it’s not likely to go away. Once they are officially branded as criminals, the poor often lose their voting rights, which means they have little political recourse to protect themselves. On the flip side, they are largely silent about their struggles for the same reason.
Once you think about our economic health this way, you realize how comparatively meaningless the GDP is. It is no longer a good proxy to true economic health, where all classes would be more or less better off as it went up.
And until we get on the same page, where we all go up and down together, it is a mathematical fact that no one statistic could possibly capture the progress we are or are not making. Instead, we need to think distributionally.
Any time I see an article about the evaluation system for teachers in New York State, I wince. People get it wrong so very often. Yesterday’s New York Times article written by Elizabeth Harris was even worse than usual.
First, her wording. She mentioned a severe drop in student reading and math proficiency rates statewide and attributed it to a change in the test to the Common Core, which she described as “more rigorous.”
The truth is closer to “students were tested on stuff that wasn’t in their curriculum.” And as you can imagine, if you are tested on stuff you didn’t learn, your score will go down (the Common Core has been plagued by a terrible roll-out, and the timing of this test is Exhibit A). Wording like this matters, because Harris is setting up her reader to attribute the falling scores to bad teachers.
Harris ends her piece with a reference to a teacher-tenure lawsuit: ‘In one of those cases, filed in Albany in July, court documents contrasted the high positive teacher ratings with poor student performance, and called the new evaluation system “deficient and superficial.” The suit said those evaluations were the “most highly predictive measure of whether a teacher will be awarded tenure.”’
In other words, Harris is painting a picture of undeserving teachers sneaking into tenure in spite of not doing their job. It’s ironic, because I actually agree with the statement that the new evaluation system is “deficient and superficial,” but in my case I think it is overly punitive to teachers – overly random, really, since it incorporates the toxic VAM model – but in her framing she is implying it is insufficiently punitive.
Let me dumb Harris’s argument down even further: How can we have 26% English proficiency among students and 94% effectiveness among teachers?! Let’s blame the teachers and question the legitimacy of tenure.
Indeed, after reading the article I felt like looking into whether Harris is being paid by David Welch, the Silicon Valley dude who has vowed to fight teacher tenure nationwide. More likely she just doesn’t understand education and is convinced by simplistic reasoning.
In either case, she clearly needs to learn something about statistics. For that matter, so do other people who drag out this “blame the teacher” line whenever they see poor performance by students.
Because here’s the thing. Beyond obvious issues like switching the content of the tests away from the curriculum, standardized test scores everywhere are hugely dependent on the poverty levels of students. Some data:
It’s not just in this country, either:
The conclusion is that, unless you think bad teachers have somehow taken over poor schools everywhere and booted out the good teachers, and good teachers have taken over rich schools everywhere and booted out the bad teachers (which is supposed to be impossible, right?), poverty has much more of an effect than teachers.
Just to clarify this reasoning, let me give you another example: we could blame bad journalists for lower rates of newspaper readership at a given paper, but since newspaper readership is going down everywhere we’d be blaming journalists for what is a cultural issue.
Or, we could develop a process by which we congratulate specific policemen for a reduced crime rate, but then we’d have to admit that crime is down all over the country.
I’m not saying there aren’t bad teachers, because I’m sure there are. But by only focusing on rooting out bad teachers, we are ignoring an even bigger and harder problem. And no, it won’t be solved by privatizing and corporatizing public schools. We need to address childhood poverty. Here’s one more visual for the road:
There was a recent New York Times op-ed by Sonja Starr entitled Sentencing, by the Numbers (hat tip Jordan Ellenberg and Linda Brown) which described the widespread use – in 20 states so far and growing – of predictive models in sentencing.
The idea is to use a risk score to help inform sentencing of offenders. The risk is, I guess, supposed to tell us how likely the person is to commit another act in the future, although that’s not specified. From the article:
The basic problem is that the risk scores are not based on the defendant’s crime. They are primarily or wholly based on prior characteristics: criminal history (a legitimate criterion), but also factors unrelated to conduct. Specifics vary across states, but common factors include unemployment, marital status, age, education, finances, neighborhood, and family background, including family members’ criminal history.
I knew about the existence of such models, at least in the context of prisoners with mental disorders in England, but I didn’t know how widespread it had become here. This is a great example of a weapon of math destruction and I will be using this in my book.
A few comments:
- I’ll start with the good news. It is unconstitutional to use information such as family member’s criminal history against someone. Eric Holder is fighting against the use of such models.
- It is also presumably unconstitutional to jail someone longer for being poor, which is what this effectively does. The article has good examples of this.
- The modelers defend this crap as “scientific,” which is the worst abuse of science and mathematics imaginable.
- The people using this claim they only use it for as a way to mitigate sentencing, but letting a bunch of rich white people off easier because they are not considered “high risk” is tantamount to sentencing poor minorities more.
- It is a great example of confused causality. We could easily imagine a certain group that gets arrested more often for a given crime (poor black men, marijuana possession) just because the police have that practice for whatever reason (Stop & Frisk). Then model would then consider any such man at a higher risk of repeat offending, but that’s not because any particular person is actually more likely to do it, but because the police are more likely to arrest that person for it.
- It also creates a negative feedback loop on the most vulnerable population: the model will impose longer sentencing on the population it considers most risky, which will in turn make them even riskier in the future, if “length of time in prison previously” is used as an attribute in the model, which is surely is.
- Not to be cynical, but considering my post yesterday, I’m not sure how much momentum will be created to stop the use of such models, considering how discriminatory it is.
- Here’s an extreme example of preferential sentencing which already happens: rich dude Robert H Richards IV raped his 3-year-old daughter and didn’t go to jail because the judge ruled he “wouldn’t fare well in prison.”
- How great would it be if we used data and models to make sure rich people went to jail just as often and for just as long as poor people for the same crime, instead of the other way around?
Here’s what comes up in conversations at my Occupy meetings a lot: systemic racism.
Maybe once a week on average, whether we are talking about the criminal justice system, or the court system, or the educational system, or standardized tests, or chronic employment problems, or welfare rhetoric, or homelessness. There are many very well-informed people in my group which can speak eloquently and convincingly about how the system itself, not any particular person (although they do exist), discriminates against minorities in this country.
As a group we cheered when Ta-Nehisi Coates came out with his Atlantic piece entitled The Case for Reparations. So much resonated, especially the parts about widespread reverse redlining of mortgages to minorities in the run-up to the credit crisis. And it finally taught me how to think about affirmative action.
Another thing that comes up sometimes, although less often: how white people, even liberals like Elizabeth Warren, don’t talk about racism anymore. They want to address education inequalities through class-based or income-based measures rather than race-based ones. They talk about unemployment and joblessness and the need for criminal justice reform without referring to the enormous and glaring racial disparities.
I’m left feeling a lot like I felt in 7th grade social studies when we studied the period of mass genocide of American Indians and called it “Manifest Destiny.”
This recent study entitled Racial Disparities in Incarceration Increase Acceptance of Punitive Policies might explain why white people are so reluctant to talk about racism. Namely, because white react strangely when you specifically point out systemic racism (they are OK with it).
So in other words, if you tell them how many people are incarcerated in this country compared to other countries, they think it’s terrible and we should stop putting so many people in jail. But if you tell them most of those prisoners (60% in New York City) are black, then they’re less likely to think it’s terrible. They also remember the number wrong, thinking it’s higher than it is. Here’s a succinct summary from this Vox article:
The question seems to be which instinct wins out: the belief that our prison system isn’t fair, or the assumption that a prisoner must be a criminal. According to the study, when whites are primed to think of prisoners as black, it’s the latter that wins out.
The conclusion of the Vox article is this: politicians and activists have figured out that, if they want to agitate for criminal justice reform, they can’t mention systemically racist unfairness, because that just doesn’t upset powerful people enough. Instead, they need to focus on important stuff like saving money, which is how you get white people people up in arms. That’s what flies in the focus groups, apparently.
It explains why Elizabeth Warren doesn’t talk about race when she talks about student loans, preferring to talk about “young people”, even though the problem is worse for non-Asian minorities. Similarly, Obama is targeting for-profit colleges without reference to race (but with reference to veterans!) even though for-profit colleges notoriously target minorities.
The problem with understanding stuff like this is that it’s primarily used to be politically cunning, which is not enough. I’d like to talk about how to get people to directly confront racism, starting with liberals.
Today I’d like to rant about a pattern I’ve noticed.
Namely, I have a bunch of female friends and acquaintances that I consider feisty, informed, and argumentative sorts. People who are fun to be around and who know how to stick up for themselves, know how to spot misogyny and paternalism in all contexts, and most of all know how to dismiss such nonsense when it appears, and then get on with whatever they were doing.
And then they get pregnant and the lose most if not all of those properties. They get doctors who tell them what to eat, and how much, even though they’ve been doing quite well feeding themselves for 30 odd years without help. They get doctors who tell them how much pain killers they should have during labor, when it’s months and months before labor and we don’t even know what’s gonna happen. What gives?
Here’s a guess. Partly it’s the baby hormones that make you generally confused when you’re pregnant. The other part is that the stakes are high, and you are not an expert, so you defer to your baby doctor. Plus there’s all those ridiculous and scary pregnancy books out there which just serve to make women neurotic and should be burned. Oh and sometimes the doctors are women so they don’t seem paternalistic. But that’s what it is:
But here’s the thing, there’s not much evidence about exactly how you should eat when you’re pregnant, unless you are doing something absolutely weird. And, in spite of what a no-drugs doctor might suggest, it’s not all that dangerous to babies to have pain meds. In fact it’s super safe to have a baby now compared to the past, both for you and and your baby. And thank goodness for that.
On the flip side, a doctor has no business dictating to you that you will have an epidural either, which is what happened to my mom back in the 1970’s. It’s really your choice, and you should decide.
So if you have one of those pushy-ass doctors, fuck ‘em. This is your body, you get to decide that stuff. Go get a new doctor.
And to be sure, I’m not saying you shouldn’t inform yourself about risks and signs of pre-eclampsia and other truly important stuff, but for goodness sakes don’t forget your feminist training. It’s not just your baby here, it’s also you, and yes you deserve to eat food you want to eat and to moderate pain if it gets overwhelming. You will be happier, your baby will be just fine, and she or he won’t remember a thing. Consider it training for how to be a mom later.
Today I read this article written by Allie Gross (hat tip Suresh Naidu), a former Teach for America teacher whose former idealism has long been replaced by her experiences in the reality of education in this country. Her article is entitled The Charter School Profiteers.
It’s really important, and really well written, and just one of the articles in the online magazine Jacobin that I urge you to read and to subscribe to. In fact that article is part of a series (here’s another which focuses on charter schools in New Orleans) and it comes with a booklet called Class Action: An Activist Teacher’s Handbook. I just ordered a couple of hard copies.
I’d really like you to read the article, but as a teaser here’s one excerpt, a rant which she completely backs up with facts on the ground:
You haven’t heard of Odeo, the failed podcast company the Twitter founders initially worked on? Probably not a big deal. You haven’t heard about the failed education ventures of the person now running your district? Probably a bigger deal.
When we welcome schools that lack democratic accountability (charter school boards are appointed, not elected), when we allow public dollars to be used by those with a bottom line (such as the for-profit management companies that proliferate in Michigan), we open doors for opportunism and corruption. Even worse, it’s all justified under a banner of concern for poor public school students’ well-being.
While these issues of corruption and mismanagement existed before, we should be wary of any education reformer who claims that creating an education marketplace is the key to fixing the ills of DPS or any large city’s struggling schools. Letting parents pick from a variety of schools does not weed out corruption. And the lax laws and lack of accountability can actually exacerbate the socioeconomic ills we’re trying to root out.