…the McAuliffe campaign invested heavily in both the data and the creative sides to ensure it could target key voters with specialized messages. Over the course of the campaign, he said, it reached out to 18 to 20 targeted voter groups, with nearly 4,000 Facebook ads, more than 300 banner display ads, and roughly three dozen different pre-roll ads — the ads seen before a video plays — on television and online.
Now I want you to close your eyes and imagine what kind of numbers we will see for the current races, not to mention the upcoming presidential election.
What’s crazy to me about the Times article is that it never questions the implications of this movement. The biggest problem, it seems, is that the analytics have surpassed the creative work of making ads: there are too many segments of populations to tailor the political message to, and not enough marketers to massage those particular messages for each particular segment. I’m guessing that there will be more money and more marketers in the presidential campaign, though.
Translation: politicians can and will send different messages to individuals on Facebook, depending on what they think we want to hear. Not that politicians follow through with all their promises now – they don’t, of course – but imagine what they will say when they can make a different promise to each group. We will all be voting for slightly different versions of a given story. We won’t even know when the politician is being true to their word – which word?
This isn’t the first manifestation of different messages to different groups, of course. Romney’s famous “47%” speech was a famous example of tailored messaging to super rich donors. But on the other hand, it was secretly recorded by a bartender working the event. There will be no such bartenders around when people read their emails and see ads on Facebook.
I’m not the only person worried about this. For example, ProPublica studied this in Obama’s last campaign (see this description). But given the scale of the big data political ad operations now in place, there’s no way they – or anyone, really – can keep track of everything going on.
There are lots of ways that “big data” is threatening democracy. Most of the time, it’s by removing open discussions of how we make decisions and giving them to anonymous and inaccessible quants; think evidence-based sentencing or value-added modeling for teachers. But this political campaign ads is a more direct attack on the concept of a well-informed public choosing their leader.
The American Enterprise Institute, conservative think-tank, is releasing a report today. It’s called For richer, for poorer: How family structures economic success in America, and there is also an event in DC today from 9:30am til 12:15pm that will be livestreamed. The report takes a look at statistics for various races and income levels at how marriage is associated with increased hours works and income, for men especially.
It uses a technique called the “fixed-effects model,” and since I’d never studied that I took a look at it on the wikipedia page, and in this worked-out example on Josh Blumenstock’s webpage of massage prices in various cities, and in this example, on Richard William’s webpage, where it’s also a logit model, for girls in and out of poverty.
The critical thing to know about fixed effects models is that we need more than one snapshot of an object of interest – in this case a person who is or isn’t married – in order to use that person as a control against themselves. So in 1990 Person A is 18 and unmarried, but in 2000 he is 28 and married, and makes way more money. Similarly, in 1990 Person B is 18 and unmarried, but in 2000 he is 28 and still unmarried, and makes more money but not quite as much more money as Person A.
The AEI report cannot claim causality – and even notes as much on page 8 of their report – so instead they talk about a bunch of “suggested causal relationships” between marriage and income. But really what they are seeing is that, as men get more hours at work, they also tend to get married. Not sure why the married thing would cause the hours, though. As women get married, they tend to work fewer hours. I’m guessing this is because pregnancy causes both.
The AEI report concludes, rightly, that people who get married, and come from homes where there were married parents, make more money. But that doesn’t mean we can “prescribe” marriage to a population and expect to see that effect. Causality is a bitch.
On the other hand, that’s not what the AEI says we should do. Instead, the AEI is recommending (what else?) tax breaks to encourage people to get married. Most bizarre of their suggestions, at least to me, is to expand tax benefits for single, childless adults to “increase their marriageability.” What? Isn’t that also an incentive to stay single and childless?
What I’m worried about is that this report will be cleverly marketed, using the phrase “fixed effects,” to make it seem like they have indeed proven “mathematically” that individuals, yet again, are to be blamed for the structural failure of our nation’s work problems, and if they would only get married already we’d all be ok and have great jobs. All problems will be solved by tax breaks.
Panelists included Facebook CTO Mike Schroepfer, Google’s SVP of Search Alan Eustace, GoDaddy CEO Blake Irving, and Intuit CTO Tayloe Stansbury. The advice was stale and trite and included things like “speak up,” “lean in,” and “get excited about your ideas like men do.”
By far the best part was the audience response – I wish I’d been there just for that part.
There was a Bingo game on the phrases that were anticipated:
What male allies should really be doing, step 1
Here’s the thing. If you haven’t seen this video of gamer Anita Sarkeesian speaking at the Feminist Frequency conference (hat tip Josh Vekhter), go take a look. It’s a fantastic and articulate diatribe against sexism and misogyny, and it ends with a super reasonable request of the men in the audience and in the world:
Trust women who say they experience sexism.
What’s amazing to me is how hard this is to hear for men in my life. When I repeated this to a couple of them, they actually said that I didn’t experience the stuff that I had. It was kind of nuts, and I had to point out to them that they were failing on the most basic level.
Yes, it requires empathy, and observation, and yes it sucks, because once you start seeing it you will be disappointed in the world. Tough shit, it’s reality.
What male allies should really be doing, step 2
Once men start trusting the women they love and admire and work with, then the next thing they can do is start acting on that knowledge.
I don’t know how many times I’ve been the target of sexism in front of other men and somehow it’s my job to confront it and deal with it. Men, step the fuck up and, when you see sexism happening, once you can manage that, defend the target and put a stop to it. Speak up and defend your friend, or your wife, or your daughter, or your colleague. Thanks.
Yesterday at the Alt Banking group we discussed the recent Koch brothers article from Rolling Stone Magazine, written by Tim Dickinson. You should read it now if you haven’t already.
There are tons of issues that came up, but one of them in particular was the control of information that the Koch brothers maintain over their activities. If you read the article, you realize that the brothers are die-hard libertarians but at some point realized that saying out loud that they are die-hard libertarians was working against them, specifically in terms of getting into trouble for polluting the environment with their chemical factories, so instead they started talking about how much they love the environment and work to protect it.
It’s not that they stopped polluting, it’s that their rhetoric changed. In fact there’s no reason to think they stopped polluting, since they still had plenty of regulators going after them for various violations. Since their apparent change of heart they’ve also decided to be publicly philanthropic, giving money to hospitals, and Lincoln Center, and even PBS (see how that worked out on Stephen Colbert).
The problem with all this window dressing is that people are actually starting to think the Koch brothers may be good guys after all, and what with the fancy lawyers that the Koch brothers hire to control information about them, the public view is very skewed.
For example, how many economists have they bought and inserted into universities nationwide? We will never really know. There’s no way we can keep a score sheet with “good deeds” on one side and “shitty deeds” on the other. We don’t have enough information for the second side.
The exception to this information control is when they get in trouble with regulators and it becomes a matter of public record. And thank goodness those court documents exist, and thank goodness investigative journalist Tim Dickinson did all the work he did to explain it to us.
A couple of conclusions. First, we complain a lot about the bank settlements for the misdeeds of the big banks. Nobody went to jail, and the system is just as likely to repeat this kind of thing again as it was in 2005. But another problem with this out-of-court settlement process, we now realize, is that we actually don’t know what happened except in big, vague terms. There will be no Tim Dickinson reporting on big banks.
Second, the connection to Detroit. Right now there are 15,000 residents of Detroit whose water has been shut down, basically so they can privatize the water system with the best deal from Wall Street. They owe less than $10 million, on average a measly $540. The United Nations has called this water shutoff a violation of the human rights of the people of Detroit.
If you feel bad about that, you can donate to someone’s water bill directly, which is kind of neat.
Or is it? Shouldn’t Obama be declaring Detroit a state of emergency? Wouldn’t we be doing that in another city that had 15,000 residents without water? Why is this an exception to that rule? Because the victims are poor? Don’t we recognize Detroit as a place where it’s unusually difficult to find work? Are we going to allow people to shut off heat as well, once winter comes?
Once you think about it, the idea of a “private solution” to the Detroit water emergency seems wrong. In fact, you can almost imagine David Koch coming to the rescue here, as part of his “positive optics” campaign, and bailing out the Detroit citizens and then, for good measure, buying up the water system altogether. A hero!
And if you’re in that mode, you can think about the asymptotic limit of that approach, whereby a few very rich people gradually take control of resources, and then there are intermittent famines of various types in different cities, and the rich people swoop in and heroically save the day whilst scooping up even more ownership of what used to be public infrastructure. And we might thank them every time, because it was a dire situation and they didn’t really need to do that with all their money.
It’s frustrating to live in a country that has so many resources but which can’t seem to get it together to meet the basic human needs of its citizens. We need a basic income, at least for the people in Detroit, at least right now.
This recent NYTimes article entitled Health Researchers Will Get $10.1 Million to Counter Gender Bias in Studies spelled out a huge problem that kind of blows me away as a statistician (and as a woman!).
Namely, they have recently decided over at the NIH, which funds medical research in this country, that we should probably check to see how women’s health are affected by drugs, and not just men’s. They’ve decided to give “extra money” to study this special group, namely females.
Here’s the bizarre and telling explanation for why most studies have focused on men and excluded women:
Traditionally many investigators have worked only with male lab animals, concerned that the hormonal cycles of female animals would add variability and skew study results.
Let’s break down that explanation, which I’ve confirmed with a medical researcher is consistent with the culture.
If you are afraid that women’s data would “skew study results,” that means you think the “true result” is the result that works for men. Because adding women’s data would add noise to the true signal, that of the men’s data. What?! It’s an outrageous perspective. Let’s take another look at this reasoning, from the article:
Scientists often prefer single-sex studies because “it reduces variability, and makes it easier to detect the effect that you’re studying,” said Abraham A. Palmer, an associate professor of human genetics at the University of Chicago. “The downside is that if there is a difference between male and female, they’re not going to know about it.”
Ummm… yeah. So instead of testing the effect on women, we just go ahead and optimize stuff for men and let women just go ahead and suffer the side effects of the treatment we didn’t bother to study. After all, women only comprise 50.8% of the population, they won’t mind.
This is even true for migraines, where 2/3rds of migraine sufferers are women.
One reason they like to exclude women: they have periods, and they even sometimes get pregnant, which is confusing for people who like to have clean statistics (on men’s health). In fact my research contact says that traditionally, this bias towards men in clinical trials was said to protect women because they “could get pregnant” and then they’d be in a clinical trial while pregnant. OK.
I’d like to hear more about who is and who isn’t in clinical trials, and why.
My first memory is of my father throwing a plate of eggs at my mother’s head, like a frisbee. My mother had to duck to get out of the way, and the plate exploded on the wall behind her. His eggs hadn’t been cooked well enough, and this was his way of expressing that to my mother, who had cooked them. Then he punched his hand through a glass window. Blood and glass fragments were everywhere. I was 4 years old. I remember running to my bed and crying, and the already familiar feeling of hiding in fear.
My mother was a battered woman who didn’t leave her abuser. And that meant a bunch of things for her and for me and my brother. I cannot explain her reasoning, because I was a small child when most of the abuse occurred. But I can tell you it’s common enough, and it’s not even that hard to understand.
One of the aspects of this decision – to stay with your abuser or not – that I haven’t been hearing a lot of recently, in this whole Ray Rice-inspired nationwide conversation about violence against women, is the economics of it. The worst of my father’s behavior happened when he was unemployed and desperately unhappy with how his life was turning out. Once he got on his feet again he didn’t take stuff out on his wife as much or as often. I imagine that is typical, but what it means is that it’s extra hard to imagine managing a second household, with small children, on one salary, when it’s already a huge struggle to manage one. The economic reality of leaving your husband has to be understood.
Even so, the abuse didn’t completely stop, and it’s not like my mother never considered leaving my father. I remember I went away for a month, to communist Budapest, when I was turning 13, the summer of 1985. When I came back my mother told me that my father had pushed her down the stairs. Then she asked me if she should leave him. I said yes, but then she didn’t do it.
I will probably never really forgive her for asking me that, for putting that kind of responsibility on a child like that, and then not following through. Especially now that I have kids of my own that age, it seems outrageous to put that kind of decision on their plate, or even seem to. It was my last day of childhood, the day I realized there were no responsible people in my family, and that I would have to step up and be the person who negotiated reasonable boundaries or, failing that, call the cops. From then on I was my mother and my brother’s protector.
If anyone ever asks me why I am not intimidated by anyone, I think of that moment. When you are a 13-year-old girl who has decided to stand up for your mother and brother against a large and very strong man, who often becomes an enraged and unreasonable bully, you forget about fear and intimidation, because it’s just something you cannot think about.
Many years later, after I left college, my father engaged me in a series of ritualized revisionist history lessons. Every Christmas, every Thanksgiving, maybe even on July 4th, he would bring up the bad old days and he’d mention how much I’d hated him when I was a teenager, and how he hadn’t deserved it, and how even when he’d been abusive to my mother, she had hit him first, and he hadn’t really wanted to do it but there it is. He often distorted facts, and he never explained why he was doing this.
It always sounded so bizarre to me – how could it matter that my mother had hit him first, not to mention that it was unbelievably hard to imagine? How could that be an excuse for what kind of fear and rage he had manifested on her body and on our family for so long? Answer: it isn’t an excuse.
It was very confusing, these inaccurate family history lessons in sermon form. It made me so angry I never could do anything except stay silent. I didn’t even correct him when he lied about the details, because he was evidently saying all of this more for him than for me.
It took me years to figure out why this conversation kept happening, but I think I finally know now. He was working through his guilt with me as his chosen audience. He was, in a sense, asking for my forgiveness. I never gave it, but what those conversations did accomplish for him was almost the same: he made it my problem for being so unkind as to not forgive him. After all, my mother had forgiven him, why couldn’t I? Looking back, I felt increasing pressure to forgive, but I never gave in. I didn’t even really know how.
Here’s why I’m thinking about this now. This Ray Rice and Adrian Peterson conversation, which I’ve been listening to on sports radio, has gotten me to thinking about this stuff. I am listening to these football guys, these pinnacles of macho masculinity, talking about men who abuse women and children, and describing it as unforgivable. Thank god for those men.
Because here’s the thing. It is unforgivable, but until now I hadn’t realized that I was allowed to think so. I’ve been feeling so guilty for so long at not being able to forgive my father, I never realized that I could just be okay with it. But now I do, and I don’t forgive him, and I never will.
After much deliberation, I’ve finally decided to publish this. To be clear, I’m not doing so to hurt my father or my mother. I’m writing it in hopes that by reading this, people will realize that this kind of thing happens everywhere, to all kinds of people, and that it’s always fucked up and wrong. We need to know that, the NFL needs to know that, and policy makers need to know that. We need to create stronger laws around this, that don’t buckle when the women refuse to press charges.
If this happened to you as a kid, it wasn’t your fault, and you don’t have to forgive if can’t or you don’t want to, and even if you don’t forgive them, you will probably still love them. Human beings are really good at conflicting emotions. Focus on not being like that yourself. My proudest accomplishment is that I have not perpetuated the cycle of violence on my own family. And good luck.
Here’s what I’ve spent the last couple of days doing: alternatively reading Christian Rudder’s new book Dataclysm and proofreading a report by AAPOR which discusses the benefits, dangers, and ethics of using big data, which is mostly “found” data originally meant for some other purpose, as a replacement for public surveys, with their carefully constructed data collection processes and informed consent. The AAPOR folk have asked me to provide tangible examples of the dangers of using big data to infer things about public opinion, and I am tempted to simply ask them all to read Dataclysm as exhibit A.
Rudder is a co-founder of OKCupid, an online dating site. His book mainly pertains to how people search for love and sex online, and how they represent themselves in their profiles.
Here’s something that I will mention for context into his data explorations: Rudder likes to crudely provoke, as he displayed when he wrote this recent post explaining how OKCupid experiments on users. He enjoys playing the part of the somewhat creepy detective, peering into what OKCupid users thought was a somewhat private place to prepare themselves for the dating world. It’s the online equivalent of a video camera in a changing booth at a department store, which he defended not-so-subtly on a recent NPR show called On The Media, and which was written up here.
I won’t dwell on that aspect of the story because I think it’s a good and timely conversation, and I’m glad the public is finally waking up to what I’ve known for years is going on. I’m actually happy Rudder is so nonchalant about it because there’s no pretense.
Even so, I’m less happy with his actual data work. Let me tell you why I say that with a few examples.
Who are OKCupid users?
I spent a lot of time with my students this summer saying that a standalone number wouldn’t be interesting, that you have to compare that number to some baseline that people can understand. So if I told you how many black kids have been stopped and frisked this year in NYC, I’d also need to tell you how many black kids live in NYC for you to get an idea of the scope of the issue. It’s a basic fact about data analysis and reporting.
When you’re dealing with populations on dating sites and you want to conclude things about the larger culture, the relevant “baseline comparison” is how well the members of the dating site represent the population as a whole. Rudder doesn’t do this. Instead he just says there are lots of OKCupid users for the first few chapters, and then later on after he’s made a few spectacularly broad statements, on page 104 he compares the users of OKCupid to the wider internet users, but not to the general population.
It’s an inappropriate baseline, made too late. Because I’m not sure about you but I don’t have a keen sense of the population of internet users. I’m pretty sure very young kids and old people are not well represented, but that’s about it. My students would have known to compare a population to the census. It needs to happen.
How do you collect your data?
Let me back up to the very beginning of the book, where Rudder startles us by showing us that the men that women rate “most attractive” are about their age whereas the women that men rate “most attractive” are consistently 20 years old, no matter how old the men are.
Actually, I am projecting. Rudder never actually specifically tells us what the rating is, how it’s exactly worded, and how the profiles are presented to the different groups. And that’s a problem, which he ignores completely until much later in the book when he mentions that how survey questions are worded can have a profound effect on how people respond, but his target is someone else’s survey, not his OKCupid environment.
Words matter, and they matter differently for men and women. So for example, if there were a button for “eye candy,” we might expect women to choose more young men. If my guess is correct, and the term in use is “most attractive”, then for men it might well trigger a sexual concept whereas for women it might trigger a different social construct; indeed I would assume it does.
Since this isn’t a porn site, it’s a dating site, we are not filtering for purely visual appeal; we are looking for relationships. We are thinking beyond what turns us on physically and asking ourselves, who would we want to spend time with? Who would our family like us to be with? Who would make us be attractive to ourselves? Those are different questions and provoke different answers. And they are culturally interesting questions, which Rudder never explores. A lost opportunity.
Next, how does the recommendation engine work? I can well imagine that, once you’ve rated Profile A high, there is an algorithm that finds Profile B such that “people who liked Profile A also liked Profile B”. If so, then there’s yet another reason to worry that such results as Rudder described are produced in part as a result of the feedback loop engendered by the recommendation engine. But he doesn’t explain how his data is collected, how it is prompted, or the exact words that are used.
Here’s a clue that Rudder is confused by his own facile interpretations: men and women both state that they are looking for relationships with people around their own age or slightly younger, and that they end up messaging people slightly younger than they are but not many many years younger. So forty year old men do not message twenty year old women.
Is this sad sexual frustration? Is this, in Rudder’s words, the difference between what they claim they want and what they really want behind closed doors? Not at all. This is more likely the difference between how we live our fantasies and how we actually realistically see our future.
Need to control for population
Here’s another frustrating bit from the book: Rudder talks about how hard it is for older people to get a date but he doesn’t correct for population. And since he never tells us how many OKCupid users are older, nor does he compare his users to the census, I cannot infer this.
Here’s a graph from Rudder’s book showing the age of men who respond to women’s profiles of various ages:
We’re meant to be impressed with Rudder’s line, “for every 100 men interested in that twenty year old, there are only 9 looking for someone thirty years older.” But here’s the thing, maybe there are 20 times as many 20-year-olds as there are 50-year-olds on the site? In which case, yay for the 50-year-old chicks? After all, those histograms look pretty healthy in shape, and they might be differently sized because the population size itself is drastically different for different ages.
One of the worst examples of statistical mistakes is his experiment in turning off pictures. Rudder ignores the concept of confounders altogether, which he again miraculously is aware of in the next chapter on race.
To be more precise, Rudder talks about the experiment when OKCupid turned off pictures. Most people went away when this happened but certain people did not:
Some of the people who stayed on went on a “blind date.” Those people, which Rudder called the “intrepid few,” had a good time with people no matter how unattractive they were deemed to be based on OKCupid’s system of attractiveness. His conclusion: people are preselecting for attractiveness, which is actually unimportant to them.
But here’s the thing, that’s only true for people who were willing to go on blind dates. What he’s done is select for people who are not superficial about looks, and then collect data that suggests they are not superficial about looks. That doesn’t mean that OKCupid users as a whole are not superficial about looks. The ones that are just got the hell out when the pictures went dark.
This brings me to the most interesting part of the book, where Rudder explores race. Again, it ends up being too blunt by far.
Here’s the thing. Race is a big deal in this country, and racism is a heavy criticism to be firing at people, so you need to be careful, and that’s a good thing, because it’s important. The way Rudder throws it around is careless, and he risks rendering the term meaningless by not having a careful discussion. The frustrating part is that I think he actually has the data to have a very good discussion, but he just doesn’t make the case the way it’s written.
Rudder pulls together stats on how men of all races rate women of all races on an attractiveness scale of 1-5. It shows that non-black men find their own race attractive and non-black men find black women, in general, less attractive. Interesting, especially when you immediately follow that up with similar stats from other U.S. dating sites and – most importantly – with the fact that outside the U.S., we do not see this pattern. Unfortunately that crucial fact is buried at the end of the chapter, and instead we get this embarrassing quote right after the opening stats:
And an unintentionally hilarious 84 percent of users answered this match question:
Would you consider dating someone who has vocalized a strong negative bias toward a certain race of people?
in the absolute negative (choosing “No” over “Yes” and “It depends”). In light of the previous data, that means 84 percent of people on OKCupid would not consider dating someone on OKCupid.
Here Rudder just completely loses me. Am I “vocalizing” a strong negative bias towards black women if I am a white man who finds white women and asian women hot?
Especially if you consider that, as consumers of social platforms and sites like OKCupid, we are trained to rank all the products we come across to ultimately get better offerings, it is a step too far for the detective on the other side of the camera to turn around and point fingers at us for doing what we’re told. Indeed, this sentence plunges Rudder’s narrative deeply into the creepy and provocative territory, and he never fully returns, nor does he seem to want to. Rudder seems to confuse provocation for thoughtfulness.
This is, again, a shame. A careful conversation about the issues of what we are attracted to, what we can imagine doing, and how we might imagine that will look to our wider audience, and how our culture informs those imaginings, are all in play here, and could have been drawn out in a non-accusatory and much more useful way.