I heard an NPR report yesterday with Emily Steel, reporter from the Financial Times, about what kind of attributes make you worth more to advertisers. She has developed an ingenious online calculator here, which you should go play with.
As you can see it cares about things like whether you’re about to have a kid or are a new parent, as well as if you’ve got some disease where the industry for that disease is well-developed in terms of predatory marketing.
For example, you can bump up your worth to $0.27 from the standard $0.0007 if you’re obese, and another $0.10 if you admit to being the type to buy weight-loss products. And of course data warehouses can only get that much money for your data if they know about your weight, which they may or may not since if you don’t buy weight-loss products.
The calculator doesn’t know everything, and you can experiment with how much it does know, but some of the default assumptions are that it knows my age, gender, education level, and ethnicity. Plenty of assumed information to, say, build an unregulated version of a credit score to bypass the Equal Credit Opportunities Act.
Every now and then you see a published result that has exactly the right kind of data, in sufficient amounts, to make the required claim. It’s rare but it happens, and as a data lover, when it happens it is tremendously satisfying.
Today I want to share an example of that happening, namely with this paper entitled Regulating Consumer Financial Products: Evidence from Credit Cards (hat tip Suresh Naidu). Here’s the abstract:
We analyze the effectiveness of consumer financial regulation by considering the 2009 Credit Card Accountability Responsibility and Disclosure (CARD) Act in the United States. Using a difference-in-difference research design and a unique panel data set covering over 150 million credit card accounts, we find that regulatory limits on credit card fees reduced overall borrowing costs to consumers by an annualized 1.7% of average daily balances, with a decline of more than 5.5% for consumers with the lowest FICO scores. Consistent with a model of low fee salience and limited market competition, we find no evidence of an offsetting increase in interest charges or reduction in volume of credit. Taken together, we estimate that the CARD Act fee reductions have saved U.S. consumers $12.6 billion per year. We also analyze the CARD Act requirement to disclose the interest savings from paying off balances in 36 months rather than only making minimum payments. We find that this “nudge” increased the number of account holders making the 36-month payment value by 0.5 percentage points.
That’s a big savings for the poorest people. Read the whole paper, it’s great, but first let me show you some awesome data broken down by FICO score bins:
This data, and the results in this paper, fly directly in the face of the myth that if you regulate away predatory fees in one way, they will pop up in another way. That myth is based on the assumption of a competitive market with informed participants. Unfortunately the consumer credit card industry, as well as the small business card industry, is not filled with informed participants. This is a great example of how asymmetric information causes predatory opportunities.
A fascinating and timely study just came out about the “Stand Your Ground” laws. It was written by Cheng Cheng and Mark Hoekstra, and is available as a pdf here, although I found out about in a Reuters column written by Hoekstra. Here’s a longish but crucial excerpt from that column:
It is fitting that much of this debate has centered on Florida, which enacted its law in October of 2005. Florida provides a case study for this more general pattern. Homicide rates in Florida increased by 8 percent from the period prior to passing the law (2000-04) to the period after the law (2006-10).By comparison, national homicide rates fell by 6 percent over the same time period. This is a crude example, but it illustrates the more general pattern that exists in the homicide data published by the FBI.
The critical question for our research is whether this relative increase in homicide rates was caused by these laws. Several factors lead us to believe that laws are in fact responsible. First, the relative increase in homicide rates occurred in adopting states only after the laws were passed, not before. Moreover, there is no history of homicide rates in adopting states (like Florida) increasing relative to other states. In fact, the post-law increase in homicide rates in states like Florida was larger than any relative increase observed in the last 40 years. Put differently, there is no evidence that states like Florida just generally experience increases in homicide rates relative to other states, even when they don’t pass these laws.
We also find no evidence that the increase is due to other factors we observe, such as demographics, policing, economic conditions, and welfare spending. Our results remain the same when we control for these factors. Along similar lines, if some other factor were driving the increase in homicides, we’d expect to see similar increases in other crimes like larceny, motor vehicle theft and burglary. We do not. We find that the magnitude of the increase in homicide rates is sufficiently large that it is unlikely to be explained by chance.
In fact, there is substantial empirical evidence that these laws led to more deadly confrontations. Making it easier to kill people does result in more people getting killed.
If you take a look at page 33 of the paper, you’ll see some graphs of the data. Here’s a rather bad picture of them but it might give you the idea:
That red line is the same in each plot and refers to the log homicide rate in states without the Stand Your Ground law. The blue lines are showing how the log homicide rates looked for states that enacted such a law in a given year. So there’s a graph for each year.
In 2009 there’s only one “treatment” state, namely Montana, which has a population of 1 million, less than one third of one percent of the country. For that reason you see much less stable data. The authors did different analyses, sometimes weighted by population, which is good.
I have to admit, looking at these plots, the main thing I see in the data is that, besides Montana, we’re talking about states that have a higher homicide rate than usual, which could potentially indicate a confounding condition, and to address that (and other concerns) they conducted “falsification tests,” which is to say they studied whether crimes unrelated to Stand Your Ground type laws – larceny and motor vehicle theft – went up at the same time. They found that the answer is no.
The next point is that, although there seem to be bumps for 2005, 2006, and 2008 for the two years after the enactment of the law, there doesn’t for 2007 and 2009. And then even those states go down eventually, but the point is they don’t go down as much as the rest of the states without the laws.
It’s hard to do this analysis perfectly, with so few years of data. The problem is that, as soon as you suspect there’s a real effect, you’d want to act on it, since it directly translates into human deaths. So your natural reaction as a researcher is to “collect more data” but your natural reaction as a citizen is to abandon these laws as ineffective and harmful.
Scott Hodge just came out with a column in the Wall Street Journal arguing that reducing income inequality is way too hard to consider. The title of his piece is Scott Hodge: Here’s What ‘Income Equality’ Would Look Like, and his basic argument is as follows.
First of all, the middle quintile already gets too much from the government as it stands. Second of all, we’d have to raise taxes to 74% for the top quintile to even stuff out. Clearly impossible, QED.
As to the first point, his argument, and his supporting data, is intentionally misleading, as I will explain below. As to his second point, he fails to mention that the top tax bracket has historically been much higher than 74%, even as recently as 1969, and the world didn’t end.
Hodge argues with data he took from a report from the CBO called The Distribution of Federal Spending and Taxes in 2006. This report distinguishes between transfers and spending. Here’s a chart to explain what that looks, before taxes are considered and by quintile, for non-elderly households (page 5 of the report):
The stuff on the left corresponds to stuff like food stamps. The stuff in the middle is stuff like Medicaid. The stuff on the right is stuff like wars.
Here are a few things to take from the above:
- There’s way more general spending going on than transfers.
- Transfers are very skewed towards the lowest quintile, as would be expected.
- If you look carefully at the right-most graph, the light green version gives you a way of visualizing of how much more money the top quintile has versus the rest.
Now let’s break this down a bit further to include taxes. This is a key chart that Hodge referred to from this report (page 6 of the report):
OK, so note that in the middle chart, for the middle quintile, people pay more in taxes than they receive in transfers. On the right chart, for the middle quintile, which includes all spending, the middle quintile is about even, depending on how you measure it.
Now let’s go to what Hodge says in his column (emphasis mine):
Looking at prerecession data for non-elderly households in 2006 in “The Distribution of Federal Spending and Taxes in 2006,” the CBO found that those in the bottom fifth, or quintile, of the income scale received $9.62 in federal spending for every $1 they paid in federal taxes of all kinds. This isn’t surprising, since people with low incomes pay little in taxes but receive a lot of transfers.
Nor is it surprising that households in the top fifth received 17 cents in federal spending for every $1 they paid in all federal taxes. High-income households hand over a disproportionate amount in taxes relative to what they get back in spending.
What is surprising is that the middle quintile—the middle class—also got more back from government than they paid in taxes. These households received $1.19 in government spending for every $1 they paid in federal taxes.
In the first paragraph Hodge intentionally conflates the concept of “transfers” and “spending”. He continues to do this for the next two paragraphs, and in the last sentence, it is easy to imagine a middle-quintile family paying $100 in taxes and receiving $119 in food stamps. This is of course not true at all.
What’s nuts about this is that it’s mathematically equivalent to complaining that half the population is below median intelligence. Duh.
Since we have a skewed distribution of incomes, and therefore a skewed distribution of tax receipts as well as transfers, then in the context of a completely balanced budget, we would expect the middle quintile – which has a below-mean average income – to pay slightly less than the government spends on them. It’s a mathematical fact as long as our federal tax system isn’t regressive, which it’s not.
In other words, this guy is just framing stuff in a “middle class is lazy and selfish, what could rich people possibly be expected do about that?” kind of way. Who is this guy anyway?
Turns out that Hodge is the President of the Tax Foundation, which touts itself as “nonpartisan” but which has gotten funding from Big Oil and the Koch brothers. I guess it’s fair to say he has an agenda.
One thing I learned on the “Public Facing Math” panel at the JMM was that I needed to know more about the Common Core, since so much of the audience was very interested in discussing it and since it was actually a huge factor in the public’s perception of math, both in the sense of high school math curriculum and in the context of the associated mathematical models related to assessments. In fact at that panel I promised to learned more about the Common Core and I urged other mathematicians in the room to do the same.
If you don’t know anything about Diane Ravitch, you should. She’s got a super interesting history in education – she’s an education historian – and in particular has worked high up, as the U.S. Assistant Secretary of Education and on the National Assessment Governing Board, which supervises the National Assessment of Educational Progress.
What’s most interesting about her is that, as a high ranking person in education, she originally supported the Bush “No Child Left Behind” policy but now is an outspoken opponent of it as well as Obama’s “Race to the Top“, which she claims in an extension of the same bad idea.
Ravitch writes an incredibly interesting blog on education issues and, what’s most interesting to me, assessment issues.
Ravitch in Westchester
Let me summarize her remarks in a free-form and incomplete way. If you want to know exactly what she said and how she said it, watch the video, and feel free to skip the first 16 minutes of introductions.
She doesn’t like the Common Core initiative and mentions that Gates Foundation people, mostly not experienced educators, and many of them associated to the testing industry, developed the Common Core standards. So there’s a suspicion right off the bat that the material is overly academic and unrealistic for actual teachers in actual classrooms.
She also objects to the idea of any fixed and untested set of standards. No standard is perfect, and this one is rigid. At the very least, if we need a “one solution for all” kind of standard, it needs to be under constant review and testing and open to revisions – a living document to change with the times and with the needs and limits of classrooms.
So now we have an unrealistic and rigid set of standards, written by outsiders with vested interests, and it’s all for the sake of being able to test everyone to death. She also made some remarks about the crappiness of the Value-Added Model similar to stuff I’ve mentioned in the past.
The Common Core initiative, she explains, exposes an underlying and incorrect mindset, which is that testing makes kids learn, and more testing makes kids learn faster. That setting a high bar makes kids suddenly be able to jump higher. The Common Core, she says, is that higher bar. But just because you raise standards doesn’t mean people suddenly know more.
In fact, she got a leaked copy of last year’s Common Core test and saw that it’s 5th grade version is similar to a current 8th grade standardized test. So it’s very much this “raise the bar” setup. And it points to the fact that standardized testing is used as punishment rather than diagnostic.
In other words, if we were interested in finding out who needs help and giving them help, we wouldn’t need harder and harder tests, we’d just look at who is struggling with the current tests and go help them. But because it’s all about punishment, we need to add causality and blame to the environment.
She claims that poverty causes kids to underperform in schools, and blaming the teachers on poverty is a huge distraction and meaningless for those kids. In fact, she asks, what are going to happen to all of those kids who fail the Common Core standards? What is going to become of them if we don’t allow them to graduate? And how do we think we are helping them? Why do we spend so much time with developing these fancy tests and on assessments instead of figuring out how to help them graduate?
She also points out that the blame game going on in this country is fueled by bad facts.
For example, there is no actual educational emergency in this country. In fact, test scores and graduation rates have never been higher for each racial group. And, although we are alway made to be afraid vis a vis our “international competition” (great recent example of this here) we actually historically never scored at the top of international rankings. But we didn’t think that meant we weren’t competitive 50 years ago, so why do we suddenly care now?
She provides the answer. Namely, if people are convinced there is an emergency in education, then the private companies – test prep and testing companies as well as companies that run charter school – stand to make big money from our response and from straight up privatization.
The statistical argument that poverty causes educational delays is ready to be made. If we want to “fix our educational system” then we need to address poverty, not scapegoat teachers.
I’m going to strike now, while the conversational iron is hot, and ask people to define success for a calculus MOOC.
I’ve already mostly explained why in this recent post, but just in case you missed it, I think mathematics is being threatened by calculus MOOCs, and although maybe in some possibly futures this wouldn’t be a bad thing, in others it definitely would.
One way it could be a really truly bad thing is if the metric of success were as perverted as we’ve seen happen in high school teaching, where Value-Added Models have no defined metric of success and are tearing up a generation of students and teachers, creating the kind of opaque, confusing, and threatening environment where code errors lead to people getting fired.
And yes, it’s kind of weird to define success in a systematic way given that calculus has been taught in a lot of places for a long time without such a rigid concept. And it’s quite possible that flexibility should be built in to the definition, so as to acknowledge that different contexts need different outcomes.
Let’s keep things as complicated as they need to be to get things right!
The problem with large-scale models is that they are easier to build if you have some fixed definition of success against which to optimize. And if we mathematicians don’t get busy thinking this through, my fear is that administrations will do it for us, and will come up with things based strictly on money and not so much on pedagogy.
So what should we try?
Here’s what I consider to be a critical idea to get started:
- Calculus teachers should start experimenting with teaching calculus in different ways. Do randomized experiments with different calculus sections that meet at comparable times (I say comparable because I’ve noticed that people who show up for 8am sections are typically more motivated students, so don’t pit them against 4pm sections).
- Try out a bunch of different possible definitions of success, including the experience and attitude of the students and the teacher.
- So for example, it could be how students perform on the final, which should be consistent for both sections (although to do that fairly you need to make sure the MOOC you’re using covers the critical material to do the final).
- Or it could be partly an oral exam or long-form written exam, whether students have learned to discuss the concepts (keeping in mind that we have to compare the “MOOC” students to the standardly taught students).
- Design the questions you will ask your students and yourself before the semester begins so as to practice good model design – we don’t want to decide on our metric after the fact. A great way to do this is to keep a blog with your plan carefully described – that will timestamp the plan and allow others to comment.
- Of course there’s more than one way to incorporate MOOCs in the curriculum, so I’d suggest more than one experiment.
- And of course the success of the experiment will also depend on the teaching style of the calc prof.
- Finally, share your results with the world so we can all start thinking in terms of what works and for whom.
One last comment. One might complain that, if we do this, we’re actually speeding on our own deaths by accelerating the MOOCs in the classroom. But I think it’s important we take control before someone else does.
A couple of days ago I was listening to a recorded webinar on K-12 student data privacy. I found out about it through an education blog I sometimes read called deutsch29, where the blog writer was complaining about “data chearleaders” on a panel and how important issues are sure to be ignored if everyone on a panel is on the same, pro-data and pro-privatization side.
Well as it turns out deutsch29 was almost correct. Most of the panelists were super bland and pro-data collection by private companies. But the first panelist named Joel Reidenberg, from Fordham Law School, reported on the state of data sharing in this country, the state of the law, and the gulf between the two.
I will come back to his report in another post, because it’s super fascinating, and in fact I’d love to interview that guy for my book.
One thing I wanted to mention was the high-level discussion that took place in the webinar on what regulation is for. Specifically, the following important question was asked:
Does every parent have to become a data expert in order to protect their children’s data?
The answer was different depending on who answered it, of course, but one answer that resonated with me was that that’s what regulation is for, it exists so that parents can rely on regulation to protect their children’s privacy, just as we expect HIPAA to protect the integrity of our medical data.
I started to like this definition – or attribute, if you will – of regulation, and I wondered how it relates to other kinds of regulation, like in finance, as well as how it would work if you’re arguing with people who hate all regulation.
First of all, I think that the financial industry has figured out how to make things so goddamn complicated that nobody can figure out how to regulate anything well. Moreover, they’ve somehow, at least so far, also been able to insist things need to be this complicated. So even if regulation were meant to allow people to interact with the financial system and at the same time “not be experts,” it’s clearly not wholly working. But what I like about it anyway is the emphasis on this issue of complexity and expertise. It took me a long time to figure out how big a problem that is in finance, but with this definition it goes right to the heart of the issue.
Second, as for the people who argue for de-regulation, I think it helps there too. Most of the time they act like everyone is a omniscient free agent who spends all their time becoming expert on everything. And if that were true, then it’s possible that regulation wouldn’t be needed (although transparency is key too). The point is that we live in a world where most people have no clue about the issues of data privacy, never mind when it’s being shielded by ridiculous and possibly illegal contracts behind their kids’ public school system.
Finally, in terms of the potential for protecting kids’ data: here the private companies like InBloom and others are way ahead of regulators, but it’s not because of complexity on the issues so much as the fact that regulators haven’t caught up with technology. At least that’s my optimistic feeling about it. I really think this stuff is solvable in the short term, and considering it involves kids, I think it will have bipartisan support. Plus the education benefits of collecting all this data have not been proven at all, nor do they really require such shitty privacy standards even if they do work.
I’m incredibly excited to announce that I am writing a book called Weapons of Math Destruction for Random House books, with my editor Amanda Cook. There will also be a subtitle which we haven’t decided on yet.
Here’s how this whole thing went down. First I met my amazing book agent Jay Mandel from William Morris though my buddy Jordan Ellenberg. As many of you know, Jordan is also writing a book but it’s much farther along in the process and has already passed the editing phase. Jordan’s book is called How Not To Be Wrong and it’s already available for pre-order on Amazon.
Anyhoo, Jay spent a few months with me telling me how to write a book proposal, and it was a pretty substantial undertaking actually and required more than just an outline. It was like a short treatment of all the chapters but then two chapters pretty filled in, including the first, and as you know the first is kind of like an advertisement for the whole rest of the book.
Then, once that proposal was ready, Jay started what he hoped would be a bidding war for the proposal among publishers. He had a whole list of people he talked to from all over the place in the publishing world.
What actually happened though was Amanda Cook from Crown Publishing, which is part of Random House, was the first person who was interested enough to talk to me about it, and then we hit it off really well, and she made a pre-emptive offer for the book so the full on bidding war didn’t end up needing to happen. And then just last week she announced the deal in what’s called the Publisher’s Marketplace, which is for people inside publishing to keep abreast of the deals and news. The actual link is here, but it’s behind a pay wall, so Amanda got me a screen shot:
If that font is too small, it says something like this:
Harvard math Ph.D., former Wall Street quant, and advisor to the Occupy movement Cathy O’Neil’s WEAPONS OF MATH DESTRUCTION, arguing that mathematical modeling has become a pervasive and destructive force in society—in finance, education, medicine, politics, and the workplace—and showing how current models exacerbate inequality and endanger democracy and how we might rein them in, to Amanda Cook at Crown in a pre-empt by Jay Mandel at William Morris Endeavor (NA).
So as you can tell I’m incredibly excited about the book, and I have tons of ideas about it, but of course I’d love my readers to weigh in on crucial examples of models and industries that you think might get overlooked.
Please, post a comment or send me an email (located on my About page) with your favorite example of a family of models (Value Added Model for teachers is already in!) or a specific model (Value-at-Risk model in finance in already!) that is illustrative of feedback loops, or perverted incentives, or creepy modeling, or some such concept that you imagine I’ll be writing about (or should be!). Thanks so much for your input!
One last thing. I’m aiming to finish the writing part by next Spring, and then the book is actually released about 9 months later. It takes a while. I’m super glad I have had the experience of writing a technical book with O’Reilly as well as the homemade brew Occupy Finance with my Occupy group so I know at least some of the ropes, but even so this is a bit more involved.
There is a movement afoot in New York (and other places) to allow private companies to house and mine tons of information about children and how they learn. It’s being touted as a great way to tailor online learning tools to kids, but it also raises all sorts of potential creepy modeling problems, and one very bad sign is how secretive everything is in terms of privacy issues. Specifically, it’s all being done through school systems and without consulting parents.
In New York it’s being done through InBloom, which I already mentioned here when I talked about big data and surveillance. In that post I related an EducationNewYork report which quoted an official from InBloom as saying that the company “cannot guarantee the security of the information stored … or that the information will not be intercepted when it is being transmitted.”
The issue is super important and timely, and parents have been left out of the loop, with no opt-out option, and are actively fighting back, for example with this petition from MoveOn (h/t George Peacock). And although the InBloomers claim that no data about their kids will ever be sold, that doesn’t mean it won’t be used by third parties for various mining purposes and possibly marketing – say for test prep tools. In fact that’s a major feature of InBloom’s computer and data infrastructure, the ability for third parties to plug into the data. Not cool that this is being done on the downlow.
Who’s behind this? InBloom is funded by the Bill & Melinda Gates foundation and the operating system for inBloom is being developed by the Amplify division (formerly Wireless Generation) of Rupert Murdoch’s News Corp. More about the Murdoch connection here.
Wait, who’s paying for this? Besides the Gates and Murdoch, New York has spent $50 million in federal grants to set up the partnership with InBloom. And it’s not only New York that is pushing back, according to this Salon article:
InBloom essentially offers off-site digital storage for student data—names, addresses, phone numbers, attendance, test scores, health records—formatted in a way that enables third-party education applications to use it. When inBloom was launched in February, the company announced partnerships with school districts in nine states, and parents were outraged. Fears of a “national database” of student information spread. Critics said that school districts, through inBloom, were giving their children’s confidential data away to companies who sought to profit by proposing a solution to a problem that does not exist. Since then, all but three of those nine states have backed out.
Finally, according to this nydailynews article, Bill de Blasio is coming out on the side of protecting children’s privacy as well. That’s a good sign, let’s hope he sticks with it.
I’m not against using technology to learn, and in fact I think it’s inevitable and possibly very useful. But first we need to have a really good, public discussion about how this data is being shared, controlled, and protected, and that simply hasn’t happened. I’m glad to see parents are aware of this as a problem.
Women are underrepresented in businesses like Goldman Sachs and JP Morgan Chase, especially in the upper management. Why is that?
Many women never go into finance in the first place, and of course some of them do go in but leave. Why are they leaving, though? Is it because they don’t like success? Or they don’t like money? Are they forgetting to lean in sufficiently?
Here’s another possibility, which I dig. They’re less willing to sacrifice their ethics than their male colleagues for the sake of money and business success.
Last Friday I read this paper entitled Who Is Willing to Sacrifice Ethical Values for Money and Social Status? Gender Differences in Reactions to Ethical Compromises and written by Jessica A. Kennedy and Laura J. Kray. It offers ethical distaste problems as at least one contributing reason we don’t see as many women as we might otherwise.
Please read the paper for details, I’m only giving a very brief overview without figures of statistical significance. They have three experiments.
First they saw who were interested in jobs that had major ethical compromises. Turns out that women were way less interested than men.
Second, to check whether that was because of the ethical compromises or because of the “job” part, they had different kinds of job descriptions and found that, in the presence of a culture of good ethics, women were just as interested in a job as men.
Third, they checked on the existing assumptions about the connection between ethics and various kinds of jobs, like the law, medicine, and “business”. Turns out woman associate compromised ethics with business but less so with law and medicine.
Conclusion: we can attribute some of the lack of women in business to a combination of assumed and real ethical compromises.
First, I love that this paper was written by two women. Maybe that’s what it took for such an common sense idea to be tested.
Secondly, I think this paper should be kept in mind when we read things about how companies that are diverse are more successful. It’s probably because they are nice places to be that women and others are there, which in turn makes them more successful. It also explains why, when companies set out to be diverse, they often have so much trouble. They want to achieve diversity without changing their underlying culture.
Thirdly, I’m going to have to admit that men are under enormous pressure to succeed at all costs, which could explain why they’re more willing to become ethically compromised to be successful. That says something about our crazy expectations of men in this culture which I think we need to address. I say that as a mother of three sons.
Finally, whenever I hear someone talking about “leaning in” from now on, I will ask them, “lean in to what?”.
Yesterday I read an interesting paper entitled Social influence and the collective dynamics of opinion formation, written by Mehdi Moussaïd, Juliane E. Kämmer, Pantelis P. Analytis, and Hansjörg Neth, about how opinions and strength of conviction spread in a crowd with many interactions, and how consensus is reached. I found the paper on Twitter through Steven Strogatz’s feed.
First they worked on individuals, and how they might update their opinion on some topic upon hearing of someone else’s opinion. They chose super unpolitical questions like, “what is the melting point of aluminum?”.
The interesting thing they did was to track both the opinion and the conviction – how sure someone was.
As expected, people did update their opinion if they heard someone else had a somewhat similar opinion, especially if that other person had a stronger conviction. They tended to ignore opinions that were super different, especially if the convictions were weaker. Sometimes they even adopted the other person’s opinion, if it wasn’t too different and if their original conviction was very low. But most of the time they ignored stuff:
What was also interesting, and what we will get back to, is that when they heard other people had similar opinions to their own, their conviction went up without their opinion changing.
Next they used a computer simulation to see how opinions would propagate if no new information was introduced but many interactions occurred, if everyone acted the same in terms of updating opinions, and if they did so time after time.
So what were the results? I’ll explain a couple, please read the paper for more details, it’s short.
The most interesting to me was that, at the end of the day, after many interactions, the convictions of the group always ended up high even if the answer was wrong. This is because, when people heard similar opinions, their convictions rose, but if they heard differing opinions their convictions didn’t lower. But the end result is that, although high conviction correlated with being correct at the start, it had no correlation with being correct by the end.
In fact, conviction correlated to consensus rather than correctness after a few interactions. The takeaway is that, in the presence of not much information, strong convictions might just imply lots of local agreement.
The next result they found was that the dynamical system that was the opinion making soup had two kinds of attractors. Namely, small groups of “experts,” defined as people with very strong convictions but who were not necessarily correct (think Larry Summers), and large groups of people with low convictions but who all happen to agree with each other.
The fact that these two populations are attractors was named by the authors as “the expert effect” and “the majority effect” respectively. And if fewer than 15% of the population were experts, in the presence of a majority, the majority effect dominated.
Finally, the presence of random noise, which correspond to people with random opinions and random conviction levels, weakened both of the above effects. If 70% or more of the population was noise, then the two effects described above vanished.
Thoughts on the paper
- One thing I’ve thought about a lot from working with my Occupy group is how opinions form on a given issue. Since we’re going for informed opinions, we very deliberately start out with a learning phase, which could last a long time depending on the complexity of the subject. We also have a thing against experts, although we do have to trust our sources when we read up on a topic. So it’s kind of balancing act at all times.
- Also, of course, most opinions are not 1-dimensional. I can’t say my opinion on the Fed on a scale between 1 and 100, for example.
- Also, it’s not clear that I update my opinion on issues in exactly the same way each time I hear someone else’s. On the other hand I do continually revise my opinion on stuff.
- The study didn’t look at super political issues. I wonder if it’s different. I guess one of the big differences is in how often someone is truly neutral on a political topic. Maybe you could even define a topic as political in this context somehow, or at least build a test for the politicalness of a topic.
- Let’s assume it also works for political topics. Then the “I heard this so many times it must be true” effect seems to be directly in line with the agenda of Fox News. Also there’s the expert effect going on there as well.
- In any case it’s interesting to note that, if you’re trying to effect opinions, you might either go with “informing and educating the general public” on something or “building up a sufficient squad of experts” on that same thing, where experts are people with super strong opinions and have the ability to interact with lots of people.
The raison d’être of hedge funds is to make the markets efficient. Or at least that’s one of the raisons d’être, the others being 1) to get rich and 2) to leave early on Fridays in the summer (resp. winter) to get a jump on traffic to the Hamptons (resp. ski area, possibly in Kashmir).
And although having efficient markets sounds like a great thing, it makes sense to ask what that would look like from the perspective of a non-insider.
This recent Wall Street Journal article on high-tech snooping does a pretty good job setting the tone here. First, the kind of thing they’re doing:
Genscape is at the vanguard of a growing industry that employs sophisticated surveillance and data-crunching technology to supply traders with nonpublic information about topics including oil supplies, electric-power production, retail traffic and crop yields.
Next, who they’re doing it for:
The techniques, which are perfectly legal, represent the latest advance in the longtime Wall Street practice of searching for every possible trading advantage. But the high cost of much of the new information—Genscape’s oil-supply report costs $90,000 a year—means that some forms of trading are becoming even more the province of firms with substantial resources.
Let’s put these two things together from the perspective of the public. The market is getting information from hidden cameras and sensors, and all that information is being fed to “the market” via proprietary hedge funds via channels we will never tap into. The end result is that the prices of commodities are being adjusted to real-world events more and more quickly, but these are events that are not truly known to the real world.
[Aside: I'm going to try to avoid talking about the "true price" of things like gas, because I think that's pretty much a fool's errand. In any case, let me just say that, in addition to the potentially realtime sensor information that goes into a commodity's price, we also have people trading on it because they are adjusting their exposure to some other historically correlated or anti-correlated instrument, or because they've decided to liquidate their books, or because they've decided the Fed has changed its macroeconomic policy, or because Spain needs to deal with its bank problems, or because someone wants to take money out of the market to rent their summer house in the Hamptons. In other words, I'm not ready to argue that we're getting close to the "true price" of gas here. It's just tradable information like any other.]
I am now prepared, as you hopefully are as well, to question what good this all does for people like us, who are not privy to the kind of expensive information required to make these trades. From our perspective, nothing happens, the price fluctuates, and the market is deemed efficient. Is this actually an improvement over the alternative version where something happens, and then the price adjusts? It’s an expensive arms race, taking up vast resources, where things have only become more opaque.
How vast are those resources? Having worked in finance, I know the answer is a shit-ton, if it is profitable in a short-term edgy kind of way. Just as those guys dug a hole through mountains to make the connection between New York to Chicago a few nanoseconds faster, they will go to any length to get the newest info on the market, as long as it is deemed to have a profitable edge in some time frame – i.e. the amount of time it will take a flood of competitors to do the same thing.
Just as there’s a kind of false myth that most of the web is porn, I’d like to perpetuate a new somewhat false myth that most data gathering and mining happens for the benefit of trading. And if that’s false now, let’s talk about it again in 100 years, when the market for celebrities is mature, and you can make money shorting a bad marriage.
I’m wondering something kind of stupid this morning, which is why we don’t track people’s insulin levels. Or maybe we do and I don’t know about it? In which case, how do I get myself a Quantified Self insulin tracker?
Here’s a bit of background to explain why I’m asking this. Insulin levels in your blood regulate the speed at which sugar in your blood is turned into fat, and similarly they regulate how quickly fat cells release fat into the bloodstream.
If someone without diabetes eats something sweet their insulin levels shoot up to collect all the blood so the blood sugar levels doesn’t get toxic. Because what’s really important, as diabetics know, is that blood sugar levels remain not too high nor too low. Type I diabetics don’t make insulin, and Type II diabetics don’t respond to insulin properly.
OK so here’s the thing. I’m pretty sure my insulin levels are normally a bit high, and that when I eat sugar or bread they spike up dramatically. And although that might sound like the opposite of diabetes, it’s actually the precursor to diabetes, where my body goes nuts making insulin and then my organs become resistant to it.
But first of all, I’d like to know if I’m right – are my insulin levels elevated? – and second of all, I’d like to know how much my body reacts, insulin-wise, to eating and drinking various things. For instance, I drink a lot of coffee, and I’d like to know what that does to my insulin levels. And what about Coke Zero?
I am probably going to be disappointed here. I know that the critical level to keep track of for diabetics, blood sugar, is still pretty hard to do, although recent continuous monitors do exist and are helping. So if anyone knows of a “continuous insulin monitor” please tell me!
One last word about why insulin. I am fairly convinced that the insulin levels – combined with a measure of their insulin resistance – would explain a lot about why certain people retain fat where others don’t with the same diet. And it would be such a relief to stop arguing about willpower and start understanding this stuff scientifically.
One last thing. Please do not comment below telling me how to lose weight or talking about how to have more willpower: I will delete such comments! Please do comment on the scientific issues around the mechanisms of insulin, data collection, and potentially modeling with that data.
There’s a new breed of models out there nowadays that reads your face for subtle expressions of emotions, possibly stuff that normal humans cannot pick up on. You can read more about it here, but suffice it to say it’s a perfect target for computers – something that is free information, that can be trained over many many examples, and then deployed everywhere and anywhere, even without our knowledge since surveillance cameras are so ubiquitous.
Plus, there are new studies that show that, whether you’re aware of it or not, a certain “gut feeling”, which researchers can get at by asking a few questions, will expose whether your marriage is likely to work out.
Let’s put these two together. I don’t think it’s too much of a stretch to imagine that surveillance cameras strategically placed at an altar can now make predictions on the length and strength of a marriage.
I guess it brings up the following question: is there some information we are better off not knowing? I don’t think knowing my marriage is likely to be in trouble would help me keep the faith. And every marriage needs a good dose of faith.
I heard a radio show about Huntington’s disease. There’s no cure for it, but there is a simple genetic test to see if you’ve got it, and it usually starts in adulthood so there’s plenty of time for adults to see their parents degenerate and start to worry about themselves.
But here’s the thing, only 5% of people who have a 50% chance of having Huntington’s actually take that test. For them the value of not knowing that information is larger than knowing. Of course knowing you don’t have it is better still, but until that happens the ambiguity is preferable.
Maybe what’s critical is that there’s no cure. I mean, if there was therapy that would help Huntington’s disease sufferers delay it or ameliorate it, I think we’d see far more people taking that genetic marker test.
And similarly, if there were ways to save a marriage that is at risk, we might want to know on the altar what the prognosis is. Right?
I still don’t know. Somehow, when things get that personal and intimate, I’d rather be left alone, even if an algorithm could help me “optimize my love life”. But maybe that’s just me being old-fashioned, and maybe in 100 years people will treat their computers like love oracles.
My friend Jordan Ellenberg sent me an article yesterday entitled Coin-flip judgement of psychopathic prisoners’ risk.
It was written by Seena Fazel, a researcher at the department of psychiatry at Oxford, and it concerns his research into the currently used predictive risk models for violence, repeat offense, and the like, which are supposedly tailored to people who have mental disorders like psychopathy.
Turns out there are a lot of these models, and they’re in use today in a bunch of countries. I did not know that. And they’re not just being used as extra, “good to know” information, but rather as a tool to assess important decisions for the prisoner. From the article:
Many US states use such tools to assess sexual offending risk and to help decide whether to exercise their powers to detain sexual offenders indefinitely after a prison term ends.
In England and Wales, these tools are part of the admission criteria for centres that treat people with dangerous and severe personality disorders. Outside North America, Europe and Australasia, similar approaches are increasingly popular, particularly in clinical settings, and there has been a steady growth of research from middle-income countries, such as China, documenting their use.
Also turns out, according to a meta-analysis done by Fazel, that these models don’t work very well, especially for the highest risk most violent population. And what’s super troubling is, as Fazel says, “In practice, the high false-positive rate probably means that some offenders spend longer in prison and secure hospital than their true risk would suggest.”
Talk about creepy.
This seems to be yet another example of a mathematical obfuscation and intimidation that gives people a false sense of having a good tool at hand. From the article:
Of course, sensible clinicians and judges take into account factors other than the findings of these instruments, but their misuse does complicate the picture. Some have argued that the veneer of scientific respectability surrounding such methods may lead to over-reliance on their findings, and that their complexity is difficult for the courts. Beyond concerns about public protection, liberty and costs of extended detention, there are worries that associated training and administration may divert resources from treatment.
The solution? Get people to acknowledge that the tools suck, and have a more transparent method of evaluating them. In this case, according to Fazel, it’s the researchers who are over-estimating the power of their models. But especially where it involves incarceration and the law, we have to maintain an adherence to a behavior-based methodology. It doesn’t make sense to put people in jail an extra 10 years because a crappy model said so.
This is a case, in my opinion, for an open model with a closed black box data set. The data itself is extremely sensitive and protected, but the model itself should be scrutinized.
Tonight I’m going to be on a panel over at Columbia’s Journalism School called Algorithmic Accountability Reporting: On the Investigation of Black Boxes. It’s being organized by Nick Diakopoulos, Tow Fellow and previous guest blogger on mathbabe. You can sign up to come here and it will also be livestreamed.
Unlike some panel discussions I’ve been on, where the panelists talk about some topic they choose for a few minutes each and then there are questions, this panel will be centered around a draft of a paper coming from the Tow Center at Columbia. First Nick will present the paper and then the panelists will respond to it. Then there will be Q&A.
I wish I could share it with you but it doesn’t seem publicly available yet. Suffice it to say it has many elements in common with Nick’s guest post on raging against the algorithms, and its overall goal is to understand how investigative journalism should handle a world filled with black box algorithms.
Super interesting stuff, and I’m looking forward to tonight, even if it means I’ll miss the New Day New York rally in Foley Square tonight.
As many of you are aware, food stamps were recently cut in this country. This has had a brutal effect on people and families and on neighborhood food pantries, which are being swamped with new customers and increased need among their existing customers.
One thing that I come away with when I read articles describing this problem is how often they detail individuals who have been diagnosed with diabetes but can no longer afford to pay for appropriate food for their condition.
As a person with a family history of diabetes, and someone who has been actively avoiding sugars and carbs to control my blood sugar for the past couple of years, I have a tremendous amount of sympathy for these struggling people.
Let me put it another way. Eating well in this country is expensive, and I’ve had to spend real money on food here in New York City to avoid sugary and fast carb-laden food. I don’t think I could have done that on a skimpy food budget. It’s especially hard to imagine budgeting healthy food on a withering food stamp budget.
Because here’s the thing, and it’s not a secret: shitty food is cheap. If I need to buy lots of food (read: calories) for a small amount of money, I can do it easily, but it will be hell for my blood sugar control. I’m guessing I’d be a full-blown diabetic by now if I were poor and on food stamps.
And that brings me to my nerd question of the morning. How much money are we really saving by decreasing the food stamp allowance in this country, if we consider how many more people will be diagnosed diabetic as a result of the decreased quality of their diet? And how many people’s diabetes will get worse, and how much will that cost?
It’s not over, either: apparently more cuts are coming over the next 10 years (maybe by $4 billion, maybe by $40 billion). And although diabetes care costs have gone up 40% in the last 5 years ($245 billion in 2012 from $174 billion in 2007), that doesn’t mean they won’t go up way more in the next 10.
I’m not an expert on how this all works, but the scale is right – we’re talking billions of dollars nationally, so not small potatoes, and of course we’re also talking about people’s quality of life. Never mind in a moral context – I’m definitely of the mind that people should be able to eat – I’m wondering if the food stamp cuts make sense in a dollars and cents context.
Please tell me if you know of an analysis in this direction.
Today I’d like to discuss recent article from the Atlantic entitled “They’re watching you at work” (hat tip Deb Gieringer).
In the article they describe what they call “people analytics,” which refers to the new suite of managerial tools meant to help find and evaluate employees of firms. The first generation of this stuff happened in the 1950′s, and relied on stuff like personality tests. It didn’t seem to work very well and people stopped using it.
But maybe this new generation of big data models can be super useful? Maybe they will give us an awesome way of throwing away people who won’t work out more efficiently and keeping those who will?
Here’s an example from the article. Royal Dutch Shell sources ideas for “business disruption” and wants to know which ideas to look into. There’s an app for that, apparently, written by a Silicon Valley start-up called Knack.
Specifically, Knack had a bunch of the ideamakers play a video game, and they presumably also were given training data on which ideas historically worked out. Knack developed a model and was able to give Royal Dutch Shell a template for which ideas to pursue in the future based on the personality of the ideamakers.
From the perspective of Royal Dutch Shell, this represents huge timesaving. But from my perspective it means that whatever process the dudes at Royal Dutch Shell developed for vetting their ideas has now been effectively set in stone, at least for as long as the algorithm is being used.
I’m not saying they won’t save time, they very well might. I’m saying that, whatever their process used to be, it’s now embedded in an algorithm. So if they gave preference to a certain kind of arrogance, maybe because the people in charge of vetting identified with that, then the algorithm has encoded it.
One consequence is that they might very well pass on really excellent ideas that happened to have come from a modest person – no discussion necessary on what kind of people are being invisible ignored in such a set-up. Another consequence is that they will believe their process is now objective because it’s living inside a mathematical model.
The article compares this to the “blind auditions” for orchestras example, where people are kept behind a curtain so that the listeners don’t give extra consideration to their friends. Famously, the consequence of blind auditions has been way more women in orchestras. But that’s an extremely misleading comparison to the above algorithmic hiring software, and here’s why.
In the blind auditions case, the people measuring the musician’s ability have committed themselves to exactly one clean definition of readiness for being a member of the orchestra, namely the sound of the person playing the instrument. And they accept or deny someone, sight unseen, based solely on that evaluation metric.
Whereas with the idea-vetting process above, the training data consisted of “previous winners” which presumable had to go through a series of meetings and convince everyone in the meeting that their idea had merit, and that they could manage the team to try it out, and all sorts of other things. Their success relied, in other words, on a community’s support of their idea and their ability to command that support.
In other words, imagine that, instead of listening to someone playing trombone behind a curtain, their evaluation metric was to compare a given musician to other musicians that had already played in a similar orchestra and, just to make it super success-based, had made first seat.
That you’d have a very different selection criterion, and a very different algorithm. It would be based on all sorts of personality issues, and community bias and buy-in issues. In particular you’d still have way more men.
The fundamental difference here is one of transparency. In the blind auditions case, everyone agrees beforehand to judge on a single transparent and appealing dimension. In the black box algorithms case, you’re not sure what you’re judging things on, but you can see when a candidate comes along that is somehow “like previous winners.”
One of the most frustrating things about this industry of hiring algorithms is how unlikely it is to actively fail. It will save time for its users, since after all computers can efficiently throw away “people who aren’t like people who have succeeded in your culture or process” once they’ve been told what that means.
The most obvious consequence of using this model, for the companies that use it, is that they’ll get more and more people just like the people they already have. And that’s surprisingly unnoticeable for people in such companies.
My conclusion is that these algorithms don’t make things objective, they makes things opaque. And they embeds our old cultural problems in new mathematical models, giving us a false badge of objectivity.
I’m lucky to be working with a super fantastic python guy on this, and the details are under wraps, but let’s just say it’s exciting.
So I’m looking to showcase a few good models to start with, preferably in python, but the critical ingredient is that they’re open source. They don’t have to be great, because the point is to see their flaws and possible to improve them.
- For example, I put in a FOIA request a couple of days ago to get the current teacher value-added model from New York City.
- A friends of mine, Marc Joffe, has an open source municipal credit rating model. It’s not in python but I’m hopeful we can work with it anyway.
- I’m in search of an open source credit scoring model for individuals. Does anyone know of something like that?
- They don’t have to be creepy! How about a Nate Silver – style weather model?
- Or something that relies on open government data?
- Can we get the Reinhart-Rogoff model?
The idea here is to get the model, not necessarily the data (although even better if it can be attached to data and updated regularly). And once we get a model, we’d build interactives with the model (like this one), or at least the tools to do so, so other people could build them.
At its core, the point of open models is this: you don’t really know what a model does until you can interact with it. You don’t know if a model is robust unless you can fiddle with its parameters and check. And finally, you don’t know if a model is best possible unless you’ve let people try to improve it.
I often talk about the modeling war, and I usually mean the one where the modelers are on one side and the public is on the other. The modelers are working hard trying to convince or trick the public into clicking or buying or consuming or taking out loans or buying insurance, and the public is on the other, barely aware that they’re engaging in anything at all resembling a war.
But there are plenty of other modeling wars that are being fought by two sides which are both sophisticated. To name a couple, Anonymous versus the NSA and Anonymous versus itself.
Here’s another, and it’s kind of bland but pretty simple: Twitter bots versus Twitter.
This war arose from the fact that people care about how many followers someone on Twitter has. It’s a measure of a person’s influence, albeit a crappy one for various reasons (and not just because it’s being gamed).
The high impact of the follower count means it’s in a wannabe celebrity’s best interest to juice their follower numbers, which introduces the idea of fake twitter accounts to game the model. This is an industry in itself, and an associated arms race of spam filters to get rid of them. The question is, who’s winning this arms race and why?
Twitter has historically made some strides in finding and removing such fake accounts with the help of some modelers who actually bought the services of a spammer and looked carefully at what their money bought them. Recently though, at least according to this WSJ article, it looks like Twitter has spent less energy pursuing the spammers.
It begs the question, why? After all, Twitter has a lot theoretically at stake. Namely, its reputation, because if everyone knows how gamed the system is, they’ll stop trusting it. On the other hand, that argument only really holds if people have something else to use instead as a better proxy of influence.
Even so, considering that Twitter has a bazillion dollars in the bank right now, you’d think they’d spend a few hundred thousand a year to prevent their reputation from being too tarnished. And maybe they’re doing that, but the spammers seem to be happily working away in spite of that.
And judging from my experience on Twitter recently, there are plenty of active spammers which actively degrade the user experience. That brings up my final point, which is that the lack of competition argument at some point gives way to the “I don’t want to be spammed” user experience argument. At some point, if Twitter doesn’t maintain standards, people will just not spend time on Twitter, and its proxy of influence will fall out of favor for that more fundamental reason.