Cathy O'Neil, mathbabe

Sunday morning music videos

February 19, 2012 Cathy O'Neil, mathbabe Comments off

Adele spoof on Gingrich:

The House of the Rising Sun, nerdstyle (h/t Emil):

Categories: Uncategorized

ECB trades crap for slightly less crappy crap

February 18, 2012 Cathy O'Neil, mathbabe 1 comment

Yesterday I read this New York Times article on how the ECB is trading its short term Greek bonds, with Greece, for longer term bonds.

Specifically, in order to avoid holding bonds that Greece is officially planning to “voluntarily” default on, the ECB is turning in that super crappy crap for other bonds that Greece hasn’t yet decided how much they’ll default on.

Just to spell it out even more, the plan to get private bondholders more excited about trusting the European bond market has been this:

have the the ECB step in (around the beginning of 2012) and provide liquidity and faith in the bond market,
negotiate that the Greek bonds maturing in March 2012 are given a 70% haircut,
make sure credit default swaps on those bonds are not activated (why we need it to be “voluntary”),
change the terms of the bonds’ contracts so that the holdouts of this voluntary deal can be safely ignored, and
have the ECB trade those bonds for longer-dated bonds at the last minute so they don’t actually have to take losses.

I’m not sure about you, but if I’m a private European debt holder my confidence in the bond market is not stronger right now. The argument for why the ECB is doing this is that they aren’t allowed to be seen giving money to Greece, by their charter. It’s odd to me that this charter, of all the various rules that have been broken here, is the one that is being fixated on as the important one we can’t break.

There are complicated politics going on, I am sure. I’m no expert in European politics, but this is about as European and about as political as things get.

Ignoring all of that, as a private bondholder, I’m putting a “ECB back-door swap” premium on all of my European debt from now on. Except maybe for German debt since I think Germany would rather jump out of the Euro altogether than default on its debt. But every other country is fair game. Bottomline is I short French debt today.

Categories: finance, news

How Harvard is failing its students

February 17, 2012 Cathy O'Neil, mathbabe 33 comments

In a recent Bloomberg article, Ezra Klein argues that Harvard and the other Ivy Leagues are failing their students because the students end up confused about what they can do with themselves after college and end up going to Wall Street firms as a way of making themselves marketable. From the article:

For many kids, college represents an end goal. Once you get into a good college, you’ve made it, and everyone stops worrying about you. You’re encouraged to take classes in subjects like English literature and history and political science, all of which are fine and interesting, but none of which leave you with marketable skills. After a few years of study, you suddenly find it’s late in your junior year, or early in your senior year, and you have no skills pointing to the obvious next step.

What Wall Street figured out is that colleges are producing a large number of very smart, completely confused graduates. Kids who have ample mental horsepower, incredible work ethics and no idea what to do next. So the finance industry takes advantage of that confusion, attracting students who never intended to work in finance but don’t have any better ideas about where to go.

He then talks about how the investment banks makes the application process formal, which is something that these kids are good at, and also that Wall Street promises to build them into people with careers and options. He also points out that some kids go into other formal applicationed jobs like Teach for America, so it’s not all about the money, at least not for all of them, and he concludes by saying how Harvard should change:

My hunch is that we have underemphasized the need to learn skills, rather than simply learn, while in college. The fact that Teach for America — which pays almost nothing and can place its hires far from cosmopolitan hot spots — is one of the few recruiting systems competitive with Wall Street suggests that graduates are open to paths that aren’t remotely as remunerative as finance and aren’t based in New York or San Francisco. They’re just not seeing all that many of them.

Although I agree with some of his diagnosis, I don’t agree with his solution of learning more “skills” in college.

As an aside, as I learned from Karen Ho’s excellent book about investment banking, Liquidated, and also from people I’ve met, the skills you learn on Wall Street as freshmen analysts are primarily bullshitting skills and Excel skills. These most definitely should not be taught at college.

I think he is right about these kids being comfortable with the “formal process” of applying to investment banks etc., but I don’t think he dives deep enough into why this is true. The fact is, the kids who get into Harvard nowadays are, generally speaking, professional test takers. They are moreover dependent on outside metrics for evaluating themselves. If you took away tests and grading systems, these kids would be desperately unhappy, because that’s how they’ve been trained all their lives to think about their self-worth.

When I was a tutor at one of the undergrad houses at grad school, I was incredibly impressed with the international group of undergrads I was in charge of; their credentials, even at the age of 20, were amazing, and their knowledge and self-possession were stunning. Same with the high school kids I taught at math camp last summer. But one thing I saw time and time again was how much they needed to please some outside authority. It’s like they never decided whether they themselves liked their major or whether it was a good fit- it was instead about whether they’d be successful and whether it would be an impressive path for them. So, external metrics of success.

Here’s my diagnosis. These kids are vulnerable to Wall Street investment firms and to things like Teach for America because they have application processes at all. But life, normal adult life, doesn’t have an application process. You actually, at some point, need to figure out what you want to do and what makes you happy. You need to take a leap of faith that your native talents and desires will end you up at a reasonable and interesting place.

Actually you don’t ever have to decide that, you could just keep doing what you think looks good to other people and pleases your parents or friends, without regard to whether it fulfills you at all. That’s kind of what’s happening I think with the 36% of the Princeton undergrads going to finance.

As for what Harvard et al can do about this, I would suggest trying to send the message in one of their core curriculum classes, that it’s not only about what you’re good at, it’s also about what makes you happy. I’m not sure those kids have ever really been told that. Being told that might not make a huge difference, but it’s a good start.

And instead of teaching them new “skills,” they should be told about options outside of school, and meet people who are employed doing interesting things with their liberal arts education. Have them talk about the way they made their way there, forged a path, and felt insecure about doing something weird but did it anyway. In other words, present them with role models who are living out their lives on their own terms, with independent thoughts.

Categories: finance, news, rant

A modeled student

February 16, 2012 Cathy O'Neil, mathbabe 6 comments

There’s a recent article from Inside Higher Ed (hat tip David Madigan) which focuses on a new “Predictive Analytics Reporting Framework” that tracks students’ online learning and predicts their outcomes, like whether they will finish the classes they’re taking or drop out. Who’s involved? The University of Phoenix among others:

A broad range of institutions (see factbox) are participating. Six major for-profits, research universities and community colleges — the sort of group that doesn’t always play nice — are sharing the vault of information and tips on how to put the data to work.

I don’t know about you but I’ve read the wikipedia article about for-profit universities and I don’t have a great feeling about their goals. In the “2010 Pell Grant Fraud controversy” section you can find this:

Out of the fifteen sampled, all were found to have engaged in deceptive practices, improperly promising unrealistically high pay for graduating students, and four engaged in outright fraud, per a GAO report released at a hearing of the Health, Education, Labor and Pensions Committee held on August 4, 2010.^[28]

Anyhoo, back to the article. They track people online and make suggestions for what classes people may want to take:

The data set has the potential to give institutions sophisticated information about small subsets of students – such as which academic programs are best suited for a 25-year-old male Latino with strength in mathematics, for example. The tool could even become a sort of Match.com for students and online universities, Ice said.

That makes me wonder- what would I have been told to do as a white woman with strength in math, if such a program had existed when I went to college? Maybe I would have been pushed to become something that historical data said I’d be best suited for? Maybe something safe, like actuarial work? What if this had existed when my mother was at MIT in applied math in the early ’60’s? Would they have had a suggestion for her?

Aside from snide remarks, let me make two direct complaints about this idea. First, I despise the idea of funneling people into chutes and ladders-type career projections based on their external attributes rather than their internal motives and desires. This kind of model, which as all models is based on historical data, is potentially a way to formally adopt racist and sexist policies. It codifies discrimination.

The second complaint: this is really all about money. In the article they mention that the model has already helped them decide whether Pell grants are being issued to students “correctly”:

Students can only receive the maximum Pell Grant award when they take 12 credit hours, which “forces people into concurrency,” said Phil Ice, vice president of research and development for the American Public University System and the project’s lead investigator. “So the question becomes, is the current federal financial aid structure actually setting these individuals up for failure?”

In other words, it looks like they are going to try to use the results of this model to persuade the government to change the way Pell Grants are distributed. Now, I’m not saying that the Pell Grant program is perfect; maybe it should be changed. But I am saying that this model is all about money and helping these online universities figure out which students will be most profitable. I’m familiar with constructing such models, because I was a quant at a hedge fund once and I know how these guys think. You can bet this model is proprietary, too- you wouldn’t want people to see into how they are being funneled too much, it might get awkward.

The article doesn’t she away from such comparisons either. From the article:

The project appears to have built support in higher education for the broader use of Wall Street-style slicing and dicing of data. Colleges have resisted those practices in the past, perhaps because some educators have viewed “data snooping” warily. That may be changing, observers said, as the project is showing that big data isn’t just good for hedge funds.

Just to be clear, they are saying it’s also good for for-profit institutions, not necessarily the students in them.

I’d like to see a law passed that forced such models to be open-sourced at the very very least. The Bill and Melinda Gates Foundation is funding this, who know how to reach those guys to make this request?

Categories: data science, finance, math education, news

How Big Pharma Cooks Data: The Case of Vioxx and Heart Disease

February 15, 2012 Cathy O'Neil, mathbabe 11 comments

This is cross posted from Naked Capitalism.

Yesterday I caught a lecture at Columbia given by statistics professor David Madigan, who explained to us the story of Vioxx and Merck. It’s fascinating and I was lucky to get permission to retell it here.

Disclosure

Madigan has been a paid consultant to work on litigation against Merck. He doesn’t consider Merck to be an evil company by any means, and says it does lots of good by producing medicines for people. According to him, the following Vioxx story is “a line of work where they went astray”.

Yet Madigan’s own data strongly suggests that Merck was well aware of the fatalities resulting from Vioxx, a blockbuster drug that earned them $2.4b in 2003, the year before it “voluntarily” pulled it from the market in September 2004. What you will read below shows that the company set up standard data protection and analysis plans which they later either revoked or didn’t follow through with, they gave the FDA misleading statistics to trick them into thinking the drug was safe, and set up a biased filter on an Alzheimer’s patient study to make the results look better. They hoodwinked the FDA and the New England Journal of Medicine and took advantage of the public trust which ultimately caused the deaths of thousands of people.

The data for this talk came from published papers, internal Merck documents that he saw through the litigation process, FDA documents, and SAS files with primary data coming from Merck’s clinical trials. So not all of the numbers I will state below can be corroborated, unfortunately, due to the fact that this data is not all publicly available. This is particularly outrageous considering the repercussions that this data represents to the public.

Background

The process for getting a drug approved is lengthy, requires three phases of clinical trials before getting FDA approval, and often takes well over a decade. Before the FDA approved Vioxx, less than 20,000 people tried the drug, versus 20,000,000 people after it was approved. Therefore it’s natural that rare side effects are harder to see beforehand. Also, it should be kept in mind that for the sake of clinical trials, they choose only people who are healthy outside of the one disease which is under treatment by the drug, and moreover they only take that one drug, in carefully monitored doses. Compare this to after the drug is on the market, where people could be unhealthy in various ways and could be taking other drugs or too much of this drug.

Vioxx was supposed to be a new “NSAID” drug without the bad side effects. NSAID drugs are pain killers like Aleve and ibuprofen and aspirin, but those had the unfortunate side effects of gastro-intestinal problems (but those are only among a subset of long term users, such as people who take painkillers daily to treat chronic pain, such as people with advanced arthritis). The goal was to find a pain-killer without the GI side effects. The underlying scientific goal was to find a COX-2 inhibitor without the COX-1 inhibition, since scientists had realized in 1991 that COX-2 suppression corresponded to pain relief whereas COX-1 suppression corresponded to GI problems.

Vioxx introduced and withdrawn from the market

The timeline for Vioxx’s introduction to the market was accelerated: they started work in 1991 and got approval in 1999. They pulled Vioxx from the market in 2004 in the “best interest of the patient”. It turned out that it caused heart attacks and strokes. The stock price of Merck plummeted and $30 billion of its market cap was lost. There was also an avalanche of lawsuits, one of the largest resulting in a $5 billion settlement which was essentially a victory for Merck, considering they made a profit of $10 billion on the drug while it was being sold.

The story Merck will tell you is that they “voluntarily withdrew” the drug on September 30, 2004. In a placebo-controlled study of colon polyps in 2004, it was revealed that over a time period of 1200 days, 4% of the Vioxx users suffered a “cardiac, vascular, or thoracic event” (CVT event), which basically means something like a heart attack or stroke, whereas only 2% of the placebo group suffered such an event. In a group of about 2400 people, this was statistically significant, and Merck had no choice but to pull their drug from the market.

It should be noted that, on the one hand Merck should be applauded for checking for CVT events on a colon polyps study, but on the other hand that in 1997, at the International Consensus Meeting on COX-2 Inhibition, a group of leading scientists issued a warning in their Executive Summary that it was “… important to monitor cardiac side effects with selective COX-2 inhibitors”. Moreover, in an internal Merck email as early as 1996, it was stated there was a “… substantial chance that CVT will be observed.” In other words, Merck knew to look out for such things. Importantly, however, there was no subsequent insert in the medicine’s packaging that warned of possible CVT side-effects.

What the CEO of Merck said

What did Merck say to the world at that point in 2004? You can look for yourself at the four and half hour Congressional hearing (seen on C-SPAN) which took place on November 18, 2004. Starting at 3:27:10, the then-CEO of Merck, Raymond Gilmartin, testifies that Merck “puts patients first” and “acted quickly” when there was reason to believe that Vioxx was causing CVT events. Gilmartin also went on the Charlie Rose show and repeated these claims, even go so far as stating that the 2004 study was the first time they had a study which showed evidence of such side effects.

How quickly did they really act though? Were there warning signs before September 30, 2004?

Arthritis studies

Let’s go back to the time in 1999 when Vioxx was FDA approved. In spite of the fact that it was approved for a rather narrow use, mainly for arthritis sufferers who needed chronic pain management and were having GI problems on other meds (keeping in mind that Vioxx was way more expensive than ibuprofen or aspirin, so why would you use it unless you needed to), Merck nevertheless launched an ad campaign with Dorothy Hamill and spent $160m (compare that with Budweiser which spent $146m or Pepsi which spent $125m in the same time period).

As I mentioned, Vioxx was approved faster than usual. At the time of its approval, the completed clinical studies had only been 6- or 12-week studies; no longer term studies had been completed. However, there was one underway at the time of approval, namely a study which compared Aleve with Vioxx for people suffering from osteoarthritis and rheumatoid arthritis.

What did the arthritis studies show? These results, which were available in late 2003, showed that the CVT events were more than twice as likely with Vioxx as with Aleve (CVT event rates of 32/1304 = 0.0245 with Vioxx, 6/692 = 0.0086 with Aleve, with a p-value of 0.01). As we see this is a direct refutation of the fact that CEO Gilmartin stated that they didn’t have evidence until 2004 and acted quickly when they did.

In fact they had evidence even before this, if they bothered to put it together (in fact they stated a plan to do such statistical analyses but it’s not clear if they did them- or in any case there’s so far no evidence that they actually did these promised analyses).

In a previous study (“Table 13”), available in February of 2002, the could have seen that, comparing Vioxx to placebo, we saw a CVT event rate of 27/1087 = 0.0248 with Vioxx versus 5/633 = 0.0079 with placebo, with a p-value of 0.01. So, three times as likely.

In fact, there was an even earlier study (“1999 plan”), results of which were available in July of 2000, where the Vioxx CVT event rate was 10/427 = 0.0234 versus a placebo event rate of 1/252 = 0.0040, with a p-value of 0.05 (so more than 5 times as likely). This p-value can be taken to be the definition of statistically significant. So actually they knew to be very worried as early as 2000, but maybe they… forgot to do the analysis?

The FDA and pooled data

Where was the FDA in all of this?

They showed the FDA some of these numbers. But they did something really tricky. Namely, they kept the “osteoarthritis study” results separate from the “rheumatoid arthritis study” results. Each alone were not quite statistically significant, but together were amply statistically significant. Moreover, they introduced a third category of study, namely the “Alzheimer’s study” results, which looked pretty insignificant (more on that below though). When you pooled all three of these study types together, the overall significance was just barely not there.

It should be mentioned that there was no apparent reason to separate the different arthritic studies, and there is evidence that they did pool such study data in other places as a standard method. That they didn’t pool those studies for the sake of their FDA report is incredibly suspicious. That the FDA didn’t pick up on this is probably due to the fact that they are overworked lawyers, and too trusting on top of that. That’s unfortunately not the only mistake the FDA made (more below).

Alzheimer’s Study

So the Alzheimer’s study kind of “saved the day” here. But let’s look into this more. First, note that the average age of the 3,000 patients in the Alzheimer’s study was 75, it was a 48-month study, and that the total number of deaths for those on Vioxx was 41 versus 24 on placebo. So actually on the face of it it sounds pretty bad for Vioxx.

There were a few contributing reasons why the numbers got so mild by the time the study’s result was pooled with the two arthritis studies. First, when really old people die, there isn’t always an autopsy. Second, although there was supposed to be a DSMB as part of the study, and one was part of the original proposal submitted to the FDA, this was dropped surreptitiously in a later FDA update. This meant there was no third party keeping an eye on the data, which is not standard operating procedure for a massive drug study and was a major mistake, possibly the biggest one, by the FDA.

Third, and perhaps most importantly, Merck researchers created an added “filter” to the reported CVT events, which meant they needed the doctors who reported the CVT event to send their info to the Merck-paid people (“investigators”), who looked over the documents to decide whether it was a bonafide CVT event or not. The default was to assume it wasn’t, even though standard operating procedure would have the default assuming that there was such an event. In all, this filter removed about half the initially reported CVT events, and about twice as often the Vioxx patients had their CVT event status revoked as for the placebo patients. Note that the “investigator” in charge of checking the documents from the reporting doctors is paid $10,000 per patient. So presumably they wanted to continue to work for Merck in the future.

The effect of this “filter” was that, instead of it seeming 1.5 times as likely to have a CVT event if you were taking Voixx, it seemed like it was only 1.03 as likely, with a high p-score.

If you remove the ridiculous filter from the Alzheimer’s study, then you see that as of November 2000 there was statistically significant evidence that Vioxx caused CVT events in Alzheimer patients.

By the way, one extra note. Many of the 41 deaths in the Vioxx group were dismissed as “bizarre” and therefore unrelated to Vioxx. Namely, car accidents, falling of ladders, accidentally eating bromide pills. But at this point there’s evidence that Vioxx actually accelerates Alzheimer’s disease itself, which could explain those so-called bizarre deaths. This is not to say that Merck knew that, but rather that one should not immediately dismiss the concept of statistically significant just because it doesn’t make intuitive sense.

VIGOR and the New England Journal of Medicine

One last chapter in this sad story. There was a large-scale study, called the VIGOR study, with 8,000 patients. It was published in the New England Journal of Medicine on November 23, 2000. See also this NPR timeline for details. They didn’t show the graphs which would have emphasized this point, but they admitted, in a deceptively round-about way, that Vioxx has 4 times the number of CVT events than Aleve. They hinted that this is either because Aleve is protective against CVT events or that Vioxx is bad for it, but left it open.

But Bayer, which owns Aleve, issued a press release saying something like, “if Aleve is protective for CVT events then it’s news to us.” Bayer, it should be noted, has every reason to want people to think that Aleve is protective against CVT events. This problem, and the dubious reasoning explaining it away, was completely missed by the peer review system; if it had been spotted, Vioxx would have been forced off the market then and there. Instead, Merck purchased 900,000 preprints of this article from the NE Journal of Medicine, which is more than the number of practicing doctors in the U.S.. In other words, the Journal was used as a PR vehicle for Merck.

The paper emphasized that Aleve has twice the rate of ulcers and bleeding, at 4%, whereas Vioxx had a rate of only 2% among chronic users. When you compare that to the elevated rate of heart attack and death (0.4% to 1.2%) of Vioxx over Aleve, though, the reduced ulcer rate doesn’t seem all that impressive.

A bit more color on this paper. It was written internally by Merck, after which non-Merck authors were found. One of them is Loren Laine. Loren helped Merck develop a sound-bite interview which was 30 seconds long and was sent to the news media and run like a press interview, even though it actually happened in Merck’s New Jersey office (with a backdrop to look like a library) with a Merck employee posing as a neutral interviewer. Some smart lawyer got the outtakes of this video made available as part of the litigation against Merck. Check out this youtube video, where Laine and the fake interviewer scheme about spin and Laine admits they were being “cagey” about the renal failure issues that were poorly addressed in the article.

The damage done

Also on the Congress testimony I mentioned above is Dr. David Graham, who speaks passionately from minute 41:11 to minute 53:37 about Vioxx and how it is a symptom of a broken regulatory system. Please take 10 minutes to listen if you can.

He claims a conservative estimate is that 100,000 people have had heart attacks as a result of using Vioxx, leading to between 30,000 and 40,000 deaths (again conservatively estimated). He points out that this 100,000 is 5% of Iowa, and in terms people may understand better, this is like 4 aircraft falling out of the sky every week for 5 years.

According to this blog, the noticeable downwards blip in overall death count nationwide in 2004 is probably due to the fact that Vioxx was taken off the market that year.

Conclusion

Let’s face it, nobody comes out looking good in this story. The peer review system failed, the FDA failed, Merck scientists failed, and the CEO of Merck misled Congress and the people who had lost their husbands and wives to this damaging drug. The truth is, we’ve come to expect this kind of behavior from traders and bankers, but here we’re talking about issues of death and quality of life on a massive scale, and we have people playing games with statistics, with academic journals, and with the regulators.

Just as the financial system has to be changed to serve the needs of the people before the needs of the bankers, the drug trial system has to be changed to lower the incentives for cheating (and massive death tolls) just for a quick buck. As I mentioned before, it’s still not clear that they would have made less money, even including the penalties, if they had come clean in 2000. They made a bet that the fines they’d need to eventually pay would be smaller than the profits they’d make in the meantime. That sounds familiar to anyone who has been following the fallout from the credit crisis.

One thing that should be changed immediately: the clinical trials for drugs should not be run or reported on by the drug companies themselves. There has to be a third party which is in charge of testing the drugs and has the power to take the drugs off the market immediately if adverse effects (like CVT events) are found. Hopefully they will be given more power than risk firms are currently given in finance (which is none)- in other words, it needs to be more than reporting, it needs to be an active regulatory power, with smart people who understand statistics and do their own state-of-the-art analyses – although as we’ve seen above even just Stats 101 would sometimes do the trick.

Categories: data science, news

Today is Volcker Day

February 14, 2012 Cathy O'Neil, mathbabe 2 comments

This is a guest post by George Bailey, who is part of Occupy the SEC. I just want insert here a congratulations to Occupy the SEC for submitting their public comments letter yesterday, and to point out that the organization SIFMA below is the same SIFMA I mentioned here and here (those guys are everywhere, defending the interests of the banks).

Today is “Volcker Day” and Paul Volcker was on a tear.

Mr Volcker added in a formal submission to regulators Monday that “proprietary trading is not an essential commercial bank service that justifies taxpayer support,” and that banks should stop “stonewalling.”

He went on:

“There should not be a presumption that evermore market liquidity brings a public benefit,” Volcker, 84, wrote in a letter submitted yesterday to regulators in defense of the rule curtailing banks’ bets on asset prices with their own money. “At some point, great liquidity, or the perception of it, may itself encourage more speculative trading (see here and here for the full story).

But then Jamie Dimon came along and bitch slapped Tall Paul. Ouch.

“Paul Volcker by his own admission has said he doesn’t understand capital markets,” Dimon told Francis in the Fox Business interview. “He has proven that to me.”

SIFMA, on behalf of the industry, took over to explain in detail just what it is that Mr. Volcker doesn’t understand in their comment letter. They reiterate their dire warning about the devastating effects on ‘corporate liquidity’’ from the Volcker Rule. Yet surprisingly, no non-financial corporate bond issuers filed any comments to acknowledge or object to this danger.

In fact, there are no comment letters from any non-financial companies. They did haul out the widely lampooned Oliver Wyman study to bolster their comment that ‘corporate’ America would suffer horribly if Volcker is enacted. But that just serves to remind us again that the corporate bond liquidity that will be affected is the liquidity in dodgy financial company ‘corporate’ bonds, like CDOs and other drek. They conclude the only solution is a rewrite . They request the rule makers go back and start all over again.

The SIFMA comment letter runs to 175 pages. I haven’t read all the other financial company letters, but the ones I’ve skimmed conform to SIFMAs position.

The Occupy the SEC comment letter logs in at 325 pages and oddly enough draws the exact opposite conclusions to each of SIFMAs objections. It’s an interesting contrast. For some reason (some familiarity with the subject matter and public interest primarily) the group seems to have understood and articulated Volcker’s (and the electorate’s) intent pretty effectively.

Of the comment letters received about 90% are from financial institutions, and another 5% are from foreign governments objecting to the priority the US regulators have gifted to US traders in US Government Bonds. The remaining 5% are from ordinary folks, like Mr. Volcker, Occupy the SEC and other public interest groups.

Its interesting that 95% of the comments reflect the views of the 1%, and the views of the 99% are embodied in the comments of the remaining 5% of commenters. I’m confident the regulators will recognize that, for all its complexity, the rules are comprehensible and can be refined to serve the public’s demand for control over a runaway financial system.

Categories: #OWS, finance, guest post, hedge funds, news

Mathematics has an Occupy moment

February 13, 2012 Cathy O'Neil, mathbabe 26 comments

The Occupy Wall Street movement means a lot of things to a lot of people, but one of the things it pretty much universally represents is the concept of agency.

Instead of sitting passively by and allowing a dysfunctional system to detract from a culture, the participants in Occupy want to object, to reform the system, and if that doesn’t work, to build a new system. And the crucial point is that they feel that they have the right (if not obligation) to do so. Moreover, they wish to construct a new paradigm built on democratic understanding of the shared goals of the system itself, rather than letting whomever is in power decide how things work and who benefits.

I feel like there’s an analogy to be drawn between this process and what’s happening now in the fight between mathematicians and Elsevier, and for that matter the publishing world (as has been pointed out, Springer has the same issues as Elsevier, even though people like Springer a lot more).

It may seem like the fight against Elsevier is only a small part of the mathematics system, in that it’s really only one publisher of many, and some people (like the journal of Topology) have already gone ahead and started new journals that don’t share the more toxic properties that the Elsevier journals have. I don’t think that narrow view is justified.

In fact, part of tearing down Elsevier has to include a broader understanding of how antiquated the entire academic publishing world is, which immediately begs the question of what we need to build to replace it. This is not unlike the Occupy movement’s goal to replace the current financial system with another which would primarily serve the needs of the citizens and only secondarily the desires of bankers. A tall order to be sure, but luckily for mathematicians their system is less complicated, and moreover the community is much more empowered.

Why am I waxing so poetic over this struggle? Because, at the heart of the question of “what is the new system” is the even more fundamental question, “what do we, as a community, wish to treasure and what do we wish to discard?”. After all, we already have arXiv, or in other words a repository of everything, and the question then becomes, how do we sort out the good stuff from the crap?

I want to stop right there and examine that question, because it’s already quite loaded. Let’s face it, people don’t always agree on what it means for something to be good versus crap, and if there was ever a time to examine that question it’s now.

Here’s a thought experiment I’d like you to do with me. Since leaving academic mathematics, I’ve realized the enormous value of being able to explain mathematical concepts to broader audiences, and I’ve been left with the distinct impression that such a skill is underappreciated inside academic mathematics. In the past 8 months, since writing this blog, I’ve become sort of a hybrid mathematician and journalist, and it’s kind of cool, if unfocused. But what if I decided to really focus on the journalism side of mathematics inside mathematics, would that be appreciated?

So the thought experiment is this. Imagine if, every 6 months, I moved to a new field of mathematics and acted as a mathematical journalist, interviewing the people in math about their work, their field, where it’s going, what the important questions are, etc., and at the end of the 6 month gig I wrote an expository article that explained that field to the rest of the mathematicians. I’d do that every 6 months for 20 years, and I’ve covered 40 fields. Assuming I’m as good at explaining things as I say I am, I’ve really opened up these fields to a larger audience (albeit still math folks), which may allow for better communication between fields, or may avoid redundant work between fields, or may simply enrich the understanding of what’s going on. From my perspective, the work I’d be doing would really be mathematics, and would further the overall creation of mathematics.

However, think about those expository articles I’d be writing. They wouldn’t be original, nor would they be particularly hard- if anything the goal would be for people to understand them. Would they ever get published in a top journal (as of now)? I don’t think so. And please don’t suggest that papers like this, written by famous people in their fields, have been well-received. This is true but I claim more a result of the reputation of the writers than because of the content.

Let’s go back to the question of how we sort papers on arXiv. For some people, this question is really confusing and even scary. They fear that any system besides the one now in place would devalue contributions that are more technical, harder, and less accessible over results that are easy, flashy, and amenable to pop culture sound bytes. I exaggerate for effect, but this is the gist of worries I’ve been hearing. For these people, which I will call “the traditionalists”, the most they want to do is to circumvent the publishers’ fees but otherwise keep intact the referee system, whereby there are gatekeepers who choose experts to anonymously review papers. The publishers are the organizers of this system, and by inviting people to be editors for their journals essentially anoint the gatekeepers.

I actually think those traditionalists should be afraid, but not exactly for the reasons that they think. Instead of worrying that their hard, technical papers won’t be appreciated, they should worry that other, totally different kinds of skills will be appreciated. Of course in the end it’s the same result, namely that the top universities may not forever be populated exclusively by people who prove wonderfully difficult, original and ground-breaking results. They could also include people who are the great story-tellers of mathematics and are appreciated for their gifts of understanding and disseminating mathematics, as well as their broad understanding of the field.

In other words, a democratic system actually looks different from a oligarchy, and that’s not necessarily bad, although the oligarchs may think it is.

I’m going to make a prediction, namely that there will be two different systems in place in 15 years. Neither will involve traditional publishers, but one of them will keep that refereeing system intact whereas the other will be more of a crowd-sourced referee system. Maybe it will be something like this idea of Yann LeCun, for example. Maybe it will be better for women. That would be cool.

By the way, I want to be clear that I’m not suggesting all papers are written equally. There really are people who make huge contributions to their fields through proving hard, creative theorems. I just think there are also people who contribute to mathematics in other ways, that also require hard work and excellent skills. And there aren’t just two skills, of course; I just simplified matters for this discussion.

The discussion of the future of academic publishing is raging, as I posted about here. And that discussion is really important in itself, and the fact that so many people are participating in it, and figuring out the shared values of the mathematics community, is democracy in action. I fully believe we are witnessing a historic moment, and it’s weirdly, and happily, happening without police intervention, pepper spray, or drum circles.

Categories: #OWS, math

New online course: model thinking

February 12, 2012 Cathy O'Neil, mathbabe 14 comments

There’s a new course starting soon, taught by Scott Page, about “model thinking” (hat tip David Laxer). The course web site is located here and some preview lectures are here. From the course description:

In this class, I present a starter kit of models: I start with models of tipping points. I move on to cover models explain the wisdom of crowds, models that show why some countries are rich and some are poor, and models that help unpack the strategic decisions of firm and politicians.

The models cover in this class provide a foundation for future social science classes, whether they be in economics, political science, business, or sociology. Mastering this material will give you a huge leg up in advanced courses. They also help you in life.

In other words, this guy is seriously ambitious. Usually around people who are this into modeling I get incredibly suspicious and skeptical, and this is no exception. I’ve watched the first two videos and I’ve come across the following phrases:

Models make us think better
Models are better than we are
Models make us humble

The third one is particularly strange since his evidence that models make us humble seems to come from the Dutch tulip craze, where a linear model of price growth was proven wrong, and the recent housing boom, where people who modeled housing prices as always going up (i.e. most people) were wrong.

I think I would have replaced the above with the following:

Models can make us come to faster conclusions, which can work as rules of thumb, but beware of when you are misapplying such shortcuts
Models make us think we are better than we actually are: beware of overconfidence in what is probably a ridiculous oversimplification of what may be a complicated real-world situation
Models sometimes fail spectacularly, and our overconfidence and misapplication of models helps them do so.

So in other words I’m looking forward to disagreeing with this guy a lot.

He seems really nice, by the way.

I should also mention that in spite of anticipating disagreeing fervently with this guy, I think what Coursera is doing by putting up online courses is totally cool. Check out some of their other offerings here.

Categories: data science, math education, open source tools

How unsupervised is unsupervised learning?

February 11, 2012 Cathy O'Neil, mathbabe 2 comments

I was recently at a Meetup and got into a discussion with Joey Markowitz about the difference between supervised, unsupervised, and partially (semi-) supervised learning.

For those who haven’t heard of this stuff, a bit of explanation. These are general categories of models. In every model there’s input data, and in some models there’s also a known quantity you are trying to predict, starting from the input data.

Not surprisingly, supervised learning is what finance quants do, because they always know what they’re going to predict: the money. Unsupervised means you don’t really know what you are looking for in advance. A good example of this is “clustering” algorithms, where you input the data and the number of clusters and the algorithm finds the “best” way of clustering the data into that many clusters (with respect to some norm in N-space where N is the number of attributes of the input data). As a toy example, you could have all your friends write down how much they like various kinds of foods (tofu, broccoli, garlic, ice cream, buttered toast) and after clustering you might find a bunch of people live in the “we love tofu, broccoli, and garlic” cluster and the others live over in the “we love ice cream and buttered toast” cluster.

I hadn’t heard of the phrase “partially supervised learning,” but it turns out it just means you train your model both on labeled and unlabeled data. Usually there’s a domain expert who doesn’t have time to classify all of the data, but the algorithm is augmented by their partial information. So, again a toy example, if the algorithm is classifying photographs, it may help for a human to go through some of them and classify them “porn” vs. “not porn” (because I know it when I see it).

Joey had some interesting thoughts about what’s really going on with supervised vs. unsupervised; he claims that “unsupervised” should really be called “indirectly supervised”. He followed up with this email:

I currently think about unsupervised learning as indirectly supervised learning. The primary reason is because once you implement an unsupervised learning algorithm it eventually becomes part of a large package, and that larger package is evaluated. Indirectly you can back out from the package evaluation the effectiveness of different implementations/seeds of the unsupervised learning algorithm.

So simply put, the unsupervised learning algorithm is only unsupervised in isolation, and indirectly supervised once part of a larger picture. If you distill this further the evaluation metric for unsupervised algorithms are project specific and developed through error analysis whereas for supervised algorithms the metric is specific to the algorithm, irrespective to the project.

supervised learning: input data -> learning algorithm -> problem non-specific cost metric -> output

unsupervised learning: input data -> learning algorithm -> problem specific cost metric -> output

The main question is… once you formulate evaluation metric for an unsupervised algorithm specific to your project… can it still be called unsupervised?

This is a good question. One stupid example of this is that, if in the tofu-broccoli-ice cream example above, we had forced three clusters instead of the more natural two clusters, then after we look at the result we may say, shit this is really a two-cluster problem. That moment when we switch the number of clusters to two is, of course, supervising the so-called unsupervised process.

I think though that Joey’s remark runs deeper than that, and is perhaps an example of how we trick ourselves into thinking we’ve successfully algorithmized a process when in fact we have made an awful lot of choices.

Categories: data science

What’s going on: Greece and mortgages

February 10, 2012 Cathy O'Neil, mathbabe 5 comments

There are two very confusing but important issues that you should be paying attention to in the news right now. Luckily, Naked Capitalism is covering this stuff for you (and for me).

First, it’s the mortgage settlement which was agreed on yesterday or maybe two days ago, which sucks in a lot of ways for poor homeowners but not for the banks. To see the top twelve reasons to hate the mortgage settlement, check out this post from Naked Capitalism.

Second, the Greek debt situation is not yet under control, and no matter what they do over there in Europe they can’t seem to admit it. Here’s a Naked Capitalism post from a couple of days ago, coupled with a new Bloomberg article that kind of says how awful that situation is.

I took all our money out of the money market account a few days ago because it’s not FDIC insured and because I really really don’t know what’s going to happen in Europe. Just saying.

Categories: finance, news

The future of academic publishing

February 10, 2012 Cathy O'Neil, mathbabe 6 comments

I’ve been talking a lot to mathematicians in the past few days about the future of mathematics publishing (partly because I gave a talk about Math in Business out at Northwestern).

It’s an exciting time, mathematicians seem really fed up with a particularly obnoxious Dutch publisher called Elsevier (tag line: “we charge this much because we can”), and a bunch of people have been boycotting them, both for submissions (they refuse to submit papers to the journals Elsevier publishes) and for editing (they resign as editors or refuse offers). One such mathematician is my friend Jordan, for example.

Here’s a page that simply collects information about the boycott. As you can see by looking at it, there’s an absolutely exploding amount of conversation around this topic, and rightly so: the publishing system in academic math is ancient and completely outdated. For one thing, nobody I’ve talked to actually reads journals anymore, they all read preprints from arXiv, and so the only purpose publishers provide right now is a referee system, but then again the mathematicians themselves do the refereeing. So publishers are more like the organizers of refereeing than anything else.

What’s next? Some people are really excited to start something completely new (I talked about this a bit already here and here) but others just want the same referee system done without all the money going to publishers. I think it would be a great start, but who would do the organizing and get to choose the referees etc? It’s both lots of work and potentially lots of bias in an already opaque system. Maybe it’s time for some crowd-sourcing in reviewing? That’s also work to set up and could potentially be gamed (if you send all your friends online to review your newest paper for example).

We clearly need to discuss.

For example, here’s a post (hat tip Roger Witte) about using arXiv.org as a collector of papers and putting a referee system on top of it, which would be called arXiv-review.org. There’s an infant google+ discussion group about what that referee system would look like.

Update: here’s another discussion taking place.

Are there other online discussions going on? Please comment if so, I’d like to know about them. I’m looking forward to what happens next!

Categories: open source tools, rant

As predicted: watered down insider trading bill

February 9, 2012 Cathy O'Neil, mathbabe 2 comments

Yesterday I posted about the insider trading bill which, in addition to making it illegal for politicians to trade on their insider knowledge, was also going to force “political intelligence firms” to register as lobbyists. Note that this is simply a form of transparency- they, people who work mostly for hedge funds and private equity, didn’t have to stop getting insider information, they’d just need to admit that they were getting it. But I guess that’s TMI from their perspective. From the Wall Street Journal article:

Rep. Eric Cantor, the No. 2 House Republican, plans to bring his version of the Stop Trading on Congressional Knowledge Act, or Stock Act, to the floor of the GOP-controlled chamber on Thursday, using a procedure that will prevent lawmakers from voting on major amendments. It is expected to pass by a wide margin.

…

At issue are changes Mr. Cantor made shortly before midnight Tuesday, when he unveiled his amendment to a bill that sailed through the Senate last week.

Most notably, Mr. Cantor cut a provision that would require people who mine Washington for market-moving information to disclose their activities in the same fashion as lobbyists. The provision covering what is known as the political-intelligence industry was opposed by Wall Street and its Washington lobbyists, including the Securities Industry and Financial Markets Association (SIFMA), which mounted an effort to kill it.

Just to be clear on who is writing legislation nowadays: they are called SIFMA, and they represent the players in the financial industry. You may remember them from this post, where they hired the research firm Oliver Wyman to investigate the impact of the Volcker Rule for a congressional hearing. Shockingly, that research firm thought the Volcker Rule should be watered down.

What exactly is the argument this guy Cantor is using to defend this change? I’d love to hear him come out and say, “I did it because SIFMA told me to”. How come we don’t get to see that argument made and defended? No wonder people don’t like or trust Congress. Even so I’ll give the last word to one of their members:

The House Democrat who has pushed for the legislation for the past six years—Rep. Louise Slaughter (D., N.Y.)—opposed the GOP-backed changes.

Ms. Slaughter said in a statement that the Cantor-backed version of the insider-trading bill was crafted “in secret, behind closed doors, brokering deals for special interests.” She added: “How ironic—insiders now appear to be writing a bill meant to ban insider trading.”

Categories: finance, news, rant

#OWS upcoming events

February 9, 2012 Cathy O'Neil, mathbabe Comments off

Here ye, here ye, there will be an Occupy Town Square event this coming Saturday. Please come and help us reconstruct Zucotti Park inside a church at 86th and Amsterdam for the afternoon. Here’s the flyer:

Also, there will be a march from Liberty Plaza to the Fed and the SEC to celebrate very own Occupy the SEC’s submission of their Volcker Rule public comments, next Monday, February 13th, at 4:30pm.

Here’s the schedule:
4-430pm: Assemble at Liberty Plaza
5pm: March to the Fed (33 Liberty Street )
5:30pm: March to the SEC’s NY Office (3 World Financial Center, Suite 400)

Finally, the Alt Banking working group now has a twitter feed.

Categories: #OWS, news

This month’s Sky Mall: a sneak peek

February 8, 2012 Cathy O'Neil, mathbabe 3 comments

I know I’m not the only person who loves Sky Mall magazine for those moments when you realize that you’re not allowed to use your electronic devices, that you have nothing at all physical to read, and that the plane won’t be airborne for 30 minutes due to runway congestion.

To tell you the truth it’s been a while since I’ve moseyed up to lean on it for psychological support so I was a bit hesitant- I didn’t know what to expect. Forgive my lack of faith.

Bottomline: Sky Mall has never disappointed me, which is more than I can say for most celebrated cultural icons. I want to share just a few of the highlights of this issue, and I hope you appreciate using up my precious 30 minutes of free in-air wifi (update: clear your cookies for another half hour) to do so:

The Fleece Poncho With A Pillow (actual name) (see picture above). Best product description ever: The Fleece Poncho With A Pillow is an all-in-one fleece poncho-style blanket with a pillow attached.
The Spongester (picture below). From the description: Made from the same steel as an industrial sink with labeled slots for your “good sponge” (utensils & dishes) and “evil sponge” (sink, counter, cat dish). Until now I (naively) didn’t realize that sponges had morals. I feel so… foolish.
Touchless Sensor Seat (with video!!) (picture below): For only $159.99 you can get an automatic sensor that lifts and lowers the toilet seat for you. It may seem like this price is a bit steep but think about it some more: it sure beats a divorce attorney.

Categories: Uncategorized

More Money than God

February 8, 2012 Cathy O'Neil, mathbabe 1 comment

This is a guest post from an anonymous friend. Actually is was a letter to me that I thought was hilarious and got permission to post.

———————————————————

Dear Cathy,

Earlier I mentioned that I was reading “More Money than God”, which might have been construed as an endorsement, so, in case you haven’t read it already, I thought I would save you some time by summarizing it:

Chapter 1: It wasn’t us! It was the banks! Those guys!

Chapter {2,\ldots,(N-2)}: All the hedge fund dudes you have heard of are* sages both of human nature and of economics. When they destroy foreign currencies, it’s to correct bad governments. When they attempt to short foreign currencies but fail, it’s because they (Soros) care deeply about these developing countries and are using their money to help support them. They are huge philanthropists. They increase economic stability by being contrarian. The only time they are outsmarted is when they are outsmarted by other hedge fund titans.

Chapter N-1: Take that, banks! Ha! In your FACE!!! Too bad you weren’t more like hedge funds. That would’ve never happened to a hedge fund.

Chapter N: Don’t regulate hedge funds. Regulating hedge funds would be bad for the economy and for philanthropy. There’s no need for hedge funds to be regulated. Regulate the banks or something else but for God’s sake not hedge funds. Also: no regulation!

Acknowledgments: thanks to Rubin and all my other buddies at CFR, and at Blackstone, and to Paul Tudor Jones, and all the other hedge fundies who supported me while I wrote this book for 3 years.

* They are now, but in the 60s when hedge funds started the whole “hedging” and “long-short” thing was just a distraction from organized insider trading over corned-beef sandwiches. But no one ever insider trades anymore. Except for Raj, who’s clearly not a real hedge fund guy. Who eats SIM cards? We’re not those kind of thugs.

Categories: finance, guest post

Politicians and insider trading

February 8, 2012 Cathy O'Neil, mathbabe 1 comment

There’s shit going down in Washington now around the proposed ban on insider trading of politicians (which for some weird reason up til now hasn’t been illegal). According to this New York Times article, the proposed legislation would also require certain “political intelligence firms” to register as lobbyists, and that gotten them up in a huff. From the article:

“Hedge funds, private equity funds and investment advisers — many of which are not currently registered under the Lobbying Disclosure Act — might now be required either to register or to alter their business practices to avoid the need for registration,” the bulletin said. “If, for example, a hedge fund calls a Congressional committee staffer to gather information about the status of a bill that relates to the fund’s investment decisions, the fund may need to register.”

If you can judge someone by their enemies, then this bill seems kind of like my new best friend. Let’s wait to see how much it’s watered down in the next few days:

House Republicans and their floor leader, Representative Eric Cantor of Virginia, said they would amend the bill, going to the House floor this week, to strengthen it.

But Representative Louise M. Slaughter, Democrat of New York, said, “I think ‘strengthening’ here is a euphemism for ‘weakening.’”

Categories: finance, news, rant

Preggers

February 8, 2012 Cathy O'Neil, mathbabe 1 comment

The below video resonates with me, but trust me when I say it’s all about the hormones, and we do get over it, at least after weaning. In any case, I apologize (hat tip Jordan Ellenberg).

While I’m here, though, I would like to say one thing that non-pregnant people do to pregnant people, which is desex them. The maternity clothes industry was part of this until recently, making all maternity dresses (and they were all dresses) look like school-girl uniforms.

It’s like, now that you’re pregnant I’m going to treat you like an innocent child who’s never had a dirty thought in her life. But, people, how do you think we got this way?

But it’s a more general phenomenon, and you kind of act like an idiot in part because people treat you like one.

Categories: rant

Opacity, noise, and overpopulation in finance

February 7, 2012 Cathy O'Neil, mathbabe 4 comments

This is a guest post by Mekon:

When you come in to work nowadays, you have to read the blogs. The other day, two blogs I like to read both had pieces about Freddie Mac and whether it had inappropriately bet against people refinancing their homes. I’ll spare you the details, which live in the highly technical world of mortgage securitization, but the issue is that Freddie Mac had a large position in “inverse floaters,” which are worth more when people don’t refinance.

The first piece says this is fishy, because Freddie Mac also makes rules on who gets to refinance and who doesn’t. So they have lots of incentive to make the rules more stringent, block people from refinancing, and profit by doing so.

But the second piece says there’s nothing fishy here at all: Freddie Mac is probably holding the inverse floaters to hedge interest rate risk. That is, they might need them just to be neutral to interest rates (people prepay when interest rates go down), because the rest of their book is exposed the other way.

How do you tell who’s right?

The first thing to realize is that they’re actually disagreeing on facts. This isn’t like the usual economic disagreements, where people argue over principles (whether the Fed should worry about unemployment as well as inflation) or things you can’t prove (how bad the economy would have gotten without the stimulus). It should be easy to settle this one: take Freddie’s book and see how it goes up and down when interest rates go down/stay the same/go up and people prepay more or less.

I imagine we haven’t done this because we don’t have the book.

Some opacity in finance may be unavoidable, but sometimes it’s completely unnecessary and self-inflicted. These are government enterprises! Why don’t we make their books transparent? If we can’t do it right away, what about with some kind of time lag? We’re talking about their positions from 2010, for heaven’s sake!

The second thing – forgive me if I’m off base here, I’m a fan of both blogs – is that it doesn’t seem like either one of them has fully done their homework (to be fair, without being able to see into Freddie’s book, it’s not clear how they could have). Both sites followed up with more detail, but nothing that seems definitive – put another way, I still can’t tell who’s right.

I’d like to see people be more sure about the facts before publishing conclusions. I thought maybe this was just me, but then I ran across a paper by Andrew Lo which makes much the same point (see the last section). Andrew looks at 21 different books about the financial crisis and compares the range of conclusions they draw to Rashomon. And, like the Freddie example, he finds no agreement on the underlying facts. I hear his frustration when he urges: “By working with a common set of facts, we have a much better chance of responding more effectively and preparing more successfully for future crises.” Amen.

Finally, if you’ll indulge me, a little sociology. If you’ve been around finance for a while, I think you’ll agree with me that people being on loose ground with their arguments and a bit quick on the draw with their conclusions is more the norm than the exception. Put another way, there’s an awful lot of noise in finance. Why is this?

This blog has focused a lot on how finance today is both complicated and opaque. One thing I’d add is that finance isoverpopulated. I don’t just mean that we’d be better off if smart people thought more about curing cancer and avoiding famine and less about executing trades a millisecond faster or securitizing and sell some kind of risk that’s never been traded before. (But duh.)

What I mean is that finance today is so complicated and opaque that it requires extremely specialized skills to understand what’s going on. At the same time, the field employs way more people than could ever have those specialized skills. End result: many people working in finance don’t really understand it. Which makes noise an accepted part of the culture. Which in turn makes it even harder to understand what the hell is going on.

I don’t know how to fix this, but wouldn’t you feel a lot better about our financial system if we could (1) make it simpler, and (2) cut the number of people needed to operate it in half?

Categories: finance, guest post

Women in math

February 6, 2012 Cathy O'Neil, mathbabe 35 comments

This is crossposted from Naked Capitalism.

A study recently came out which was entitled “Can stereotype threat explain the gender gap in mathematics performance and achievement?”. One of the authors created and posted a video describing the paper, which you can view here.

As a preview, there seem to be four main points of the paper and the video:

The papers on stereotype threat normalize with respect to SAT scores which is bad.
Evidence for stereotype threat is therefore weak.
We should therefore stop putting all of our resources into combating stereotype threat.
We should instead do something easy like combating stereotypes themselves.

Before we go into the details of the paper, we need a bit of context. For that reason, this post is split into three parts. The first addresses a meta-issue, namely that of the “null hypothesis” in this discussion. A frustration that I have, and that I think is shared by many of the women I know in math, is that the (often unspoken) working hypothesis is that in fact women are just not as talented, and it is somehow up to us women to prove this otherwise, presumably by convincing men that we’re geniuses.

The authors of the above paper fall prey to this disingenuous line of thought, by proclaiming stereotype threat is an insufficient explanation but not offering any alternative explanations. This sets up a kind of implied false dichotomy: if it isn’t explained by such and such, it must mean girls are dumb.

Not only does this undermine serious intellectual debate, but it often turns people off from entering the debate in the first place, because they sense the manipulative nature of the discussion. But that’s a pity, since, with the correct assumption, namely that women and men have equal talents but things are holding back women, we could probably make lots of progress on what those things are.

The second part is directly related not to the paper but to the blog post which referenced the paper, which changed the conversation from “math performance gap” to the question of “why there are no women math geniuses”. This is an interesting twist, and in my opinion warrants addressing separately.

In the third part I argue directly against the paper and its conclusions.

1. The Null Hypothesis

Needless to say, I think the onus is on the scientific community to prove that women aren’t as mathematically talented as men. In other words, I do not accept the defensive position that I need to prove we are as smart: the null hypothesis is that a series of effects, one of them stereotype threat, explains any perceived difference in talent.

In his now famous lecture at NBER in 2005, Larry Summers putatively discusses the issue of why there are fewer tenured women in science and math departments at top universities. However, if you read the transcript, you will note that, when he gets to the “different availability of aptitude at the high end” part, he does us a favor of sorts by admitting what his underlying working hypothesis is: that girls aren’t as good at math. His argument using standard deviations of test scores is ridiculous, especially if you consider 1) how differently women do versus men on the same test in different conditions, 2) how much that difference has itself changed over time, and of course 3) the question of what the tests themselves are measuring.

To test why this null hypothesis is so damaging, my friend Catherine Good suggested the following thought experiment: imagine if he’d gone up to the podium and, instead of saying that women aren’t all that good at math and it was partly explained by when he’d given boyish toys to his twin girls that they took care of them instead of constructed things, he had instead substituted gender with race. Here’s the passage:

There may also be elements, by the way, of differing, there is some, particularly in some attributes, that bear on engineering, there is reasonably strong evidence of taste differences between little girls and little boys that are not easy to attribute to socialization. I just returned from Israel, where we had the opportunity to visit a kibbutz, and to spend some time talking about the history of the kibbutz movement, and it is really very striking to hear how the movement started with an absolute commitment, of a kind one doesn’t encounter in other places, that everybody was going to do the same jobs. Sometimes the women were going to fix the tractors, and the men were going to work in the nurseries, sometimes the men were going to fix the tractors and the women were going to work in the nurseries, and just under the pressure of what everyone wanted, in a hundred different kibbutzes, each one of which evolved, it all moved in the same direction. So, I think, while I would prefer to believe otherwise, I guess my experience with my two and a half year old twin daughters who were not given dolls and who were given trucks, and found themselves saying to each other, look, daddy truck is carrying the baby truck, tells me something. And I think it’s just something that you probably have to recognize.

It begs the question, why did the women in kibbutz quit working on tractors? The way Larry tells his story, he makes it clear he thinks that it’s because the women wanted it that way (thus his story about the twins). But surely it is as plausible that: 1) Men, having a vested interest in proving their manhood (which they do and in cultures around the world leads to certain types of work being seen as “manly”) weren’t keen about day care duty and/or 2) women were hesitant to cross the lines of gender stereotype (it might lead them to be perceived as being masculine, or even worse, emasculating). And it also isn’t hard to imagine that parents ooh and ahh more when small children play with what are perceived to be gender-appropriate toys and are quietly or even vocally uncomfortable when boys play with dolls and girls play with trucks.

One last word about the null hypothesis and why I’m so devoted to this issue: when I and two other girls (and, as it happens, no boys) in the 6th grade did well enough to go into a special, advanced 7th grade algebra class, my (female) teacher brought us up to the front of the room and told the three of us “I don’t see why you would challenge yourselves like this anyway since you are girls, and you won’t be needing math when you grow up.” I was the only one of the three of us to actually choose that class, and I was the only girl in the algebra class. One of my friends was one of two women in a class of 45 students studying artificial intelligence at Yale. She was expecting praise for being one of only two students to get a program to work on a particularly tough assignment. Instead, she was accused by the professor of stealing the code from her male classmate. She left the major. Until stories like this become rare, or even uncommon, I will assume that there’s too much cultural influence to figure out the real story.

Going back to Larry Summers, his lecture did two things: 1) it breathed new life into the age-old stereotype that women aren’t as good at math as men, and 2) it attributed that difference to an underlying innate ability difference- that is, he conveyed a “fixed ability mindset” regarding math (more on mindsets below). As the leader of an educational institution he introduced the two ideas that together are like a powder keg: they can undermine women’s feelings of belonging in math, which in turn informs their mathematics achievement and intrinsic motivation to remain in math.

Now more about Catherine Good. She talked at that same conference where Larry Summers put his foot in his mouth; in fact she was the speaker after Larry at that conference, and she was talking about her paper that gives evidence that the above “powder keg” message tends to push women out of math (but Larry didn’t stick around long enough to hear her talk, unfortunately). She is also an expert on stereotype threat and helped me look at the study. More on her thoughts below, but I still want to talk about the concept of “genius.”

2. Women and the concept of genius

Let’s define, as one of the commenters does from the blog, a “genius woman in math” to be any woman who has won a Fields Medal. Since there are no women who have won Fields Medals (versus 52 men), this is a pretty tight definition. I would argue, and I might in another post, that even without the above definition, the concept of “genius” is a social construct which is rarely if ever applied to women, except perhaps after they’re dead. Please comment with counterexamples if you know of any.

So here’s what I think. There are lots of reasons that women don’t win Fields Medals. I will name a few.

Fields Medals are awarded to mathematicians under the age of 40, for some reason, and women mathematicians typically do good work into their retirement age, whereas men usually do their best work young (this also explains why Harvard has so much trouble hiring women- by the time they are convinced the woman is a genius, she’s 55 and has grandchildren and frankly probably sees the offer as tokenism).
The commenter who defined a “math genius” as a Fields Medalist said that it would be an objective measure. But Fields Medals are awarded by a bunch of guys who decide what’s important and who’s responsible for the important results. In other words it’s a political process.
Women don’t care as much about winning Fields Medals. This matters, because I know of men who explicitly worked on problems in order to win the Fields Medal (you know who you are). It’s a serious and bizarre case of narrow focus.
Why is math genius defined so narrowly? I would personally define it more broadly (a topic for another post), and there’d be plenty of women geniuses. With my definition, though, I’d guess that women who are geniuses have lots of options and they often choose something they consider more personally rewarding than an academic job.
Women’s intelligence may also manifest in different ways: note that most of the assholes on Wall Street are men. This kind of makes sense since women are typically not as driven by testosterone and competitiveness. This doesn’t mean they aren’t geniuses or that they couldn’t have done the work the men on Wall Street did (my experience proves that).
The Fields Medal distorts the mathematical process itself, by implying that there’s a single superstar who swoops in and solves the problem that all the other people were incapable of doing. In fact mathematics as a field is an enormous collaboration, a scientific project, where everyone depends on the community around them for coming up with questions, defining the “interestingness” of questions, and giving context to results. The idea that there’s one winner out of all of this, or even one metric by which we could measure such a winner, is silly. See this post from Quomodocumque.
Another point about genius (in any domain): research is showing that to truly express one’s genius takes thousands of hours of practice. So genius may be a latent trait but will never be expressed without many hours of hard work. This point is very often lost and is related to women in that their apparent geniusness depends to a large extent on how supportive their environment is for all that investment of time.

3. The paper against stereotype threat

I am finally ready to address (with Catherine’s help) the issues of the paper in question, which I will repeat:

The papers on stereotype threat normalize with respect to SAT scores which is bad

In fact the author “discards” a bunch of stereotype threat studies on these grounds. However, it is totally standard to normalize with respect to some other metric (would you rather we didn’t normalize to anything?), and in fact it essentially penalizes the studies, since it has been shown that stereotype threat is in play even for the SATs. On the other hand, the standard for normalizing (this is called “including a covariate”) is that the groups being compared should not differ significantly in the covariate, presumably because it’s harder to argue that your are in fact correcting for that aspect. Because men and women sometimes do differ significantly in SAT scores, including them as covariates could be a technical violation of the rules of conducting a so-called ANCOVA.

Is this what the author is complaining about specifically? Did he, for example, check to see if the samples in the “discarded” studies actually differ in the covariate? It seems he’s making the assumption that they did, but it’s not clearly stated that they did. It’s certainly not a given that the men and women in these studies did differ in the covariate, and he needs to make that precise. If they did not, then there’s no valid argument against using SAT scores.

Evidence for stereotype threat is therefore weak.

There is ample evidence that stereotype threat is very real. Keep in mind that the authors of this study have not shown evidence against stereotype threat, but have simply complained that they don’t like the existing studies for it. And their standard for what “replicates” the original study is overly stringent- they only wanted to include studies that found significant interactions between gender and condition. Interactions are easiest to find when you have a “crossover effect” (e.g. males are higher in condition A but lower in condition B), but often we find “span effects” in which the males and females may be equal in condition A but differ in condition B. This can also be an example of stereotype threat. For example, in a paper written by Catherine, she didn’t find a significant interaction (males and females performed equally in condition A) but when the stereotype threat was reduced, women outperformed men. To discount this and other studies as not providing evidence of stereotype threat simply because an “interaction” wasn’t found is playing games with statistics.

We should therefore stop putting all of our resources into combating stereotype threat.

Nobody who studies stereotype threat claims it explains everything. It is part of a larger picture. The good news is that there are interventions for it (described below).

We should instead do something easy like combating stereotypes themselves.

The idea that it’s “easy” to combat stereotypes is completely naive. There are tons of ways that stereotyping is understood to be very difficult, if not impossible, to get rid of. Some of them have to do with an evolutionary need to simplify first impressions of people (i.e. categorize) so that we can tell if they are an immediate threat to our safety. This may be the most baffling part of the whole thing, because the authors should really know better.

I want to end on a positive note, because the news is actually pretty good. There is a way to combat stereotype threat, and I’ve tried it and it works. To understand it, it helps to think about the way people think about intelligence itself. As a simplification, people either think that intelligence is fixed and rigid (you’re either born with it or you’re not) or they think that intelligence is malleable and can be learned and practiced.

It turns out that if someone believes the latter “malleable intelligence” view, then they work hard and are hopeful and stereotype threat is to a large extent alleviated. Whereas if they’re convinced of the former mindset for intelligence, the effect of stereotype threat is more pronounced. In situations where the stereotype is salient (“girls are bad at math” is salient when taking a math test), the situation itself can convey a mindset of fixed ability and all the hallmark responses that go along with that mindset then follow. To encourage a malleable view of intelligence can help combat that fixed view and thus the threat of the stereotype.

The way I used this information was as follows. I started a class in teaching proof techniques at Barnard College (there were both Barnard students and Columbia students in the class). At the beginning of every class for the first two weeks I described how mathematicians aren’t born knowing how to prove things, but rather they learn techniques, and practice them until they are proficient. Note I wasn’t directly confronting or addressing stereotypes, but rather setting up the mindset where the studies have shown stereotypes have less negative power.

The class went great, and is still going on. I will post soon about my experiences starting that class and others like it.

Categories: math education, women in math

Raise capital gains and stop flying

February 5, 2012 Cathy O'Neil, mathbabe 7 comments

There are two totally unrelated stories I want to discuss this morning, I hope you’ll forgive me.

First, take a look at this post, written by David Brin, which argues for higher capital gains tax. He points out VC’s or angel investors, in combination with entrepreneurs, are the true “job creators”, and also invest their money in a truly risky way, whereas generic rich people who only invest in established companies are taking risks but not on the same level. Yet these two classes of people are taxed at the same rate. I guess the counterarguments would be that they, the VC’s, also get more payoff (when things work out) and that they couldn’t make their investments without the fleet of passive rich people ready to invest if and when the company succeeds. Even so I think there’s a real difference.

It reminds me that, when I worked at D.E. Shaw and Lehman fell, there were lots of discussions around the water cooler about what the reaction would be by policy makers and regulators. The consensus fear was that the capital gains tax rate for hedge fund workers would be removed within weeks, if not days. Note this tax loophole allows hedge fund quants and traders to pay less taxes on their take-home pay than bankers across the street doing the same job. I don’t really know anyone who defends it, not even people who benefit from it. Please correct me if I’m wrong. Update: mostly people below the MD (managing director) level at hedge funds actually don’t get this benefit. It primarily applies to “buy and hold” people like VC’s, private equity, and long term debt firms.

Another argument I enjoy from Brin’s post is the refutation of lowering taxes in general to entice investment by rich people. As he said:

Supply Side assumes that the rich have a zillion other uses for their cash and thus have to be lured into investing it! Now ponder that nonsense statement. Roll it around and try to imagine it making a scintilla of sense! Try actually asking a very rich person. Once you have a few mansions and their contents and cars and boats and such, actually spending it all holds little attraction. Rather, the next step is using the extra to become even richer. Naturally, you invest it. Whatever the tax rates, you invest it, seeking maximum return.

This is absolutely true, and one of the funny things about (many of) the rich quants I know: they are obsessed with growing their pile, to the point of focusing more on money now that they’re rich than they ever did when they were poor physics or math graduate students. To be fair, to make the whole argument for raising taxes you’d need to consider the global response, whereby rich people essentially arb the tax systems of the various countries in search of the maximum return. Even so, I’m pretty sure the answer is not to try to compete with Caribbean island nations on how low we can tax.

Second, check out this fantastic article from the Wall Street Journal about how people respond to environmental impact issues by consuming more. In the article they describe what’s called the “Prius Fallacy: a belief that switching to an ostensibly more benign form of consumption turns consumption itself into a boon for the environment”. I love it, first of all because it’s completely snarky and second of all because it’s really true and annoying. My favorite line:

Even if you think that climate change is a left-wing crock, this ought to be a matter of gnawing concern. Global energy use is growing faster than population. It’s expected to double by midcentury, and most of the growth will be in fossil fuels. Disasters like the BP oil spill attract world-wide attention, but the main environmental, economic and geopolitical challenge with petroleum isn’t the oil that goes into the ocean; it is the oil we continue to use exactly as we intend.

By the way, I don’t claim to be particularly low-impact on the world myself: I’m flying to Amsterdam in March with my entire family, which definitely puts me on the earth’s shit list (turns out it’s all about airplane travel). For that matter I work at a company that makes it easier for consumers to buy airplane tickets. But at least I don’t pretend that buying a Prius or replacing my kitchen counters with less eco-unfriendly material makes me a good person (by the way, once you’ve got eco-unfriendly kitchen counters the damage is done. The best thing you can do for the environment at that point is never ever remodel your kitchen again. Can you handle that?!).

If I had my way, we’d know the fossil-fuel impact of every activity we engage in, and we’d be able to put ourselves on a fossil-fuel diet. Those people who carefully recycle their milk containers and buy local but also fly to East Asia every chance they get would be in for some major belt-tightening.

Categories: finance, news, rant

Newer Entries Older Entries

mathbabe

Archive

Sunday morning music videos

ECB trades crap for slightly less crappy crap

How Harvard is failing its students

A modeled student

How Big Pharma Cooks Data: The Case of Vioxx and Heart Disease

Today is Volcker Day

Mathematics has an Occupy moment

New online course: model thinking

How unsupervised is unsupervised learning?

What’s going on: Greece and mortgages

The future of academic publishing

As predicted: watered down insider trading bill

#OWS upcoming events

This month’s Sky Mall: a sneak peek

More Money than God

Politicians and insider trading

Preggers

Opacity, noise, and overpopulation in finance

Women in math

Raise capital gains and stop flying

Top Posts & Pages

Follow Blog via Email

Recent Posts

Meta