mathbabe

High frequency trading: Update

July 18, 2011 Cathy O'Neil, mathbabe 4 comments

I’d like to make an update to my earlier rant about high frequency trading. I got an awesome comment from someone in finance that explains that my main point is invalid, namely:

…the statement that high frequency traders tend to back away when the market gets volatile may be true, but it is demonstrably true that other, non-electronic, non-high-frequency, market makers do and have done exactly the same thing historically (numerous examples included 1987, 1998, various times in the mortgage crisis, and just the other morning in Italian government bonds when they traded 3 points wide for I believe over an hour). While there is an obligation to make markets, in general one is not obliged to make markets at any particular width; and if there were such an obligation, the economics of being a marketmaker would be really terrible, because you would be saying that at certain junctures you are obliged to be picked off (typically exactly when that has the greatest chance of bankrupting your enterprise).

My conclusion is that it’s not a clear but case that high-frequency traders actually increase the risk.

By the way, just in case it’s not clear: one of the main reasons I am blogging in the first place is so that people will set me straight if I’m wrong about the facts. So please do comment if you think I’m getting things wrong.

Categories: finance, hedge funds, news

High frequency trading

July 18, 2011 Cathy O'Neil, mathbabe 2 comments

This morning there was an article in the New York Times describing high frequency traders- what they do and how they want people to like them. I’m of the mind that there’s not much to like.

NOTE: Please see update!

High frequency traders are basic, old-fashioned opportunists. They buy somewhere and try to sell somewhere else cheaper. They have expensive technology and colocate next to exchanges to deal with speed-of-light issues to shave off tiny fractions of seconds for their trades. They notice a currency change in Brazil and trade on it in the US before anyone else notices. That kind of thing.

They will tell you that they are useful to the market, because they have set the bid-ask spread smaller than it used to be. Back in the day, there were official “market makers” who would maintain a book of certain instruments, and would be the go-to person for anyone who wanted to buy or sell. In return for the service they would charge a fee, which would be this so-called spread. Moreover, they were required to offer to buy and to sell in all kinds of trading environments (the spreads could get pretty wide of course).

It’s true that those spreads have gotten smaller since high-frequency traders have come to dominate. They have substantially replaced the old-school market makers and claim to be doing a better job. However, it’s also true that high-frequency traders aren’t required to be there. So when the going gets tough they completely vanish. This happens in moments of panic, and it can easily be true that their ability to vanish at will can also create more panics more often (I’d love some evidence to support or deny this theory), since from their perspective, at the first sign of weirdness, they may as well pull out until the dust settles.

The analogy I like to come up with is a little story about chores. Suppose you have someone who comes and helps you with your cleaning, mostly dishes, every day, for a small fee. Since you have kids and a job, the small fee seems to be worth it. After a while someone else comes along and offers to do your dishes every day! for free!! What a deal! You can’t resist. However, it turns out that, if the kitchen actually gets really dirty and needs to be mopped up or seriously cleaned, the free-dishes guy is nowhere to be found and you’re on your own, just when all the kids are sick and there’s a product release at work. Maybe not such a great deal after all.

Categories: news, rant

Math contests kind of suck

July 17, 2011 Cathy O'Neil, mathbabe 77 comments

I’m going to annoy quite a few people with this post, but I’ve been thinking about this for a while and it comes down to this: I think math contests for kids kind of suck.

Here’s the short version of my argument.

Math contests discourage most people who take them, because most people don’t get close to winning, and in particular give those people the impression that because they lost a contest they don’t “have it” when it comes to math. At the same time, although they are encouraging for a few people, it’s not clear to me that the kind of encouragement they give those kids is healthy. Finally, they are bad for women.

Now I will argue this more thoroughly.

The way math contests are set up nowadays, they start in middle school, at the school level, and if a student does well at a given test they move on to a larger stage, perhaps at the state level, and they typically culminate in a national test, or sometimes even an international test (in the case of the IMO).

This system sets up nearly all the participating students for a feeling afterwards of having not been good enough. It encourages competition over collaboration, which is a huge problem in my opinion, but even worse, it tends to make young people feel like they aren’t smart enough to be mathematicians. It is in fact well-documented that people seem to think that one is either born good at math or not, in spite of the fact that there’s ample evidence that practicing math competition-type problems makes you good at them (why else would Stuyvesant kids consistently beat other kids? Is it really possible that smart people somehow know to be born in New York?). The bottomline is that these extremely young, impressionable kids get early impressions that the contests are measuring their genetic abilities, and that they aren’t cutting it.

When I was in middle school, there were no math contests. I was lucky enough to have a great teacher in 7th grade, who let us nerds debate amongst ourselves for an entire class whether 0.999999… is equal to 1 or not. He put himself in the position of a mediator. It was a great moment for me, and made me realize how much creativity and originality could be involved in the process of making and understanding math.

When I got to high school, I was on the math team, and although I wasn’t bad, I also wasn’t good – and I felt bad about that, consistently. In fact there were definitely moments when I doubted my chances at becoming a mathematician. It is really a testament to my internal love for mathematics, combined with finding this math camp that I’m teaching at now, that motivated me to become a mathematician. If I had not had that 7th grade teacher, and if I had had earlier experiences being so-so at math contests, it’s possible I would have been turned off of math altogether.

Perhaps you are thinking, well of course there’s a selection process for math contests, because they select for people who are good at math! I discussed this with another mathematician today and he refined that argument as follows: some people are good at understanding concepts but can’t work out the details, and some people are good at working out details by rote but don’t understand the concepts- you can’t really be a good mathematician without both, and perhaps the contests select for the details people, but after all you need that aspect.

But I would go further: although I agree you can’t be a good mathematician without both, I don’t think the contests select for the details people. They actually select for people who do or don’t understand the concepts (probably do for the higher level tests) but who in any case are extremely fast at the details. I have never been particularly fast at working out the details of something from the conceptual understanding (for example, it takes me a long time to solve a 7x7x7 Rubik’s cube) but it turns out the Rubik’s cube doesn’t mind. And in fact mathematics in real life isn’t a timed tests- the idea that you need to be original and creative really quickly is just a silly, arbitrary way to select for talent.

I guess if you could have math competitions that aren’t timed then I might start being okay with them. Especially if they were collaborative.

The reason I claim math contests are bad for math is that women are particularly susceptible to feelings that they aren’t good enough or talented enough to do things, and of course they are susceptible to negative girls-in-math stereotypes to begin with. It’s not really a mystery to me, considering this, that fewer girls than boys win these contests – they don’t practice them as much, partly because they aren’t expected by others, nor do they expect themselves, to be good at them. It’s even possible that boys brains develop differently which makes them faster at certain things earlier- I don’t know and I don’t care, because I don’t think that the speed issue is correlated to later deep thought or mathematical creativity.

Finally, I don’t necessarily think that winning math contests is even all that good for the winners either. In spite of the fact that many of my favorite people are mathematicians who were excellent at contests, I also know quite a few people who were absolutely dominant in math contests in their youth who really seemed to suffer later on from that, especially in grad school. From my armchair psychologist’s perspective, I think it’s because they got addicted to the rush of doing math really fast and really well, and winning all these prizes, and when they get to grad school and realize how hard math really is, they can’t stand it.

One related complaint to this rant: it seems like there is way money out there for math contests for young people than there is for math enrichment programs like the program I’m working at now (I’m looking at you, NSF). Why is this? Probably a combination of the fact that’s it’s easier to organize, it seems quantitatively measurably “successful” because there’s a winner at the end, and maybe even because it makes the United States look good compared to other countries to have a winning IMO team- in other words, spin. Booo! How about throwing a little bit of money towards programs that sponsor a sense of collaborative, exploratory mathematics and which encourages women?

Before I get people too riled up, I will say this in favor of math contests: they do tend to expose kids to different kinds of math than is normally offered in their classrooms, which can be really great, and expansive, for kids that have drab math curriculums with drab teachers. Lots of kids first find out there’s math beyond quadratic equations by going to a math contest. That’s cool, but can’t we do it in a better way?

Categories: math education

What is an earnings surprise?

July 17, 2011 Cathy O'Neil, mathbabe 1 comment

One of my goals for this blog is to provide a minimally watered-down resource for technical but common financial terms. It annoys me when I see technical jargon thrown around in articles without any references.

My audience for a post like this is someone who is somewhat mathematically trained, but not necessarily mathematically sophisticated, and certainly not knowledgeable about finance. I already wrote a similar post about what it means for a statistic to be seasonally adjusted here.

By way of very basic background, publicly traded companies (i.e. companies you can buy stock on) announce their earnings once a quarter. They each have a different schedule for this, and their stock price often has drastic movements after the announcement, depending on if it’s good news or bad news. They usually make their announcement before or after trading hours so that it’s more difficult for news to leak and affect the price in weird ways minutes before and after the announcement, but even so most insider trading is centered around knowing and trading on earnings announcements before the official announcement. (Don’t do this. It’s really easy to trace. There are plenty of other ways to illegally make money on Wall Street that are harder to trace.)

In fact, there’s so much money at stake that there’s a whole squad of “analysts” whose job it is to anticipate earnings announcements. They are supposed to learn lots of qualitative information about the industry and the company and how it’s managed etc. Even so most analysts are pretty bad at forecasting earnings. For that reason, instead of listening to a specific analyst, people sometimes take an average of a bunch of analysts’ opinions in an effort to harness the wisdom of crowds. Unfortunately the opinions of analysts are probably not independent, so it’s not clear how much averaging is really going on.

The bottomline of the above discussion is that the concept of an earnings surprise is really only borderline technical, because it’s possible to define it in a super naive, model-free way, namely as the difference between the “consensus among experts” and the actual earnings announcement. However, there’s also a way to quantitatively model it, and the model will probably be as good or better than most analysts’ predictions. I will discuss this model now.

[As an aside, if this model works as well or better as most analysts’ opinions, why don’t analysts just use this model? One possible answer is that, as an analyst, you only get big payoffs if you make a big, unexpected prediction which turns out to be true; you don’t get much credit for being pretty close to right most of the time. In other words you have an incentive to make brash forecasts. One example of this is Meredith Whitney, who got famous for saying in October 2007 that Citigroup would get hosed. Of course it could also be that she’s really pretty good at learning about companies.]

An earnings surprise is the difference between the actual earnings, known on day t, and a forecast of the earnings, known on day t-1. So how do we forecast earnings? A simple and reasonable way to start is to use an autoregressive model, which is a fancy way of saying do a regression to tell you how past earnings announcements can be used as signals to predict future earnings announcements. For example, at first blush we may use last earning’s announcement as a best guess of this coming one. But then we may realize that companies tend to drift in the same direction for some number of quarters (we would find this kind of thing out by pooling data over lots of companies over lots of time), so we would actually care not just about what the last earnings announcement was but also the previous one or two or three. [By the way, this is essentially the same first step I want to use in the diabetes glucose level model, when I use past log levels to predict future log levels.]

The difference between two quarters ago and last quarter gives you a sense of the derivative of the earnings curve, and if you take an alternating sum over the past three you get a sense of the curvature or acceleration of the earnings curve.

It’s even possible you’d want to use more than three past data points, but in that case, since the number of coefficients you are regressing is getting big, you’d probably want to place a strong prior on those coefficients in order to reduce the degrees of freedom; otherwise we would be be fitting the coefficients to the data too much and we’d expect it to lose predictive power. I will devote another post to describing how to put a prior on this kind of thing.

Once we have as good a forecast of the earnings knowing past earnings as we can get, we can try adding macroeconomic or industry-specific signals to the model and see if we get better forecasts – such signals would bring up or bring down the earnings for the whole industry. For example, there may be some manufacturing index we could use as a proxy to the economic environment, or we could use the NASDAQ index for the tech environment.

Since there is never enough data for this kind of model, we would pool all the data we had, for all the quarters and all the companies, and run a causal regression to estimate our coefficients. Then we would calculate a earnings forecast for a specific company by plugging in the past few quarterly results of earnings for that company.

Categories: data science, finance, hedge funds, news

I love math nerd kids

July 15, 2011 Cathy O'Neil, mathbabe 9 comments

So I’m almost at the end of my second week here at HCSSiM, and the pathetic truth is I already miss these kids. They are so freaking adorable, and of course I miss my own kids so much, that the emotional turmoil of the situation combines to create the reality that I am actually nostalgic for each moment with them before that moment happens. Pathetic!! It’s something about identifying with their nerdy selves finding each other and figuring out that they have a community of nerds that accepts them… whatever, now I’m tearing up. Pitiful.

As for what I’m teaching them, the first week it was number theory, number theory, and more number theory. Can you tell I like number theory? At the end of the first week I looked around and I saw a bunch of earnest faces wondering if I was going to prove yet another thing about relatively prime numbers and solving polynomials modulo n and I thought to myself, these kids are going to think there’s no other examples of proof by induction! How shameless! So this week I talked about graph theory. Next week: I’m going back to number theory. Yes I know, but it’s AWESOME. I’m going to talk about Farey numbers and continued fractions and maybe the Pell equation. They will know all about the golden ratio and maybe we’ll even measure each other’s faces. I can’t wait.

Last night we went to the director’s house and ate corn on the cob (we made the kids husk the corn- did you know teenagers today have mostly never husked corn before in their lives?) and pizza and we played “Mafia,” which was hilarious and sweetly innocent.

This weekend is “Yellow Pig day” at the ~~camp~~ program, which is a day where we celebrate yellow pigs and the number 17. We take this incredibly seriously, including making t-shirts with yellow pigs, having a 4-hour (feels like 17) talk about interesting properties of the number 17, and finally, singing yellow pig carols and eating a yellow pig cake at the end. It’s a wild time for math nerd kids. They will remember this and each other for the rest of their lives. Woohoo!!

Did I mention that I was a minor celebrity last night because I solved a 7x7x7 Rubik’s cube in front of them? This is status at its best. I even showed them my trick, and one of the kids came back to me at breakfast this morning proudly displaying his cube with a 3-cycle. Update: he has solved his entire cube using 3-cycles. Now he’s moving on to a dodecahedron puzzle.

LOVE these kids.

Categories: math education, rant, women in math

Motivating transparency: what we could do about too big to fail

July 13, 2011 Cathy O'Neil, mathbabe 5 comments

In this previous post, I promised a follow-up post about how we can devise a system in which large banks are actually motivated to be transparent about what is inside their portfolios. We have also discussed why the current system doesn’t work this way and that the banks have every reason to obfuscate their holdings, and in fact make loads of money by doing so. This makes appropriate external risk management difficult or impossible.

I have actually thought about this problem quite a bit since that post, and I (and a friend in finance) have come up with two quasi ideas, which hopefully together add up to be as good as one complete idea. The first comes under the category, “add stuff to what we have now”, whereas the second comes under the category, “initiate a new system which will over time replace the one we have”. Both of these systems rely on a good understanding of the underlying problem of the current system, namely the concept of “too big to fail.”

If you’re reading this and you have comments about either idea, please do comment. We are hoping for lots of feedback so we can improve the details.

Too Big to Fail

Recall that the way it works when hedge funds want to trade stuff: they have prime brokers, i.e. banks like Deutche and Goldman Sachs and Bank of America (see list of the biggies here). When the brokers don’t like the trade, or think it’s not sufficiently liquid, or think that the hedge fund may fail for any reason, they demand that the hedge funds post margin. That way if the bet goes sour there is a limited amount of risk that the brokerage could lose. As soon as a position starts to look riskier, which could happen because of recent volatility or lack of price transparency, the amount of margin that needs to be posted normally increases, putting pressure on the hedge fund to liquidate suspicious assets.

In other words, there is a real cost to hedge funds for trading in illiquid or complex securities, namely their cash is tied up in bank accounts with their brokers. This is not to say that they don’t take large risks, but there is a limit of how much risk they can take because of the “posting margin” system.

By contrast, big banks don’t post margins. They trade with hedge funds, of course, since hedge funds trade with them, but it’s the banks who demand margin, not the hedge funds (actually there’s a historical exception to this rule, namely Paulson’s hedge fund demanded margin from its brokers during the 2008 financial crisis).

This asymmetrical situation begs the question, why do hedge funds have to post margin but the big banks don’t? Two reasons: first, banks have access to Federal funds, and second, they are deemed to big to fail. [I admit I don’t know exactly why the access to Federal funds is granted to banks, nor do I understand exactly what the effect is. But I do think it’s a pertinent fact which is why I’ve included it here. Please do comment if you know more! Also note it may be a red herring since Goldman Sachs didn’t have access to Fed funds until the crisis.]

This “too big to fail” guarantee is a huge problem, which has only gotten more precise (since we’ve seen the bailout and now everyone knows the guarantee is there) and larger (because, in the end, the net result of all the 2008 crisis is fewer, larger banks) and about which absolutely nothing seems to be getting done. The disingenuous whining of greedy bankers like Jamie Dimon serves as a smokescreen for the fact that, if anything, banks are presumably waltzing into the next phase of their life with more power and fewer checks than they could have dreamed about in August 2008.

Idea #1: make banks post margins

“Too big to fail” means that it is assumed that the bank will be rescued by the government if it makes huge bad bets that threaten to bring them down. Two of the reasons the government can be counted on to bail out banks are first, that the deposits of normal Americans are at risk, which is discussed below in Idea #2, and second, that a bankruptcy would be catastrophically complicated, which we discuss here. One result of the guarantee is that hedge funds don’t bother demanding margins, which makes the banks riskier, which makes the “too big to fail” guarantee even worse.

What if the lawmakers enforced a symmetry of posted margins? We have to be precise, because actually there are different kinds of margins that traders are forced to post.

First, there’s the margin you post in the sense of “keep $x as a deposit for the position”, the thinking being that even if things go south, the broker could liquidate at something better than $x below current marked price in a hurry. This is the initial margin.
Next there’s the “your position lost $10 today, so you need to give me $10” (this is called variation margin). This is the most likely way to get margin called.

The idea here is to require brokers to post initial margin just as hedge funds do now. More precisely, the idea would be to let the two parties negotiate on the initial margin, which could be more for hedge funds since they may well be riskier, but then once it’s set to have complete symmetry of variation margin.

Occasionally, in risky environments, the initial margin of $x is increased, which causes a lot of unraveling, and possibly cascading waves of problems which set off a panic. We’d need to have rules about how often this can happen to avoid the “symmetric of variation margin” rule from being bypassed with lots of initial margin modifications. The symmetry aspect should keep the margin contracts from allowing this to happen too often.

The overall goal would be to devise a system that would:

Encourage the posting and calling of (variation) margins,
Encourage sufficient sizing of initial margin,
Encourage early calls and liquidating if there is doubt that a variation margin call could be met, and
Simplify the bankruptcy rules on ownership of assets, especially for illiquid or complex assets.

The initial margin can be thought of as the dollar amount a price could move by between a margin call and it being paid. It should not be thought of as an asset for either party (and therefore the accounting of the various margins should be carefully considered, but I’m no accounting expert), and certainly should not be able to be recycled to buy more stuff, i.e. add to ones leverage, or offered towards capital requirements. Moreover, if it is indeed symmetric, that would mean if a bank claims to only need to post n dollars in initial margin, then the hedge fund can turn around and use that same number for that same trade, at least up to an understood discount.

As for bankruptcy, we should start with the following. When a margin call is made by one side and it isn’t met, the person making the call:

keeps ALL the margin,
gets the security, and
is a (super-senior level of seniority) claimaint to the variation margin they posted with the counterparty.

Moreover, rules 1 and 2 above do not go into a bankruptcy filing if one occurs (in particular, if the security is a swap, it’s just torn up). This is a key point since that means the bankruptcy is simplified and at the same time the security is back in liquid hands. All over, this setup, or one like it, encourage hedge funds to margin call frequently (banks already do that), which is a good thing, and as described above is a further incentive to invest in liquid, non-complex securities, which in the end creates transparency.

The above idea doesn’t deal directly with desired property 2, and may well cause margins to be lower. One possibility to encourage margins to be of sufficient size would be to allow either party to “put” the security in question on to the other party at a cost of giving up the initial margin posted.

Idea #2: grow a separate system of utility deposit banks

Besides incredibly complicated bankruptcy filings with infinitely many counterparties, one of the major reasons those banks really are too big to fail is that they hold deposits, and the government doesn’t want people to worry that their life savings are at risk, causing a run on the banks and chaos. Another way to get around this, at least eventually, is to create new “utility banks” at the state level which do not trade securities (beyond very basic one like interest rate swaps and treasuries), don’t take large risks, and have FDIC guarantees on savings.

In order to get consumers to switch to banks like this, the government should intentionally create incentives for people to transfer their deposits from “too big to fail” banks to these utility banks. A list of incentives could start with reasonable, transparent fees, and the eventual loss of FDIC insurance guarantee at non-utility banks. Then people who want to stay with risk-taking banks can do so knowing that, as long as bankruptcy laws eventually get simplified, the “too big to fail” guaranteed will in fact be gone.

Moreover, another layer of separation between depositors and utility banks should be the requirement that, even with the restricted kinds of trades allowed for utility banks, they should be done in separate corporate entities (since banks are always a mishmash of many companies anyway).

This idea is not new, and can be seen for example in this article. In fact it is incredibly obvious: admit that what we have now is a guarantee for a get-out-of-jail card for greedy bankers, and transfer that guarantee to a banking system that we’ve created to be boring, along the lines of the post office.

Categories: finance, hedge funds

Bank accounting link

July 12, 2011 Cathy O'Neil, mathbabe Comments off

I wanted to share this link with you; it is both interesting and relevant to another post I’m working on (a follow up to this one) that will describe two ideas I’m contemplating regarding how to systematically change the way big banks are motivated to behave in the presence of the “too big to fail” guarantee.

Its goal is to describe how banks will behave in a given situation with a mortgage, but the thought process generalizes quite well to how banks behave in general, and in particular how accounting considerations trump utility to the depositors and even the long-term shareholders. It also explains, to those of us who were wondering, why Obama’s mortgage modification plan was never going to work.

Categories: finance, news, rant

Short Post!

July 11, 2011 Cathy O'Neil, mathbabe 8 comments

I’ve been told my posts are intimidatingly long, what with the twitter generation’s sound byte attention span. Normally I’d say, screw that! It’s because my ideas are so freaking nuanced they can’t be condensed to under a paragraph without losing their essence!

But today I acquiesce; here’s a short post containing at most one idea.

Namely, I’ve been getting pretty strong reactions online and offline regarding my post about whether an academic math job is a crappy job. I just want to set the record straight: I’m not even saying it’s a crappy job, I’m simply talking about someone else’s essay which describes it that way. But moreover, even if I were saying that, I would only be saying it’s crappy (which I’m not) compared to other jobs that very very smart mathy people could get. Obviously in the grand scheme of things it’s a very good job- safe working conditions, regular hours, well-respected, etc., and many people in this world have far crappier jobs and would love a job with those conditions. But relative to other jobs that math people could be getting, it may not be the best.

Many professors of math (you know who you are) have this weird narrow world view, that they feed their students, which goes something like, “if you want to be a success, you should be exactly like me (which is to say, an academic)”. So anyone who gets educated in a math department is apt to run into all these people who define success as getting tenure in an academic math department, and they just don’t know about or consider other kinds of gigs. It would be nice if there was a way to get a more balanced view of the pros and cons of all of the options.

Categories: finance, internet startup, math education, rant, women in math

Weekend Reading

July 8, 2011 Cathy O'Neil, mathbabe 1 comment

FogOfWar and I have compiled a short list of weekend reading for you that you may enjoy:

What’s the right way to think about China’s economy?
Is Japan’s “lost decades” a media myth?
Can I hear a FUCK YEAH for Elizabeth Warren? I feel a follow-up post coming on how much she rocks.
Get ready to be depressed by how few natural resources there really are.
This essay really pins Robert Rubin to the wall in a totally awesome way. I will add more in another post.
The Republicans are holding the entire nation for ransom over the possibility of default. Is it all political posturing? Or is it for the sake of the insanely shitty idea of a tax repatriation holiday? Here’s another article about this crappy idea; when Bloomberg makes you out as a selfish bastard then you know you’re a truly selfish bastard. I’m convinced that the politicians (and union leaders) arguing for this are just counting on the average person not understanding the actual issues well enough to know how evil it is (and how much kickback they must be getting). Another example of asymmetric information that really gets my goat.
I think it’s fair to say we all need a little more of this in our lives.

Categories: finance, FogOfWar, hedge funds, news, rant

Adding-up rules and Hockey Sticks

July 7, 2011 Cathy O'Neil, mathbabe 2 comments

So I’m at the math program HCSSiM, teaching for three weeks in a “workshop,” which means I am responsible for teaching 12 teenagers the basic language and techniques of math- things like induction, proof by contradiction, the pigeon-hole principle, and how to correctly use phrases like “without loss of generality we can assume…” and “the following is a well-defined function…”, as well as familiarity with basic group theory, graph theory, number theory, cardinality, and fun things like Pascal’s triangle.

It’s really beautiful, classical math, and the students are eager and fantastically bright. They are my temporary brood, and I adore them and feed them chocolate at evening problem sets.

It’s also a fine opportunity to do some silly math doodling just for fun, the only rules being you can’t use a computer to look anything up until you’re done, and you can only use the stuff your kids at the program already learned. I’m going to describe what my mom and I, and then a junior (Amber Verser) and senior (Benji Fisher) staff member at the math program, figured out in the last couple of days. It’s super cool and turns out is at least 400 years old.

One of the most common examples of proof by induction is the formula for the sum of the counting numbers up to n:

1 + 2 + 3 + … + n = n(n+1)/2

And then, once you figure that out, you move on to the next case:

1^2 + 2^2 + 3^2 + … + n^2 = n(n+1)(2n+1)/6.

If you’re really into it, you can put the next case on the problem set:

1^3 + 2^3 + 3^3 + … + n^3 = (n(n+1)/2)^2.

Two obvious patterns are emerging when you add up successive dth powers up to n.

It’s a polynomial of degree d+1, and
The roots of the polynomial are symmetric about -1/2 (mom noticed this!).

How do you prove those two facts?

If you think it’s totally easy, stop reading now and give it a shot. There are about a million things you could try and none of them seem to work. I’ll wait.

…okay, let’s say you gave up, or already know, or don’t care. (Why are you reading still if you don’t care?!)

First let’s generalize the question to, if we add up values of some degree d polynomial for values i=0, 1, 2, …, n, then we want to prove the result is a degree d+1 polynomial in n. That this is equivalent to the first statement above is pretty easy to see by just re-arranging the terms of the double sum over i and over the terms of the polynomial in question. But it still seems like you need to know at least the answer to the question of what is a formula for 0^d + 1^d + 2^d + … + n^d, which is of course where we started.

But that’s where Pascal’s triangle comes in! We can generate Pascal’s triangle by the familiar “add up two consecutive numbers and put the answer below,” but we also can think of the element on the nth row and kth (tilted) column of Pascal’s triangle as the number of ways to choose k things from n things, which is referred to as “n choose k”, and where we start both the row and column counts at 0, not at 1. That definition satisfies the addition law because, if we have n things, we can label one as “special,” and then the choice of size k subsets of the n things divide into two categories: the size k subsets that contain the special guy and the ones that don’t. If they do, then we need only find k-1 other things in the remaining n-1 size set, and the number of ways to do that is given by the element on row n-1 and column k-1. If they don’t contain the special guy, we need to find k things in the remaining n-1 size set, and the number of ways to do that is given by the element on row n-1 and column k.

On the other hand, we also know a formula for the numbers in Pascal’s triangle: the guy on the nth row and kth column is given by a degree k polynomial in n, namely n!/k!(n-k)!. (This is because we can label all of the guys 1 through n, and just take the first k guys, and there are n! ways to label n things, but we don’t actually care about the order among the first k or among the last n-k.)

For example, in the second column, where we are looking at “n choose 2” for various n, we have the equation n(n-1)/2. This is a LOT like n^2 but has extra terms sticking on the end of lower order. When you’re looking at the third column, you’re working with the formula n(n-1)(n-2)/6, which is like the basic polynomial n^3 with extra stuff. In other words, the formula for “n choose k” is a degree k polynomial in n which we can think of as being a stand-in for n^k. Awesome.

The last ingredient is something called the “Hockey Stick Theorem,” which you gotta love just because of the name. It states that if we add up the values along a column, from the top of the rows down to the nth row, then the sum will be the number just below and to the right, and the entire picture will resemble a hockey stick.

The proof of the Hockey Stick Theorem is trivial- the answer is of course the sum of the two above it, and we have one in the sum already, but the other isn’t… but that other is the sum of the two above it, one of which is again already in the sum but the other isn’t… and you keep going until you get to the top edge of Pascal’s triangle, where the missing number is just 0.

Why does the Hockey Stick Theorem give us what we want? Going back to our generalized statement, we want to show the sum of values on a (any) degree d polynomial for i = 0, 1, 2, …, n is a degree d+1 polynomial. Well, use the dth column and make a hockey stick from the top to row n. Then the sum is on the (n+1)st row, in the (d+1)st column, which we know is a degree d+1 polynomial in n. Woohoo!

One way of looking at this is that we were actually asking the wrong question: instead of asking what the sum of the dth powers is we should have perhaps been asking what the sum of the dth column of Pascal’s triangle is; in other words, there is a better basis for the vector space of polynomials than x^d, namely “x choose d”. In fact, if there were an agreement in the world that actually the “x choose d” polynomials should be the standard basis, (by the way, these basis polynomials would be called “Pascalinomials”!) then the hockey stick theorem would be the last word on how do those things add up. As it stands, to figure out the actual formula for the sum of the dth powers for i=0, 1, 2, …, n, we need to write the first row of the change-of-basis matrix from one basis to the other.

As for the second question, we simply need to extend the definition of the sum F(n) of dth powers from 0 to n to the case where n is negative, by iteratively using the relation:

F(n) = F(n-1) + n^d, or

F(n-1) = F(n) – n^d.

Then we have F(0) = 0, F(-1) = 0, F(-2) = (-1)^(d+1), F(-3) = (-1)^(d+1) – (-2)^d = (-1)^(d+1)(1^d + 2^d) …, and it’s easy to prove that, for any n,

F(n) = (-1)^(d+1)F(-n-1).

This means that if we have a root at -1/2 + a, we also have a root at -1/2 – a = -(-1/2 +a) -1.

Categories: math education

Does an academic job in math really suck?

July 6, 2011 Cathy O'Neil, mathbabe 10 comments

My cousin recently sent me a link to this article about women in science. Actually it’s really about jobs in science, and how much they suck, and how women are too practical to want them. It’s definitely interesting- and pretty widely read, as well, although I’d never seen it. It makes a few excellent points, especially about the crappy amount of money and feedback one gets as an academic, two issues which were definitely part of my personal decision to leave my academic career.

I think his overall argument, though, is simultaneously too practical-minded and not practical-minded enough. And although his essay is about science, I’ll concentrate on how it relates to math.

It’s too practical in that it doesn’t really understand the attraction- the nearly carnal desire- people have to math. It essentially assumes that after some amount of time, maybe 20 years, people will lose interest in their subject, perhaps because they are getting poorly paid.

Is this really true? Maybe for some people this is true, but the nerds I know are nerds for life – they don’t wake up one day thinking math isn’t cool after all. And from what I know about people, they acclimate pretty thoroughly to their standard of living by the time they are 40.

It’s not practical enough, though, because it doesn’t get at one of the most important reasons women leave math, namely because they are married and maybe have kids and they simply can’t be that person who moves across the country for a visiting semester in Berkeley because their husband has a job already and it’s not in Berkeley.

[As a side note, if someone wants to actually encourage women in math, and they are loaded, I would encourage them to set up a fund that would pay costs for quality childcare and airplane tickets for kids when woman go to math conferences. You don’t even need to help organize the babysitting, just pay for it. It would help out a lot of young women and free them up to go to way more conferences, evening the playing field with young men.]

In fact there are plenty of women who are super nerdy and would love to go do math across the country, but when it comes to choosing between that lifestyle and having a family life, they will choose the family life more times than not. Really it’s the “nomadic monk” system itself that is crappy for women at that moment, even if they are theoretically happy to be a poor nerd for the rest of their lives.

I have another complaint (which will make it sound like I don’t like the essay but actually I do). It says that people in science don’t have the ability to switch careers, essentially because they don’t have the money. But that’s really not true, at least in math, and I’m a testament to the possibility of switching careers. One thing a nerd is really good at is learning new things quickly.

I also thought that there was something missing about the alternative jobs he mentions, in industry or otherwise, which is that, yes you do get paid better outside of academics, but on the other hand pretty much any nonacademic job requires you to have a boss, which can be really fine or really horrible, and restricts your vacation time to 3 or 4 weeks. By contrast the quality of life as an academic is, if not luxurious, at least much more under one’s control.

Categories: math education, women in math

Glucose Prediction Model: absorption curves and dirty data

July 5, 2011 Cathy O'Neil, mathbabe 7 comments

In this post I started visualizing some blood glucose data using python, and in this post my friend Daniel Krasner kindly rewrote my initial plots in R.

I am attempting to show how to follow the modeling techniques I discussed here in order to try to predict blood glucose levels. Although I listed a bunch of steps, I’m not going to be following them in exactly the order I wrote there, even though I tried to make them in more or less the order we should at least consider them.

For example, it says first to clean the data. However, until you decide a bit about what your model will be attempting to do, you don’t even know what dirty data really means or how to clean it. On the other hand, you don’t want to wait too long to figure something out about cleaning data. It’s kind of a craft rather than a science. I’m hoping that by explaining the steps the craft will become apparent. I’ll talk more about cleaning the data below.

Next, I suggested you choose in-sample and out-of-sample data sets. In this case I will use all of my data for my in-sample data since I happen to know it’s from last year (actually last spring) so I can always ask my friend to send me more recent data when my model is ready for testing. In general it’s a good idea to use at most two thirds of your data as in-sample; otherwise your out-of-sample test is not sufficiently meaningful (assuming you don’t have that much data, which always seems to be the case).

Next, I want to choose my predictive variables. First, we should try to see how much mileage we can get out of predicting future blood glucose levels with past glucose levels. Keeping in mind that the previous post had us using log levels instead of actual glucose levels, since then the distribution of levels is more normal, we will actually be trying to predict log glucose levels (log levels) knowing past log glucose levels.

One good stare at the data will tell us there’s probably more than one past data point that will be needed, since we see that there is pretty consistent moves upwards and downwards. In other words, there is autocorrelation in the log levels, which is to be expected, but we will want to look at the derivative of the log levels in the near past to predict the future log levels. The derivative can be computed by taking the difference of the most recent log level and the previous one to that.

Once we have the best model we can with just knowing past log levels, we will want to add reasonable other signals. The most obvious candidates are the insulin intakes and the carb intakes. These are presented as integer values with certain timestamps. Focusing on the insulin for now, if we know when the insulin is taken and how much, we should be able to model how much insulin has been absorbed into the blood stream at any given time, if we know what the insulin absorption curve looks like.

This leads to the question of, what does the insulin (rate of) absorption curve look like? I’ve heard that it’s pretty much bell-shaped, with a maximum at 1.5 hours from the time of intake; so it looks more or less like a normal distribution’s probability density function. It remains to guess what the maximum height should be, but it very likely depends linearly on the amount of insulin that was taken. We also need to guess at the standard deviation, although we have a pretty good head start knowing the 1.5 hours clue.

Next, the carb intakes will be similar to the insulin intake but trickier, since there is more than one type of carb and different types get absorbed at different rates, but are all absorbed by the bloodstream in a vaguely similar way, which is to say like a bell curve. We will have to be pretty careful to add the carb intake model, since probably the overall model will depend dramatically on our choices.

I’m getting ahead of myself, which is actually kind of good, because we want to make sure our hopeful path is somewhat clear and not too congested with unknowns. But let’s get back to the first step of modeling, which is just using past log glucose levels to predict the next glucose level (we will later try to expand the horizon of the model to predict glucose levels an hour from now).

Looking back at the data, we see gaps and we see crazy values sometimes. Moreover, we see crazy values more often near the gaps. This is probably due to the monitor crapping out near the end of its life and also near the beginning. Actually the weird values at the beginning are easy to take care of- since we are going to work causally, we will know there had been a gap and the data just restarted, so we we will know to ignore the values for a while (we will determine how long shortly) until we can trust the numbers. But it’s much trickier to deal with crazy values near the end of the monitor’s life, since, working causally, we won’t be able to look into the future and see that the monitor will die soon. This is a pretty serious dirty data problem, and the regression we plan to run may be overly affected by the crazy crapping-out monitor problems if we don’t figure out how to weed them out.

There are two things that may help. First, the monitor also has a data feed which is trying to measure the health of the monitor itself. If this monitor monitor is good, it may be exactly what we need to decide, “uh-oh the monitor is dying, stop trusting the data.” The second possible saving grace is that my friend also measured his blood glucose levels manually and inputted those numbers into the machine, which means we have a way to check the two sets of numbers against each other. Unfortunately he didn’t do this every five minutes (well actually that’s a good thing for him), and in particular during the night there were long gaps of time when we don’t have any manual measurements.

A final thought on modeling. We’ve mentioned three sources of signals, namely past blood glucose levels, insulin absorption forecasts, and carbohydrate absorption forecasts. There are a couple of other variables that are known to effect the blood glucose levels. Namely, the time of day and the amount of exercise that the person is doing. We won’t have access to exercise, but we do have access to timestamps. So it’s possible we can incorporate that data into the model as well, once we have some idea of how the glucose is effected by the time of day.

Categories: data science, open source tools

Cookies

July 4, 2011 Cathy O'Neil, mathbabe 9 comments

About three months ago I started working at an internet company which hosts advertising platforms. It’s a great place to work, with a bunch of fantastically optimistic, smart people who care about their quality of life. I’m on the tech team along with the team of developers which is led by this super smart, cool guy who looks like Keanu Reeves from the Matrix.

I’ve learned a few things about how the internet works and how information is collected about people who are surfing the web, and the bottom line is I clear my cookies now after every session of browsing. Now that I know the ways information travels the risks of retaining cookies seem to outweigh the benefits. First I’ll explain how the system works and then I’ll try to make a case for why it’s creepy, and finally, why you may not care at all.

Basically you should think of yourself, when you surf the web, as analogous to someone on the subway coming home from Macy’s with those enormous red and white shopping bags. You are a walking advertisement for your past, your consumer tastes, and your style, not to mention your willingness to purchase. Moreover, beyond that, you are also carrying around information about your political beliefs, religious beliefs, and temperament. The longer you browse between cookie cleanings, the more precise a picture you’ve painted of yourself for the sites you visit and for third parties (explained below) who get their hands on your information.

Just to give you a flavor of what I’m talking about, you probably are already aware that when you go to a site like, say, Amazon, the site assigns you a cookie to recognize you as a guest; when you return a week later it knows you and says, “Hi, Catherine!”. That’s on the low end of creepy since you have an account with Amazon and it’s convenient for the site to not ask you who you are every time you visit.

However, you may not be aware that Amazon can also see and parce the cookies that other sites, like Google (correction: a reader has pointed out to me that Google doesn’t let this happen, sorry. I was getting confused between the cookie and the “referring url”, which tells a site where the user has come from when they first get to the site. That does contain Google search terms), places on your web signature. In other words Amazon, or any other site that knows how to look, can figure out what other sites’ label of you says. Some cookies are encrypted but not all of them, and I think the general rule is to not encrypt- after all, the people who have the tools to read the cookies all benefit from that information being easy to read. From the perspective of Google, moreover, this information is helping improve your user experience. It should be added that Google and many other companies give you the option of opting out of receiving cookies, but to do so you have to figure out it’s happening and then how to opt out (which isn’t hard).

One last layer of cookie collection is this: there are other companies which lurk on websites (like Amazon, although I’m not an expert on exactly when and where this happens) which can also see your cookies and tag you with additional cookies, or even change your existing cookies (this is considered rude but not prevented). This is where, for me, the creep factor gets going. Those third parties certainly have less riding on their brand, since of course you don’t even see them, so they have less motivation to act honorably with the information they collect about you. For the most part, though, they are just looking to see what kind of advertisement you may be weak for and, once they figure it out, they show you exactly that model of showerhead that you searched for three weeks ago but decided was too expensive to buy. If you want to stop seeing that freaking showerhead popping up everywhere, clear thy cookies.

Here’s why I don’t like this; it’s not about the ubiquitous showerhead, which is just annoying. Think about rich people and how they experience their lives. I touched on this in a previous post about working at D.E. Shaw, but to summarize, rich people think they are always right, and that’s a pretty universal rule, which is to say anyone who becomes rich will probably succumb to that pretty quickly. Why, though? My guess is that everyone around them is aware of their money and is always trying to make them happy in the hope that they at some point could have some of that money. So they effectively live in a cocoon of rightness, which after a while seems perfectly logical and normal.

How that concept manifests itself in this conversation about cookies is that, in a small but meaningful way, that’s exactly what happens to the user when he or she is browsing the web with lots of cookies. Every time Joe encounters a site, the site and all third-party advertisers have the ability to see that Joe is a Republican gun-owner, and the ads shown to Joe will be absolutely in line with that part of the world. Similarly the cookies could expose Dan as a liberal vegetarian and he sees ads that never shake his foundations. It’s like we are funneled into a smaller and smaller world and we see less and less that could challenge our assumptions. This is an isolating thought, and it’s really happening.

At the same time, people sometimes want to be coddled, and I’m one of those people. Sometimes I enjoy it when my favorite yarn store advertises absolutely gorgeous silk-cashmere blends at me, or shows me to a rant against greedy bankers, and no I’d rather not replace them with Viagra ads. So it’s also a question of how much does this matter. For me it matters, but I also like New York City because it is dirty and gritty and all these people from all over the world live there and sweat on each other on the subway and it makes me feel like part of a larger community- I like to mix it up and have it mixed up.

I’d also like to mention another kind of reason you may want to clear your cookies: you get better deals. A general rule of internet advertising is that you don’t need to show good deals to loyalists. So if you don’t have cookies proving you have an account on Netflix, you may get an advertisement offering you three free months of membership. Or if you want to get more free articles on the New York Times website, clear your cookies and the site will have no idea who you are. There are many such examples like this.

Lastly, I’d like to point out that you probably don’t need to worry about this. After all, many browsers will clear your cookies but also clear your usernames and passwords, and you may never be able to get some of those back. And maybe you don’t mind being coddled while online. Maybe it’s the one place where you get to feel understood. Why question that?

Categories: internet startup, news

Fair Foods

July 3, 2011 Cathy O'Neil, mathbabe 3 comments

This post will only be indirectly quantitative, and not a rant, so I guess that means I will have to either apologize or change my mission statement. Sorry. Oh and by the way I do have lots of ideas for quantitative blogs coming up, topics to include:

clear your cookies! how internet companies track your every click
update on the diabetes model
is being a mathematician just a crappy job?
shout-outs to other nerd bloggers who are sending me readers

So yesterday I loaded up the (rental) car to the brim, with my mom, my two older sons, a guitar (for me) and an air conditioning unit (for my mom), and drove out to Amherst for the math program I’m teaching in for three weeks.

Before I left I visited my friend Nancy at Fair Foods in Dorchester.

I drove to her house early, getting there at maybe 8:30am. She wasn’t home- she had me meet her at a church near Codman Square, where she was making a drop. When I got there I helped her unload a van full of maybe 40 or so boxes of vegetables and fruit, with a few 50-pound bags of carrots and potatoes. She got on the van and handed me the boxes and I carried them over to a sidewalk, while the woman, Marie, who was accepting the drop, carried some smaller boxes into the basement. Nancy introduced me to Marie as her daughter, and introduced Marie to me as the beautiful, wise Haitian woman who was a professional cook and would turn all of these vegetables into a delicious feast for her congregation. Nancy and Marie talked about the church, and the fact that it was shared between two different congregations, one Haitian immigrant and one African-American, and how the church was run.

After a while it didn’t seem like Marie was going to get the help she was expecting to carry the larger boxes into the basement, so Nancy and I moved all of the boxes down there, temporarily rigging a window to be a de facto dumb waiter to avoid three corners and some stairs. There were tomatoes, white potatoes, red potatoes, carrots, ugli fruit, limes, lettuce, string beans, wax beans, and others I can’t remember. Almost all of these were in great condition, but some needed sorting before going into the feast. Marie asked for corn for the 4th of July- since the food that is collected is surplus, a given request may be hard to fill, especially around a holiday, which Nancy explained. But then she said that if we got corn we would call Marie right away.

After we finished unloading the van I was soaked in sweat; it reminded me of how incredibly strong I’d gotten working one summer for Nancy, unloading trucks all day (as well as loading them at the Chelsea Produce Market every morning at 7) and driving around the city in the big yellow truck making drops to churches, senior centers, and youth centers, and holding dollar-a-bag sites in vacant parking lots and sidestreets. That was in 1992; and Nancy, who was born in 1950, has been doing the program ever since, with various peoples’ help.

Nancy mentioned that before I’d gotten there she had gone into the church and listened to the singing and the praying of the Haitian congregation, and that it had been seriously beautiful. Marie insisted on us coming inside. We sat in the pews as the woman leading the small prayer group of about 8 people, mostly women, was talking to one woman who was clearly in distress. Perhaps she was in mourning. They were speaking in Creole, which I don’t understand (although I know some French so every now and then I can pick up a word or two), but it was viscerally moving how kindly the leader was speaking to the sad woman seated in front of her. After she allowed that woman to finish, she looked up and welcomed us in English and asked us our names. Marie explained in Creole something about us, probably that we had just brought in the food for the July 4th meal, and we were instantly welcomed by the entire group. After that they told us they were wrapping up their prayer session and would stand and have a group prayer.

Everyone stood, except for the mourning woman who was holding her head in her hands. And at once everyone started praying, but the interesting thing was they were all saying different prayers, and it was fascinating to watch and listen to how they could be both praying together and praying individually. I could make out a few words from Marie’s prayer, which near the beginning was quiet and included lots of words like “please” and “hope”, but which, like everyone else’s, became louder and more fervent and contained more words like “thank you” and “hallelujah”. It ended by everyone holding their hands up to the front of them and giving thanks. Everyone ended at exactly the same time.

After the prayer group ended, there were lots of hugs and hand shaking. Many of the women wanted to talk to Nancy and she probably ended up hugging and being hugged by everyone there. There was a deep human connection inside that little church, which is pretty different from my normal assumptions about piousness and rules-based religions. Connection and empathy.

After we left the church we went to a playground and sat and had coffee together, and Nancy laid something down that was pretty thick. She talked about her disillusionment with her generation- the hippy generation- how they made all these promises but then didn’t follow through- the words she uses is didn’t apply themselves. She talked about having faith in her generation up to the “We Are the World” moment, and then waiting, and seeing nothing come out of it, and how bitter that had made her feel, how disappointed. She said it took her years to get over that, and now she feels like those years of her life, until recently in fact, are in some sense unaccounted for, both because she’s been sick and because she was somewhat paralyzed with anger.

She went on to say that she’s in a new phase now, she’s accepted the lazy fact of life that the people she was counting on, if anything, have made the world a worse place, not a better one, but that she’s decided to love them and love the world anyway, and to continue to make human connections with individuals, because it makes her have faith in a different way, a more diffuse but a stronger faith that won’t be disappointed.

It’s interesting to me that Nancy would ever describe her life as unaccounted for or her feelings as bitter. When I met her in 1989, she had been diagnosed with MS and lived in a huge old house with very little working anything (and what was working she’d installed herself- wired the electricity and installed plumbing). She had a great Dane and a broken-down donated truck, and when I came to her we spent the whole night cleaning out and reorganizing the truck. Whenever the truck’s insurance was due, or the phone was about to be cut off, we’d get a check for $50 and it would be a miracle, and I always felt like if I was ever going to believe in something it would be because of her.

I fell in love with her and with her approach to problem solving- namely, do the right thing, and go figure how to with bare knuckles and sweat. Over the years she’s been better or worse off with her health, but she’s never given up and, to be honest, I never sensed bitterness from her. Maybe these are relative notions, that bitterness from her is like frustration from someone else. Unaccountability from the woman who moves tons of food a week, that will otherwise be thrown away, into the homes of impoverished, mostly immigrant households, who know her and appreciate her act of kindness and take part in that act, would mean… what? to other people. Hard to say.

Categories: news, rant

Did you have a happy childhood?

July 2, 2011 Cathy O'Neil, mathbabe 6 comments

For whatever reason, I’ve been thinking about my childhood recently. Partly it’s the post I wrote about why I chose to call myself “mathbabe”, partly it’s an old essay of Jonathan Franzen’s that got me all riled up (in a good way). Plus I’m traveling to the math camp of my youth to teach, and stopping on the way in Harvard Square at my parents’ house; that’s enough to make you reconsider your memories in short order.

I have never understood what people mean when they talk about carefree, happy childhoods. I think I’ve always assumed this to be some kind of ironic joke, or maybe a plastered-over memory, a convenient approach to pain management. While it’s true that children have fewer responsibilities than grown-ups, it’s really not the responsibilities of adulthood that weigh me down (says the woman with three kids), or ever did. For me it was the constant awareness of my helplessness and impotence, my inability to decide my own fate, my feeling of having to wait forever for freedom, that got to me. I was also teased, but not relentlessly, and I did have friends, and moreover I wasn’t thought of (I don’t think) as a worrying child. From the outside people may have imagined me as a normal albeit nerdy kid. However, I always identified with the oppressed, and I had a keen sense of fairness which was constantly being challenged by reality. When we studied the “Manifest Destiny” in third grade, it killed me to think of the white man’s assumptions. When I saw a kid getting bullied at school, it tore me up that I didn’t know how to put an end to it and no teachers bothered. The list goes on, you get the idea. Also, I had an internal standard that was painfully high- I wanted to become a musician, a pianist, but never thought I’d be good enough, and I questioned my creativity, since what I really wanted to do was compose. When I decided to become a mathematician I started worrying about my thesis (I was 16). By the way, lest people get the wrong impression, my parents never put pressure on me to play music (in fact they openly discouraged me since it was expensive) and thought my worrying about my thesis was downright amusing. This was all internally generated. In short, I was a struggler, at best of times a striver, but never ever carefree and happy.

I have always been attracted to other people who struggle and strive; for the most part my closest friends are, like me, in constant flux with respect to their identities and their goals and even the interpretation of the most basic cultural assumptions like toenail polish and the role of the FDIC.

This brings me to the Franzen essay, where he talks about being isolated in childhood as a reader, and spending the rest of your life trying to find and form a community with other isolated readers. As an aside, Franzen makes a distinction in this essay between isolated readers and isolated math or technology nerds. He basically said that math nerds are isolated because they are autistic, incapable of social interaction, whereas readers are isolated because they feel more deeply and can’t relate to artificiality. I’m not sure whether to argue that math nerds aren’t all autistic or just count myself as both a reader and a math nerd and be proud of out-isolating Franzen, no easy task. Basically, I agree with Franzen. From my perspective upon meeting someone I am always looking for that inner torture, the hallmark of an examined life. It doesn’t make you happy, perhaps, but it makes you real, and moreover interesting.

But here’s the thing, I was blindsided this week by the discovery that my husband, of all people, had a happy childhood. He insists on this, even when I ask him if perhaps he’s misremembering his inner turmoil– he claims no. He moreover avers that, at the age of 12, he decided to become a mathematician and has never looked back, never once questioned that decision. Is this possible? That I’m married to a man who had a happy childhood? For all I know, it is true and moreover it may be exactly why I have a happy marriage. Maybe strugglers need to be married to non-strugglers to maintain some kind of balance. I don’t know, I’m still thinking about it. It does explain something that I’ve always been confused by, though- when my husband comes across an ethical or moral decision, he does so painlessly and makes a decision instantaneously. I now think this is because he just doesn’t think about things like that in between those moments, and so he’s got a clarity of consciousness which allows him to make snap decisions. When I come across such dilemmas, I am much more confused and ambivalent. I usually decide it’s a matter of opinion. I’m wondering if it’s this element of our differences that makes our marriage work.

Categories: rant

Asymmetrical Information

July 2, 2011 Cathy O'Neil, mathbabe 2 comments

From my experience, there are only a few basic kinds of trading models encountered on Wall Street. These are:

chasing dumb money, which I’ve described already,
asymmetrical information, which I want to talk about today,
market-making,
providing “insurance”,
seasonality, which I’ve touched on, and
taking advantage of macroeconomic misalignment (think Soros’s pound trade)

In other posts I intend to go into more detail in the above categories, as well as devote a post to the question of how trading models fail (there also seem to be only a few basic categories for that).

Finance nerd readers: please tell me if I’m missing something!

The concept of asymmetrical information is incredibly simple: I know more than you so I can make a more informed assessment of the value of some underlying contract. This could mean I know inside information about a company and trade before the announcement (illegal but common), or that I know the likelihood of bankruptcy for a company is higher than the market seems to think, or that the underlying mortgages of a packaged security are likely to default.

I could go on, and probably will in another post, but I’d like to make a very basic point, which is this: a lot of money is made every day via asymmetrical information, and in particular there’s a major motivation to obfuscate data in order to create asymmetry. One of the missions of this blog is to uncover and expose major, unreasonable examples of obfuscated information that I know about.

At this point it’s critical to differentiate between two things which typically get confused by non-nerds. Namely, the difference between a technical but thorough explanation and true information obfuscation. A technical explanation, if thorough, can be worked through eventually by someone with enough expertise, or someone who is motivated enough to get that expertise, whereas true information obfuscation just doesn’t provide enough details to really know anything.

The worst is when you are given pretty specific technical information, but which only explains half of the story. This leads to an imprecise false sense of security, which I suspect underlies most of the very large mistakes we’ve seen in finance in the last few years.

For example, let’s talk about the bank stress tests in the United States in 2009. They were conducted in two distinct phases. In the first, a bunch of economists were asked to write down two scenarios. The first was kind of a prediction of how 2009 and 2010 would play out, and the second was a more negative scenario. Okay so far, even though economists aren’t all that pessimistic as people (more on this on another post). The scenarios were averaged in some way and then publicly posted. The good news is, if you thought the scenarios were unrealistic, you’d at least know how to complain about them. The bad news is that they are pretty vague, only really specifying the GDP growth and the unemployment rate.

In the second phase, the banks were allowed to predict the impact of those two scenarios on their portfolios using their own internal models, which were not made public. Here’s the white paper if you don’t believe me. So, in the name of asymmetrical information, why is this a problem? Here are a few reasons:

Banks had bad internal risk models
Banks had clear motivation to mark their portfolios to their advantage
The fact that their methods weren’t made public gives them ample cover to do whatever they wanted

There are two reasons I say that banks had bad internal risk models. The first reason is the one you know about already- they evidently bought a whole bunch of toxic securities leading up to 2008 and seemed to have no idea about the risks. But moreover, my personal experience working in the risk field is that banks used external risk modeling companies as a rubber stamp, essentially to placate those worrywarts who insisted on obsessing about risks. To be more precise without getting anyone into trouble, it was commonplace for banks to not notice when a model at a risk software company had very basic problems and would spit out nonsensical numbers. It was almost as if you couldn’t trust the banks to look at their risk numbers at all. This isn’t true of every bank at all times, but as a general rule when models had major problems it was hedge funds, not banks, who would bring attention to those problems. Moreover, the banks did not seem to have internal risk modeling across their desks. In other words, a trading desk which trades a certain kind of instrument may have some risk monitoring in place (mostly to bound the amount of trading of that type), but when it comes to understanding systemic risk across instrument types, the external risk companies were the source.

It is obvious that banks were motivated to mark their portfolios to their advantage. The ultimate result of bank stress tests were possible additional capital requirements, which they clearly wanted to avoid. This temptation meant it would benefit them to make every assumption of their risk model liberal to their cause.

Finally, they didn’t expose their methods- not even to explain in general terms how they dealt with, say, interest rate risks across instrument types. This meant that only the Fed people involved got to decide how honest the banks were. This is the opposite of what is needed in this situation. There is no reasonable need to keep these methodologies secret from the general public, since it is we who are on the hook if their methods are flawed, as we have seen.

Here’s where I admit that it’s actually really hard to come up with good methodologies to measure impact of vague GDP growth and unemployment estimates. But that admission is only going to add to my rant, because my overall point is that the instruments themselves have been created to make that hard. They are examples, especially tranched mortgage-backed securities but others as well, of intentional obfuscation for the sake of creating asymmetrical information. Instead of living in a world where banks who own things like this are allowed to measure them at their whim, and benefit from that obfuscation, we need to create a system where they are penalized for having illiquid or complex instruments.

And here’s where I admit that I’m not an expert on all of these instruments – some would say I don’t have the right to talk about how they should be assessed. Yet again, I choose to use that fact to add to my rant: if, after working for four years in finance as a quant at a hedge fund and then a researcher and account manager at a risk company, I can’t have an opinion about how to assess risk, then the system is too freaking complicated.

Categories: finance, hedge funds

Women on a board of directors: let’s use Bayesian inference

June 30, 2011 Cathy O'Neil, mathbabe 6 comments

I wanted to show how to perform a “women on the board of directors” analysis using Bayesian inference. What this means is that we need to form a “prior” on what we think the distribution of the answer could be, and then we update our prior with the data available. In this case we simplify the question we are trying to answer: given that we see a board with 3 women and 7 men (so 10 total), what is the fraction of women available for the board of directors in the general population? The reason we may want to answer this question is that then we can compare the answer to other available answers, derived other ways (say by looking at the makeup of upper level management) and see if there’s a bias.

In order to illustrate Bayesian techniques, I’ve simplified it further to be a discrete question. So I’ve pretended that there are only 11 answers you could possible have, namely that the fraction of available women (in the population of people qualified to be put on the board of directors) is 0%, 10%, 20%, …, 90%, or 100%.

Moreover, I’ve put the least judgmental prior on the situation, namely that there is an equal chance for any of these 11 possibilities. Thus the prior distribution is uniform:

We have absolutely no idea what the fraction of qualified women is.

The next step is to update our prior with the available data. In this case we have the data point that there a board with 3 women and 7 men. In this case we are sure that there are some women and some men available, so the updated probability of there being 0% women or 100% women should both be zero (and we will see that this is true). Moreover, we would expect to see that the most likely fraction will be 30%, and we will see that too. What Bayesian inference gives to us, though, is the relative probabilities of the other possibilities, based on the likelihood that one of them is true given the data. So for example if we are assuming for the moment that 70% of the qualified people are women, what is the likelihood that the board ends up being 3 women and 7 men? We can compute that as (0.70)^3*(0.30)^7. We multiply that by 1/11, the probability that 70% is the right answer (according to our prior) to get the “unscaled posterior distribution”, or the likelihoods of each possibility. Here’s a graph of these numbers when I do it for all 11 possibilities:

We learn the relative likelihoods of the outcome "3 out of 10" given the various ratios of women

In order to make this a probability distribution we need to make sure the total adds up to 1, so we scale to get the actual posterior distribution:

We scale these to add up to 1

What we observe is, for example, that it’s about twice as likely for 50% of women to be qualified as it is for 10% of women to be qualified, even though those answers are equally distant from the best guess of 30%. This kind of “confidence of error” is what Bayesian inference is good for. Also, keep in mind that if we had had a more informed prior the above graph would look different; for example we could use the above graph as a prior for the next time we come across a board of directors. In fact that’s exactly how this kind of inference is used: iteratively, as we travel forward through time collecting data. We typically want to start out with a prior that is pretty mild (like the uniform distribution above) so that we aren’t skewing the end results too much, and let the data speak for itself. In fact priors are typically of the form, “things should vary smoothly”; more on what that could possibly mean in a later post.

Here’s the python code I wrote to make these graphs:

#!/usr/bin/env python

from matplotlib.pylab import *

from numpy import *

# plot prior distribution:

figure()

bar(arange(0,1.1,0.1), array([1.0/11]*11), width = 0.1, label = “prior probability distribution”)

xticks(arange(0,1.1,0.1) + 0.05, [str(x) for x in arange(0,1.1,0.1)] )

xlim(0, 1.1)

legend()

show()

# compute likelihoods for each of the 11 possible ratios of women:

likelihoods = []

for x in arange(0, 1.1, 0.1):

likelihoods.append(x**3*(1-x)**7)

# plot unscaled posterior distribution:

figure()

bar(arange(0,1.1,0.1), array([1.0/11]*11)*array(likelihoods), width = 0.1, label = “unscaled posterior probability distribution”)

xticks(arange(0,1.1,0.1) + 0.05, [str(x) for x in arange(0,1.1,0.1)] )

xlim(0, 1.1)

legend()

show()

# plot scaled posterior distribution:

figure()

bar(arange(0,1.1,0.1), array([1.0/11]*11)*array(likelihoods)/sum(array([1.0/11]*11)*array(likelihoods)), width = 0.1, label = “scaled posterior probability distribution”)

xticks(arange(0,1.1,0.1) + 0.05, [str(x) for x in arange(0,1.1,0.1)] )

xlim(0, 1.1)

legend()

show()

Here’s the R code that Daniel Krasner wrote for these graphs:

barplot( rep((1/11), 11), width = .1, col=”blue”, main = “prior probability distribution”)

likelihoods = c()

for (x in seq(0, 1.0, by = .1))

likelihoods = c(likelihoods, (x^3)*((1-x)^7));

barplot(likelihoods, width = .1, col=”blue”, main = “unscaled posterior probability distribution”)

barplot(likelihoods/sum(seq((1/11), 11)*likelihoods), width = .1, col=”blue”, main = “scaled posterior probability distribution”)

Categories: data science, open source tools

Cora Sadosky

June 29, 2011 Cathy O'Neil, mathbabe 5 comments

I was looking through an old photo album (the kind where there are sticky pages and actual physical photos- it looks like an ancient technology now) and I came across one of my favorites of all time- a picture of me being embraced and supported by Cora Sadosky on one side and Barry Mazur on the other. This picture was taken in 1993 in Vancouver, where I received the Alice T. Schafer prize. It was a critical moment for me, and both of those people have influenced me profoundly. Barry became my thesis advisor; part of the reason I went into number theory was to become his student (the other part was this book).

Cora became my mathematical role model and spiritual mother. I already wrote earlier about how going to math camp when I was 14 changed my life and made me realize there is a whole community of math nerds out there and that I belonged to that nerd community. Well, Cora, whom I met when I was 21, was the person that made me realize there is a community of women mathematicians, and that I was also welcome to that world.

Actually it was something I didn’t even really want to know at the time. After all, I was happy to be a successful math undergraduate at UC Berkeley, frolicking in the graduate student lounge and partaking in tea every day at 3:00. Who cares that I was a woman? It seemed antiquated to me, almost crude, to mention my gender. When I got word that I’d won the prize, my reaction was essentially, “is there money?” (there was a bit).

And when I meet young women in math nowadays with that attitude, I am happy for them, really very happy for them. To live in that state of not caring what your gender is in mathematics is a kind of bliss, that lasts until the very moment it stops. My greatest wish for future generations of women in math is for that bliss to never stop.

And yet. I went to Vancouver and met Cora and learned about Alice Shafer and her struggles and successes as a trailblazer for women in math, and I felt really honored to be collecting an award in her name. And I felt honored to have met Cora, whose obvious passion for mathematics was absolutely awe-inspiring. She was the person who first explained to me that, as women mathematicians, we will keep growing, keep writing, and keep getting better at math as we grow older (unlike men who typically do their best work when they’re 29), and we absolutely have to maintain a purpose and a drive and fortitude for that highest call, the struggle of creation.

I kept up with Cora over the years. Every now and then she’d write to me and send me pushy little maternal notes reminding me to work hard and stay strong and productive. And I’d write to her with news of my life and my growing family and sometimes when I visited D.C. I’d meet her and we’d have lunch or dinner and talk about ideas and great books we’d read and how much we loved each other.

When I googled her this morning, I found out she’d died about 6 months ago. You can read about her difficult and inspiring mathematical career in this biography. It made me cry and made me think about how much the world needs role models like Cora.

Categories: women in math

Woohoo!

June 27, 2011 Cathy O'Neil, mathbabe 3 comments

First of all, I changed the theme of the blog, because I am getting really excellent comments from people but I thought it was too difficult to read the comments and to leave comments with the old theme. This way you can just click on the word “Go to comments” or “Leave a comment” which is a bit more self-evident to design-ignorant people like me. Hope you like it.

Next, I had a bad day today, but I’m very happy to report that something has raised my spirits. Namely, Jake Porway from Data Without Borders and I have been corresponding, and I’ve offered to talk to prospective NGO’s about data, what they should be collecting depending on what kind of studies they want to be able to perform, and how to store and revise data. It looks like it’s really going to happen!

In fact his exact words were: I will definitely reach out to you when we’re talking to NPOs / NGOs.

Oh, and by the way, he also says I can blog about our conversations together as well as my future conversations with those NGO’s (as long as they’re cool with it), which will be super interesting.

Oh, yeah. Can I get a WOOHOO?!?

Categories: data science, open source tools

Better risk modeling: motivating transparency

June 27, 2011 Cathy O'Neil, mathbabe Comments off

In a previous post, I wrote about what I see as the cowardice and small-mindedness of the U.S. government and in particular the regulators for not demanding daily portfolios of all large investors. Of course this goes for the governments in Europe as well, and especially right now. The Economist had a good article this past Friday which attempted to quantify the results of a Greek default, but there were major holes, especially in the realm of “who owns the CDS contracts on Greek bonds, and how many are there?”. This fear of the unknown is a root cause of the current political wrangling which will probably end in a postponement of resolving the Greek situation; the question is whether the borrowed time will be used properly or squandered.

It’s ridiculous that nobody knows where the risk lies, but as a friend of mine pointed out to me last week at lunch, it probably won’t be enough to demand the portfolios daily, even if you had the perfect quantitative risk model available to you to plug them into. Why? Because if “transparency” is what the regulators demand, then “transparency” is what they would get – in the form of obfuscated lawyered-up holding lists.

In other words, let’s say a bank has a huge pile of mortgage-backed securities of dubious value on their books, but doesn’t want to accept losses on them. If they knew they’d have to start giving their portfolio to the SEC daily instead of quarterly, it would change the rules of the game. They’d have to hide these holdings by pure obfuscation rather than short-term month- or quarter-end legal finagling. So for example, they could invest in company A, which invests in company B, which happens to have a bunch of mortgage-backed securities of dubious value, but which is too small to fall under the “daily reporting” rules. This is just an example but is probably an accurate portrayal of the kind of thing that would happen with enough lead time and enough lawyers.

What we actually want is to set up a system whereby banks and hedge funds are motivated to be transparent. Read this as: will lose money if they aren’t transparent, because that’s the only motivation that they respond to.

In some sense, as my friend reminded me, we don’t need to worry about hedge funds as much as about banks. This is because hedge funds do their trades through brokerages, which force margin calls on trades that they deem risky. In other words, they pay for their risk through margins on a trade-by-trade, daily basis. If you are thinking, “wait, what about LCTM? Isn’t that a hedge fund that got away with murder and almost blew up the system and didn’t seem to have large margins in place?” then the answer is, “yeah but brokers don’t get fooled (as much) by hedge funds anymore”. In other words, brokers, who are major players in the financial game, are the policemen of hedge funds.

There are two major limits to the above argument. Firstly, hedge funds purposefully use multiple brokers simultaneously so that nobody knows their entire book, so to the extent that risk of portfolio isn’t additive (it isn’t), this policing method isn’t complete. Secondly, it is only a local kind of risk issue- it doesn’t clarify risk given a catastrophic event (like a Greek default), but rather a more work-a-day “normal circumstances” market risk.

Even so, what about the banks? Are there any brokers measuring the risk of their activities and investments? Since the banks are the brokers, we have to look elsewhere… I guess that would have to be at the government, and the regulators themselves, maybe the FDIC… in any case, people decidedly not players in the financial game, not motivated by pay-off, and therefore not prone to delving into the asperger-inspiring details of complicated structured products to search out lies or liberal estimates.

The goal then is to create a new kind of market which allows insiders to bet on the validity of banks’ portfolios. You may be saying, “hey isn’t that just the stock price of the bank itself?”, and to answer that I’d refer you to this article which does a good job explaining how little information and power is actually being exercised by stockholders.

I will follow up this post with another more technical one where I will attempt to describe the new market and how it could (possibly, hopefully) function to motivate transparency of banks. But in the meantime, feel free to make suggestions!

Categories: finance, hedge funds, news

Newer Entries Older Entries

mathbabe

High frequency trading: Update

High frequency trading

Math contests kind of suck

What is an earnings surprise?

I love math nerd kids

Motivating transparency: what we could do about too big to fail

Bank accounting link

Short Post!

Weekend Reading

Adding-up rules and Hockey Sticks

Does an academic job in math really suck?

Glucose Prediction Model: absorption curves and dirty data

Cookies

Fair Foods

Did you have a happy childhood?

Asymmetrical Information

Women on a board of directors: let’s use Bayesian inference

Here’s the python code I wrote to make these graphs:

Here’s the R code that Daniel Krasner wrote for these graphs:

Cora Sadosky

Woohoo!

Better risk modeling: motivating transparency

Top Posts & Pages

Follow Blog via Email

Recent Posts

Meta