We’re not just predicting the future, we’re causing the future

Home > data science, finance, rant > We’re not just predicting the future, we’re causing the future

We’re not just predicting the future, we’re causing the future

October 23, 2012 Cathy O'Neil, mathbabe

My friend Rachel Schutt, a statistician at Google who is teaching the Columbia Data Science course this semester that I’ve been blogging every Thursday morning, recently wrote a blog post about 10 important issues in data science, and one of them is the title of my post today.

This idea that our predictive models cause the future is part of the modeling feedback loop I blogged about here; it’s the idea that, once we’ve chosen a model, especially as it models human behavior (which includes the financial markets), then people immediately start gaming the model in one way or another, both weakening the effect that the model is predicting as well as distorting the system itself. This is important and often overlooked when people build models.

How do we get people to think about these things more carefully? I think it would help to have a checklist of properties of a model using best practices.

I got this idea recently as I’ve been writing a talk about how math is used outside academia (which you guys have helped me on). In it, I’m giving a bunch of examples of models with a few basic properties of well-designed models.

It was interesting just composing that checklist, and I’ll likely blog about this in the next few days, but needless to say one thing on the checklist was “evaluation method”.

Obvious point: if you have a model which has no well-defined evaluation model then you’re fucked. In fact, I’d argue, you don’t really even have a model until you’ve chosen and defended your evaluation method (I’m talking to you, value-added teacher modelers).

But what I now realize is that part of the evaluation method of the model should consist of an analysis of how the model can or will be gamed and how that gaming can or will distort the ambient system. It’s a meta-evaluation of the model, if you will.

Example: as soon as regulators agree to measure a firm’s risk with 95% VaR on a 0.97 decay factor, there’s all sorts of ways for companies to hide risk. That’s why the parameters (95, 0.97) cannot be fixed if we want a reasonable assessment of risk.

This is obvious to most people upon reflection, but it’s not systemically studied, because it’s not required as part of an evaluation method for VaR. Indeed a reasonable evaluation method for VaR is to ask whether the 95% loss is indeed breached only 5% of the time, but that clearly doesn’t tell the whole story.

One easy way to get around this is to require a whole range of parameters for % VaR as well as a whole range of decay factors. It’s not that much more work and it is much harder to game. In other words, it’s a robustness measurement for the model.

Categories: data science, finance, rant

Comments (3)

JP Onstwedder

October 23, 2012 at 10:43 am

Which is why, certainly in the bank I work, internal and regulatory evaluations of risk models cover the entire range of confidence levels, historical sample periods etc and are far more realistic than the simple backtesting we used to do many years ago. We’re not complete idiots, you know 🙂

LikeLike
- Cathy O'Neil, mathbabe
  
  October 23, 2012 at 1:24 pm
  
  Cool. Next stop: stop pretending returns are normally distributed, especially for credit instruments, and use 2008 data in your history to get a good sense of what could go bad. Oh, and take into account interest rate risk and counter-party risk.
  
  LikeLike
JP Onstwedder

October 25, 2012 at 7:53 am

Hate to disappoint you…but we make adjustments for non-normality, although I suspect the way we do that (measure the 99th percentile from long run data and work out the multiplier to the number of standard deviations from a normal distribution) isn’t terribly clever. I don’t work on traded credit so I can’t speak to that. But we do have counter-party risk in our risk numbers, just not as part of market risk; instead we run extensive stress tests on market factors to see how counter-party exposure changes – the hard part is assessing correlation between the two as we generally don’t have enough information to do so.

What’s your best suggestion for an improvement on MC based VaR, by the way? For, say, commodities markets…

LikeLike