Causal versus causal

Home > data science, statistics > Causal versus causal

Causal versus causal

October 16, 2012 Cathy O'Neil, mathbabe

Today I want to talk about the different ways the word “causal” is thrown around by statisticians versus finance quants, because it’s both confusing and really interesting.

But before I do, can I just take a moment to be amazed at how pervasive Gangnam Style has become? When I first posted the video on August 1st, I had no idea how much of a sensation it was destined to become. Here’s the Google trend graph for “Gangnam” versus “Obama”:

It really hit home last night as I was reading a serious Bloomberg article take on the economic implications of Gangnam Style whilst the song was playing in the background at the playoff game between the Cardinals and the Giants.

Back to our regularly scheduled program. I’m first going to talk about how finance quants think about “causal models” and second how statisticians do. This has come out of conversations with Suresh Naidu and Rachel Schutt.

Causal modeling in finance

When I learned how to model causally, it basically meant something very simple: I never used “future information” to make a prediction about the future. I was strictly using information from the past, or that was available and I had access to, to make predictions about the future. In other words, as I trained a model, I always had in mind a timestamp explaining what the “present time” is, and all data I had access to at that moment had timestamps of availability for before that present time so that I could use this information to make a statement about what I think would happen after that present time. If I did this carefully, then my model was termed “causal.” It respected time, and in particular it didn’t have great-looking predictive power just because it was peeking ahead.

Causal modeling in statistics

By contrast, when statisticians talk about a causal model, they mean something very different. Namely, they mean whether the model shows that something caused something else to happen. For example, if we saw certain plants in a certain soil all died but those in a different soil lived, then they’d want to know if the soil caused the death of the plants. Usually to answer this kind of questions, in an ideal situation, statisticians set up randomly chosen experiments where the only difference between the treatments is that one condition (i.e. the type of soil, but not how often you water it or the type of sunlight it gets). When they can’t set it up perfectly (say because it involves people dying instead of plants) they do the best they can.

The differences and commonalities

On the one hand both concepts refer and depend on time. There’s no way X caused Y to happen if X happened after Y. But whereas in finance we only care about time, in statistics there’s more to it.

So for example, if there’s a third underlying thing that causes both X and Y, but X happens before Y, then the finance people are psyched because they have a way of betting on the direction of Y: just keep an eye on X! But the statisticians are not amused, since there’s no way to prove causality in this case unless you get your hands on that third thing.

Although I understand wanting to know the underlying reasons things happen, I have a personal preference for the finance definition, which is just plain easier to understand and test, and usually the best we can do with real world data. In my experience the most interesting questions relate to things that you can’t set up experiments for. So, for example, it’s hard to know whether blue-collar presidents would be impose less elitist policy than millionaires, because we only have millionaires.

Moreover, it usually is interesting to know what you can predict for the future knowing what you know now, even if there’s no proof of causation, and not only because you can maybe make money betting on something (but that’s part of it).

Categories: data science, statistics

Comments (43)

JSE

October 16, 2012 at 7:47 am

Your reasons for preferring the first thing to the second thing make sense, but I think it’s weird to call the first thing “causal.” Isn’t there something else you could call it?

LikeLike
Zubin

October 16, 2012 at 8:33 am

Forecasting

LikeLike
- Cathy O'Neil, mathbabe
  
  October 16, 2012 at 8:38 am
  
  Most of the forecasts people talk about aren’t causal in the financial sense. In particular they normalize their data with non-causal estimates of standard deviation and/or mean of variables.
  
  LikeLike
  - Zubin
    
    October 16, 2012 at 10:57 am
    
    Not sure what you mean by noncausal (using future periods to calculate mean?), but your overall theme of financial causality is well aligned with granger causality.
    
    While easier to understand, it is unfortunately not sufficient in areas where people need to go beyond prediction and instead need to understand the structure of the system generating the data.
    
    LikeLike
Mary O'Keeffe

October 16, 2012 at 9:17 am

Public policy folks care passionately about identifying causality in the second sense of the word because they care about *changing the world for the better*, so they want to identify the causal levers to make that happen. (A great example: Esther Duflo, the director of MIT’s global Poverty Action Lab.)

Finance folks, by contrast, are perfectly content to have the world go to hell in a handbasket, as long as they can make winning bets on the direction in which it is going. They have some nerve calling their forecasting models “causal”!

LikeLike
- michaelkleber
  
  October 16, 2012 at 9:46 am
  
  Yes, this! But Cathy, surely finance folks also want to know whether their models are “really” predictive (sense 2), because only then can you hope to change X and have Y vary as a result. Right? I mean, if you’re making trades that are large enough that the fact that you made a trade is going to change market conditions, isn’t know this vital?
  
  LikeLike
Jonathan

October 16, 2012 at 9:20 am

I agree that “causal” should only be used in the second sense. Yes, you can make a lot of money on models that predict but causal implies something more than that.

“Predictive” would be much better for the first meaning.

LikeLike
mathematrucker

October 16, 2012 at 9:33 am

Under the safe assumption that

“there is always some pervasive pop act”

the pervasiveness of Gangnam Style is not amazing at all. (What mostly amazes me is that you caught it so early – perhaps having children helps with that.)

Gangnam Style’s main distinction from its predecessors is geographical. Despite major differences in style and genre, Psy’s rapid ascent to global stardom sort of reminds me of Kurt Cobain’s.

LikeLike
charles sereno

October 16, 2012 at 12:03 pm

Medieval scholastics were also concerned about causality and time but not so much about making money in the market. They had a problem. Aristotle (THE PHILOSOPHER) claimed the world was infinite (always existed). Biblical accounts said otherwise. Their 13th century arguments pro and con (Aquinas vs. Bonaventure) are quaint but still interesting today.
PS: A suggestion to distinguish the so-called Nobel Prize in Economics from the other Nobel Prizes — Uncapitalize “Nobel,” as in “nobel” Prize.

LikeLike
- Jonathan
  
  October 16, 2012 at 12:11 pm
  
  How about nouveau Nobel (it was started much later) or Riksbank Nobel?
  
  LikeLike
  - charles sereno
    
    October 16, 2012 at 12:17 pm
    
    Maybe just putting Nobel in quotes (sarcastic) would do.
    
    LikeLike
Michael A. Lewis

October 16, 2012 at 12:28 pm

I’m neither a finance expert nor a statistician but a quantitative social scientist. I guess this makes me something of an “applied statistician.” I have two comments. First, there is a large literature in statistics and increasingly the social sciences on claims of causality, in the statistician’s sense, based on observational data (instead of the randomized experiments you refer to). Observational data is, of course, the type one typically finds in finance. Judea Pearl and others have a lot to say about this that you might find interesting. Second, near the end of your post you use the phrase “proof of causality.” But if a statistician is being careful I don’t think they would ever claim that a study, even one based on a randomized experiment, proves causality, assuming the word “proof” is being used the way it is in mathematics.

LikeLike
- Cathy O'Neil, mathbabe
  
  October 16, 2012 at 4:50 pm
  
  Sorry I shouldn’t have been so casual with the word proof. I meant they’d have no evidence. Thanks!
  
  LikeLike
Lou Puls (@MonkeeRench)

October 16, 2012 at 12:33 pm

Cathy O’Neil, mathbabe :
Most of the forecasts people talk about aren’t causal in the financial sense. In particular they normalize their data with non-causal estimates of standard deviation and/or mean of variables.

…and nobody has yet proven that financial transactions and their statistics have any realistic connection with the normal (gaussian) probability structures used to justify statistical models based on such convenient, linearized, simple-minded badmath.

LikeLike
charles sereno

October 16, 2012 at 4:36 pm

As briefly as possible, old geezer catching up with your Gangnam refs — An insult to Korean girls with “daikon” legs, Jews with oversize noses, almost an infinity of others. Scented shit. Fare well.

LikeLike
- Cathy O'Neil, mathbabe
  
  October 16, 2012 at 4:51 pm
  
  Charles, I interpret the video as satire. It’s enjoyable and catchy if you do that, or at least I think so.
  
  LikeLike
  - charles sereno
    
    October 16, 2012 at 4:55 pm
    
    OK. Personally, I get annoyed. Yes, catchy.
    
    LikeLike
  - mathematrucker
    
    October 18, 2012 at 2:54 pm
    
    I think so too. Pop is pop.
    
    LikeLike
    - mathematrucker
      
      October 18, 2012 at 2:58 pm
      
      (Should have started that reply with @Cathy – I was agreeing with her that the vid is enjoyable and catchy.)
      
      LikeLike
N-Sphere Thingy

October 17, 2012 at 11:23 pm

“There’s no way X caused Y to happen if X happened after Y.”

John Kramer’s Fanclub disagrees!!

LikeLike
araybold

October 18, 2012 at 2:47 pm

This caused me to think back to Greenspan’s testimony before Congress after the 2008 debacle: “The models just stopped working”, or words to that effect. Eighteen months later, he came out with a book with all sorts of explanations as to why his world view was right all along, but at the time he was nonplussed.

There’s a superficiality to the understanding that ‘financial causality’ gives, and could that be part of the reason why economists, as a group, seem incapable of predicting even big changes that are just around the corner?

LikeLike
- N-Sphere Thingy
  
  October 18, 2012 at 10:15 pm
  
  Who can predict changes all that much better though?
  
  Did “regular” folks see the real estate bubble? If they did, why are there so many with underwater mortgages and crushing debt burdens? Why did they let real estate become their only source of wealth? Why did home sales keep rising even after real estate price growth exceeded record highs if it was so obvious we were in a bubble? Did thousands of aunts and uncles make millions off of shorting CDO and MBS?
  
  For that matter, how many non-economists, out of the millions of people active in the US economy, took precise steps to prepare for this calamity 4/5 years in advance? After all, it’s easy to see things coming right? Well I am semi-confident that can tell you with near certainty- a very, very tiny fraction. Because if more did, the bubble would have popped a lot sooner.
  
  Going back further- why didn’t factory workers start leaving their jobs for computer science programs and MBAs in droves, given that they saw the massive lay-offs that were coming their way in the 2000s?
  
  Why didn’t more real estate workers move into healthcare and education? After all, it was obvious that real estate was going to be in trouble and healthcare/education were going to be booming 10 years ago right?
  
  Why do people complain about their employers being bought out by private equity funds and there being lay-offs when the future is so easy to predict? They should have been prepared in advance.
  
  Why are the Florange workers in France so upset? Didn’t they realize a few years ago that ArcelorMittal was going to buy their factory and shut it down? They should have started saving up for this inevitability!
  
  For that matter, let’s nail your colors to the mast. What will happen in 5 years? Where will GDP be for each OECD country, where will real estate prices be in each state, where will relative employment shares be, what new technologies will be in mass production, what will be the hourly wage in manufacturing?
  
  LikeLike
- N-Sphere Thingy
  
  October 18, 2012 at 10:35 pm
  
  BTW I’m not trying to be a jerk, it’s just that economists are very intelligent and if we want to claim that others can just see things that they can’t (or if economists are just totally oblivious to obvious things), we need to justify that claim very carefully.
  
  I myself am not an economist but I know most of them are a lot smarter than me and think a lot more consistently than I do.
  
  LikeLike
  - araybold
    
    October 19, 2012 at 8:53 am
    
    I am amazed, firstly by how thoroughly you have misread my short comment (nowhere do I say it was obvious beforehand that the events of 2008 were coming, let alone that “regular” folk were better at foreseeing it than economists), and secondly by your belief that it is unreasonable to expect economists to do any better than “regular” folk in foreseeing the immediate economic future, or to notice a major developing instability.in economic affairs.
    
    I do not think it is controversial to say that, with some notable exceptions, the economic community was largely blindsided by the way events unfolded in 2008 – Greenspan’s comments at the time seem to make this clear. Compare this to the way climate scientists have steadily built up a compelling case for the existence and risks of anthropogenic climate change, even before the evidence was clear. It is not unreasonable to speculate about what makes economic forecasting so difficult.
    
    There is a certain irony in your statement ” we need to justify that claim very carefully” about something I did not even suggest!
    
    LikeLike
    - N-Sphere Thingy
      
      October 19, 2012 at 6:45 pm
      
      In that case, I have no idea what you are saying. What is it you are suggesting, exactly?
      
      First of all, you talked of “big changes just around the corner”. That sure sounds like you thought the events of 2008 were obvious and obviously coming. If that is NOT what you are saying, then what’s your point?
      
      If you are saying that “big changes just around the corner” are NOT obvious (not sure why you used that expression, if that’s the case) then why are you so surprised than economists were “blindsided” by them?
      
      You need to make up your mind. Were the changes, being big and just around the corner, easy to predict or were they not? If they were so easy to predict, you need to explain why so few other people saw them coming. If they weren’t so easy to predict, then why are you so shocked that economists didn’t predict them?
      
      Either way, it doesn’t really make sense.
      
      Second, with what precision do you expect economic predictions to be made? What is your benchmark, exactly? Because some models do a decent job of predicting next-quarter GDP and unemployment fairly well. Is that your target?
      
      Third, the thing about climate change is really hard to reconcile. What is its role in your argument?
      
      Economists have built up lots and lots of theories that predict a lot of phenomena very successfully (conditional convergence, capital flows, etc.).
      
      Why aren’t you comparing those to the climate change theories instead?
      
      What is it about the sudden shock in 2008 that makes it more akin to climate change than any of the other economic theories and fields?
      
      How about instead we compare the accuracy with which economists predict recessions with the accuracy with which geologists can predict earthquakes?
      
      That seems a lot more reasonable. Why aren’t you upset that after decades of research, we still have no way of accurately predicting the magnitude and location of major earthquakes?
      
      I am sorry, I just don’t think you have any kind of specific complaint, just a general gripe about predictive failure that exists in every science that deals with complex systems.
      
      LikeLike
    - N-Sphere Thingy
      
      October 19, 2012 at 7:00 pm
      
      And just to reinforce the point about short-term predictive power in other sciences…
      
      Why aren’t you concerned that geneticists can’t tell us when we’ll get cancer?
      
      Why aren’t you concerned that epidemiologists failed to predict the extent to which HIV would spread in different populations?
      
      What about the failure of volcanologists to predict the behavior of Eyjafjallajökull? Or for that matter, the gross difficulties of accurately predicting volcano eruptions’ magnitude and timing in general?
      
      People have some kind of crazy double standard. When other sciences’ predictive power is limited by complexity, it’s all good. No problemo. When economists’ predictive power is limited by complexity, it’s a sign of a fatal flaw that requires us to uproot the entire approach to the science.
      
      LikeLike
    - mathematrucker
      
      October 19, 2012 at 11:10 pm
      
      Re 2008, Barron’s published a short article circa late 2005 (that’s a guess – I do know it was relatively early though) that strongly suggested we were in the midst of a classic real estate bubble. Still relatively new to subjects financial, I read it carefully, fascinated by the accompanying graph and what the article was saying.
      
      Naturally the article predicted bust – since that’s what bubbles do – but whether it even hinted at any likelihood of calamity, I don’t remember. (One thing I do know is, regrettably, I didn’t use the information to place any bets.)
      
      LikeLike
    - N-Sphere Thingy
      
      October 19, 2012 at 11:18 pm
      
      So you’re saying at least 1 author (or even 2) out of 170 million working Americans nailed it? Did they get the extent of the recession? The year it would start? How much wealth would be lost? In what sectors? You don’t think that speaks more to the difficulty of predicting these things than anything else?
      
      LikeLike
    - mathematrucker
      
      October 20, 2012 at 12:58 am
      
      Extremely few people nailed real early what was going on. Michael Burry did (and no, Dr. Greenspan, it’s pretty obvious he didn’t “just get lucky”.)
      
      I mainly just mentioned the Barron’s article since its take was both unusual at that time, and, as time told, very accurate.
      
      But let me sort of toot my own horn here for a moment. As evening was approaching on Sunday September 14th, 2008, I was chatting with a close childhood friend of mine and his dad at a family picnic of theirs (held at his sister’s place) on Whidbey Island, WA. My friend, a partner in a Florida CPA firm, had traveled to Seattle a couple days before to attend our 30-year high school reunion. His dad, a retired CEO, had a distinguished career in the Alaska canning industry. Both are longtime, staunchly pro-business Republicans.
      
      Since as a trucker I don’t get to chat with guys with this kind of cred all that often (except on here!), when a pause came into the conversation, I couldn’t help but ask both of them – in a genuinely worried tone – “so, what do you think’s going to happen with the economy?”
      
      This would have been right around 6 PM Pacific, 9 PM Eastern. According to Wikipedia, Lehman filed 4 hours and 45 minutes later.
      
      My friend’s dad replied. “Do you know the difference between a recession and a depression?” Since the question was obviously a setup, after some hesitation I finally went along and said “No – what is it?” I’m sure everyone here’s probably heard this before (I hadn’t at the time), but the punch line is, “A recession is when my neighbor is out of a job. A depression is when I’M out of a job!”
      
      He told it well and the three of us immediately broke into laughter.
      
      Considering what was going on in New York as we spoke, laughed, and sipped our wine, I’ve often thought about that brief portion of our conversation as something that would probably fit in perfectly as a little side scene in a movie about the meltdown.
      
      LikeLike
    - N-Sphere Thingy
      
      October 20, 2012 at 2:01 am
      
      OKay, how does any of that address the topic?
      
      Which, at this point, is itself uncertain but to me boils down to this: everyone seems to have a complaint about economics’ predictive power but no one I know actually knows what exactly their complaint amounts to or why they even have it.
      
      It goes like this: “Economics didn’t predict the crisis!”
      “Okay…”
      “Other sciences predict stuff.”
      “Well no, that’s just what you’d be led to believe if your understanding of science is based on what is in the media and pop literature. Other sciences also get things wrong most of the time. In fact, even in the purest sciences there are serious disagreements about fundamental topics, methodology, and…”
      “I saw the crisis coming though!”
      “Why didn’t you make millions off of shorting CDOs?”
      “Nouriel Roubini predicted the crisis.”
      “Okay, that’s one guy out of thousands or millions of non-economists. Do you understand that just by pure chance it is quite likely that one guy out of millions may get things right and…”
      “Economics didn’t predict the crisis!”
      “…”
      
      LikeLike
    - mathematrucker
      
      October 20, 2012 at 2:06 pm
      
      Perhaps my undergraduate analysis professor had it right when we were discussing the fundamental theorem of calculus and for comic relief, he also tossed in the fundamental theorem of economics: “Nobody knows what the hell is going on.”
      
      LikeLike
    - mathematrucker
      
      October 20, 2012 at 2:41 pm
      
      N-Sphere Thingy, for the record, I don’t really disagree with any of your points. In general it’s really hard to predict anything at all with much precision.
      
      But sometimes you don’t need much precision to succeed. It’s pretty obvious Michael Burry didn’t “just get lucky”. His case is well documented in a Michael Lewis book and 60 Minutes interview. This guy really did his homework. He put lots of money on the line based on what he saw clearly – because he really did his homework – was going on with mis-rated CDO’s.
      
      His hard work was rewarded so well, he will never again have to deal with doubting, complaining customers in his life.
      
      LikeLike
    - araybold
      
      October 20, 2012 at 9:08 am
      
      Mathematru, that is interesting. Is it possible that Nouriel Roubini wrote the article, or that it was a precis of what Roubini had said in Fortune and elsewhere?
      
      I don’t think the existence of, and eventual end, of the housing bubble was a big surprise, but the consensus seems to have been that the fall-out would be mild and short. It was the cascading contagion that blindsided mainstream economics, and that was what Greenspan was refering to in his comments to Congress.
      
      Roubini, for one, was making a cogent case for there being a coming calamity, so it is reasonable to ask what factors prevented this reasoning from penetrating the mainstream. One possibility is an undue faith in other models, models that turned out to be wrong, as Greenspan acknowledged.
      
      LikeLike
    - mathematrucker
      
      October 20, 2012 at 2:17 pm
      
      It may have been Roubini who authored the article. It tempts me to resubscribe just so I can search and find it in the online archives. It only took up somewhere between a quarter and a half of the page, and included a small graph displaying statistics that made the article’s main thesis seem indisputable.
      
      Maybe my memory has, with hindsight, embellished some of its predictive power. What makes the memory of it so personally significant for me is I really took note of it at the time, without acting on it in any financially beneficial way.
      
      LikeLike
    - mathematrucker
      
      October 20, 2012 at 3:28 pm
      
      One last comment: my hunch is that using any “model” whatsoever (with any more than just a tiny amount of sophistication) is probably nowhere near to being necessary to make accurate predictions about lots of things.
      
      During the Asian financial crisis of the late 1990s, about all my mostly uneducated (about booms and busts, at the time) mind took away from the news about what caused it, can be summed up in just two words: “bad loans”.
      
      So if you can call it one, my own personal, very unsophisticated “model” of the Asian financial crisis was: “bad loans = financial collapse”.
      
      Looking back at it now, that model is all one would really have needed to make fairly accurate predictions about the meltdown.
      
      LikeLike
    - araybold
      
      October 21, 2012 at 10:59 am
      
      The librarian at any public library would probably be pleased to help you find it.
      
      LikeLike
    - mathematrucker
      
      October 21, 2012 at 11:24 am
      
      You’re right. By golly I’ll take a shot at it and report back. Las Vegas isn’t known for its academics, but I’ve found the public library system (Las Vegas-Clark County) here amazingly useful. In particular it has an excellent interlibrary loan department.
      
      LikeLike
    - mathematrucker
      
      October 21, 2012 at 4:37 pm
      
      The LV library unfortunately doesn’t keep Barron’s going back very far at all, except they do have microfilm going up to 2004, which alas is not quite recent enough. For now I’m holding off on pursuing this any further.
      
      LikeLike
Asymptosis

October 21, 2012 at 11:38 am

It gets even better when you start dealing with human expectations of the future. A future event that’s sufficiently predictable can actually cause that event (or some other event) to happen today. See this recent by Scott Sumner (which I’m not endorsing, just pointing to):

http://www.themoneyillusion.com/?p=17115

LikeLike
nilakantan n s

November 11, 2012 at 12:41 am

your reason for prefering one over the other is not correct. in fact, there is a lot of work carried out by various researchers treating time as one of the underlying variables and establishing causality. leading work has been carried out by Granger and others. once causality is established you can no longer depend upon time as the causal variable and the prediction power definitely comes down.

LikeLike
isomorphismes

November 16, 2012 at 1:38 pm

So for example, if there’s a third underlying thing that causes both X and Y, but X happens before Y, then the finance people are psyched because they have a way of betting on the direction of Y: just keep an eye on X!

Right, but if you don’t understand the mechanism for why X causes Y, then all of a sudden the correlation may disappear.

LikeLike
- Cathy O'Neil, mathbabe
  
  November 16, 2012 at 1:45 pm
  
  By that time my bonus will be in!
  
  LikeLike
  - isomorphismes
    
    November 16, 2012 at 2:18 pm
    
    ha!
    
    LikeLike