Just to be clear, if I’m a hedge fund who owns Greek bonds right now, and say I’ve hedged my exposure using CDSs, then why the fuck would I go along with a voluntary write-down of Greek debt??
From my perspective, if I do go along with it, I lose a asston of money on my bonds and my CDSs don’t get triggered because the write-down is considered “voluntary”. If I don’t go along with it, and enough other hedge funds also don’t go along with it, I either get paid in full or the CDSs I already own get triggered and I get paid in full (unless the counterparty who wrote the CDS goes under, but there’s always that risk).
Bottomline: I don’t go along with it.
None of this political finagling will change my mind. No argument for the stability of the European Union will change my mind. In fact, I will feel like arguing, hey if you force an involuntary voluntary write-down, then you are essentially making the meaning of CDS protection null and void. This is tantamount to ignoring legal contracts. And I’d have a pretty good point.
How’s this: let this shit go down, and start introducing a system that works, with a CDS market that is either reasonably regulated or nonexistent.
In the meantime, if I’m a Greek citizen, I’m wondering if I’ll ever be living in a country that has a consistent stock of aspirin again.
When you are modeling for the sake of real-time decision-making you have to keep updating your model with new data, ideally in an automated fashion. Things change quickly in the stock market or the internet, and you don’t want to be making decisions based on last month’s trends.
One of the technical hurdles you need to overcome is the sheer size of the dataset you are using to first train and then update your model. Even after aggregating your model with MapReduce or what have you, you can end up with hundreds of millions of lines of data just from the past day or so, and you’d like to use it all if you can.
The problem is, of course, that over time the accumulation of all that data is just too unwieldy, and your python or Matlab or R script, combined with your machine, can’t handle it all, even with a 64 bit setup.
Luckily with exponential downweighting, you can update iteratively; this means you can take your new aggregated data (say a day’s worth), update the model, and then throw it away altogether. You don’t need to save the data anywhere, and you shouldn’t.
As an example, say you are running a multivariate linear regression. I will ignore bayesian priors (or, what is an example of the same thing in a different language, regularization terms) for now. Then in order to have an updated coefficient vector , you need to update your “covariance matrix” and the other term (which must have a good name but I don’t know it) and simply compute
So the problem simplifies to, how can we update and ?
As I described before in this post for example, you can use exponential downweighting. Whereas before I was expounding on how useful this method is for helping you care about new data more than old data, today my emphasis is on the other convenience, which is that you can throw away old data after updating your objects of interest.
So in particular, we will follow the general rule in updating an object $T$ that it’s just some part old, some part new:
where by I mean the estimate of the thing at time and by I mean the estimate of the thing given just the data between time and time
The speed at which I forget data is determined by my choice of and should be determined by the market this model is being used in. For example, currency trading is fast-paced, and long-term bonds not as much. How long does it take the market to forget news or to acclimate to new news? The same kind of consideration should be used in modeling the internet. How quickly do users change their behaviors? This could depend on the season as well- things change quickly right after Christmas shopping season is done compared to the lazy summer months.
Specifically, I want to give an example of this update rule for the covariance matrix which really isn’t a true covariance matrix because I’m not scaling it correctly, but I’ll ignore that because it doesn’t matter for this discussion.
Namely, I claim that after updating with the above exponential downweighting rule, I have the covariance matrix of data that was itself exponentially downweighted. This is totally trivial but also kind of important- it means that we are not creating some kind of new animal when we add up covariance matrices this way.
Just to be really dumb, start with a univariate regression example, so where we have a single signal and a single response . Say we get our first signal and our first reponse Our first estimate for the covariance matrix is
Now we get a new piece of data , and we want to downweight the old stuff, so we multiply and by some number Then our signal vector looks like and the new estimate for the covariance matrix is
where by I mean the estimate of the covariance matrix at time as above. Up to scaling this is the exact form from above, where
Things to convince yourself of:
- This works when we move from pieces of data to pieces of data.
- This works when we move from a univariate regression to a multivariate regression and we’re actually talking about square matrices.
- Same goes for the term in the same exact way (except it ends up being a column matrix rather than a square matrix).
- We don’t really have to worry about scaling; this uses the fact that everything in sight is quadratic in , the downweighting scalar, and the final product we care about is where, if we did decide to care about scalars, we would mutliply by the appropriate scalar but then end up dividing by that same scalar when we find the inverse of
- We don’t have to update one data point at a time. We can instead compute the `new part’ of the covariance matrix and the other thingy for a whole day’s worth of data, downweight our old estimate of the covariance matrix and other thingy, and then get a new version for both.
- We can also incorporate bayesian priors into the updating mechanism, although you have decide whether the prior itself needs to be downweighted or not; this depends on whether the prior is coming from a fading prior belief (like, oh I think the answer is something like this because all the studies that have been done say something kind of like that, but I’d be convinced otherwise if the new model tells me otherwise) or if it’s a belief that won’t be swayed (like, I think newer data is more important, so if I use lagged values of the quarterly earnings of these companies then the more recent earnings are more important and I will penalize the largeness of their coefficients less).
End result: we can cut our data up into bite-size chunks our computer can handle, compute our updates, and chuck the data. If we want to maintain some history we can just store the `new parts’ of the matrix and column vector per day. Then if we later decide our downweighting was too aggressive or not sufficiently aggressive, we can replay the summation. This is much more efficient as storage than holding on to the whole data set, because it depends only on the number of signals in the model (typically under 200) rather than the number of data points going into the model. So for each day you store a 200-by-200 matrix and a 200-by-1 column vector.
In the most recent New Yorker, there’s an article which basically says that, although “no-judgment” brainstorming sounds great, it doesn’t actually produce better ideas. That in fact you need to be able to criticize each other’s half-baked plans to get real innovation.
The idea that a bunch of people, who have been instructed that no idea is too banal to speak out loud will eventually move beyond the obvious into creative territory is certainly attractive, mostly because it’s so hopeful: in this world everyone can participate in innovation. And in fact it may be true, that everyone can be creative, but I agree that it won’t generally happen in the standard brainstorming meeting.
As usual I have lots of opinions about this, and lots of experience, so I’ll just go ahead and say what I think.
When does working in a group work?
- When people are sufficiently technical for the discussion, although not completely informed: it’s helpful to have someone with great technical skills or domain knowledge but who hasn’t thought through the issue, so they can question all of the assumptions as they come up to speed. In my experience this is when some of the best ideas happen.
- When people are more interested in getting to the answer than in impressing the people around them. This sounds too obvious to mention, but as we will see below it’s actually almost impossible to achieve in a largish group at an ambitious or successful company.
- When people know the people around them will be able to follow somewhat vague arguments and help them make those arguments precise.
- Alternatively when people know that others will gladly find flaws in ideas that are essentially bullshit. When everyone has agreed to call each other’s bullshit in a supportive way, and has taken on that role aggressively, you have a good dynamic.
Why does the “no-judgment” rule fail?
- When you aren’t being critical, you never get to the reasons why things are obviously a bad idea, so you never get to a new idea. That’s the critical part of no-judgment brainstorming that fails, the friction supplied by the other people who call you on a bad idea. Otherwise it’s just a bunch of people talking in a room, distracting you from thinking well by the loudness of their voices.
- When you have a bunch of successful people who have never failed, nobody actually lowers their guard. This idea of the super-achieving educational 1% is described in this recent New York Times article. I’ve seen this phenomenon close-up many times. People who are academic superstars absolutely hate taking risks and hate being wrong: life is a competition and they need to win every time. (update: there are plenty of people who were really freaking good at school that aren’t like this; when I say “academic superstars” I want to incorporate the idea that these people identify their success in school and/or other arenas that have metrics of success, like contests or high-quality brands (Harvard, McKinsey, etc.), as part of their identity.)
- Finally, the setup of the brainstorm is necessarily shallow and doesn’t require follow-through. In my experience only germs of good ideas can possibly occur in meetings. Lots of good germs have been left to rot on whiteboards. It would be wicked useful to try to rank ideas at the end of a meeting (by a show of hands, for example), but the “no-judgment” rule also prevents this.
Asian educational systems often get criticized for being so non-individualistic that they repress originality. True. But a system where the individual is promoted as special in every way also represses originality, because narcissists brook no argument.
This recent “Room for Debate” discussion in the New York Times brings up this issue beautifully. The idea is that schools are more and more being seen as companies, where the students and parents (especially the parents) are seen as the customer. The customer is always right, of course, and the schools are expected to tailor themselves to please everyone. It’s the opposite of learning how to disagree, learning how to be a member of society, and learning how to be wrong.
Interestingly, some of the best experiences I’ve had recently in the successful brainstorming arena have come from the #OWS Alternative Banking group I help organize. It’s made up of a bunch of citizens, many of whom are experts, but not all, and many of whom are experts in different corners of finance. The fact that people come to a meeting to talk policy and finance on Sunday afternoons means they are obviously interested, and the fact that no two people seem to agree on anything completely makes for feisty and productive debates.
If you haven’t been following the drama of the possible mortgage settlement between the big banks that committed large-scale mortgage fraud and the state Attorney Generals, then get yourself over to Naked Capitalism right away. What could end up being the biggest boondoggle coming out of the credit crisis is unfolding before us.
The very brief background story is this. Banks made huge bets on the housing market through securitized products (mortgage backed securities which were then often repackaged or rerepackaged). The underlying loans were often given to people with very hopeful expectations about the future of the housing market, like that it would only go up. In the meantime, the banks did very bad jobs of keeping track of the paperwork. In addition to that, many of the loans were actually fraudulent and a very large number of them were ridiculous, with resetting interest rates that were clearly unaffordable.
Fast forward to post-credit crisis, when people were having trouble with their monthly bills. The banks made up a bunch of paperwork that they’d lost or had never been made in the first place (this is called “robo-signing”). The judges at foreclosures got increasingly skeptical of the shoddy paperwork and started balking (to be fair, not all of them).
Who’s on the hook for the mistakes the banks made? The home owners, obviously, and also the investors in the securitized products, but most critically the taxpayer, through Fannie and Freddie, who are insuring these ridiculous mortgages.
So what we’ve got now is an effort by the big banks to come to a “settlement” with the states to pay a small fee (small in the context of how much is at stake) to get out of all of this mess, including all future possible findings of fraud or misdeeds. The settlement terms have been so outrageously bank-friendly that a bunch of state Attorney Generals have been pushing back, with the help of prodding from the people.
Meanwhile, the Obama administration would love nothing more than to be able to claim they cleaned up the mess and made the banks pay. But that story seriously depends on people not really understanding the scale of the problem and the meaning of the fine print of the proposed settlement.
If you want to learn more recent details about this potential tragedy, this post from Naked Capitalism got me so entranced that I actually missed my subway stop on the way to work and had to walk uptown from Canal. From the post:
The story did not outline terms, but previous leaks have indicated that the bulk of the supposed settlement would come not in actual monies paid by the banks (the cash portion has been rumored at under $5 billion) but in credits given for mortgage modifications for principal modifications. There are numerous reasons why that stinks. The biggest is that servicers will be able to count modifying first mortgages that were securitized toward the total. Since one of the cardinal rules of finance is to use other people’s money rather than your own, this provision virtually guarantees that investor-owned mortgages will be the ones to be restructured. Why is this a bad idea? The banks are NOT required to write down the second mortgages that they have on their books. This reverses the contractual hierarchy that junior lienholders take losses before senior lenders. So this deal amounts to a transfer from pension funds and other fixed income investors to the banks, at the Administration’s instigation.
Another reason the modification provision is poorly structured is that the banks are given a dollar target to hit. That means they will focus on modifying the biggest mortgages. So help will go to a comparatively small number of grossly overhoused borrowers, no doubt reinforcing the “profligate borrower” meme.
But those criticisms assume two other things: that the program is actually implemented. The experience with past consent decrees in the mortgage space is that the servicers get a legal get out of jail free card, a release, and do not hold up their end of the deal. Similarly, we’ve seen bank executives swear in front of Congress in late 2010 that they had stopped robosigning, which turned out to be a brazen lie. So here, odds favor that servicers will pretty much do nothing except perhaps be given credit for mortgage modifications they would have made anyhow.
Interestingly, Romney has gone on record siding with the homeowners. The following is a Romney quote:
The banks are scared to death, of course, because they think they’re going to go out of business… They’re afraid that if they write all these loans off, they’re going to go broke. And so they’re feeling the same thing you’re feeling. They just want to pretend all of this is going to get paid someday so they don’t have to write it off and potentially go out of business themselves.”
This is cascading throughout our system and in some respects government is trying to just hold things in place, hoping things get better… My own view is you recognize the distress, you take the loss and let people reset. Let people start over again, let the banks start over again. Those that are prudent will be able to restart, those that aren’t will go out of business. This effort to try and exact the burden of their mistakes on homeowners and commercial property owners, I think, is a mistake.
“This effort” must refer to the mortgage settlement. I’m with Romney on this one.
In 50 years, when we look back at this period of time, we may be able to describe it like this:
The financial system got high on profits from unreasonably priced homes and mortgages, underestimating risk, and securitization fees. When the truth came out they paid a pittance to escape their mistakes, transferring the cost to homeowners and the taxpayer and leaving the housing market utterly inflated and confused. The entire charade lasted decades and was in the name of not acknowledging what everyone already knew, namely that the banks were effectively insolvent.
In one of my first posts ever, I talked about seasonal adjustment models and how they can work. I was sick of seeing that phrase go unexplained in the news all the time.
If I had been a bit more thoughtful, maybe I could have also mentioned various ways seasonal adjustment models could screw things up, or more precisely be screwed up by weird events. Luckily, a spokesperson from Goldman Sachs recently did that for me, and it was mentioned in this Bloomberg article. Those GS guys are smart, and would only mention this to Bloomberg if they thought everyone on the street knew it anyway, but I still appreciate them strewing their crumbs (it occurs to me that they might be trading on people’s overreactions to inflated good news right now).
Recall my frustration with seasonal adjustment models: they typically don’t tell you how many years of data they use, and how much they weight each year. But it’s safe to say that for statistics like unemployment and manufacturing, multiple years are used and more recent years are at least as important as older years. So events in the market that occurred in 2008 are still powerfully present in the seasonal adjustment model.
That means that, when the model is deciding what to expect, it looks at the past few years and kind of averages them. One of those years was 2008 when all hell broke loose, Lehman fell, TARP came into being, and Fannie, Freddie, and AIG were seized by the government. Lots of people lost their jobs and the housing and building industries went into freefall.
So the model thinks that’s a big deal, and compares what happened this year to that (and to the other years in the model, but that year dominates since it was such an extreme event), and decides we’re looking good. Here’s a picture from the Kansas Fed of the raw vs. seasonally adjusted manufacturing index results, from July 2001 to December 2011:
As one of my readers has already commented (darn, you guys are fast!), this just refers to manufacturing near Kansas, but the point I’m trying to make is still valid, namely that the seasonal adjustments clearly pale in comparison to the actual catastrophic event in 2008. However, that event still informs the seasonal adjustment model afterwards.
Because of the “golden rule” I mentioned in my post, namely that seasonal adjustment needs to on average (or at least in expectation) not add bias to the actual numbers, if things look better than they should in the second half of the year, that means they will look worse than they should in the first half of the year.
So be prepared for some crappy statistics coming out soon!
I still wish they’d just show us the graphs for the past 10 years and let us decide whether it’s good news.
Prediction: in the next 10 years we will see the majority of major universities start masters degree programs, or Ph.D. programs, in data science, data analytics, business analytics, or the like. They will exist somewhere in the intersection of the fields of statistics, operations research, and computer science, and business. They will teach students how to use machine learning algorithms and various statistical methods, and how to design expert systems. Then they will send these newly minted data scientists out to work at McKinsey, Google, Yahoo, and possibly Data Without Borders.
The questions yet unanswered:
- Relevance: will they also teach the underlying theory well enough so that the students will know when the techniques are applicable?
- Skepticism: will they in general teach enough about robustness in order for the emerging data scientists to be sufficiently skeptical of the resulting models?
- Ethics: will they incorporate understanding the impact of the models so that students will think to understand the ethical implications of modeling? Will they have a well-developed notion of the Modeler’s Hippocratic Oath by then?
- Open modeling: will they focus narrowly on making businesses more efficient or will they focus on developing platforms which are open to the public and allow people more views into the models, especially when the models in question affect that public?
Open questions. And important ones.
Here’s one that’s already been started at the University of North Carolina, Charlotte.