## Topology of financial modeling

After my talk on Monday there were lots of questions and comments, which is always awesome (will blog the contents soon).

One person in the audience asked me if I’d ever heard of CompTop, which I hadn’t. And actually, even though I vaguely understand what they’re talking about, I still don’t understand it sufficiently to blog about it- but it reminds me of something else which I *would* like to blog about, and which combines topology and modeling.

Maybe they’re even the same thing! But if so (especially if so), I’d like to get my idea down onto electronic paper before I read theirs. This is kind of like my thing about not googling something until you’ve tried to work it out for yourself.

So here’s the setup. In different fields in finance, there’s a “space” you work in. I worked in Futures, which you’ve heard of because when they talk about the price of barrels of oil going up (or maybe down, but you don’t hear about it as much when that happens), they are actually talking about futures prices. This also happens with basic food prices such as corn and wheat; corn and oil are linked of course through ethanol production. There are also futures on the S&P (or any other major stock index), bonds, currencies, other commodities, or even options on stock indices.

The general idea, which is given away by the name, is that when you buy a futures contract, you are placing a bet on the future price of something. Futures were started as a way for farmers to hedge their risks when they were growing food. But clearly other things have happened since then.

There’s a way of measuring the dimension of this space of instruments, which is less trivial than counting them. For example, there is a “2 year U.S. bond” future as well as a “5 year U.S. bond future” and you may guess (and you’d be right) that these don’t really represent independent dimensions.

Indeed there’s a concept of independence which one can use coming from statistics (so, statistical independence), which is pretty subjective in that it depends on what time period and how much data you use to measure it (and lately we’ve seen less independence in general). But even so, you can go blithely forward and count how many dimensions your space has, and you generally got something like 15, at least before the credit crisis hit. This process is called PCA, and I’ll write a post on it sometime.

Depending on which instruments you counted, and how liquid you expected them to be, you could get a few more “independent” instruments, but you also may be fooling yourself with idiosyncratic noise caused by those instruments being not very liquid. So there are some subtleties.

Once you have your space measured in terms of dimension, you can choose a basis and look at things along the basis vectors. You can see how your different models behave, for example. You might see how the bond model you worked on places no bet on the basis vectors corresponding to lean hog futures.

That made me wonder the following question. If we can measure the space of instruments, can we also measure the space of models? Is this some kind of dual? If so, is there some kind of natural upper bound on the number of (independent) models we could ever have which all make profit?

Note there’s also a way of making sure that models are statistically independent, so this part of the question is well-defined. But it’s not clear what property of the space of instruments you are measuring when you ask for a model on that space which “makes profit”.

Another related question is whether such a question can really only be asked at a given *time horizon* (if it can be asked at all). I’ll explain.

The *horizon* of a model is essentially how long you expect a given bet to last in terms of time. For example, a weekly horizon model is something you’d typically only see on a slow-moving instrument class like bonds. There are plenty of daily models on equities, but there are also incredibly hyper fast “high frequency” models, say on currencies, which care about the speed of light and how different computers in the same room, being at different internal temperatures, can’t place consistent timestamps on ticker data.

These different horizons have such different textures, it makes me wonder if the question of an upper bound on the number of profitable models, if true, is true at each horizon.

Another related question: what about topological weirdness inside the space of instruments? If you plot some of this (take as a baby model three instruments that are essentially independent, choose a time horizon, and plot the simultaneous returns) the main characteristic you’ll see is that it’s a bounded blob. But inside that blob are certainly inconsistencies; in particular the density is not everywhere the same. Is the lack of consistency a signal that there’s a model there? Does the market know about holes, for example? Maybe not, which would mean that the space of (profitable) models is perhaps better understood as a space whose basis consists of something like “holes in the instrument space”, rather than a dual.

This is verging on something like what CompTop is talking about. Maybe. I’ll have to go read what they’re doing now.

Yep, Cathy, what you say in the last paragraph is very much in the spirit of Comp Top / persistent homology. To give a simple example: suppose there are two instruments you’re tracking in time, and suppose (for some reason) that they are connected by some kind of predator-prey type relationship, which tends to produce a cycle. This will NOT show up as a correlation between the two instruments, but it might be detectable in persistent homology — as long as things aren’t too noisy, the sampling rate is sufficiently high relative to the length of the cycle, etc.

This is the third time I have tried to make this post and the idea it contains still isn’t as well-formed as I would like; my return to work is exhausting:

When I am looking for duals to some kind of mathematical object there are a couple useful starting points.

Categorical duals, (which include partial order duals and vector space duals as special cases) are useful because they are so general, and they can be studied by identifying the object itself as a category, or finding the ambient category in which the object resides: see the book The joy of cats:Abstract and Concrete Categories for an introduction to category theory.

Closely related are the Stone dualities between spaces and algebras.

Since we are looking at multi-dimensional spaces where there is no universal othorgonal basis available I want to see if the objects are some kind of manifold (which would allow me to apply Stone dualities), and since the spaces relate to probability and statistics, I am expecting them to be measure spaces (which might help me to clarify the topology of the objects).

.

Any recommendation for solid modern computation topology text which includes coverage of relevant persistent homology (e.g. Edelsbrunner and Harer [2010])?