Home > data science, finance, open source tools, rant > What is “publicly available data”?

What is “publicly available data”?

September 7, 2011

As many of you know, I am fascinated with the idea of an open source ratings model, set up to compete with the current big three ratings agencies S&P, Moody’s, and Fitch. Please check out my previous posts here and here about this idea.

For that reason, I’ve recently embarked on the following thought experiment: what would it take to start such a thing? As is the case with most things quantitative and real-world, the answer is data. Lots of it.

There’s good news and bad news. The good news is there are perfectly reasonable credit models that use only “publicly available data”, which is to say data that can theoretically gleaned from quarterly filings that companies are required to file. The bad news is, the SEC filings, although available on the web, are completely useless unless you have a team of accounting professionals working with you to understand them.

Indeed what actually happens if you work at a financial firm and want to implement a credit model based on “publicly available information” is the following: you pay a data company like Compustat good money for a clean data feed to work with. They charge a lot for this, and for good reason: the SEC doesn’t require companies to standardize their accounting terms, even within an industry, and even over time (so the same company can change the way it does its accounting from quarter to quarter). Here‘s a link for the white paper (called The Impact of Disparate Data Standardization on Company Analysis) which explains the standardization process that they go through to “clean the data”. It’s clearly a tricky thing requiring true accounting expertise.

To sum up the situation, in order to get “publicly available data” into usable form we need to give a middle-man company like Compustat thousands of dollars a year. Wait, WTF?!!? How is that publicly available?

And who is this benefitting? Obviously it benefits Compustat itself, in that there even is a business to be made from converting publicly available data into usable data. Next, it obviously benefits the companies to not have to conform to standards- easier for them to hide stuff they don’t like (this is discussed in the first section of Compustat’s whitepaper referred to above), and to have options each quarter on how the presentation best suits them. So… um… does it benefit anyone besides them? Certainly not any normal person who wants to understand the creditworthiness of a given company. Who is the SEC working for anyway?

I’ve got an idea. We should demand publicly available data to be usable. Standard format, standard terminology, and if there are unavoidable differences across industries (which I imagine there are, since some companies store goods and others just deal in information for example), then there should be fully open-source translation dictionaries written in some open-source language (python!) that one can use to standardize the overall data. And don’t tell me it can’t be done, since Compustat already does it.

SEC should demand the companies file in a standard way. If there really are more than a couple of standard terms, then demand the company report in each standard way. I’m sure the accountants of the company have this data, it’s just a question of requiring them to report it.

  1. isomorphismes's avatar
    human mathematics
    September 7, 2011 at 10:37 pm

    Well what would it take to build the simplest working prototype with the available data?

    For example, say I put up a website with scores and the links to the reasoning and data behind each. What is the next step from there.

    Like

    • isomorphismes's avatar
      human mathematics
      September 8, 2011 at 12:43 pm

      Also – remember that accountants can collaborate in open-source as well as mathematicians. So another smallest working prototype would be to pick just one security and rate it with the open-source team. Sovereign debt might be a good choice in terms of newsworthiness, but the first target could be any security. Accountants do pro buono work and they have itches to scratch just like programmers do.

      Like

  2. isomorphismes's avatar
    human mathematics
    September 8, 2011 at 1:43 am

    As with Sarbox, more transparency is good but usually not free.

    Like

  3. September 8, 2011 at 8:47 am

    Surely the authority of the ‘big three’ credit rating agencies has it’s origins in the patronage of the SEC. If the SEC wanted the publicly available data to be standardised, it has the power to insist (albeit not retrospectively). So the big issue for an ‘open-source’ credit rating system are ‘what is the opinion of the SEC’ and possibly ‘Why is it of that opinion?’

    Like

  1. September 8, 2011 at 7:08 am
Comments are closed.