Archive for December, 2012

I totally trust experts, actually

I lied yesterday, as a friend at my Occupy meeting pointed out to me last night.

I made it seem like I look into every model before trusting it, and of course that’s not true. I eat food grown and prepared by other people daily. I go on airplanes and buses all the time, trusting that they will work and that they will be driven safely. I still have my money in a bank, and I also hire an accountant and sign my tax forms without reading them. So I’m a hypocrite, big-time.

There’s another thing I should clear up: I’m not claiming I understand everything about climate research just because I talked to an expert for 2 or 3 hours. I am certainly not an expert, nor am I planning to become one. Even so, I did learn a lot, and the research I undertook was incredibly useful to me.

So, for example, my father is a climate change denier, and I have heard him give a list of scientific facts to argue against climate change. I asked my expert to counter-argue these points, and he did so. I also asked him to explain the underlying model at a high level, which he did.

My conclusion wasn’t that I’ve looked carefully into the model and it’s right, because that’s not possible in such a short time. My conclusion was that this guy is trustworthy and uses logical argument, which he’s happy to share with interested people, and moreover he manages to defend against deniers without being intellectually defensive. In the end, I’m trusting him, an expert.

On the other hand, if I met another person with a totally different conclusion, who also impressed me as intellectually honest and curious, then I’d definitely listen to that guy too, and I’d be willing to change my mind.

So I do imbue models and theories with a limited amount of trust depending on how much sense they makes to me. I think that’s reasonable, and it’s in line with my advocacy of scientific interpreters. Obviously not all scientific interpreters would be telling the same story, but that’s not important – in fact it’s vital that they don’t, because it is a privilege to be allowed to listen to the different sides and be engaged in the debate.

If I sat down with an expert for a whole day, like my friend Jordan suggests, to determine if they were “right” on an issue where there’s argument among experts, then I’d fail, but even understanding what they were arguing about would be worthwhile and educational.

Let me say this another way: experts argue about what they don’t agree on, of course, since it would be silly for them to talk about what they do agree on. But it’s their commonality that we, the laypeople, are missing. And that commonality is often so well understood that we could understand it rather quickly if it was willingly explained to us. That would be a huge step.

So I wasn’t lying after all, if I am allowed to define the “it” that I did get at in the two hours with an expert. When I say I understood it, I didn’t mean everything, I meant a much larger chunk of the approach and method than I’d had before, and enough to evoke (limited) trust.

Something I haven’t addressed, which I need to think about more (please help!), is the question of what subjects require active skepticism. On of my commenters, Paul Stevens, brought this up:

… For me, lay people means John Q Public – public opinion because public opinion can shape policy. In practice, this only matters for a select few issues, such as climate change or science education. There is no impact to a lay person not understanding / believing in the Higgs particle for example.

On trusting experts, climate change research, and scientific translators

Stephanie Tai has written a thoughtful response on Jordan Ellenberg’s blog to my discussion with Jordan regarding trusting experts (see my Nate Silver post and the follow-up post for more context).

Trusting experts

Stephanie asks three important questions about trusting experts, which I paraphrase here:

  1. What does it take to look into a model yourself? How deeply must you probe?
  2. How do you avoid being manipulated when you do so?
  3. Why should we bother since stuff is so hard and we each have a limited amount of time?

I must confess I find the first two questions really interesting and I want to think about them, but I have a very little patience with the last question.

Here’s why:

  • I’ve seen too many people (individual modelers) intentionally deflect investigations into models by setting them up as so hard that it’s not worth it (or at least it seems not worth it). They use buzz words and make it seem like there’s a magical layer of their model which makes it too difficult for mere mortals. But my experience (as an arrogant, provocative, and relentless questioner) is that I can always understand a given model if I’m talking to someone who really understands it and actually wants to communicate it.
  • It smacks of an excuse rather than a reason. If it’s our responsibility to understand something, then by golly we should do it, even if it’s hard.
  • Too many things are left up to people whose intentions are not reasonable using this “too hard” argument, and it gives those people reason to make entire systems seem too difficult to penetrate. For a great example, see the financial system, which is consistently too complicated for regulators to properly regulate.

I’m sure I seem unbelievably cynical here, but that’s where I got by working in finance, where I saw first-hand how manipulative and manipulated mathematical modeling can become. And there’s no reason at all such machinations wouldn’t translate to the world of big data or climate modeling.

Climate research

Speaking of climate modeling: first, it annoys me that people are using my “distrust the experts” line to be cast doubt on climate modelers.

People: I’m not asking you to simply be skeptical, I’m saying you should look into the models yourself! It’s the difference between sitting on a couch and pointing at a football game on TV and complaining about a missed play and getting on the football field yourself and trying to figure out how to throw the ball. The first is entertainment but not valuable to anyone but yourself. You are only adding to the discussion if you invest actual thoughtful work into the matter.

To that end, I invited an expert climate researcher to my house and asked him to explain the climate models to me and my husband, and although I’m not particularly skeptical of climate change research (more on that below when I compare incentives of the two sides), I asked obnoxious, relentless questions about the model until I was satisfied. And now I am satisfied. I am considering writing it up as a post.

As an aside, if climate researchers are annoyed by the skepticism, I can understand that, since football fans are an obnoxious group, but they should not get annoyed by people who want to actually do the work to understand the underlying models.

Another thing about climate research. People keep talking about incentives, and yes I agree wholeheartedly that we should follow the incentives to understand where manipulation might be taking place. But when I followed the incentives with respect to climate modeling, they bring me straight to climate change deniers, not to researchers.

Do we really think these scientists working with their research grants have more at stake than multi-billion dollar international companies who are trying to ignore the effect of their polluting factories on the environment? People, please. The bulk of the incentives are definitely with the business owners. Which is not to say there are no incentives on the other side, since everyone always wants to feel like their research is meaningful, but let’s get real.

Scientific translators

I like this idea Stephanie comes up with:

Some sociologists of science suggest that translational “experts”–that is, “experts” who aren’t necessarily producing new information and research, but instead are “expert” enough to communicate stuff to those not trained in the area–can help bridge this divide without requiring everyone to become “experts” themselves. But that can also raise the question of whether these translational experts have hidden agendas in some way. Moreover, one can also raise questions of whether a partial understanding of the model might in some instances be more misleading than not looking into the model at all–examples of that could be the various challenges to evolution based on fairly minor examples that when fully contextualized seem minor but may pop out to someone who is doing a less systematic inquiry.

First, I attempt to make my blog something like a platform for this, and I also do my best to make my agenda not at all hidden so people don’t have to worry about that.

This raises a few issues for me:

  • Right now we depend mostly on press to do our translations, but they aren’t typically trained as scientists. Does that make them more prone to being manipulated? I think it does.
  • How do we encourage more translational expertise to emerge from actual experts? Currently, in academia, the translation to the general public of one’s research is not at all encouraged or rewarded, and outside academia even less so.
  • Like Stephanie, I worry about hidden agendas and partial understandings, but I honestly think they are secondary to getting a robust system of translation started to begin with, which would hopefully in turn engage the general public with the scientific method and current scientific knowledge. In other words, the good outweighs the bad here.

Open data is not a panacea

I’ve talked a lot recently about how there’s an information war currently being waged on consumers by companies that troll the internet and collect personal data, search histories, and other “attributes” in data warehouses which then gets sold to the highest bidders.

It’s natural to want to balance out this information asymmetry somehow. One such approach is open data, defined in Wikipedia as the idea that certain data should be freely available to everyone to use and republish as they wish, without restrictions from copyrightpatents or other mechanisms of control.

I’m going to need more than one blog post to think this through, but I wanted to make two points this morning.

The first is my issue with the phrase “freely available to everyone to use”. What does that mean? Having worked in futures trading, where we put trading machines and algorithms in close proximity with exchanges for large fees so we can get to the market data a few nanoseconds before anyone else, it’s clear to me that availability and access to data is an incredibly complicated issue.

And it’s not just about speed. You can have hugely important, rich, and large data sets sitting in a lump on a publicly available website like wikipedia, and if you don’t have fancy parsing tools and algorithms you’re not going to be able to make use of it.

When important data goes public, the edge goes to the most sophisticated data engineer, not the general public. The Goldman Sachs’s of the world will always know how to make use of “freely available to everyone” data before the average guy.

Which brings me to my second point about open data. It’s general wisdom that we should hope for the best but prepare for the worst. My feeling is that as we move towards open data we are doing plenty of the hoping part but not enough of the preparing part.

If there’s one thing I learned working in finance, it’s not to be naive about how information will be used. You’ve got to learn to think like an asshole to really see what to worry about. It’s a skill which I don’t regret having.

So, if you’re giving me information on where public schools need help, I’m going to imagine using that information to cut off credit for people who live nearby. If you tell me where environmental complaints are being served, I’m going to draw a map and see where they aren’t being served so I can take my questionable business practices there.

I’m not saying proponents of open data aren’t well-meaning, they often seem to be. And I’m not saying that the bad outweighs the good, because I’m not sure. But it’s something we should figure out how to measure, and in this information war it’s something we should keep a careful eye on.

Suggested New Year’s resolution: start a blog

I was thinking the other day how much I’ve gotten out of writing this blog. I’m incredibly grateful for it, and I want you to consider starting a blog too. Let’s go through the pros and cons:


  1. A blog forces you to articulate your thoughts rather than having vague feelings about issues.
  2. This means you get past things that are bothering you.
  3. You also get much more comfortable with writing, because you’re doing it rather than thinking about doing it.
  4. If your friends read your blog you get to hear what they think.
  5. If other people read your blog you get to hear what they think too. You learn a lot that way.
  6. Your previously vague feelings and half-baked ideas are not only formulated, but much better thought out than before, what with all the feedback. You’ll find yourself changing your mind or at least updating and modifying lots of opinions.
  7. You also get to make new friends through people who read your blog (this is my favorite part).
  8. Over time, instead of having random vague thoughts about things that bug you, you almost feel like you have a theory about the things that bug you (this could be a “con” if you start feeling all bent out of shape because the world is going to hell).


  1. People often think what you’re saying is dumb and they don’t resist telling you (you could think of this as a “pro” if you enjoy growing a thicker skin, which I do). 
  2. Once you say something dumb, it’s there for all time, in your handwriting, and you’ve gone on record saying dumb things (that’s okay too if you don’t mind being dumb).
  3. It takes a pretty serious commitment to write a blog, since you have to think of things to say that might interest people (thing you should never say on a blog: “Sorry it’s been so long since I wrote a post!”).
  4. Even when you’re right, and you’ve articulated something well, people can always dismiss what you’ve said by claiming it can’t be important since it’s just a blog.

Advice if you’ve decided to go ahead and start a blog

  1. Set aside time for your blog every day. My time is usually 6-7am, before the kids wake up.
  2. Keep notes for yourself on bloggy subjects. I write a one-line gmail to myself with the subject “blog ideas” and in the morning I search for that phrase and I’m presented with a bunch of cool ideas.
  3. For example I might write something like, “Can I pay people to not wear moustaches?” and I leave a link if appropriate.
  4. I try to switch up the subject of the blog so I don’t get bored. This may keep my readers from getting bored but don’t get too worried about them because it’s distracting.
  5. My imagined audience is almost always a friend who would forgive me if I messed something up. It’s a friendly conversation.
  6. Often I write about something I’ve found myself explaining or complaining about a bunch of times in the past few days.
  7. Anonymous negative comments happen, and are often written by jerks. Try to not take them personally.
  8. Try to accept criticism if it’s helpful and ignore it if it’s hurtful. And don’t hesitate to delete hurtful comments. If that jerk wants a platform, he or she can start his or her own goddamn blog.
  9. Never feel guilty towards your blog. It’s inanimate. If you start feeling guilty then think about how to make it more playful. Take a few days off and wait until you start missing your blog, which will happen, if you’re anything like me.
Categories: musing

Corporations don’t act like people

Corporations may be legally protected like people, but they don’t act selfishly like people do.

I’ve written about this before here, when I was excitedly reading Liquidated by Karen Ho, but recent overheard conversations have made me realize that there’s still a feeling out there that “the banks” must not have understood how flawed the models were because otherwise they would have avoided them out of a sense of self-preservation.

Important: “the banks” don’t think or do things, people inside the banks think and do things. In fact, the people inside the banks think about themselves and their own chances of getting big bonuses/ getting fired, and they don’t think about the bank’s future at all. The exception may be the very tip top brass of management, who may or may not care about the future of their institutions just as a legacy reputation issue. But in any case their nascent reputation fears, if they existed at all, did not seem to overwhelm their near-term desire for lots of money.

Example: I saw Robert Rubin on stage well before the major problems at Citi in a discussion about how badly the mortgage-backed securities market was apt to perform in the very near future. He did not seem to be too stupid to understand what the conversation was about, but that didn’t stop him from ignoring the problem at Citigroup whilst taking in $126 million dollars. The U.S. government, in the meantime, bailed out Citigroup to the tune of $45 billion with another guarantee of $300 billion.

Here’s a Bloomberg BusinessWeek article excerpt about how he saw his role:

Rubin has said that Citigroup’s losses were the result of a financial force majeure. “I don’t feel responsible, in light of the facts as I knew them in my role,” he told the New York Times in April 2008. “Clearly, there were things wrong. But I don’t know of anyone who foresaw a perfect storm, and that’s what we’ve had here.”

In March 2010, Rubin elaborated in testimony before the Financial Crisis Inquiry Commission. “In the world of trading, the world I have lived in my whole adult life, there is always a very important distinction between what you could have reasonably known in light of the facts at the time and what you know with the benefit of hindsight,” he said. Pressed by FCIC Executive Director Thomas Greene about warnings he had received regarding the risk in Citigroup’s mortgage portfolio, Rubin was opaque: “There is always a tendency to overstate—or over-extrapolate—what you should have extrapolated from or inferred from various events that have yielded warnings.”

Bottomline: there’s no such thing as a bank’s desire for self-preservation. Let’s stop thinking about things that way.

Categories: finance, rant

Consumer segmentation taken to the extreme

I’m up in Western Massachusetts with the family, hidden off in a hotel with a pool and a nearby yarn superstore. My blogging may be spotty for the next few days but rest assured I haven’t forgotten about mathbabe (or Aunt Pythia).

I have just enough time this morning to pose a thought experiment. It’s in three steps. First, read this Reuters article which ends with:

Imagine if Starbucks knew my order as I was pulling into the parking lot, and it was ready the second I walked in. Or better yet, if a barista could automatically run it out to my car the exact second I pulled up. I may not pay more for that everyday, but I sure as hell would if I were late to a meeting with a screaming baby in the car. A lot more. Imagine if my neighborhood restaurants knew my local, big-tipping self was the one who wanted a reservation at 8 pm, not just an anonymous user on OpenTable. They might find some room. And odds are, I’d tip much bigger to make sure I got the preferential treatment the next time. This is why Uber’s surge pricing is genius when it’s not gouging victims of a natural disaster. There are select times when I’ll pay double for a cab. Simply allowing me to do so makes everyone happy.

In a world where the computer knows where we are and who we are and can seamlessly charge us, the world might get more expensive. But it could also get a whole lot less annoying. ”This is what big data means to me,” Rosensweig says.

Second, think about just how not “everyone” is happy. It’s a pet peeve of mine that people who like their personal business plan consistently insist that everybody wins, when clearly there are often people (usually invisible) who are definitely losing. In this case the losers are people whose online personas don’t correlate (in a given model) with big tips. Should those people not be able to reserve a table at a restaurant now? How is that model going to work?

And now I’ve gotten into the third step. It used to be true that if you went to a restaurant enough, the chef and the waitstaff would get to know you and might even keep a table open for you. It was old-school personalization.

What if that really did start to happen at every restaurant and store automatically, based on your online persona? On the one hand, how weird would that be, and on the other hand how quickly would we all get used to it? And what would that mean for understanding each other’s perspectives?

Categories: data science, modeling, musing

Whom can you trust?

My friend Jordan has written a response to yesterday’s post about Nate Silver. He is a major fan of Silver and contends that I’m not fair to him:

I think Cathy’s distrust is warranted, but I think Silver shares it.  The central concern of his chapter on weather prediction is the vast difference in accuracy between federal hurricane forecasters, whose only job is to get the hurricane track right, and TV meteorologists, whose very different incentive structure leads them to get the weather wrong on purpose.  He’s just as hard on political pundits and their terrible, terrible predictions, which are designed to be interesting, not correct.

To this I’d say, Silver mocks TV meteorologists and political pundits in a dismissive way, as not being scientific enough. That’s not the same as taking them seriously and understanding their incentives, and it doesn’t translate to the much more complicated world of finance.

In any case, he could have understood incentives in every field except finance and I’d still be mad, because my direct experience with finance made me understand it, and the outsized effect it has on our economy makes it hugely important.

But Jordan brings up an important question about trust:

But what do you do with cases like finance, where the only people with deep domain knowledge are the ones whose incentive structure is socially suboptimal?  (Cathy would use saltier language here.)  I guess you have to count on mavericks like Cathy, who’ve developed the domain knowledge by working in the financial industry, but who are now separated from the incentives that bind the insiders.

But why do I trust what Cathy says about finance?

Because she’s an expert.

Is Cathy OK with this?

No, Cathy isn’t okay with this. The trust problem is huge, and I address it directly in my post:

This raises a larger question: how can the public possibly sort through all the noise that celebrity-minded data people like Nate Silver hand to them on a silver platter? Whose job is it to push back against rubbish disguised as authoritative scientific theory?

It’s not a new question, since PR men disguising themselves as scientists have been around for decades. But I’d argue it’s a question that is increasingly urgent considering how much of our lives are becoming modeled. It would be great if substantive data scientists had a way of getting together to defend the subject against sensationalist celebrity-fueled noise.

One hope I nurture is that, with the opening of the various data science institutes such as the one at Columbia which was a announced a few months ago, there will be a way to form exactly such a committee. Can we get a little peer review here, people?

I do think domain-expertise-based peer review will help, but not when the entire field is captured, like in some subfields of medical research and in some subfields of economics and finance (for a great example see Glen Hubbard get destroyed in Matt Taibbi’s recent blogpost for selling his economic research).

The truth is, some fields are so yucky that people who want to do serious research just leave because they are disgusted. Then the people who remain are the “experts”, and you can’t trust them.

The toughest part is that you don’t know which fields are like this until you try to work inside them.

Bottomline: I’m telling you not to trust Nate Silver, and I would also urge you not to trust any one person, including me. For that matter don’t necessarily trust crowds of people either. Instead, carry a healthy dose of skepticism and ask hard questions.

This is asking a lot, and will get harder as time goes on and as the world becomes more complicated. On the one hand, we need increased transparency for scientific claims like projects such as runmycode provide. On the other, we need to understand the incentive structure inside a field like finance to make sure it is aligned with its stated mission.

Categories: data science, finance, modeling