Stockholm Tutorial on Data Science

I’m super excited to be teaching a day-long technical tutorial in data science in Stockholm in one month. Stockholm is gorgeous and Sweden is an amazing country. Last time I was there with the entire family, the husband was giving talks the whole time and the little guy had an ear infection, so it was kind of a bust (although not entirely; I did become the bus queen of Stockholm). This time I’m going alone. Cheese fondue and meatballs will be eaten.

Here’s the flier for the event:


According to my calculation, 1000 SEK is equivalent to $120. That’s with coffee and lunch though, so I feel like as long as I explain k-nearest neighbors we’re good. Also, this is a draft of the flier. I told them to change the “prior knowledge” to be less focused on statistics. After all, data scientists are not all stats majors.

So far there have been around 20 people who have signed up, mostly affiliated with Statistics Sweden, the Swedish government agency responsible for producing official statistics regarding Sweden, established in 1749. This means I’ll be addressing the important question, what’s the difference between statistics and data science?

Well, it’s kind of hard to answer that question abstractly. I need to supply examples of realistic “found data” which we use in data science. So that’s my plan for the day, to create a few iPython notebooks with examples of the kind of data and algorithmic techniques that you’d typically find in nature. I think once these statisticians see those examples they will be comfortable knowing how much better off they are in Sweden measuring the inflation rate (currently at -0.2%) than we are trying to understand whether people like specific brand names by scouring Twitter.

However! I’m totally not above stealing other people’s examples to make my points, so if you know of a nice example or two, that involves scraping (or API’ing), cleaning, and algorithmizing, and especially if it does all this in python, then please make suggestions. Otherwise I’ll look up some topics in my book and try to do it myself.

[Update: Holy crap look at this repository of iPython notebooks which explain data science stuff! Amazing.]

I definitely want to spend at least some time showing the audience how much the answer can depend on seemingly benign choices of hyperparameters and so on. If I end up with good examples I’ll be sure to share them here.

Beyond my tutorial, I’m also giving a keynote talk at an associated conference taking place at the very nice Hotel Sign, which is also where they’re putting me up.

Categories: Uncategorized

Aunt Pythia’s advice

Readers, so glad to be back with you this week, and many apologies for missing last week, but I was arranging my yarn collection.

It's all on now. Username cathyoneil.

It’s all on now. Username cathyoneil.

I’m back now, though, and reading interesting articles about the real life of a sex worker (not arousing, as it turns out) and recording my weekly Slate Money podcast (I’m particularly proud of this week’s episode on Disparate Impact).

Enjoy today’s column! And afterwards, please:

ask Aunt Pythia any question at all at the bottom of the page!

By the way, if you don’t know what the hell Aunt Pythia is talking about, go here for past advice columns and here for an explanation of the name Pythia.


Dear Aunt Pythia,

Must I go to my grandmothers funeral? I do not really even like her.

Greek Girl

Dear GG,

Let’s think this through. Your grandmother is dead, so she won’t mind if you don’t come to her funeral. Really the only people who are going to be bothered are the people in your family. If they are going to the trouble of having a funeral at all, I’d guess they think people should come to it. So your primary consideration, to my mind, is how much you feel obligated to them (assuming you care what they think about you in the first place).

Next, I don’t think you need to actually like someone to go to their funeral, but at the same time, if someone was really cruel to you, it’s totally acceptable to skip it. From the tone of your letter I’m guessing she wasn’t really horrible, though. So that’s not an easy out.

Finally, it may be difficult to get to, expensive, or time consuming. And you may be a busy person who doesn’t have extra time and/or money. If true, send your regrets and tell your family how much you’re looking forward to seeing them soon, at a happier time.

If it’s nearby and convenient, and your family really cares that you’re there, I’d say you’re stuck.

Good luck!

Aunt Pythia


Dear Aunt Pythia,

In this age of hyper-macho global finance, how come individual stock markets such as the NYSE have ‘trading hours’ instead of just being open 24/7? Are there no computerized trading algorithms that are willing to sacrifice their family life to stay at work until 4 am on a Sunday?

Just Idly Musing

Dear JIM,

Great question. The technology is there, certainly. But why then isn’t the trading happening?

The answer is more or less, people don’t trade 24 hours a day because people aren’t already trading 24 hours a day. It takes a certain amount of liquidity for trading to be efficient, and without that you end up with large spreads between buy and sell and nobody wants to feel like they’re wasting money.

Of course, the algorithms could run all day and night, but at the end of the day people watch over those algorithms (really!) and they want to sleep. Plus, it’s actually true that most people sleep at basically the same time in the same time zone, and that people in the U.S. are more likely to care about U.S. stocks.

The flip side of that is that soon after the NYSE closes, the Asian market opens, then the European market. So it’s not like there’s a lot of downtime as it is.

Aunt Pythia


Aunt Pythia,

I don’t believe in imposter syndrome. It’s all the rage to tell us successful women how we have imposter syndrome and many successful women are saying this about themselves as if this is somehow rooted in their psyche.

I am a successful woman and I’ve discovered that what happens when you reach a certain level of success is a huge backlash. That is, I was permitted to be successful from my quiet little corner where people could just appreciate my work and grant me their benevolence. But when my success went too far, and I left that corner and stepped up as an equal to my former benefactors, I began to have everything I did questioned and lowered.

Now, some of my former benefactors, the ones who have truly stellar positions in society, they are still benefactors because I am still far beneath them. Thanks to these truly well located folks telling me my work is better than ever and they expect even more from me, I have had the confidence not to develop imposter syndrome.

If I was left with all the trashing my cohort has showered upon me since I joined them, I could well develop all the symptoms of that syndrome but not because I have a psych problem but out of mistreatment.

Shouldn’t there be a term for this? Its not quite battered worker syndrome or battered employee syndrome, because I’m speaking of someone who is very successful. It’s not imposter syndrome because I don’t feel like an imposter. But it is something and it infuriates me and it is very, very common.

Fortunate Uber Cunt Kicked Effrontery Down


They say to “Lean In,” but I say, to what? To these douchebags? I’d rather not.

So yes, I think you’re right. When it’s called “Imposter Syndrome,” it’s often a way for people to dismiss us as inwardly insecure and, therefore, incompetent. It’s used as an excuse to explain the mysterious forces which keep us from succeeding further, in fact.

On the other hand, it sucks for everyone at a certain level, and you have to be just totally focused on success beyond anything else no matter what, whether you’re a man or a woman. So there’s that too. Said another way, if I were a man I still wouldn’t want to be in that rat race, personally.

My advice to you is, call it “being an Uber Cunt that nobody can handle” and refer to it – breezily and often – as a superpower.

Auntie P


Dear Aunt Pythia,

I have a question about the fair trade of blowjobs. (I also must acknowledge before I move on that I was never sure if just licking someone’s genitals w/o them getting off on it is considered a ‘blowjob’. I use it here more as a pre-sex tool than as a way to come.)

I enjoy giving them. I don’t see it as a chore, or something my partner needs to earn. (I’ve even given unsolicited blowjobs at first dates!) My latest partner of several years is more stingy about giving blowjobs though. He still makes the sex interesting with finger-play, etc, but I don’t know why he doesn’t constantly offer a blowjob into the sex like I do.

I tried bringing up this a few times, but he kinda avoided the subject with comments like “I am sorry, i know.” – I should also add “I am sorry” is his first response to anything.

But even without getting them, I like giving blowjobs. Though lately, I have been thinking if I should appropriate my blowjobs. Should blowjobs only be traded on a one-for-one basis so that one party don’t get exploited? Am I adding to the sexism in the world by giving non-deserving men blowjobs? Is this a bigger issue than I think it is?

What is your take on blowjobs?

Being Lewd Or Wicked Sexy?


Amazing question and sign-off. And I think the “unsolicited first-date blowjob” is a generous concept that will earn you quite a few fans among my readership. We are on your side!

I’d say a straight-up conversation with said partner is called for. Specifically, ask him what the conditions are that make him want to give you a blowjob, and how you can achieve them more often. Who knows, he might be squeamish about certain smells which you can solve with a quick shower. What a shame, after all, if that’s all it would take and you just don’t know. Communication, communication, communication.

Now, as part of that conversation, you should add that, because of the unequal blowjobbery in your relationship, you’ve found yourself thinking somewhat and surprisingly quid pro quo in the blowjob department. This will probably spur him to action, as the urgency of the situation will immediately be revealed. You don’t have to directly threaten him, mind you, just mention that the count is off, the blowjob equity is lacking, and you need some relief.

Or else, maybe you do need to threaten? I mean, try the talk first, but I do think reciprocity in bed is a basic requirement of a good relationship, and if he’s not up for it (as it were!), plenty of other men would be.

And: wicked sexy, not lewd.

Aunt Pythia


Dear Aunt Pythia,

I’m thinking about buying one of those test kits that takes a sample of your DNA and reports your ancestry. There are a few companies that sell kits:, 23andme, and National Geographic’s Genographic Project.

I’m wondering whether I should be concerned about my DNA data being misused in any way. Would you do it? Why or why not? More info here.  

If you did get yourself genetically tested, what percent Neanderthal would you wager you are?

DNA Data Skeptic

Dear Skeptic,

Not sure. I don’t think I’d be too worried about my DNA being used, but that’s likely because I’m not financially insecure, I’m a US citizen, and I have health insurance. I think other people might be more worried. And even if the company I gave my DNA to doesn’t sell it or something, there’s always the chance they’d get hacked. So I’d go in thinking that my DNA would in fact be public knowledge.

On the other hand, I’m also not particularly interested in my heritage, so the very small interest would not overwhelm the small risk, and I’d end up not doing it.

Here’s the question I was hoping you’d ask: would I send away my DNA to get it tested for possible hereditary diseases? And the answer there is a firm no, because as I learned reading this article, the results on those kinds of test are terribly innaccurate and vary wildly depending on the company’s methods. This is not yet science. And I’m not sure if the ancestry thing is better or worse.

Come to think of it, I might suggest you do it just to see how the answered vary depending on the company.

Aunt Pythia


Readers? Aunt Pythia loves you so much. She wants to hear from you – she needs to hear from you – and then tell you what for in a most indulgent way. Will you help her do that?

Please, pleeeeease ask her a question. She will take it seriously and answer it if she can.

Click here for a form for later or just do it now:

Categories: Uncategorized

The tricky thing about disparate impact

Today I’m fascinated by the story described in this three-part American Banker series on the Consumer Financial Protection Bureau’s (CFPB’s) use of disparate impact, written by Rachel Witkowski. Disparate impact, according to the article, is a legal theory that says lenders can be penalized if they have a neutral policy that creates an adverse impact against a protected class of borrowers, regardless of intent.

Witkowski reports on the CFPB trying to understand and punish auto lenders for their process for figuring out fees and interest rates on auto loans. In general, the auto dealers, who work in partnerships with auto lenders, have discretion to add on some interest rate and pocket the difference. They seem to be pocketing fatter differences for certain populations, specifically black car buyers.

The problem is, it’s hard to measure exactly how much fatter and who is getting screwed, by how much. And in the world of law and punishment, it’s not enough to prove that there’s been a disparate impact – you have to actually make restitutions to the victims. So for example, the CFPB is in discussions with Ally Financial for exactly this problem, and the question is how much money to they give to which borrowers as a refund.

The first reason this is hard to get right is that auto dealers and lenders don’t actually collect race information, in contrast to mortgage lending, where it’s a requirement of the lending process, specifically to ward against redlining. So the CFPB, in its investigation, has to rely on proxy data like zip codes and names to guess the race of a given borrower. In fact their methodology is described in this white paper, but unsurprisingly the auto lenders under scrutiny complain it is not sufficiently transparent.

What that translates into is the possibility that some white car buyers people will get refunded accidentally and some black car buyers won’t, even if there were shenanigans going on with their car loan. From my perspective as a data person, this tells me that, as long as we have problems like this, we should probably require race to be recorded in a car loan.

That’s not the only problem, though. The thing about these modern cases of measuring disparate impact is that it’s a model, and models are extremely squishy things. Two people asked to build a disparate impact model on the same data will likely come up with different answers, because all sorts of decisions have to be made on the way. From the article:

Each financial regulator has its own method for determining disparities and harm in fair-lending cases, and each of those cases can differ depending on the business model of the bank and what variables the regulators will consider. The Federal Reserve, for instance, generally adds controls, such as geography, to the statistical model if the bank’s business model indicates that certain pricing criteria can influence the price or markup, according to a 2013 Fed presentation.

Given this uncertainty, plus the uncertainty of the race of the borrowers, you end up firmly in a land of statistics, where each borrower is assigned a probability of being minority and a probability of having been screwed. Then the question becomes, do we err on the side of under- or over-refunding these borrowers? The lenders, who are paying for this all, tend to lean on the side of not giving any money away at all unless we’re sure.

In this particular story, specifically in part 3, there’s even an expert consultant named Dr. Bernard Siskin who happens to work for both sides – the banks and the CFPB. The excuse for that questionable arrangement is that there aren’t enough statisticians who can do this work (my hand is raised!), but the end result is that Siskin seems to help the banks complain about exactly this issue: which version of the disparate impact model is to be used, and what kind of attributes will be controlled for, so that they can each get the least expensive settlement.

Here’s my theory. This is a big new field in statistics and data science, and this is just the tip of the iceberg. We will be seeing a large amount of work being done and tools being made which aim to measure and audit processes and algorithms, whether they are auto loans that discriminate against minority borrowers or car computers that bypass emissions tests. And we will have to develop standards by which we measure a company’s work. The standards won’t be perfect, mind you, and people will end up getting away with certain things, but at leas we won’t have the gaming that’s obviously going on now, because there will be a set way, hopefully reasonably thought out, to measure discrimination, or lying, or cheating, or what have you.

That’s the field I want to go into. Building models that call bullshit on other models.

Categories: Uncategorized

Strata and swag

Yesterday I gave a 5-minute lightning talk at a corporate big data conference here in New York called Strata+Hadoop World, put on by O’Reilly and Cloudera.

My talk was part of a session run by DataKind, aimed at talking about the ethics of algorithms. My 5 minutes were taken up discussing 5 ideas:

  • In order to do good with data, first you have to not do bad. Data scientists aren’t trained to think through the ethics and social impact of their work, so this is non-trivial.
  • We haven’t actually figured out the difference between correlation and causation. That means, in the context of social algorithms, that we blame the victim constantly. Think about the HR algorithm that decides never to hire another woman engineer because it notices how badly women engineers fare in the workplace.
  • Or, we could take the example of the justice system, where we use recidivism algorithms to figure out that poor black people are more likely to be arrested, and we decide to punish them even more as a result, instead of asking why the justice system isn’t serving to help them as much as it helps white or rich people.
  • Or, we could take the example of teacher assessment, where we blame teachers on student test scores, even though they have little power over them.
  • Conclusion: data scientists are de facto policy makers. We shouldn’t be.

So, the talk I gave was sparsely attended, with maybe 40 people in the room (which is actually more than we expected). I was happy to see those people, and many of them were earnest and thoughtful, to be sure. Danah Boyd spoke in the second session, as usual very eloquently, and I felt like there were far too few people in the room compared to who might benefit from hearing her.

But let’s face it, Strata is a celebration of big data in the corporate setting, and few people there were spending too much time fretting about ethics. It was dominated by its expo room, where dozens of data science platforms extending the hype of the power of big data were set to sell you magical thinking. There were also a few groups doing good stuff, to be sure, but the overall feel was similar to how it felt back in 2011, except bigger.

Not to be cynical! There’s plenty of other stuff going on that wasn’t in 2011, so really it’s fine. And plus, I did manage to meet up with some colorful ladies:

Picture taken by my buddy Debbie Berebichez

Picture taken by my buddy Debbie Berebichez

and I picked up an enormous amount of Strata swag (more here) because teenage sons:

This one is the cutest. Most of the other t-shirts I got had silly puns.

This one is the cutest. Most of the other t-shirts I got had silly puns.

If I had stayed longer I could have gotten plenty of free beer and food, not to mention more pens than I could ever use. There were even lego data science characters, but to get those I had to stay to listen to the pitch, which was a dealbreaker for me.

Conclusion: Strata fills a niche not unlike the New York Coffee Festival. Almost completely frivolous but fun for the participants, as long as you don’t get caffeine poisoning.

Categories: Uncategorized

Guest Post: how to be a data scientist at a non-profit

September 30, 2015 5 comments

This is a guest post by John Santerre, a 5th year Ph.D. student at the University of Chicago in the Computer Science Department. Previously a photojournalist, John has worked with nonprofits and NGO’s off and on for the last ten years. This summer he served as a Research Programmer at The Eric and Wendy Schmidt Data Science for Social Good Fellowship. His master’s work, under Prof. Lek-heng Lim, involved the use of Hodge Decomposition for rank disambiguation while his Ph.D work is at Argonne National Laboratory and involves scalable Machine Learning techniques for use on Cancer and Anti-Microbial Resistance (AMR).

The recent mathbabe post What can a non-academic mathematician do that makes the world a better place struck a chord with me.  Over the last ten years I’ve worked as a photojournalist on and off with nonprofits photographing everything from Sigourney Weaver and Anna Wintour to documenting drug dealers in Puerto Rico, rebel fighters in Burma/Myanmar, and the UN Peacekeeping effort in Haiti. I was so involved with nonprofit work, I founded my own, just to provide photography services to other nonprofits. Most recently I spent the summer as research programer for the Eric and Wendy Schmidt Data Science for Social Good Fellowship. I thought my experience might offer a small amount insight, at least for those who are truly new to working with nonprofits or “social good” organizations.

The most rewarding and challenging aspect of working at a nonprofit is the responsibility you bear to educate the organization about the limitations and potential you present. This can’t be overstated. The organization you choose to work with will have any number of teachers, secretaries, drivers, programmers, and support staff all with clearly delineated jobs descriptions. As the outsider who, as in T.S. Eliot’s poem, has “Come … to tell you all, I shall tell you all,” you will almost certainly be alone in your role. In fact, that is the explicit reason we seek out such opportunities: working at the “tip of the spear” presents an opportunity for our skills to be uniquely impactful. Offering insights that the organization wouldn’t have access to, or perhaps cannot afford, can be vastly fulfilling.

However, this brings with it an inherent Faustian bargain. Just as the violinist Joshua Bell was ignored in the DC Metro but adored in the philharmonic [1], so too will many of your finely tuned skills fall on comparatively (computationally?) deaf ears. In my experience, you have/get to “check” your craft at the door. In my role as a photographer this often meant my most useful skill was taking comparatively simple photographs [2]. Now it often means my technical contribution to nonprofits is less than my potential. In fact, the clients are more than happy to explicitly express that. Often they are looking for a “sanity check” and understand that I am overqualified for their problem. In fact they often seek out and will only work with someone who they perceive is overqualified. I personally don’t mind this. In the right organization you can build on your own personal skill sets. In fact, I’ve never been challenged in the ways I had expected. Even so, there are a host of fairly common challenges and insights that are orthogonal to my craft that have kept appearing over the last ten years which I’ll share.

1. You are alone.

You will likely be a solo consultant who has no one to brainstorm with, no one to advocate for an agenda with, and no one to share the burden of the tumultuous experience of making sense of a new work environment. To top all of that off, you are often working on a necessarily compressed schedule. Ryan Kappadal, a statistics professor at the Air Force Institute of Technology (10+ years of experience in data science) told me this summer that the Air Force integrates its data science teams into other units in pairs. It was instantly obvious to me how much more impressive it is for an organization to watch and interact with two professionals debating approaches and building a strategy out loud rather than listening to a pitch from a single perspective.

2. You will likely have different metrics of success.

Currently I work on classifying Anti-Microbial Resistance (AMR) at Argonne National Lab, a topic so important that the UN, POTUS, and WHO have all identified it as a top threat facing global humanity. When I explained my work to a ‘data scientist’ at a start-up I noted a 95% accuracy of one classifier. They responded slightly dismissively “Is 95% high enough?”. Sure it’s “just” the k-means classification accuracy on the MNIST data set, but it’s also high enough to identify consistently (despite low sample size), the gene regions that confer resistance to a particular antimicrobial.This provides evidence of the likely mechanism – i.e., cell wall transport – that has mutated thereby implying possible counter strategies for the biologists.

Perhaps more humorously, I traveled to Cambodia to work with a nonprofit at “Smoke Mountain”, the continually burning garbage dump in Phnom Penh. The most useful service I provided for them was photographing their Christmas card. I climbed atop a nearby building and photographed the children spelling out “Thank You!” in human gymnastic positions. Not exactly what I was expecting after traveling 1/2 way around the world!

I cannot stress this topic enough, so I will harp on one more little point. In my mind, the inculcation required to become a specialist in our craft can blind us to the impact our client requires. In NGO work the objective is to build a shared skill set between you and the organization, rather than develop new insights into the problem. Photographers are constantly looking for new ways to restructure the frame. Similarly, machine Learning people are constantly trying to find new ways to approach the problems. But working with organizations requires a different metric. I have to judge my contribution not by the number of trailing digits of prediction accuracy, but by the impact I have on how biologist’s approach the problem of AMR. I love the craft of both ML and photography, so stepping out of the role where the craft is the most important thing is always hard, but when appropriate it can be vastly rewarding.

3. Different organizations have surprisingly similar needs.

This summer at the DSSG, my role, along with another programer was to build a “best practices” pipeline across the DSSG fellows and individual teams [3]. Freed from providing results to a client we could write maintainable clean code, while simultaneously “looking over their shoulders” for similarities between workflows. While each group was different, there was surprisingly consistency across groups, especially in terms of client interaction. That is another way of saying a sampling of 3-5 such organizations will give you a good sense of what this work is actually like.

4. Social good doesn’t require a 501c3 status.  

It can be more rewarding and impactful to provide sophisticated technical services to a for-profit start-up preventing relapse in drug addiction (i.e. than providing rudimentary analysis consulting with a nonprofit following bird migrations [4]. It can be more fulfilling working for a growing for-profit but non-partisan organization like targeting voter engagement, than it is to work for an issue-based partisan nonprofit. Or maybe it’s not for you. We are fortunate enough to work in a time where both types of organizations require our services.

In summary, with such tremendous need for the intersection of statistics and computer science, I find I am overwhelmed with options, but only if I am flexible in what role I expect to provide. Having “a voice” as a photographer or a focused specialty as an academic are hallmarks of advanced practitioners for good reason. These are the contributions that move the field forward. Conversely, serving as an advocate is a generalist position. Recognizing that helps me to find unique ways to ensure both my professional progress and that the organization’s needs are met.

1.  While people walked past him IRL, he did manage to get 160k youtube views of his being ignored however!

2.  A.K. Kimoto traveled through northern Afghanistan as photographer for UNICEF taking simple portraits that were very much needed.  Later he used the connections he gained to return and photograph this work, a far more subtle and evocative collection of imagery that was very close to his heart.  

3. It’s a (not quite alpha stage) python grid search library across models and parameters.  We named Diogenes and gave it the tongue in cheek slogan “Searching for an honest classifier”.   

4.  Although, I used to watch the fall migrations mountain-side, and .csv’s of the data might make for an interesting weekend!

Categories: Uncategorized

Yarn Confessions

September 29, 2015 19 comments

Readers, you might have missed me for the past few days. I know I’ve missed you (and so has Aunt Pythia).

Well, full disclosure on what’s been happening is in order: I’ve been organizing my yarn collection.

Yes, it’s true, I have a deeply alarming amount of yarn, which has hitherto been gathered in smallish bags tucked all over the house, in every nook, cranny, and corner.

Well, with the help of my good friend Elena, who is starting a side business to help people organize their homes (ask me for an amazing reference!), I have officially tamed the yarn beast.

Just to give you a sense of the sprawl, here are just the “odds and ends” yarns in the blue or purple spectrum:


Pretty much every ball here represents a project I have long finished. I always buy a bit too much yarn and then keep the extra. Yes, that’s my foot and boob shelf.

Of course, I have odds and ends in other colors too:

For whatever reason I keep buying lots of green yarn. I have like 5 green sweaters in my closet.

For whatever reason I keep buying lots of green yarn. I have like 5 green sweaters in my closet.

Of course, not all my yarn is in the “odds and ends” category. I have whole bags of unopened yarn I got on sale during one of my winter excursions to Webs, the biggest and best yarn store in the world (you can tell it’s a big deal because it owns the “” url). Here’s an example:

I also love orange, as a color, but nothing orange ever looks good on me.

I also love orange, as a color, but nothing orange ever looks good on me.

Anyhoo, the entire collection is here, feel free to take a look. I am beyond shame and embarrassment at this point, because at some point, when all my yarn was splayed across my entire living room and dining room, I realized how amazingly beautiful it all is and how much I’ve gotten out of my hobby over the years.

Also, I think I might actually have more yarn than the average yarn shop, so there’s that option as well, if I’m ever really broke. Plus, now that my yarn is so nicely organized and tidy, I’d even be able to show it to people.

So there you have it, yarn confessions. Life is too short to be ashamed of your passion.

Categories: Uncategorized

Obsessed with VW

September 25, 2015 40 comments

So I’m kind of obsessed with the VW story. Specifically, I want to know what happened back in 2009 when they started cheating. What was that conversation like? And how many people were privy to the deception? And how did they think it was going to go undetected?

In case you haven’t read all available articles on this like I have, the VOX article is really informative. Here are the key facts:

  1. Diesel cars are better at gas mileage, worse at polluting out nitrogen oxide (NOx). We care more about NOx in the US than they do in Europe, which is why there’s so much more diesel in Europe.
  2. But recently we’ve started caring about gas mileage, so there’s been a spot for diesel cars that can pass the NOx emissions test, which most diesel cars cannot (at least for a given price).
  3. VW blew everyone away with a diesel car that seemed to have good gas mileage and good emissions test results.
  4. They were discovered by people hired by an independent group, the International Council on Clean Transportation, who hired people to stick a probe up the tailpipe of some VW cars and drive them from Seattle to San Diego, where it was discovered that the NOx levels were up to 35 times higher than was allowed. And that group wanted to know how VW did it so they could copy them.
  5. Basically they were discovered because their results were “too good to be true.”

So, back to my question. How did the decision get made, that they’d just cheat? Didn’t they know they’d eventually get discovered? It’s kind of like when teachers and principals change the results on students’ tests so they can get a bonus: short term thinking, and kind of obvious if you track erasure marks.

Or… another way of looking at this was that they really didn’t think they’d get caught. The evidence they’d need for that theory is that they cheated all the time like this and had never gotten caught, or they knew others did.

Another possibility: it was a small group of engineers who did this, looking for a large bonus. This kind of thing happens all the time in finance, where you cook the books in a short term way to get a pay-out. Could this be true? Certainly many of the raw ingredients were already available – surely the software already existed for the engineers to test performance and emissions under all types of conditions, so putting it together with a simple “if” statement wouldn’t be too hard. But that begs the question of how they’d explain it to their boss.

In any case, there’s an internal VW story – or perhaps industry-wide story – here that I’d love to hear.

Categories: Uncategorized

Get every new post delivered to your Inbox.

Join 3,629 other followers