Author Archive

Aunt Pythia’s advice, hungover edition

You might notice that Aunt Pythia’s advice is getting posted later than usual. That’s because Aunt Pythia is a wee bit slow on the uptake this morning due to a mighty exciting and exhausting week followed by celebrations of said week. Please bear with her as she gives groggy, possibly irrelevant suggestions to your lovely, deeply and heartfelt questions.

And please, after reading her worse-than-usual advice this morning/ afternoon,

think of something to ask Aunt Pythia at the bottom of the page!

By the way, if you don’t know what the hell Aunt Pythia is talking about, go here for past advice columns and here for an explanation of the name Pythia.


Dear Aunt Pythia,

I seriously consider the “Ask Aunt Pythia” series on as the greatest and bloggiest thing on the blogging planet (granted, I explored only a part of it, and this is only an individual opinion).

Is this the right place to say it?

Mount Trouillet With Love

Dear MTWL,

Why yes, yes it is. Thank you darling.

Aunt Pythia


Dear Aunt Pythia,

As a grad student, I feel guilty constantly. Guilty that I am probably not spending enough time on my research, guilty that I don’t spend enough time on teaching, guilty that I sleep too much… You get the idea.

To have a successful academic career, how much should one be working, assuming average intelligence? Also, how should one avoid feeling guilty all the time?

A Grad Student Who Loves To Sleep


Sleep sounds like a gooooooood idea right about now, I think I will.

One of the things I don’t miss about being an academic is the constant guilt I imposed upon myself. It was all me, and I can’t blame anyone else. I can blame nothing except possibly the intense and competitive environment, which again, I chose to live in.

It was, I guess, the internal drive to write papers and stay abreast of my field, and without it I might never have done those things, but it sucked. I don’t even think I could summon up guilt feelings like that if I tried nowadays. Instead I do things out of sheer excitement about the ideas. I guess sometimes I feel frustrated that I haven’t had time to do the stuff I want to, but that frustration is definitely preferable to the old guilt. And come to think of it, a much more efficient way to work too.

My advice to you is to give yourself one day a week to do stuff that you just totally love, and banish guilt from your life. You might end up getting more done that way, and then you could expand it to two days a week, who knows. Tell me how that works for you!

Auntie P

p.s. Please work on your sign-offs. “AGSWLTS” means nothing to me.

p.p.s. Never skimp on sleep. Skimp on reading Aunt Pythia, but never skimp on sleep.


Dear Aunt Pythia,

I have lived in a different country in each decade of my life and currently use three different languages on an every day basis. No language do I master well, especially in speaking and listening. The doctor says that I am healthy, and I try to study and practice as much as possible. But, I have communication difficulties in any language. Should a more drastic action be taken? For example, find a job that requires more oral communication. Or, move back to my mother tongue country and try to reactivate my native language ability? 


Smurf, or Schtroumpf

Dear Smurf/Schtroumpf,

I just wanna start this out by saying how very much I enjoyed the smurfs as a child. It was weird, the show was never very good but I always ascribed to those little blue creatures much more interesting lives than they seemed to have. At the end of each episode I remember thinking, “and now they’ll go back to even more interesting things they do in their village in the woods with mushroom houses.”

I think that was their magic, in fact, to seem more interesting than they are. Smallish confession for Aunt Pythia readers: I have been doing my best to summon up a similar more-interesting-than-she-seems cachet pretty much all my life. That’s right, everything I’ve ever done or ever will do goes back to my fascination with the smurfs, and especially papa smurf, who always seemed wiser than even Alan Greenspan back in the day (“NOT LONG NOW!”).

As for your question, I’m of the opinion that people get good at what they focus on and what they are patient for. If you really want to focus on getting good at a given language, then you’ll need to stop moving countries and just forgive yourself for not already knowing stuff you don’t know, it will come with time. My husband, who is not particularly good with languages, has gotten really good at English since I met him 20 years ago.

Stay blue!

Aunt Pythia


Dear Aunt Pythia,

Your thoughts on the mathematical community being possibly less empathetic than average really hit home for me, because my experiences of being trans and attempting to do math have been really pretty miserable.

So with that said, let’s confront some cissexism:

Plenty of human females have penises. Trans women are female.
Plenty of human males have vaginas. Trans men are male.
(and of course such porn exists)

Talking about sexism in science is interesting. But we can (and should!) do it without erasing the experiences and existence of trans people, whose gender and sex are valid and real.

Further reading here.


Cisnormativity Is Silly

Dear CIS,

Thanks for the corrections, CIS! You are totally correct that I ignored trans women in my recent piece about female penes.

And although I strive to be empathetic, ignoring someone is a common way to be the opposite. And so I apologize, and I’ll try to be more thoughtful in the future. Thanks for writing!

Aunt Pythia


Please submit your well-specified, fun-loving, cleverly-abbreviated question to Aunt Pythia!

Categories: Aunt Pythia

I am boycotting Amazon

I have been doing some reading about the Amazon/ Hachette battle and I have come to the conclusion that Amazon has become a huge bully. I also wasn’t impressed by how they treat employees, how they monitor and surveil them, and a host of other problems. For that reason I’m boycotting Amazon for my shopping as well as my blogging habits, so no more direct links.

Update: I’m actually still going to use their EC2 services as part of the Lede Program. Not sure how to avoid that actually, and I’d welcome suggestions.

Categories: rant

How Not To Be Wrong by Jordan Ellenberg

You guys are in for a treat. In fact I’m jealous of you.

I had a little secret about my survival in grad school, and that secret has a name, and that name is Jordan Ellenberg. We used to meet every Tuesday and Thursday to study schemes at the CallaLily Cafe a few blocks from the Science Center on Kirkland Street, and even though that sounds kind of dull, it was a blast. It was what kept me sane at Harvard.

You see, Jordan has an infectious positivity about him, which balances my rather intense suspicions, and moreover he’s hilariously funny. He’s really somewhere between a mathematician and a stand-up comedian, and to be honest I don’t know which one he’s better at, although he is a deeply talented mathematician.

Screen Shot 2014-05-29 at 7.21.14 AMThe reason I’m telling you this is that he’s written a book, called How Not To Be Wrong, and available for purchase starting today, which is a delight to read and which will make you understand why I survived graduate school. In fact nobody will ever let me complain again once they’ve read this book, because it reads just like Jordan talks. In reading it, I felt like I was right back at CallaLily, singing Prince’s “Sexy MF” and watching Jordan flirt with the cashier lady again. Aaaah memories.

So what’s in the book? Well, he talks a lot about math, and about mathematicians, and the lottery, and in fact he has this long riff which starts out with lottery math, then goes to error-correcting codes and then to made-up languages and then to sphere packing and then arrives again at lotteries. And it’s brilliant and true and beautiful and also funny.

I have a theory about this book that you could essentially open it up to any page and begin to enjoy it, since it is thoroughly enjoyable and the math is cumulative but everywhere so well explained that it wouldn’t take long to follow along, and pretty soon you’d be giggling along with Jordan at every ridiculous footnote he’s inserted into his narrative.

In other words, every page is a standalone positive and ontological examination of the beauty and surprise of mathematical discovery. And so, if you are someone who shares with Jordan a love for mathematics, you will have a consistently great time with this book. In fact I’m imagining that you have an uncle or a mom who loves math or science, in which case this would be a seriously perfect gift to them, but of course you could also give that gift to yourself. I mean, this is a guy who can make nazi jokes funny, and he does.

Having said that, the magic of the book is that it’s not just a collection of wonderful mathy tidbits. Jordan also has a point about the act of scrutinizing something in a logical and mathematical fashion. That act itself is courageous and should be appreciated, and he explains why, and he tells us how much we’ve already benefited from people in the past who have had the bravery to do so. He appreciates them and we should too.

And yet, he also sends the important message that it’s not an elitist crew of the usual genius suspects, that in fact we can all do this in our own capacity. It’s a great message and, if it ends up allowing people to re-examine their need for certainty in an uncertain world, then Jordan will really end up doing good. Fingers crossed.

That’s not to say it’s a perfect book, and I wanted to argue with points on basically every other page, but mostly in a good, friendly, over-drinks kind of way, which is provocative but not annoying. One exception I might make came on page 256: no, Jordan, municipal bonds do not always get paid back, and no, stocks do not always go up, not even in expectation. In fact to the extent that both of those statements seem true to many people is the result of many cynical political acts and is damaging, mostly to people like retired civil servants. Don’t go there!

Another quibble: Jordan talks about how public policy makers make proclamations in the face of uncertainty, and he has a lot of sympathy and seems to think the should keep doing this. I’m on the other side on this one. Telling people to avoid certain foods and then changing stances seems more damaging than helpful and it happens constantly. And it’s often tied to industry and money, which also doesn’t impress.

Even so, even when I strongly disagree with Jordan, I always want to have the conversation. He forces that on the reader because he’s so darn positive and open-minded.

A few more goodies that I wanted to adore without giving too much away. Jordan does a great job with something he calls “The Great Square of Men” and Berkson’s Fallacy: it will explain to many many women why they are not finding the man they’re looking for. He also throws out a bone to nerds like me when he almost proves that every pig is yellow, and he absolutely kills it, stand-up comedian style, when comparing Ross Perot to a small dark pile of oats. Holy crap he was on a roll there.

So here’s one thing I’ve started doing since reading the book. When I give my 5-year-old son his dessert, it’s in the form of Hershey Drops, which are kind of like fat M&M’s. I give him 15 and I ask him to count them to make sure I got it right. Sometimes I give him 14 to make sure he’s paying attention. But that’s not the new part. The new part is something I stole from Jordan’s book.

The new part is that some days I ask him, “do you want me to give you 3 rows of 5 drops?” And I wait for him to figure out that’s enough and say “yes!” And the other days I ask him “do you want me to give you 5 rows of 3 drops?” and I again wait. And in either case I put the drops out in a rectangle.

And last night, for the first time, he explained to me in a slightly patronizing voice that it doesn’t matter which way I do it because it ends up being the same, because of the rectangle formation and how you look at it. And just to check I asked him which would be more, 10 rows of 7 drops or 7 rows of 10 drops, and he told me, “duh, it would be the same because it couldn’t be any different.”

And that, my friends, is how not to be wrong.

Categories: math, math education

The Lede Program has started!

Yesterday was the first day of the Lede Program and so far so awesome. After introducing ourselves – and the 17 students are each amazing – we each fired up an EC2 server on the Amazon cloud (in North Virginia) and cloning a pre-existing disc image, we got an inspiration speech from Matt Jones about technological determinism and the ethical imperative of reproducibility. Then Adam Parrish led the class in a fun “Hello, world!” exercise on the iPython notebook. In other words, we rocked out.

Today we’ll hear from Soma about some bash command line stuff, file systems, and some more basic python. I can’t wait. Our syllabi are posted on github.

Categories: data journalism

Obama has the wrong answer to student loan crisis

Have you seen Obama’s latest response to the student debt crisis (hat tip Ernest Davis)? He’s going to rank colleges based on some criteria to be named later to decide whether a school deserves federal loans and grants. It’s a great example of a mathematical model solving the wrong problem.

Now, I’m not saying there aren’t nasty leeches who are currently gaming the federal loan system. For example, take the University of Phoenix. It’s not a college system, it’s a business which extracts federal and private loan money from unsuspecting people who want desperately to get a good job some day. And I get why Obama might want to put an end to that gaming, and declare the University of Phoenix and its scummy competitors unfit for federal loans. I get it.

But unfortunately it won’t fix the problem. Because the real problem is the federal loan system in the first place, which has grown a shitton since I was in school:

Screen Shot 2014-05-27 at 6.26.10 AM


and in the meantime, our state and private schools are getting more and more expensive relative to the available grants:

Screen Shot 2014-05-27 at 6.41.47 AM



And state funding for public schools has decreased while tuition has increased especially since the financial crisis:

Screen Shot 2014-05-27 at 6.46.40 AM

Screen Shot 2014-05-27 at 6.46.51 AM

The bottomline is that we – and especially our children – need more state school funding much more than we need a ranking algorithm. The best way to bring down tuition rates at private schools is to give them competition at good state schools.

Categories: arms race, education, modeling

Weekly Slate Money podcast

Aunt Pythia is bowing out today from an exhausting week, and she extends her apologies.

But if you are looking for opinionated advice, please feel free to try out my recent Slate Money podcast with Felix Salmon and Jordan Weissmann. This week I complain about Ben Bernanke, I talk reparations, and complain about white collar crime going unpunished. Last week was also great, because I got to complain about Tim Geithner. And two weeks ago we started the podcast talking Alibaba and the minimum wage.

If you enjoy listening, please subscribe via iTunes and also, please rate the podcast on iTunes so we get more traffic.

Categories: #OWS

The business of big data audits: monetizing fairness

I gave a talk to the invitation-only NYC CTO Club a couple of weeks ago about my fears about big data modeling, namely:

  • that big data modeling is discriminatory,
  • that big data modeling increases inequality, and
  • that big data modeling threatens democracy.

I had three things on my “to do” list for the audience of senior technologists, namely:

  • test internal, proprietary models for discrimination,
  • help regulators like the CFPB develop reasonable audits, and
  • get behind certain models being transparent and publicly accessible, including credit scoring, teacher evaluations, and political messaging models.

Given the provocative nature of my talk, I was pleasantly surprised by the positive reception I was given. Those guys were great – interactive, talkative, and very thoughtful. I think it helped that I wasn’t trying to sell them something.

Even so, I shouldn’t have been surprised when one of them followed up with me to talk about a possible business model for “fairness audits.” The idea is that, what with the recent bad press about discrimination in big data modeling (some of the audience had actually worked with the Podesta team), there will likely be a business advantage to being able to claim that your models are fair. So someone should develop those tests that companies can take. Quick, someone, monetize fairness!

One reason I think this might actually work – and more importantly, be useful – is that I focused on “effects-based” discrimination, which is to say testing a model by treating it like a black box and seeing how it works on different inputs and gives different outputs. In other words, I want to give a resume-sorting algorithm different resumes with similar qualifications but different races. An algorithmically induced randomized experiment, if you will.

From the business perspective, a test that allows a model to remain a black box feels safe, because it does not require true transparency, and allows the “secret sauce” to remain secret.

One thing, though. I don’t think it makes too much sense to have a proprietary model for fairness auditing. In fact the way I was imagining this was to develop an open-source audit model that the CFPB could use. What I don’t want, and which would be worse than nothing, would be if some private company developed a proprietary “fairness audit” model that we cannot trust and would claim to solve the very real problems listed above.

Update: something like this is already happening for privacy compliance in the big data world (hat tip David Austin).

The next time I’m a criminal

Next time I’m tried for a criminal act, I think I’d like to be tried as a big bank.

Then I can pay a smallish fine for my misdeeds – pocket change for me, a cost of doing business really – and be assured that none of my business partners will stop hanging out with me or stop doing business with me, and in fact all punishments will likely be waived.

In any case, I’d prefer not to be tried as a poor person, where I’d likely be charged money I don’t have for my free lawyer, for any time I spend in jail, and possibly extra for a full jury. And if I didn’t have the money I’d have to spend extra time in jail, until I came up with the money I don’t have.

And since all people are equal under the law, even corporations, it shouldn’t matter who I choose to be, so I choose to be a big bank.

Categories: Uncategorized

Ignore data, focus on power

I get asked pretty often whether I “believe” in open data. I tend to murmur a response along the lines of “it depends,” which doesn’t seem too satisfying to me or to the person I’m talking about. But this morning, I’m happy to say, I’ve finally come up with a kind of rule, which isn’t universal. It focuses on power.

Namely, I like data that shines light on powerful people. Like the Sunlight Foundation tracks money and politicians, and that’s good. But I tend to want to protect powerless people, like people who are being surveilled with sensors and their phones. And the thing is, most of the open data focuses on the latter. How people ride the subway or how they use the public park or where they shop.

Something in the middle is crime data, where you have compilation of people being stopped by the police (powerless) and the police themselves (powerful). But here as well you’ll notice an asymmetry on identifying information. Looking at Stop and Frisk data, for example, there’s a precinct to identify the police officer, but no badge number, whereas there’s a bunch of identifying information about the person being stopped which is recorded.

A lot of the time you won’t even find data about powerful people. Worker bees get scored but the managers are somehow above scoring. Bloomberg never scored his lieutenants or himself even when he insisted that teachers should be scored. I like to keep an eye on who gets data collected about them. The power is where the data isn’t.

I guess my point is this. Data and data modeling are not magical tools. They are in fact crude tools, and so to focus on them is misleading and distracting from the real show, which is always about power (and/or money). It’s a boondoggle to think about data when we should be thinking about when and how a model is being wielded and who gets to decide.

One of the biggest problem we face is that all this data is being collected and saved now and the models haven’t even been invented yet. That’s why there’s so much urgency in getting reasonable laws in place to protect the powerless.

How do we prevent the next Tim Geithner?

When you hate on certain people and things as long as I’ve hated on the banking system and Tim Geithner, you start to notice certain things. Patterns.

I read Tim Geithner’s book Stress Test last week, and instead of going through and sharing all the pains of reading it, which were many, I’m going to make one single point.

Namely, Tim was unqualified for his jobs and head of the NY Fed, during the crisis, and then as Obama’s Treasury Secretary. He says so a bunch of times and I believe him. You should too.

He even is forced at some point to admit he had no idea what banks really did, and since he needed someone or something to blame for his deep ignorance, he somehow manages to say that Brooksley Born was right, that derivatives should have been regulated, but that since she was at the CFTC everybody (read: Geithner’s heroes Larry Summers and Robert Rubin) dismissed her out of hand, and that as a result he had no ability to look into the proliferating shadow banking or stuff going on at all the investment banks and hedge funds. So it was kind of her fault that he wasn’t forced to understand stuff, even though she warned people, and when shit got real, all he could do was preserve the system because the alternative would be chaos. And people should fucking thank him. That’s his 600 page book in a nutshell.

Let’s put aside Tim Geithner’s mistakes and his narrow outlook on what could have been done better, and even what Dodd-Frank should accomplish, for a moment. It’s hard to resist complaining about those things, but I’ll do my best.

The truth is, Tim Geithner was a perfect product of the system. He was an effect, not a cause.

When I dwell on the fact that he got the NY Fed job with no in-the-weeds knowledge or experience on how banks operate, there’s no reason, not one single reason, to think it’s not going to happen again.

What’s going to prevent the next NY Fed bank head from being as unqualified as Tim Geithner?

Put it another way: how could we possibly expect the people running the regulators and the Treasury and the Fed to actually understand the system, when they are appointed the way they are? In case you missed it, the process currently is their ability to get along with Larry Summers and Robert Rubin and to look like a banker.

Before you go telling me I’m asking for a Goldman Sachs crony to take over all these positions, I’m not. It’s actually not impossible to understand this system for a curious, smart, skeptical, and patient person who asks good questions and has the power to make meetings with heads of trading floors. And you don’t have to become captured when you do that. You can remember that it’s your job to understand and regulate the system, that it’s actually a perfectly reasonable way to protect the country. From bankers.

Here’s a scary thought, which would be going in the exact wrong direction: we have Hillary Clinton as president and she brings in all the usual suspects to be in charge of this stuff, just like Obama did. Ugh.

I feel like a questionnaire is in order for anyone being considered for one of these jobs. Things like, how does overnight lending work, and what is being used for collateral, and what have other countries done in moments of financial crisis, and how did that work out for them, and what is a collateralized debt obligation and how does one assess the associated risks and who does that and why. Please suggest more.

Categories: #OWS, finance

Aunt Pythia’s advice

Aunt Pythia is sad this morning, folks. You see, she just found out that her favorite show, Psych, was canceled after its eighth season. What is Aunt Pythia going to do, folks? Besides rewatching old episodes, that is. And don’t say watch the Mentalist, the premise of that show is absurd.

I am also pissed that the rapture index is so damn high. Suck it, rapture index!

In spite of her funk, Aunt Pythia is going to muddle through for her readers. Anything for you people!

Please, after reading her advice,

think of something to ask Aunt Pythia at the bottom of the page!

By the way, if you don’t know what the hell Aunt Pythia is talking about, go here for past advice columns and here for an explanation of the name Pythia.


Dear Aunt Pythia,

Why don’t we spend more time naked?

People seem to enjoy their bodies a lot more when they spend a lot of time naked, by confronting themselves to being judged, they free themselves from the general canons of beauty that are imposed on them. Yet nowadays most people who show their bodies are the ones matching those canons of beauty- or you have to join one of those creepy nude beaches.

There’s also something very primal and aesthetic about being (even partly) naked, both of which I’d argue we all need more of.

While I understand that we need to come to work with clothes on, that even inside we can’t spend all day juggling about the pendents, there’s surely something we can do. What practical measure would you suggest to expose more of everyone’s skin without having to move further south or joining a sauna club?

Thanks for your enlightenment,


Dear Dekan,

You had me until you mentioned “creepy nude beaches.” Can’t have it both ways. Anyway, there are plenty of nice nude beaches, like the one close to the math department at the University of Vancouver.

As for how we can all be naked more and accept our bodies in all their glories and pendants, I think about it this way. It’s not the clothes that are the barrier, it’s our mindset. And we can change our mindsets without touching our clothes, or anyone else’s. Finding beauty in everyone’s body, including our own, is a kind of mindset challenge that is both entertaining and uplifting. And clothes do not pose much of a barrier here, they’re generally speaking very thin.

Aunt Pythia


Hi, Aunt Pythia!

Mother Jones has just published an article called, “Is New Mexico Gov. Susana Martinez the Next Sarah Palin?”. The article is reasonable—obviously not positive about Governor Martinez, but it seems like an acceptable piece of partisan journalism—but the title bothers me.

Nobody would have thought to say, “Is [Man in Politics] the Next Sarah Palin?” If the article were in fact about Chris Christie, I wouldn’t have a problem with the comparison to Palin.

Clearly it is relevant that Governor Martinez is a woman, insofar as Republicans are madly scrambling to find someone, anyone, who isn’t a white man to run for national office. People are allowed to note that she’s a she, and maybe there’s an interesting article and headline in there. But I don’t think this Palin headline is OK. It bothers me that the media (Mother Jones, of all publications!) finds it so easy to make fun of women for being women. Am I supposed to find it funny?

So I ask: do you think Mother Jones screwed up here?

Extremely Sensitive Liberal

Dear ESL,

First of all, the article itself is good. It talks about things Martinez has in common with Palin, Bush, and Christie. It has insane audio recordings (how did they get those?). And the truth is – of course! – politicians do have profiles, and it’s an important part of their draw. I don’t think the writer of the article, Andy Kroll, over-emphasized it.

Second, the title is often chosen to get clicks, even without consulting the writer. I’ve heard more than one journalist complain about their title being changed on them to something less than appropriate for their article. So we can blame Mother Jones but not necessarily Andy Kroll if we think the title is terrible.

But, is the title terrible? I think not. There are lots of commonalities between Martinez and Palin. They are both vindictive, narcissistic, and uninformed. They are also both women and being used in a cynical way by the Republican Party for their gender. Not that the Democratic Party doesn’t also do that.

I guess what I’m saying is that you can be offended, but I advise to instead be a bit more realistic about how politics actually works.

Aunt Pythia


Dear Aunt Pythia,

I’d like to apply, please.

  • Approx 8/10 (but sometimes only 2/10, and occasionally 11/10)
  • In the right company, this can reach 4/10
  • Only about 3/10, I’m afraid, but I could scrub up a bit
  • Neither – both bore me
  • I’d rather things stay polite? 
  • Politics, just
  • The Antarctic
  • No
  • A cat person
  • Yes

Best Regards,

Male And Deluded

Dear MAD,

It took me a second but I figured out you are answering my “new matchmaking questions” from this Aunt Pythia column. Let me provide a more complete conversation for myself and my readers (your answers are italicized):

  1. How sexual are you? (super important question)
    • Approx 8/10 (but sometimes only 2/10, and occasionally 11/10)
  2. How much fun are you? (people are surprisingly honest when asked this)
    • In the right company, this can reach 4/10
  3. How awesome do you smell? (might need to invent technology for this one)
    • Only about 3/10, I’m afraid, but I could scrub up a bit
  4. What bothers you more: the big bank bailout or the idea of increasing the minimum wage?
    • Neither – both bore me
  5. Do you like strong personalities or would you rather things stay polite?
    • I’d rather things stay polite? 
  6. What do you love arguing about more: politics or aesthetics?
    • Politics, just
  7. Where would you love to visit if you could go anywhere?
    • The Antarctic
  8. Do you want kids?
    • No
  9. Dog person or cat person?
    • A cat person
  10. Do you sometimes wish the girl could be the hero, and not always fall for the hapless dude at the end?
    • Yes

You had me until cat person. Wrong answer!!

Auntie P


Dear Aunt Pythia,

What’s the deal with casual sex?

As a society, we tend to both condemn it and desire it. It sounds good to be free spirited and enjoy our bodies, but at the same time we are usually scared (and excited) to share that level of intimacy with a stranger.

Common wisdom seems to suggest that there’s not that much to be gained from it, especially if you are already in a working relationship. So why does everyone (you included?) keep coming back to it all the time? Is it just a way to talk about sex? I don’t get it.

Currently Amused, Slightly Under Aroused Level.


Nothing to be gained from casual sex? What common wisdom manual have you been reading? Have you never watched Up In The Air?

My theory is that certain things are hugely important but you’re not supposed to talk about them. Or rather, you can talk about them in the abstract but not admit you’re involved personally. For women, casual sex is in this category, and for men it isn’t. The mismatch is borne out by the lying statistics we see all the time when (straight) men and women estimate the number of their partners and the averages do not match, a mathematical impossibility unless a bunch of men had sex with a bunch of old ladies that just died.

Also, I don’t think the fear of sharing intimacy with strangers holds too many people back. In fact I don’t think anything holds people back as a general rule. There’s a whole bunch of casual sex happening all the freaking time, all around us.

Finally, yes, it is just an excuse to talk about sex. Thanks for providing it.

Aunt Pythia


Please submit your well-specified, fun-loving, cleverly-abbreviated question to Aunt Pythia!

Categories: Aunt Pythia

The Future of Data Journalism, I hope

I’m really excited about the Lede Program I’ve been working on at the Journalism School at Columbia. We’ve just now got a full and wonderful faculty and a full pilot class of 16 brilliant and excited students.

So now that we’ve gotten set up, what are we going to do?

Well, the classes are listed here, but let me say it in a few words: for the first half of the summer, we’re going to to teach the students how to use data and build models in context. We’ll teach them to script in python and use github for their code and homework. We’ll teach them how to use API’s, how to scrape data when there are no API’s, and by the end of the first half of the summer they will know how to build their own API. They will submit projects in iPython notebooks to meet the highest standard of reproducibility and transparency.

In the second half of the summer, they will learn more about algorithms, on the one hand, and how deeply to distrust algorithms, on the other. I’ll be teaching them a class invented by Mark Hansen which he called “the Platform”:

This begins with the idea that computing tools are the products of human ingenuity and effort. They are never neutral and they carry with them the biases of their designers and their design process. “Platform studies” is a new term used to describe investigations into the relationships between computing technologies and the creative or research products they help generate. How do you understand how data, code and algorithm affect creative practices can be an effective first step toward critical thinking about technology? This will not be purely theoretical, however, and specific case studies (technologies) and project work will make the ideas concrete.

In order to teach this I’ll need lots of guest lecturers on bias and in particular the politics behind modeling. Emanuel Derman has kindly offered to give one of the first guest lectures. Please suggest more!

Now, it’s easier to criticize than it is to create, and I don’t want to train a whole generation of journalists that they should just swear off mathematical modeling altogether. But I do want to make sure they are skeptical and understand the need for robustness and transparency. For that reason I’m also looking for great examples of reproducible data journalism (please provide them!).

For example, this is a great video, but where are the calculations that have been made that support it? And what assumptions went into it?

In other words, to make this a truly great video, we would need to be able to scrutinize those calculations and for that matter the data sources and the data. Then we could have a conversation about under what conditions private companies should be allowed to rely on food stamp programs for their workers.

Now I’m not claiming that all journalism is necessarily data journalism. Sometimes we’re simply talking about one person with one set of facts around them, and that’s also hugely important. For example, and in the same vein as the above video, take a look at this Reuters blog post written by Danish McDonalds worker and activist Louis Marie Rantzau who earns $21 per hour and has great benefits. Pretty much all you need to know is that she exists.

So here’s what I hope: that we start having conversations that are somewhat more based on evidence, which relies crucially on a separate discussion about what constitutes evidence. I’m hoping that we stop hiding misleading arguments behind opaque calculations and start talking about which assumptions are valid, and why we chose one model or algorithm over another, and how sensitive the conclusions are to different reasonable assumptions. I hope that, as we share our code and try out different approaches, we find ourselves acknowledging certain ground-level truths that we can agree on and then – not that we’ll stop arguing – but we might better understand why we disagree on other things.

Categories: data journalism

Was Jill Abramson’s firing a woman thing?

Need to be both nerdy and outraged today.

I’ve noticed something. When something shitty happens to me, and I’m complaining to a group of friends about it, I sometimes say something like “that only happened to me because I’m a woman.”

Now, first of all, I want to be clear, I’m no victim. I don’t let sexism get me down. In fact when I say something like that it usually is a coping mechanism to separate that person’s actions from my own actions, and to help me figure out what to do next. Usually I let it slide off of me and continue on my merry way.

But here’s where it’s weird. If I’m with a bunch of women friends, their immediate reaction is always the same: “hell yes, that bitch/ bastard is just a sexist fuck.” But if I’m with a bunch of man friend of mine, the reaction is very likely to be different: “oh, I don’t think there’s any reason to assume it was sexist. That guy/ girl is just an asshole.”

What it comes down to is priors. My prior is that there is sexism in the world, and it happens all the fucking time, especially to women with perceived power (or to women with no power whatsoever), and so when someone treats me or someone else badly, I do assume we should look into the sexism angle. It’s a natural choice, and Occam’s razor suggests it is involved.

So when Jill Abramson got fired, a bunch of the world’s women were like, those fuckers fired her because she is a powerful, take-no-bullshit woman, and if she’d been a man she would have been expected to act like a dick, but because she’s a woman they couldn’t handle it.

And a bunch of the world’s men were like, wow, I wonder what happened?

So, yes, now I have a prior on people’s priors on sexism, and I think men’s and women’s sexism priors are totally different. I can even explain it.

Men are men, so they don’t experience sexism. So they don’t update their priors like women do. Plus, because there is rarely a moment when an event or reaction is officially deemed “sexist,” men even categorize events differently than women (as discussed above), so even when they do update their prior, it is differently updated, partly because their prior is that nothing is sexist unless proven to be, since it’s so freaking unlikely, according to their prior.

Ezra Klein isn’t speculating, for example, but Emily Bell is, and I’m with her. That just strengthened my priors about other people’s priors.

Categories: journalism, statistics

Reading Geithner’s Stress Test

I’m reading Tim Geithner’s new book Stress Test: Reflection on Financial Crises in preparation for a discussion in this week’s Slate Money podcast. I also plan to write a review here.

I don’t want to say too much because I’m not even halfway through but here’s one thing: Geithner is surprisingly honest about certain things and predictably dishonest, or at least misleading, about other things.

And although at first I thought it would be purely painful to read this book, since I don’t have any respect for the guy, now I’m glad I’m doing it, because it exposes so much about how the old boys network operates. It’s material.

Categories: finance

May 13, 2014 Comments off

I am unfortunately too late to show you the website itself (hat tip Ernest Davis) but luckily Forbes has a good article on the recent parody of the combination of Google and Nest, created by by German activist organization Peng Collective.

The putative products for Google-Nest included:

  1. Google Trust: Data insurance, because accidents will always happen, and we at Google won’t protect your data but we will do our best to protect you after the fact. “Opt in for total protection”
  2. Google Hug: An app about connections. It always knows where you are and how you feel at any given moment, and it crowdsources hug matches nearby. “There for one another” 
  3. Google Bee: A personal drone equipped with livestreaming video capacity, to watch over your home and family. Also takes out the garbage. “Your little friend in the sky”
  4. Google Bye: Sustaining your digital life after you die. Plus informing your friends of your death by text. Google takes the best quotes of the dead one’s emails and puts it up on their wall. “Be remembered”

Here’s the video of the Germans (including a supposed Google “data security evangelist”) spoofing on Google and pretending to be from Google at the conference re:publica in Germany. And it’s pretty convincing:

Categories: musing

Text laundering

This morning I received this link on plagiarism software via the Columbia Journalism School mailing list – which is significantly more interesting than most faculty mailing lists, I might add.

In the article, the author, Neuroskeptic, describes the smallish bit of work one has to go through to “launder” text in order for the standard plagiarism detector software to deem it original. The example Neuroskeptic gives us is, ironically, from a research ethics program in Britain called PIE which Neuroskeptic is accusing of plagiarizing text:

PIE Original: You are invited to join the Publication Integrity and Ethics (herein referred to as PIE) as one of its founding members. PIE, a not-for profit organisation, offers free membership to all interested individuals. Please join us and become part of this exciting new movement in the world of publishing ethics; it is the professional home for authors, reviewers, editorial board members and editors-in-chief.

Neuroskeptic: You are invited to join Publication Integrity and Ethics (herein referred to as PIE) and become one of its founding members. PIE, a not-for profit organisation, offers interested individuals free membership. Please join this exciting new movement in the publishing ethics world; PIE is the professional home for reviewers, editorial board members, authors, and editors-in-chief.

This second, laundered piece of text got through software called Grammarly, and the first one didn’t.

Neuroskeptic made his or her point, and PIE has been adequately shamed into naming their sources. But I think the larger point is critical.

Namely, this is the problem with having standard software for things like plagiarism. If everyone uses the same tools, then anyone can launder their text sufficiently to jump through all the standard hoops and then be satisfied that they won’t get caught. You just keep running your text through the software, adding “the’s” and changing the beginning of sentences, until it comes out with a green light.

The rules aren’t that you can’t plagiarize, but instead that you can’t plagiarize without adequate laundering.

This reminds me of my previous post on essay correction software. As soon as you have a standard for that, or even a standard approach, you can automate writing essays that will get a good grade, by iteratively running a grading algorithm (or a series of grading algorithms) on your piece and adding big words or whatever until you get an “A” on all versions. You might need to input the topic and the length of the essay, but that’s it.

And if you think that someone smart enough to code this up deserves an A just for the effort, keep in mind that you can buy such software as well. So really it’s about who has money.

Far from believing in MOOC’s destroying the college campus and have everything online, in my cynical moments I’m starting to believe we’re going to have to talk to and test people face to face to be sure they aren’t using algorithms to cheat on tests.

Of course once our brains are directly wired into the web it won’t make a difference.

Categories: feedback loop, modeling

Aunt Pythia’s advice

Aunt Pythia is super excited to have you on her nerd advice bus this morning.

And readers, you knew the time would come that Aunt Pythia would be saying this, but the time is now, peoples: we’re talking about female penes. I’m saving it for the end, but for those of you who are too impatient, you can go ahead and jump to the bottom of the page.

That is, as long as you remember to:

think of something to ask Aunt Pythia at the bottom of the page!

By the way, if you don’t know what the hell Aunt Pythia is talking about, go here for past advice columns and here for an explanation of the name Pythia.


Dear Aunt Pythia,

De Blasio recently released his tax return, he had over 200K in income but paid only 8.3% in taxes. How does that work?

Guy who actually pays taxes in NYC

Dear Guy,

I’m no accountant but I suspect is has to do with “taking a loss” on the value of their home – which is pretty much a one-time accounting trick – as the article you provided described. Also, mortgage payments are tax deductible, which is a regressive tax and should be slowly phased out starting immediately.

By the way, if you’re getting all huffy about de Blasio, pardon me if I get much huffier about Bloomberg.

Aunt Pythia


Dear Aunt Pythia,

I am a female physics PhD student. A colleague once said to me that “If women want to be respected, they should not show cleavage.” What do you think?

Breasts of Oppression

Dear Breasts,

I’ve always thought quite the reverse. Namely, if men had boobs, they’d be showing them off all the time.

Different people can disagree about this, but my feeling is that cleavage is a kind of power, and some men and women find that threatening and/or distracting, and they don’t like feeling threatened or distracted so they make up nonsense about respectability (people in general don’t like to acknowledge threats).

That doesn’t mean women shouldn’t do use that power, it just means they should be aware of it and make sure they are in control of their power. I think it’s like men and muscles or height. You see tall men standing up to make a point and to use their height to their advantage. And you never hear a woman tell a man not to show off their height if they want to be respected. In fact, that sounds ridiculous.

My response to someone who said that would be to laugh in their face, honestly.

That’s not to say one can’t go too far. It’s not a linear, “cleavage is good so more cleavage is better” situation. At a certain point it can go too far, just as a man who wears muscle shirts and is constantly flexing his biceps in meetings – yes I’ve seen that happen – would be ridiculous.

Aunt Pythia


Dear Aunt Pythia,

Why can’t their be a different “shadow” banking system that’s only casting shadows because it’s out in broad day light and is transparent? Why can’t groups of ordinary people pool their money and lend it to each other at market rates via a community oriented savings bank? Did these types of banks ever exist? If not, shouldn’t we be able to create them today with all of our cleverness and technology?


Dear RobFromAvon,

First, it’s a great thought experiment. What are the obstacles to having a bunch of people create a network of loans? Mostly it’s trust: if someone doesn’t pay back their loan, or someone in charge of holding money just goes missing or spends it, the community needs a recourse. That’s where the government comes in with its legal and justice system. Not to mention FDIC insurance in case the banking entity somehow can’t give you back your deposits.

In other words, banks really do something, at least in conjunction with the threat of jailtime, and you might not want to depend on your neighbor for that function.

That said, there is a growing movement afoot to simplify banking down to very basic functions as you describe, namely through Public Banks, although sometimes the customers are businesses and municipalities rather than individual consumers. Also don’t forget credit unions.

Aunt Pythia


Hi Aunt Pythia,

My husband (a mathematician) and I (a stay-at-home-mom once employed in **cough** finance) love reading your blog.

My husband is a manic-depressive. When we got married 8 years ago, we didn’t know that. And believe me, it sucked.

We made it for 3 years but that’s probably because my job had me away from home 14 hours a day. When I took him to the doctor, the diagnosis became life-changing. Now when he has a relapse, we usually see it coming and I am rooting for him instead of wondering who I’m married to.

That is, up until we had a baby last year. Our doctor is worried that emotional changes and the disruption to our schedule increases the risk of a relapse. He started a higher dosage combined with an additional daily pill.

This new level of prophylactic medication is impacting his work.

If it were just me and him, I would say, bring it, relapse! Now that we understand the relapses, they aren’t confusing or hurtful, at least not to _me_. But we are both worried about how it might affect our new family.

There must be lots of people doing research math with a variety of mental illnesses and different family arrangements, but it hasn’t been easy talking about it in his department. It doesn’t help that the relationship between mental illness and math is sometimes made out to be somewhat glamorous. What is your take on the relationship between mental illness and math research? Our other question is: what can we do so that he can work? I personally am in the locking the werewolf in his office on full moons camp, but we will run any thing that could help by our psychiatrist. A big thanks from both of us!

Hoping it’s a nonexclusive decision, life or work

Dear Hoping,

Sometimes I worry that I am unqualified to give advice in certain situations. In this case I’m not worried because I am absolutely sure that I’m unqualified to give you advice. So instead of advice, I’ll make some observations and leave it at that.

First observation: kids are resilient. They don’t need their families to be happy and perfect all the time. In fact some strife and tension is good for kids, especially when that strife is resolved and the love is there. So I’d make sure that you two explain to your child or children, even before you are sure they can understand it, that things will be all right and that you love each other and that you’re rooting for daddy to get through this tough time and that you’re sure he will. Give kids the message not that disturbances will never occur, but that they will blow over. The fact that you’ve figured out why this stuff happens is critical and revelatory, and I’m sure that as much as it helps you it will also help your family.

Second observation: math communities are not much more likely than other communities to understand or accept people with mental health problems. Math people have their quirks, and it’s possible that they are somewhat more likely to have mental health problems – and if you believe Hollywood depictions of us, we are much more likely – but when it comes to compassion and empathy I don’t think we’re there. In fact I might argue that we are a bit more on the Aspy spectrum than your usual crowd, and that makes us less empathetic in general about other kinds of mental health issues. I guess what I’m saying is that I’m not surprised your husband is having trouble talking about this at work.

My final, very very practical observation is that if you keep close track of the dosage, and the environment, and various life events, then you might be able to track the illness and find patterns that will help you balance the symptoms with the ability to do research. Another approach might be to research ways to help him realize his research potential while fully dosed – after all, his intelligence is still there. Maybe it would help to go for a brisk walk first thing in the morning and then spend 3 hours alone with a notepad? You probably already tried that, but I’m just saying you can optimize within any given set of constraints.

Again, I have no qualifications for this advice, so talk to professionals. I’m just a practical-minded person.

Finally, good luck! I’m really proud of you guys, and you in particular for leaving finance.

Aunt Pythia


Dear Aunt Pythia,

I just read this article about a group of insects with “female penes.” This question is two-fold: First, how would human sexuality be different if women penetrated men and forcibly extracted their sperm to impregnate themselves? Secondly, what sort of ramifications will this have on porn? (Is there already female + penis, male + vagina porn out there?)

Also, please feel free to give a scathing remark to the researcher who comments that females with penes are more “macho” than other females. Ugh.

Person Of Random News

Dear PORN,

I deeply love this article, and your sign-off, and you as a person. You have made me happy, PORN.

Readers! You now have a standard to live up to! Please take note of the perfect Aunt Pythia question.

My favorite line in the article is where a researcher describes the findings as “really, really exciting.” Also, this on the side of the article:

Screen Shot 2014-05-10 at 7.36.34 AM

Pretty much everything about this article is a hoot.

And look, I realize that human females typically don’t have penises – at least until they buy them (update! by this I meant strap-ons! I had no intention of being cissexist and apologies if that’s how it came off) – but in terms of sucking sperm out of the men in order to get pregnant, that’s pretty much what all my girlfriends do once they want kids. Tell me if I’m wrong, ladies. It’s a matter of perspective of course, but there you have it. There are really more commonalities than differences here.

As for the macho comment, I think that’s kind of dumb and/or tautological. Macho just means manly, and for most people, manliness is at least confounded with, if not defined by, the presence of a penis. So if a woman has a penis, people are going to say she’s macho, especially if said penis has been built to last 70 hours and has spikes. HOLY SHIT! Tell me that’s not “really, really exciting.”

Bring it on, PORN! More articles, please!

Aunt Pythia


Please submit your well-specified, fun-loving, cleverly-abbreviated question to Aunt Pythia!

Categories: Aunt Pythia

On Slate Money Podcast starting Saturday

There has been very little press that I can find but if you look reeeeally closely you’ll see this recent article from the New York Times, with the following line:

The digital magazine Slate will start two new podcasts in the next week: The Gist, with the former NPR reporter Mike Pesca, its first daily podcast intended to deliver news and opinion to afternoon drive-time listeners, and Money, hosted by the financial writer Felix Salmon.

And moreover, here’s a suggestion, if you squint your eyes a wee bit, you might notice that I’m actually working with Felix Salmon and Jordan Weissman on the Money podcast, starting this Saturday.

And look, I don’t listen to podcasts – yet – but maybe you do, so I thought you might like to know. I’m looking forward to doing this because it’s fun and forces me to think about various interesting topics.

Categories: data journalism, podcast

Inside the Podesta Report: Civil Rights Principles of Big Data

I finished reading Podesta’s Big Data Report to Obama yesterday, and I have to say I was pretty impressed. I credit some special people that got involved with the research of the report like Danah Boyd, Kate Crawford, and Frank Pasquale for supplying thoughtful examples and research that the authors were unable to ignore. I also want to thank whoever got the authors together with the civil rights groups that created the Civil Rights Principles for the Era of Big Data:

  1. Stop High-Tech Profiling. New surveillance tools and data gathering techniques that can assemble detailed information about any person or group create a heightened risk of profiling and discrimination. Clear limitations and robust audit mechanisms are necessary to make sure that if these tools are used it is in a responsible and equitable way.
  2. Ensure Fairness in Automated Decisions. Computerized decisionmaking in areas such as employment, health, education, and lending must be judged by its impact on real people, must operate fairly for all communities, and in particular must protect the interests of those that are disadvantaged or that have historically been the subject of discrimination. Systems that are blind to the preexisting disparities faced by such communities can easily reach decisions that reinforce existing inequities. Independent review and other remedies may be necessary to assure that a system works fairly.
  3. Preserve Constitutional Principles. Search warrants and other independent oversight of law enforcement are particularly important for communities of color and for religious and ethnic minorities, who often face disproportionate scrutiny. Government databases must not be allowed to undermine core legal protections, including those of privacy and freedom of association.
  4. Enhance Individual Control of Personal Information. Personal information that is known to a corporation — such as the moment-to-moment record of a person’s movements or communications — can easily be used by companies and the government against vulnerable populations, including women, the formerly incarcerated, immigrants, religious minorities, the LGBT community, and young people. Individuals should have meaningful, flexible control over how a corporation gathers data from them, and how it uses and shares that data. Non-public information should not be disclosed to the government without judicial process.
  5. Protect People from Inaccurate Data. Government and corporate databases must allow everyone — including the urban and rural poor, people with disabilities, seniors, and people who lack access to the Internet — to appropriately ensure the accuracy of personal information that is used to make important decisions about them. This requires disclosure of the underlying data, and the right to correct it when inaccurate.

This was signed off on by multiple civil rights groups listed here, and it’s a great start.

One thing I was not impressed by: the only time the report mentioned finance was to say that, in finance, they are using big data to combat fraud. In other words, finance was kind of seen as an industry standing apart from big data, and using big data frugally. This is not my interpretation.

In fact, I see finance as having given birth to big data. Many of the mistakes we are making as modelers in the big data era, which require the Civil Rights Principles as above, were made first in finance. Those modeling errors – and when not errors, politically intentional odious models – were created first in finance, and were a huge reason we first had the mortgage-backed-securities rated with AAA ratings and then the ensuing financial crisis.

In fact finance should have been in the report standing as a worst case scenario.

One last thing. The recommendations coming out of the Podesta report are lukewarm and are even contradicted by the contents of the report, as I complained about here. That’s interesting, and it shows that politics played a large part of what the authors could include as acceptable recommendations to the Obama administration.

Categories: data science, modeling

Podesta’s Big Data report to Obama: good but not great

This week I’m planning to read Obama’s new big data report written by John Podesta. So far I’ve only scanned it and read the associated recommendations.

Here’s one recommendation related to discrimination:

Expand Technical Expertise to Stop Discrimination. The detailed personal profiles held about many consumers, combined with automated, algorithm-driven decision-making, could lead—intentionally or inadvertently—to discriminatory outcomes, or what some are already calling “digital redlining.” The federal government’s lead civil rights and consumer protection agencies should expand their technical expertise to be able to identify practices and outcomes facilitated by big data analytics that have a discriminatory impact on protected classes, and develop a plan for investigating and resolving violations of law.

First, I’m very glad this has been acknowledged as an issue; it’s a big step forward from the big data congressional subcommittee meeting I attended last year for example, where the private-data-for-services fallacy was leaned on heavily.

So yes, a great first step. However, the above recommendation is clearly insufficient to the task at hand.

It’s one thing to expand one’s expertise – and I’d be more than happy to be a consultant for any of the above civil rights and consumer protection agencies, by the way – but it’s quite another to expect those groups to be able to effectively measure discrimination, never mind combat it.

Why? It’s just too easy to hide discrimination: the models are proprietary, and some of them are not even apparent; we often don’t even know we’re being modeled. And although the report brings up discriminatory pricing practices, it ignores redlining and reverse-redlining issues, which are even harder to track. How do you know if you haven’t been made an offer?

Once they have the required expertise, we will need laws that allow institutions like the CFPB to deeply investigate these secret models, which means forcing companies like Larry Summer’s Lending Club to give access to them, where the definition of “access” is tricky. That’s not going to happen just because the CFPB asks nicely.

Categories: modeling, news

Get every new post delivered to your Inbox.

Join 984 other followers