Author Archive

Aunt Pythia’s advice

Aunt Pythia is so very pleased to bring you more of her pearls of wisdom this nearly-believably-spring morning.

In celebration of above-freezing temperature, she’s extra cheerful and she welcomes the clouds and drizzle. After all, late March showers bring late April flowers, or something like that! Let there be blooming and cleansing!

And please, after you enjoy Aunt Pythia’s wisdom, and possibly after you clean out the front closet, please don’t forget to:

think of something to ask Aunt Pythia at the bottom of the page!

By the way, if you don’t know what the hell Aunt Pythia is talking about, go here for past advice columns and here for an explanation of the name Pythia.


Dear Aunt Pythia,

What the hell is goin’ on with Bitcoin? Will it survive into the future (or something else akin to it) or is it ultimately doomed???

Bitcoin Boogie

p.s. – I hope you realize you’ll probably have more success explaining quantum mechanics to me than Bitcoin.

Dear BB,

I promise not to try to explain Bitcoin’s underlying algorithms. But I think I can still answer your questions.

First of all, Bitcoin has been in the news lately in bad or confusing ways, first with the exchange (Mt. Gox) that went bankrupt, and second because regulators and institutional authorities are having trouble figuring out what Bitcoins are.

Even so, think of these hiccups as growing pains, according to Coinbase co-founder and former Goldman Sachs foreign exchange trader Fred Ehrsam, quoted as saying inspiring things like:

I would go to the bathroom and trade bitcoin on my smartphone and then return to my real desk to do my real job trading real currency.

If you don’t know about it, Coinbase is the “digital wallet” company that you’d probably sign up with if you wanted Bitcoins and you weren’t a huge nerd or a criminal willing to do things on the technical downlow: it makes owning Bitcoins easy, like signing up for a normal checking account.

And they are seeing lots of people joining, and they just got Overstock to accept Bitcoins as payment. So Ehrsam and people like him are pretty positive, and you never know.

Between you and me, though, I think the biggest competitor out there is Google, which has plans to allow people to share money over gmail (hat tip Suresh Naidu). Instead of paying heavy fees, you – guess what – tell Google about your checking accounts and other financial information. I see this potentially competing with banks, Apple, and of course PayPal, which sucks.

I hope that helps!

Aunt Pythia


Dear Aunt Pythia,

I am originally from a country where it’s normal to be sentimental. I am easily moved to tears and worry that this annoys others around me. Of course I can take counter-measures, for example I try to steady myself if the music is becoming emotional or before viewing some breathtakingly beautiful scenery, or, when news about a disaster or a sad film is being shown on the television I discreetly leave the room before it affects me.

I would like to be strong enough to withstand what appears to provoke no reaction in people here. Do you have any suggestion?

Too Sensitive

Dear Too,

I hear you, I’m a huge cryer too.  I blame the Irish side of my family.

What I do is I playfully prepare people I’m around, for their own comfort, and especially when they are not familiar with this side of me. So when I feel some sentimentality coming on, I’ll announce, “Hey I’m about to totally cry, because that’s what I do! Please bear with me and please ignore the tears, I’ll be OK in 10 minutes or less.” and then I’ll laugh, usually out of embarrassment.

That way they will know I realize it’s about me, not them, and that they’re not responsible to comfort me in any way. It works great, and it’s easy for me to do because I’m an extrovert. If you’re shy, it’s going to be harder, but the alternative is often that you have to explain yourself while you’re crying, which I think is worse.

Good luck!

Auntie P


Dear Aunt Pythia.

I am but a humble traveler trying to win you over with a Firefly reference and desperately seeking your advice.

Come July, I will find myself in New York for a week. I will be in need of a place to stay and some things to do while I’m visiting your fine city.

I have been looking on airbnb for a place to stay over a hotel or a hostel but am overwhelmed by all the options. Do I stay in Brooklyn, or Lower Manhattan? Harlem or the Upper West Side. I am a young data analyst from New Zealand, what do I know of New York neighborhoods?

And then there is the sightseeing, do I go and tick off all the tourist spots or are there better things for me to do with my time? Do you know any secret spots filled with good food, great coffee and devoid of the fanny-pack wearing, obnoxiously-photographing tourist hordes?


Seeking Habitation In New York

P.S. In New Zealand we call fanny-packs ‘bum-bags’. A fanny in NZ is something entirely different!


I don’t know from Firefly, sorry. But I’ll answer you anyway and let readers add their opinions.

I’d suggest you stay in a different neighborhood every night or two. That way you get to see more of New York, and any annoyance is short-lived. Most of your time will be spent traveling from place to place, so pack light. Make sure at least one night is in Astoria, Queens, which is just cool and kind of the epitome of the melting pot.

The reason I suggest this is that, for me, official tourist destinations are incredibly boring and expensive for what they offer (and what they offer is bum-bag bearing tourists, which you can already see in NZ anyway). I mean, if you think you’ll regret not going to the top of the Empire State Building, then by all means go, but go 10 minutes before they open and depart quickly.

Authentic sight-seeing in New York City consists, in my opinion, of walking through neighborhoods and checking out bars and restaurants and the local cultural gathering places. Look for live music in each neighborhood you stay in, if you like that sort of thing. Or if you are into food, make a plan for a foodie tour of each neighborhood. Yum!

Aunt Pythia


Dear Aunt Pythia,

In searching online dating profile in New York City (I live nearby), I encounter a bunch of profiles of finance professionals working in, say, investment banking. After reading your blog, I have become convinced that people who work in banking

1) are morally bankrupt
1.5) are swindlers
2) are not very thoughtful in regards to the concerns of the 99%
3) are greedy
4) are arrogant … they think they are the best and the brightest, and point to the fake wealth they created to justify their salary
5) are overworked, stressed out at work, and their job is slowly killing them physically and emotionally
6) have expectations of a lavish lifestyle (nothing wrong with that, just not for me…I can’t compete, and perhaps mo money mo problems)

Am I right or am I right? Should I even bother expressing an interest in these profiles?

Just Pondering

Dear Just,

There are two questions here, which I’d like to pose separately.

First, are investment bankers are morally bankrupt swindlers who ignore lesser folk and hate their jobs?

Second, how do optimize my chances of finding love – or at least great sex with a tolerable partner – on an online dating site?

The answer to the first questions is, of course not. There are plenty of people in finance and even in investment banking that are perfectly nice and even sensitive and empathetic. On the other hand, there is some story explaining why they’re there, and it often exposes a weird side to them. On the other other hand, who here doesn’t have a weird side? On the whole I’d say, never disqualify someone on one attribute, especially if they otherwise seem great and you find yourself liking them at a basic human level.

The answer to the second question is a lot trickier, though, and is related to the first in the following sense: if you are playing the numbers – which is all you can do on these websites – then you might well decide to avoid investment bankers. After all, you only have so much time and some many free Friday nights, and you want to optimize for best chance of liking someone. All you have is demographic information like their job and age, and even if you gather more information through emails, you might first want to filter out red flags, and you might find “investment banker” to be a red flag.

As an aside, I would love someone to do a quantitative and qualitative investigation to see how people have changed their dating and mating habits through online dating. It seems like the most profound area of the internet affecting cultural practices.

My bottomline suggestion is to try to find a date through a friend of a friend. Good luck!

Aunt Pythia


Please submit your well-specified, fun-loving, cleverly-abbreviated question to Aunt Pythia!


Categories: Aunt Pythia

Interview with a high school principal on the math Common Core

In my third effort to understand the Common Core State Standards (CC) for math, I interviewed an old college friend Kiri Soares, who is the principal and co-founder of the Urban Assembly Institute of Math and Science for Young Women. Here’s a transcript of the interview which took place earlier this month. My words are in italics below.


How are high school math teachers in New York City currently evaluated?

Teachers are now evaluated on 2 things:

  1. First, measures of teacher practice, which are based on observations, in turn based on some rubric. Right now it’s the Danielson Rubric. This is a qualitative measure. In fact it is essentially an old method with a new name.
  2. Second, measures of student learning, that is supposed to be “objective”. Overall it is worth 40% of the teacher’s score but it is separated into two 20% parts, where teachers choose the methodology of one part and principals choose the other. Some stuff is chosen for principals by the city. Any time there is a state test we have to choose it. In terms of the teachers’ choices, there are two ways to get evaluated: goals or growth. Goals are based on a given kid, and the teachers can guess they will get a certain slightly lower score or higher score for whatever reason. Otherwise, it’s a growth-based score. Teachers can also choose from an array of assessments (state tests, performance tests, and third party exams). They can also choose the cohort (their own kids/ the grade/the school). The city also chose performance tasks in some instances.

Can you give me a concrete example of what a teacher would choose as a goal?

At the beginning of year you give diagnostic tests to students in your subject. Based on what a given kid scored in September, you extrapolate a guess for their performance in the June test. So if a kid has a disrupted homelife you might guess lower. Teacher’s goal setting is based on these teachers’ guesses.

So in other words, this is really just a measurement of how well teachers guess?

Well they are given a baseline and teachers set goals relative to that, but yes. And they are expected to make those guesses in November, possibly well before homelife is disrupted. It definitely makes things more complicated. And things are pretty complicated. Let me say a bit more.

The first three weeks of school are all testing. We test math, social studies, science, and English in every grade, and overall it depending on teacher/principal selections it can take up to 6 weeks, although not in a given subject. Foreign language and gym teachers also getting measured, by the way, based on those other tests. These early tests are diagnostic tests.

Moreover, they are new types of tests, which are called performance-based assessments, and they are based on writing samples with prompts. They are theoretically better quality because they go deeper, the aren’t just bubble standardized tests, but of course they had no pre-existing baseline (like the state tests) and thus had to be administered as diagnostic. Even so, we are still trying to predict growth based on them, which is confusing since we don’t know how to predict performance on new tests. Also don’t even know how we can consistently grade such essay-based tests- despite “norming protocols”, which is yet another source of uncertainty.

How many weeks per year is there testing of students?

The last half of June is gone, a week in January, and 2-3 weeks in the high school in the beginning per subject. That’s a minimum of 5 weeks per subject per year, out of a total of 40 weeks. So one eighth of teacher time is spent administering tests. But if you think about it, for the teachers, it’s even more. They have to grade these tests too.

I’ve been studying the rhetoric around the CC. So far I’ve listened to Diane Ravitch stuff, and to Bill McCallum, the lead writer of the math CC. They have very different views. McCallum distinguished three things, which when they are separated like that, Ravitch doesn’t make sense.

Namely, he separates standards, curriculum, and testing. People complain about testing and say that CC standards make testing easier, and we already have too much testing, so CC is a bad thing. But McCallum makes this point: good standards also make good testing easier.

What do you think? Do teachers see those as three different things? Or is it a package deal, where all three things rolled into one in terms of how they’re presented?

It’s much easier to think of those three things as vertices of a triangle. We cannot make them completely isolated, because they are interrelated.

So, we cannot make the CC good without curriculum and assessment, since there’s a feedback loop. Similarly, we cannot have aligned curriculum without good standards and assessment, and we cannot have good tests without good standards and curriculum. The standards have existed forever. The common core is an attempt to create a set of nationwide standards. For example, without a coherent national curriculum it might seem OK to teach creationism in place of evolution in some states. Should that be OK?

CC is attempting to address this, in our global economy, but it hasn’t even approached science for clear political reasons. Math and English are the least political subjects so they started with those. This is a long time coming, and people often think CC refers to everything but so far it’s really only 40% of a kid’s day. Social studies CC standards are actually out right now, but they are very new.

Next, the massive machine of curriculum starts getting into play, as does the testing. I have CC standards and the CC-aligned test, but not curriculum.

Next, you’re throwing into the picture teacher evaluation aligned to CC tests. Teachers are freaking out now – they’re thinking, my curriculum hasn’t been CC-aligned for many years, what do I do now? By the way, importantly, none of the high school curriculum in NY State is actually CC-aligned now. DOE recommendations for the middle school happened last year, and DOE people will probably recommend this year for high school, since they went into talks with publication houses last year to negotiate CC curriculum materials.

The real problem is this: we’ve created these new standards to make things more difficult and more challenging without recognizing where kids are in the present moment. If I’m a former 5th grader, and the old standards were expecting something from me that I got used to, and it wasn’t very much, and now I’m in 6th grade, and there are all these raised expectations, and there’s no gap attention.

Bottomline, everybody is freaking out – teachers, students, and parents.

Last year was the first CC-aligned ELA and math tests. Everybody failed. They rolled out the test before any CC curriculum.

From the point of view of NYC teachers, this seems like a terrorizing regime, doesn’t it?

Yes, because the CC roll-out is rigidly tied to the tests, which are in turn rigidly tied to evaluations of teachers. So the teachers are worried they are automatically going to get a “failure” on that vector.

Another way of saying this is that, if teacher evaluations were taken out of the mix, we’d have a very different roll-out environment. But as it is, teachers are hugely anxious about the possibility that their kids might fail both the city and state tests, and that would give the teacher an automatic “failure” no matter how good their teacher observations are.

So if I’m a special ed teacher of a bunch of kids reading at 4th and 5th grade level even through they’re in 7th grade, I’m particularly worried with the introduction of the new and unknown CC-aligned tests.

So is that really what will happen? Will all these teachers get failing evaluation scores?

That’s the big question mark. I doubt it there will be massive failure though. I think given that the scores were so clustered in the middle/low muddle last year, they are going to add a curve and not allow so many students to fail.

So what you’re pointing out is that they can just redefine failure?

Exactly. It doesn’t actually make sense to fail everyone. Probably 75% of the kids got 2′s or 1′s out of a 4 point scale. What does failure mean when everyone fails? It just means the test was too hard, or that what the kids were being taught was not relevant to the test.

Let’s dig down to the the three topics. As far as you’ve heard from the teachers, what’s good and bad about CC?

My teachers are used to the CC. We’ve rolled out standards-based grading three years ago, so our math and ELA teachers were well adjusted, and our other subject teachers were familiar. The biggest change is what used to be 9th grade math is now expected of the 8th grade. And the biggest complaint I’ve heard is that it’s too much stuff – nobody can teach all that. But that’s always been true about every set of standards.

Did they get rid of anything?

Not sure, because I don’t know what the elementary level CC standards did. There was lots of shuffling in the middle school, and lots of emphasis on algebra and algebraic thinking. Maybe they moved data and stats to earlier grades.

So I believe that my teachers in particular were more prepared. In other schools, where teachers weren’t explicitly being asked to align themselves to standards, it was a huge shock. For them, it used to be solely about Regents, and also Regents exams are very predictable and consistent, so it was pretty smooth sailing.

Let’s move on to curriculum. You mentioned there is no CC-aligned curriculum in NY. I also heard NY state has recently come out against the CC, did you hear that?

Well what I heard is that they previously said they this year’s 9th graders (class of 2017) would be held accountable but now the class of 2022 will be. So they’ve shifted accountability to the future.

What does accountability mean in this context?

It means graduation requirements. You need to pass 5 Regents exams to graduate, and right now there are two versions of some of those exams: one CC-aligned, one old-school. The question is who has to pass the CC-aligned versions to graduate. Now the current 9th grade could take either the CC-aligned or “regular” Regents in math.

I’m going to ask my 9th grade students to take both so we can gather information, even though it means giving them 3 extra hours of tests. Most of my kids pass 2 Regents in 9th grade, 2 in 10th, and 3 in 11th, and then they’re supposed to be done. They only take those Regents tests in senior year that they didn’t pass earlier.

What are the good and bad things about testing?

What’s bad is how much time is lost, as we’ve already said. And also, it’s incredibly stressful. You and I went to school and we had one big college test that was stressful, namely the SAT. In terms of us finishing high school, that was it. For these kids it’s test, test, test, test. I don’t think it’s actually improved the quality of college students across the country. 20 years ago NY was the only one that had extra tests except California achievement tests, which I guess we sometimes took as well.

Another way to say it is that we did take some tests but it didn’t take 5 weeks.

And it wasn’t high stakes for the teacher!

Let’s go straight there: what are the good/bad things for the teachers with all these tests?

Well it definitely makes the teachers more accountable. Even teachers think this: there is a cadre of protected teachers in the city, and the principals didn’t want to take the time to get rid of them, so they’d excess them out of the schools, and they would stay in the system.

Now with testing it has become much more the principal’s responsibility to get rid of bad teachers. The number of floating teachers is going down.

How did they get rid of the floaters?

A lot of different ways. They made them go into the schools, take interviews, they made their quality of life not great, and a lot if them left or retired or found jobs. Principals took up the mantle as well, and they started to do due diligence.

Sounds like the incentive system for over-worked principals was wrong.

Yes, although the reason it became easier for the principals is because now we have data. So if you’re coming in as ineffective and I also have attendance data and observation data, I can add my observational data (subjective albeit rubric based) and do something.

If I may be more skeptical, it sounds like this data gathering was used as a weapon against teachers. There were probably lots of good teachers that have bad numbers attached to them that could get fired if someone wanted them to be fired.

Correct, except those good teachers generally have principals who protect them.

You could give everyone a bad number and then fire the people you want, right?


Is that the goal?

Under Bloomberg it was.

Is there anything else you want to mention? 

I think testing needs to be dialed down but not disappear. Education is a bi-polar pendulum and it never stops in the middle. We’re on an extreme but let’s not get rid of everything. There is a place for testing.

Let’s get our CC standards, curriculum, and testing reasonable and college-aligned and let’s keep it reasonable. Let’s do it with standards across states and let’s make sure it makes sense.

Also, there are some new tests coming out, called PARCC assessments, that are adaptive tests aligned to the CC. They are supposed to replace Regents down the line and be national.

Here’s what bothers me about that. It’s even harder to investigate the experience of the student with adaptive tests.

I’m not sure there’s enough technology to actually do this anyway very soon. For example, we were given $10,000 for 500 student. That’s not going to go far unless it takes 2 weeks to administer the test. But we are investing in our technology this year. For example, I’m looking forward  to buying textbooks and get my updates pushed instead of having to buy new books every year.

Last question. They are redoing the SAT because rich kids are doing so much better. Are they just trying to get in on the test prep game? Because, here’s the thing, there’s no test that can’t be gamed that’s also easy to grade. It’s gotta depend on the letters and grades. We keep trying to shortcut that.

Listen, this is what I tell the kids. What’s going to matter to you is the letter of recommendation, so don’t be an jerk to your fellow students or to the teachers. Next, are you going to be able to meet the minimum requirements? That’s what the SAT is good for. It defines a lower bound.

Is it a good lower bound though?

Well, I define the lower bound as 1000 in total. My kids can target that. It’s a reasonable low bar.

To what extent do your students – mostly inner-city, black girls interested in math and science – suffer under the wholly gamed SAT system?

It serves to give them a point of self-reference with the rest of the country. You have to understand, they, like most kids in the nation, don’t have a conception of themselves outside of their own experience. The SAT serves that purpose. My kids, like many others, have the dream of Ivy League minus the understanding of where they actually stand.

So you’re saying their estimates of their chances are too high?

Yes, oftentimes. They are the big fish in a well-defined pond. At the very least, The SAT helps give them perspective.

Thanks so much for your time Kiri.

The Lede Program: An Introduction to Data Practices

It’s been tough to blog what with jetlag and a new job, and continuing digestive issues stemming from my recent trip, which has prevented me from drinking coffee. It really isn’t until something like this happens that I realize how very much I depend on caffeine for my early morning blogging. I really cherish that addiction like a child. Don’t tell my other kids.

Speaking of my new job, the website for the Lede Program: An Introduction to Data Practices is now live, as is the application. Very cool.

We’re holding information sessions about the program next Monday and Thursday at 1:00pm at the Stabile Center, on the first floor of the Journalism School which is in Pulitzer Hall. Please join us and please spread the news.

We are also still looking for teachers for the program, and we’ve fixed the summer classes, which will be:

  1. Basic computing,
  2. Data and databases,
  3. Algorithms, and
  4. The platform

I’m really excited about all of these but probably most about the last one, where we will investigate biases inherent in data, systems, and platforms and how they affect our understanding of objective truth. Please tell me if you know someone who might be great for teaching any of these, they are intense, seven week classes (either from end of may to mid-July or from mid-July to the end of August) which will meet 3 hours twice a week each.

Categories: data journalism, education

Tia Pythia’s advice

Aunt Pythia is coming to you from Costa Rica, where she’s been on vacation all week and is officially 100% sunburnt, relaxed, and happy, except for the occasional digestive issue.

To commemorate the occasion and location she’s temporarily changed her name to “Tia Pythia”, but don’t worry, you can expect consistent obnoxious and over-the-top advice coming from her. She hasn’t lost her edge, even in 95 degree heat!

After you enjoy her column (and the copious fruity drinks!) today please don’t forget to:

think of something to ask Tia Pythia at the bottom of the page!

By the way, if you don’t know what the hell Tia Pythia is talking about, go here for past advice columns and here for an explanation of the name Pythia.


Dear Tia Pythia,

I am a graduate student still early in my career, at a university that I am quite happy with (people, subject areas good; geography tolerable for 5 years). However, my (would-be, as-of-yet-unofficial) adviser is moving to a more prestigious, if less outwardly friendly, math department (the way it was described to me is, my current institution is solidly “tier n” while they are considering moving to a “tier n-1″ school). They have offered to bring me with them, but I am nervous about a)whether I could cut it at a more competitive place and b)even if I could, whether 3-4 years of relative misery is worth a “more prestigious” degree.

I’m very excited about the research that my adviser is doing and the field in general, and the prospect of more favorable geography along with a higher “payoff” (in terms of where my degree is from) is attractive, but I’m still seeking sage advice in case I haven’t thought of it in a certain way.

Also, a more direct question: could you have done everything you’ve done since getting your phd if it hadn’t been from Harvard, but from some middle-of-the-road school?

With love,

Future Anxiety Revealed, Troubling Situation


Nice sign-off, and it kind of makes up for a super long letter.

As I’ve written about recently, not all graduate school experiences are the same. Even so, Harvard wasn’t known as the most friendly department and I made it work for me, partly because of the location and the fact that I could make friends outside math. It helped that I grew up in the area and knew people like Nancy from Fair Foods and crucial information like where the best yarn stores were.

I’d suggest visiting the place and seeing if it can work. And importantly, try to make it work. Having an exciting advisor you trust is crucial to the graduate school experience, so I would definitely do my best if I were you to stick with him or her.

In terms of prestige, I definitely think it helps me personally, but I’m never sure how much of that is because I’m a woman – it definitely still seems true that you have to be top-notch to impress people if you’re female whereas men often get the benefit of the doubt.

Also it depends on whether you’re talking to someone inside math or outside math, because outsiders don’t have a definite sense of ranking and also don’t usually care too much. So it also depends on what you want to do with your life after school.

Good luck!

Tia Pythia


Dear Aunt Pythia,

I’ve never had a girlfriend. I’ve worked for a long time (years) to get one. It hasn’t been a complete waste of time from a social skills and self-improvement point of view, but I haven’t gotten anywhere, and the lack of success has taken its toll.

Logically, I tell myself that it has no bearing on my worth as a human being (compare Isaac Newton and Charlie Sheen for example), that I should enjoy being young and single, and I don’t get as depressed about it as I once did, but we all know that logic isn’t everything. And to clarify, I know I don’t need or am entitled to a girlfriend (I know a lot of guys in my situation do), and my life is satisfying – or getting there – on my own.

I fear that when I’m older, I’ll look back on these years-my most sexually fertile, as well as my most “fun” ones, and see barrenness, when others see great memories with lovers. And I’ll be constantly reminded that they do. I just feel so… tired, or deadened sometimes when thinking about it. What can I do about this?

Draußen vor der Tür

Dear Draußen,

Pardon me for cutting about 85% of your letter, it was just too long. I’m in a short question – short answer kind of mood this morning. Something about the last day of a vacation. And I didn’t cut out the part about Charlie Sheen, because honestly I don’t get it and I’m wondering if readers do and could comment on a possible interpretation.

Look, I have sympathy for your situation. As a guy in physics (part of what I cut), I’m sure you spend most of your time around other guys. It must be tough to meet nice women.

But at the same time, I guess I’m wondering what it is you’ve been doing to try to meet women, and importantly how you’ve talked to them when you’ve met them. From the 85% of your letter that I cut, I can tell you spend a lot of time thinking about yourself and what you should do with your life.

But to be honest, unless someone is already your friend, they probably don’t care about that stuff. At all. If I meet someone who starts talking about that stuff, I find a quick reason to depart.

You need to make sure you have opinions about other things besides yourself. Like, do you read the paper? What do you think about Ukraine? Or the new SAT? Make sure you are not too self-involved and that you have truly interesting opinions and things to say before you meet women. Even better: have ideals. Have plans to fix problems. That’s interesting! That’s possibly even sexy!

Another idea: try reading How Not to Be a Dick: An Everyday Etiquette Guide (hat tip Becky Jaffe). And I’m not saying that because I think you’re a dick (although I’m also not saying you aren’t a dick!) but because it has lots of great points about communication and making sure you’re coming across well. I know I learned something reading it!

One last thing: it doesn’t have to be work. Find something you like doing that lots of women also like doing, and go enjoy yourself. Joy is extremely catchy. Worst case you make some new friends.

Good luck,

Tia Pythia


Dear Aunt Pythia,

What do nerdy women talk about at lunch?

I’m a woman who recently started working in an office after a few years of working at home. It’s an educational technology company, and I’m in the unique and fortunate position of being both an educator and a software tester in training.

Most days when I go to the communal kitchen to heat my lunch, many of the women from the education team are engaging in loud discussions about their kids. I’ve started noticing that most of the men educators and people from other departments bury themselves in a book or take their food back to their desks. As a non-parent, I’m sometimes curious about life on the other side, but there’s such a thing as too much information. Today after yet another round of hearing about children’s eating habits, sleep habits, etc, I took my plate and headed for the tech zone for more stimulating conversation.

Any suggestions on things I can ask my coworkers with kids to steer the topic in another direction? I would like to get to know them better and pick their brains on their career paths and aspirations. Or am I better off spending more time with the mostly male tech geeks and absorbing their lingo?

Lunch Uncomfortable Need Conversation Help


First, congrats on the new job, it sounds cool.

Second, to be honest I have never encountered this problem personally because I’m such a freaking loudmouth. I pretty much just barge into conversations and change the topic if I’m bored. I often even tell people they’re really boring and need to spice up their conversation, preferably with sex. Nobody ever seems to complain that they want to talk more about their sleeping or non-sleeping kids. Not sure they like me, but whatevs.

I mean, I don’t literally interrupt the conversation, because I’m not totally rude. But I’ll wait for a good moment and just jump in with something off-topic like the new SAT or Putin or whether House of Cards is too cynical or not cynical enough.

My advice: come prepared with a short list of 4 non-parenting but general topics and see how they fly. I’m guessing they are themselves just bored and talking about that stuff out of habit and will welcome new blood.

Also, engage the men as well, especially if you go with sex.

Good luck!

Tia Pythia


Dear Aunt Pythia,

An acquaintance has started sending emails urging his “friends” to call upon their political leaders to oppose immigration reform. The first time, I assumed his message was spam and deleted it. I replied to the second message saying that I thought his account had been hacked. He replied that he had indeed sent it, explaining his position. After another message, which he forgot to BCC, one of the recipients replied with a well-reasoned rebuttal. The spammer’s response was to remove that person from his contact list.

Of course, everyone is entitled to their opinion, but I was initially confused because:
1. I met this person through an organization that celebrates cultural diversity.
2. His wife and stepson are immigrants.

His reasons for opposing immigration reform:
1. He was forced into early retirement because his employer went out of business.
2. Big business is profiting off cheap illegal labor, taking better paying jobs from Americans.
3. “Those people” are migrating northward, taking over, etc. (His wife is from a country bordering Europe, and the immigrants he opposes come from other places.)

So, the question is, should I ignore the emails, ask him to stop, or attempt to find common ground? I would generally ignore such spam, but I consider his wife a good friend. What kinds of holes can I point out in his argument, like the many forms of corporate greed?

Stop Propagating Antagonizing Messages

Dear SPAM,

Awesome sign-off. Both topical and sensical. Seriously, you should hold a master class in these motherfuckers.

In terms of your quandary, I’ll tell you what I’d do. I’d ask him politely to take you off the email list, and I’d never discuss it again with him. I’d continue to be friends with his wife.

Here’s why. He’s gotten it into his head that he lost his job for an abstract political reason. In doing so, he’s made it incredibly personal, and no amount of factual evidence is going to change his stance. You are not going to change it either.

Maybe at some point something will change it, but it will be emotional and deeply personal to him, not something you can effect.

Better yet, just build a filter to send his emails to trash and never think about it again.

Tia Pythia


Dear Tia Pythia,

From your remark of blowing off steam at a conference I remembered this article. Have you read it? It’s very informative and fun.

If conferences like JMM were to have bowls of condoms at the end of the tables where you pick up your badge do you think people would get the idea and pocket a hand full, then use them?

Open Relationships Rock

Dear ORR,

Wait, what article? That’s super unfair.

Tia Pythia


Please submit your well-specified, fun-loving, cleverly-abbreviated question to Aunt Pythia!

Categories: Aunt Pythia

Billionaire money and academic freedom

If you haven’t seen this recent New York Times article by William Broad, entitled Billionaires With Big Ideas Are Privatizing American Sciencethen go take a look. It generalizes to all of scientific research my recent post entitled Billionaire Money in Mathematics.

My favorite part of Broad’s article is the caption of the video at the top, which sums it up nicely:

Funding the Future: As government financing of basic science research has plunged, private donors have filled the void, raising questions about the future of research for the public good.

In his article Broad makes a bunch of great points.

First, the fact that rich people generally ask for research into topics they care about (“personal setting of priorities”) to the detriment of basic research. They want flashy stuff, bang for their buck.

Second, academics interested in getting funding from these rich people have to learn to market themselves. From the article:

The availability of so much well-financed ambition has created a new kind of dating game. In what is becoming a common narrative, researchers like to describe how they begged the federal science establishment for funds, were brushed aside and turned instead to the welcoming arms of philanthropists. To help scientists bond quickly with potential benefactors, a cottage industry has emerged, offering workshops, personal coaching, role-playing exercises and the production of video appeals.

If you think about it, the two issues above are kind of wrapped up together. Flashy academic content goes hand in hand with flashy marketing. Let’s say goodbye to the true nerd who doesn’t button up their cardigan correctly. And I don’t know about you but I like those nerds. My mom is one of them.

This morning I thought of another way to express this issue, from the point of view of the individual scientist or mathematician, that might have profound resonance where the above just sounds annoying.

Namely, I believe that academic freedom itself is at stake. Let me explain.

I’m the last person who would defend our current tenure system. It’s awful for women, especially those who want kids, and it often breeds a kind of arrogant laziness post-tenure. Even so, there are good things about it, and one of them is academic freedom.

And although theoretically you can have academic freedom without tenure, it is certainly easier with it (example from this piece: “In Oklahoma, a number of state legislators attempted to have Anita Hill fired from her university position because of her testimony before the U.S. Senate. If not for tenure, professors could be attacked every time there’s a change in the wind.”).

But as we’ve seen recently, tenure-track positions are quickly declining in number, even as the number of teaching positions is growing. This is the academic analog of how we’ve seen job growth in the US but it’s majority shitty jobs. And as I’ve predicted already, this trend is surely going to continue as we scale education through MOOCs.

The dwindling tenured positions means there are increasing number of people trying to do research dependent upon outside grants and funding, and without the safety net of tenure. These people often lose their jobs when their funding flags, as we’ve recently seen at Columbia.

Now let’s put these two trends together. We’ve got fewer and fewer tenure jobs, which are precariously dependent on outside funding, and we’ve got rich people funding their own tastes and proclivities.

Where does academic freedom shake out in that picture? I’m going to say nowhere.

Categories: education, math, math education

Optimizing for Einstein and other homo-erotic theories

Jointly posted with Naked Capitalism.

At 41, I’m a grown woman. I’ve had enough weird and bad experiences as a woman in the mathematics part of “STEM,” inside and outside of academia, that my skin is relatively thick, a fact I’m proud of. Most of the time I let stuff roll off of me.

Even so, there are certain things that really get under my skin. Examples include terrible advice to young anxious women, and anything having to do with Princeton, New Jersey.

The recent appearance of the “Princeton Mom” Susan Patton (more about her below) has created a perfect storm inside me and I feel I have to comment, at the risk of giving her book more buzz. Note this post is not at all quantitative or even nerdy, except for some free market chit-chat which doesn’t really count. Instead it is much more straight-up ranting that I allow myself from time to time on mathbabe. If you want a more scientific and polite takedown, please see this Huffington Post article.

Princeton, New Jersey

There are two kinds of people in the world: people who hate Princeton, New Jersey, and people who are über successful white men (and sometimes Asian men). And I guess there’s a third kind, the people who have never visited Princeton.

I know that sounds histrionic, and I’ll make some caveats later on, but bear with me, it’s coming from personal experience.

I spent one horrific year (the academic year 1997-1998) as a visiting graduate student in the Princeton math department. Coming from the Harvard math department, I’d been socialized to think that spending all night in the library reading musty old French mathematical manuscripts was cool, and the very least one could do to impress one’s advisor.

In other words, I knew from male-dominated macho nerd culture. I girded myself for more of the same when I got to Princeton. But Princeton turned that up quite a few notches, and it wasn’t pretty. And it might have had something to do with being newly married, but that kind of makes my point stronger, not weaker, as you will see.

The first thing I noticed was that there were no other women in the math department. Well, that’s not quite true, since there were secretaries, and there was one female professor, who I never once spotted, and there was one other female graduate student, at least in theory, but it took me weeks and weeks to run into her.

But even so, I was kind used to that, being an experienced math nerd. I would normally just make do with hanging out with the social nerd boys. Unfortunately I couldn’t find any. It seemed like a department that either selected for anti-social people or efficiently turned them into anti-social people after they arrived.

As an illustration, let me tell you about the most social experience among graduate students I ever witnessed. It started out as a joyous scene: an enthusiastic young man bounded into the common room (which was almost always empty and didn’t really deserve the name “common room” at all) holding a book. He was showing off his newly bound thesis to an unusually large crowd of fellow graduate students – maybe 7 other men.

Instead of congratulating him, someone from the crowd grabbed the thesis and immediately and loudly proclaimed he’d found a typo. Everyone laughed. Long pause. The guy took his thesis and walked out of the room.

As you might imagine, I didn’t spend too much time in the math department. Instead, naïf that I was, I gave myself the task of finding friendly people I could truly connect with in the cultural wasteland that was Princeton Township.

The problem was, it felt like a village frozen in time. Of the perhaps 7 people I got to the point of trusting enough to share my desire for connection, no fewer than 3 of them suggested I join a church (that always made me wonder, what do Jewish people do in Princeton?), and the other 4 suggested I have a child in order to have company and something to do with myself. No shit. Human being as hobby.

I could go on – I could describe the pathetic attempt to attend a female graduate student mixer (“canceled for lack of attendants”) or the desperate time I sought counseling from the sole campus Mental Health Professional. Her exact words: “If it helps, I think I eventually see every female graduate student at Princeton.” Me: “Yes, it helps! I’m getting the FUCK out of here.” And I did.

I’ve been back once or twice, mostly to see the one person I became fond of in my year-long visit, and I am always amazed to see how little has changed. The last time I went, I attended a conference at the Institute for Advanced Study, and after lunch one afternoon I was in the cafeteria there, looking for coffee, when someone (a man! an oldish white man!) asked me to “find more plates, please” because there were no more clean ones. I looked down at my clothes: was I wearing a kitchen staff uniform like other people working the kitchen? Not at all, but I did suspiciously have my boobs with me. I must be kitchen staff.

Hey, I might be wrong

Other people have been to Princeton in the past 15 years, and some of them tell me it’s gotten somewhat better, and there are sightings of more than one woman at a time in the math department, and so on. I mean, the standards are super low, so “better” doesn’t necessarily mean much, but then again I don’t want to make it seem impossibly fixed. I’m glad the President of Princeton is was a woman.

On the other hand, another friend of mine had this to say about a very recent visit (less than 3 months ago):

I was a job candidate there. Put up at that Inn. Eating by myself, and there was a long table in the center of the room  - all white men, many in bow ties, I swear. They were talking loudly about curriculum changes in the humanities over time, and what a shame it was that they couldn’t teach the classics anymore, laughing about having to teach world literature, etc. And everyone serving them was black. It was disgusting.

My theory of Princeton

I have a kind of fun theory of why Princeton is like this. The short version is that the culture has optimized to producing “geniuses,” which started with Einstein. In fact, Einstein’s success story also pinpoints the moment that time froze there. It was like the lesson learned for the town was that, if they could only keep the place exactly like it was the moment Einstein entered Princeton, then maybe it would be a breeding ground for many many more geniuses to make the town proud.

So that’s what’s happened: everything that is done there is done in the hope that more Einsteins will pop up among the population. Would-be geniuses are worshipped in weird ways, and anyone who is not themselves a genius candidate has to tailor themselves to those who are.

And since by definition geniuses are not women – and nor are minority men – we know what their roles turn out to be. Women, at least white women, are seen as useful in as much as they can have man-children who may grow up to be geniuses. Everyone else is even less crucial.

Do you think I’m being too harsh? Perhaps. To be honest, there is a space for white men to be tagged as successful without being full-blown geniuses, especially if they’re undergraduates. Namely, if they are potentially super rich, preferably by working in finance. In any case it’s all about the successful male narrative. There is no room for any other narrative.

Why am I talking shit about Princeton?

Here’s the thing. I have come to appreciate Princeton, in a wry way (“If you’re suicidal,” one character says, “and you don’t actually kill yourself, you become known as ‘wry.’ ”), and only as long as I’m not actually there. It is such a perfect example of old-fashioned, fucked up shit. You can’t make that stuff up.

But you can point to it and say, I will never live like that. It’s become a convenient counterfactual for me personally.

But not everyone has my perspective. My biggest fear nowadays about Princeton is that people are not sufficiently up front about how awful it is, and because of that people are sometimes tricked into visiting or even moving there.

It is this fear that I’m writing this essay, that I might be able to warn people away from that place, and possibly other places like it, although I don’t know of any. I’m a one-person anti-PR machine, but there’s only so much I can do.

Susan Patton to the rescue

It turns out my job is getting easier, thanks to Susan Patton, self-proclaimed “Princeton Mom”.

As if to amplify my complaints about Princeton, Patton has come out with yet more advice for girls who are aspiring to be Princeton wives. Her new advice to young women is to get fake boobs and whatever other plastic surgery deemed necessary in high school so you can attract a man in college.

Let’s back up for just a moment, though. Who is this woman?

You have heard of Susan Patton. She’s the confused bitch that wrote a now-famous letter to undergraduate women telling them to stop thinking about careers and start getting engaged whilst in college.

Oh, and she also suggested in a recent Valentine’s Day column (subtitle: “Young women in college need to smarten up and start husband-hunting.”) in the Wall Street Journal (where else!?) that, if you want men to marry you, you shouldn’t fuck them too soon, because, in her words, “men won’t buy the cow if the milk is free.”

Yes, she said that. I’ve got two responses to that tidbit. First, this:

mooooooo, motherfucker, moooooooooooo!!

Next, Aunt Pythia mentioned this but it bears repeating: Patton is objectifying women by calling them cows.

She’s doing the same when she tells young women to get boob jobs in high school. That’s in fact the name of her game. She is insisting that women abandon any hope of intellectual curiosity, goals or ambitions while they are still teenagers and start in on a desperate competition to be a Princeton wife.

Why is Patton so nuts?

By her own account, Susan Patton married the wrong guy – a non-Princeton guy – and later got divorced. She’s bitter about her lack of foresight. In some sense this is just a pathetic story about one sad person.

But in another way it’s not. I’ve been reading a super interesting book called Why Love Hurts: a Sociological Explanation that explains why Susan Patton has some things right. In fact she’s kind of brilliant, but for obviously weird reasons, and her plan to deal with the issues she rightly raises is completely fucked up.

Here’s what she’s understood: there has been a revolution in mating rituals and partnering, and it has become a competition, and it has become increasingly important to be sexually attractive to win this competition. And although it’s not the only competition young women are enduring in college, it’s the one she’s fixated on.

In fact to a large extent we’ve gone from a social contract partnering society to a kind of pseudo-free market partnering society. The results of that transition include various things like how men and women see themselves, and specifically how they (women, not men) blame themselves for failed relationships, and moreover how they are incentivized (or not) to get married, or have kids, or importantly, to keep their word.

One of the most interesting points, at least as it pertains to Susan Patton, is that whereas men used to need to get married and have children to assert their masculinity, this is no longer true.

Nowadays, according to this theory, men in question increasingly assert their masculinity to each other through the sexual attractiveness of their girlfriends, and they don’t care very much whether they get married and have kids, or at least they don’t feel any urgency (which gives rise to both “the noncommittal man” and “the woman who loves too much”).

So when Patton tells women to get boob jobs, she’s essentially telling them to improve their odds in that existing free market. It’s not about sexual gratification, or even “self confidence” for the women. It’s really a homo-erotic, all-male issue: be something that other men will be jealous of.  And what is the measure of their jealousy? That other men are responding sexually to “my” woman. So this means men are focusing on signs of sexual responses in other men and deriving gratification from them.

Here’s what Patton has tragically wrong, though. Given that you’re willing to toss out your personal and intellectual growth for the sake of winning this competition, even given that, which is a sad way to approach life, it still doesn’t have a chance of working.

Because, once we’ve acknowledged and entered this free market for sexual and romantic partnership, it’s simply not going to work in this day and age to expect the men to want to get married when they’re 20 years old, and it’s also certainly not going to work to withhold sex from 20-year-old men and expect them to marry you. It’s just not where 20-year-old men are at in this system. In fact by doing those things a woman is signaling desperation, which – as is explained in this book – works against a given woman, not for them.

Patton and my theory

I’d like to square her advice with my optimized-for-geniuses theory of Princeton.

The main point of my theory is that it’s all about the men, and specifically, it’s all about the successful male narrative. Whereas before it was enough for women to subjugate their personality, personal ambitions, and long-term goals for the purpose of potential geniuses and/or rich finance guys, Patton is now calling for women to also mutilate their bodies for the cause.

As a signaling device, it indicates real hunger for the role. As some guy said:

Fake boobs say, ‘I objectify myself, therefore I have no problem with you doing the same.’

But as I mentioned above, it is a failed signaling device. It’s an indication that the cultural worship of men has gone too far in Princeton, New Jersey. I’m hopeful that the smell of desperation will be so obvious that people will have to take a closer look and scrutinize the culture.

I’d also like to start a petition to demand that the Wall Street Journal make up for the publishing Patton’s column by also printing this excellent essay on getting laid really well when you’re a divorced fat woman. We need an antidote.

Categories: rant

Let’s not replace the SAT with a big data approach

The big news about the SAT is that the College Boards, which makes the SAT, has admitted there is a problem, which is widespread test-prep and gaming. As I talked about in this post, the SAT mainly serves to sort people by income.

It shouldn’t be a surprise to anyone when a weak proxy gets gamed. Yesterday I discussed this very thing in the context of Google’s PageRank algorithm, and today it’s student learning aptitude. The question is, what do we do next?

Rick Bookstaber wrote an interesting post yesterday (hat tip Marcos Carreira) with an idea to address the SAT problem with the same approach that I’m guessing Google is addressing the PageRank problem, namely by abandoning the poor proxy and getting a deeper, more involved one. Here’s Bookstaber’s suggestion:

You would think that in the emerging world of big data, where Amazon has gone from recommending books to predicting what your next purchase will be, we should be able to find ways to predict how well a student will do in college, and more than that, predict the colleges where he will thrive and reach his potential.  Colleges have a rich database at their disposal: high school transcripts, socio-economic data such as household income and family educational background, recommendations and the extra-curricular activities of every applicant, and data on performance ex post for those who have attended. For many universities, this is a database that encompasses hundreds of thousands of students.

There are differences from one high school to the next, and the sample a college has from any one high school might be sparse, but high schools and school districts can augment the data with further detail, so that the database can extend beyond those who have applied. And the data available to the colleges can be expanded by orders of magnitude if students agree to share their admission data and their college performance on an anonymized basis. There already are common applications forms used by many schools, so as far as admission data goes, this requires little more than adding an agreement in the college applications to share data; the sort of agreement we already make with Facebook or Google.

The end result, achievable in a few years, is a vast database of high school performance, drilling down to the specific high school, coupled with the colleges where each student applied, was accepted and attended, along with subsequent college performance. Of course, the nature of big data is that it is data, so students are still converted into numerical representations.  But these will cover many dimensions, and those dimensions will better reflect what the students actually do. Each college can approach and analyze the data differently to focus on what they care about.  It is the end of the SAT version of standardization. Colleges can still follow up with interviews, campus tours, and reviews of musical performances, articles, videos of sports, and the like.  But they will have a much better filter in place as they do so.

Two things about this. First, I believe this is largely already happening. I’m not an expert on the usage of student data at colleges and universities, but the peek I’ve had into this industry tells me that the analytics are highly advanced (please add related comments and links if you have them!). And they have more to do with admissions and college aid – and possibly future alumni giving – than any definition of academic success. So I think Bookstaber is being a bit naive and idealistic if he thinks colleges will use this information for good. They already have it and they’re not.

Secondly, I want to think a little bit harder about when the “big, deeper data” approach makes sense. I think it does for teachers to some extent, as I talked about yesterday, because after all it’s part of a job to get evaluated. For that matter I expect this kind of thing to be part of most jobs soon (but it will be interesting to see when and where it stops – I’m pretty sure Bloomberg will never evaluate himself quantitatively).

I don’t think it makes sense to evaluate children in the same way, though. After all, we’re basically talking about pre-consensual surveillance, not to mention the collection and mining of information far beyond the control of the individual child. And we’re proposing to mine demographic and behavioral data to predict future success. This is potentially much more invasive than just one crappy SAT test. Childhood is a time which we should try to do our best to protect, not quantify.

Also, the suggestion that this is less threatening because “the data is anonymized” is misleading. Stripping out names in historical data doesn’t change or obscure the difference between coming from a rich high school or a poor one. In the end you will be judged by how “others like you” performed, and in this regime the system gets off the hook but individuals are held accountable. If you think about it, it’s exactly the opposite of the American dream.

I don’t want to be naive. I know colleges will do what they can to learn about their students and to choose students to make themselves look good, at least as long as the US News & World Reports exists. I’d like to make it a bit harder for them to do so.

The endgame for PageRank

First there was Google Search, and then pretty quickly SEOs came into existence.

SEOs are marketing people hired by businesses to bump up the organic rankings for that business in Google Search results. That means they pay people to make their website more attractive and central to Google Search so they don’t have to pay for ads but will get visitors anyway. And since lots of customers come from search results, this is a big deal for those businesses.

Since Google Search was based on a pretty well-known, pretty open algorithm called PageRank which relies on ranking the interestingness of pages by their links, SEOs’ main jobs were to add links and otherwise fiddle with links to and from the websites of their clients. This worked pretty well at the beginning and the businesses got higher rank and they didn’t have to pay for it, except they did have to pay for the SEOs.

But after a while Google caught on to the gaming and adjusted its search algorithm, and SEOs responded by working harder at gaming the system (see more history here). It got more expensive but still kind of worked, and nowadays SEOs are a big business. And the algorithm war is at full throttle, with some claiming that Google Search results are nowadays all a bunch of crappy, low-quality ads.

This is to be expected, of course, when you use a proxy like “link” to indicate something much deeper and more complex like “quality of website”. Since it’s so high stakes, the gaming acts to decouple the proxy entirely from its original meaning. You end up with something that is in fact the complete opposite of what you’d intended. It’s hard to address except by giving up the proxy altogether and going for something much closer to what you care about.

Recently my friend Jordan Ellenberg sent me an article entitled The Future of PageRank: 13 Experts on the Dwindling Value of the LinkIt’s an insider article, interviewing 13 SEO experts on how they expect Google to respond to the ongoing gaming of the Google Search algorithm.

The experts don’t all agree on the speed at which this will happen, but there seems to be some kind of consensus that Google will stop relying on links as such and will go to user behavior, online and offline, to rank websites.

If correct, this means that we can expect Google to pump all of our email, browsing, and even GPS data to understand our behaviors in a minute fashion in order to get at a deeper understanding of how we perceive “quality” and how to monetize that. Because, let’s face it, it’s all about money. Google wants good organic searches so that people won’t abandon its search engine altogether so it can sell ads.

So we’re talking GPS on your android, or sensor data, and everything else it can get its hands on through linking up various data sources (which as I read somewhere is why Google+ still exists at all, but I can’t seem to find that article on Google).

It’s kind of creepy all told, and yet I do see something good coming out of it. Namely, it’s what I’ve been saying we should be doing to evaluate teachers, instead of using crappy and gameable standardized tests. We should go deeper and try to define what we actually think makes a good teacher, which will require sensors in the classroom to see if kids are paying attention and are participating and such.

Maybe Google and other creepy tech companies can show us the way on this one, although I don’t expect them to explain their techniques in detail, since they want to stay a step ahead of SEO’s.

Categories: data science, modeling

Working at the Columbia Journalism School

I’m psyched to say that, as of today, I’m helping start a data journalism program at the Columbia J-School. It’s a one or two semester post-bacc program to get people into data, coding, and visualizations who are starting from non-technical fields. It starts this summer and runs through the end of the year.

And although it’s being held in the J-School, it’s not only meant for journalists. The idea is that people from other humanities who see value in working with data can enroll in the program and emerge competent with data.

There’s no time to waste, as the program starts soon (May 27th) and we don’t even quite have a name for it (suggestions welcome!). We’re also looking for students and teachers. What we do have is plenty of great plans of what to teach, lots of institutional support, and some scholarship money.


Categories: data journalism

Julia Angwin’s Dragnet Nation

I recently devoured Julia Angwin‘s new book Dragnet Nation: A Quest for Privacy, Security, and Freedom in a World of Relentless Surveillance. I actually met Julia a few months ago and talked to her briefly about her upcoming book when I visited the ProPublica office downtown, so it was an extra treat to finally get my hands on the book.

First off, let me just say this is an important book, and a provides a crucial and well-described view into the private data behind the models that I get so worried about. After reading this book you have a good idea of the data landscape as well as many of the things that can currently go wrong for you personally with the associated loss of privacy. So for that reason alone I think this book should be widely read. It’s informational.

Julia takes us along her journey of trying to stay off the grid, and for me the most fascinating parts are her “data audit” (Chapter 6), where she tries to figure out what data about her is out there and who has it, and the attempts she makes to clean the web of her data and generally speaking “opt out”, which starts in Chapter 7 but extends beyond that when she makes the decision to get off of gmail and LinkedIn. Spoiler alert: her attempts do not succeed.

From the get go Julia is not a perfectionist, which is a relief. She’s a working mother with a web presence, and she doesn’t want to live in paranoid fear of being tracked. Rather, she wants to make the trackers work harder. She doesn’t want to hand herself over to them on a silver platter. That is already very very hard.

In fact, she goes pretty far, and pays for quite a few different esoteric privacy services; along the way she explores questions like how you decide to trust the weird people who offer those services. At some point she finds herself with two phones – including a “burner”, which made me think she was a character in House of Cards – and one of them was wrapped up in tin foil to avoid the GPS tracking. That was a bit far for me.

Early on in the book she compares the tracking of a U.S. citizen with what happened under Nazi Germany, and she makes the point that the Stasi would have been amazed by all this technology.

Very true, but here’s the thing. The culture of fear was very different then, and although there’s all this data out there, important distinctions need to be made: both what the data is used for and the extent to which people feel threatened by that usage are very different now.

Julia brought these up as well, and quoted sci-fi writer David Brin: The key question is, who has access? and what do they do with it?

Probably the most interesting moment in the book was when she described the so-called “Wiretapper’s Ball”, a private conference of private companies selling surveillance hardware and software to governments to track their citizens. Like maybe the Ukrainian government used such stuff when they texted warning messages to to protesters.

She quoted the Wiretapper’s Ball organizer Jerry Lucas as saying “We don’t really get into asking, ‘Is in the public’s interest?’”.

That’s the closest the book got to what I consider the critical question: to what extent is the public’s interest being pursued, if at all, by all of these data trackers and data miners?

And if the answer is “to no extent, by anyone,” what does that mean in the longer term? Julia doesn’t go much into this from an aggregate viewpoint, since her perspective is both individual and current.

At the end of the book, she makes a few interesting remarks. First, it’s just too much work to stay off the grid, and moreover it’s become entirely commoditized. In other words, you have to either be incredibly sophisticated or incredibly rich to get this done, at least right now. My guess is that, in the future, it will be more about the latter category: privacy will be enjoyed only by those people who can afford it.

Julia also mentions near the end that, even though she didn’t want to get super paranoid, she found herself increasingly inside a world based on fear and well on her way to becoming a “data survivalist,” which didn’t sound pleasant. It is not a lot of fun to be the only person caring about the tracking in a world of blithe acceptance.

Julia had some ways of measuring a tracking system, which she refers to as a “dragnet”, which seems to me a good place to start:

julia_angwinIt’s a good start.

The sun goes around the earth

Periodically you have people conducting surveys to prove how dumb people are. Questions are of the form: Is Germany in Africa? Is the earth less than 1000 years old?

I hate these surveys, and I’m usually able to ignore these obnoxious and unscientific nature of them, except when they also ask the following question: Does the sun go around the earth?

Here’s my reproduction of the imaginary conversation if I encounter such a pollster:

Pollster: Does the sun go around the earth?

Me: It depends on your frame of reference, but yes, if I’m standing on the earth, and I look up in the sky, I will observe the sun going around the earth in a wobbly path, although before I let you go I need to make the point that it would be quite a bit simpler to understand the model of the solar system whereby the earth and other planets revolve around the sun and spin while they do so.

Pollster: Yes or no question, ma’am, what’s it gonna be?

Me: Yes, I guess.

Pollster: You are so ignorant!

Categories: Uncategorized

SAT overhaul

There’s a good New York Times article by Todd Balf entitled The Story Behind the SAT Overhaul (hat tip Chris Wiggins).

In it is described the story of the new College Board President David Coleman, and how he decided to deal with the biggest problem with the SAT: namely, that it was pretty easy to prepare for the test, and the result was that richer kids did better, having more resources – both time and money – to prepare.

Here’s a visual from another NY Times blog on the issue:


Here’s my summary of the story.

At this point the SAT serves mainly to sort people by income. It’s no longer an appropriate way to gauge “IQ” as it was supposed to be when it was invented. Not to mention that colleges themselves have been playing a crazy game with respect to gaming the US News & World Reports college ranking model via their SAT scores. So it’s one feedback loop feeding into another.

How can we deal with this? One way is to stop using it. The article describes some colleges that have made SAT scores optional. They have not suffered, and they have more diversity.

But since the College Board makes their livelihood by testing people, they were never going to just shut down. Instead they’ve decided to explicitly make the SAT about content knowledge that they think high school students should know to signal college readiness.

And that’s good, but of course one can still prepare for that test. And since they’re acknowledging that now, they’re trying to set up the prep to make it more accessible, possibly even “free”.

But here’s the thing, it’s still online, and it still involves lots of time and attention, which still saps resources. I predict we will still see incredible efforts towards gaming this new model, and it will still break down by income, although possibly not quite as much, and possibly we will be training our kids to get good at slightly more relevant stuff.

I would love to see more colleges step outside the standardized testing field altogether.

Categories: modeling, statistics

Could we use eminent domain to help suffering homeowners? (#OWS)

Here are two things you might have some trouble believing if you read the papers regularly and find yourself convinced we are in a housing recovery. First, there are still huge numbers of homeowners on the brink of, or just starting to enter, foreclosure. Second, many of the banks foreclosing on those properties do not have clear legal ownership over the mortgages in question.

Obama should have addressed the first problem through TARP way back in 2008. In fact mortgage modification was an intention of TARP that was promised Congress when it passed the second half of the money but it never happened. Instead Obama came up with the garbage called HAMP, which has been dreadfully implemented and possibly a net harmful program.

Even without Obama, we should have seen a willingness to renegotiate debt. After all, we can negotiate credit card debt, and businesses routinely renegotiate their mortgages. Why are private home mortgages kept airtight? I guess the banks see it as in their interest not to allow negotiations, and whatever the banks want, the banks seem to get.

The second problem, which is essentially one of botched paperwork (explained here), is probably technically the job of some regulator to deal with, but nobody wants to “blow up the system” so nobody is dealing with it. This is especially ironic considering how often we hear about the so-called sanctity of the contract.

The result of these huge looming problems is that banks got bailed out and the system never got cleared of its actual debt and paperwork problems,.

Enter the concept of using eminent domain to force these two issues. Strike Debt, an offshoot of Occupy Wall Street, is pushing this in a few nationwide court cases, for example in Richmond, California.

More recently, and what inspired this post this morning, is a plan cooked up by Strike Debt using eminent domain to force courts to clear up broken chains of title, written by Hannah Appel and JP Massar.

This idea is on its face unappealing, given the history of that crude tool eminent domain. Everyone I meet has their own stories, but start here for a short list of eminent domain abuses.

And it might not work, either. A district judge might not want to deal with the complexity of the issue and might just let the bad paperwork through.

For that matter, many concerns have been voiced about the practicality of this approach, and one that deeply resonates with me is the idea of using it against current mortgages – i.e. mortgages where the homeowner is up-to-date with payment. Using eminent domain in such a case could set a precedent whereby, even though someone has been taking care of their property, the city uses eminent domain to condemn it based on historical data which implies the owner is likely to neglect their property. That would not be good enough. As far as I know the current plan only uses mortgages where there have been missed payments, though.

The bottomline is this: we’re in a situation where all these homeowners are being crushed with unreasonable monthly payments, and hugely inflated principals, where the legal ownership of the mortgage itself is under question, and nobody seems to want to do squat about it. Maybe it’s time a crude tool is used against a cruel enemy.

Categories: #OWS, finance, musing

Aunt Pythia’s advice

Aunt Pythia missed you very much last week and is ever so grateful to return today. And although she usually takes on four questions from readers, today she feels like switching it up and taking on three but making them extra delicious. She hopes you agree that this was the correct choice. Plus she’s running out of questions again, so she’s conserving.

In other words, after you enjoy Aunt Pythia’s wisdom, please don’t forget to:

think of something to ask Aunt Pythia at the bottom of the page!

By the way, if you don’t know what the hell Aunt Pythia is talking about, go here for past advice columns and here for an explanation of the name Pythia.


Dear Aunt Pythia,

So about that Valentine’s Day article which you asked us to ask about… so many questions!

1. In consecutive paragraphs, she says that educated men want “younger, less challenging women” and then that educated women will be frustrated with someone who “just can’t keep up with you or your friends.” Question: is this more insulting to women or to men?

2. She says that “College is the best place to look for your mate. It is an environment teeming with like-minded, age-appropriate single men with whom you already share many things.” Is she talking about STDs here?

3. Did she actually write the sentence “Men won’t buy the cow if the milk is free.”?

4. She writes, “And if you fail to identify ‘the one’ while you’re in college, don’t worry—there’s always graduate school.” So she’s encouraging the old MRS degree. Question: what year was this article written?

That’s all I’ve got for now… I can’t bear to read any more of it!

Woman Turning Forty

Dear WTF,

First, may I express deep satisfaction and pleasure at both your willingness to hate on this article with me and your gorgeous and appropriate acronym. Nicely done, we should hang out. Plus we are age-appropriate, so I’m sure Susan Patton would approve. In fact, here’s a picture of Susan Patton approving or not:

She actually looks like she's reserving judgment in a baffled way.

She actually looks like she’s reserving judgment in a baffled way.

On to the questions:

1. Great point, but I’d have to go with “equally insulting to all human persons” here. The basic assumption she makes is that people can be meaningfully measured by external attributes such as age and education level. Some of the stupidest people I’ve ever met were at Harvard and MIT, and some of the wisest – and in some sense, most threatening – people I’ve met are young children, who can really say it like it is. As to the assumption that men are only interested in young, less challenging women, I’m going to assume that’s the way she raises her sons to be, and I pity them.

2. I mean, look. I’m not saying you shouldn’t take lovers in college, and experiment with STDs for that matter, when it suits you and you have the time and interest. In fact you should fool around as much as you care to, and it’s a natural thing to do considering how many hormones are knocking about. But the idea that you should feel like you’re already late to the critical party if you graduate from college without a fiancee is just putrid advice. People make desperate and bad choices when they are insecure, boxed in, and panicking for time. The way I see it, getting people to marry young is a kind of social control that old people exert on the young, before they really know how to say “fuck this particular model of conformity”.

3. OMG yes she did, and guess what? That’s sexual objectification, pure and simple, and it’s not empowering. If she doesn’t see that, she should watch this video with Caroline Heldman, the chair of the Politics department at Occidental College. In fact everyone should, it blew me away.

4. I’m eyeballing the answer as before 1920, the year women were given the right to vote.

Thanks again for the opportunity to vent!

Aunt Pythia


Dear Aunt Pythia,

You asked for questions on the Susan Patton column. This is barely a question, but here you go.

I have a lot of “alpha” traits that may be stereotypically associated with males. Your posts on being an alpha female have definitely helped me understand some aspects of myself and why it can be confusing for me when I interact with other women, so thanks for that.

For example, my ego likes it when I’m the smartest one in a group, or earn the most money in a relationship or something. But that isn’t always actually what will make me happiest/best off. I am an amateur musician, and I have learned to enjoy being in a musical group where I am the weakest link. I don’t like being a burden to the other people in the group, but if I’m the worst, that means I’m making music with a bunch of people who are even better than I am, so I am making really great music. (And of course I work hard to improve and play as well as I possibly can.) I don’t like playing music with people who are so much better that they will hate the experience, but if I’m the worst by a little bit, it’s perfect for me. Sure, it would give me a little ego boost to be the best and look down on the other people, but that ego boost isn’t as good as the feeling of making better music.

Likewise, if my family’s earnings were limited to 2x, where x is my salary, I would be worse off than if I had a partner who made more money than I did (assuming that money can buy happiness, which it basically can). But in the Patton piece, she talks about the old trope that men don’t want to be out-earned by their partners. My question is, what’s the deal with that? Why are people (stereotypically males, I guess) so threatened by having a partner who earns more than they do, or who is smarter than they are?

Another Alpha Female

Dear AAF,

I just want to make a couple of remarks before getting to your question. First of all, everyone likes feeling like a smart person in a group, and second of all, not everyone is willing to be the worst player in a band. So good for you for being willing to put yourself out there, and alpha female or not, people need to challenge themselves. Plus keep in mind many people – maybe even all – will think they’re the worst person in a band, because they notice their own mistakes more than they notice other people’s.

As for the money thing, I think there are two effects going on here. First, there’s a very temporary “attributes seem important” effect when you first meet someone. This was illustrated recently by various reports (e.g. this) on how people create artificial filters in their online dating profiles – things like height, weight, and education requirements. As it turns out, people are much more restrictive online than in real life, partly because of the nature of the information that is available to online daters.

So just as you think you want a tall guy when you fill out a form, if you meet someone in real life who is two inches shorter than you but makes you laugh yourself silly, you will not even notice his height. And just as men might abstractly be seeking a woman who earns just a little bit less than he does – although I’m not sure men think about it explicitly like this – there’s a good chance he will fall in love based on how she smiles when she plays guitar rather than her paycheck.

There may be a longer term intimidation problem as well, where men and women are accustomed to the idea that the man should be in some way dominant. For example, I still think that men are less likely to leave bad jobs because they have more of a sense of duty towards their images as workers. I’m not sure how to address this in a relationship except to advise women to find a man who loves his job.

Finally, I don’t think anyone ever thinks they’re “not as smart” as their partner. It’s a combination of the multidimensionality of intelligence and human nature that we all find ways in which we’re plenty smart with respect to our long-term friends and partners. I guess the exception might be if both people work in the same exact field and so one dimension of smarts is overemphasized. In that case I’d suggest working in different jobs or at least focusing on other kinds of talents whenever possible.

Aunt Pythia


Dear Aunt Pythia,

Isn’t fairness at least as quantifiable as happiness? Why have no fairness rankings of nations been published? If psychologists can study happiness, then surely sociologists can study fairness.

Elvis Von Essende Nicholas Friedrich Lester Otto Widener IV


Well, depending on what you mean by fairness, there have been a few attempts. For just plain income inequality, we have what’s called the Gini coefficient with an associated map:

In 2009, USA had a terribly high Gini coefficient. Most recently it is 0.486.

In 2009, USA had a terribly high Gini coefficient. Most recently it was measured at 0.486, the very top of that bin.

For other concepts of fairness like “given your situation at birth, what’s your situation later on?” you have the concept of mobility, and here’s a graph of that by city from the New York Times:

inequality map 630

Did you have something else in mind?

Aunt Pythia


Please submit your well-specified, fun-loving, cleverly-abbreviated question to Aunt Pythia!

Categories: Aunt Pythia

An attempt to FOIL request the source code of the Value-added model

Last November I wrote to the Department of Education to make a FOIL request for the source code for the teacher value-added model (VAM).


To explain why I’d want something like this, I think the VAM model sucks and I’d like to explore the actual source code directly. The white paper I got my hands on is cryptically written (take a look!) and doesn’t explain what the actual sensitivity to inputs are, for example. The best way to get at that is the source code.

Plus, since the New York Times and other news outlets published teacher’s VAM scores after a long battle and a FOIA request (see details about this here), I figured it’s only fair to also publicly release the actual black box which determines those scores.

Indeed without knowledge of what the model consists of, the VAM scoring regime is little more than a secret set of rules, with tremendous power over teachers and the teacher union, and also incorporates outrageous public shaming as described above.

I think teachers deserve better, and I want to illustrate the weaknesses of the model directly on an open models platform.

The FOIL request

Here’s the email I sent to on 11/22/13:

Dear Records Access Officer for the NYC DOE,

I’m looking to get a copy of the source code for the most recent value-added teacher model through a FOIA request. There are various publicly available descriptions of such models, for example here, but I’d like the actual underlying code.

Please tell me if I’ve written to the correct person for this FOIA request, thank you very much.

Cathy O’Neil

Since my FOIL request

In response to my request, on 12/3/13, 1/6/14, and 2/4/14 I got letters saying stuff was taking a long time since my request was so complicated. Then yesterday I got the following response:
Screen Shot 2014-03-07 at 8.49.57 AM

If you follow the link you’ll get another white paper, this time from 2012-2013, which is exactly what I said I didn’t want in my original request.

I wrote back, not that it’s likely to work, and after reminding them of the text of my original request I added the following:

What you sent me is the newer version of the publicly available description of the model, very much like my link above. I specifically asked for the underlying code. That would be in a programming language like python or C++ or java.

Can you to come back to me with the actual code? Or who should I ask?

Thanks very much,

It strikes me as strange that it took them more than 3 months to send me a link to a white paper instead of the source code as I requested. Plus I’m not sure what they mean by “SED” but I’m guessing it means these guys, but I’m not sure of exactly who to send a new FOIL request.

Am I getting the runaround? Any suggestions?

Categories: modeling, statistics

Speaking tonight at NYC Open Data

March 6, 2014 Comments off

Tonight I’ll be giving a talk at the NYC Open Data Meetup, organized by Vivian Zhang. I’ll be discussing my essay from last year entitled On Being a Data Skeptic, as well as my Doing Data Science book. I believe there are still spots left if you’d like to attend. The details are as follows:

When: Thursday, March 6, 2014, 7:00 PM to 9:00 PM

Where: Enigma HQ, 520 Broadway, 11th Floor, New York, NY (map)


  • 6:15pm: Doors Open for pizza and casual networking
  • 7:00pm: Workshop begins
  • 8:30pm: Audience Q&A
Categories: data science

Gaming the (risk/legal) system

A while back I was talking to some math people about how credit default swaps (CDSs), by their very nature, contain risk that is generally speaking undetectable with standard risk models like Value-at-Risk (VaR).

It occurred to me then that I could put it another way: that perhaps credit default swaps might have been deliberately created by someone who knew all about the standard risk models to game the system. VaR was commercialized in the mid 1990′s and CDSs existed around the same time, but didn’t take off for a decade or so until after VaR became super widespread, which makes it hard to prove without knowing the actors.

For that matter it is reasonable to assume something less deliberate occurred: that a bunch of weird instruments were created and those which hid risk the most thrived, kind of an evolutionary approach to the same theory.

I was reminded recently of this conspiracy theory when Joe Burns talked to my Occupy group last Sunday about his recent book, Reviving the Strike. He talked about the history of strikes as a tool of leverage, and how much less frequently we’ve seen large-scale strikes and industry-wide strikes. He made the point that the legality of strikes has historically been uncorrelated to the existence of strikes – that strikers cannot necessarily wait for the legal system to catch up with the needs of the worker. Sometimes strikers need to exert pressure on legislation.

Anyhoo, one question that came up in Q&A was how, in this world of subsidiaries and franchises, can workers strike against the upper management with control over the actual big money? After all, McDonalds workers work for franchisees who are often not well-off. The real money lives in the mother company but is legally isolated from the franchises.

Similarly, with Walmart, there are massive numbers of workers that don’t work directly for Walmart but do work in the massive supply chain network set up and run by Walmart. They would like to hold Walmart responsible for their working conditions. How does that work?

It seems like the same VaR/CDS story as above. Namely, the legal structure of McDonalds and Walmart almost seems deliberately set up to avoid legal responsibility from disgruntled workers. So maybe first you had the legal system, then lawyers set up the legal construction of the supply chain and workers such that striking workers could only strike against powerless figures, especially in the McDonalds case (since Walmart has plenty of workers working for the mother company as well).

Last couple of points. First, only long-term, powerful enterprises can go to the trouble of gaming such large systems. It’s an artifact of the age of the corporation.

And finally, I feel like it’s hard to combat. We could try to improve our risk or legal system but that makes them – probably – even more complicated, which in turn gives massive corporations more ways to game them. Not to be a cynic, but I don’t see a solution besides somehow separately sidestepping our personal risk exposure to these problems.

Categories: finance

How much is your data worth?

I heard an NPR report yesterday with Emily Steel, reporter from the Financial Times, about what kind of attributes make you worth more to advertisers. She has developed an ingenious online calculator here, which you should go play with.

As you can see it cares about things like whether you’re about to have a kid or are a new parent, as well as if you’ve got some disease where the industry for that disease is well-developed in terms of predatory marketing.

For example, you can bump up your worth to $0.27 from the standard $0.0007 if you’re obese, and another $0.10 if you admit to being the type to buy weight-loss products. And of course data warehouses can only get that much money for your data if they know about your weight, which they may or may not since if you don’t buy weight-loss products.

The calculator doesn’t know everything, and you can experiment with how much it does know, but some of the default assumptions are that it knows my age, gender, education level, and ethnicity. Plenty of assumed information to, say, build an unregulated version of a credit score to bypass the Equal Credit Opportunities Act.

Here’s a price list with more information from the biggest data warehouser of all, Acxiom.

Categories: data science, modeling

Report from an MSRI MOOC conversation

I am back from Berkeley where I attended a couple of hours of conversations about MOOCs last Friday up at MSRI.

It was a panel discussion given mostly by math and stats people who themselves run MOOCs, and I was wondering if the people who are involved have a better sense of the side effects and feedback loops involved in the process. After all, I’m claiming that the MOOC Revolution will lead to the end of math research, and I wanted to be proven wrong.

Unfortunately, I left feeling like I have even more evidence that my fears will be realized.

I think the critical moment came when Ani Adhikari spoke. Professor Adhikari is in the second semester of giving her basic stats MOOC, and from how she described it, she is incredibly good at it, and there’s a social network aspect of the class which seems like it’s going really well – she says she spends 30 minutes to an hour a day on it herself, interacting with students. I think she said 28,000 students took it her first semester in addition to her in-class students at Berkeley. I know and respect Professory Adhikari personally, as I taught for her at the Berkeley Mills summer program for women many years ago. I know how devoted she is to good teaching.

Even so, she lost me late in the discussion when she explained that EdX, the platform which hosts her stats MOOC, wanted to offer her class three times a year without her participation. She said something to the effect that MOOC professors had to be “extra vigilant” about this outrageous idea and guard against it at all costs.

After all, she said, at the end of the day the MOOC videos are something like a fancy textbook, and we don’t hand out textbooks and claim they are courses, so we by the same token cannot hand out MOOC videos (and presumably the social networks associated with them) and claim they are courses.

When I pressed her in the Q&A session as to how exactly she was going to remain vigilant against this threat, she said she has a legal contract with EdX that prevented them from offering the course without her approval.

And I’m happy for her and her great contract, but here are two questions for her and for the community.

First, how long until someone in math or stats makes a kick-ass MOOC and doesn’t remember to have that air-tight legal contract? Or has an actual legal battle with EdX and realized their lawyers are not as expensive? Or believes that “information should be free” and does it with the express intention of letting the MOOC be replayed forever?

Second, how much sense does it make to claim that you and your presence are super critical to the success of a MOOC if 28,000 people took this class and you interacted at most one hour a day? Can you possibly claim that the average student benefitted from your presence? It seems to me that the value proposition for the average MOOC student is very similar whether you are there or not.

Overall the impression I got from the speakers, who were mostly MOOC evangelists and involved with MOOCs themselves, was that they loved MOOCs because MOOCs were working for them. They weren’t looking much beyond that point to side effects.

There was one exception, namely Susan Holmes, who listed some side effects of MOOCs including a decreased need for math Ph.D.’s. Unfortunately the conversation didn’t dwell on this, though, and it happened at the very end of the day.

Here’s what I’d like to see: a conversation at MSRI about the future of math research funding in the context of MOOCs and a reduced NSF, where hopefully we come up with something besides “Jim Simons”. It’s extra ironic that the conversation, if it happens, would be held in the Simons Theater.

Categories: math education

Data journalism

I’m in Berkeley this week, where I gave two talks (here are my slides from Monday’s talk on recommendation engines, and here are my slides from Tuesday’s talk on modeling) and I’ve been hanging out with math nerds and college friends and enjoying the amazing food and cafe scene. This is the freaking life, people.

Here’s what’s been on my mind lately: the urgent need for good data journalism. If you read this Washington Post blog by Max Fisher you will get at one important angle of the problem. The article talks about the need for journalists to be competent in basic statistics and exploratory data analysis to do reasonable reporting on data, in this case the state of journalistic freedoms.

And you might think that, as long as journalists report on other stuff that’s not data heavy, they’re safe. But I’d argue that the proliferation of data is leaking into all corners of our culture, and basic data and computing literacy is becoming increasingly vital to the job of journalism.

Here’s what I’m not saying (a la Miss Disruption): learn to code, journalists, and everything will be cool. To be clear, having data skills is necessary but not sufficient.

So it’s more like, if you don’t learn to code, and even more importantly if you don’t learn to be skeptical of the models and the data, then you will have yet another obstacle between you and the truth.

Here’s one way to think about it. A few days ago I wrote a post about different ways to define and regulate discriminatory acts. On the one hand you have acts or processes that are “effectively discriminatory” and on the other you have acts or processes that are “intentionally discriminatory.”

In this day and age, we have complicated, opaque, and proprietary models: in other words, a perfect hiding place for bad intentions. It would be idiotic for someone with the intention of being discriminatory to do so outright. It’s much easier to embed such a thing in an opaque model where it will seem unintentional and will probably never be discovered at all.

But how is an investigative journalist going to even approach that? The first thing they need is to arm themselves with the right questions and the right attitude. And it wouldn’t help if they or their team can perform a test on the data and algorithm as well.

I’m not saying that we’re going to suddenly have do-everything super human journalists. Just as the list of job requirements for data scientists is outrageously long and nobody can be expert at everything, we will have to form teams of journalists which as a whole has lots of computing and investigative expertise.

The alternative is that the models go unchallenged, which is a really bad idea.

Here’s a perfect example of what I think needs to happen more: when ProPublica reverse-engineered Obama’s political messaging model.

Categories: data journalism

Get every new post delivered to your Inbox.

Join 891 other followers