I am pushing an unusual way of considering economic health. I call it “distributional thinking.” It requires that you not aggregate everything into one statistic, but rather take a few samples from different parts of the distribution and consider things from those different perspectives.
So instead of saying “things are great because the economy has expanded at a rate of 4%” I’d like us to think about more individual definitions of “great.”
For example, it’s a good time to be rich right now. Really good. The stock market keeps hitting all-time highs, the jobs market is great in tech, and it’s still absolutely possible to hide wealth in off-shore tax havens.
It’s not so good to be middle class. Wages are stagnant and have been forever, and jobs are drying up due to automation and a lack of even maintenance-level infrastructure work. Colleges are super expensive, and the best the government can do is fiddle around the edges with interest rates.
It’s a really bad time to be poor in this country. Jobs are hard to find and conditions are horrible. There are more and more arrests for petty crimes as the violent crime rate goes down. Those petty crime arrests lead to big fees and sometimes jail time if you can’t pay the fee. Look at Ferguson as an example of what this kind of frustration this can lead to.
Once you are caught in the court system, private probation companies act as abusive debt collectors, and nobody controls their fees, which can be outrageous. To be clear, we let this happen in the name of saving money: private for-profit companies like this guarantee that they won’t cost anything to the local government because they make the people on probation pay for services.
And even though that’s an outrageous and predatory system, it’s not likely to go away. Once they are officially branded as criminals, the poor often lose their voting rights, which means they have little political recourse to protect themselves. On the flip side, they are largely silent about their struggles for the same reason.
Once you think about our economic health this way, you realize how comparatively meaningless the GDP is. It is no longer a good proxy to true economic health, where all classes would be more or less better off as it went up.
And until we get on the same page, where we all go up and down together, it is a mathematical fact that no one statistic could possibly capture the progress we are or are not making. Instead, we need to think distributionally.
Aunt Pythia is ginormously and ridonkulously excited to be here. She just got back from a nifty bike ride to the other side of the Hudson and took this picture of this amazing city on this amazing day:
OK, so full disclosure. Aunt Pythia kind of blew her load, so to speak, on the sex questions last week, so she’s making do with coyly answering nerdy questions. Because that’s what we got.
I hope you enjoy her efforts, and even if you despise them – especially if you despise them – don’t forget to:
please think of something to ask Aunt Pythia at the bottom of the page!
Hi Aunt Pythia,
I’m a math student at MIT, where you did a postdoc. I’m also into computers, and am considering working in some finance classes. I could see myself being happy working for some big financial company that I don’t really care about, as long as I have interesting problems to work on, make a ton of money, and have bright people I get to work with.
My interests right now are in very pure math, I get chills just thinking about categorical-theoretic concepts. I’m planning to learn commutative algebra and algebraic geometry soon. I’m also likely to take stochastic calculus.
What kind of math did you do? Any tips on if taking the pure math I love will be of use, or at least get me “cred” with financial companies?
I do love math, and seeing that you did math at MIT and have seen this world of things, maybe you have some advice to offer me.
Thank you dearly.
Don’t do it!
Don’t take the math to get “cred” with financial companies. Do what is sexy and beautiful to you. If you love category theory, do that, then do algebra and algebraic geometry. I did number theory in the form of arithmetic algebraic geometry myself. It’s awesomely beautiful and I don’t regret one moment of it.
Let’s say you do decide to go into the “real world.” At the end of the day, if you can do that math stuff we’ve been talking about, you can learn other stuff too. So I’m not going to worry about you on the technical side of things.
On the other side of things, I’d like you to rethink the idea that you “don’t mind who you work for as long as you have interesting problems.” Is that really true? Once you leave pure math there are real applications of your work, and they affect real people. Shit gets real real quick and stuff matters, and I urge you to think it through some more.
Dear Aunt Pythia,
Do all mathematicians visualize their problems? From a logical viewpoint there are a lot of mathematical spaces that don’t map onto an imagined 3d workspace but on limited conversations with working mathematicians they seem to me to do it at least at some stage of problem solving.
(I’m more of a physicist who visualizes nearly everything so maybe I’m misreading them.)
Most, but not all. I once had a conversation with someone who couldn’t understand my drawing of a geometric map between spaces. I was explaining the concept visually (or at least I thought I was!) but he forced me to write it down with double sums and formulas, and I thought that was the weirdest thing ever, but that’s how it became understandable to him.
In general we do think visually, although we really can’t think beyond three dimensions (even though we pretend we can). I guess time makes it 4. Most geometers I know, ironically, don’t have a very good working sense of 3 dimensions, and definitely don’t have a good sense of direction!
Come to think of it my sample is too small, so I’m mostly just saying that for fun. It would be neat to get actual statistics on that. Maybe if I’m ever pulled into going to JMM again I’ll make people fill out forms. Oh wait, I’m going to JMM this January.
I can ask about this, it’s a nice question! Readers, what else should I poll math nerds on?
Dear Aunt Pythia,
I’m an American mutt and for awhile I was annoyed when people asked “Where are you from” or “What’s your nationality”. I think I was sensitive to it because kids wanted to narrow down exactly which ethnic slurs to use. But as an adult, mostly people are just curious, and I’m happy to share since I’m curious about them too.
When I meet someone with an accent, I’m curious about them and their background, what it’s like in their home country, how they came to the US, etc.
What is an appropriate way to ask about someone’s ethnic background or country of origin? It seems like you should be able to ask anyone this question; it just seems rude when that person is different from you. Do you know what I mean?
WHy Ask That Rude qUestion
I like the subtle sign-off!
Here’s the thing, I think you nailed it. If your intention is to be mean, then don’t ask it. If your intention is to be friendly and to make a connection, then go ahead and ask it! I always ask cabbies where they come from, and then I get to learn about their countries. I have never experienced someone who doesn’t want to talk to me about their home country, and I’ve made quite a few friends. I’ve been invited to so many countries for visits, and that is always so incredibly generous and sweet! People are amazing.
Of course, some people just don’t do this kind of small-talk, and I get that too. It’s not for everyone. But it’s super fun for us extroverts.
Dear Aunt Pythia,
First off, you’re blog is both entertaining and informative, and you’ve found the sweet spot combination of the two that makes it addictive.
I find your work with the Lede program at Columbia fascinating and relevant to the growing, amorphous “big data” movement. I am a frequent visitor of websites such as Fivethirtyeight, which Nate Silver has rebranded as a news source that derives its stories from statistics and big data analytics. Even other sources, such as The Atlantic, have begun to follow suit and incorporate large statistical analyses into some of their stories. This experiment of basing our news stories on statistics brings hope that we can move closer to the ideal of an unbiased account.
In light of this new format (and your school), what sources do you consider the best? Are there any that you visit to get an insightful statistical perspective on the news. Or do you side with the criticism that many of these sites fuel a sensationalist, biased view of the world intended to spawn viral stories?
Will we ever find the right place for statistics in the news?
Considering unbiased reality in our ubiquitous (news)stories
Holy crap, nice sign-off. And thanks for being addicted to mathbabe! All my evil plans are working. Time to start on the next phase… moo-hooo-hahahahahaha.
OK, so here’s the thing. We will never have unbiased accounts. Never. At the very least we will have bias in the way that data is collected.
What I’ve spent the summer talking to my students about is getting used to the fact that there will always be bias, and how we therefore do our best to be at least somewhat aware of them, and try very hard not to obscure them. Transparency is the new objectivity!
This is of course disappointing to people who want there to be “one truth,” but that’s how science is. After a while we get used to the disappointment and we can all appreciate some really good signal/noise ratios.
As for the right place for statistics in the news, I think we’re figuring that out right now, and I’m excited to be part of it. And holy shit, have you seen the new ProPublica work on the Louisiana coast? Those guys are killing it.
Please submit your well-specified, fun-loving, cleverly-abbreviated question to Aunt Pythia!
Any time I see an article about the evaluation system for teachers in New York State, I wince. People get it wrong so very often. Yesterday’s New York Times article written by Elizabeth Harris was even worse than usual.
First, her wording. She mentioned a severe drop in student reading and math proficiency rates statewide and attributed it to a change in the test to the Common Core, which she described as “more rigorous.”
The truth is closer to “students were tested on stuff that wasn’t in their curriculum.” And as you can imagine, if you are tested on stuff you didn’t learn, your score will go down (the Common Core has been plagued by a terrible roll-out, and the timing of this test is Exhibit A). Wording like this matters, because Harris is setting up her reader to attribute the falling scores to bad teachers.
Harris ends her piece with a reference to a teacher-tenure lawsuit: ‘In one of those cases, filed in Albany in July, court documents contrasted the high positive teacher ratings with poor student performance, and called the new evaluation system “deficient and superficial.” The suit said those evaluations were the “most highly predictive measure of whether a teacher will be awarded tenure.”’
In other words, Harris is painting a picture of undeserving teachers sneaking into tenure in spite of not doing their job. It’s ironic, because I actually agree with the statement that the new evaluation system is “deficient and superficial,” but in my case I think it is overly punitive to teachers – overly random, really, since it incorporates the toxic VAM model – but in her framing she is implying it is insufficiently punitive.
Let me dumb Harris’s argument down even further: How can we have 26% English proficiency among students and 94% effectiveness among teachers?! Let’s blame the teachers and question the legitimacy of tenure.
Indeed, after reading the article I felt like looking into whether Harris is being paid by David Welch, the Silicon Valley dude who has vowed to fight teacher tenure nationwide. More likely she just doesn’t understand education and is convinced by simplistic reasoning.
In either case, she clearly needs to learn something about statistics. For that matter, so do other people who drag out this “blame the teacher” line whenever they see poor performance by students.
Because here’s the thing. Beyond obvious issues like switching the content of the tests away from the curriculum, standardized test scores everywhere are hugely dependent on the poverty levels of students. Some data:
It’s not just in this country, either:
The conclusion is that, unless you think bad teachers have somehow taken over poor schools everywhere and booted out the good teachers, and good teachers have taken over rich schools everywhere and booted out the bad teachers (which is supposed to be impossible, right?), poverty has much more of an effect than teachers.
Just to clarify this reasoning, let me give you another example: we could blame bad journalists for lower rates of newspaper readership at a given paper, but since newspaper readership is going down everywhere we’d be blaming journalists for what is a cultural issue.
Or, we could develop a process by which we congratulate specific policemen for a reduced crime rate, but then we’d have to admit that crime is down all over the country.
I’m not saying there aren’t bad teachers, because I’m sure there are. But by only focusing on rooting out bad teachers, we are ignoring an even bigger and harder problem. And no, it won’t be solved by privatizing and corporatizing public schools. We need to address childhood poverty. Here’s one more visual for the road:
For a while now I’ve been thinking I should build a decision tree for deciding which algorithm to use on a given data project. And yes, I think it’s kind of cool that “decision tree” would be an outcome on my decision tree. Kind of like a nerd pun.
I’m happy to say that I finally started work on my algorithm decision tree, thanks to this website called gliffy.com which allows me to build flowcharts with an easy online tool. It was one of those moments when I said to myself, this morning at 6am, “there should be a start-up that allows me to build a flowchart online! Let me google for that” and it totally worked. I almost feel like I willed gliffy.com into existence.
So here’s how far I’ve gotten this morning:
I looked around the web to see if I’m doing something that’s already been done and I came up with this:
I appreciate the effort but this is way more focused on the size of the data than I intend to be, at least for now. And here’s another one that’s even less like the one I want to build but is still impressive.
Because here’s what I want to focus on: what kind of question are you answering with which algorithm? For example, with clustering algorithms you are, you know, grouping similar things together. That one’s easy, kind of, although plenty of projects have ended up being clustering or classifying algorithms whose motivating questions did not originally take on the form “how would we group these things together?”.
In other words, the process of getting at algorithms from questions is somewhat orthogonal to the normal way algorithms are introduced, and for that reason taking me some time to decide what the questions are that I need to ask in my decision tree. Right about now I’m wishing I had taken notes when my Lede Program students asked me to help them with their projects, because embedded in those questions were some great examples of data questions in search of an algorithm.
Please give me advice!
I don’t usually blog about my kids, but my 14-year-old son has explicitly given me his blessing to post his recent stand-up performance at the Gotham Comedy Club:
The look he gives the audience at the end is my favorite part.
Yesterday I read a book written by Carole Marshall which she called Stubborn Hope: Memoir of an Urban Teacher (thanks to Ernest Davis for sending it to me). Just to give you an idea of how quick this read is, I read it before class. I think it took about 1 hour and 10 minutes in all.
In a nutshell, it was the story of a really hard-working and dedicated urban school teacher who learned how to teach reading skills, and prose and poetry writing skills to her poverty-stricken students in the urban Providence, RI area. She develops curriculum, making it relevant to the kids, and gets them to read every night and to aspire to college. The school that she mostly taught at is profiled in this article from the Brown Daily Herald.
She’s a really good writer herself, and she profiles a bunch of her students with enough details to make you feel enormous empathy for their struggles. In other words, she makes this shit very very real. After reading this you stop wondering why we see a strong negative correlation between standardized tests scores and poverty levels, because it is so obvious.
You might want to check out this video to get a satirical idea of what this woman was like and what she was dealing with (hat tip Jenn Rubinovitz):
Here’s the thing. We need nice white ladies in our schools! And of course nice other people too.
But we are presently losing such dedicated people. Carole Marshall, the author of these memoirs, quit teaching after the school system she worked in was taken over by the mindless testing zombies. She describes her experience like this:
After spending years refining strategies for getting my students to become enthusiastic readers and writers on thoughtful, relevant curriculum, I was being forced to teach canned curriculum purchased for millions of dollars from textbook publishers who knew nothing about urban teaching.
School and district administrators roamed the halls and classrooms, taking notes on shiny new iPads, to make sure teachers were on the same page every day as every other teacher in our grade and subject in the district. All the activities we had used in the past to open our students to a world beyond the narrow constraints of their neighborhoods were no longer permitted; they were seen as time wasted. Every path to good teaching was effectively blocked off.
It had become impossible to do the things with students that I believe teachers need to be able to do. What was going on in the classrooms could no longer be called teaching. When I realized that, it was a sad day. At the end of that year, I left teaching.
That was in 2012, I believe. Since then she’s become more aware of the national disaster that is defined by the testing insanity. She even worked for a time with a test prep company based in Florida that was clearly scamming for the $5 million consultant fee and removing cherry-picked students from important classes so the school would look like it had improved based on the arbitrary measure of the month.
We are so used to pointing at examples of bad and defeated teachers and saying that they are the problem, and that a strict and regimented system of curriculum will improve the classrooms for the students of such teachers. And maybe in some cases that is true.
But when we do that we also push out really talented and inspirational teachers like Carole Marshall. It is painful to imagine how many great teachers have left the educational system because of No Child Left Behind and Race To The Top. Come to think of it, that would be a great data journalism project.
Last Friday Gillian Tett ran a profoundly disturbing article in the Financial Times entitled Mapping Crime – Or Stirring Hate? (hat tip Marcos Carreira), which makes me sad to say this given how much respect I normally have for her regarding her coverage of the financial crisis.
In the article, Tett describes the predictive policing model used by the Chicago police force, which told the police where to go to find criminals based on where people had been arrested in the past.
Her article reads like an advertisement for racist profiling. First she deftly and indirectly claims the model is super successful at lowering the murder rate without actually coming out and saying so (since she actually has only correlative evidence):
And when Weis launched the programme in early 2010, together with a clever policeman-cum-computer expert called Brett Goldstein, it delivered impressive results. In the first year the murder rate fell 5 per cent and then continued to tumble. Indeed by the summer of 2011 it looked as if Chicago’s annual death toll would soon drop below 400, the lowest since 1965. “The homicide rates for that summer were just crazy low compared to what we had been,” Weis observes.
But then, following his departure from the force, the programme was wound down in late 2011. And, tragically, the murder rate immediately rose again.
Here’s the thing, it’s really hard to actually know why murder rates go up and down. In New York City we’ve been using Stop & Frisk as the violent crime rates have been steadily lowering in this city (and many others), and for a long time Bloomberg took credit for that through the Stop & Frisk practice. But when Stop & Frisk rates went down, murder rates didn’t shoot up. Just saying. And that’s ignoring how reliable the police data is, which is another issue. Let’s take a look at her evidence for a longer time frame:
The reason I’m pointing out her bad statistics is that she needs them to set up the following, truly disturbing paragraphs (emphasis mine):
But while racism is rightly deemed unacceptable, computer programs pose more subtle questions. If a spreadsheet forecast has a racial imbalance, is this likely to reinforce existing human biases, or racial profiling? Or is a weather map of crime simply a neutral tool? To put it another way, does the benefit of using predictive policing outweigh any worries about political risk?
Personally, I think it does. After all, as the former CPD computer experts point out, the algorithms in themselves are neutral. “This program had absolutely nothing to do with race… but multi-variable equations,” argues Goldstein. Meanwhile, the potential benefits of predictive policing are profound.
No, Gillian Tett, there is no such thing as a neutral tool. No algorithm focused on human behavior is neutral. Anything which is trained on historical human behavior embeds and codifies historical and cultural practices. Specifically, this means that the fact that black Americans are nearly four times as likely as whites to be arrested on charges of marijuana possession even though the two groups use the drug at similar rates would be seen by such a model (or rather, by the people who deploy the model) as a fact of nature that is neutral and true. But it is in fact a direct consequence of systemic racism.
Put it another way: if we allowed a model to be used for college admissions in 1870, we’d still have 0.7% of women going to college. Thank goodness we didn’t have big data back then!
This is very scary to me, when even Gillian Tett, who famously predicted the financial crisis in 2006, can be fooled. We clearly have a lot of work to do.