## Gender And The Harvard Math Department

*This is a guest post by Meena Boppana, a junior at Harvard and former president of the Harvard Undergraduate Math Association (HUMA). Meena is passionate about addressing the gender gap in math and has co-lead initiatives including the Harvard math survey and the founding of the Harvard student group Gender Inclusivity in Math (GIIM). *

I arrived at Harvard in 2012 head-over-heels in love with math. Encouraged to think mathematically since I was four years old by my feminist mathematician dad, I had even given a TEDx talk in high school declaring my love for the subject. I was certainly qualified and excited enough to be a math major.

Which is why, three years later, I think about how it is that virtually all my female friends with insanely strong math backgrounds (e.g. math competition stars) decided not to major in math (I chose computer science). This year, there were no female students in Math 55a, the most intense freshman math class, and only two female students graduating with a primary concentration in math. There are also a total of zero tenured women faculty in Harvard math.

So, I decided to do some statistical sleuthing and co-directed a survey of Harvard undergraduates in math. I was inspired by the work of Nancy Hopkins and other pioneering female scientists at MIT, who quantified gender inequities at the Institute – even measuring the square footage of their offices – and sparked real change. We got a 1/3 response rate among all math concentrators at Harvard, with 150 people in total (including related STEM concentrations) filling it out.

The main finding of our survey analysis is that the dearth of women in Harvard math is far more than a “pipeline issue” stemming from high school. So, the tale that women are coming in to Harvard knowing less math and consequently not majoring in math is missing much of the picture. Women are dropping out of math during their years at Harvard, with female math majors writing theses and continuing on to graduate school at far lower rates than their male math major counterparts.

And it’s a cultural issue. Our survey indicated that many women would like to be involved in the math department and aren’t, most women feel uncomfortable as a result of the gender gap, and women feel uncomfortable in math department common spaces.

The simple act of talking about the gender gap has opened the floodgates to great conversations. I had always assumed that because no one was talking about the gender gap, no one cared. But after organizing a panel on gender in the math department which drew 150 people with a roughly equal gender split and students and faculty alike, I realized that my classmates of all genders feel more disempowered than apathetic.

The situation is bad, but certainly not hopeless. Together with a male freshman math major, I am founding a Harvard student group called Gender Inclusivity in Math (GIIM). The club has the two-fold goal of increasing community among women in math, including dinners, retreats, and a women speaker series, and also addressing the gender gap in the math department, continuing the trend of surveys and gender in math discussions. The inclusion of male allies is central to our club mission, and the support from male allies at the student and faculty level that we have received makes me optimistic about the will for change.

Ultimately, it is my continued love for math which has driven me to take action. Mathematics is too beautiful and important to lose 50 percent (or much more when considering racial and class-based inequities) of the potential population of math lovers.

## Nick Kristof is not Smarter than an 8th Grader

*This is a post by Eugene Stern, originally posted on his blog sensemadehere.wordpress.com.*

About a week ago, Nick Kristof published this op-ed in the New York Times. Entitled Are You Smarter than an 8th Grader, the piece discusses American kids’ underperformance in math compared with students from other countries, as measured by standardized test results. Kristof goes over several questions from the 2011 TIMSS (Trends in International Mathematics and Science Study) test administered to 8th graders, and highlights how American students did worse than students from Iran, Indonesia, Ghana, Palestine, Turkey, and Armenia, as well as traditional high performers like Singapore. “We all know Johnny can’t read,” says Kristof, in that finger-wagging way perfected by the current cohort of New York Times op-ed columnists; “it appears that Johnny is even worse at counting.”

The trouble with this narrative is that it’s utterly, demonstrably false.

My friend Jordan Ellenberg pointed me to this blog post, which highlights the problem. In spite of Kristof’s alarmism, it turns out that American eighth graders actually did quite well on the 2011 TIMSS. You can see the complete results here. Out of 42 countries tested, the US placed 9th. If you look at the scores by country, you’ll see a large gap between the top 5 (Korea, Singapore, Taiwan, Hong Kong, and Japan) and everyone else. After that gap comes Russia, in 6th place, then another gap, then a group of 9 closely bunched countries: Israel, Finland, the US, England, Hungary, Australia, Slovenia, Lithuania, and Italy. Those made up, more or less, the top third of all the countries that took the test. Our performance isn’t mind-blowing, but it’s not terrible either. So what the hell is Kristof talking about?

You’ll find the answer here, in a list of 88 publicly released questions from the test (not all questions were published, but this appears to be a representative sample). For each question, a performance breakdown by country is given. When I went through the questions, I found that the US placed in the top third (top 14 out of 42 countries) on 45 of them, the middle third on 39, and the bottom third on 4. This seems typical of the kind of variance usually seen on standardized tests. US kids did particularly well on statistics, data interpretation, and estimation, which have all gotten more emphasis in the math curriculum lately. For example, 80% of US eighth graders answered this question correctly:

Which of these is the best estimate of (7.21 × 3.86) / 10.09?

(A) (7 × 3) / 10 (B) (7 × 4) / 10 (C) (7 × 3) / 11 (D) (7 × 4) / 11

More American kids knew that the correct answer was (B) than Russians, Finns, Japanese, English, or Israelis. Nice job, kids! And let’s give your teachers some credit too!

But Kristof isn’t willing to do either. He has a narrative of American underperformance in mind, and if the overall test results don’t fit his story, he’ll just go and find some results that do! Thus, the examples in his column. *Kristof literally went and picked the two questions out of 88 on which the US did the worst, and highlighted those in the column*. (He gives a third example too, a question in which the US was in the middle of the pack, but the pack did poorly, so the US’s absolute score looks bad.) And, presto! — instead of a story about kids learning stuff and doing decently on a test, we have yet another hysterical screed about Americans “struggling to compete with citizens of other countries.”

Kristof gives no suggestions for what we can actually do better, by the way. But he does offer this helpful advice:

Numeracy isn’t a sign of geekiness, but a basic requirement for intelligent discussions of public policy. Without it, politicians routinely get away with using statistics, as Mark Twain supposedly observed, the way a drunk uses a lamppost: for support rather than illumination.

So do op-ed columnists, apparently.

## How many NYC are arbitrarily punished by the VAM? About 578 per year.

There’s been an important update in the thought experiment I started yesterday. Namely, a reader (revuluri) has provided me with a link to show how many teachers are considered “ineffective,” which was my shorthand for scoring either third or fourth in the four categories.

According to page 5 of this document, that percentage was 16% in 2011-2012, 17% in 2012-2013, and 16% in 2013-2014. We’ll take this to mean that the true cutoff is about 16.3%. Using my formula from yesterday, that means that after 4 years, about

or 12.7% of teachers going up for tenure in the new system will be arbitrarily denied tenure based only on their VAM score.

How many people is that in a given year? Well, this document explains that in 2000, 9,000 teachers were hired and in 2008, 6,000 teachers were hired. I’ll assume my best guess for “teachers hired” in a given year is something between those two numbers, but I’ll also assume it’s closer to the latter since it is more recent information. Say 7,000 new teachers per year.

Of course, not all of them go up for tenure. There’s attrition. Say 35% of those teachers leave before the tenure decision is made (also guessing from this document). That leave us with about 4,550 teachers going up for tenure each year, and 12.7% of them is 578 people.

So, according to my crude estimates, about 578 people will be denied tenure simply based on this random number generator we call VAM. And as my reader said, this says nothing about the hard-to-measure damage done to all the good teachers trying to teach their kids but having to deal with this standardized testing nonsense. It’s a wonder anyone is willing to work here.

Please comment if you have updated numbers for anything here.

## I accept mathematical bribes

Last Friday I traveled to American University and gave an evening talk, where I met Jeffrey Hakim, a mathematician and designer who openly bribed me.

Don’t worry, it’s not that insidious. He just showed me his nerdy math wallet and said I could have one too if I blogged about it. I obviously said yes. Here’s my new wallet:

You might notice there is writing and pictures on my new wallet! They are mathematical, which is why I don’t feel bad about accepting this bribe: it’s all in the name of education and fun with mathematics. Let me explain the front and back of the wallet.

The front is a theorem:

Here’s the thing, I’ve proven this. I have even assigned it to my students in the past to prove. We always use induction. This kind of identity is kind of made for induction, no? Don’t you think?

Well Jeffrey Hakim had an even better idea. His proof of Nicomachus’s Theorem is represented as a picture on the back of the wallet:

Here’s what I’d like you all to do: go think about why this is a proof of the above identity. Come back if you can’t figure it out, but if you can, just go ahead and pat yourself on your back and don’t bother reading the rest of this blogpost because it’s just going to explain the proof.

I’ll give you all a moment…

OK almost ready?

OK cool here’s why this is a proof.

First, convince yourself that this “pattern,” of building a frame of square boxes around the above square, can be continued. In other words, it’s a square of 4 1×1 boxes, framed by 2×2 boxes, framed by 3×3 boxes, and so on. It could go on forever this way, because if you focus on one side of the outside of the third layer, there are 4 3×3 boxes, so length , and we need it to also be the length inside the 4th frame, which has 3 boxes of length 4. Since , we’re good. And that generalizes when it’s the th layer, of course, since the outside of the th layer will have boxes, each of length making the inside of the st have boxes, each of length .

OK, now here’s the actual trick. *What is the area of this box? *

I claim there are two ways to measure the area, and one of the ways will give you the left hand side of Nicomachus’s Theorem but the other way will give you the right hand side of Nicomachus’s Theorem.

To be honest, it’s just one bit more complicated than that. Namely, the first way gives you something that’s 4 times bigger than the left hand side of Nicomachus’s Theorem and the second way gives you something 4 times bigger than the right hand side of Nicomachus’s Theorem.

Why don’t you go think about this for a few minutes, because the clue might be all you need to figure it out.

Or, perhaps you just want me to go ahead and explain it. I’ll do that! That’s why I got the wallet!

OK, now imagine isolating the top right quarter of the above figure. Like this:

That’s a square, obviously, so its area is the square of the length of any side. But if you go along the bottom, the length is obviously which means the area is the square of that,

And since we know we can generalize the original figure to go up to instead of just 4, one quarter of the figure will have area which is to say the entire figure will have area

That’s 4 times the right-hand side of the theorem, so we’re halfway done!

Next, we will compute the area of the original figure a different way, namely by simply adding up and counting all the differently colored squares that make it up. Assume that we continue changing colors every time we get a new layer.

So, there are 4 1×1 squares, and there are 8 2×2 squares, and there are 12 3×3 squares, and there are 16 4×4 squares. In the generalized figure, there would be squares.

So if you look at the area of the generalized figure which is all one color, say the th color, it will be of the form

That means the overall generalized figure will have total area:

Since that’s just 4 times the left-hand side of the theorem, we’re done.

Notes:

- this would be a fun thing to do with a kid.
- there’s more math inside the wallet which I haven’t gotten to yet.
- After staring at the picture for another minutes, I just realized the total area of the whole (generalized) thing is obviously which is to say that either the left-hand side or right-hand side of the original identity is one fourth of that. Cool!

## Guest post: Be more careful with the vagina stats in teaching

*This is a guest post by Courtney Gibbons, an assistant professor of mathematics at Hamilton College. You can see her teaching evaluations on ratemyprofessor.com. She would like you to note that she’s been tagged as “hilarious.” Twice.*

Lately, my social media has been blowing up with stories about gender bias in higher ed, especially course evaluations. As a 30-something, female math professor, I’m personally invested in this kind of issue. So I’m gratified when I read about well-designed studies that highlight the “vagina tax” in teaching (I didn’t coin this phrase, but I wish I had).

These kinds of studies bring the conversation about bias to the table in a way that academics can understand. We can geek out on experimental design, the fact that the research is peer-reviewed and therefore passes some basic legitimacy tests.

Indeed, the conversation finally moves out of the realm of folklore, where we have “known” for some time that students expect women to be nurturing in addition to managing the class, while men just need to keep class on track.

Let me reiterate: as a young woman in academia, I want deans and chairs and presidents to take these observed phenomena seriously when evaluating their professors. I want to talk to my colleagues and my students about these issues. Eventually, I’d like to “fix” them, or at least game them to my advantage. (Just kidding. I’d rather fix them.)

However, let me speak as a mathematician for a minute here: bad interpretations of data don’t advance the cause. There’s beautiful link-bait out there that justifies its conclusions on the flimsy “hey, look at this chart” understanding of big data. Benjamin M. Schmidt created a really beautiful tool to visualize data he scraped from the website ratemyprofessor.com through a process that he sketches on his blog. The best criticisms and caveats come from Schmidt himself.

What I want to examine is the response to the tool, both in the media and among my colleagues. USAToday, HuffPo, and other sites have linked to it, citing it as yet more evidence to support the folklore: students see men as “geniuses” and women as “bossy.” It looks like they found some screenshots (or took a few) and decided to interpret them as provocatively as possible. After playing with the tool for a few minutes, which wasn’t even hard enough to qualify as sleuthing, I came to a very different conclusion.

If you look at the ratings for “genius” and then break them down further to look at positive and negative reviews separately, it occurs predominantly in negative reviews. I found a few specific reviews, and they read, “you have to be a genius to pass” or along those lines.

[Don’t take my word for it — search google for:

rate my professors “you have to be a genius”‘

and you’ll see how students use the word “genius” in reviews of professors. The first page of hits is pretty much all men.]

Here’s the breakdown for “genius”:

Similar results occur with “brilliant”:

Now check out “bossy” and negative reviews:

I thought that the phrase “terrible teacher” was more illuminating, because it’s more likely in reference to the subject of the review, and we’ve got some meaningful occurrences:

Who’s doing this reporting, and why aren’t we reading these reports more critically? Journalists, get your shit together and report data responsibly. Academics, be a little more skeptical of stories that simply post screenshots of a chart coupled with inciting prose from conclusions drawn, badly, from hastily scanned data.

Is this tool useless? No. Is it fun to futz around with? Yes.

Is it being reported and understood well? Resounding no!

I think even our students would agree with me: that’s just f*cked up.

## Male nerd privilege

I recently read this essay by Laurie Penny (hat tip Jordan Ellenberg) about male nerd privilege. Her essay stemmed from comment 171 of Scott Aaronson’s blogpost about whether MIT professor Walter Lewin, who was found to be harassing women, should also have had his OpenCourseWare physics course taken down. Aaronson says no.

Personally, I think it should, because if I’m a woman who was harassed by that dude, I don’t want to see physics represented by my harasser up on MIT’s website; it would not make me feel welcome to the MIT community. Physics is a social community activity, after all, just like mathematics, and it is important to feel safe doing physics in that community. Plus the courses will be available on YouTube and other places, it’s not like the physics represented in the course has been lost to humanity.

Anyhoo, I did really want to talk about white male nerd privilege. Penny makes a bunch of good points in her essay, but I think she misses a big opportunity as well.

Quick summary. Aaronson talks about how he spent his youth and formative years terrified, since he was a shy nerd boy. Penny talks about how she did too, but then on top of it had to deal with structural sexism. Good point, and entirely true in my experience. Her best line:

At the same time, I want you to understand that that very real suffering does not cancel out male privilege, or make it somehow alright. Privilege doesn’t mean you don’t suffer, which, I know, totally blows.

So, I had two responses to her piece.

First was, she was complaining about her childhood, but she wasn’t even fat! I mean, *GAWD*. She was complaining about being too *skinny*, of all things. Plus it’s not clear whether or not she came from an abusive home. So I’ve got like, at least two complaints up on her. She thinks *she’s* had it bad?!

My point being, we can’t actually win when we count up all the ways we were miserable. Because the truth is, most people were actually miserable in their childhood, or soon after it, or at some time. And by comparing that stuff we just get stuck in a cycle of feeling competitively sorry for ourselves and pointing fingers. We need to sympathize, not only with our former selves, but with other people.

And although she does end the essay with the idea that we have to transcend all of our personal bruises and wrongs, and call each other human, and forget our resentments, it doesn’t seem like she’s giving us a path towards that.

Because, and here’s my second point, she doesn’t do the big thing of naming all of *her* privileges. Like, that nerds get good jobs. And that white people get loads of resources and attention and benefit of the doubt just for being white. At the end of the day, we are privileged to be sitting around talking about privilege. We are not worried about dying of hunger or exposure.

When Aaronson complained that naming male privilege is shaming, I’m prone to agree, at least if it’s done like this. What I’d propose is to figure out a way to talk about these structural problems in an aspirational way. How can we help make things fairer? How can we move this problem to the next level? Scott, you’re wicked smart, want to be on a taskforce with me?

Would it help if we gave it another name? Basic human rights, perhaps? Because that’s what we’re talking about, at the end of the day. The right to be free, to not get shot by the police, the right to hold a good job and care for your family, stuff like that.

Of course, there are plenty of people who are unwilling to move to the next level because they don’t acknowledge the structural racism, sexism, and other stuff at all. They don’t see the current situation as problematic. But on the other hand, there are loads of people who do, and Aaronson is clearly one of them.

As for problems for women in STEM, we’ve already studied this and we all know that both men *and* women are sexist, so it’s obviously not a blame game here. Instead, it’s a real cultural conundrum which we would like to approach thoughtfully and we’d like to make progress on as a team.

## Notices of the AMS is killing it

I am somewhat surprised to hear myself say this, but this month’s Notices of the AMS is killing it. Generally speaking I think of it as rather narrowly focused but things seem to be expanding and picking up. Scanning the list of editors, they do seem to have quite a few people that want to address wider public issues that touch and are touched by mathematicians.

First, there’s an article about how the h-rank of an author is basically just the square root of the number of citations for that author. It’s called *Critique of Hirsch’s Citation Index: A Combinatorial Fermi Problem* and it’s written by Alexander Yong. Doesn’t surprised me too much, but there you go, people often fall in love with new fancy metrics that turn out to be simple transformations of old discarded metrics.

Second, and even more interesting to me, there’s an article that explains the mathematical vapidness of a widely cited social science paper. It’s called *Does Diversity Trump Ability? An Example of the Misuse of Mathematics in the Social Sciences *and it’s written by Abby Thompson. My favorite part of paper:

Oh, and here’s another excellent take-down of a part of that paper:

Let me just take this moment to say, right on, Notices of the AMS! And of course, right on Alexander Yong and Abby Thompson!

## The green-eyed/ blue-eyed puzzle/ conundrum

Today I want to share a puzzle that my friend Aaron Abrams told me a few days ago. I’m sure some of you have heard it before, but it’s confusing me, so I’m asking for your help.

**Set-up**

Here’s the setup. There’s an island of people, all of whom have either blue eyes or green eyes. By social convention they never discuss eye color, because there’s a tragic rule that states that, if you ever figure out your eye color, you have to leave the island within 24 hours. Oh, and there are no mirrors.

OK, get it? So think of the island as pretty small, maybe 100 people, so you know everyone else’s eye color but not your own.

Here’s what happens next. Some castaway arrives by swimming onto the island, stays for a few days and hangs out with the folks there eating island food and having island parties, and then after building himself a boat he prepares to leave. Not being trained in the social customs of the island, on the day he leaves he says, “hey, it’s good to see some people with green eyes here!”.

**Puzzle**

So the puzzle is, what happens next?

Here’s what’s obvious. If you are a person who only sees blue eyes, you know by his statement that you must have green eyes. So you have to leave the next day.

But actually he said “some people.” So even if you only see *one other person* with green eyes, then you have to leave, with that other green-eyed person, after one day.

With me so far?

But hey, what if you see *two* other people with green eyes? Well, you might think you’re safe, and you’d wait to see them leave together the next day. But what if they *don’t* leave after one day? That must mean that *you also have green eyes*. Then all three of you have to leave, after two days. Get it?

Then you work by induction. If you see N other people with green eyes, they should all leave after N-1 days, or else you have green eyes too and all (N+1) of you leave after N days.

**Conundrum**

OK, so here’s the conundrum. The guy who started this whole mess really didn’t do much. He just stated what was obvious to everyone already on the island, namely that some people had green eyes. I mean, yes, if there were really only 2 people with green eyes, then he clearly added real information, because both those people had thought only 1 person had green eyes.

But just for the fun of it, let’s assume there were 17 people with green eyes. Then they guy really didn’t add information. And yet, 16 days after the guy left, so do all the green-eyed islanders. So really the guy just started a count-down more than anything.

So, is that it? Is that what happened? Or was the original set-up inconsistent? Is it not an equilibrium at all? Or is it an unstable equilibrium?

**Saying**

In any case, Aaron and his friend Jamie have developed a saying, *it’s a green-eyed/ blue-eyed thing*, which means it’s an apparently information-free fact which changes everything. I think I’ll use that.

## Guest Post: Bring Back The Slide Rule!

*This is a guest post by Gary Cornell, a mathematician, writer, publisher, and recent founder of StemForums.*

I was was having a wonderful ramen lunch with the mathbabe and, as is all too common when two broad minded Ph.D.’s in math get together, we started talking about the horrible state math education is in for both advanced high school students and undergraduates.

One amusing thing we discovered pretty quickly is that we had independently come up with the same (radical) solution to at least part of the problem: throw out the traditional sequence which goes through first and second year calculus and replace it with a unified probability, statistics, calculus course where the calculus component was only for the smoothest of functions and moreover the applications of calculus are only to statistics and probability. Not only is everything much more practical and easier to motivate in such a course, students would hopefully learn a skill that is essential nowadays: how to separate out statistically good information from the large amount of statistical crap that is out there.

Of course, the downside is that the (interesting) subtleties that come from the proofs, the study of non-smooth functions and for that matter all the other stuff interesting to prospective physicists like DiffEQ’s would have to be reserved for different courses. (We also were in agreement that Gonick’s beyond wonderful*“Cartoon Guide To Statistics”* should be required reading for all the students in these courses, but I digress…)

The real point of this blog post is based on what happened next: but first you have to know I’m more or less one generation older than the mathbabe. This meant I was both able and willing to preface my next point with the words: “You know when I was young, in one way students were much better off because…” Now it is well known that using this phrase to preface a discussion often poisons the discussion but occasionally, as I hope in this case, some practices from days gone by ago can if brought back, help solve some of today’s educational problems.

By the way, and apropos of nothing, there is a cure for people prone to too frequent use of this phrase: go quickly to YouTube and repeatedly make them watch Monty Python’s Four Yorkshireman until cured:

Anyway, the point I made was that I am a member of the last generation of students who had to use slide rules. Another good reference is: here. Both these references are great and I recommend them. (The latter being more technical.) For those who have never heard of them, in a nutshell, a slide rule is an analog device that uses logarithms under the hood to do (sufficiently accurate in most cases) approximate multiplication, division, roots etc.

The key point is that using a slide rule *requires* the user to keep track of the “order of magnitude” of the answers— because slide rules *only* give you four or so significant digits. This meant students of my generation when taking science and math courses were continuously exposed to order of magnitude calculations and you just couldn’t escape from having to make order of magnitude calculations *all *the time—students nowadays, not so much. Calculators have made skill at doing order of magnitude calculations (or Fermi calculations as they are often lovingly called) an add-on rather than a base line skill and that is a really bad thing. (Actually my belief that bringing back slide rules would be a *good thing* goes back a ways: when that when I was a Program Director at the NSF in the 90’s, I actually tried to get someone to submit a proposal which would have been called “On the use of a hand held analog device to improve science and math education!” Didn’t have much luck.)

Anyway, if you want to try a slide rule out, alas, good vintage slide rules have become collectible and so expensive— because baby boomers like me are buying the ones we couldn’t afford when we were in high school – but the nice thing is there are lots of sites like this one which show you how to make your own.

Finally, while I don’t think they will ever be as much fun as using a slide rule, you could still allow calculators in classrooms.

Why? Because it would be trivial to have a mode in the TI calculator or the Casio calculator that all high school students seem to use, called “significant digits only.” With the right kind of problems this mode would *require* students to do order of magnitude calculations because they would never be able to enter trailing or leading zeroes and we could easily stick them with problems having a lot of them!

But calculators really bug me in classrooms and, so I can’t resist pointing out one last flaw in their omnipresence: it makes students believe in the possibility of ridiculously high precision results in the real world. After all, nothing they are likely to encounter in their work (and certainly not in their lives) will ever need (or even have) 14 digits of accuracy and, more to the point, when you see a high precision result in the real world, it is likely to be totally bogus when examined under the hood.

## I love math and I hate the Fields Medal

I’ve loved math since I can remember. When I was 5 I played with spirographs and learned about periodicity, which made me understand prime numbers as colorful patterns on a page. I always thought 5-fold symmetry was the most beautiful.

In high school I was incredibly lucky to attend HCSSiM and learn about the wonders of solving the Rubik’s cube with group theory.

Then I got to college at UC Berkeley and in my second semester was privileged to learn algebra (and later, Galois Theory!) from Ken Ribet, who became my very good friend. He brought me to have dinner with all sorts of amazing mathematicians, like Serge Lang and J.P. Serre and Barry Mazur and John Tate and of course his Berkeley colleagues Hendrik Lenstra and Robert Coleman and many others. Many of the main characters behind the story of solving Fermat’s Last Theorem were people I had met at dinner parties at Ken’s house, including of course Ken himself. Math was discussed in between slices of Cheese Board Pizza and fresh salad mixes from the Berkeley Bowl.

How lucky was I?!?

And I knew it, at least partially. Really the best thing about these generous and wonderful people was how joyful they were about the serious business of doing math. It was a pleasure to them, and it made them smile and even appear wistful if I’d mention my difficulties with tensor products, say.

They were incredibly inviting to me, and honestly I was spoiled. I had been invited into this society because I loved math and because I was devoting myself to it, and that was enough for them. Math is, after all, not an individual act, it is a community effort, and progress is to be celebrated and adored. And it wasn’t just any community, it was a really really nice group of guys who loved what they did for a living and wanted other cool and smart people to join.

I mention all this because I want to clarify how fucking cool it can be to be a mathematician, and what kind of group involvement and effort it can feel like, even though many of the final touches on the proofs are made inside closed offices. Being part of such a community, where math is so revered and celebrated, it is its own reward to be able to prove a theorem and tell your friends about it.

Hey, guess what?

Thisis true too! We always suspected it but now we can use it! How cool is that?

Now that I’ve explained how much I love math (and I still love math very much), let me explain why I hate the Fields Medal. Namely, because that group effort is utterly lost and is replaced with a synthetic and false myth of the individual genius working in isolation.

Here’s the thing, and I can say this now pretty confidently, journalism has rules about writing stories that don’t really work for math. When journalists are told to “put a face on the story,” they end up with all face and no story.

How else is a journalist going to write about progress in some esoteric field? The mathematics itself is naturally not within arms reach: mathematics is by nature deep and uses multiple layers of metaphor and notation which even trained mathematicians grapple with, never mind a new result on the very far edge of what is known. So it makes sense that the story becomes about the mathematician himself or herself.

It’s not just journalists, though. Certain mathematicians do their best to represent research mathematics, and sometimes it’s awesome, sometimes it kind of works, and sometimes it ends up being laughably or even embarrassingly simplistic. That’s the thing about math, it’s deep. It’s hard to boil down to a nut graf.

So here’s the thing, the Fields Medal is easy to understand (“it’s the Nobel Prize for math!”) but it’s incredibly and dangerously misleading. It gives the impression that we have these superstars who “have it” and then we have a bunch of wandering nerds who “don’t really have it.” That stereotype is a bad advertisement for mathematics and for mathematicians, who are actually much more generous and community-spirited than that.

Plus, now that I’m in full rant mode, can I just mention that the 40-year-old age limit for the award is *just terrible* and obviously works against certain people, especially women or men who take parenting seriously. I am not even going to explain that because it’s so freaking clear, and as a 42-year-old woman myself, may I say I’m just getting started. And yes, the fact that a woman has won the Fields Medal is a good things, but it’s a silver lining on an otherwise big old rain cloud which I do my best to personally blow away.

And, lest I seem somehow mean to the Fields Medal winners, of course they are great mathematicians! Yes, yes they are! They’re all great, and there are many great mathematicians who never get awards, and doing great math and making progress is its own reward, and those mathematicians who do great work tend to be the ones who already have lots of resources and don’t need more, but I’m not saying they shouldn’t be celebrated, because they’re awesome, no question about it.

Here’s what I’d like to see: serious outward-facing science journalism centered around, or at least instructive towards, the incredible* collaborative* effort that is modern mathematics.

## Love StackOverflow and MathOverflow? Now there’s StemForums!

Everyone I know who codes uses stackoverflow.com for absolutely everything.

Just yesterday I met a cool coding chick who was learning python and pandas (of course!) with the assistance of stackoverflow. It is exactly what you need to get stuff working, and it’s better than having a friend to ask, even a highly knowledgable friend, because your friend might be busy or might not know the answer, or even if your friend knew the answer her answer isn’t cut-and-paste-able.

If you are someone who has never used stackoverflow for help, then let me explain how it works. Say you want to know how to load a JSON file into python but you don’t want to write a script for that because you’re pretty sure someone already has. You just search for “import json into python” and you get results with vote counts:

Also, every math nerd I know uses and contributes to mathoverflow.net. It’s not just for math facts and questions, either, there are interesting discussions going on there all the time. Here’s an example of a comment in response to understanding the philosophy behind the claimed proof of the ABC Conjecture:

OK well hold on tight because now there’s a new online forum, but not about coding and not about math. It’s about all the other STEM subjects, which since we’ve removed math might need to be called STE subjects, which is not catchy.

It’s called stemforums.com, and it is being created by a team led by Gary Cornell, mathematician, publisher at Apress, and beloved Black Oak bookstore owner.

So far only statistics is open, but other stuff is coming very soon. Specifically it covers, or soon will cover, the following fields:

- Statistics
- Biology
- Chemistry
- Cognitive Sciences
- Computer Sciences
- Earth and Planetary Sciences
- Economics
- Science & Math Education
- Engineering
- History of Science and Mathematics
- Applied Mathematics, and
- Physics

I’m super excited for this site, it has serious potential to make peoples’ lives better. I wish it had a category for Data Sciences, and for Data Journalism, because I’d probably be more involved in those categories than most of the above, but then again most data science-y questions could be inserted into one of the above. I’ll try to be patient on this one.

Here’s a screen shot of an existing Stats question on the site:

The site doesn’t have many questions, and even fewer answers, but as I understand it the first few people to get involved are eligible for Springer books, so go check it out.

## Nerding out: RSA on an iPython Notebook

Yesterday was a day filled with secrets and codes. In the morning, at The Platform, we had guest speaker Columbia history professor Matthew Connelly, who came and talked to us about his work with declassified documents. Two big and slightly depressing take-aways for me were the following:

- As records have become digitized, it has gotten easy for people to get rid of archival records in large quantities. Just press delete.
- As records have become digitized, it has become easy to trace the access of records, and in particular the leaks. Connelly explained that, to some extent, Obama’s harsh approach to leakers and whistleblowers might be explained as simply “letting the system work.” Yet another way that technology informs the way we approach human interactions.

After class we had section, in which we discussed the Computer Science classes some of the students are taking next semester (there’s a list here) and then I talked to them about prime numbers and the RSA crypto system.

I got really into it and wrote up an iPython Notebook which could be better but is pretty good, I think, and works out one example completely, encoding and decoding the message “hello”.

The underlying file is here but if you want to view it on the web just go here.

## The platonic solids

I managed to record this week’s Slate Money podcast early so I could drive up to HCSSiM for July 17th, and the Yellow Pig Day celebration. I missed the 17 talk but made it in time for yellow pig carols and cake.

This morning my buddy Aaron decided to let me talk to the kids in the last day of his workshop. First Amber is working out the formula for the Euler Characteristic of a planar graph with the kids and after that I’ll help them count the platonic solids using stereographic projection. If we have time we’ll talk about duals (update: we had time!).

Tonight at Prime Time I’ll play a game or two of Nim with them.

## Correlation does not imply equality

One of the reasons I enjoy my blog is that I get to try out an argument and then see if readers can 1) poke holes in my arguement, or 2) if they misunderstand my argument, or 3) if they misunderstand something tangential to my argument.

Today I’m going to write about an issue of the third kind. Yesterday I talked about how I’d like to see the VAM scores for teachers directly compared to other qualitative scores or other VAM scores so we could see how reliably they regenerate various definitions of “good teaching.”

The idea is this. Many mathematical models are meant to replace a human-made model that is deemed too expensive to work out at scale. Credit scores were like that; take the work out of the individual bankers’ hands and create a mathematical model that does the job consistently well. The VAM was originally intended as such – in-depth qualitative assessments of teachers is expensive, so let’s replace them with a much cheaper option.

So all I’m asking is, how good a replacement is the VAM? Does it generate the same scores as a trusted, in-depth qualitative assessment?

When I made the point yesterday that I haven’t seen anything like that, a few people mentioned studies that show *positive correlations* between the VAM scores and principal scores.

But here’s the key point: *positive correlation does not imply equality.*

Of course sometimes positive correlation is good enough, but sometimes it isn’t. It depends on the context. If you’re a trader that makes thousands of bets a day and your bets are positively correlated with the truth, you make good money.

But on the other side, if I told you that there’s a ride at a carnival that has a positive correlation with not killing children, that wouldn’t be good enough. You’d want the ride to be safe. It’s a higher standard.

I’m asking that we make sure we are using that second, higher standard when we score teachers, because their jobs are increasingly on the line, so it matters that we get things right. Instead we have a machine that nobody understand that is *positively correlated* with things we do understand. I claim that’s not sufficient.

Let me put it this way. Say your “true value” as a teacher is a number between 1 and 100, and the VAM gives you a noisy approximation of your value, which is 24% correlated with your true value. And say I plot your value against the approximation according to VAM, and I do that for a bunch of teachers, and it looks like this:

So maybe your “true value” as a teacher is 58 but the VAM gave you a zero. That would not just be frustrating to you, since it’s taken as an important part of your assessment. You might even lose your job. And you might get a score of zero many years in a row, even if your true score stays at 58. It’s increasingly unlikely, to be sure, but given enough teachers it is bound to happen to a handful of people, just by statistical reasoning, and if it happens to you, you will not think it’s unlikely at all.

In fact, if you’re a teacher, you should demand a scoring system that is consistently the same as a system you understand rather than positively correlated with one. If you’re working for a teachers’ union, feel free to contact me about this.

One last thing. I took the above graph from this post. These are actual VAM scores for the same teacher in the same year but for two different class in the same subject – think 7th grade math and 8th grade math. So neither score represented above is “ground truth” like I mentioned in my thought experiment. But that makes it even more clear that the VAM is an insufficient tool, because it is only 24% correlated *with itself*.

## Why Chetty’s Value-Added Model studies leave me unconvinced

Every now and then when I complain about the Value-Added Model (VAM), people send me links to recent papers written Raj Chetty, John Friedman, and Jonah Rockoff like this one entitled *Measuring the Impacts of Teachers II: Teacher Value-Added and Student Outcomes in Adulthood *or its predecessor *Measuring the Impacts of Teachers I: Evaluating Bias in **Teacher Value-Added Estimates*.

I think I’m supposed to come away impressed, but that’s not what happens. Let me explain.

Their data set for students scores start in 1989, well before the current value-added teaching climate began. That means teachers weren’t teaching to the test like they are now. Therefore saying that the current VAM works because an retrograded VAM worked in 1989 and the 1990’s is like saying I must like blueberry pie now because I used to like pumpkin pie. It’s comparing apples to oranges, or blueberries to pumpkins.

I’m surprised by the fact that the authors don’t seem to make any note of the difference in data quality between pre-VAM and current conditions. They should know all about feedback loops; any modeler should. And there’s nothing like telling teachers they might lose their job to create a mighty strong feedback loop. For that matter, just consider all the cheating scandals in the D.C. area where the stakes were the highest. Now that’s a feedback loop. And by the way, I’ve never said the VAM scores are totally meaningless, but just that they are not precise enough to hold individual teachers accountable. I don’t think Chetty et al address that question.

So we can’t trust old VAM data. But what about recent VAM data? Where’s the evidence that, in this climate of high-stakes testing, this model is anything but random?

If it were a good model, we’d presumably be seeing a comparison of current VAM scores and current other measures of teacher success and how they agree. But we aren’t seeing anything like that. Tell me if I’m wrong, I’ve been looking around and I haven’t seen such comparisons. And I’m sure they’ve been tried, it’s not rocket science to compare VAM scores with other scores.

The lack of such studies reminds me of how we never hear about scientific studies on the results of Weight Watchers. There’s a reason such studies never see the light of day, namely because whenever they do those studies, they decide they’re better off not revealing the results.

And if you’re thinking that it would be hard to know exactly how to rate a teacher’s teaching in a qualitative, trustworthy way, then yes, that’s the point! It’s actually not obvious how to do this, which is the real reason we should never trust a so-called “objective mathematical model” when we can’t even decide on a definition of success. We should have the conversation of what comprises good teaching, and we should involve the teachers in that, and stop relying on old data and mysterious college graduation results 10 years hence. What are current 6th grade teachers even supposed to *do* about studies like that?

Note I do think educators and education researchers should be talking about these questions. I just don’t think we should punish teachers arbitrarily to have that conversation. We should have a notion of best practices that slowly evolve as we figure out what works in the long-term.

So here’s what I’d love to see, and what would be convincing to me as a statistician. If we see all sorts of qualitative ways of measuring teachers, and see their VAM scores as well, and we could compare them, and make sure they agree with each other *and themselves* over time. In other words, at the very least we should demand an explanation of how some teachers get totally ridiculous and inconsistent scores from one year to the next and from one VAM to the next, even in the same year.

The way things are now, the scores aren’t sufficiently sound be used for tenure decisions. They are too noisy. And if you don’t believe me, consider that statisticians and some mathematicians agree.

We need some ground truth, people, and some common sense as well. Instead we’re seeing retired education professors pull statistics out of thin air, and it’s an all-out war of supposed mathematical objectivity against the civil servant.

## How Not To Be Wrong by Jordan Ellenberg

You guys are in for a treat. In fact I’m jealous of you.

I had a little secret about my survival in grad school, and that secret has a name, and that name is Jordan Ellenberg. We used to meet every Tuesday and Thursday to study schemes at the CallaLily Cafe a few blocks from the Science Center on Kirkland Street, and even though that sounds kind of dull, it was a blast. It was what kept me sane at Harvard.

You see, Jordan has an infectious positivity about him, which balances my rather intense suspicions, and moreover he’s hilariously funny. He’s really somewhere between a mathematician and a stand-up comedian, and to be honest I don’t know which one he’s better at, although he is a deeply talented mathematician.

The reason I’m telling you this is that he’s written a book, called How Not To Be Wrong, and available for purchase starting today, which is a delight to read and which will make you understand why I survived graduate school. In fact nobody will ever let me complain again once they’ve read this book, because it reads just like Jordan talks. In reading it, I felt like I was right back at CallaLily, singing Prince’s “Sexy MF” and watching Jordan flirt with the cashier lady again. Aaaah memories.

So what’s in the book? Well, he talks a lot about math, and about mathematicians, and the lottery, and in fact he has this long riff which starts out with lottery math, then goes to error-correcting codes and then to made-up languages and then to sphere packing and then arrives again at lotteries. And it’s brilliant and true and beautiful and also funny.

I have a theory about this book that you could essentially open it up to any page and begin to enjoy it, since it is thoroughly enjoyable and the math is cumulative but everywhere so well explained that it wouldn’t take long to follow along, and pretty soon you’d be giggling along with Jordan at every ridiculous footnote he’s inserted into his narrative.

In other words, every page is a standalone positive and ontological examination of the beauty and surprise of mathematical discovery. And so, if you are someone who shares with Jordan a love for mathematics, you will have a consistently great time with this book. In fact I’m imagining that you have an uncle or a mom who loves math or science, in which case this would be a seriously perfect gift to them, but of course you could also give that gift to yourself. I mean, this is a guy who can make nazi jokes funny, and he does.

Having said that, the magic of the book is that it’s not just a collection of wonderful mathy tidbits. Jordan also has a point about the act of scrutinizing something in a logical and mathematical fashion. That act itself is courageous and should be appreciated, and he explains why, and he tells us how much we’ve already benefited from people in the past who have had the bravery to do so. He appreciates them and we should too.

And yet, he also sends the important message that it’s not an elitist crew of the usual genius suspects, that in fact we can all do this in our own capacity. It’s a great message and, if it ends up allowing people to re-examine their need for certainty in an uncertain world, then Jordan will really end up doing good. Fingers crossed.

That’s not to say it’s a perfect book, and I wanted to argue with points on basically every other page, but mostly in a good, friendly, over-drinks kind of way, which is provocative but not annoying. One exception I might make came on page 256: no, Jordan, municipal bonds do not always get paid back, and no, stocks do not always go up, not even in expectation. In fact to the extent that both of those statements seem true to many people is the result of many cynical political acts and is damaging, mostly to people like retired civil servants. Don’t go there!

Another quibble: Jordan talks about how public policy makers make proclamations in the face of uncertainty, and he has a lot of sympathy and seems to think the should keep doing this. I’m on the other side on this one. Telling people to avoid certain foods and then changing stances seems more damaging than helpful and it happens constantly. And it’s often tied to industry and money, which also doesn’t impress.

Even so, even when I strongly disagree with Jordan, I always want to have the conversation. He forces that on the reader because he’s so darn positive and open-minded.

A few more goodies that I wanted to adore without giving too much away. Jordan does a great job with something he calls “The Great Square of Men” and Berkson’s Fallacy: it will explain to many many women why they are not finding the man they’re looking for. He also throws out a bone to nerds like me when he almost proves that every pig is yellow, and he absolutely kills it, stand-up comedian style, when comparing Ross Perot to a small dark pile of oats. Holy crap he was on a roll there.

So here’s one thing I’ve started doing since reading the book. When I give my 5-year-old son his dessert, it’s in the form of Hershey Drops, which are kind of like fat M&M’s. I give him 15 and I ask him to count them to make sure I got it right. Sometimes I give him 14 to make sure he’s paying attention. But that’s not the new part. The new part is something I stole from Jordan’s book.

The new part is that some days I ask him, “do you want me to give you 3 rows of 5 drops?” And I wait for him to figure out that’s enough and say “yes!” And the other days I ask him “do you want me to give you 5 rows of 3 drops?” and I again wait. And in either case I put the drops out in a rectangle.

And last night, for the first time, he explained to me in a slightly patronizing voice that it doesn’t matter which way I do it because it ends up being the same, because of the rectangle formation and how you look at it. And just to check I asked him which would be more, 10 rows of 7 drops or 7 rows of 10 drops, and he told me, “duh, it would be the same because it couldn’t be any different.”

And that, my friends, is how not to be wrong.

## Is math an art or a science?

I left academic math in 2007, but I still identify as a mathematician. That’s just how I think about the world, through a mathematician’s mindset, whatever that means.

Wait what *does* that mean? How do I characterize the mathematician’s mindset? I’ve struggled in the past to try, but a few days ago, a part of it got a little bit easier.

I was talking to my friend Matt Jones – an historian of science, actually – about the turf wars inside computer science surrounding functional versus object oriented programming. It seems like questions about which one is better or when is one more appropriate than the other have become so political that they are no longer inside the scientifically acceptable realm.

But that kind of reminded me of the turf war of the bayesian versus frequentist statisticians. Or the fresh water versus salt water economists. Or possibly the string theorists versus the non-string theorists in physics.

What’s going on in all of those fields, as best I can understand, is that different groups within the field have different assumptions about what the field may assume and what it’s trying to accomplish, and they fight over the validity of those sets of assumptions. The fights themselves, which are often emotional and brutal, expose the underlying assumptions in certain ways. Matt told me that historians often get at a fields assumptions through these wars.

Here’s the thing, though, math doesn’t have that. I’m not saying there are no turf wars at all in math, there certainly are, but they aren’t political in nature exactly. They are aesthetic.

In the context of mathematics, where nothing can be considered truly inappropriate as long as the assumptions are clear, it’s all about whether something is beautiful or important, not whether it is valid. Validity has no place in mathematics per se, which plays games with logical rules and constructs. I could go off an build a weird but internally logical universe on my own, and no mathematician would complain it’s invalid, they’d only complain it’s unimportant if it doesn’t tie back to their field and help them prove a theorem.

I claim that this turf war issue is a characterizing issue of the field of mathematics versus the other sciences, and makes it more of an art than a science.

To finish my argument I’d need to understand more about how artistic fields fight, and in particular that their internally hurled insults focus more on aesthetics than on validity, say in composition or painting. I can’t imagine it otherwise, but who knows. Readers, please chime in with evidence in either direction.

## Interview with a middle school math teacher on the Common Core

Today’s post is an email interview with Fawn Nguyen, who teaches math at Mesa Union Junior High in southern California. Fawn is on the leadership team for UCSB Mathematics Project that provides professional development for teachers in the Tri-County area. She is a co-founder of the Thousand Oaks Math Teachers’ Circle. In an effort to share and learn from other math teachers, Fawn blogs at Finding Ways to Nguyen Students Over. She also started VisualPatterns.org to help students develop algebraic thinking, and more recently, she shares her students’ daily math talks to promote number sense. When Fawn is not teaching or writing, she is reading posts on mathblogging.org as one of the editors. She sleeps occasionally and dreams of becoming an architect when all this is done.

Importantly for the below interview, Fawn is not being measured via a value-added model. My questions are italicized.

——

*I’ve been studying the rhetoric around the mathematics Common Core State Standard (CCSS). So far I’ve listened to Diane Ravitch stuff, I’ve interviewed Bill McCallum, the lead writer of the math CCSS, and I’ve also interviewed Kiri Soares, a New York City high school principal. They have very different views. Interestingly, McCallum distinguished three things: standards, curriculum, and testing. *

*What do you think? Do teachers see those as three different things? Or is it a package deal, where all three things rolled into one in terms of how they’re presented?*

I can’t speak for other teachers. I understand that the standards are not meant to be the curriculum, but the two are not mutually exclusive either. They can’t be. Standards inform the curriculum. This might be a terrible analogy, but I love food and cooking, so maybe the standards are the major ingredients, and the curriculum is the entrée that contains those ingredients. In the show Chopped on Food Network, the competing chefs must use all 4 ingredients to make a dish – and the prepared foods that end up on the plates differ widely in taste and presentation. We can’t blame the ingredients when the dish is blandly prepared any more than we can blame the standards when the curriculum is poorly written.

Similary, the standards inform testing. Test items for a certain grade level cover the standards of that grade level. I’m not against testing. I’m against bad tests and a lot of it. By bad, I mean multiple-choice items that require more memorization than actual problem solving. But I’m confident we can create good multiple-choice tests because realistically a portion of the test needs to be of this type due to costs.

The three – standards, curriculum, and testing – are not a “package deal” in the sense that the same people are not delivering them to us. But they go together, otherwise what is school mathematics? Funny thing is we have always had the three operating in schools, but somehow the Common Core State Standands (CCSS) seem to get the all the blame for the anxieties and costs connected to testing and curriculum development.

*As a teacher, what’s good and bad about the CCSS?*

I see a lot of good in the CCSS. This set of standards is not perfect, but it’s much better than our state standards. We can examine the standards and see for ourselves that the integrity of the standards holds up to their claims of being embedded with mathematical focus, rigor, and coherence.

Implementation of CCSS means that students and teachers can expect consistency in what is being in taught at each grade level across state boundaries. This is a nontrivial effort in addressing equity. This consistency also helps teachers collaborate nationwide, and professional development for teachers will improve and be more relevant and effective.

I can only hope that textbooks will be much better because of the inherent focus and coherence in CCSS. A kid can move from Maine to California and not have to see different state outlines on their textbooks as if he’d taken on a new kind of mathematics in his new school. I went to a textbook publishers fair recently at our district, and I remain optimistic that better products are already on their way.

We had every state create its own assessment, now we have two consortia, PARCC and Smarter Balanced. I’ve gone through the sample assessments from the latter, and they are far better than the old multiple-choice items of the CST. Kids will have to process the question at a deeper level to show understanding. This is a good thing.

What is potentially bad about the CCSS is the improper or lack of implementation. So, this boils down to the most important element of the Common Core equation – the teacher. There is no doubt that many teachers, myself included, need sustained professional development to do the job right. And I don’t mean just PD in making math more relevant and engaging, and in how many ways we can use technology, I mean more importantly, we need PD in content knowledge.

It is a perverse notion to think that anyone with a college education can teach elementary mathematics. Teaching mathematics requires *knowing* mathematics. To know a concept is to understand it backward and forward, inside and outside, to recognize it in different forms and structures, to put it into context, to ask questions about it that leads to more questions, to know the mathematics *beyond* this concept. That reminds me just recently a 6th grader said to me as we were working on our unit of dividing by a fraction. She said, “My elementary teacher lied to me! She said we always get a smaller number when we divide two numbers.”

Just because one can make tuna casserole does not make one a chef. (Sorry, I’m hungry.)

*What are the good and bad things for kids about testing?*

Testing is only good for kids when it helps them learn and become more successful – that the feedback from testing should inform the teacher of next moves. Testing has become such a dirty word because we over test our kids. I’m still in the classroom after 23 years, yet I don’t have the answers. I struggle with telling my kids that I value them and their learning, yet at the end of each quarter, the narrative sum of their learning is a letter grade.

Then, in the absence of helping kids learn, testing is bad.

*What are the good/bad things for the teachers with all these tests?*

Ideally, a good test that measures what it’s supposed to measure should help the teacher and his students. Testing must be done in moderation. Do we really need to test kids at the start of the school year? Don’t we have the results from a few months ago, right before they left for summer vacation? Every test takes time away from learning.

I’m not sure I understand why testing is bad for teachers aside from lost instructional minutes. Again, I can’t speak for other teachers. But I do sense heightened anxiety among some teachers because CCSS is new – and newness causes us to squirm in our seats and doubt our abilities. I don’t necessarily see this as a bad thing. I see it as an opportunity to learn content at a deeper conceptual level and to implement better teaching strategies.

If we look at anything long and hard enough, we are bound to find the good and the bad. I choose to focus on the positives because I can’t make the day any longer and I can’t have fewer than 4 hours of sleep a night. I want to spend my energies working with my administrators, my colleagues, my parents to bring the best I can bring into my classroom.

*Is there anything else you’d like to add?*

The best things about CCSS for me are not even the standards – they are the 8 Mathematical Practices. These are life-long habits that will serve students well, in all disciplines. They’re equivalent to the essential cooking techniques, like making roux and roasting garlic and braising kale and shucking oysters. Okay, maybe not that last one, but I just got back from New Orleans, and raw oysters are awesome.

I’m excited to continue to share and collaborate with my colleagues locally and online because we now have a common language! We teachers do this very hard work – day in and day out, late into the nights and into the weekends – because we love our kids and we love teaching. But we need to be mathematically competent first and foremost to teach mathematics. I want the focus to always be about the kids and their learning. We start with them; we end with them.

## Interview with a high school principal on the math Common Core

In my third effort to understand the Common Core State Standards (CC) for math, I interviewed an old college friend Kiri Soares, who is the principal and co-founder of the *Urban Assembly Institute of Math and Science for Young Women*. Here’s a transcript of the interview which took place earlier this month. My words are in italics below.

——

*How are high school math teachers in New York City currently evaluated?*

Teachers are now evaluated on 2 things:

- First,
*measures of teacher practice*, which are based on observations, in turn based on some rubric. Right now it’s the Danielson Rubric. This is a qualitative measure. In fact it is essentially an old method with a new name. - Second,
*measures of student learning*, that is supposed to be “objective”. Overall it is worth 40% of the teacher’s score but it is separated into two 20% parts, where teachers choose the methodology of one part and principals choose the other. Some stuff is chosen for principals by the city. Any time there is a state test we have to choose it. In terms of the teachers’ choices, there are two ways to get evaluated: goals or growth. Goals are based on a given kid, and the teachers can guess they will get a certain slightly lower score or higher score for whatever reason. Otherwise, it’s a growth-based score. Teachers can also choose from an array of assessments (state tests, performance tests, and third party exams). They can also choose the cohort (their own kids/ the grade/the school). The city also chose performance tasks in some instances.

*Can you give me a concrete example of what a teacher would choose as a goal?*

At the beginning of year you give diagnostic tests to students in your subject. Based on what a given kid scored in September, you extrapolate a guess for their performance in the June test. So if a kid has a disrupted homelife you might guess lower. Teacher’s goal setting is based on these teachers’ guesses.

*So in other words, this is really just a measurement of how well teachers guess?*

Well they are given a baseline and teachers set goals relative to that, but yes. And they are expected to make those guesses in November, possibly well before homelife is disrupted. It definitely makes things more complicated. And things are pretty complicated. Let me say a bit more.

The first three weeks of school are all testing. We test math, social studies, science, and English in every grade, and overall it depending on teacher/principal selections it can take up to 6 weeks, although not in a given subject. Foreign language and gym teachers also getting measured, by the way, based on those other tests. These early tests are diagnostic tests.

Moreover, they are new types of tests, which are called performance-based assessments, and they are based on writing samples with prompts. They are theoretically better quality because they go deeper, the aren’t just bubble standardized tests, but of course they had no pre-existing baseline (like the state tests) and thus had to be administered as diagnostic. Even so, we are still trying to predict growth based on them, which is confusing since we don’t know how to predict performance on new tests. Also don’t even know how we can consistently grade such essay-based tests- despite “norming protocols”, which is yet another source of uncertainty.

*How many weeks per year is there testing of students?*

The last half of June is gone, a week in January, and 2-3 weeks in the high school in the beginning per subject. That’s a minimum of 5 weeks per subject per year, out of a total of 40 weeks. So one eighth of teacher time is spent administering tests. But if you think about it, for the teachers, it’s even more. They have to grade these tests too.

*I’ve been studying the rhetoric around the CC. So far I’ve listened to Diane Ravitch stuff, and to Bill McCallum, the lead writer of the math CC. They have very different views. McCallum distinguished three things, which when they are separated like that, Ravitch doesn’t make sense. *

*Namely, he separates standards, curriculum, and testing. People complain about testing and say that CC standards make testing easier, and we already have too much testing, so CC is a bad thing. But McCallum makes this point: good standards also make good testing easier. *

*What do you think? Do teachers see those as three different things? Or is it a package deal, where all three things rolled into one in terms of how they’re presented?*

It’s much easier to think of those three things as vertices of a triangle. We cannot make them completely isolated, because they are interrelated.

So, we cannot make the CC good without curriculum and assessment, since there’s a feedback loop. Similarly, we cannot have aligned curriculum without good standards and assessment, and we cannot have good tests without good standards and curriculum. The standards have existed forever. The common core is an attempt to create a set of nationwide standards. For example, without a coherent national curriculum it might seem OK to teach creationism in place of evolution in some states. Should that be OK?

CC is attempting to address this, in our global economy, but it hasn’t even approached science for clear political reasons. Math and English are the least political subjects so they started with those. This is a long time coming, and people often think CC refers to everything but so far it’s really only 40% of a kid’s day. Social studies CC standards are actually out right now, but they are very new.

Next, the massive machine of curriculum starts getting into play, as does the testing. I have CC standards and the CC-aligned test, but not curriculum.

Next, you’re throwing into the picture teacher evaluation aligned to CC tests. Teachers are freaking out now – they’re thinking, my curriculum hasn’t been CC-aligned for many years, what do I do now? By the way, importantly, none of the high school curriculum in NY State is actually CC-aligned now. DOE recommendations for the middle school happened last year, and DOE people will probably recommend this year for high school, since they went into talks with publication houses last year to negotiate CC curriculum materials.

The real problem is this: we’ve created these new standards to make things more difficult and more challenging without recognizing where kids are in the present moment. If I’m a former 5th grader, and the old standards were expecting something from me that I got used to, and it wasn’t very much, and now I’m in 6th grade, and there are all these raised expectations, and there’s no gap attention.

Bottomline, everybody is freaking out – teachers, students, and parents.

Last year was the first CC-aligned ELA and math tests. Everybody failed. They rolled out the test before any CC curriculum.

*From the point of view of NYC teachers, this seems like a terrorizing regime, doesn’t it?*

Yes, because the CC roll-out is rigidly tied to the tests, which are in turn rigidly tied to evaluations of teachers. So the teachers are worried they are automatically going to get a “failure” on that vector.

Another way of saying this is that, if teacher evaluations were taken out of the mix, we’d have a very different roll-out environment. But as it is, teachers are hugely anxious about the possibility that their kids might fail both the city and state tests, and that would give the teacher an automatic “failure” no matter how good their teacher observations are.

So if I’m a special ed teacher of a bunch of kids reading at 4th and 5th grade level even through they’re in 7th grade, I’m particularly worried with the introduction of the new and unknown CC-aligned tests.

*So is that really what will happen? Will all these teachers get failing evaluation scores?*

That’s the big question mark. I doubt it there will be massive failure though. I think given that the scores were so clustered in the middle/low muddle last year, they are going to add a curve and not allow so many students to fail.

*So what you’re pointing out is that they can just redefine failure?*

Exactly. It doesn’t actually make sense to fail everyone. Probably 75% of the kids got 2’s or 1’s out of a 4 point scale. What does failure mean when everyone fails? It just means the test was too hard, or that what the kids were being taught was not relevant to the test.

*Let’s dig down to the the three topics. As far as you’ve heard from the teachers, what’s good and bad about CC?*

My teachers are used to the CC. We’ve rolled out standards-based grading three years ago, so our math and ELA teachers were well adjusted, and our other subject teachers were familiar. The biggest change is what used to be 9th grade math is now expected of the 8th grade. And the biggest complaint I’ve heard is that it’s too much stuff – nobody can teach all that. But that’s always been true about every set of standards.

*Did they get rid of anything?*

Not sure, because I don’t know what the elementary level CC standards did. There was lots of shuffling in the middle school, and lots of emphasis on algebra and algebraic thinking. Maybe they moved data and stats to earlier grades.

So I believe that my teachers in particular were more prepared. In other schools, where teachers weren’t explicitly being asked to align themselves to standards, it was a huge shock. For them, it used to be solely about Regents, and also Regents exams are very predictable and consistent, so it was pretty smooth sailing.

*Let’s move on to curriculum. You mentioned there is no CC-aligned curriculum in NY. I also heard NY state has recently come out against the CC, did you hear that?*

Well what I heard is that they previously said they this year’s 9th graders (class of 2017) would be held accountable but now the class of 2022 will be. So they’ve shifted accountability to the future.

*What does accountability mean in this context?*

It means graduation requirements. You need to pass 5 Regents exams to graduate, and right now there are two versions of some of those exams: one CC-aligned, one old-school. The question is who has to pass the CC-aligned versions to graduate. Now the current 9th grade could take either the CC-aligned or “regular” Regents in math.

I’m going to ask my 9th grade students to take both so we can gather information, even though it means giving them 3 extra hours of tests. Most of my kids pass 2 Regents in 9th grade, 2 in 10th, and 3 in 11th, and then they’re supposed to be done. They only take those Regents tests in senior year that they didn’t pass earlier.

*What are the good and bad things about testing?*

What’s bad is how much time is lost, as we’ve already said. And also, it’s incredibly stressful. You and I went to school and we had one big college test that was stressful, namely the SAT. In terms of us finishing high school, that was it. For these kids it’s test, test, test, test. I don’t think it’s actually improved the quality of college students across the country. 20 years ago NY was the only one that had extra tests except California achievement tests, which I guess we sometimes took as well.

*Another way to say it is that we did take some tests but it didn’t take 5 weeks.*

And it wasn’t high stakes for the teacher!

*Let’s go straight there: what are the good/bad things for the teachers with all these tests?*

Well it definitely makes the teachers more accountable. Even teachers think this: there is a cadre of protected teachers in the city, and the principals didn’t want to take the time to get rid of them, so they’d excess them out of the schools, and they would stay in the system.

Now with testing it has become much more the principal’s responsibility to get rid of bad teachers. The number of floating teachers is going down.

*How did they get rid of the floaters?*

A lot of different ways. They made them go into the schools, take interviews, they made their quality of life not great, and a lot if them left or retired or found jobs. Principals took up the mantle as well, and they started to do due diligence.

*Sounds like the incentive system for over-worked principals was wrong.*

Yes, although the reason it became easier for the principals is because now we have data. So if you’re coming in as ineffective and I also have attendance data and observation data, I can add my observational data (subjective albeit rubric based) and do something.

*If I may be more skeptical, it sounds like this data gathering was used as a weapon against teachers. There were probably lots of good teachers that have bad numbers attached to them that could get fired if someone wanted them to be fired.*

Correct, except those good teachers generally have principals who protect them.

*You could give everyone a bad number and then fire the people you want, right?*

Correct.

*Is that the goal?*

Under Bloomberg it was.

*Is there anything else you want to mention? *

I think testing needs to be dialed down but not disappear. Education is a bi-polar pendulum and it never stops in the middle. We’re on an extreme but let’s not get rid of everything. There is a place for testing.

Let’s get our CC standards, curriculum, and testing reasonable and college-aligned and let’s keep it reasonable. Let’s do it with standards across states and let’s make sure it makes sense.

Also, there are some new tests coming out, called PARCC assessments, that are adaptive tests aligned to the CC. They are supposed to replace Regents down the line and be national.

*Here’s what bothers me about that. It’s even harder to investigate the experience of the student with adaptive tests.*

I’m not sure there’s enough technology to actually do this anyway very soon. For example, we were given $10,000 for 500 student. That’s not going to go far unless it takes 2 weeks to administer the test. But we are investing in our technology this year. For example, I’m looking forward to buying textbooks and get my updates pushed instead of having to buy new books every year.

*Last question. They are redoing the SAT because rich kids are doing so much better. Are they just trying to get in on the test prep game? Because, here’s the thing, t**here’s no test that can’t be gamed that’s also easy to grade. It’s gotta depend on the letters and grades. We keep trying to shortcut that.*

Listen, this is what I tell the kids. What’s going to matter to you is the letter of recommendation, so don’t be an jerk to your fellow students or to the teachers. Next, are you going to be able to meet the minimum requirements? That’s what the SAT is good for. It defines a lower bound.

*Is it a good lower bound though?*

Well, I define the lower bound as 1000 in total. My kids can target that. It’s a reasonable low bar.

*To what extent do your students – mostly inner-city, black girls interested in math and science – suffer under the wholly gamed SAT system?*

It serves to give them a point of self-reference with the rest of the country. You have to understand, they, like most kids in the nation, don’t have a conception of themselves outside of their own experience. The SAT serves that purpose. My kids, like many others, have the dream of Ivy League minus the understanding of where they actually stand.

*So you’re saying their estimates of their chances are too high?*

Yes, oftentimes. They are the big fish in a well-defined pond. At the very least, The SAT helps give them perspective.

*Thanks so much for your time Kiri.*

## Billionaire money and academic freedom

If you haven’t seen this recent New York Times article by William Broad, entitled *Billionaires With Big Ideas Are Privatizing American Science, *then go take a look. It generalizes to all of scientific research my recent post entitled *Billionaire Money in Mathematics.*

My favorite part of Broad’s article is the caption of the video at the top, which sums it up nicely:

Funding the Future: As government financing of basic science research has plunged, private donors have filled the void, raising questions about the future of research for the public good.

In his article Broad makes a bunch of great points.

First, the fact that rich people generally ask for research into topics they care about (“personal setting of priorities”) to the detriment of basic research. They want flashy stuff, bang for their buck.

Second, academics interested in getting funding from these rich people have to learn to market themselves. From the article:

The availability of so much well-financed ambition has created a new kind of dating game. In what is becoming a common narrative, researchers like to describe how they begged the federal science establishment for funds, were brushed aside and turned instead to the welcoming arms of philanthropists. To help scientists bond quickly with potential benefactors, a cottage industry has emerged, offering workshops, personal coaching, role-playing exercises and the production of video appeals.

If you think about it, the two issues above are kind of wrapped up together. Flashy academic content goes hand in hand with flashy marketing. Let’s say goodbye to the true nerd who doesn’t button up their cardigan correctly. And I don’t know about you but I like those nerds. My mom is one of them.

This morning I thought of another way to express this issue, from the point of view of the individual scientist or mathematician, that might have profound resonance where the above just sounds annoying.

Namely, I believe that academic freedom itself is at stake. Let me explain.

I’m the last person who would defend our current tenure system. It’s awful for women, especially those who want kids, and it often breeds a kind of arrogant laziness post-tenure. Even so, there are good things about it, and one of them is academic freedom.

And although theoretically you can have academic freedom without tenure, it is certainly easier with it (example from this piece: “In Oklahoma, a number of state legislators attempted to have Anita Hill fired from her university position because of her testimony before the U.S. Senate. If not for tenure, professors could be attacked every time there’s a change in the wind.”).

But as we’ve seen recently, tenure-track positions are quickly declining in number, even as the number of teaching positions is growing. This is the academic analog of how we’ve seen job growth in the US but it’s majority shitty jobs. And as I’ve predicted already, this trend is surely going to continue as we scale education through MOOCs.

The dwindling tenured positions means there are increasing number of people trying to do research dependent upon outside grants and funding, and without the safety net of tenure. These people often lose their jobs when their funding flags, as we’ve recently seen at Columbia.

Now let’s put these two trends together. We’ve got fewer and fewer tenure jobs, which are precariously dependent on outside funding, and we’ve got rich people funding their own tastes and proclivities.

Where does academic freedom shake out in that picture? I’m going to say nowhere.