I remember when I moved to New York in 2005. I found it intimidating and shocking how aggressively people vied for seats on the subway. I live near Columbia so the 1 train is my line, and of course everyone thinks their subway line is the most overused and crazy line, but in this case I’m right. I came from Boston, where we have subways too, four little itty bitty ones, and we are extremely polite to each other and, in particular, we never touch. By contrast here were these New Yorkers not only touching but literally squeezing into these tiny seats and sweating all over each other in the summer.
After about 3 months of living here I got really into it. I was in love with this city, and every gritty thing about it, and I considered the shared experience of the subway a sign of a larger public communistic love. Here they were, people from all walks of life, sharing their sweat! Isn’t it beautiful?
That kind of admiration only grew in the two years I stayed a professor at Barnard, which meant I almost never left the cozy neighborhood of Morningside Heights, so subway rides were rather rare, amusing events. I loved the subway and I developed theories about when people start talking on the subway (in three situations: 1) someone who is incredibly smelly gets off the train and everyone needs to talk about how smelly they were, 2) someone who is incredibly sick and coughing up a lung gets off the train and everyone has to talk about how sick and nasty they were, and 3) the train stops in the tunnel and the announcer tells us we have no idea when we will be able to move, and everyone has to talk about their stuck-in-a-tunnel-during-9/11 experiences.)
As soon as I started working at D.E. Shaw in midtown, and commuted during rush hour, I got real. I figured out exactly where to stand, and I mean exactly where on each platform, to maximize my chances of getting a seat once the train came. I figured out, depending on how many people were on which platform in Times Square, and the subsequent stations as we passed them, what the recent train traffic pattern had been in terms of the express 2/3 train and my local 1 train, and sometimes I’d do crazy things like get off the express train early to get on the 1 train because I’d anticipate that if I waited til 96th street like everyone else, there would be no chance I could get on the 1 train. Actually looking back, I almost never sat down at all during these commutes, even when I was pregnant.
Which comes to the turn in my story. When I was heavily pregnant, commuting on the subway was actually hellish. I had no balance, and felt vulnerable, and being squished up against people with no place to hold on was really scary. For the most part commuters are a selfish bunch, and people sitting would pretend not to notice me, so they wouldn’t have to give up their seat. I promised myself I’d never be that jerk.
For the last two weeks of my pregnancy I took a cab to work every day, but even so coming home was another story, since it’s hard to get a cab in Times Square at 5pm. I remember one time some asshole in a suit actually ran to grab a cab that had stopped for me, and he beat me because… I was 9 months pregnant and couldn’t keep up with him. I started crying, on the street, until this nice pedicab guy pulled over and asked me if he could help. I told him I lived all the way uptown and he biked me around until he found me a cab; he refused to let me pay. I still love that guy.
Once I started down the road of getting up for pregnant people, though, it was a short logical step to never sitting down again. After all, there are all kinds of hidden reasons people may need to sit down more than I do. What if their feet are killing them after standing all day at work? What if they have balance problems?
For a while I decided it’s okay to sit if everyone else had an available seat. That seemed safe. But then I’d be sitting there, spaced out or reading, with a sea of empty seats around me, and all of a sudden a huge group of people would converge and somehow I’d be face to face with someone with a murderous look which said, you motherfucker you’re sitting in my seat. In the end, it’s become my policy to just never sit down.
I do of course still think about the question of where’s the best place to stand in the subway. This is a whole different optimization play, which for intellectual property reasons I won’t share with you all, since I don’t want more competition than I already have. Just one hint: don’t get on in the middle of the car. Always get on at one of the ends.
Best article ever about beards.
First, it needs to be said that, as I have learned in this book I’m reading, it’s probably a bad idea to make statements about learning when you make “cohort-to-cohort comparisons” instead of following actual students along in time. In other words, if you compare how well the 3rd grade did in a test one year to the next, then for the most part the difference could be explained by the fact that they are different populations or demographics. Indeed the College Board, which administers the SAT, explains that the scores went down this year because more and more diverse kids are taking the test. So that’s encouraging, and it makes you think that the statement “SAT scores went down” is in this case pretty meaningless.
But is it meaningless for that reason?
Keep in mind that these are small differences we’re talking about, but with a pretty huge sample size overall. Even so, it would be nice to see some errorbars and see the methodology for computing errorbars.
What I’m really worried about though is the “equating” part of the process. That’s the process by which they decide how to compare tests from year to year, mostly by having questions in common that are ungraded. At least that’s what I’m guessing, it’s actually not clear from their website.
My first question is, are they keeping in mind the errors for the equating process? (I find it annoying how often people, when they calculate errors, only calculate based on the very last step they take in a very sketchy overall process with many steps.) For example, is their equating process so good that they can really tell us with statistical significance that American Indians as a group did 2 points worse on the writing test (see this article for numbers like this)? I am pretty sure that’s a best guess with significant error bars.
Additional note: found this quote in a survey paper on equating methodologies (top of page 519):
Almost all test-equating studies ignore the issue of the standard error of the equating
Second, I’m really worried about the equating process and its errorbars for the following reason: the number of repeat testers varies widely depending on the demographic, and also from year to year. How then can we assess performance on the “linking questions” (the questions that are repeated on different tests) if some kids (in fact the kids more likely to be practicing for the test) are seeing them repeatedly? Is that controlled for, and how? Are they removing repeat testers?
This brings me to my main complaint about all of this. Why is the SAT equating methodology not open source? Isn’t the proprietary “intellectual property” in the test itself? Am I missing a link? I’d really like to take a look. Even better of course if the methodology is open source (as in there’s an available script which actually computes the scores starting with raw data) and the data is also available with anonymization of course.
Being a mathematician, I find myself forced to consider statements like “higher taxes kill jobs” as statements of theorems with missing stated assumptions. How could you fill in the assumptions and prove this theorem?
First I think about extreme cases- sometimes extreme situations need fewer assumptions, they kind of spill out as obvious. So here’s one, the tax rate is at 80%, what would happen if we raised taxes? My first reaction is, 80%!? That must mean you have way too much government and regulation and for those reasons businesses are probably already quite pinned down and don’t have lots of freedom- don’t tax them more, that will make their good ideas (if they have them) all the more suffocated. Just think of the paperwork you’d need to go through in a society that government-heavy, to hire someone.
What’s another extreme case? How about taxes are super low, more like fees for doing business. Then no, I don’t think raising them a moderate amount would kill jobs at all, in fact it may introduce enough government to make things less wild west and safer for businesses to operate.
So in other words at some level I buy the anti-regulation anti-government angle. I don’t want super duper high taxes because I think it encourages too much bureaucracy and that stuff is boring (but some amount of it is necessary to make things safe).
Moreover I’m assuming that governments generally use taxes to protect people from food poisoning and the like, regulate to force companies to play fair, and as social safety nets when things go bad, and that they’re not particularly efficient. Those of course are my assumptions, which anyone can disagree with.
But in terms of proving my theorem, I’m stuck thinking it’s more like, there’s some point in between very low and very high taxes where it gradually becomes true that raising taxes more will indeed start to kill jobs.
How about our situation now? Right now we have pretty low taxes by historical measures, and moreover the known loopholes mean that businesses (especially big ones with fancy lawyers) pay much less than their stated tax rate.
Why, in this case, would a moderate bump in their tax rates kill jobs?
Here’s a possible argument: if higher taxes actually encourage more regulation, then that could be a major problem for smaller businesses, who don’t have the margin for dealing with hiring that many lawyers for compliance issues. Although this article argues that “regulation kills jobs” is an invalid statement in general.
Pet peeve of mine: when you hear conservatives talk about killing jobs, they often frame it in terms of struggling small businesses, often run by a woman. But it’s easy enough to imagine that we introduce taxes and regulation that are easier for small businesses to avoid smothering them. It’s really the huge businesses that we want to see start hiring, and it’s the huge businesses that pay so little taxes.
Here’s another one: if you raise taxes people will spend their cash on taxes instead of hiring people. But wait, that doesn’t apply right now when we have so much frigging cash on hand (and hidden in other countries). In other words, companies are not not hiring people for cash flow reasons, it’s because they don’t see the demand.
In the end I can’t see how to prove or even argue that theorem, assuming today’s conditions. Would love to hear the argument I’m missing.
I was delighted to meet a huge number of fun, hopeful, and excited nerds throughout the day. Since my talk was pretty early in the morning, I was able to relax afterwards and just enjoy all the questions and remarks that people wanted to discuss with me.
Some were people with lots of data, looking for data scientists who could analyze it for them, others were working with packs of data scientists (herds? covens?) and were in search of data. It was fun to try to help them find each other, as well as to hear about all the super nerdy and data-driven businesses that are getting off the ground right now. It certainly was an optimistic tone, I didn’t feel like we were in the middle of a double-dip recession for the entire day (well, at least til I got home and looked at the Greek default news).
Conferences like these are excellent; they allow people to get together and learn each others’ languages and the existence of the new tools and techniques in use or in development. They also save people lots of time, make fast connection that would otherwise difficult or impossible, and of course sometimes inspire great new ideas. Too bad they are so expensive!
I also learned that there’s such thing as a “data scientist in residence,” held of course by very few people, which is the equivalent in academic math to having a gig at the Institute for Advanced Study in Princeton. Wow. I still haven’t decided whether I’d want such a cushy job. After all, I think I learn the most when I have reasonable pressure to get stuff done with actual data. On the other hand maybe that much freedom would allow one to do really cool stuff. Dunno.
I’m reading an interesting book by Douglas Harris about the value-added model movement, called Value-added Measures in Education, available here from Harvard Education Press. Harris goes into a very reasonable critique of how “snapshot” views of students, teachers, and school are a very poor assessment of teacher ability, since they are absolute measurements rather than changes in knowledge. Kind of like comparing the Dow to the S&P and concluding that you should definitely invest in Dow stocks since they are ten times better, it’s all about the return on a test score or an index, not the absolute number, when you are trying to gauge learning or profit.
His goal of the book is to explain how value-added models work, how they measure learning, how the take into account things like poverty level and other circumstances beyond the control of the school or the teachers, and other such factors. In his introduction he also promises not to be unreasonable about applying the results of these tests beyond where it makes sense. He certainly seems to be a smart guy; smart enough to know about errors and the problems with badly set up incentives – he uses the financial crisis as a model of how not to do it. I’m hopeful!
Here’s what I am interested in talking about today, which is how the “standardized” gets into standardized testing, because already at this point the mathematical modeling is pretty tricky (and involves lots of choices). There are many ways a test is ultimately standardized, assuming for simplicity that it’s a national test given at many grade levels yearly (pretend it’s an SAT that every grade takes):
- the test is normalized for being harder or easier than it was last year, for each grade’s test separately, and sometimes per question as well,
- the grading is normalized so that a student who learns exactly as much “as is expected” gets the same grade from year to year, and
- the grading is further normalized so that a student who gets 10 more points than expected in 3rd grade is doing as well as if she got 10 extra points in 4th grade.
One way of accomplishing all of the above would be to draw a histogram of raw results per year and per grade and normalize that distribution of raw scores by some standard mean and standard deviation, just as you would make a normal distribution standard, i.e. mean 0 and standard deviation 1. In fact, go ahead and demean it and divide by the standard deviation. That’s the first thing I’d do.
But if you actually do that, then you lose lots of the information you are actually trying to glean. Namely, how could you then conclude if students are doing better or worse than last year? I’m sure you’ve seen the recent news that SAT scores have fallen this year from last. I guess my question is, how can they tell? If we do something as simple as what I suggested, then the definition of doing as well “as is expected” is that you did “as well as the average person did”. But clearly this is not what the SAT people do, since they claim people aren’t doing as well as they used to. So how are they standardizing their test?
It isn’t really explained here or here, but there are clues. Namely, if you give 3rd and 4th graders some of the same questions on a given year, then you can infer how much better 4th graders do on those questions than 3rd graders do, and you can use that as a proxy for how to scale between grades (assuming that those questions represent the general questions well). Next, since you can’t repeat questions (at least questions that count towards the score) between years, because the stakes are too high and people would cheat, you can instead have ungraded sections that have repeated questions which give you a standard against which to compare between years. In fact the SAT does have ungraded sections, and so did the GREs as I recall, and my guess is this is why.
That brings up the question, do all standardized tests have ungraded sections? Is there some other clever way to get around this problem? Also in my mind, how well does standardization work, and what is a way to test it?
This week has been particularly confusing when it comes to the European debt crisis. It’s complicated enough to think about the various countries, with their various current debt problems, future debt problems, and austerity plans, not to mention how they typically interact at the political level versus how the average citizen is affected by it all. But this week we’ve seen weird and coordinated intervention by a bunch of central banks to address a so-called “liquidity crisis”.
What is this all about? Is it actually a credit crisis disguised as a liquidity crisis? Is it just another stealth way to bail out huge banks?
I’m going to take a stab at answering these questions, at the risk of talking out of my ass (and when has that ever stopped me?).
Finance is a big messy system, and it’s hard to know where to begin on the merry-go-round of confusion, but let’s start with European banks since they are the ones in need of funding.
European banks have lots of euros on hand, just as American banks have lots of dollars, because of the actual deposits they hold. However, European banks invest in American things (like businesses) that need them to come up with short term funding denominated in dollars. Similarly American banks invest in Europe, but that’s not really relevant to the discussion yet.
How do European banks get these short term (3 month) loans? Historically they do a large majority of it through money-markets: much of the money people have in banks is funneled to huge vats called money markets, and the fund managers of those vats are very very conservatively trying to make a bit of interest on them. In fact they were burned in the credit crisis, when they famously “broke the buck” on Lehman short-term loans.
Well, guess what, those same American money managers are avoiding European short-term loans right now, because they are super afraid of losing money on them. So that source of funding has dried up. Note that this is a credit problem: the money market managers do not trust the banks to be around in 3 months.
Another source of funding for the European banks’ American investments has been just to use their euros, exchange them to dollars (the currency market is very very large and liquid, especially on this particular exchange), then wait until the term of the short-term financing is over, and then convert the dollars back to euros. What actually happens, in fact, is that they borrow euros (at the going rate of 1%), do the exchange, then financing, and then get their money back in the future.
The guys who work at the European banks and who do this short-term financing aren’t allowed to take on the risk that the exchange rate is going to violently change between now and when the short-term term is over. Therefore they need to hedge the risk, which means they have to have a guarantee that the dollars they get out at the end of the term will be turned into a reasonable number of euros.
This kind of guarantee is called a currency swap, and the market for those is also very large and liquid, but has been less liquid recently because of the one-sidedness of this problem: European banks need short-term dollars but American banks don’t need euros at the same rate at the same maturity. So the end result is that the swaps are very very expensive for European banks.
Let’s put this another way, the way that seems strangest and most confusing: right now the European banks can borrow at 1% in euros but at 4% in dollars (for three month maturity), and more generally the demand for USD seems to be skyrocketing recently from all over the place. Does this mean there’s an arbitrage opportunity somewhere? The swaps market is at 3% so no obvious arbitrage. More likely it means that the markets are expecting the exchange rate to drastically change, or at least they are pricing in the risk of it changing violently in the very near future. (The strangest thing to me is why it hasn’t just changed the spot exchange rate as well.)
By the way, a pet peeve or two I have with people talking about arbitrage: firstly, many people use the term so loosely it means nothing at all, as when they take risk over time (exposing themselves to the possibility of an exchange rate change for example). But even here, I’m misusing the term, since in an arbitrage it’s literally supposed to be a way to make money risk-free, but the whole point of my post is that this is really all about counter-party risk! In other words, there’s no arbitrage opportunity to get into contracts with people where you’d make money except if they go bankrupt tomorrow, when there’s a good chance that will happen.
The bottomline is that although the ECB and the Fed and the other central banks have spun this as a coordinated effort to help out a liquidity squeezed but functional market, it doesn’t pass the smell test. What’s actually happening is that the shoddy accounting and investments of French banks and others is not being trusted by American money market managers who are wise to them.
One more thing: the collateral being asked of the European banks is purportedly of low standard, which is to say the ECB is allowing thing like Greek debt as collateral, which wouldn’t past muster with other institutions (or with U.S. money markets!). In that sense this can be seen as a stealth bailout, although I think not the first one in Europe under that definition. This isn’t going away until they figure out how to deal with the Greek debt problem.