Teaching scores released
Anyone who reads this blog regularly knows how detestable I think it is that the teacher value-added model scores are being released but the underlying model is not.
We are being shown scores of teachers and we are even told the scores have a wide margin of error: someone who gets a 30 out of 100 could next year get a 70 out of 100 and nobody would be surprised (see this article).
Just to be clear, the underlying test doesn’t actually use a definition of a good teacher beyond what the score is. In other words, this model isn’t being trained by looking at examples of what is a “good teacher”. Instead, it derived from another model which predicts students’ test scores taking into account various factors. At the very most you can say the teacher model measures the ability teachers have to get their kids to score better or worse than expected on some standardized tests. Call it a “teaching to the test model”. Nothing about learning outside the test. Nothing about inspiring their students or being a role model or teaching how to think or preparing for college.
A “wide margin of error” on this value-added model then means they have trouble actually deciding if you are good at teaching to the test or not. It’s an incredibly noisy number and is affected by things like whether this year’s standardized tests were similar to last year’s.
Moreover, for an individual teacher with an actual score, being told there’s a wide margin of error is not helpful at all. On the other hand, if the model were open source (and hopefully the individual scores not public), then a given teacher could actually see their margin of error directly: it could even be spun as a way of seeing how to ”improve”. Otherwise said, we’d actually be giving teachers tools to work with such a model, rather than simply making them targets.
update: Here’s an important comment from a friend of mine who works directly with New York City math teachers:
Thanks for commenting on this. I work with lots of public school math teachers around New York City, and have a sense of which of them are incredible teachers who inspire their students to learn, and which are effective at teaching to the test and managing their behavior.
Curiosity drove me to it, but I checked out their ratings. The results are disappointing and discouraging. The ones who are sending off intellectually engaged children to high schools were generally rated average or below, while the ones who are great classroom managers and prepare their lessons with priority to the tests were mostly rated as effective or above.
Besides the huge margin of uncertainty in this model, it’s clear that it misses many dimensions of great teaching. Worse, this model, now published, is an incentive for teachers to develop their style even more towards the tests.
If you don’t believe me or Japheth, listen to Bill Gates, who is against publicly shaming teachers (but loves the models). From his New York Times op-ed from last week:
Many districts and states are trying to move toward better personnel systems for evaluation and improvement. Unfortunately, some education advocates in New York, Los Angeles and other cities are claiming that a good personnel system can be based on ranking teachers according to their “value-added rating” — a measurement of their impact on students’ test scores — and publicizing the names and rankings online and in the media. But shaming poorly performing teachers doesn’t fix the problem because it doesn’t give them specific feedback.
If nothing else, the Bloomberg administration should also look into statistics regarding whether it’s become a more attractive or less attractive profession since he started publicly shaming teachers. Has introducing the models and publicly displaying the results had the intended effect of keeping good teachers and getting rid of bad ones, Mayor Bloomberg?