## How many NYC are arbitrarily punished by the VAM? About 578 per year.

There’s been an important update in the thought experiment I started yesterday. Namely, a reader (revuluri) has provided me with a link to show how many teachers are considered “ineffective,” which was my shorthand for scoring either third or fourth in the four categories.

According to page 5 of this document, that percentage was 16% in 2011-2012, 17% in 2012-2013, and 16% in 2013-2014. We’ll take this to mean that the true cutoff is about 16.3%. Using my formula from yesterday, that means that after 4 years, about

or 12.7% of teachers going up for tenure in the new system will be arbitrarily denied tenure based only on their VAM score.

How many people is that in a given year? Well, this document explains that in 2000, 9,000 teachers were hired and in 2008, 6,000 teachers were hired. I’ll assume my best guess for “teachers hired” in a given year is something between those two numbers, but I’ll also assume it’s closer to the latter since it is more recent information. Say 7,000 new teachers per year.

Of course, not all of them go up for tenure. There’s attrition. Say 35% of those teachers leave before the tenure decision is made (also guessing from this document). That leave us with about 4,550 teachers going up for tenure each year, and 12.7% of them is 578 people.

So, according to my crude estimates, about 578 people will be denied tenure simply based on this random number generator we call VAM. And as my reader said, this says nothing about the hard-to-measure damage done to all the good teachers trying to teach their kids but having to deal with this standardized testing nonsense. It’s a wonder anyone is willing to work here.

Please comment if you have updated numbers for anything here.

We should also consider how this, in NYC, will disproportionately fall upon teachers who are black and Hispanic. Up to 50% of NYC teachers in high poverty school are minorities themselves.

LikeLike

maybe just maybe we should consider the 28% of NYC public school students that don’t graduate.

http://schools.nyc.gov/Accountability/data/GraduationDropoutReports/default.htm

Public school is not a jobs program for teachers or even a jobs program for minority teachers, it is supposed to educate – something it does very poorly. And don’t bring up funding before looking at the data and in particular per pupil spending.

LikeLike

You are assuming causation. How does that graduation rate change over time? How does it compare with other places?

LikeLike

I am not assuming anything. You assume they were good teachers. I know the system does not work and doing the same thing will not change it. Ideally they would try several options across a random sample of schools and pick the best, then do it again,…

but looking at public schools as jobs program certainly will not improve the school system.

LikeLike

Hey, watch the tone. I’m looking at the public schools as a system where we might want to attract people that have a sense of fairness.

I agree about the randomized experiments. Not sure why we cannot accomplish that.

LikeLike

Great! We should! And we should look at the multitude of factors that contribute to student achievement and drop out rates that are well outside of school.

I’ve looked extensively at pupil funding, and I understand that, unlike the consumer marketplace, in education you frequently have to spend a LOT more and still get “WORSE” results because funding an education is not like buying a car. A huge number of factors go into why we spend what we spend “per pupil” and a flat comparison of district spending connected to graduation rates and test results is about as crude an analysis as possible.

LikeLike

We can always say that the public schools fail to educate. Hell, Harvard and Yale don’t do a good job of education, either. (They choose excellent students instead.) Even when I was in high school, the saying was, “Mass education is miseducation.”

However, the comparison of the US with other countries is misleading. The US has not done as good a job of primary and secondary education as other advanced countries since the 1950s, if not earlier. But if you compare how well we are doing now with how well we did, say 25 years ago, we are doing much better, especially with minority students. And the high school dropout rate has fallen as part of that picture.

Today’s high school graduates are about as academically intelligent as college graduates of their grandparents’ generation. (See the Flynn Effect.)

LikeLike

How good are our primary and secondary school teachers? It would be great if we required them to have masters degrees, and paid them appropriately, but we don’t, and we don’t want to. Everybody says they want good education, but they don’t want to pay for it.

In this country we worship the market. Well, the market says that you get what you pay for. But who listens?

LikeLike

Except that VAM is not a random number generator. I would expect that teachers who teach the wrong order of operation or the incorrect way to add fractions to their students will get low scores on VAM since a very large portion of their students will be doing poorly on the exams directly because they were incorrectly taught. My kids had teachers doing this. I won’t even mention the mathematical errors made by certified high school math teachers taking masters classes with me. I can say the percent who were doing high school level mathematics confidently incorrectly was closer to 20 percent.

LikeLike

“Except that VAM is not a random number generator”

So how do you explain the fact that you can look at the same teacher teaching the same subject to students in consecutive grades, and find that their VAM scores don’t self-correlate? How can it be that the same person, over the same school year, can be awesome at teaching 6th grade math, but suck at teaching 7th grade math?

http://garyrubinstein.teachforus.org/2012/02/28/analyzing-released-nyc-value-added-data-part-2/

LikeLike

I would prefer to see teachers have to perform well on the exact exams they are teaching to the kids. But their union forbids retesting the teachers with the state exam to see which are actually qualified to teach a particular subject. So one has to wait for their students to perform poorly (which happens when kids have uneducated parents who can’t correct what they have been taught incorrectly). Teachers can still have high VAM scores if their students have educated parents. This is why the VAM scores appear random. Because some bad teachers can slip through with high VAM scores.

LikeLike

While I agree the system proposed is stupid, I would love to come up with a system that allows say the worst 1-2% of teachers to be fired, tenured or not. The trick is how to find them! Any ideas??

It seems pretty clear to me that there is a very small percentage of teachers (or for that matter people who work for any sufficiently large company where some sort of law of large numbers applies), that have such a large negative effect on their students that they need to be gotten rid of quickly (or on their co workers for the more general case).

LikeLike

VAM is probably a pretty bad model, but I do not think it is a true random number generator. It probably has some nonzero positive correlation to performance. I think you should add that correlation to your model and do a little sensitivity analysis.

LikeLike

Some batched responses to comments above:

First, graduation rates are a terrible metric to use for high school success. Political pressure gets made to raise graduation rates, and the downstream effect is always to make the standards easier (or commit fraud on tests, etc.). Currently NYC boasts great graduation rates and 80% of those graduates entering CUNY can’t pass an 8th-grade level entry algebra test. The Regents test scores are basically just fictional BS from what I understand now (scaled to a new level each year, set post-test, that arranges a certain percent to pass automatically). Likewise, CUNY responded to pressure for higher graduation rates with its recent Pathways initiative (which took a 92% no-confidence vote from faculty) to reduce credit-hours and wipe out science labs and language requirements at several schools; the next thing in the future is likely to strike out the universal expectation of passing algebra, or even basic arithmetic, for college graduates.

http://www.villagevoice.com/2013-04-03/news/system-failure-the-collapse-of-public-education/

Second, there’s something about the Flynn Effect (claims that people are getting smarter) that doesn’t synch up with this degeneration of standards. I’m pretty sure my grandfather and his brothers and sisters (very proud of their high school diplomas) would have laughed at the college algebra tests we give now. The ETS released a report in January that U.S. millennials are demonstrably weaker in literacy, numeracy, and even technology than prior generations:

http://www.ets.org/s/research/29836/

Third, if I’m in a dark mood then I’ll point out that U.S. entrants to education programs have been dead-last in any kind of proficiency testing among college students consistently for the last century. Many probably couldn’t even follow the demonstration here, or assess whether or why the teaching job may not be rewarding for them. Of course, the lack of structure or support at their institutions is more important, but generally most are not in a position to tell or explain if their received materials to teach a subject like math are correct/coherent or not.

http://qz.com/334926/your-college-major-is-a-pretty-good-indication-of-how-smart-you-are/

LikeLike

Wow. this is going all kinds of places. A few thoughts:

1) Randomized controlled trials (RCTs) have done some great stuff — in lab settings and medicine, and even in social science (see the work of JPAL at http://www.povertyactionlab.org for some examples). However, to fetishize them (as I see happening too much in education) is unwise. They are incredibly expensive, incredibly hard to do, weak in terms of statistical power, and give little useful guidance on policy. There are long timescales, high peer effects, and too many outcome variables that we care about — all big differences from the “medical model”. So I strongly disagree that this would be “ideal”.

2A) The REALLY bad effects of the use of VAM in teacher evaluation are SO BAD (for example, the American vogue for highly “reliable” assessments that narrow and lower the thinking that’s actually assessed). I disagree that we need to stick it with responsibility for other misdeeds. In particular (if I read it correctly) the NYS law only “ties the hands” if the teacher’s growth measure is in the ineffective band. So this (6% rather than 16.3%) means the number is more like 2% of all teachers. But if you cut this down for teachers of tested subjects in tested grades I think it’s way fewer than 140. In any case, this is such a small sin of VAM use that I don’t think this is the best line of attack. Perhaps from a journalistic perspective it makes more sense. Thoughts, Cathy?

2B) One robust finding is that principals are “easy graders” in the observational component of the evaluation process (visible in the NYSED slides I linked, as well as many other places). I’ve come to think that this ISN’T due to imperfect design or principals’ skills that need to be built (though these may both be true); it’s because HIGH STAKES have been attached, and a Taylorism-based framework of consequences has been built such that the visible, short-term consequences of falsely identifying a teacher as ineffective are so bad that it’s better to err on the side of “grading easy”. If we had a spirit that it was about continually improving together, we’d see more productive critique and actionable feedback. (This seems another manifestation of Campbell’s law; it’s not just about “gaming” but about “insurance”.)

3) Teachers “performing well” on student exams sounds appealing, but it misunderstands both some basic ideas of assessment (inference and validity) and about the content knowledge that teachers need. There’s more — and DIFFERENT — knowledge than what students or other practitioners need. There’s a lot of research on “mathematical knowledge for teaching” (see Shulman, Ball, Hill, and many others). But I think the more fundamental problem here is that we (Americans) seem to think effective teaching is about knowledge transmission, and so it should be straightforward. Teaching is complex and often hard. Many of the factors and situations that come to a head in large urban districts like NYC intensify the challenge while scoring some “own goals” in terms of organization and culture.

LikeLike

revuluri, I agree that the problems with test-and-punish goes beyond the problems with VAM. A basic question is what is a scale by which to measure student success, how this is decided, etc.

One clause in the new teacher evaluation law calls for “student growth scores” even for teachers whose course does not have a state test. What this will mean is hard to say. Does this mean that art and music teachers will also have VAM measures? It seems to depend on what the next NYS Education Commissioner decides. The Commissioner also apparently decides on where the cut-off is for VAM scores.

The position is currently vacant, and apparently the Board of Regents appoints the Commissioner.

Cathy, I would urge you to speak out, not only at this blog, but also in public settings, at rallies if there are any in NYC. There are other problems with so-called education reform. But you are especially well-qualified to speak on the problems with VAM.

LikeLike

I’d be into that.

On Mon, Apr 6, 2015 at 11:29 AM, mathbabe wrote:

>

LikeLike

So, if you argue at all for VAMs, you’re arguing that one teacher’s effects should dictate the outcome for all of the other teachers.

On top of all of the other problems with with y=mx+b-t (with “t” being last year’s score) which is all a VAM essentially is (let’s hear it for 8th grade algebra!), English teacher VAM scores account for as much of the VAM outcomes in science and social studies teachers as do the actual teachers of those subjects.

See this very nice open access study that came out recently on this: http://epaa.asu.edu/ojs/article/view/1761

Our VAM policy system is holding teachers accountable for subjects they don’t teach. How much of your probability of being fired do you want attributable to your co-workers? Perhaps some, as it’s your job to work together and help each other out, but equal?

So to argue for VAMs, assumes a linear direct effect of teaching on learning, even on the 10% of a test score that’s up for explaining by a school at all, which just doesn’t hold up as all that useful. Berliner and Glass 2014 – make lay out the main arguments and more here http://www.amazon.com/Myths-Threaten-Americas-Public-Schools/dp/0807755249

LikeLike

revuluri, I am not seeing the link to the slides you mentioned when talking about principals being easy graders. My state (NM) recently sent all of our principals through training based on a similar claim. After their training, they had to rate a video of a teacher ‘correctly’ in order to earn the right to evaluate their own teachers. Almost every principal in the state came back saying they were apparently grading too easy, even on this no stakes (for the teacher) evaluation of a random unnamed classroom teacher. The pressure on principals the entire time was to evaluate teachers closer to the ‘objectively correct’ lower score. But there is no ‘objectively correct’ score for a teacher or their classroom presentation, and trying to convince principals that there is such a thing on the strength of a rubric (and then further obscuring the subjective nature of evaluation behind VAM) just seems like an effort to drive down scores and disguise the subjective nature of claims about teacher quality.

And lest you think that I am just bitter about the state forcing me into bad reviews, my evaluation from my local admin was solid but mediocre, whereas the ‘objective’ outside observer ranked me very high. In talking with other highly effective teachers who were rated much lower than me by the state observer (and roughly equivalent by our principal), it seems like the big difference is that I made a point to have some conversations with her outside of the formal evaluation in which we discussed my classroom decisions and the theory underlying them. While I am happy with the results, it makes me even less likely to believe that there is any objective truth behind the claim that principals are lax evaluators , precisely because evaluation of a complex process like teaching is an inherently subjective process. This, in turn, causes me to suspect that those who make claims about how principals are objectively lax are simply masking an agenda (possibly even from themselves).

And at the point that the scatterplot of teacher VAM scores looks utterly random, as the chart of MS teachers teaching more than one level of math did, it really is fair to say that it is a random number generator. You and I can come up with all sorts of anecdotes about teachers who cannot teach math (especially from elementary teachers). You cannot convince me that there are sufficient numbers of MS teachers that can teach one level of math very well and another terribly that the random scatter from that chart makes any kind of sense for anything other than a random number generator.

LikeLike

Here’s an article from American Educator, 2008, on the advantages of teacher peer review in the evaluation process. Note finding (1) on p. 7: principals do not have the time or ability to properly evaluate teachers. And finding (6) on p. 37: a panel of consulting teachers (CT) is far more likely to recommend nonrenewal for underperforming teachers (12% compared to about 1% for principals):

Click to access goldstein.pdf

LikeLike

I’m sorry I was out of internet range when this was published… As a retired superintendent of schools with 35 years of experience in the field I am maddened by the conventional wisdom that schools are populated with bad teachers and administrators do nothing to “weed out” bad teachers. This overlooks the reality that many administrators non-renew poor and/or marginal teachers BEFORE they receive a continuing contract and many poor and/or marginal teachers decide to resign and seek work elsewhere after finding that teaching is hard work. Gathering data on this is difficult because not all voluntary resignations are the result of ineffectiveness and in many states school boards are not asked to act on non-renewals.

VAM got traction because of this perception that bad teachers abound and also because politicians seek fast, cheap, and visible solutions to complicated problems. Thank you for pointing out the mathematical and statistical flaws of this politically convenient means of “solving” the problems of public education.

LikeLike