Home > data science, math education, news > Are SAT scores going down?

## Are SAT scores going down?

September 23, 2011

I wrote here about standardizing tests like the SAT. Today I wanted to spend a bit more time on them since they’ve been in the news and it’s pretty confusing what to think.

First, it needs to be said that, as I have learned in this book I’m reading, it’s probably a bad idea to make statements about learning when you make “cohort-to-cohort comparisons” instead of following actual students along in time. In other words, if you compare how well the 3rd grade did in a test one year to the next, then for the most part the difference could be explained by the fact that they are different populations or demographics. Indeed the College Board, which administers the SAT, explains that the scores went down this year because more and more diverse kids are taking the test. So that’s encouraging, and it makes you think that the statement “SAT scores went down” is in this case pretty meaningless.

But is it meaningless for that reason?

Keep in mind that these are small differences we’re talking about, but with a pretty huge sample size overall. Even so, it would be nice to see some errorbars and see the methodology for computing errorbars.

What I’m really worried about though is the “equating” part of the process. That’s the process by which they decide how to compare tests from year to year, mostly by having questions in common that are ungraded. At least that’s what I’m guessing, it’s actually not clear from their website.

My first question is, are they keeping in mind the errors for the equating process? (I find it annoying how often people, when they calculate errors, only calculate based on the very last step they take in a very sketchy overall process with many steps.) For example, is their equating process so good that they can really tell us with statistical significance that American Indians as a group did 2 points worse on the writing test (see this article for numbers like this)? I am pretty sure that’s a best guess with significant error bars.

Additional note: found this quote in a survey paper on equating methodologies (top of page 519):

Almost all test-equating studies ignore the issue of the standard error of the equating
function.

Second, I’m really worried about the equating process and its errorbars for the following reason: the number of repeat testers varies widely depending on the demographic, and also from year to year. How then can we assess performance on the “linking questions” (the questions that are repeated on different tests) if some kids (in fact the kids more likely to be practicing for the test) are seeing them repeatedly? Is that controlled for, and how? Are they removing repeat testers?

This brings me to my main complaint about all of this. Why is the SAT equating methodology not open source? Isn’t the proprietary “intellectual property” in the test itself? Am I missing a link? I’d really like to take a look. Even better of course if the methodology is open source (as in there’s an available script which actually computes the scores starting with raw data) and the data is also available with anonymization of course.

1. September 23, 2011 at 10:50 am

ETS is a monopoly, so it is not accountable to you or anyone else. They have zero motivation to be transparent. It’s easier for them to ask the public to trust them that their methodologies are legit, since only 0.1% of the population would even think to question it.

Also, although SAT data might be used to compare cohorts from different times, I doubt that it was *designed* with this purpose in mind. For the sole purpose of college admissions, standardization is a much simpler problem. For this purpose, a score only needs to carry a fairly consistent meaning over the course a few years rather than an entire decade.

Like