Home > data science, modeling, musing, news > War of the machines, college edition

War of the machines, college edition

April 14, 2013

A couple of people have sent me this recent essay (hat tip Leon Kautsky) written by Elijah Mayfield on the education technology blog e-Literate, described on their About page as “a hobby weblog about educational technology and related topics that is maintained by Michael Feldstein and written by Michael and some of his trusted colleagues in the field of educational technology.”

Mayfield’s essay is entitled “Six Ways the edX Announcement Gets Automated Essay Grading Wrong”. He’s referring to the recent announcement, which was written about in the New York Times last week, about how professors will soon be replaced by computers in grading essays. He claims they got it all wrong and there’s nothing to worry about.

I wrote about this idea too, in this post, and he hasn’t addressed my complaints at all.

First, Mayfield’s points:

  • Journalists sensationalize things.
  • The machine is identifying things in the essays that are associated with good writing vs. bad writing, much like it might learn to distinguish pictures of ducks from pictures of houses.
  • It’s actually not that hard to find the duck and has nothing to do with “creativity” (look for webbed feet).
  • If the machine isn’t sure it can spit back the essay to the professor to read (if the professor is still employed).
  • The machine doesn’t necessarily reward big vocabulary words, except when it does.
  • You’d need thousands of training examples (essays on a given subject) to make this actually work.
  • What’s so really wonderful is that a student can get all his or her many drafts graded instantaneously, which no professor would be willing to do.

Here’s where I’ll start, with this excerpt from near the end:

“Can machine learning grade essays?” is a bad question. We know, statistically, that the algorithms we’ve trained work just as well as teachers for churning out a score on a 5-point scale.  We know that occasionally it’ll make mistakes; however, more often than not, what the algorithms learn to do are reproduce the already questionable behavior of humans. If we’re relying on machine learning solely to automate the process of grading, to make it faster and cheaper and enable access, then sure. We can do that.

OK, so we know that the machine can grade essays written for human consumption pretty accurately. But it hasn’t had to deal with essays written for machine consumption yet. There’s major room for gaming here, and only a matter of time before there’s a competing algorithm to build a great essay. I even know how to train that algorithm. Email me privately and we can make a deal on profit-sharing.

And considering that students will be able to get their drafts graded as many times as they want, as Mayfield advertised, this will only be easier. If I build an essay that I think should game the machine, by putting in lots of (relevant) long vocabulary words and erudite phrases, then I can always double check by having the system give me a grade. If it doesn’t work, I’ll try again.

And the essays built this way won’t get caught via the fraud detection software that finds plagiarism, because any good essay-builder will only steal smallish phrases.

One final point. The fact that the machine-learning grading algorithm only works when it’s been trained on thousands of essays points to yet another depressing trend: large-scale classes with the same exact assignments every semester so last year’s algorithm can be used, in the name of efficiency.

But that means last year’s essay-building algorithm can be used as well. Pretty soon it will just be a war of the machines.

Categories: data science, modeling, musing, news
  1. April 14, 2013 at 8:03 am

    Distinguishing ducks from houses?! That’s grading and essay? The algorithm can’t even tell an small bird from a large inanimate structure? And this helps students learn good writing, how? So what if the algorithm’s results are correlated with grades? How can any student learn to communicate by writing under these conditions? This approach will only generate pompous asses, not writers.


  2. Brian
    April 14, 2013 at 10:39 am

    Well, of course it will be a war of the machines. I, for one, welcome our new essay-writing overlords.


    Just wait till we have computers spitting out essays of pseudo babble that even the postmodern literary folks can’t pull apart.


  3. April 14, 2013 at 1:29 pm

    I really like your blog and the reality you offer, I consider it valuable for sure. I recently commented on an article over at Forbes and they publicly stated a while back that bots and algos would be generating some of their news articles. I asked him if he wrote the column, which was basically done to promote a book about the rise of the machines or if the bot did it:) I got a human answer for sure but still don’t know if the bot did the original book review or not:)

    I wrote about a man on my blog a short while back who got a patent for a technology he calls “Long Tail” in which he brags about using it to write 800,000 books at once and put them up for sale on Amazon related to rare diseases, so do we go the next level with machine generated material too for study too? His patented technology has a few other purposes with generating material outside of books he stated.

    I think all of this really brings up some good questions for sure. Narrative Science is the company that Forbes is using for their automated content and I found their customer page interesting:) It’s “award winning” the page states:) Anyway just a couple other “machine generating” technologies that are out there so what machines will be programmed to work with other machines? I don’t know if that is good question or not?


    You are so right on it being the war of the machines…


  4. April 15, 2013 at 1:04 am

    It’s worth mentioning that Mayfield, author of said essay, is vested in promoting grading technologies. He even started a company around it; his About blurb reads, “Founder of LightSIDE Labs, a small company in Pittsburgh focusing on machine learning for automated writing assessment in education.”

    Essentially his article is a plea to not throw the baby out with the bathwater, that there’s some good in all this. Perhaps. He hasn’t proven his case.


  5. April 15, 2013 at 8:19 am

    Worst case scenario analyses are appropriate in security research, analyses of running times of algorithms and, maybe, privacy implications of “open data” (I certainly thank Mathbabe for the wonderfully expressive “thinking like an asshole” frase).

    But, it is not at all obvious that the same kind of analysis is useful in educational contexts. Yes, we should be aware of feedback loops. Yes, we just have to look at cheating on high stakes standardized test to have an idea of what can and will go wrong if we introduce computer aided evaluations unthinkingly into schools. But another framing is to ask what could be done with the technology by people dedicated to improving education.

    I like the answer Justin Reich gave a year ago: http://blogs.edweek.org/edweek/edtechresearcher/2012/04/grading_automated_essay_scoring_programs-_part_iii_classrooms.html . He proposes to use machine graded essays as an aid in formative assessment activities, enabling teachers to assess their student in ways not possible right now.


  1. No trackbacks yet.
Comments are closed.
%d bloggers like this: