New creepy model: job hiring software

Home > data science, modeling, rant > New creepy model: job hiring software

New creepy model: job hiring software

April 10, 2013 Cathy O'Neil, mathbabe

Warmup: Automatic Grading Models

Before I get to my main take-down of the morning, let me warm up with an appetizer of sorts: have you been hearing a lot about new models that automatically grade essays?

Does it strike you that’s there’s something wrong with that idea but you don’t know what it is?

Here’s my take. While it’s true that it’s possible to train a model to grade essays similarly to what a professor now does, that doesn’t mean we can introduce automatic grading – at least not if the students in question know that’s what we’re doing.

There’s a feedback loop, whereby if the students know their essays will be automatically graded, then they will change what they’re doing to optimize for good automatic grades rather than, say, a cogent argument.

For example, a student might download a grading app themselves (wouldn’t you?) and run their essay through the machine until it gets a great grade. Not enough long words? Put them in! No need to make sure the sentences make sense, because the machine doesn’t understand grammar!

This is, in fact, a great example where people need to take into account the (obvious when you think about them) feedback loops that their models will enter in actual use.

Job Hiring Models

Now on to the main course.

In this week’s Economist there is an essay about the new widely-used job hiring software and how awesome it is. It’s so efficient! It removes the biases of of those pesky recruiters! Here’s an excerpt from the article:

The problem with human-resource managers is that they are human. They have biases; they make mistakes. But with better tools, they can make better hiring decisions, say advocates of “big data”.

So far “the machine” has made observations such as:

Good if candidate uses browser you need to download like Chrome.
Not as bad as one might expect to have a criminal record.
Neutral on job hopping.
Great if you live nearby.
Good if you are on Facebook.
Bad if you’re on Facebook and every other social networking site as well.

Now, I’m all for learning to fight against our biases and hire people that might not otherwise be given a chance. But I’m not convinced that this will happen that often – the people using the software can always train the model to include their biases and then point to the machine and say “The machine told me to do it”. True.

What I really object to, however, is the accumulating amount of data that is being collected about everyone by models like this.

It’s one thing for an algorithm to take my CV in and note that I misspelled my alma mater, but it’s a different thing altogether to scour the web for my online profile trail (via Acxiom, for example), to look up my credit score, and maybe even to see my persistence score as measured by my past online education activities (soon available for your 7-year-old as well!).

As a modeler, I know how hungry the model can be. It will ask for all of this data and more. And it will mean that nothing you’ve ever done wrong, no fuck-up that you wish to forget, will ever be forgotten. You can no longer reinvent yourself.

Forget mobility, forget the American Dream, you and everyone else will be funneled into whatever job and whatever life the machine has deemed you worthy of. WTF.

Categories: data science, modeling, rant

Comments (38)

Michael Thaddeus

April 10, 2013 at 7:22 am

Just a quibble — the machine does understand grammar, i.e. syntax. What it doesn’t understand is semantics.

Anyway, yes this sudden burst of optimism about automated grading would be hilarious if it weren’t so frightening. See this NYT story:

LikeLike
- moosesnsquirrels
  
  April 10, 2013 at 7:49 am
  
  Machines can’t understand anything, since they lack minds. At best they can execute highly sophisticated pattern recognition.
  
  LikeLike
  - Ron Broberg
    
    April 12, 2013 at 2:56 pm
    
    And what is the difference?
    
    LikeLike
    - moosesnsquirrels
      
      April 12, 2013 at 4:25 pm
      
      Humor, nuance, sarcasm, poetic license, …. 🙂
      
      LikeLike
    - Ron Broberg
      
      April 12, 2013 at 5:20 pm
      
      All patterns I recognize … even if I don’t always understand them. 😉
      
      LikeLike
    - moosesnsquirrels
      
      April 12, 2013 at 7:18 pm
      
      Right! 😉
      
      LikeLike
dave

April 10, 2013 at 7:41 am

What would Daniel Kahneman say? And lets just cut to the chase and provide DNA samples #gataca. Finally who is composing these models and will there job prospects increase

LikeLike
moosesnsquirrels

April 10, 2013 at 7:48 am

Agreed on all points, but I also would add something far more sinister: The transformation of education from an activity focused on the development of our intellects–the core of our humanness and therefore humanity–into a normalizing of our behavior according to our wanna-be corporate masters.

The goal of writing is to express one’s thoughts in a way that enables others (i.e., humans) to at least grasp, if not intimately “share”, those thoughts. Writing, like speech, is a fundamental human activity that is critical for civilized societies to exist. To teach writing, then, is to help a student develop their skills to understand, organize, and express their thoughts. Traditionally, this was done by teaching the Trivium (logic, grammar, and rhetoric) to enable students to think clearly, express themselves accurately, and argue convincingly.

Since the goal of writing is to extend one’s internal understanding to another through written expression, students need skilled and experienced human colleagues to hone their skills. At best, a computer can only offer a program that executes some sort of pattern analysis and scoring according to a correlative model that is based on some one’s (or group’s) view of what “good” writing looks like. In other words, since computers don’t “think” (a minor point that seems lost on most journalists, business types, and teachers), the computer can only try to find correlative similarities between submitted writings and some aggregate model of writing. To the computer, the submitted writing is just a string of encoded electoral impulses that are manipulated according to some algorithm to produce an encoded output that is displayed in a form that we can perceive. But to call that “understanding” is a travesty.

In fact that defeats the very purpose of writing, since there can be no hope of establishing an understanding–and hence communication–between the writer and the reviewer. Computers don’t understand anything; hence any output from the computer can’t convey any understanding of the subject essay. The teacher who can only read the computer’s assessment thus can’t understand the writing. How then, can this serve the purpose of training and coaching good writers?

The quotes from the Economist seem to be almost verbatim statements from a marketing rep. The “bias” angle is really marketing speciousness at its worst. The issue in grading fairness isn’t bias, it’s prejudice. Yes, we all have biases; so what? Good teachers, like all good readers and writers, understand their biases and consciously keep them in check. There are graders out there who are fair, and that’s the goal. And since writing is about communication between human beings, and not some rule-based, formalized, abstract exercise like mathematics or symbolic logic, fairness is what matters.

Worse, who says the computer (really the underlying algorithm) isn’t biased itself? The algorithm has to be based on a model of “good” writing. But machines can’t judge writing; so, the model has to be based on “biased” human beings! And bringing in extraneous information about social sites, credit scores, etc. only shows that the bias of the computer will be far more extensive than any human teacher.

So, if the idea of computer grading actually defeats the purpose of writing, and if the very programs that do the “grading” are likely to be at least as biased if not more biased than human graders, then what’s the point? Well, profits for companies that make the software are important. But there’s much more at stake. Replacing writing teachers is just another nail in the coffin of humanities education. Education will become still more like technical training. Worse, as you noted Cathy, by forcing students to “write to the model”, the owners of the code, like Bill Gates and Rupert Murdoch, will actively mold the mind of the young to their own models of humanity. And we’re talking about some pretty withered human beings here.

Time for home schooling.

LikeLike
- Thads
  
  April 10, 2013 at 9:38 am
  
  Oddly enough, just this week a childhood friend invited me to get involved in developing educational software for his company, which is wholly owned by Rupert Murdoch.
  
  LikeLike
  - Cathy O'Neil, mathbabe
    
    April 10, 2013 at 9:42 am
    
    Mole?
    
    LikeLike
    - Kurt S
      
      April 10, 2013 at 2:41 pm
      
      You already know people that work there…
      
      LikeLike
    - Kurt S
      
      April 10, 2013 at 2:44 pm
      
      I’m amused by their example of something that can go horribly wrong with this software. Not hiring the wrong person or a psychopath or not hiring a great person, but:
      
      they can go horribly wrong. Peter Cappelli of the University of Pennsylvania’s Wharton School of Business recalls a case where the software rejected every one of many good applicants for a job because the firm in question had specified that they must have held a particular job title—one that existed at no other company.
      
      HORRIBLE!
      
      LikeLike
    - Kurt S
      
      April 10, 2013 at 2:45 pm
      
      Whoops, not sure why WordPress put this here instead of as a top level comment.
      
      LikeLike
  - moosesnsquirrels
    
    April 10, 2013 at 9:55 am
    
    Just say no! Save your soul! 🙂
    
    LikeLike
Jeremy Cherfas

April 10, 2013 at 8:06 am

In a MOOC that I took part in, the grading of discursive answers was done by peers. You had to grade two or three other students, and the software told you what had to be present to give each grade. It seemed to work well, but it would have been great to see the raw data for that, so see how much graders agreed and stuff like that.

LikeLike
- araybold
  
  April 10, 2013 at 10:01 am
  
  Were you not concerned by the fact that your grades were, at least in part, being decided by people who presumably lacked an in-depth knowledge of the subject matter? I would be interested in knowing how that could possibly work.
  
  LikeLike
  - Jeremy Cherfas
    
    April 13, 2013 at 7:26 am
    
    Not too much, for their grades were also being decided by me. And there was an appeals process. The honest truth is that my grades didn’t matter nearly as much to me as just learning about the course material. But perhaps I’m lucky in that I didn’t need the qualification or anything like that.
    
    LikeLike
Linda

April 10, 2013 at 9:18 am

One had always understood that the point of writing was communication, transmission of information or sentiment from one mind to another. It is hard to imagine sitting down to write something that you know isn’t going to be read, even by an overworked and bored graduate student. This seems to be a category mistake, a misunderstanding of what writing is. The spookiest part about computer essay grading, as presented in the NY Times story, is that the student submits an essay and receives a grade instantaneously–so clearly the writer is not even “communicating” with the machine–the essay is being evaluated on surface features. In the old human-grading system, a student improved an essay by tightening the argument, using well-chosen examples, employing fresh language rather than relying on cliches–it doesn’t seem that these things are what the computer will be checking for.

LikeLike
- Cathy O'Neil, mathbabe
  
  April 10, 2013 at 9:19 am
  
  Exactly!
  
  LikeLike
- araybold
  
  April 10, 2013 at 10:42 am
  
  I am sorry to say, but the dirty little secret of the testing industry is that the human grading of essays has already been dumbed-down into a superficial rote procedure. I have seen comments confirming this from people employed by the companies that score the SAT essays, and Dr. Perelman of MIT studied the essay answers about five years ago and found a very strong correlation of score with length – the essays might as well have been weighed instead of read. In this light, the ability of software to score these essays similarly to humans is no particular achievement, and no indication of quality.
  
  There is a general obsession in management, which I tentatively blame on business schools, for metrics that are ‘objective’ and ‘repeatable’ without regard to whether they are meaningful (being ‘gameable’ is just one way in which metrics can be rendered meaningless.)
  
  LikeLike
  - Leon
    
    April 11, 2013 at 6:48 am
    
    That reminds me of the observation about Software Engineering’s (mostly former) obsession with quality metrics. (e.g. cyclomatic complexity) Namely, it was discovered that most every metric described so far is highly correlated to the length of the program, and that if you knew the length of the program you could basically ignore all other metrics as inputs to your model.
    
    Then of course, it’s been fairly well known (at least among people who actually do the programming) that these models aren’t worth anything once you take them too seriously, e.g. http://dilbert.com/strips/comic/1995-11-13/
    
    LikeLike
araybold

April 10, 2013 at 10:22 am

The flip side of not being able to have your mistakes forgotten is that nothing noteworthy you do will count for anything, if it is unusual (as noteworthy things generally are). Even if that achievement could somehow be adequately encoded in a form acceptable to the model, it will not have been trained on sufficient comparable examples for it to be taken into account.

LikeLike
- Cathy O'Neil, mathbabe
  
  April 10, 2013 at 10:40 am
  
  Great point!
  
  LikeLike
UncleMookie

April 10, 2013 at 11:48 am

A lot like running ‘blog’ or web copywriting through a black box to optimize search engine crawl! Takes a little of my writing creativity away – but who wants THAT anymore?!? Get those eyeballs (but what if it isn’t compelling, interesting, engaging once they get there).

LikeLike
- Cathy O'Neil, mathbabe
  
  April 10, 2013 at 11:51 am
  
  Oooh good idea. Kinda reminds me of an old post of mine: https://mathbabe.org/2011/12/03/quantitative-theory-of-blogging/
  
  LikeLike
Becky Jaffe

April 10, 2013 at 1:21 pm

I recently had a run-in with some grading software that illustrates your point. I am an educational consultant working with high school students individually. One of my juniors at a competitive college prep high school showed me the feedback he had received on a 10-page expository paper on the topic of Chinese immigration, for which he had received a C-. I reviewed the feedback and was for a good twenty minutes stymied as to how to explain the fact that it was way off target. His teacher had mis-identified run-on sentences and sentence fragments throughout, giving the student confusing and inaccurate grammatical feedback with no commentary on the content or flow of ideas. The feedback seemed desultory and baffling until the student explained that the paper had been graded not by a person but by a software program, which I can only conclude was written by a computer programmer with no grasp of English grammar. This is a student who struggles with all aspects of writing: mechanics, thesis development, saliency, transitions from one idea to the next, etc. and who would benefit from writing instruction from an actual human — presumably what the $20,000/year price tag for his private education is intended for. From my student’s perspective, he spent two months working diligently on a research paper that no human bothered to read, a demotivating exercise in futility that became the ultimate lesson of the assignment. How can we expect our students to extend effort toward their own educations when we as educators aren’t willing to meet them half way?

LikeLike
- Cathy O'Neil, mathbabe
  
  April 10, 2013 at 1:24 pm
  
  Holy shit can you write a blog post about this?
  
  LikeLike
- Artie Prendergast-Smith
  
  April 10, 2013 at 3:50 pm
  
  Following up on your closing question, the next logical step will be students using software to write their papers (á la this beauty), and instructors using software to grade them. The whole process will produce moutains of gibberish, but nobody will care, because everyone will get an A!
  
  LikeLike
  - Cathy O'Neil, mathbabe
    
    April 10, 2013 at 3:52 pm
    
    Perfect!!
    
    LikeLike
- Nathanael
  
  April 14, 2013 at 9:17 pm
  
  You need to publish the name of the prep school so as to destroy its reputation. This is *important*.
  
  LikeLike
Constantine Costes

April 10, 2013 at 10:49 pm

I am a little disappointed in the Onion, which, as far as I know, failed to predict this trend.

LikeLike
Kaylee Jakubowski

April 11, 2013 at 9:46 am

This exact idea has been recognised in social media as well, except Businessweek predicts that social media is going in the exact opposite direction. Should be interesting to see where these two paradigms collide.

http://www.businessweek.com/articles/2013-02-07/snapchat-and-the-erasable-future-of-social-media

LikeLike
GSo

April 11, 2013 at 9:50 am

Your only hope for a fresh start is a witness protection program.

LikeLike
Corporate Serf

April 12, 2013 at 10:57 am

Beginning of the end for universities and large bureaucratic companies

LikeLike
Manoel Galdino

April 13, 2013 at 8:11 pm

Have you seen this post arguing in favor of machine learning grading essays? I’d like to hear from you what you thing of the arguments presented at the blog:
http://mfeldstein.com/si-ways-the-edx-announcement-gets-automated-essay-grading-wrong/

LikeLike
Nathanael

April 14, 2013 at 9:25 pm

In the short run, the result is as you describe.
In the medium run, people hiring — and actually pretty much everyone — will learn to ignore or manipulate the garbage in the database, as it gives them a competitive advantage to do so. However, this leads to a system where hiring is based on nepotism, fraud, and bias.

So, I think the results will be terrible, just not in the way you expect. People will start discounting the contents of the database *extremely quickly*, but the result will be a reversion to “You get jobs by knowing someone who knows how to override the databse”.

LikeLike
- Nathanael
  
  April 14, 2013 at 9:27 pm
  
  Please note that I’ve dealt with lots and lots of companies who have some sort of large computer database which supposedly has lots of information in it. Over the years, the general policy has changed: nowadays, if you say “Look, the data in that is wrong. Fix it or else,”…. *bottom-rung employees* are authorized to just change it. It’s *assumed* that the database is full of errors!
  
  Think about how useless this renders all that money spent on the databases. Also think who this situation benefits most: the fraud artists.
  
  LikeLike
abababer

April 16, 2013 at 6:25 pm

@mathbabe: As far as creepy software goes, http://www.yesware.com/ is totally worth a look.

LikeLike