Archive for the ‘rant’ Category

Workplace Personality Tests: a Cynical View

There’s a frightening article in the Wall Street Journal by Lauren Weber about personality tests people are now forced to take to get shitty jobs in customer calling centers and the like. Some statistics from the article include: 8 out of 10 of the top private employers use such tests, and 57% of employers overall in 2013, a steep rise from previous years.

The questions are meant to be ambiguous so you can’t game them if you are an applicant. For example, yes or no: “I have never understood why some people find abstract art appealing.”

At the end of the test, you get a red light, a yellow light, or a green light. Red lighted people never get an interview, and yellow lighted may or may not. Companies cited in the article use the tests to disqualify more than half their applicants without ever talking to them in person.

The argument for these tests is that, after deploying them, turnover has gone down by 25% since 2000. The people who make and sell personality tests say this is because they’re controlling for personality type and “company fit.”

I have another theory about why people no longer leave shitty jobs, though. First of all, the recession has made people’s economic lives extremely precarious. Nobody wants to lose a job. Second of all, now that everyone is using arbitrary personality tests, the power of the worker to walk off the job and get another job the next week has gone down. By the way, the usage of personality tests seems to correlate with a longer waiting period between applying and starting work, so there’s that disincentive as well.

Workplace personality tests are nothing more than voodoo management tools that empower employers. In fact I’ve compared them in the past to modern day phrenology, and I haven’t seen any reason to change my mind since then. The real “metric of success” for these models is the fact that employers who use them can fire a good portion of their HR teams.

Categories: data science, modeling, rant

Fingers crossed – book coming out next May

As it turns out, it takes a while to write a book, and then another few months to publish it.

I’m very excited today to tentatively announce that my book, which is tentatively entitled Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy, will be published in May 2016, in time to appear on summer reading lists and well before the election.

Fuck yeah! I’m so excited.

p.s. Fight for 15 is happening now.

Predatory credit score-based insurance fees

I’ve been looking into who uses credit scores – FICO scores or other alternative scores – and I’ve found that the insurance industry is a major user.

Homeowners insurance rates, for example, varies wildly by state depending on what kind of credit score you have, often more than doubling for people with poor credit versus people with excellent credit. This is in spite of the fact that homeowners insurance applies not to the payments of mortgages but rather to the contents of an apartment or home.

Similarly, auto insurance rates vary by credit score, even though someone with a poor credit score isn’t obviously a bad driver. For example, in Maryland, people with bad credit scores can be charged 40% more just for having bad credit scores.

Statistics like this make me wonder, how much of this price discrimination comes from the insurance companies trying to understand and account for actual risk, and how much comes from their understanding that poorer people have fewer options and will simply pay predatory rates?

And just in case you’re a believer in free markets and fair competition, and think such predatory behavior would be whisked away in a competitive market, insurance companies actually target people who don’t shop around and charge them more. In other words, it’s not a free market if not everyone actually has good information.

Tell me if you have more examples like this, I’m a collector!

Everyone hates college administrators

If you were wondering why I didn’t blog yesterday, which you probably weren’t (confession: I don’t read other peoples’ blogs and I don’t listen to any podcasts. So I would never, ever ask anyone to read my blog or listen to my podcast), it was because I was completely confused and irritated by this NYTimes opinion piece on the rising cost of college, written by University of Colorado Law Professor Paul Campos.

I really think the Times needs to either have footnotes or hyperlinks in their opinion pieces, because this guy was playing so fast and loose with his numbers that I had really no idea what he was talking about most of the time. That’s saying something considering that this, the cost of college and its causes, is something I have spent many hours thinking about and researching.

So what happened was, I didn’t have time to completely formulate my opposition to why his reasoning was muddled and confusing. I spent way too much time trying to figure out where he was getting his data. Waste of time.

Good news, though, my Slate Money co-host Jordan Weissman has done all that work for us, in his piece aptly entitled The New York Times Offers One of the Worst Explanations You’ll Read of Why College Is So Expensive. Who says procrastination doesn’t work?

As usual, if you’ve ever listened to my podcast (and this isn’t a request for you to do so!), I don’t agree completely with Jordan. However, my delta of agreement with Jordan is very manageable compared to the delta of disagreement I had with Campos. Basically I would quibble with laying any of the blame at the feet of instructors, but since he barely does that, let’s just go with his awesome take-down.

Take-down of what? Well, Campos basically hates college administrators, and pretends there’s no other problems in the world except them. It’s a mistake that he doesn’t have to make.

I mean really, who doesn’t hate college administrators? As a former college administrator myself, I know it’s universal; I certainly hated myself the entire time.

But that doesn’t mean there’s no other factors at all. Reduced public money for colleges is in fact a huge problem, especially when you pair it with the increased federal aid money going to students at corrupt for-profit colleges. Corinthian obtained $1.4 billion in federal grant and loan dollars in 2010 alone, more than the 10 University of California campuses combined for that same year. This system is in terrible need of repair.

Instead of simply hating on college admin, or rather, in addition to hating on admin, can we start thinking about an alternative no-frills state college system that is truly affordable and gives honest and basic instructions without trying to compete on the US News & World Reports stage?

Categories: education, rant

The arbitrary punishment of New York teacher evaluations

The Value-Added Model for teachers (VAM), currently in use all over the country, is a terrible scoring system, as I’ve described before. It is approximately a random number generator.

Even so, it’s still in use, mostly because it wields power over the teacher unions. Let me explain why I say this.

Cuomo’s new budget negotiations with the teacher’s union came up with the following rules around teacher tenure, as I understand them (readers, correct me if I’m wrong):

  1. It will take at least 4 years to get tenure,
  2. A teacher must get at least 3 “effective” or “highly effective” ratings in those three years,
  3. A teacher’s yearly rating depends directly on their VAM score: they are not allowed to get an “effective” or “highly effective” rating if their VAM score comes out as “ineffective.”

Now, I’m ignoring everything else about the system, because I want to distill the effect of VAM.

Let’s think through the math of how likely it is that you’d be denied tenure based only on this random number generator. We will assume only that you otherwise get good ratings from your principal and outside observations. Indeed, Cuomo’s big complaint is that 98% of teachers get good ratings, so this is a safe assumption.

My analysis depends on what qualifies as an “ineffective” VAM score, i.e. what the cutoff is. For now, let’s assume that 30% of teachers receive “ineffective” in a given year, because it has to be some number. Later on we’ll see how things change if that assumption is changed.

That means that 30% of the time, a teacher will not be able to receive an “effective” score, no matter how else they behave, and no matter what their principals or outside observations report for a given year.

Think of it as a biased coin flip, and 30% of the time – for any teacher and for any year – it lands on “ineffective”, and 70% of the time it lands on “effective.” We will ignore the other categories because they don’t matter.

How about if you look over a four year period? To avoid getting any “ineffective” coin flips, you’d need to get “effective” every year, which would happen 0.70^4 = 24% of the time. In other words, 76% of the time, you’d get at least one “ineffective” rating just by chance. 

But remember, you don’t need to get an “effective” rating for all four years, you are allowed one “ineffective rating.” The chances of exactly one “ineffective” coin flip and three “effective” flips is 4 (1-0.70) 0.70^3 =  41%.

Adding those two scenarios together, it means that 65% of the time, over a four year period, you’d get sufficient VAM scores to receive tenure. But it also means that 35% of the time you wouldn’t, through no fault of your own.

This is the political power of a terrible scoring system. More than a third of teachers are being arbitrarily chosen to be punished by this opaque and unaccountable test.

Let’s go back to my assumption, that 30% of teachers are deemed “ineffective.” Maybe I got this wrong. It directly impacts my numbers above. If the overall probability of being deemed “effective” is p, then the overall chance of getting sufficient VAM scores will be p^4 + 4 p^3 (1-p).

So if I got it totally wrong, and 98% of teachers are described as effective by the VAM model, this would mean almost all teachers get sufficient VAM scores.

On the other hand, remember that the reason VAM is being pushed so hard by people is that they don’t like it when evaluations systems think too many people are effective. In fact, they’d rather see arbitrary and random evaluation than see most people get through unscathed.

In other words, it is definitely more than 2% of teachers that are called “ineffective,” but I don’t know the true cutoff.

If anyone knows the true cutoff, please tell me so I can compute anew the percentage of teachers that are arbitrarily being kept from tenure.

Categories: education, rant, statistics

A critique of a review of a book by Bruce Schneier

I haven’t yet read Bruce Schneier’s new book, Data and Goliath: The Hidden Battles To Collect Your Data and Control Your World. I plan to in the coming days, while I’m traveling with my kids for spring break.

Even so, I already feel capable of critiquing this review of his book (hat tip Jordan Ellenberg), written by Columbia Business School Professor and Investment Banker Jonathan Knee. You see, I’m writing a book myself on big data, so I feel like I understand many of the issues intimately.

The review starts out flattering, but then it hits this turn:

When it comes to his specific policy recommendations, however, Mr. Schneier becomes significantly less compelling. And the underlying philosophy that emerges — once he has dispensed with all pretense of an evenhanded presentation of the issues — seems actually subversive of the very democratic principles that he claims animates his mission.

That’s a pretty hefty charge. Let’s take a look into Knee’s evidence that Schneier wants to subvert democratic principles.


First, he complains that Schneier wants the government to stop collecting and mining massive amounts of data in its search for terrorists. Knee thinks this is dumb because it would be great to have lots of data on the “bad guys” once we catch them.

Any time someone uses the phrase “bad guys,” it makes me wince.

But putting that aside, Knee is either ignorant of or is completely ignoring what mass surveillance and data dredging actually creates: the false positives, the time and money and attention, not to mention the potential for misuse and hacking. Knee’s opinion on that is simply that we normal citizens just don’t know enough to have an opinion on whether it works, including Schneier, and in spite of Schneier knowing Snowden pretty well.

It’s just like waterboarding – Knee says – we can’t be sure it isn’t a great fucking idea.

Wait, before we move on, who is more pro-democracy, the guy who wants to stop totalitarian social control methods, or the guy who wants to leave it to the opaque authorities?

Corporate Data Collection

Here’s where Knee really gets lost in Schneier’s logic, because – get this – Schneier wants corporate collection and sale of consumer data to stop. The nerve. As Knee says:

Mr. Schneier promotes no less than a fundamental reshaping of the media and technology landscape. Companies with access to large amounts of personal data would be “automatically classified as fiduciaries” and subject to “special legal restrictions and protections.”

That these limits would render illegal most current business models — under which consumers exchange enhanced access by advertisers for free services – does not seem to bother Mr. Schneier”

I can’t help but think that Knee cannot understand any argument that would threaten the business world as he knows it. After all, he is a business professor and an investment banker. Things seem pretty well worked out when you live in such an environment.

By Knee’s logic, even if the current business model is subverting democracy – which I also argue in my book – we shouldn’t tamper with it because it’s a business model.

The way Knee paints Schneier as anti-democratic is by using the classic fallacy in big data which I wrote about here:

Although professing to be primarily preoccupied with respect of individual autonomy, the fact that Americans as a group apparently don’t feel the same way as he does about privacy appears to have little impact on the author’s radical regulatory agenda. He actually blames “the media” for the failure of his positions to attract more popular support.

Quick summary: Americans as a group do not feel this way because they do not understand what they are trading when they trade their privacy. Commercial and governmental interests, meanwhile, are all united in convincing Americans not to think too hard about it. There are very few people devoting themselves to alerting people to the dark side of big data, and Schneier is one of them. It is a patriotic act.

Also, yes Professor Knee, “the media” generally speaking writes down whatever a marketer in the big data world says is true. There are wonderful exceptions, of course.

So, here’s a question for Knee. What if you found out about a threat on the citizenry, and wanted to put a stop to it? You might write a book and explain the threat; the fact that not everyone already agrees with you wouldn’t make your book anti-democratic, would it?


The rest of the review basically boils down to, “you don’t understand the teachings of the Reverend Dr. Martin Luther King Junior like I do.”

Do you know about Godwin’s law, which says that as soon as someone invokes the Nazis in an argument about anything, they’ve lost the argument?

I feel like we need another, similar rule, which says, if you’re invoking MLK and claiming the other person is misinterpreting him while you have him nailed, then you’ve lost the argument.

Creepy big data health models

There’s an excellent Wall Street Journal article by Joseph Walker, entitled Can a Smartphone Tell if You’re Depressed?that describes a lot of creepy new big data projects going on now in healthcare, in partnership with hospitals and insurance companies.

Some of the models come in the form of apps, created and managed by private, third-party companies that try to predict depression in, for example, postpartum women. They don’t disclose what they are doing to many of the women, or the extent of what they’re doing, according to the article. They own the data they’ve collected at the end of the day and, presumably, can sell it to anyone interested in whether a woman is depressed. For example, future employers. To be clear, this data is generally not covered by HIPAA.

Perhaps the creepiest example is a voice analysis model:

Nurses employed by Aetna have used voice-analysis software since 2012 to detect signs of depression during calls with customers who receive short-term disability benefits because of injury or illness. The software looks for patterns in the pace and tone of voices that can predict “whether the person is engaged with activities like physical therapy or taking the right kinds of medications,” Michael Palmer, Aetna’s chief innovation and digital officer, says.

Patients aren’t informed that their voices are being analyzed, Tammy Arnold, an Aetna spokeswoman, says. The company tells patients the calls are being “recorded for quality,” she says.

“There is concern that with more detailed notification, a member may alter his or her responses or tone (intentionally or unintentionally) in an effort to influence the tool or just in anticipation of the tool,” Ms. Arnold said in an email.

In other words, in the name of “fear of gaming the model,” we are not disclosing the creepy methods we are using. Also, considering that the targets of this model are receiving disability benefits, I’m wondering if the real goal is to catch someone off their meds and disqualify them for further benefits or something along those lines. Since they don’t know they are being modeled, they will never know.

Conclusion: we need more regulation around big data in healthcare.

Categories: data journalism, modeling, rant

Get every new post delivered to your Inbox.

Join 3,063 other followers