What you tweet could cost you
Yesterday I came across this Reuters article by
The title employs a common marketing rule. Frame bad news as good news. Instead of saying, Big data shifts costs to pessimistic tweeters, mention only those who will benefit.
So, what’s going on? In the usual big data fashion, it’s not entirely clear. But the idea is your future health will be measured by your tweets and your premium will go up if it’s bad news. From the article:
In a study cited by the Swiss group last month, researchers found Twitter data alone a more reliable predictor of heart disease than all standard health and socioeconomic measures combined.
Geographic regions represented by particularly high use of negative-emotion and expletive words corresponded to higher occurrences of fatal heart disease in those communities.
To be clear, no insurance company is currently using Twitter data against anyone (or for anyone), at least not openly. The idea outlined in the article is that people could set up accounts to share their personal data with companies like insurance companies, as a way of showing off their healthiness. They’d be using a company like digi.me to do this. Monetize your data and so on. Of course, that would be the case at the beginning, to train the algorithm. Later on who knows.
While we’re on the topic of Twitter, I don’t know if I’ve had time to blog about University of Maryland Computer Science Professor Jennifer Golbeck. I met Professor Golbeck in D.C. last month when she interviewed me at Busboys and Poets. During that discussion she mentioned her paper, Predicting Personality from Social Media Text, in which she inferred personality traits from Twitter data. Here’s the abstract:
This paper replicates text-based Big Five personality score predictions generated by the Receptiviti API—a tool built on and tied to the popular psycholinguistic analysis tool Linguistic Inquiry and Word Count (LIWC). We use four social media datasets with posts and personality scores for nearly 9,000 users to determine the accuracy of the Receptiviti predictions. We found Mean Absolute Error rates in the 15–30% range, which is a higher error rate than other personality prediction algorithms in the literature. Preliminary analysis suggests relative scores between groups of subjects may be maintained, which may be sufficient for many applications.
Here’s how the topic came up. I was mentioning Kyle Behm, a young man I wrote about in my book who was denied a job based on a “big data” personality test. The case is problematic. It could represent a violation of the Americans with Disability Act, and a lawsuit filed in court is pending.
What Professor Golbeck demonstrates with her research is that, in the future, the employers won’t even need to notify applicants that their personalities are being scored at all, it could happen without their knowledge, through their social media posts and other culled information.
I’ll end with this quote from Christian Mumenthaler, CEO of Swiss Re, one of the insurance companies dabbling in Twitter data:
I personally would be cautious what I publish on the internet.