What you tweet could cost you

Home > Uncategorized > What you tweet could cost you

What you tweet could cost you

October 24, 2016 Cathy O'Neil, mathbabe

Yesterday I came across this Reuters article by Brenna Hughes Neghaiwi:

In insurance Big Data could lower rates for optimistic tweeters.

The title employs a common marketing rule. Frame bad news as good news. Instead of saying, Big data shifts costs to pessimistic tweeters, mention only those who will benefit.

So, what’s going on? In the usual big data fashion, it’s not entirely clear. But the idea is your future health will be measured by your tweets and your premium will go up if it’s bad news. From the article:

In a study cited by the Swiss group last month, researchers found Twitter data alone a more reliable predictor of heart disease than all standard health and socioeconomic measures combined.

Geographic regions represented by particularly high use of negative-emotion and expletive words corresponded to higher occurrences of fatal heart disease in those communities.

To be clear, no insurance company is currently using Twitter data against anyone (or for anyone), at least not openly. The idea outlined in the article is that people could set up accounts to share their personal data with companies like insurance companies, as a way of showing off their healthiness. They’d be using a company like digi.me to do this. Monetize your data and so on. Of course, that would be the case at the beginning, to train the algorithm. Later on who knows.

While we’re on the topic of Twitter, I don’t know if I’ve had time to blog about University of Maryland Computer Science Professor Jennifer Golbeck. I met Professor Golbeck in D.C. last month when she interviewed me at Busboys and Poets. During that discussion she mentioned her paper, Predicting Personality from Social Media Text, in which she inferred personality traits from Twitter data. Here’s the abstract:

This paper replicates text-based Big Five personality score predictions generated by the Receptiviti API—a tool built on and tied to the popular psycholinguistic analysis tool Linguistic Inquiry and Word Count (LIWC). We use four social media datasets with posts and personality scores for nearly 9,000 users to determine the accuracy of the Receptiviti predictions. We found Mean Absolute Error rates in the 15–30% range, which is a higher error rate than other personality prediction algorithms in the literature. Preliminary analysis suggests relative scores between groups of subjects may be maintained, which may be sufficient for many applications.

Here’s how the topic came up. I was mentioning Kyle Behm, a young man I wrote about in my book who was denied a job based on a “big data” personality test. The case is problematic. It could represent a violation of the Americans with Disability Act, and a lawsuit filed in court is pending.

What Professor Golbeck demonstrates with her research is that, in the future, the employers won’t even need to notify applicants that their personalities are being scored at all, it could happen without their knowledge, through their social media posts and other culled information.

I’ll end with this quote from Christian Mumenthaler, CEO of Swiss Re, one of the insurance companies dabbling in Twitter data:

I personally would be cautious what I publish on the internet.

Categories: Uncategorized

Comments (7)

mathematrucker

October 24, 2016 at 9:46 am

The Reuters title may be more than half full: a new branch of online reputation management could spring up aimed at minimizing insurance premiums. The mere absence of tweets might not maximize savings.

LikeLike
medicalquackblog

October 24, 2016 at 10:57 am

Insurers really don’t need Twitter data, they are drowning in data and have more on file about you that you probably realize. There was an article in Harvard publication talking about the $5 billion a year one subsidiary of United Healthcare makes selling data. Just under IMS, United Healthcare is the 2nd largest health data broker/seller in the US. They have around 300 subsidiaries that do a lot of different things and those subs and their data get connected and lot of that gets sold.

It’s not ethical and see how pharmacy benefit managers collect all kinds of data about you and even front run you to the drug store, always ask for the cash price indeed. The amount of behavior data collected about you already from insurers is staggering, so ugly that people don’t even want to hear about it. Cigna and United Healthcare, class action suits coming up with using all their data to get you to pay more at the drug store. Look at what Quest labs is collecting and selling about you after buying/licensing software from a subsidiary of United Healthcare too. By the way, your next Quest Lab bill will come from yet another subsidiary of United Healthcare. Read about medication adherence prediction scores, and yup you get a secret score every time you fill a prescription, and that’s all sold to your insurer too, or in the case of United it’s all in house data as they own their PBM, OptumRX.

http://ducknetweb.blogspot.com/2016/10/cigna-united-healthcare-face-class_16.html

When it comes to predictions, United is the king of risk assessments and the incest of the company with HHS/CMS is long standing as other insurers don’t get the same treatment as the red carpet rolled out to United. That comes from a former CMS modeling analyst who had to live, eat and breath it until recent retirement. They may not have room to do a lot of work with Twitter data as they have lakes of data right now.

LikeLike
Lars

October 24, 2016 at 11:12 am

“What Professor Golbeck demonstrates with her research is that, in the future, the employers won’t even need to notify applicants that their personalities are being scored at all, it could happen without their knowledge, ”

I thought she demonstrated that she doesn’t know what real scientific evidence is.

LikeLike
rsterbal

October 24, 2016 at 12:33 pm

I doubt the claims for Twitter have a significant n value or reproducibility, which makes me wonder why it was published without those warnings.

LikeLike
- Lars
  
  October 25, 2016 at 10:26 am
  
  Because there is no meaningful peer review when it comes to this sort of thing.
  
  It is not science in any meaningful sense of the word. By and large, the people doing it don’t even know what real science is.
  
  A lot it is just complete BS.
  
  As a scientist, one of the things that I find most disturbing about many “big data” applications and the people who push them is that they give real science and real scientists a very bad name.
  
  LikeLiked by 1 person
Allen K.

October 24, 2016 at 2:10 pm

I imagine “at least not secretly” should be “at least not openly”?

LikeLike
- Cathy O'Neil, mathbabe
  
  October 24, 2016 at 2:12 pm
  
  yes thanks
  
  LikeLike