In NYTimes’ Room for Debate

This morning I’m in the New York Times, having written a short opinion piece on the following Facebook-centered theme:

How to Stop the Spread of Fake News

My actual opinion is entitled Social Media Companies Like Facebook Need to Hire Human Editors.

screen-shot-2016-11-22-at-6-36-57-am

Tell me what you think!

Categories: Uncategorized

Miami Book Festival

I had a great time this weekend in Miami, attending the delightful Miami Book Festival with other longlisters (and winners!) of the National Book Award. We each read for about 5 minutes. Here’s a picture of me perched on the edge of the stage Saturday afternoon, getting ready to read, with many cool non-fiction writers:

miamibookfair

 

Before my reading the wonderful Karan Mahajan brought me to a graffiti art area called Wynwood Wall and we were amazed by spray painted walls:

e52c35b5-4169-496c-96bf-15ed40dc4d033246440c-e6fb-4480-9088-515f590d5f7185eb1f4d-aa4c-42d5-b2dc-3d469b913785fd52e54a-6386-4d57-93fa-af8889ce9169

I was supposed to go to a party after that but I made a detour to South Beach, hanging out with the amazing Jonathan Rabb at the Clevelander:

screen-shot-2016-11-21-at-12-22-48-pm

clevelander2585.jpg

After about 3 mojitos and many many performance artists I fell asleep at about 8pm.

Conclusion: I’m not cool enough for cool things like Miami, but I had a great time.

Categories: Uncategorized

Facebook should hire me to audit their algorithm

There’s lots of post-election talk that Facebook played a large part in the election, despite Zuckerberg’s denials. Here are some the various theories going around:

  1. People shared fake news on their walls, and sometimes Facebook’s “trending algorithm” also messed up and shared fake news. This fake news was created by Russia or by Eastern European teenagers and it distracts and confuses people and goes viral.
  2. Political advertisements had deep influence through Facebook, and it worked for Trump even better than it worked for Clinton.
  3. The echo chamber effect, called the “filter bubble,” made people hyper-partisan and the election became all about personality and conspiracy theories instead of actual policy stances. This has been confirmed by a recent experiment on swapping feeds.

If you ask me, I think “all of the above” is probably most accurate. The filter bubble effect is the underlying problem, and at its most extreme you see fake news and conspiracy theories, and a lot of middle ground of just plain misleading, decontextualized headlines that have a cumulative effect on your brain.

Here’s a theory I have about what’s happening and how we can stop it. I will call it “engagement proxy madness.”

It starts with human weakness. People might claim they want “real news” but they are actually very likely to click on garbage gossip rags with pictures of Kardashians or “like” memes that appeal to their already held beliefs.

From the perspective of Facebook, clicks and likes are proxies for interest. Since we click on crap so much, Facebook (and the rest of the online ecosystem) interprets that as a deep interest in crap, even if it’s actually simply exposing a weakness we wish we didn’t have.

Imagine you’re trying to cut down on sugar, because you’re pre-diabetic, but there are M&M’s literally everywhere you look, and every time you stress-eat an M&M, invisible nerds exclaim, “Aha! She actually wants M&M’s!” That’s what I’m talking about, but where you replace M&M’s with listicles.

This human weakness now combines with technological laziness. Since Facebook doesn’t have the interest, commercially or otherwise, to dig in deeper to what people really want in a longer-term sense, our Facebook environments eventually get filled with the media equivalent of junk food.

Also, since Facebook dominates the media advertising world, it creates feedback loops in which newspapers are stuck in the loop of creating junky clickbait stories so they can beg for crumbs of advertising revenue.

This is really a very old story, about how imperfect proxies, combined with influential models, lead to distortions that undermine the original goal. And here the goal was, originally, pretty good: to give people a Facebook feed filled with stuff they’d actually like to see. Instead they’re subjected to immature rants and conspiracy theories.

 

Of course, maybe I’m wrong. I have very little evidence that the above story is true beyond my experience of Facebook, which is increasingly echo chamber-y, and my observation of hyper-partisanship overall. It’s possible this was entirely caused by something else. I have an open mind if there were evidence that Facebook’s influence on this system is minor.

Unfortunately, Facebook’s data is private and so I cannot audit their algorithm for the effect as an interested observer. That’s why I’d like to be brought in as an outside auditor. The first step in addressing this problem is measuring it.

I already have a company, called ORCAA, which is set up for exactly this: auditing algorithms and quantitatively measuring effects. I’d love Facebook to be my first client.

As for how to address this problem if we conclude there is one: we improve the proxies.

Categories: Uncategorized

Guest post: we should not get out-imagined again

This is an anonymous guest post.

I am a member of Cathy’s Occupy group, and like a lot of people, had a really bad week. By Sunday I thought I was feeling better. It seemed some of the sadness and shock had passed, and I was developing a resolve about how to move forward.

Then I had a really weird experience Sunday night. I came into the City to attend a black-tie event at the Waldorf in support of an organization I really like, even if it raises a lot of its money from the .001%.

As a labor lawyer and Occupier, it is not my crowd. But I usually find the event amusing. They serve sushi for cocktails and pour Makers Mark into wine glasses, like it is wine.

I walked in and immediately felt strange, actually felt really sick. It was like being in an historical re-enactment, precisely because everything was the same. I went for the Makers Mark early. It only made the disembodied feeling worse. Nothing, nothing, had changed from years past. The beautiful young women in the exquisite dresses were the same. The conversation among the supremely confident looking men seemed the same.

I was not the same.

I got another drink and went to my table. Then, like everyone else, I rose for the National Anthem.

I started feeling super weird though, because everyone else was carrying on so completely normally. I thought of kneeling like Collin Kaepernick, but figured my wife would kill me. Then came “My Country Tisethy” and the room just started swirling.

I sat down and took a breath. They started introducing the first honoree: Hank Greenberg. Yes, the guy in charge of AIG until shortly before it blew up the world economy. The guy who sued the government alleging that the $182 billion bailout his company got was on inadequately advantageous terms. In other words, one of the guys most responsible for elite behaviors that led to this most awful eruption of fear, resentment and hate, that led to last Tuesday.

And all I heard was the introduction of him as a “Great American”.

I walked briskly out of the room, then ran fast through the hotel halls and down Park Ave to my car. Where I sat, for I don’t know how long, and just cried. Cried like a fucking baby. Cried for having to look out my car window at what seemed now like an unfamiliar place, cried for the kid in the Bronx who doesn’t know yet about the threat people think he poses to “law and order,” cried for the family in some far off country that doesn’t know about the charade war coming their way to assuage an angry people losing its collective mind over broken empty promises, cried for all the people who, after I’m gone, will live on a chaotic planet my purposefully ignorant country cooked. Damn, I cried hard.

Then I stopped. And I felt a lot better.

It feels so weird to share this publicly, because it is really embarrassing. I really did all that. But I figured out what it was and wanted to say it out loud. It is moral injury. It is real. It hurts. It can make you cry. Don’t try to pretend otherwise. But also take solace that the only way to treat it is to do good anyway.

There are going to be a lot of opportunities. But we should not get out-imagined again, as I surely was. We should shoot really high this time, be really creative about the good we can do.

For example, if you think our national government will remain awful for a long time, you are probably right. So think locally and globally. What stops us from creating real “sanctuary cities, ” ones that are sanctuaries in such a wider sense, to all of the people he has declared hated or who otherwise just reject him? And why cant we make contact, seek advice, and give aid to the 99.9% of the world that is far more affected by this than us, and got no vote? Again, they are the ones who will otherwise get bombed when he starts dumb wars to distract from his mindless policies; and get drowned and fried when he turns up the temperature on the already sizzling planet.

And remember, 2017 is an election year in New York City. Yes, there is another election coming up which it would feel really good to WIN. Let’s demand better. The Left has never said “think nationally” … no, it is has always been Local and Global.

Feel the pain; it is real; cry; and then gather a stronger opposing force to treat it by occupying the spaces that remain up for the taking.

Categories: Uncategorized

WMD press from Germany, Israel, and the Netherlands

November 15, 2016 1 comment

The Netherlands’ Vrij Nederland, written by Gerard Janssen:

dutch-wmd-press

dutch2

dutch-3

Germany’s Die Tageszeitung, written by Ingo Arzt:

German WMD article INGO ARZT.png

Israel’s Calcalist, written by Uri Pasovsky:

israel-1

israel-2

israel-3

israel-4

israel-5

israel-6

Categories: Uncategorized

Guest post: the foreclosure vote

This is a guest post by Tom Adams, who spent over 20 years in the securitization business and now works as an attorney and consultant and expert witness on MBS, CDO and securitization related issues.

I don’t expect anyone to really come up with the perfect explanation for why Clinton lost and Trump won the presidential election.  But I do spend some time looking at these maps:

realtytracforeclosure

screen-shot-2016-11-14-at-6-34-06-am

The first map is from RealtyTrac, and indicates the states with the largest foreclosure inventory in 2012. The second is a map of the key battleground states. In 2008 and 2012, Obama won these states. In 2016 Clinton lost them. There’s a lot of similarities between those two maps.

Even in the best economic environment, residential mortgage foreclosure is a long, messy process. The massive wave of foreclosures that hit these regions after the financial crisis had enormous consequences economically. They also had a tremendous, painful impact on the families and neighborhoods of the people affected, directly and indirectly by the foreclosures.

A rise in the number of suicides have been tide to the wave of foreclosures. Large swaths of neighborhoods were plagued by falling property values, blighted abandoned homes and a sense of uncertainty and, perhaps, doom. I often think about the effect the foreclosure crisis had on the children of affected families and the impact of children watching families, neighbors and classmates going through the painful process.

I was involved, to a small degree, with homeowners, activists and lawmakers that tried to deal with the issues and problems in the foreclosure crisis, some of which is documented in David Dayen’s excellent new book, “Chain of Title“. As Dayen documents, the government response to the issues was ultimately terribly unsatisfying and at best, had the effect of sweeping the issue under the carpet.

The consequences of the government’s response played out in this presidential election.

Clinton was aware of the problems caused by the wave of foreclosures: last fall the NY Times reported that the campaign was frustrated that the crisis had displaced so many homeowners that their database of voters was disrupted. Perhaps this is why the campaign’s get out the vote efforts in Michigan, Wisconsin, Minnesota and other states were much less effective than the campaign had hoped for. Some reports were that up to [25%] of the voters the campaign contacted were actually Republicans or potential Trump voters. In fairness, Clinton was probably concerned about the economic plight of affected homeowners and communities than she was about the technological issues it caused, but that was hardly the dominant campaign message.

How much of an impact would a compassionate outreach have had on these neighborhoods? It’s also worth remembering that the people hit by the foreclosure crisis were generally middle class – prior to the crisis they owned homes, held jobs, were members of the community. Where were they by the time the 2016 election came around?

Certainly, it’s a complicated issue and made more complicated by the fact that the Obama Administration didn’t cover themselves in accolades during the mess. But what if she had said something like this while campaigning in the battleground states:

“While I appreciate the efforts of the Obama administration to address the foreclosure crisis, the Home Affordable Modification Program simply has not provided the relief needed by many families. That is why I strongly support the creation of an Office of the Homeowner Advocate to help struggling families who have been wrongly denied assistance, or who have had difficulties navigating the extremely stressful system of avoiding foreclosure. The Office of the Homeowner Advocate will not only give Vermonters a strong voice in the process, but it will identify ways to make the HAMP program work better,”

That, unfortunately is a statement from Bernie Sanders, in 2010, rather from Clinton (Sanders continued to make it a focus of his primary efforts in 2016 as well).

Or perhaps Clinton could have spoken out in support of the frustrated community groups that sought to participate in the HUD auctions of distressed loans, only to lose out time and again to hedge funds, many of which were run by bankers who were directly involved in the financial crisis.

Maybe she was reluctant to get too involved in the issue because she tried to talk about it back in the 2008 primary and ended up being tagged as a too close to Wall Street. On several occasions in foreclosure states like Nevada, she seemed to cede the issue of the financial crisis  to Sanders and focused her efforts on minority outreach instead. But in states like Florida, where many homeowners remained underwater on the value of the homes and mortgages still in 2016, the issue appeared to still be on the minds of voters on the eve of the election.

Of course, it’s easy to second guess the campaign now. I, and many others, spend hours over several years trying to get the Obama Administration or state governments to improve their response to the foreclosure crisis. By 2016, many of the people I worked with back in 2011 to 2013 on housing issues were exhausted and frustrated. I can only imagine how the people living with the foreclosure crisis must have felt.

Still, a few thousand votes in three key states would have been enough to change the outcome of the election. And when you compare these maps, it’s hard not to see the lost opportunities.

Categories: Uncategorized

The Models Were Telling Us Trump Could Win

This is a post by Eugene Stern, originally posted on his blog sensemadehere.wordpress.com.

Nate Silver got the election right.

Modeling this election was never about win probabilities (i.e., saying that Clinton is 98% likely to win, or 71% likely to win, or whatever). It was about finding a way to convey meaningful information about uncertainty and about what could happen. And, despite the not-so-great headline, this article by Nate Silver does a pretty impressive job.

First, let’s have a look at what not to do. This article by Sam Wang (Princeton Election Consortium) explains how you end up with a win probability of 98-99% for Clinton. First, he aggregates the state polls, and figures that if they’re right on average, then Clinton wins easily (with over 300 electoral votes I believe). Then he looks for a way to model the uncertainty. He asks, reasonably: what happens if the polls are all off by a given amount? And he answers the question, again reasonably: if Trump overperforms his polls by 2.6%, the election becomes a toss-up. If he overperforms by more, he’s likely to win.

But then you have to ask: how much could the polls be off by? And this is where Wang goes horribly wrong.

The uncertainty here is virtually impossible to model statistically. US presidential elections don’t happen that often, so there’s not much direct history, plus the challenges of polling are changing dramatically as fewer and fewer people are reachable via listed phone numbers. Wang does say that in the last three elections, the polls have been off by 1.3% (Bush 2004), 1.2% (Obama 2008), and 2.3% (Obama 2012). So polls being off by 2.6% doesn’t seem crazy at all.

For some inexplicable reason, however, Wang ignores what is right in front of his nose, picks a tiny standard error parameter out of the air, plugs it into his model, and basically says: well, the polls are very unlikely to be off by very much, so Clinton is 98-99% likely to win.

Always be wary of models, especially models of human behavior, that give probabilities of 98-99%. Always ask yourself: am I anywhere near 98-99% sure that my model is complete and accurate? If not, STOP, cross out your probabilities because they are meaningless, and start again.

How do you come up with a meaningful forecast, though? Once you accept that there’s genuine uncertainty in the most important parameter in your model, and that trying to assign a probability is likely to range from meaningless to flat-out wrong, how do you proceed?

Well, let’s look at what Silver does in this article. Instead of trying to estimate the volatility as Wang does (and as Silver also does on the front page of his web site, people just can’t help themselves), he gives a careful analysis of some possible specific scenarios. What are some good scenarios to pick? Well, maybe we should look at recent cases of when nationwide polls have been off. OK, can you think of any good examples? Hmm, I don’t know, maybe…

brexit-headlines

Aiiieeee!!!!

Look at the numbers in that Sun cover. Brexit (Leave) won by 4%, while the polls before the election were essentially tied, with Remain perhaps enjoying a slight lead. That’s a polling error of at least 4%. And the US poll numbers are very clear: if Trump overperforms his polls by 4%, he wins easily.

In financial modeling, where you often don’t have enough relevant history to build a good probabilistic model, this technique — pick some scenarios that seem important, play them through your model, and look at the outcomes — is called stress testing. Silver’s article does a really, really good job of it. He doesn’t pretend to know what’s going to happen (we can’t all be Michael Moore, you know), but he plays out the possibilities, makes the risks transparent, and puts you in a position to evaluate them. That is how you’re supposed to analyze situations with inherent uncertainty. And with the inherent uncertainty in our world increasing, to say the least, it’s a way of thinking that we all better start becoming really familiar with.

The models were plain as day. What the numbers were telling us was that if the polls were right, Clinton would win easily, but if they were underestimating Trump’s support by anywhere near a Brexit-like margin, Trump would win easily. Shouldn’t that have been the headline? Wouldn’t you have liked to have known that? Isn’t it way more informative than saying that Clinton is 98% or 71% likely to win based on some parameter someone plucked out of thin air?

We should have been going into this election terrified.

Categories: Uncategorized

It’s Time to Smell the Shit

People voted for Trump because he was speaking to them about their pain, and making unreasonable promises about how great the future would be for them.

At the same time he was unforgivably awful to all sorts of subpopulations of Americans. The people who voted for him either embraced that hate or ignored it.

This means two things for the rest of us.

First, it means we need to help Trump voters smell their particular shit, which is going to be hard for them, because many of them actually trusted Trump’s promises. That means we document all the ways their expectations have been unmet in the next four years. We have to keep track of the inevitable blame game that Trump is so good at, where he will vilify random people when he fails to deliver his promises.

Second, it means we need to carefully watch all those people who were willing to embrace the hate; they have been empowered and could be truly dangerous, especially when the shit first gets smelled. Nor can we rely on those people who don’t think of themselves as racist but who ignored the hate. They are willing to remain passive in the face of hatred, exactly what we cannot do. People, we need to protect one another, and in particular we need to protect the most vulnerable among us.

How do we document and protect? It starts with citizen journalism. As individuals, we need to use our phones, our blogs, and our conversations as opportunities to speak clearly about what we witness.

We need to train ourselves to intervene when we see someone get singled out for their religion or the color of their skin. We all need to get off of the toxic echo chamber that is Facebook and engage with people in a coffee shop that we happen to meet. Who knows, we might disagree with them, but that shouldn’t stop us from communicating civilly. We need to travel away from our cities and interact with people outside our normal lives.

We need to support independent journalism (choose your two favorite) and civil rights groups (ACLU, Legal Defense Fund) so they can do their jobs, which are crucially important.

We also need to organize locally to do more. This means more than a protest march. It’s a long game, and it needs to be strategic. It needs to reimagine the Democratic party as well a strategy to empower unionization or some other form or forms of working class solidarity. In my Occupy group we’re going to watch this video soon to know what that might look like.

There’s real risk that if we don’t document and protect, we’ll have a disappointed and angry mob casting their anger and blame on minorities with impunity.

We can do this. We can smell the shit together.

Categories: Uncategorized

We are all activists now

Go bake your pie, your lasagna. Get your comfort food made, and check on the kids.

And then contribute to your favorite, most hard-hitting independent journalism organization, if you have money to spare. Look to the future, don’t dwell. Ignore conversations about what happened, about the mathematics of polling, of demographic nonsense. It’s time to prepare for whatever the hell is happening next. It’s up to us to focus, to value information over propaganda. Nobody else is going to do that for us.

Because we are all activists now.

Categories: Uncategorized

Aise’s Voting Guide

This is a voting guide my son Aise put together for me. He’s not old enough to vote so he made it to influence my vote. I thought he did a nice job distilling some real information, so I got his permission to post it here.

Federal:

Presidential: Dan Vacek (Legal Marijuana Now)

New York Senate: Alex Merced (Libertarian)

  • He wants to legalize all drugs
  • He wants to have a very open border policy.
    • He says illegal immigration is like a black market, if you make something more or less legal, the black market will go away

New York House District 10: Jerrold Nadler (Democratic)

  • He has been a reliable progressive democrat. One example of this was his voting against laws that would have helped along the tpp. He also voted to stop the expansion of military suspending and voted to keep the Iran deal together.
  • His one opponent is Philip Rosenthal who favors entitlement reform, tearing up the iran agreement and whose website has the words on it “When America retreats, evil advances.”

State:

State Senate District 30: Bill Perkins (Democrat)

  • He is running against Jon Girodes who was arrested for running a scam by taking people’s money to rent out an apartment and then not returning the money or giving them the apartment.
  • He sponsored legislation to allow people 16 and over to donate an organ.
  • He sponsored legislation to regulate emissions from cars.
  • He sponsored legislation to give inmates translation services in parole hearings.
  • He sponsored legislation to make it easier for the disabled receiving social security money to avoid rent hikes.

State Assembly District 69: Daniel O’Donnell (Democratic)

  • He supported legislation to allow convicted felons to vote once they completed their sentence.
  • He supported legislation to expand eligibility for “shock incarcerations.” A shock incarceration is when someone serves time in a treatment facility instead of prison.
Categories: Uncategorized

Fake News, False Information, and Stupid Polls

Fake News

Facebook uses an algorithm to decide what you see. It’s proprietary but my guess it’s optimized to keep you on Facebook for as long as possible.

This wouldn’t be a problem but becomes one when we realize that people get their news from Facebook.

news-from-facebook

When you optimize to something, and when you ignore something else, that other thing can be expected to balloon beyond recognition. We’ve seen that with ballooning tuition for colleges because of the US News & World Report, for example.

In this example, the thing Facebook has ignored is “truth.” The result is a proliferation of fake news:

screen-shot-2016-11-07-at-6-24-35-am

False Information

Beyond simply fake news, there’s tons of hyper partisan articles that make use of false information. These pseudo-news sites have popped up simply to exist on Facebook and to game the Facebook algorithm.

hyperpartisan

When Facebook started 12 years ago, there was a much healthier journalism industry. It’s now much less healthy, in no small part because of the ad dollars that now pour into Facebook. What will another 12 years bring? I’m worried that we won’t have real news anymore even if we search for it.

screen-shot-2016-11-07-at-6-26-36-am

This is bad for democracy, because people are constantly being misinformed or hysterically informed. It’s pushing people further into their corners, or pushing them off of Facebook and politics entirely.

Stupid Polls

Finally, the polling conversations are out of hand. We tune into our favorite radio shows to hear about policy and instead we hear about poll numbers, or even worse, debates between poll watchers about whose poll is more accurate. That’s not news.

We have obsessed over the college educated white Iowan women’s vote for long enough, and we need to enter a new phase where we discuss actual issues. Leave the polling to campaigns. Mona said it best:

https://embed.theguardian.com/embed/video/commentisfree/video/2016/jun/01/political-polls-bad-for-democracy-heres-why-video

 

What can we do?

Here are some ideas that might help a little but won’t solve everything. Tell me yours.

  1. Facebook absolutely must acknowledge its role in the spread of misinformation. They need to act as editors. This will take an army of workers, but there are plenty of journalists who are looking for jobs, and Facebook makes tons of money, so there’s no actual problem besides the will of Facebook.
  2. Beyond that, Facebook needs to redesign its algorithm so that people don’t only see things they already agree with. This echo chamber (or “filter bubble,” as Eli Pariser described it in his 2011 book) has had a terrible effect on political partisanship. We’ve ended up thinking people who don’t agree with us are actually bad people. Facebook should redesign its platform so that we talk and listen to each other more.
  3. We need to demand that media stop fixating on polls. If we can’t outlaw them, at the very least we can complain and move our attention to real information.
Categories: Uncategorized

SLA-NY PrivCo Spotlight Award!

Tonight I’m taking my adorable husband with me to accept an award from the Special Libraries Association of New York called the PrivCo Spotlight Award. Here’s the description of the award and their reasoning in choosing me, from their website:

This award celebrates website founders and bloggers, curators of distinctive collections, solo librarians, mentors and teachers, conference organizers, and librarians typically working outside the traditional scope of SLA-NY award consideration. As a data scientist and author of the “MathBabe” blog, we feel Cathy O’Neil strongly embodies the spirit of SLA NY. Her book Weapons of Math Destruction was published in 2016 and has been nominated for the 2016 National Book Award for Nonfiction.

I’m a huge fan of librarians – they are, in my opinion, the original ethical data scientists – so I’m both honored and psyched for the award and for the chance to meet the kind people at SLA-NY as well as the other award winners.

Categories: Uncategorized

A good use of big data: to help struggling students

There’s an article that’s been forwarded to me by a bunch of people (I think first by Becky Jaffe) by Anya Kamanetz entitled How One University Used Big Data To Boost Graduation Rates.

The article centers on an algorithm being used by Georgia State University to identify students in danger of dropping out of school. Once identified, the school pairs those wobbly students with advisers to try to help them succeed. From the article:

A GPS alert doesn’t put a student on academic probation or trigger any automatic consequence. Instead, it’s the catalyst for a conversation.

The system prompted 51,000 in-person meetings between students and advisers in the past 12 months. That’s three or four times more than was happening before, when meetings were largely up to the students.

The real work was in those face-to-face encounters, as students made plans with their advisers to get extra tutoring help, take a summer class or maybe switch majors.

I wrote a recent book about powerful, secret, destructive algorithms that I called WMD’s, short for Weapons of Math Destruction. And naturally, a bunch of people have written to me asking if I thought the algorithm from this article would qualify as a WMD.

In a word, no.

Here’s the thing. One of the hallmark characteristics of a WMD is that it punishes the poor, the unlucky, the sick, or the marginalized. This algorithm does the opposite – it offers them help.

Now, I’m not saying it’s perfect. There could easily be flaws in this model, and some people are not being offered help who really need it. That can be seen as a kind of injustice, if others are receiving that help. But that’s the worst case scenario, and it’s not exactly tragic, and it’s a mistake that might well be caught if the algorithm is trained over time and modified to new data.

According to the article, the new algorithmic advising system has resulted in quite a few pieces of really good news:

  • Graduation rates are up 6 percentage points since 2013.
  • Graduates are getting that degree an average half a semester sooner than before, saving an estimated $12 million in tuition.
  • Low-income, first-generation and minority students have closed the graduation rate gap.
  • And those same students are succeeding at higher rates in tough STEM majors.

graduation-liftedbydata_slide-271e09a2737b0ad490a15c1be7fb62c67837ed63-s800-c85

But to be clear, the real “secret sauce” in this system is the extraordinary amount of advising that’s been given to the students. The algorithm just directed that work.

A final word. This algorithm, which identifies struggling students and helps them, is an example I often use in explaining that an algorithm is not inherently good or evil.

In other words, this same algorithm could be used for evil, to punish the badly off, and a similar one nearly was in the case of Mount St. Mary’s College in Virginia. I wrote about that case as well, in a post entitled The Mount St. Mary’s Story is just so terrible.

Categories: Uncategorized

Pseudoscience at Gate B6

This is a guest post written by Matt Freeman, an epidemiologist and nurse practitioner. His fields are adolescent and men’s health. He holds a doctorate in nursing from Duke University, a masters in nursing from The Ohio State University, a masters in epidemiology and public health from The Yale School of Medicine, and a Bachelor of Arts from Brandeis University. His blog is located at www.medfly.org.

It was mid-morning on a Saturday. I had only hand luggage, and had checked in online the day before. I arrived at the small airport exactly one hour before departure. I was a bit annoyed that the flight was delayed, but otherwise not expecting too much trouble. It was a 90-minute flight on a 70-seat regional jet.

By my best estimate, there were 80 passengers waiting to enter the security checkpoint. Most seemed to be leisure travelers: families with little kids, older adults. There was an abundance of sunburn and golf shirts.

The queue inched along. As I looked around, anxiety was escalating. There was a lot of chatter about missing flights; several people were in tears knowing that they would certainly have their travel plans fall into disarray.

One TSA employee with two stripes on his lapels walked his way through the increasingly antsy crowd.

“What is the province of your destination?” He asked the woman next to me.

“Province?”

“Yes, which province? British Columbia? Ontario?”

Confused, the woman replied, “I’m going to Houston. I don’t know what province that’s in.”

The TSA agent scoffed. He moved on to the next passenger. “The same question for you, ma’am. What is the province of your destination?”

The woman didn’t speak, handing over her driver’s license and boarding card, assuming that was what he wanted. He stared back with disdain.

There are no flights from this airport to Canada.

When it was my turn, I volunteered, “I’m going to Texas, not Canada.”

“What are the whereabouts of your luggage?” He asked.

“Their whereabouts? My bag is right here next to me.”

“Yes, what are its whereabouts?”

“It’s right here.”

“And that’s its whereabouts?”

This was seeming like a grammatical question.

“And about its contents? Are you aware of them?”

“Yes,” I replied, quizzically.

He moved on.

I missed my flight. The woman next to me met the same fate. She cried. I cringed. We pleaded with the airline agent for clemency. The plane pushed back from the gate with many passengers waiting to be asked about the whereabouts of their belongings or their province of destination.

The agent asking the strange questions and delaying the flights was a part of  the SPOT program.

The SPOT Program

In 2006, the US Transportation Security Administration (TSA) introduced “SPOT: Screening Passengers by Observational Techniques.” The concept was to identify nonverbal indicators that a passenger was engaged in foul play. Some years after the program started, the US Government Accounting Office (GAO) declared that, “no scientific evidence exists to support the detection of or inference of future behavior.”

SPOT is expensive too. The GAO reported that the program has cost more than $900 million since its inauguration. That is just the cost of training staff and operating the program, not the costs incurred by delayed or detained passengers.

The “Science” Behind Behavioral Techniques

The SPOT program was developed by multiple sources, but there is one most prominent psychologist in the field: Paul Ekman PhD.

Ekman published Emotion in the Human Face, which demonstrated that six basic human emotions: anger, sadness, fear, happiness, surprise, and disgust, are universally expressed on the human face. Ekman had travelled to New Guinea to show that facial expressions did not vary across geography or culture.

Ekman’s theory was undisputed for 20 years until Lisa Feldman Barrett PhD showed that Ekman’s research required observers to select from the list of six emotions. When observers were asked to analyze emotions without a list, there was some reliability in the recognition of happiness and fear. The others emotions could not be distinguished.

When confronted with skepticism from scientists, Ekman declined to release the details of his research for peer review.

Charles Honts, Ph.D., attempted to replicate Ekman’s findings at the University of Utah. No dice. Ekman’s “secret” findings could not be replicated. Maria Hartwig PhD, a psychologist at City University of New York’s John Jay College of Criminal Justice, described Ekman’s work as, “a leap of gargantuan dimensions not supported by scientific evidence.”

When asked directly, a TSA analyst pointed to the work of David Givens, Ph.D., an anthropologist and author. Givens has published popular works on body language, but Givens explained that the TSA did not specify which elements of his own theories were adopted by the TSA, and the TSA never asked him.

The TSA’s Response

When asked for statistics, TSA analyst Carl Maccario cited one anecdote of a passenger who was “rocking back and forth strangely,” and was later found to have been carrying fuel bottles that contained flammable materials. The TSA described these items as, “the makings of a pipe bomb,” but there was no evidence that the passenger was doing anything other than carrying a dangerous substance in his hand luggage. There was nothing to suggest that he planned to hurt anyone.

A single anecdote is not research, and this was a weak story at best.

When the GAO investigated further, they analyzed the data of 232,000 passengers who were identified by “behavioral detection” as cause for concern. Of the 232,000, there were 1,710 arrests. These arrests were mostly due to outstanding arrest warrants, and there is no evidence that any were ever linked to terrorist activity.

What Criteria Are Used in the SPOT Program?

In 2015, The Intercept published the TSA’s worksheet for behavioral detection officers. Here it is:

tsa.jpgAs much as the TSA’s behavioral detection mathematical model is hilarious, it is also frightening. The model seeks to identify whistling and shaving.

If I score myself before a typical flight, I earn eight points, which assigned me to the highest risk category. If one followed the paperwork, I should have been referred for extensive screening and law enforcement was to be notified.

Considering that the criteria include yawning, whistling, a subjectively fast “eye blink rate,” “strong body odor” and head turning, just about everyone reaches the SPOT threshold.

The Risk of Scoring

Looking past the absence of evidence, there are further problems with the SPOT worksheet. “Scored” decisions can detract common sense. It does not matter if a hijacker or terrorist fails to whistle or blinks at a normal rate if he or she blows up the airplane. 

The Israeli Method

As an Israeli national, I became accustomed to the envied security techniques employed at Israel’s four commercial airports.

The agents employed by the Israeli Airports Authority (IAA) do indeed “profile” passengers, but their efforts are often quicker, easier, and arguably more sensitive.

The questions are usually reasonable and fast. “Where have your bags been since you packed them?” “Did anyone give you anything to take with you?” “Are you carrying anything that could be used as a weapon?”

The IAA is cautious about race and religion. The worst attack on Israeli air transportation took place in 1972 at Ben Gurion Airport. Twenty-six people were killed. The assailants were Japanese, posing as tourists. Since that attack, the IAA has attempted to include ethnicity and religion only as components of its screening process.

Although many have published horror stories, the overwhelming majority of passengers do not encounter anything extraordinary at Israeli airports. The agents are usually young, bubbly, right out of their army service, and eager to show off any language skills they may have acquired.

Is There a Better Answer?

Israel does not publish statistics, and I could not tell you if their system is any better. The difference is one of attitude: most of the IAA staff are kind, calm, and not interested in hassling anyone. They do not care how fast you are blinking or if you shave.

Given the amount of air travel to, from, and within the United States, I doubt that questioning passengers would ever work. The TSA lacks the organization, multilingual skills, and service mentality of the Israel Airports Authority.

The TSA already has one answer, but they chose not to use it in my case. I am a member of the Department of Homeland Security’s “Global Entry” program. This means that I was subject to a background check, interview, and fingerprinting. The Department of Homeland Security vetted my credentials and deemed that I did not present any extraordinary risks, and could therefore use its “PreCheck” lane. But this airport had decided to close its PreCheck lane that day. And their SPOT agent had no knowledge that I had already been vetted through databases and fingerprints… arguably a more reliable system than having him determine if I blinked too rapidly.

Until 2015, the PreCheck program also meant that one need not pass through a full-body scanning machine, in part because the machines are famously slow and inaccurate. They are particularly problematic for those with disabilities and other medical conditions. But the TSA decided that it would switch to random use of full body scanners even for those passengers who had already been vetted. Lines grew longer; no weapons have been discovered.

Looking Forward

  1. The SPOT program has been proven to be ineffective. There is no rational reason to keep it in place.
  2. There must not be quotas or incentives for detailed searches and questioning in the absence of probable cause.
  3. Passengers consenting to a search should have the right to know what the search entails, particularly if it involves odd interrogation techniques that can lead to missing one’s flight.
  4. The TSA should respect previous court rulings that the search process begins when a passenger consents to being searched. Asking questions outside of the TSA’s custodial area of the airport is questionable for legal reasons.
  5. Reduce lines. The attacks in Rome and Vienna were more than four decades ago, but that has not dissuaded the TSA. Get the queue moving quickly, thereby reducing the opportunity for an attack. The more recent attack in Brussels still did not change TSA policy.
  6. Stratified screening, such as the PreCheck program, makes sense. But it TSA staff elect to ignore the program, then it is no longer useful.

References

Benton H, Carter M, Heath D, and Neff J. The Warning. The Seattle Times. 23 July 2002.

Borland J. Maybe surveillance is bad, after all. Wired. 8 August 2007.

Dicker K. Yes, the TSA is probably profiling you and it’s scientifically bogus. Business Insider. 6 May 2015.

Herring A. The new face of emotion. Northeastern Magazine. Spring 2014.

Kerr O. Do travelers have a right to leave airport security areas without the TSA’s Permission. The Washington Post. 6 April 2014.

Martin H. Conversations are more effective for screening passengers, study finds.  The Los Angeles Times. 16 November 2014.

The men who stare at airline passengers. The Economist. 6 June 2010.

Segura L. Feeling nervous? 3,000 Behavioral Detection Officers will be watching you at the airport this thanksgiving. Alternet. 23 November 2009

Smith T. Next in line for the TSA? A thorough ‘chat down.’ National Public Radio. 16 August 2011.

Wallis R. Lockerbie: The Story and the Lessons. London: Praeger. 2000.

Weinberger S. Intent to deceive: Can the science of deception detection help catch terrorists? Nature. 465:27. May 2010.

US House of Representatives. Behavioral Science and Security: Evaluating the TSA’s SPOT Program. Hearing Before the Subcommittee on Investigation and Oversight. Committee on Science, Space, and Technology. Serial 112-11. 6 April 2011.

Categories: Uncategorized

What you tweet could cost you

Yesterday I came across this Reuters article by Brenna Hughes Neghaiwi: 

In insurance Big Data could lower rates for optimistic tweeters.

 

The title employs a common marketing rule. Frame bad news as good news. Instead of saying, Big data shifts costs to pessimistic tweeters, mention only those who will benefit.

So, what’s going on? In the usual big data fashion, it’s not entirely clear. But the idea is your future health will be measured by your tweets and your premium will go up if it’s bad news. From the article:

In a study cited by the Swiss group last month, researchers found Twitter data alone a more reliable predictor of heart disease than all standard health and socioeconomic measures combined.

Geographic regions represented by particularly high use of negative-emotion and expletive words corresponded to higher occurrences of fatal heart disease in those communities.

To be clear, no insurance company is currently using Twitter data against anyone (or for anyone), at least not openly. The idea outlined in the article is that people could set up accounts to share their personal data with companies like insurance companies, as a way of showing off their healthiness. They’d be using a company like digi.me to do this. Monetize your data and so on. Of course, that would be the case at the beginning, to train the algorithm. Later on who knows.

While we’re on the topic of Twitter, I don’t know if I’ve had time to blog about University of Maryland Computer Science Professor Jennifer Golbeck. I met Professor Golbeck in D.C. last month when she interviewed me at Busboys and Poets. During that discussion she mentioned her paper, Predicting Personality from Social Media Text, in which she inferred personality traits from Twitter data. Here’s the abstract:

This paper replicates text-based Big Five personality score predictions generated by the Receptiviti API—a tool built on and tied to the popular psycholinguistic analysis tool Linguistic Inquiry and Word Count (LIWC). We use four social media datasets with posts and personality scores for nearly 9,000 users to determine the accuracy of the Receptiviti predictions. We found Mean Absolute Error rates in the 15–30% range, which is a higher error rate than other personality prediction algorithms in the literature. Preliminary analysis suggests relative scores between groups of subjects may be maintained, which may be sufficient for many applications.

Here’s how the topic came up. I was mentioning Kyle Behm, a young man I wrote about in my book who was denied a job based on a “big data” personality test. The case is problematic. It could represent a violation of the Americans with Disability Act, and a lawsuit filed in court is pending.

What Professor Golbeck demonstrates with her research is that, in the future, the employers won’t even need to notify applicants that their personalities are being scored at all, it could happen without their knowledge, through their social media posts and other culled information.

I’ll end with this quote from Christian Mumenthaler, CEO of Swiss Re, one of the insurance companies dabbling in Twitter data:

I personally would be cautious what I publish on the internet.

Categories: Uncategorized

At the Wisconsin Book Festival!

I arrived in Madison last night and had a ridiculously fantastic meal at Forequarter in Madison thanks to my friends Shamus and Jonny of the Underground Food Collective.

Delicious!

I’m here to give a talk at the Wisconsin Book Festival, which will take place today at noon, and I’m excited to have my buddy Jordan Ellenberg introduce me at my talk.

I’ll also stop by beforehand at WORT for a conversation with Patty Peltekos on her show called A Public Affair, as well as afterwards at the local NPR station, WPR, for a show called To The Best of Our Knowledge. These might be recorded, I don’t know when they’re airing.

What a city! Very welcoming and fun. I should visit more often.

Categories: Uncategorized

Facebook’s Child Workforce

I’ve become comfortable with my gadfly role in technology. I know that Facebook would characterize their new “personalized learning” initiative, Summit Basecamp, as innovative if not downright charitable (hat tip Leonie Haimson). But again, gadly.

What gets to me is how the students involved – about 20,000 students in more than 100 charter and traditional public schools – are really no more than an experimental and unpaid workforce, spending classroom hours training the Summit algorithm and getting no guarantee in return of real learning.

Their parents, moreover, are being pressured to sign away all sorts of privacy rights for those kids. And, get this, Basecamp “require disputes to be resolved through arbitration, essentially barring a student’s family from suing if they think data has been misused.” Here’s the quote from the article that got me seriously annoyed, from the Summit CEO Diane Tavenner herself:

“We’re offering this for free to people,” she said. “If we don’t protect the organization, anyone could sue us for anything — which seems crazy to me.”

To recap. Facebook gets these kids to train their algorithm for free, whilst removing them from their classroom time, offering no evidence that they will learn anything, making sure that they’ll be able to use the childrens’ data for everything short of targeted ads, and also ensuring the parents can’t even hire a lawyer to complain. That sounds like a truly terrible deal.

Here’s the thing. The kids involved are often poor, often minority. They are the most surveilled generation and the most surveilled subpopulation out there, ever. We have to start doing better for them than unpaid work for Facebook.

Categories: Uncategorized

Guest post: An IT insider’s mistake

This is a guest post by an IT Director for a Fortune 500 company who has worked with many businesses and government agencies.

It was my mistake. My daughter’s old cell phone had died. My wife offered to get a new phone from Verizon and give that to me and then give my daughter my old phone. Since I work with Microsoft it made sense for me to get the latest Nokia Lumia model. It’s a great looking phone, with a fantastic camera, and a much bigger screen than my old model. I told my wife not to wipe all the data off my old phone but to just get the phone numbers switched, and we could then delete all my contacts from my old phone. While you can remove an email account on the phone, you can’t change the account that is associated with Windows Phone’s cloud. So my daughter manually deleted all my phone contacts and added her own to my old phone – but before that I had synced up my new phone to the cloud and got all my contacts downloaded to it. Within 24 hours, the Microsoft Azure cloud had re-synced both phones, so now all the deletes my daughter did propagated to my new phone.

I lost all my contacts.

I panicked, went back to the Verizon store, and they told me that we had to flash my old phone to factory settings. But they didn’t have a way for me to get my contacts back. And they had no way for me to contact Microsoft directly to get them back either. The Windows Phone website lists no contact phone number for customer support – Microsoft relies on the phone carriers to provide this, apparently believing that being a phone manufacturer doesn’t require you to have a call center that can resolve consumer issues. I see this as a policy flaw.

I had the painstaking process of figuring out how to get my phone contacts back, maybe one at a time.

But the whole cloud syncing made me think about how we’ve now come to trust that we can have everything on our phones and not think about adequately backing it up. In 2012, the Wired reporter Mat Honan reported about how a hacker systematically deleted all his personal information including baby photos on his Apple devices he had saved to the cloud. The big three phone manufacturers now (Apple, Google and Microsoft) have a lot of personal information in their clouds about all of us cell phone users. Each company, on its own, can each create a Kevin Bacon style “six degrees of separation” contacts map that would make the NSA proud. While I lost over 100 or more phone contacts, each one of those people would likely also have a similar or more contacts plugged into their phones, and so on. If the big three (AGM, not to be confused with Annual General Meetings) colluded together, they could even create a real time locator map showing where all our contacts are right now all round the world. Think of the possibilities for tracking: cheating spouses, late lunches at work, what time you quit drinking at the local, what sporting events you go to, which clients your competitors are meeting with etc. Microsoft’s acquisition of LinkedIn makes this sharing of information even more powerful. Now they’ll have our phone numbers and email contacts and some professional correspondence too.

I don’t trust Google. Their motto of “don’t be evil”, almost begs the question why do they have to remind themselves of that? Some years ago they were reported as scanning emails written to and from Gmail accounts. Spying on what your customers think of as private correspondence comes to my mind as evil. And just last week Yahoo admits to doing the same thing on behalf of the government, scanning for a very specific search phrase. I hope the NSA got their suspect with that request, and it wasn’t just a trial balloon to see how far they could go with pressuring the big data providers and aggregators. Yes, I can see the guys in suits and dark glasses approaching Marissa Mayer, “Trust us, this will save lives. We believe there’s the risk of an imminent terrorist attack”. I hope they arrest someone and bring charges, even if to justify Marissa’s position.

So why do I bring all that up? I believe we need consumer personal data protection rights. Almost like credit reporting. The big three (AGM) personal data aggregators and Facebook and LinkedIn collect a lot of personal data about each of us. We should have the right to know what they keep about us, and to possibly correct that record, like we do with the credit bureaus. We should be able to get a free digital copy of our personal data at least annually. The personal data aggregators should also have to report who they share that information with, and in what form. Do they pass along our phone contact information, or email accounts to 3 rd party providers or license that to other companies to help them do their business? The Europeans are ahead of America in protecting privacy rights on the internet, with the right to be forgotten, and the right to correct data. We should not be left behind in making our lives safer from invasion of our privacy and loss of personal security.

We need to know. The personal data aggregators need to be held to higher standards.

Categories: Uncategorized

New America event next Monday

Hey D.C. folks!

I’ll be back in your area next Monday for an event at the New America Foundation from noon to 1:30pm. It will also be livestreamed.

It’s going to be a panel discussion with some super interesting folks:

David Robinson is co-founder and principal at Upturn, a team of technologists working to give people a meaningful voice in how technology shapes their lives. David leads the firm’s work on automated decisions in the criminal justice system.

Rachel Levinson-Waldman is senior counsel to the Liberty and National Security Program at the Brennan Center for Justice. She is an expert on surveillance technology and national security issues, and a frequent commentator on the intersection of policing, technology and civil rights.

Daniel Castro is vice president at the Information Technology and Innovation Foundation (ITIF) and director of ITIF’s Center for Data Innovation. He was appointed by U.S. Secretary of Commerce Penny Pritzker to the Commerce Data Advisory Council.

K. Sabeel Rahman is an assistant professor of law at Brooklyn Law School, a Eric and Wendy Schmidt fellow at New America, and a Four Freedoms fellow at the Roosevelt Institute. He is the author of Democracy Against Domination (Oxford University Press 2017), and studies the history, values, and policy strategies that animate efforts to make our society more inclusive and democratic, and our economy more equitable.

Also, I wrote an essay for New America in preparation for the event, entitled Alien Algorithms.

I hope I see you next Monday!

Categories: Uncategorized

Three upcoming NY events, starting tonight at Thoughtworks

I’ve got three upcoming New York events I wanted people to know about.

Thoughtworks/ Data-Pop Alliance Tonight

First, I’ll be speaking tonight starting at 6:30pm at Thoughtworks, at 99 Madison Ave. It’s co-hosted by Data-Pop Alliance, and after giving a brief talk about my book I’ll be joined for a panel discussion by Augustin Chaintreau (Columbia University), moderated by Emmanuel Letouzé (Data-Pop Alliance and MIT Media Lab). There will be Q&A as well. More here.

screen-shot-2016-10-13-at-6-24-15-am

Betaworks next week

Next I’ll be talking with the folks at Betaworks about my book next Thursday evening, starting at 6:30pm, at 29 Little West 12th Street. You can get more information and register for the event here.

Data & Society in two weeks

Finally, if the world of New York City data hasn’t gotten sick of hearing from me, I’ll be giving a “Databite” (with whiskey!) at Data & Society the afternoon of Wednesday, October 26th, starting at 4pm. Data & Society is located at 36 West 20th Street, 11th Floor. I will update this post with an registration link when I have it.

Categories: Uncategorized