It deals with the question of whether current anti-discrimination law is equipped to handle the kind of unintentional discrimination and digital redlining we see emerging in some “big data” models (and that we suspect are hidden in a bunch more). See for example this post for more on this concept.
The short answer is no, our laws are not equipped.
Here’s the abstract:
This article addresses the potential for disparate impact in the data mining processes that are taking over modern-day business. Scholars and policymakers had, until recently, focused almost exclusively on data mining’s capacity to hide intentional discrimination, hoping to convince regulators to develop the tools to unmask such discrimination. Recently there has been a noted shift in the policy discussions, where some have begun to recognize that unintentional discrimination is a hidden danger that might be even more worrisome. So far, the recognition of the possibility of unintentional discrimination lacks technical and theoretical foundation, making policy recommendations difficult, where they are not simply misdirected. This article provides the necessary foundation about how data mining can give rise to discrimination and how data mining interacts with anti-discrimination law.
The article carefully steps through the technical process of data mining and points to different places within the process where a disproportionately adverse impact on protected classes may result from innocent choices on the part of the data miner. From there, the article analyzes these disproportionate impacts under Title VII. The Article concludes both that Title VII is largely ill equipped to address the discrimination that results from data mining. Worse, due to problems in the internal logic of data mining as well as political and constitutional constraints, there appears to be no easy way to reform Title VII to fix these inadequacies. The article focuses on Title VII because it is the most well developed anti-discrimination doctrine, but the conclusions apply more broadly because they are based on the general approach to anti-discrimination within American law.
I really appreciate this paper, because it’s an area I know almost nothing about: discrimination law and what are the standards for evidence of discrimination.
Sadly, what this paper explains to me is how very far we are away from anything resembling what we need to actually address the problems. For example, even in this paper, where the writers are well aware that training on historical data can unintentionally codify discriminatory treatment, they still seem to assume that the people who build and deploy models will “notice” this treatment. From my experience working in advertising, that’s not actually what happens. We don’t measure the effects of our models on our users. We only see whether we have gained an edge in terms of profit, which is very different.
Essentially, as modelers, we don’t humanize the people on the other side of the transaction, which prevents us from worrying about discrimination or even being aware of it as an issue. It’s so far from “intentional” that it’s almost a ridiculous accusation to make. Even so, it may well be a real problem and I don’t know how we as a society can deal with it unless we update our laws.
Specifically, he looked at the 2013 cab rides in New York City, which was provided under a FOIL request, and he stalked celebrities Bradley Cooper and Jessica Alba (and discovered that neither of them tipped the cabby). He also stalked a man who went to a slew of NYC titty bars: found out where the guy lived and even got a picture of him.
Previously, some other civic hackers had identified the cabbies themselves, because the original dataset had scrambled the medallions, but not very well.
The point he was trying to make was that we should not assume that “anonymized” datasets actually protect privacy. Instead we should learn how to use more thoughtful approaches to anonymizing stuff, and he proposes a method called “differential privacy,” which he explains here. It involves adding noise to the data, in a certain way, so that at the end any given person doesn’t risk too much of their own privacy by being included in the dataset versus being not included in the dataset.
Bottomline, it’s actually pretty involved mathematically, and although I’m a nerd and it doesn’t intimidate me, it does give me pause. Here are a few concerns:
- It means that most people, for example the person in charge of fulfilling FOIL requests, will not actually understand the algorithm.
- That means that, if there’s a requirement that such a procedure is used, that person will have to use and trust a third party to implement it. This leads to all sorts of problems in itself.
- Just to name one, depending on what kind of data it is, you have to implement differential privacy differently. There’s no doubt that a complicated mapping of datatype to methodology will be screwed up when the person doing it doesn’t understand the nuances.
- Here’s another: the third party may not be trustworthy and may have created a backdoor.
- Or they just might get it wrong, or do something lazy that doesn’t actually work, and they can get away with it because, again, the user is not an expert and cannot accurately evaluate their work.
Altogether I’m imagining that this is at best an expensive solution for very important datasets, and won’t be used for your everyday FOIL requests like taxicab rides unless the culture around privacy changes dramatically.
Today I’m super excited to go to the opening launch party of danah boyd’s Data and Society. Data and Society has a bunch of cool initiatives but I’m particularly interested in their Council for Big Data, Ethics, and Society. They were the people that helped make the Podesta Report on Big Data as good as it was. There will be a mini-conference this afternoon I’m looking forward to very much. Brilliant folks doing great work and talking to each other across disciplinary lines, can’t get enough of that stuff.
This coming Saturday I’ll be moderating a panel called Spotlight on Data-Driven Journalism: The job of a data journalist and the impact of computational reporting in the newsroom at the New York Press Club Conference on Journalism. The panelists are going to be great:
- John Keefe @jkeefe, Sr. editor, data news & J-technology, WNYC
- Maryanne Murray @lightnosugar, Global head of graphics, Reuters
- Zach Seward @zseward, Quartz
- Chris Walker @cpwalker07, Dir., data visualization, Mic News
The full program is available here.
In mid-December I’m on a panel myself at the Fairness, Accountability, and Transparency in Machine Learning Conference in Montreal. This conference seems to directly take up the call of the Podesta Report I mentioned above, and seeks to provide further research into the dangers of “encoding discrimination in automated decisions”. Amazing! So glad this is happening and that I get to be part of it. Here are some questions that will be taken up at this one-day conference (more information here):
- How can we achieve high classification accuracy while eliminating discriminatory biases? What are meaningful formal fairness properties?
- How can we design expressive yet easily interpretable classifiers?
- Can we ensure that a classifier remains accurate even if the statistical signal it relies on is exposed to public scrutiny?
- Are there practical methods to test existing classifiers for compliance with a policy?
This recent paper written by Gary King, Jennifer Pan, and Margaret Roberts explores the way social media posts are censored in China. It’s interesting, take a look, or read this article on their work.
Here’s their abstract:
Existing research on the extensive Chinese censorship organization uses observational methods with well-known limitations. We conducted the first large-scale experimental study of censorship by creating accounts on numerous social media sites, randomly submitting different texts, and observing from a worldwide network of computers which texts were censored and which were not. We also supplemented interviews with confidential sources by creating our own social media site, contracting with Chinese firms to install the same censoring technologies as existing sites, and—with their software, documentation, and even customer support—reverse-engineering how it all works. Our results offer rigorous support for the recent hypothesis that criticisms of the state, its leaders, and their policies are published, whereas posts about real-world events with collective action potential are censored.
Interesting that they got so much help from the Chinese to censor their posts. Also keep in mind a caveat from the article:
Yu Xie, a sociologist at the University of Michigan, Ann Arbor, says that although the study is methodologically sound, it overemphasizes the importance of coherent central government policies. Political outcomes in China, he notes, often rest on local officials, who are evaluated on how well they maintain stability. Such officials have a “personal interest in suppressing content that could lead to social movements,” Xie says.
I’m a sucker for reverse-engineering powerful algorithms, even when there are major caveats.
If you think Ello is the newest safest social media platform, you might want to think again.
Or at the very least, go ahead and read this piece by my data journalist buddy Meredith Broussard, entitled ‘Ello, social media newcomer! Goodbye, data security fears?. Meredith has read the fine print in Ello’s security policy, and it’s not great news.
I am somewhat surprised to hear myself say this, but this month’s Notices of the AMS is killing it. Generally speaking I think of it as rather narrowly focused but things seem to be expanding and picking up. Scanning the list of editors, they do seem to have quite a few people that want to address wider public issues that touch and are touched by mathematicians.
First, there’s an article about how the h-rank of an author is basically just the square root of the number of citations for that author. It’s called Critique of Hirsch’s Citation Index: A Combinatorial Fermi Problem and it’s written by Alexander Yong. Doesn’t surprised me too much, but there you go, people often fall in love with new fancy metrics that turn out to be simple transformations of old discarded metrics.
Second, and even more interesting to me, there’s an article that explains the mathematical vapidness of a widely cited social science paper. It’s called Does Diversity Trump Ability? An Example of the Misuse of Mathematics in the Social Sciences and it’s written by Abby Thompson. My favorite part of paper:
Oh, and here’s another excellent take-down of a part of that paper:
Let me just take this moment to say, right on, Notices of the AMS! And of course, right on Alexander Yong and Abby Thompson!
Here’s an interesting paper I’m reading this morning (hat tip Suresh Naidu) entitled Chameleons: The Misuse of Theoretical Models in Finance and Economics written by Paul Pfleiderer. The paper introduces the useful concept of chameleon models, defined in the following diagram:
Pfleiderer provides some examples of chameleon models, and also takes on the Milton Friedman argument that we shouldn’t judge a model by its assumptions but rather by its predictions (personally I think this is largely dependent on the way a model is used; the larger the stakes, the more the assumptions matter).
I like the term, and I think I might use it. I also like the point he makes that it’s really about usage. Most models are harmless until they are used as political weapons. Even the value-added teacher model could be used to identify school systems that need support, although in the current climate of distorted data due to teaching to the test and cheating, I think the signal is probably very slight.