Search Results

Keyword: ‘algorithm’

TikTok’s Algorithm Cannot Be Trusted

September 21, 2020 Comments off

My newest Bloomberg column is one in which I explain what I know about recommendation engines, which concludes with my claim that whoever controls TikTok’s algorithm can of course tamp down or emphasize whatever kind of content they want, misinformation or otherwise (and to be clear, being able to manipulate recommendation algorithms is in general a good thing!):

TikTok’s Algorithm Can’t Be Trusted

If it operates like other recommendation engines, it can be used for good or for evil.

Read my other Bloomberg columns here.

Categories: Uncategorized

IB’s grading algorithm is a huge mess

In my newest Bloomberg column, I wrote about a boy named Hadrien, interested in studying engineering, whose future has been put in doubt by the International Baccalaureate Organization’s new grading algorithm, which assigns grade in a secret, powerful, and destructive manner. This qualifies it as a “weapons of math destruction:”

This Grading Algorithm Is Failing Students

The International Baccalaureate’s experience offers a cautionary tale.

You can read more of my Bloomberg columns here.

Categories: Uncategorized

The Truth About Algorithms

I’m animated!!


Categories: Uncategorized

The Bail System Sucks. Algorithms Might Not Help.

My newest Bloomberg View piece just went up:

Big Data Alone Can’t Fix a Broken Bail System

Philadelphia should think twice about its risk-assessment algorithm.


For all of my Bloomberg View pieces, go here.

Categories: Uncategorized

Gaydar algorithms and ethics

September 25, 2017 Comments off

My latest Bloomberg View article is out. I interviewed Michal Kosinski, gaydar algorithm author, about the ethical responsibilities of data scientists:

‘Gaydar’ Shows How Creepy Algorithms Can Get

Read my other Bloomberg View columns here.

Categories: Uncategorized

Look Who’s Fighting Our Algorithmic Overlords

August 30, 2017 Comments off

I wrote a new Bloomberg View column about some of the tools to fight bad algorithms:

Look Who’s Fighting Our Algorithmic Overlords

Take a look at my older Bloomberg View columns here.

Categories: Uncategorized

Criminal Algorithms

A piece I wrote for the Observer over in the UK just dropped, as part of my book’s softcover launch over there. Here it is:

How can we stop algorithms telling lies?

Categories: Uncategorized

How to Manage Our Algorithmic Overlords

My latest piece in Bloomberg View just came out:

How to Manage Our Algorithmic Overlords

Categories: Uncategorized

Guest post: the age of algorithms

Artie has kindly allowed me to post his thoughtful email to me regarding my NYU conversation with Julia Angwin last month.

This is a guest post by Arthur Doskow, who is currently retired but remains interested in the application and overapplication of mathematical and data oriented techniques in business and society. Artie has a BS in Math and an MS that is technically in Urban Engineering, but the coursework was mostly in Operations Research. He spent the largest part of my professional life working for a large telco (that need not be named) on protocols, interconnection testing and network security. He is a co-inventor on several patents. He also volunteers as a tutor.

Dear Dr. O’Neil and Ms. Angwin,

I had the pleasure of watching the livestream of your discussion at NYU on February 15. I wanted to offer a few thoughts. I’ll try to be brief.

  1. Algorithms are difficult, and the ones that were discussed were being asked to make difficult decisions. Although it was not discussed, it would be a mistake to assume a priori that there is an effective mechanized and quantitative process by which good decisions can be made with regard to any particular matter. If someone cannot describe in detail how they would evaluate a teacher, or make a credit decision or a hiring decision or a parole decision, then it’s hard to imagine how they would devise an algorithm that would reliably perform the function in their stead. While it seems intuitively obvious that there are better teachers and worse teachers, reformed convicts and likely recidivist criminals and other similar distinctions, it is not (or should not be) equally obvious that the location of an individual on these continua can be reliably determined by quantitative methods. Reliance on a quantitative decision methodology essentially replaces a (perhaps arbitrary) individual bias with what may be a reliable and consistent algorithmic bias. Whether or not that represents an improvement must be assessed on a situation by situation basis.
  1. Beyond this stark “solvability” issue, of course, are the issues of how to set objectives for how an algorithm should perform (this was discussed with respect to the possible performance objectives of a parole evaluation system) and the devising, validating and implementing of a prospective system. This is a significant and demanding set of activities for any organization, but the alternative of procuring an outsourced “black box” solution requires, at the least, an understanding and an assessment of how these issues were addressed.
  1. If an organization is considering outsourcing an algorithmic decision system, the RFP process offers them an invaluable opportunity to learn and assess how a proposed system is designed and how it will work – What inputs does it use? How does its decision engine operate? How has it been validated? How will it cover certain test cases? Where has it been used? To what effect? Etc. Organizations that do not take advantage of an RFP process to ask these detailed questions and demand thorough and responsive answers have only themselves to blame.
  1. While a developers’ code of ethics is certainly a good thing, the development, marketing and support of a proposed solution is a shared task for which all members of the team must share responsibility – coders, system designers and specifiers, testers, marketers, trainers, support staff, executives. There is no single point of responsibility that can guarantee either a correct or an ethical implementation. Perhaps, in the same way that a CEO must personally sign off on all financial filings, the CEO of a company offering an evaluative system should be required to sign off on the legality, effectiveness and accuracy of claims made regarding the system.
  1. Software contracts are notoriously developer-friendly, basically absolving the developer of all possible consequences arising out of the use of their product. This needs to change, particularly in the case of systems sold as “black box” solutions to a purchaser’s needs, and contracts should be negotiated in which the developer retains significant responsibility and liability.
  1. As I think was pointed out, there is a broad range of analysis and modeling techniques, ranging from expert systems that seek to encode human knowledge, to heuristic learning system such as neural nets. While heuristic systems have the potential to ferret out non-intuitive relationships, their results obviously require a much higher degree of scrutiny. Part of me wonders how IBM and Watson would do at developing decision systems.
  1. Extensive testing and analysis should be required before any system “goes live”. It is disappointing to hear that “algorithm auditing” does not seem to be a thriving business, and, depending on the definition of “algorithm auditing”, I may be suggesting even more. Perhaps “algorithm testing” would be a more attractive sounding service name. Beyond requiring an analytical assessment of underlying data requirements and assessment algorithms, systems should be tested using an extensive set of test cases. Test cases should be assessed in advance by other (e.g., human expert) means, and system results should be examined for plausibility and for sanity. Another set of test cases should assess performance with extreme (e.g., best case, worst case) scenarios to check for system sanity. Another possibility is “side by side” testing, in which the system will “shadow” the current implementation, either concurrently or in retrospect and the results will be compared.
  1. Psychological and other pre-employment tests, described in Weapons of Math Destruction, are problematic in two ways. First is whether it is appropriate to conduct them at all, and second is whether they are effective in their stated purpose (i.e., to select the best prospective employees, or those best matched to the position in question). Certainly, competency testing is an appropriate part of candidate selection, but whether psychological characteristics are a component of competency is arguable, at best. At the very least, however, such testing should be assessed as to whether it predicts what it claims to predict, and whether that characteristic is emblematic of work effectiveness. How to conduct such testing would require some creativity. Testing could be conducted on an “incoming class” of employees, whether prior to hiring, or after hiring with the test results being sequestered (neither reported to company management nor used in any evaluation process). After some period (1 – 2 years), the qualitative measures of employee performance and effectiveness could be compared to the sequestered test results and examined for correlation. Another possibility would be to identify a disinterested company with employees performing similar work. (By disinterested, I mean disinterested in using the evaluative test in question.) Employees of that company could be asked to undergo “risk free” testing, with results again being sequestered from their employer. The quantitative test results could then be compared to the qualitative measures of employee performance and effectiveness used by that employer. Whatever one thinks of such testing, as Weapons of Math Destruction correctly points out, to the extent to which it is used, efforts should be made to test and improve its efficacy. To the extent that such testing is promoted by an outside party, that party should be ready, willing and able to demonstrate observed effectiveness.
  1. An interesting alternative to a proprietary black box system would be what might be called a meta-system, a configurable engine which would allow its procurer to specify the inputs, weightings and the manner in which they are used to formulate a decision, perhaps offering a drag and drop software interface to specify the decision algorithm. Such a system would leave the fundamentals of the decision algorithm design to the purchasing company, but simply facilitate its implementation.
  1. One must always be cautious the possibility of inherent bias in data. As a simple example, recidivism is most easily estimated by the proportion of released convicts who are re-arrested. But if recidivism is actually defined by the percentage of released convicts who return to criminal life, then the estimate is likely skewed in several ways. Some recidivists will be caught; others will not. For example, some types of crime are more heavily investigated than others, leading to higher re-arrest rates. Further, even among perpetrators of the same crime, investigation and enforcement may well be targeted more to some areas than to others.
  1. As was pointed out during the discussion, being fair, being humane may cost money. And this is the real issue with many algorithms. In economists’ terms, the inhumanity associated with an algorithm could be referred to as an externality. Optimization has its origins with the solutions to problem in the inanimate world, how to inspect mass produced parts for flaws, how to cut a board to obtain the most salable pieces of lumber, how to minimize the lengths of circuit traces on a PC board. There were problems that touched on human behavior, scheduling issues, or traveling salesman type problems, but not to the extent that they ignored humane considerations. We are now to the point where we have human beings being compared to poisonous Skittles, and where life altering decisions of great import (hiring, firing parole, assessment, scheduling, etc.) are being subjected to optimization processes, often of questionable validity, which objectify people, view them as resources or threats, and give little or no consideration to the very human consequences of their deployment. Assuming that your good work can drive to this consensus, there is a fork in the road as to how it can be addressed. One way would be to attempt to implement humane costs, benefits and constraints into the models being deployed and optimize on that basis. The other is to stand back and monitor applications for their human costs and attempt to address them iteratively. Or, as Yogi said, you can come to the fork and take it.
Categories: Uncategorized

Age of Algorithms: Data, Democracy and the News Event at NYU Journalism 2/15

Next Wednesday evening I’ll be talking data, democracy, and the news with the amazing Julia Angwin at the NYU Journalism School moderated by Robert Lee Hotz. More information here.

Please come! Or if you can’t come, you can watch the livestream.


Categories: Uncategorized

Bloomberg post: When Algorithms Come for Our Children

Hey all, my second column came out today on Bloomberg:

When Algorithms Come  for Our Children

Also, I reviewed a book called Data for the People by Andreas Weigend for Science Magazine. My review has a long name:

A tech insider’s data dreams will resonate with the like-minded but neglect issues of access and equality

Categories: Uncategorized

Algorithmic collusion and price-fixing

There’s a fascinating article on the (hat tip Jordan Weissmann) today about how algorithms can achieve anti-competitive collusion. Entitled Policing the digital cartels and written by David J Lynch, it profiles a classic cinema poster seller that admitted to setting up algorithms for pricing with other poster sellers to keep prices high.

That sounds obviously illegal, and moreover it took work to accomplish. But not all such algorithmic collusion is necessarily so intentional. Here’s the critical paragraph which explains this issue:

As an example, he cites a German software application that tracks petrol-pump prices. Preliminary results suggest that the app discourages price-cutting by retailers, keeping prices higher than they otherwise would have been. As the algorithm instantly detects a petrol station price cut, allowing competitors to match the new price before consumers can shift to the discounter, there is no incentive for any vendor to cut in the first place.

We also don’t seem to have the legal tools to address this:

“Particularly in the case of artificial intelligence, there is no legal basis to attribute liability to a computer engineer for having programmed a machine that eventually ‘self-learned’ to co-ordinate prices with other machines.

Categories: Uncategorized

Recidivism risk algorithms are inherently discriminatory

A few people have been sending me, via Twitter or email, this unsurprising article about how recidivism risk algorithms are inherently racist.

I say unsurprising because I’ve recently read a 2011 paper by Faisal Kamiran and Toon Calders entitled Data preprocessing techniques for classification without discriminationwhich explicitly describes the trade-off between accuracy and discrimination in algorithms in the presence of biased historical data (Section 4, starting on page 8).

In other words, when you have a dataset that has a “favored” group of people and a “discriminated” group of people, and you’re deciding on an outcome that has historically been awarded to the favored group more often – in this case, it would be a low recidivism risk rating – then you cannot expect to maximize accuracy and keep the discrimination down to zero at the same time.

Discrimination is defined in the paper as the difference in percentages of people who get the positive treatment among all people in the same category. So if 50% of whites are considered low-risk and 30% of blacks are, that’s a discrimination score of 0.20.

The paper goes on to show that the trade-off between accuracy and discrimination, which can be achieved through various means, is linear or sub-linear depending on how it’s done. Which is to say, for every 1% loss of discrimination you can expect to lose a fraction of 1% of accuracy.

It’s an interesting paper, well written, and you should take a look. But in any case, what it means in the case of recidivism risk algorithms is that any algorithm that is optimized for “catching the bad guys,” i.e. accuracy, which these algorithms are, and completely ignores the discrepancy between high risk scores for blacks and for whites, can be expected to be discriminatory in the above sense, because we know the data to be biased*.

* The bias is due to the history of heightened scrutiny of black neighborhoods by police which we know as broken windows policing, which makes blacks more likely to be arrested for a given crime, as well as the inherent racism and classism in our justice system itself that was so brilliantly explained out by Michelle Alexander in her book  The New Jim Crow, which makes them more likely to be severely punished for a given crime.

Categories: Uncategorized

Facebook should hire me to audit their algorithm

There’s lots of post-election talk that Facebook played a large part in the election, despite Zuckerberg’s denials. Here are some the various theories going around:

  1. People shared fake news on their walls, and sometimes Facebook’s “trending algorithm” also messed up and shared fake news. This fake news was created by Russia or by Eastern European teenagers and it distracts and confuses people and goes viral.
  2. Political advertisements had deep influence through Facebook, and it worked for Trump even better than it worked for Clinton.
  3. The echo chamber effect, called the “filter bubble,” made people hyper-partisan and the election became all about personality and conspiracy theories instead of actual policy stances. This has been confirmed by a recent experiment on swapping feeds.

If you ask me, I think “all of the above” is probably most accurate. The filter bubble effect is the underlying problem, and at its most extreme you see fake news and conspiracy theories, and a lot of middle ground of just plain misleading, decontextualized headlines that have a cumulative effect on your brain.

Here’s a theory I have about what’s happening and how we can stop it. I will call it “engagement proxy madness.”

It starts with human weakness. People might claim they want “real news” but they are actually very likely to click on garbage gossip rags with pictures of Kardashians or “like” memes that appeal to their already held beliefs.

From the perspective of Facebook, clicks and likes are proxies for interest. Since we click on crap so much, Facebook (and the rest of the online ecosystem) interprets that as a deep interest in crap, even if it’s actually simply exposing a weakness we wish we didn’t have.

Imagine you’re trying to cut down on sugar, because you’re pre-diabetic, but there are M&M’s literally everywhere you look, and every time you stress-eat an M&M, invisible nerds exclaim, “Aha! She actually wants M&M’s!” That’s what I’m talking about, but where you replace M&M’s with listicles.

This human weakness now combines with technological laziness. Since Facebook doesn’t have the interest, commercially or otherwise, to dig in deeper to what people really want in a longer-term sense, our Facebook environments eventually get filled with the media equivalent of junk food.

Also, since Facebook dominates the media advertising world, it creates feedback loops in which newspapers are stuck in the loop of creating junky clickbait stories so they can beg for crumbs of advertising revenue.

This is really a very old story, about how imperfect proxies, combined with influential models, lead to distortions that undermine the original goal. And here the goal was, originally, pretty good: to give people a Facebook feed filled with stuff they’d actually like to see. Instead they’re subjected to immature rants and conspiracy theories.


Of course, maybe I’m wrong. I have very little evidence that the above story is true beyond my experience of Facebook, which is increasingly echo chamber-y, and my observation of hyper-partisanship overall. It’s possible this was entirely caused by something else. I have an open mind if there were evidence that Facebook’s influence on this system is minor.

Unfortunately, Facebook’s data is private and so I cannot audit their algorithm for the effect as an interested observer. That’s why I’d like to be brought in as an outside auditor. The first step in addressing this problem is measuring it.

I already have a company, called ORCAA, which is set up for exactly this: auditing algorithms and quantitatively measuring effects. I’d love Facebook to be my first client.

As for how to address this problem if we conclude there is one: we improve the proxies.

Categories: Uncategorized

Donald Trump is like a biased machine learning algorithm

Bear with me while I explain.

A quick observation: Donald Trump is not like normal people. In particular, he doesn’t have any principles to speak of, that might guide him. No moral compass.

That doesn’t mean he doesn’t have a method. He does, but it’s local rather than global.

Instead of following some hidden but stable agenda, I would suggest Trump’s goal is simply to “not be boring” at Trump rallies. He wants to entertain, and to be the focus of attention at all times. He’s said as much, and it’s consistent with what we know about him. A born salesman.

What that translates to is a constant iterative process whereby he experiments with pushing the conversation this way or that, and he sees how the crowd responds. If they like it, he goes there. If they don’t respond, he never goes there again, because he doesn’t want to be boring. If they respond by getting agitated, that’s a lot better than being bored. That’s how he learns.

A few consequences. First, he’s got biased training data, because the people at his rallies are a particular type of weirdo. That’s one reason he consistently ends up saying things that totally fly within his training set – people at rallies – but rub the rest of the world the wrong way.

Next, because he doesn’t have any actual beliefs, his policy ideas are by construction vague. When he’s forced to say more, he makes them benefit himself, naturally, because he’s also selfish. He’s also entirely willing to switch sides on an issue if the crowd at his rallies seem to enjoy that.

In that sense he’s perfectly objective, as in morally neutral. He just follows the numbers. He could be replaced by a robot that acts on a machine learning algorithm with a bad definition of success – or in his case, a penalty for boringness – and with extremely biased data.

The reason I bring this up: first of all, it’s a great way of understanding how machine learning algorithms can give us stuff we absolutely don’t want, even though they fundamentally lack prior agendas. Happens all the time, in ways similar to the Donald.

Second, some people actually think there will soon be algorithms that control us, operating “through sound decisions of pure rationality” and that we will no longer have use for politicians at all.

And look, I can understand why people are sick of politicians, and would love them to be replaced with rational decision-making robots. But that scenario means one of three things:

  1. Controlling robots simply get trained by the people’s will and do whatever people want at the moment. Maybe that looks like people voting with their phones or via the chips in their heads. This is akin to direct democracy, and the problems are varied – I was in Occupy after all – but in particular mean that people are constantly weighing in on things they don’t actually understand. That leaves them vulnerable to misinformation and propaganda.
  2. Controlling robots ignore people’s will and just follow their inner agendas. Then the question becomes, who sets that agenda? And how does it change as the world and as culture changes? Imagine if we were controlled by someone from 1000 years ago with the social mores from that time. Someone’s gonna be in charge of “fixing” things.
  3. Finally, it’s possible that the controlling robot would act within a political framework to be somewhat but not completely influenced by a democratic process. Something like our current president. But then getting a robot in charge would be a lot like voting for a president. Some people would agree with it, some wouldn’t. Maybe every four years we’d have another vote, and the candidates would be both people and robots, and sometimes a robot would win, sometimes a person. I’m not saying it’s impossible, but it’s not utopian. There’s no such thing as pure rationality in politics, it’s much more about picking sides and appealing to some people’s desires while ignoring others.
Categories: Uncategorized

Auditing Algorithms

Big news!

I’ve started a company called ORCAA, which stands for O’Neil Risk Consulting and Algorithmic Auditing and is pronounced “orcaaaaaa”. ORCAA will audit algorithms and conduct risk assessments for algorithms, first as a consulting entity and eventually, if all goes well, as a more formal auditing firm, with open methodologies and toolkits.

So far all I’ve got is a webpage and a legal filing (as an S-Corp), but no clients.

No worries! I’m busy learning everything I can about the field, small though it is. Today, for example, my friend Suresh Naidu suggested I read this fascinating study, referred to by those in the know as “Oaxaca’s decomposition,” which separates differences of health outcomes for two groups – referred to as “the poor” and the “nonpoor” in the paper – into two parts: first, the effect of “worse attributes” for the poor, and second, the effect of “worse coefficients.” There’s also a worked-out example of children’s health in Viet Nam which is interesting.

The specific formulas they use depends crucially on the fact that the underlying model is a linear regression, but the idea doesn’t: in practice, we care about both issues. For example, with credit scores, it’s obvious we’d care about the coefficients – the coefficients are the ingredients in the recipe that takes the input and gives the output, so if they fundamentally discriminate against blacks, for example, that would be bad (but it has to be carefully defined!). At the same time, though, we also care about which inputs we choose in the first place, which is why there are laws about not being able to use race or gender in credit scoring.

And, importantly, this analysis won’t necessarily tell us what to do about the differences we pick up. Indeed many of the tests I’ve been learning about and studying have that same limitation: we can detect problems but we don’t learn how to address them.

If you have any suggestions for me on methods for either auditing algorithms or for how to modify problematic algorithms, I’d be very grateful if you’d share them with me.

Also, if there are any artists out there, I’m on the market for a logo.

Categories: Uncategorized

Gerrymandering algorithms

I’ve been thinking about gerrymandering recently, specifically how to design algorithms to gerrymander and to detect gerrymandering.

Whence “Gerrymander”?

First thing’s first. According to wikipedia (and my friend Michael Thaddeus), the term “Gerrymander” is a mash-up of a dude named Elbridge Gerry and the word “salamander.” It was concocted when Gerry got made fun of for his crazy districting of Massachusetts back in 1812 to push out the power of the Federalists:


It’s true, this is depicted as a dragon. But believe me, someone thought it looked like a salamander.


How To Gerrymander

Think about it. In this crazy pseudo-democratic world of ours, we’re still voting locally and asking delegates to ride their horses to a centralized location to cast a vote for the group. The system was invented well before the internet, and it is a ridiculous and unnecessary artifact from the days when information didn’t travel well. In particular, it means you can manipulate voting at the local level, by gaming the definition of the district boundaries.

So, let’s imagine you’re in charge of drawing up districts, and you want to rig it for your party. That means you’d like your party to win as often as possible and lose as seldom as possible per district. If you think about it for a while, you’ll come up with the following observation: you should win by a thin margin but lose huge.

Theoretically that would mean building districts – a lot of them – that are 51% in your favor, and then other districts that are 100% against you.

In reality, you can’t count on anything these days, so you might want to create slightly wider margins, of maybe 55% your party, and there might be rules about how connected districts must be, so you’ll never achieve 100% loss districts.


How Not To Gerrymander

On the other side of the same question, we might ask ourselves, is there a better way? And the answer is, absolutely yes. Besides just counting all votes equally, we could draw up districts to contain similar numbers of voters and to be more or less “compact.”

If you don’t know what that really means, you can go look at the work of a computer nerd named Brian Olsen, who built a program to do just this.

Screen Shot 2016-06-22 at 7.43.40 AM

Before and after Brian Olsen gets his hands on Pennsylvania


Detecting Gerrymandering

The concept of compactness is pretty convincing, and has led some to define gerrymandering to be, in effect, a measurement of the compactness of districts. More formally, there’s a so-called “Gerrymander Score” that is defined as the ratio of the perimeter to the area of districts, with some fudge factor which allows for things like rivers and coastlines.

Another approach is a “Gerrymander statistical bias” test, namely the difference between the mean and the median. Here you take the results of an election by district, and you rank them from lowest to highest for your party. So there might be a district that only voted 4% for your party, and it might go on the left end, and on the other end the district that voted 95% for your party would be on the other end. Now look at the “middle” district, and see how much that district voted for your party. Say it’s 47%. Then, if your party won 55% of the vote overall in the state – the mean in this case is 55% – there’s a big difference between 55 and 47, and you can perhaps cry foul.

I mean, this seems like a pretty good test, since if you think back to what we would do to gerrymander, in the ideal world (for us) we’d get a bunch of districts with 45% for the other side and then a few with 99% for the other side, and the median would be 45% even if the other side had way more voters overall.

Problems With Gerrymandering Detection 

There’s a problem, though, which was detected by, among other people, political scientists Jowei Chen and Jonathan Rodden. Namely, if you run scenarios on non-gerrymandered, compact districts, you don’t get very “fair” results as defined by the above statistical bias test.

This is because, in reality, Democrats are more clustered than Republicans. Democrats are quite concentrated in cities and college towns, and then they are more sparse elsewhere. They, in essence, gerrymander themselves.

Said another way, if you build naive districts that are compact (so their Gerrymander Scores are good) then there will be automatic “Gerrymander statistical bias” problems. Oy vey.

Which is not to say that there isn’t actual effective and nasty Gerrymandering going on. There is, in North Carolina, Florida, Pennsylvania, Texas and Michigan for the Republicans and in California, Maryland and Illinois for the Democrats.

But what it means overall is that there’s no reason to believe we’ll ever get out of this stupid districting system, because it gives an inherent advantage to Republicans. So why would they agree to tossing it?

Categories: Uncategorized

Sketchy genetic algorithms are the worst

Math is intimidating. People who meet me and learn that I have a Ph.D. in math often say, “Oh, I suck at math.” It’s usually half hostile, because they’re not proud of this fact, and half hopeful, like they want to believe I must be some kind of magician if I’m good at it, and I might share my secrets.

Then there’s medical science, specifically anything around DNA or genetics. It’s got its own brand of whiz-bang magic and intimidation. I know because, in this case, I’m the one who is no expert, and I kind of want anything at all to be true of a “DNA test.” (You can figure out everything that might go wrong and fix it in advance? Awesome!)

If you combine those two things, you’ve basically floored almost the entire population. They remove themselves from even the possibility of critique.

That’s not always a good thing, especially when the object under discussion is an opaque and possibly inaccurate “genetic algorithm” that is in widespread use to help people make important decisions. Today I’ve got two examples of this kind of thing.

DNA Forensics Tests

The first example is something I mentioned a while ago, which was written up beautifully in the Atlantic by Matthew Shaer. Namely, DNA forensics.

There seem to be two problems in this realm. First, there’s a problem around contamination. The tests have gotten so sensitive that it’s increasingly difficult to know if the DNA being tested comes from the victim of a crime, the perpetrator of the crime, the forensics person who collected the sample, or some random dude who accidentally breathed in the same room three weeks ago. I am only slightly exaggerating.

Second, there’s a problem with consistency. People claiming to know how to perform such tests get very different results from other people claiming to know how to perform them. Here’s a quote from the article that sums up the issue:

“Ironically, you have a technology that was meant to help eliminate subjectivity in forensics,” Erin Murphy, a law professor at NYU, told me recently. “But when you start to drill down deeper into the way crime laboratories operate today, you see that the subjectivity is still there: Standards vary, training levels vary, quality varies.”

Yet, the results are being used to put people away. In fact jurors are extremely convinced by DNA evidence. From the article:

A researcher in Australia recently found that sexual-assault cases involving DNA evidence there were twice as likely to reach trial and 33 times as likely to result in a guilty verdict; homicide cases were 14 times as likely to reach trial and 23 times as likely to end in a guilty verdict.

Opioid Addiction Risk

The second example I have today comes from genetic testing of “opioid addiction risk.” It was written up in Medpage Today by Kristina Fiore, and I’m pretty sure someone sent it to me but I can’t figure out who (please comment!).

The article discusses two new genetic tests, created by companies Proove and Canterbury, which claim to accurately assess a person’s risk of becoming addicted to pain killers.

They don’t make their accuracy claims transparent (93% for Proove), and scientists not involved with the companies peddling the algorithms are skeptical for all sorts of reasonable reasons, including historical difficulty reproducing results like this.

Yet they are still being marketed as a way of saving money on worker’s comp systems, for example. So in other words, people in pain who are rated “high risk” might be denied pain meds through such a test that has no scientific backing but sounds convincing.

Enough with this intimidation. We need new standards of evidence before we let people wield scientific tools against people.

Categories: Uncategorized

Algorithms are as biased as human curators

The recent Facebook trending news kerfuffle has made one thing crystal clear: people trust algorithms too much, more than they trust people. Everyone’s focused on how the curators “routinely suppressed conservative news,” and they’re obviously assuming that an algorithm wouldn’t be like that.

That’s too bad. If I had my way, people would have paid much more attention to the following lines in what I think was the breaking piece by Gizmodo, written by Michael Nunez (emphasis mine):

In interviews with Gizmodo, these former curators described grueling work conditions, humiliating treatment, and a secretive, imperious culture in which they were treated as disposable outsiders. After doing a tour in Facebook’s news trenches, almost all of them came to believe that they were there not to work, but to serve as training modules for Facebook’s algorithm.

Let’s think about what that means. The curators were doing their human thing for a time, and they were fully expecting to be replaced by an algorithm. So any anti-conservative bias that they were introducing at this preliminary training phase would soon be taken over by the machine learning algorithm, to be perpetuated for eternity.

I know most of my readers already know this, but apparently it’s a basic fact that hasn’t reached many educated ears: algorithms are just as biased as human curators. Said another way, we should not be offended when humans are involved in a curation process, because it doesn’t make that process inherently more or less biased. Like it or not, we won’t understand the depth of bias of a process unless we scrutinize it explicitly with that intention in mind, and even then it would be hard to make such a thing well defined.

Categories: Uncategorized

I’ll stop calling algorithms racist when you stop anthropomorphizing AI

I was involved in an interesting discussion the other day with other data scientists on the mistake people make when they describe a “racist algorithm”. Their point, which I largely agreed with, is that algorithms are simply mirroring back to us what we’ve fed them as training data, and in that sense they are no more racist than any other mirror. And yes, it’s a complicated mirror, but it’s still just a mirror.

This issue came up specifically because there was a recent story about how, if you google image search “professional hairstyles for work,” you’ll get this:

Screen Shot 2016-04-07 at 9.24.56 AM

but if you google image search “unprofessional hairstyles for work” you’ll instead get this:

Screen Shot 2016-04-07 at 9.26.12 AM.png

This is problematic, but it’s also clearly not the intention of the Google engineering team, or the google image search algorithm, to be racist. It is instead a reflection of what we as a community have presented to that algorithm as “training data.” So in that sense we should blame ourselves, not the algorithm. The algorithm isn’t (intentionally) racist, because it’s not intentionally anything.

And although that’s true, it’s also dodging some other truth about how we talk about AI and algorithms in our society (and since we don’t differentiate appropriately between AI and algorithms, I’ll use them interchangeably).

Namely, we anthropomorphize AI all the time. Here’s a screenshot of what I got when I google image searched the phrase “AI”:

Screen Shot 2016-04-06 at 12.26.32 PM.png

Out of the above images, only a couple of them do not have some reference to human brains or bodies.

In other words, we are marketing AI as if it’s human. And since we do that, we are treating it and reacting to it as quasi-humans would. That means when it seems racist, we’re going to say the AI is racist. And I think that, all things considered, it’s fair to do this, even though there’s no intention there.

Speaking of intention and blame, I am of the mind that, even though I do not suspect any Google employee of making their algorithms prone to this kind of problem, I still think they should have an internal team that’s on the look-out for this kind of thing and address it. Just as, as a parent, I am constantly on the look-out for my kids getting the wrong ideas about racism or other prejudices; I correct their mistakes. And I know I’m anthropomorphizing the google algorithms when I talk about them like children, but what can I say, I am a sucker for marketing.

Categories: Uncategorized