How do you quantify morality?

Home > Uncategorized > How do you quantify morality?

How do you quantify morality?

December 20, 2016 Cathy O'Neil, mathbabe

Lately I’ve been thinking about technical approaches to measuring, monitoring, and addressing discrimination in algorithms.

To do this, I consider the different stakeholders and the relative harm they will suffer depending on mistakes made by the algorithm. It turns out that’s a really robust approach, and one that’s basically unavoidable. Here are three examples to explain what I mean.

AURA is an algorithm that is being implemented in Los Angeles with the goal of finding child abuse victims. Here the stakeholders are the children and the parents, and the relative harm we need to quantify is the possibility of taking a child away from parents who would not have abused that kid (bad) versus not removing a child from a family that does abuse them (also bad). I claim that, unless we decide on the relative size of those two harms – so, if you assign “unit harm” to the first, then you have to decide what the second harm counts as – and then optimize to it using that ratio in the penalty function, then you cannot really claim you’ve created a moral algorithm. Or, to be more precise, you cannot say you’ve implemented an algorithm in line with a moral decision. Note, for example, that arguments like this are making the assumption that the ratio is either 0 or infinity, i.e. that one harm matters but the other does not.
COMPAS is a well-known algorithm that measures recidivism risk, i.e. the risk that a given person will end up back in prison within two years of leaving it. Here the stakeholders are the police and civil rights groups, and the harms we need to measure against each other are the possibility of a criminal going free and committing another crime versus a person being jailed in spite of the fact that they would not have gone on to commit another crime. ProPublica has been going head to head with COMPAS’s maker, Northpointe, but unfortunately, the relative weight of these two harms is being sidestepped both by one side and the other.
Michigan recently acknowledged its automated unemployment insurance fraud detection system, called Midas, was going nuts, accusing upwards of 20,000 innocent people of fraud while filling its coffers with (mostly unwarranted) fines, which it’s now using to balance the state budget. In other words, the program deeply undercounted the harm of charging an innocent person with fraud while it was likely overly concerned with missing out on a fraud fine payment that it deserved. Also it was probably just a bad program.

If we want to build ethical algorithms, we will need to weight harms against each other and quantify their relative weights. That’s a moral decision, and it’s hard to agree on. Only after we have that difficult conversation can we optimize our algorithms to those choices.

Categories: Uncategorized

Comments (25)

Anthony Barnes

December 20, 2016 at 8:25 am

It might be good to compare the benefit of the “tough love” programs also, as there is a counterintuitive nature of kindness as doled out by hard teachers, tough parents and drill instructors, just as there is real harm from cheatable welfare, false school testing (see Atlanta Schools) and acceptance of people into schools that they aren’t prepared for.

LikeLike
- Klondike Jack
  
  December 20, 2016 at 12:00 pm
  
  What exactly does false school testing mean here? Is it the non-validity of the tests, the coercive nature of individual and school level evaluations based on those bad tests or the cheating that was done to mitigate harm in response to the irrationality of the
  testing system? FYI, the (very poorly investigated) test cheating scandal in DC under Michelle Rhee far surpasses the one in Atlanta. Search on “Michelle Rhee’s reign of error” an excellent investigative piece by John Merrow
  
  LikeLike
JamesNT

December 20, 2016 at 8:29 am

Another problem will be ratios and that fact that any algorithm will always cause some harm. For example, let’s say Midas is correct 99% of the time and has evaluated 1 million people. Well, by mathematical definition, that’s 10,000 screw ups. 10k people can make a lot of noise in the media. So even once algos are made more efficient than humans, the harm debate will be with us for some time.

JamesNT

LikeLike
- Drew
  
  December 20, 2016 at 2:10 pm
  
  As someone that had to deal with the UIA in Michigan, I can tell you there are even more issues with the system than what was reported. Some of the issues came down to people not entering “correct” data.
  
  Suppose you just got laid off from a company. UIA will ask hw much vacation do you have? If you say 80 hours and you company said 10 days, YOU are WRONG!!!! Fraud fine! If you used your vacation time and the company hasnt updated the system, and you answer zero days, and the company says 10 days, FRAUD FINE!
  
  Another question is, What is your rate of pay? If you said $20.00 and your company said $20.0345, Fraud Fine! Do you know your pay down to the 100th of a penny?
  
  Beyond that, the computer systems for filling out the info will shut down at given times. Cuz, computers need time off every day too. So, suppose you need to fill out a form. If you started at 7:50pm and the system goes down at 8:00pm, you need to be done when it thinks it’s 8:00pm. If not, you’re done! No service on sundays, cuz the puters are at church all day.
  
  Should you call for service, and you call in at 7:59am, no one answers. They “open” @ 8:00am. If you call at 8:01am, you get a busy signal unless you are lucky. No automated report telling you that you are #4 in line and have a wait of 40 mins.
  
  It even sent me a tax statement claiming they paid me $2,000+. The system decided I was not eligable for unemployment benefits for the reasons above. They claimed they used a direct deposit to my “account”, even though they didn’t have my bank account info!
  
  LikeLike
abekohen

December 20, 2016 at 9:13 am

Ethics? Whose ethics? In a world where some consider female genital mutilation ethical, I think it all depends on who has the power.

Years ago when I took an education class at CCNY, one professor spoke highly of the ethics of adult men having sex with pre-pubescent boys. Need I say more?

LikeLike
Lars

December 20, 2016 at 11:29 am

“Saint Algorithm”

The algorithm’s just
It’s people who just ain’t
So listen here, we must
To algorithm saint

LikeLike
Roger Joseph Witte

December 20, 2016 at 11:37 am

Sounds like the epidemiological calculations to determine whether a screening test a vaccination program is safe and justified. So make enquiries with specialists in medical ethics to see if they have any useful tools.

My best guess is that we will find that fine tuning these parameters falls into the nasty gap between personal ethical priorities and the social consensus. (Arrow’s theorem etc. guarantees this gap exists).

In my professional life, many parameters of the models used to appraise transport schemes fall into the gap between science and politics. Examples would be the monetary value of a traffic accident, or of a minute of travel time. In both the US and the UK the government issues advice on these (To see the UK guidance search for WebTAG). The guidance becomes, de facto mandatory and, while not removing the politics, at least gives a level playing field.

You could also argue thst it is appropriate for elected representatives to be arbiter: although the real decisions are made by career civil servants with appropriate expertise. The subtleties are lost on many politicians so the democratic control is limited.

LikeLike
- Robert Brown
  
  December 23, 2016 at 12:33 pm
  
  I came here to mention this. The textbook example is mammograms for breast cancer screening. Mammography has a particular rate of false positives and false negatives, each of which is associated with various costs. False positive mammograms cause patients to undergo unnecessary surgical procedures (usually a biopsy), which are invasive and undesirable for the patient, and cost resources. False negatives can mean delayed diagnoses of actual cancer.
  
  For any test it is always trivially possible to achieve a zero false positive rate (always return a negative result) or zero false negative rate (always return positive). Generally it is possible to fairly smoothly vary your criteria between these extremes to trade one type of error for the other. A receiver operator characteristic (ROC) curve illustrates the tradeoff, and allows you to choose an optimal operating point.
  
  The mathematical machinery for this is well established. The tricky part is assigning costs and benefits to each outcome.
  
  LikeLike
Lloyd Lofthouse

December 20, 2016 at 12:36 pm

What these algorithms do is create/legalize a preemptive strike.

For instance, the Bush Doctrine legalized the preemptive strike based on presumed evidence of an alleged future threat. This came of of G. W. Bush’s National Security Strategy. There were four main points highlighted as the core to the Bush Doctrine: Preemption, Military Primacy, New Multilateralism, and the Spread of Democracy.

These algorithms are based on the same theory.

How many innocent people have died, been tortured and suffered because of the Bush doctrine and are these algorithms and extension of that doctrine but used on the citizens of the United States instead of presumed threats form other countries and/or extremist gorps or individuals?

Have you ever been angry, lost your tempter, and made a threat verbally or in an e-mail that later you regretted when you calmed down? Obviously most people react like this but never act on that flare up of tempter, but now we live in an age where there are corpora tins that gather facts on every individual in the United States and/or the world who is linked to the internet. How many of those angry e-mail reactions reactions will end up in that growing global data base that is fed into an algorithm that might make a prediction “you” are a threat and must be removed?

Taking this a step further, what happens if the United States becomes an theocracy ruled by fundamentalist Christian billionaire autocrats (I’m thinking of Betsy DeVos, the Walton family, the Koch brothers, etc.) and they want to ferret out anyone who doesn’t fit their definition of proper behavior (private or public), and you crossed their Biblical flaming bush; will they have you collected and sent to a re-education camp to turn you into a “proper” Christian or will they just eliminate you?

It’s easy to extend this even further. Anyone not connected to the internet with a personal file that feeds those algorithms is automatically collected and sent to prison because they offer a potential threat that can’t be defined.

LikeLike
- Aaron Lercher
  
  December 20, 2016 at 5:49 pm
  
  In a way, Lloyd is right. In ethics there are at least two distinct ways of evaluating people’s actions. Consequentialist ethical principles (like utilitarianism) allow harms and benefits to be added, although there might be different ideas of “addition” as well as different accounts of what counts as “harm” or “benefit.”
  
  But non-consequentialist principles do not allow for adding or trading off of harms and benefits. People normally think of “rights” in this way. Thus (using a standard thought experiment) even if 5 lives could be saved by transplanting organs from someone who didn’t volunteer, most of us would reject this as a violation of the moral rights of the person who’d be cut up. (If the person volunteered, as a heroic person might do, that entirely changes the example.)
  
  Lloyd’s point is that once we start calculating where the trade-off point is, this calculation justifies the harms that are less than the trade-off point. There was a lot of this talk during the debate over torture in the George W. Bush Administration. My point is that, thankfully, this isn’t the only way to think about ethics.
  
  In addition, other things are relevant, such as how much is known about harms and benefits that are being counted. But that wasn’t the main topic.
  
  LikeLike
  - Cathy O'Neil, mathbabe
    
    December 20, 2016 at 5:51 pm
    
    Well I’m no ethicist but I can tell you that the current crop of data scientists aren’t either, and they’re not even making these decisions explicit.
    
    LikeLike
    - abekohen
      
      December 20, 2016 at 5:59 pm
      
      And is that worse than having an abortion foe or a slave owner using his ethics in an algorithm?
      
      LikeLike
    - Drew
      
      December 20, 2016 at 6:13 pm
      
      Between the ASA and many new stats programs, the have ethical standards and/or teach ethics as part of the curriculum.
      
      I prefer a cost benefit analysis. If you fine someone improperly, and they can get back 10x what they were fined plus you pay for all court costs and attorneys fees, you can eliminate a lot of these issues. Proving guilt is a lot cheaper than assuming guilt.
      
      BTW, the state of Michigan sent me a tax bill for my mom’s “business” with fines. They estimated what she made, the tax she should have paid, and the fines, for summer 2016. She died spring 2014. They argued we still owed because we didn’t report to them she died. All the other govt agencies knew she was dead, including other branches of the state govt. We didn’t get tax bills for summer 2014 nor 2015. Yeah stupid programmers!
      
      LikeLike
  - Aaron Lercher
    
    December 20, 2016 at 5:59 pm
    
    https://plato.stanford.edu/entries/consequentialism/ has a long discussion of variants of the “transplant” thought experiment. But this article disappointingly gives up on defining “consequentialism,” which would seem to be part of the purpose of an encyclopedia article.
    
    Philosophy can be frustrating!
    
    LikeLike
Juan

December 20, 2016 at 2:08 pm

What you’re describing seems to be known as utilitarianism. Which I fully agree on.

LikeLike
Michael L.

December 20, 2016 at 4:09 pm

Things are a bit more complicated than you suggested because I think in each of your examples you’ve left out some stakeholders. Take the child abuse example. Parents and their kids aren’t the only stakeholders. Residents of Los Angeles, or perhaps the entire country, are also stakeholders, to the extent they are emotionally affected when they hear about cases of child abuse. The same is true when it comes to hearing that someone who was released from prison has committed another crime, or hearing that someone has defrauded the unemployment insurance fund. If you agree, do you have thoughts about how an algorithm would incorporate these other stakeholders’ interests?

Also, just because more than one party has a stake in something doesn’t mean we should weight those different stakes equally. Maybe some group’s stake should be considered more important than some other group’s and the algorithm should reflect this. And this, of course, raises the question of how these different weights should be decided.

LikeLike
Mykel G. Larson

December 20, 2016 at 5:23 pm

Attachment theory, particularly the work that came out of the Oregon Social Learning Center, is a step in the right direction in encoding behavior as being positive, negative or neutral.

The problem is rooted in the criterion problem and the difficulty with encoding and classifying observation, however it may manifest itself. How objective should observation be, hmmm? Are there standards? Won’t there always be bias in some form or another? How does one mathematically account for bias if it’s encoded in the data so deep that it cannot be separated or parsed out? These are things I mull over in the realm of economics and behavior. Data can tell a story but only if it’s good data.

LikeLike
RTG

December 20, 2016 at 11:27 pm

I think there’s another assumption in this discussion that you can even identify all the potential harms that could result from implementing an algorithm in order to quantify them and feed them into the model. The nature of policy (which these algorithms seem to be aimed at making more objective) is that it’s always imperfect. Rarely are there only two competing interests in a decision to balance. Even in the first example, you are not only balancing between removing children from safe homes or leaving them in abusive ones. There’s the indirect costs associated with expending resources removing a child from the safe home which delays removing a different child from an abusive one.

LikeLike
Dan C.

December 21, 2016 at 2:16 am

Some proposals:
(1) “First, do no harm.”
(2) “It is better that ten guilty persons escape than that one innocent suffer”
(3) “Off with their heads”

LikeLike
Anon

December 21, 2016 at 8:59 am

As stakeholders in item 2, you could also add the criminals and their families, and more generally the accused and their families.

LikeLike
Lars

December 21, 2016 at 10:00 am

Algorithms are basically human decisions once removed because some person(s) had to put in the (supposedly) “moral” decision criteria to begin with when they were developing the algorithms.

The primary difference between decisions made by humans and ones made by algorithms is that the latter have zero accountability.

If an innocent person is harmed by the latter, neither those who came up with the algorithm nor those who implemented it will be held responsible.

In fact, other than “efficiency”, escape from responsibility is one of the primary “advantages” of computer-based decision-making.

LikeLike
dmf

December 22, 2016 at 10:32 pm

always good to try and factor in all possible/knowable consequences, and if yer going to be inputting data from social sciences check their methods/,math/stats otherwise garbage in will be garbage out: https://en.wikipedia.org/wiki/Replication_crisis

LikeLike
mitchki

December 23, 2016 at 8:42 am

Cathy, I like your approach. Here is an attempt by Google research to address discrimination in threshold classification problems, like the granting of loans. Instead of explicitly including multiple stakeholders, they suggest at the end of the piece that a fair solution might set the threshold for various groups so that an equal proportion of loans will successfully be paid. https://research.google.com/bigpicture/attacking-discrimination-in-ml/

LikeLike
- Cathy O'Neil, mathbabe
  
  December 23, 2016 at 8:44 am
  
  That’s definitely not the same thing, nor is it fair, in my opinion. It’s essentially the same reasoning that sends blacks to prison for longer because they are more likely to get arrested in the future. In particular it ignores systemic discrimination as one of many influencing factors of future events.
  
  LikeLiked by 1 person
  - mitchki
    
    December 23, 2016 at 2:32 pm
    
    You’re right, it’s not the same thing, and it does nothing to undo the effects of previous systematic discrimination.
    
    However, i am heartened to see Google even attempting to address the issue of discrimination. If we can keep the conversation going then we perhaps can move them even farther towards fairness. Unfortunately, I didn’t see a comments section on the post.
    
    LikeLike