Data science code of conduct, Evgeny Morozov
I’m going on an 8-day long trip to Seattle with my family this morning and I’m taking the time off from mathbabe. But don’t fret! I have a crack team of smartypants skeptics who are writing for me while I’m gone. I’m very much looking forward to seeing what Leon and Becky come up with.
In the meantime, I’ll leave you with two things I’m reading today.
First, a proposed Data Science Code of Professional Conduct. I don’t know anything about the guys at Rose Business Technologies who wrote it except that they’re from Boulder Colorado and have had lots of fancy consulting gigs. But I am really enjoying their proposed Data Science Code. An excerpt from the code after they define their terms:
(c) A data scientist shall rate the quality of evidence and disclose such rating to client to enable client to make informed decisions. The data scientist understands that evidence may be weak or strong or uncertain and shall take reasonable measures to protect the client from relying and making decisions based on weak or uncertain evidence.
(d) If a data scientist reasonably believes a client is misusing data science to communicate a false reality or promote an illusion of understanding, the data scientist shall take reasonable remedial measures, including disclosure to the client, and including, if necessary, disclosure to the proper authorities. The data scientist shall take reasonable measures to persuade the client to use data science appropriately.
(e) If a data scientist knows that a client intends to engage, is engaging or has engaged in criminal or fraudulent conduct related to the data science provided, the data scientist shall take reasonable remedial measures, including, if necessary, disclosure to the proper authorities.
(f) A data scientist shall not knowingly:
- fail to use scientific methods in performing data science;
- fail to rank the quality of evidence in a reasonable and understandable manner for the client;
- claim weak or uncertain evidence is strong evidence;
- misuse weak or uncertain evidence to communicate a false reality or promote an illusion of understanding;
- fail to rank the quality of data in a reasonable and understandable manner for the client;
- claim bad or uncertain data quality is good data quality;
- misuse bad or uncertain data quality to communicate a false reality or promote an illusion of understanding;
- fail to disclose any and all data science results or engage in cherry-picking;
Read the whole Code of Conduct here (and leave comments! They are calling for comments).
Second, my favorite new Silicon Valley curmudgeon is named Evgeny Morozov, and he recently wrote an opinion column in the New York Times. It’s wonderfully cynical and makes me feel like I’m all sunshine and rainbows in comparison – a rare feeling for me! Here’s an excerpt (h/t Chris Wiggins):
Facebook’s Mark Zuckerberg concurs: “There are a lot of really big issues for the world that need to be solved and, as a company, what we are trying to do is to build an infrastructure on top of which to solve some of these problems.” As he noted in Facebook’s original letter to potential investors, “We don’t wake up in the morning with the primary goal of making money.”
Such digital humanitarianism aims to generate good will on the outside and boost morale on the inside. After all, saving the world might be a price worth paying for destroying everyone’s privacy, while a larger-than-life mission might convince young and idealistic employees that they are not wasting their lives tricking gullible consumers to click on ads for pointless products. Silicon Valley and Wall Street are competing for the same talent pool, and by claiming to solve the world’s problems, technology companies can offer what Wall Street cannot: a sense of social mission.
Read the whole thing here.