Archive

Author Archive

Links (with annotation)

I’ve been heads down writing this week but I wanted to share a bunch of great stuff coming out.

  1. Here’s a great interview with machine learning expert Michael Jordan on various things including the big data bubble (hat tip Alan Fekete). I had a similar opinion over a year ago on that topic. Update: here’s Michael Jordan ranting about the title for that interview (hat tip Akshay Mishra). I never read titles.
  2. Have you taken a look at Janet Yellen’s speech on inequality from last week? She was at a conference in Boston about inequality when she gave it. It’s a pretty amazing speech – she acknowledges the increasing inequality, for example, and points at four systems we can focus on as reasons: childhood poverty and public education, college costs, inheritances, and business creation. One thing she didn’t mention: quantitative easing, or anything else the Fed has actual control over. Plus she hid behind the language of economics in terms of how much to care about any of this or what she or anyone else could do. On the other hand, maybe it’s the most we could expect from her. The Fed has, in my opinion, already been overreaching with QE and we can’t expect it to do the job of Congress.
  3. There’s a cool event at the Columbia Journalism School tomorrow night called #Ferguson: Reporting a Viral News Story (hat tip Smitha Corona) which features sociologist and writer Zeynep Tufekci among others (see for example this article she wrote), with Emily Bell moderating. I’m going to try to go.
  4. Just in case you didn’t see this, Why Work Is More And More Debased (hat tip Ernest Davis).
  5. Also: Poor kids who do everything right don’t do better than rich kids who do everything wrong (hat tip Natasha Blakely).
  6. Jesse Eisenger visits the defense lawyers of the big banks and writes about his experience (hat tip Aryt Alasti).

After writing this list, with all the hat tips, I am once again astounded at how many awesome people send me interesting things to read. Thank you so much!!

Guest post: The dangers of evidence-based sentencing

This is a guest post by Luis Daniel, a research fellow at The GovLab at NYU where he works on issues dealing with tech and policy. He tweets @luisdaniel12. Crossposted at the GovLab.

What is Evidence-based Sentencing?

For several decades, parole and probation departments have been using research-backed assessments to determine the best supervision and treatment strategies for offenders to try and reduce the risk of recidivism. In recent years, state and county justice systems have started to apply these risk and needs assessment tools (RNA’s) to other parts of the criminal process.

Of particular concern is the use of automated tools to determine imprisonment terms. This relatively new practice of applying RNA information into the sentencing process is known as evidence-based sentencing (EBS).

What the Models Do

The different parameters used to determine risk vary by state, and most EBS tools use information that has been central to sentencing schemes for many years such as an offender’s criminal history. However, an increasing amount of states have been utilizing static factors such as gender, age, marital status, education level, employment history, and other demographic information to determine risk and inform sentencing. Especially alarming is the fact that the majority of these risk assessment tools do not take an offender’s particular case into account.

This practice has drawn sharp criticism from Attorney General Eric Holder who says “using static factors from a criminal’s background could perpetuate racial bias in a system that already delivers 20% longer sentences for young black men than for other offenders.” In the annual letter to the US Sentencing Commission, the Attorney General’s Office states that “utilizing such tools for determining prison sentences to be served will have a disparate and adverse impact on offenders from poor communities already struggling with social ills.” Other concerns cite the probable unconstitutionality of using group-based characteristics in risk assessments.

Where the Models Are Used

It is difficult to precisely quantify how many states and counties currently implement these instruments, although at least 20 states have implemented some form of EBS. Some of the states or states with counties that have implemented some sort of EBS (any type of sentencing: parole, imprisonment, etc) are: Pennsylvania, Tennessee, Vermont, Kentucky, Virginia, Arizona, Colorado, California, Idaho, Indiana, Missouri, Nebraska, Ohio, Oregon, Texas, and Wisconsin.

The Role of Race, Education, and Friendship

Overwhelmingly states do not include race in the risk assessments since there seems to be a general consensus that doing so would be unconstitutional. However, even though these tools do not take race into consideration directly, many of the variables used such as economic status, education level, and employment correlate with race. African-Americans and Hispanics are already disproportionately incarcerated and determining sentences based on these variables might cause further racial disparities.

The very socioeconomic characteristics such as income and education level used in risk assessments are the characteristics that are already strong predictors of whether someone will go to prison. For example, high school dropouts are 47 times more likely to be incarcerated than people in their similar age group who received a four-year college degree. It is reasonable to suspect that courts that include education level as a risk predictor will further exacerbate these disparities.

Some states, such as Texas, take into account peer relations and considers associating with other offenders as a “salient problem”. Considering that Texas is in 4th place in the rate of people under some sort of correctional control (parole, probation, etc) and that the rate is 1 in 11 for black males in the United States it is likely that this metric would disproportionately affect African-Americans.

Sonja Starr’s paper

Even so, in some cases, socioeconomic and demographic variables receive significant weight. In her forthcoming paper in the Stanford Law Review, Sonja Starr provides a telling example of how these factors are used in presentence reports. From her paper:

For instance, in Missouri, pre-sentence reports include a score for each defendant on a scale from -8 to 7, where “4-7 is rated ‘good,’ 2-3 is ‘above average,’ 0-1 is ‘average’, -1 to -2 is ‘below average,’ and -3 to -8 is ‘poor.’ Unlike most instruments in use, Missouri’s does not include gender. However, an unemployed high school dropout will score three points worse than an employed high school graduate—potentially making the difference between “good” and “average,” or between “average” and “poor.” Likewise, a defendant under age 22 will score three points worse than a defendant over 45. By comparison, having previously served time in prison is worth one point; having four or more prior misdemeanor convictions that resulted in jail time adds one point (three or fewer adds none); having previously had parole or probation revoked is worth one point; and a prison escape is worth one point. Meanwhile, current crime type and severity receive no weight.

Starr argues that such simple point systems may “linearize” a variable’s effect. In the underlying regression models used to calculate risk, some of the variable’s effects do not translate linearly into changes in probability of recidivism, but they are treated as such by the model.

Another criticism Starr makes is that they often make predictions on an individual based on averages of a group. Starr says these predictions can predict with reasonable precision the average recidivism rate for all offenders who share the same characteristics as the defendant, but that does not make it necessarily useful for individual predictions.

The Future of EBS Tools

The Model Penal Code is currently in the process of being revised and is set to include these risk assessment tools in the sentencing process. According to Starr, this is a serious development because it reflects the increased support of these practices and because of the Model Penal Code’s great influence in guiding penal codes in other states. Attorney General Eric Holder has already spoken against the practice, but it will be interesting to see whether his successor will continue this campaign.

Even if EBS can accurately measure risk of recidivism (which is uncertain according to Starr), does that mean that a greater prison sentence will result in less future offenses after the offender is released? EBS does not seek to answer this question. Further, if knowing there is a harsh penalty for a particular crime is a deterrent to commit said crime, wouldn’t adding more uncertainty to sentencing (EBS tools are not always transparent and sometimes proprietary) effectively remove this deterrent?

Even though many questions remain unanswered and while several people have been critical of the practice, it seems like there is great support for the use of these instruments. They are especially easy to support when they are overwhelmingly regarded as progressive and scientific, something Starr refutes. While there is certainly a place for data analytics and actuarial methods in the criminal justice system, it is important that such research be applied with the appropriate caution. Or perhaps not at all. Even if the tools had full statistical support, the risk of further exacerbating an already disparate criminal justice system should be enough to halt this practice.

Both Starr and Holder believe there is a strong case to be made that the risk prediction instruments now in use are unconstitutional. But EBS has strong advocates, so it’s a difficult subject. Ultimately, evidence-based sentencing is used to determine a person’s sentencing not based on what the person has done, but who that person is.

Big Data’s Disparate Impact

Take a look at this paper by Solon Barocas and Andrew D. Selbst entitled Big Data’s Disparate Impact.

It deals with the question of whether current anti-discrimination law is equipped to handle the kind of unintentional discrimination and digital redlining we see emerging in some “big data” models (and that we suspect are hidden in a bunch more). See for example this post for more on this concept.

The short answer is no, our laws are not equipped.

Here’s the abstract:

This article addresses the potential for disparate impact in the data mining processes that are taking over modern-day business. Scholars and policymakers had, until recently, focused almost exclusively on data mining’s capacity to hide intentional discrimination, hoping to convince regulators to develop the tools to unmask such discrimination. Recently there has been a noted shift in the policy discussions, where some have begun to recognize that unintentional discrimination is a hidden danger that might be even more worrisome. So far, the recognition of the possibility of unintentional discrimination lacks technical and theoretical foundation, making policy recommendations difficult, where they are not simply misdirected. This article provides the necessary foundation about how data mining can give rise to discrimination and how data mining interacts with anti-discrimination law.

The article carefully steps through the technical process of data mining and points to different places within the process where a disproportionately adverse impact on protected classes may result from innocent choices on the part of the data miner. From there, the article analyzes these disproportionate impacts under Title VII. The Article concludes both that Title VII is largely ill equipped to address the discrimination that results from data mining. Worse, due to problems in the internal logic of data mining as well as political and constitutional constraints, there appears to be no easy way to reform Title VII to fix these inadequacies. The article focuses on Title VII because it is the most well developed anti-discrimination doctrine, but the conclusions apply more broadly because they are based on the general approach to anti-discrimination within American law.

I really appreciate this paper, because it’s an area I know almost nothing about: discrimination law and what are the standards for evidence of discrimination.

Sadly, what this paper explains to me is how very far we are away from anything resembling what we need to actually address the problems. For example, even in this paper, where the writers are well aware that training on historical data can unintentionally codify discriminatory treatment, they still seem to assume that the people who build and deploy models will “notice” this treatment. From my experience working in advertising, that’s not actually what happens. We don’t measure the effects of our models on our users. We only see whether we have gained an edge in terms of profit, which is very different.

Essentially, as modelers, we don’t humanize the people on the other side of the transaction, which prevents us from worrying about discrimination or even being aware of it as an issue. It’s so far from “intentional” that it’s almost a ridiculous accusation to make. Even so, it may well be a real problem and I don’t know how we as a society can deal with it unless we update our laws.

Aunt Pythia’s advice

Quick, get on the bus! Hurry!

Aunt Pythia is gonna be super fast this morning because she’s got crepes to make and apples to pick.

And then many, many apple pies to bake.

And then many, many apple pies to bake.

Are you ready? Belts buckled? OK great, let’s do this. And afterwards:

please think of something to ask Aunt Pythia at the bottom of the page!

By the way, if you don’t know what the hell Aunt Pythia is talking about, go here for past advice columns and here for an explanation of the name Pythia.

——

Dear Aunt Pythia,

Now I’m dying to know – what are some Dan Savage answers that you disagree with?? Say, what are your top 3?

An obliging – and curious! – good friend

Dear Ao-ac-gf,

First, let me say I’m glad this is a written word thing and I don’t have to pronounce your name.

Second, I only disagree with Dan Savage on (pretty much) one thing. And he’s a gay man, and without meaning to offend may I say he has typical gay man aesthetics coming from mostly interacting with other men. You see this is fashion as well, which is dominated by gay men.

Which is to say, he’s really judgmental about fatness. And I find it peculiar, coming from a man who is pro-sex and anti-shame on most topics. As is typical of people who are judgy about fatness, he claims it’s coming from a place of worrying about health, which I first of all object to strenuously as a super healthy fat woman, but secondly it just strikes me as almost comically parallel to how people complain about gayness and hide behind some weird argument that it’s for the sake of the gay person’s soul.

UPDATE: please read this totally awesome essay on the subject.

That’s pretty much it. In almost every other way I agree with Dan Savage. And also, I haven’t read his stuff for a while, so who knows, maybe he’s had a total change of heart, and maybe he embraces fat ladies such as myself nowadays (although, not literally, I’m sure).

XOXOX good friend!

Aunt Pythia

——

Dear Aunt Pythia,

I’m currently in a data quality job that I was promoted into for sheer enthusiasm and work ethic. It’s turned into a data quality/analysis/reporting and visualisation role (can you guess I’m at a non-profit). I’ve taught myself advanced Excel, some data visualisation and how to manage our database since being promoted. I love knowing what all the data shows and being able to explain why certain things are happening. However I want to excel at my job and with no prior training (I’m not even a graduate yet) I find it so stressful as I feel I’m always one step behind.

Currently to improve my skills… (I have your book on my wishlist) I follow your blog and several others in similar fields and I’ve read books on Tableau/Excel/dashboard design and books on how to think statistically. I’m going through the entirety of the maths section on Khan Academy. I’m also studying part time so I will be a graduate soon and I have done some statistics in this course but it’s all been related to psychology experiments (I started the course before being promoted).

Unfortunately no one else in my organisation does anything similar or is any kind of position to train or mentor me. Would you be able to recommend other books/blogs/online courses or even ways of thinking/learning skills that might be useful?

Girl drowing in data

Dear Girl,

Whoa! You rock! Let’s hear it for enthusiasm and work ethic, sister!

And hey, I even have advice: check out the github for my data journalism program this past summer, there’s lots of good stuff there. Also make sure you’ve taken a look at Statistics Done Wrong. And also, the drafts of my book are all on my blog.

Good luck!

Auntie P

——

Dear Aunt Pythia,

I am fed up with being single, and I am fed up with dating mathematicians, because the aftermath is too awkward. I’d like to try online dating, but I’m too embarrassed to tell my friends. But I feel that I need to tell someone to stay safe. Do you have any suggestions?

Currently Unsure of my Prospects In Dating

Dear CUPID,

OK let me just plug dating mathematicians in spite of the fact that you’ve decided to give up on them. They are actually super nice.

Come to think of it, before I met my husband, I decided on three rules for my next boyfriend and publicly announced them to my friends:

  1. Had to be at least 30 (because younger men were so freaking immature),
  2. Had to love his job (because men who don’t love their job are so freaking insecure)
  3. Couldn’t be a mathematician (because it’s so freaking awkward after breakups)

Then, after I met my mathematician husband and people pointed out my hypocrisy, I’d always say, “two out of three aint bad, amIright?”. So in other words, I’m totally fine with your proclamation that you’re done with math people, guys or girls, as long as you are willing to bend rules for the right nerd.

Back to online dating. Yes, I think it makes sense for at least one of your friends to know about your online activities before you start meeting strangers in night clubs. But I don’t really see why that’s embarrassing, maybe because I’m not easily embarrassed, but also because EVERYONE DOES ONLINE DATING. Seriously, I don’t know anyone who hasn’t tried that.

Why don’t you talk to a friend you trust and ask them what they think of online dating, and kind of poke the topic around a bit. I think you will be surprised to learn that it’s very common, and not at all embarrassing. And once you start doing it, with the disclosure to a good friend who will notice if you go missing, please be aware of the problems with online dating that have nothing to do with safety.

Good luck!

Aunt Pythia

——

Dear Aunt Pythia,

Our daughter has recently started watching way too much Faux News and blaming everything wrong in her life on “the liberals.” Not wanting to damage our relationship with her or our grandkids, my wife and I tend not to respond to her tea-partyish pronouncements. Alas, our silence is characterized as “uncomfortable,” and if we look at one another we’re presumed to be eye-rolling. I am afraid the whole thing may be escalating to the point that the kids start to see us as the villains responsible for the tensions in the air. The alternatives to silence appear to be: responding truthfully, which would probably get us ejected, or feigning agreement (i.e., lying), which we simply will not do. Agreeing honestly with minor details only gets us pressed for our positions on the larger issues, and we’re back to those two choices. Any ideas you have would be welcome.

Virtually Unspeaking Leftish Parents In No-win Exercise

Dear VULPINE,

What a foxy sign-off!!!

OK, so this is your daughter, right? Not your daughter-in-law? So presumably you raised her? And presumably she knows all about how leftish you guys are?

If so, it’s a weird situation. My best guess, from way over here in unspeakably leftish territory, is that she has hostility for you two and wants to blame you for her problems but the closest she can get to blaming you is blaming people like you, namely liberals.

Even if I’m wrong, there really does seem to be more than enough blame and hostility to go around in the above description, mostly coming from her, but also being passed around like a hot potato by all concerned. If I were you I’d focus on the underlying hostility, although maybe not talk directly about it with her. Some ideas:

  1. Maybe you could have dinner with just her (or with her husband if he’s around) and talk about how you guys don’t have to agree about everything to get along as a family. Focus on the interactions rather than the details of what you don’t agree about. Try to make a plan with her to avoid hot topics and enjoy your time together. Plan an apple-picking trip!
  2. If that’s too direct, think about what she’s actually accomplishing when she makes “tea-partyish pronouncements”. Does she do this right after something happens to embarrass her or put a spotlight on her vulnerabilities? Is there a pattern to the behaviors? Understanding what gives rise to those moments might help you defuse them. And if you can’t defuse them, it still might help you to know when things are coming up. Plan ahead about what you will say to change the subject.
  3. You can try to address the frustration by giving her lots of love in other ways. In other words, just find things where you guys get along and stick with them. Try to make a habit out of emphasizing common ground. Maybe you all love certain kinds of food or entertainment? Karaoke?
  4. If all those distraction methods fail, I think an articulate discussion of polite (even if strenuous!) disagreement is great for kids. And it shouldn’t ban you from spending time with the kids either, if you keep it relatively civilized.
  5. Here’s what might get you into real trouble: if you ever tell the grandkids what you really think when their mom isn’t around. That will get back to her and she will feel betrayed and might take away your private time with the grandkids. I think the disagreements have to happen out in the open in front of everyone.
  6. Finally, it just might not be possible. If she is on a tear for being hostile and blaming, then that’s what she’s gonna do. Some people are just filled with anger and there’s nothing anyone can do about it. I would just try the other stuff and if they don’t work try to be there for the grandkids, especially when they’re going through puberty.

Good luck, grandpa! I hope this was somewhat helpful.

Aunt Pythia

——

Dear Aunt Pythia,

CA has just adopted legislation to require that colleges require students to give positive consent before sex. In other words, lack of protest does not constitute consent. The change seems appropriate, but I wonder about the basic structure of the system.

My question: why are schools responsible rather than the police and does this empirically make the situation better? Are there fewer incidents, faster prosecution, more victim support, etc, because the universities are involved or does it function to shield perpetrators from criminal punishment?

Sorry this is only a quasi-sex question.

Sex Questions Unlikely In Near Term

Dear SQUINT,

THANK YOU! THANK YOU THANK YOU THANK YOU THANK YOU THAAAANK YOOOOUUUU!!

I’m on the verge of making a huge rant about this issue. I’ll probably still do it actually, but yes, yes yes. Here’s an imaginary Q&A I have with myself on a daily basis.

Why are schools responsible? Mostly historical, towns don’t want to have to hire extra police to deal with the nuisance problems (think: vomit everywhere) that proliferate on campus, so schools are like, “we got this!”.

Does this make sense? It does for actual nuisance problems, but not for violent crime. In fact it leads to ridiculous situations where professors of philosophy are expected to decide whether something was a sex act or just really terrible sex by asking whether it’s really possible for someone to be ass-raped without lubrication. Yes, it is.

Why don’t students go straight to the real police when there is a violent crime committed against them? Partly because the campus police are nearby and present, but mostly because the “real” police are not sufficiently responsive to their complaints.

So doesn’t that mean that there are two entirely different systems available to 19-year-old rape victims, depending on whether they happen to be college students or not? Yes, and it’s bullshit, and elitist, although neither system actually works for the victims.

So what should we do? We should require that claims of violent crimes on campuses go straight to the real police and we should also require that real police learn how to do their jobs when it comes to rape, so it’s a fair system for all 19-year-olds.

Aunt Pythia

——

Please submit your well-specified, fun-loving, cleverly-abbreviated question to Aunt Pythia!

Click here for a form.

Categories: Aunt Pythia

De-anonymizing what used to be anonymous: NYC taxicabs

Thanks to Artem Kaznatcheev, I learned yesterday about the recent work of Anthony Tockar in exploring the field of anonymization and deanonymization of datasets.

Specifically, he looked at the 2013 cab rides in New York City, which was provided under a FOIL request, and he stalked celebrities Bradley Cooper and Jessica Alba (and discovered that neither of them tipped the cabby). He also stalked a man who went to a slew of NYC titty bars: found out where the guy lived and even got a picture of him.

Previously, some other civic hackers had identified the cabbies themselves, because the original dataset had scrambled the medallions, but not very well.

The point he was trying to make was that we should not assume that “anonymized” datasets actually protect privacy. Instead we should learn how to use more thoughtful approaches to anonymizing stuff, and he proposes a method called “differential privacy,” which he explains here. It involves adding noise to the data, in a certain way, so that at the end any given person doesn’t risk too much of their own privacy by being included in the dataset versus being not included in the dataset.

Bottomline, it’s actually pretty involved mathematically, and although I’m a nerd and it doesn’t intimidate me, it does give me pause. Here are a few concerns:

  1. It means that most people, for example the person in charge of fulfilling FOIL requests, will not actually understand the algorithm.
  2. That means that, if there’s a requirement that such a procedure is used, that person will have to use and trust a third party to implement it. This leads to all sorts of problems in itself.
  3. Just to name one, depending on what kind of data it is, you have to implement differential privacy differently. There’s no doubt that a complicated mapping of datatype to methodology will be screwed up when the person doing it doesn’t understand the nuances.
  4. Here’s another: the third party may not be trustworthy and may have created a backdoor.
  5. Or they just might get it wrong, or do something lazy that doesn’t actually work, and they can get away with it because, again, the user is not an expert and cannot accurately evaluate their work.

Altogether I’m imagining that this is at best an expensive solution for very important datasets, and won’t be used for your everyday FOIL requests like taxicab rides unless the culture around privacy changes dramatically.

Even so, super interesting and important work by Anthony Tockar. Also, if you think that’s cool, take a look at my friend Luis Daniel‘s work on de-anonymizing the Stop & Frisk data.

Bad Paper by Jake Halpern

Yesterday I finished Jake Halpern’s new book, Bad Paper: Chasing Debt From Wall Street To The Underground.

It’s an interesting series of close-up descriptions of the people who have been buying and selling revolving debt since the credit crisis, as well as the actual business of debt collecting. He talks about the very real problem, for debt collectors, of having no proof of debt, of having other people who have stolen on your debt trying to collect on it at the same time, and of course the fact that some debt collectors resort to illegal threats and misleading statements to get debtors – or possibly ex-debtors, it’s never entirely clear – to pay up or suffer the consequences. An arms race of quasi-legal and illegal cultural practices.

Halpern does a good job explaining the plight of the debt collectors, including the people hired for the call centers. It’s the poor pitted against the poorer here, a dirty fight where information asymmetry is absolutely essential to the profit margin of any given tier of the system.

Halpern outlines those tiers well, as well as the interesting lingo created by this subculture centered, at least until recently, in Buffalo, New York. People at the top are credit card companies themselves or hedge fund buyers from credit card companies; in other words, people who get “fresh debt” lists in the form of excel spreadsheets, where the people listed have recently stopped paying and might have some resources to pull. Then there are people who deal in older debt, which is harder to collect on. After that are people who have yet older debt which may or may not be stolen, so other collectors might simultaneously be picking over the carcasses. At the very bottom of the pile, from Halpern’s perspective, come the lawyers. They bring debtors to court and try to garnish wages.

Somewhat buried at very end of Halpern’s book is some quite useful information for the debtors. So for example, if you ever get dragged to court by a debt collection lawyer,

  1. definitely show up (or else they will just garnish your wages)
  2. ask for proof that they own the debt and how you spent it. They will likely not have such documentation and will dismiss your case.

Overall Bad Paper is a good book, and it explains a lot of interesting and useful information, but from my perspective, being firmly on the side of (most of) the debtors, everyone who gets a copy of the book should also get a copy of Strike Debt’s Debt Resistors’ Operation Manual, which has way more useful information, and even form letters, for the debtor.

As far as real solutions, we see the usual problems: underfunded and impotent regulators in the FTC, the CFPB, and the Attorney General’s office, as well as ridiculously small fines when actually caught that amount to fractions of the profit already made by illegal tactics. Everyone is feasting, even when they don’t find much meat on the bones.

Given how big a problem this is, and how many people are being pursued by debt collectors, you’d think they might set up a system of incentives so lawyers can make money by nailing illegal actions instead of just leveraging outdated information and trying to squeeze poor people out of their paychecks.

The bigger problem, once again, is that so many people are flat broke and largely go into debt for things like emergency expenses. And yes, of course there are people who buy a bunch of things they don’t need and then refuse to pay off their debts – Halpern profiles one such person – but the vast majority of the people we’re talking about are the struggling poor. It would be nice to see our country become a place where we don’t need so much damn debt in the first place, then the scavengers wouldn’t have so many rubbish piles to live off of.

Categories: #OWS, economics, journalism

Upcoming data journalism and data ethics conferences

Today

Today I’m super excited to go to the opening launch party of danah boyd’s Data and Society. Data and Society has a bunch of cool initiatives but I’m particularly interested in their Council for Big Data, Ethics, and Society. They were the people that helped make the Podesta Report on Big Data as good as it was. There will be a mini-conference this afternoon I’m looking forward to very much. Brilliant folks doing great work and talking to each other across disciplinary lines, can’t get enough of that stuff.

This weekend

This coming Saturday I’ll be moderating a panel called Spotlight on Data-Driven Journalism: The job of a data journalist and the impact of computational reporting in the newsroom at the New York Press Club Conference on Journalism. The panelists are going to be great:

  • John Keefe @jkeefe, Sr. editor, data news & J-technology, WNYC
  • Maryanne Murray @lightnosugar, Global head of graphics, Reuters
  • Zach Seward @zseward, Quartz
  • Chris Walker @cpwalker07, Dir., data visualization, Mic News

The full program is available here.

December 12th

In mid-December I’m on a panel myself at the Fairness, Accountability, and Transparency in Machine Learning Conference in Montreal. This conference seems to directly take up the call of the Podesta Report I mentioned above, and seeks to provide further research into the dangers of “encoding discrimination in automated decisions”. Amazing! So glad this is happening and that I get to be part of it. Here are some questions that will be taken up at this one-day conference (more information here):

  • How can we achieve high classification accuracy while eliminating discriminatory biases? What are meaningful formal fairness properties?
  • How can we design expressive yet easily interpretable classifiers?
  • Can we ensure that a classifier remains accurate even if the statistical signal it relies on is exposed to public scrutiny?
  • Are there practical methods to test existing classifiers for compliance with a policy?
Follow

Get every new post delivered to your Inbox.

Join 1,741 other followers