Home > Uncategorized > Strata and swag

Strata and swag

October 1, 2015

Yesterday I gave a 5-minute lightning talk at a corporate big data conference here in New York called Strata+Hadoop World, put on by O’Reilly and Cloudera.

My talk was part of a session run by DataKind, aimed at talking about the ethics of algorithms. My 5 minutes were taken up discussing 5 ideas:

  • In order to do good with data, first you have to not do bad. Data scientists aren’t trained to think through the ethics and social impact of their work, so this is non-trivial.
  • We haven’t actually figured out the difference between correlation and causation. That means, in the context of social algorithms, that we blame the victim constantly. Think about the HR algorithm that decides never to hire another woman engineer because it notices how badly women engineers fare in the workplace.
  • Or, we could take the example of the justice system, where we use recidivism algorithms to figure out that poor black people are more likely to be arrested, and we decide to punish them even more as a result, instead of asking why the justice system isn’t serving to help them as much as it helps white or rich people.
  • Or, we could take the example of teacher assessment, where we blame teachers on student test scores, even though they have little power over them.
  • Conclusion: data scientists are de facto policy makers. We shouldn’t be.

So, the talk I gave was sparsely attended, with maybe 40 people in the room (which is actually more than we expected). I was happy to see those people, and many of them were earnest and thoughtful, to be sure. Danah Boyd spoke in the second session, as usual very eloquently, and I felt like there were far too few people in the room compared to who might benefit from hearing her.

But let’s face it, Strata is a celebration of big data in the corporate setting, and few people there were spending too much time fretting about ethics. It was dominated by its expo room, where dozens of data science platforms extending the hype of the power of big data were set to sell you magical thinking. There were also a few groups doing good stuff, to be sure, but the overall feel was similar to how it felt back in 2011, except bigger.

Not to be cynical! There’s plenty of other stuff going on that wasn’t in 2011, so really it’s fine. And plus, I did manage to meet up with some colorful ladies:

Picture taken by my buddy Debbie Berebichez

Picture taken by my buddy Debbie Berebichez

and I picked up an enormous amount of Strata swag (more here) because teenage sons:

This one is the cutest. Most of the other t-shirts I got had silly puns.

This one is the cutest. Most of the other t-shirts I got had silly puns.

If I had stayed longer I could have gotten plenty of free beer and food, not to mention more pens than I could ever use. There were even lego data science characters, but to get those I had to stay to listen to the pitch, which was a dealbreaker for me.

Conclusion: Strata fills a niche not unlike the New York Coffee Festival. Almost completely frivolous but fun for the participants, as long as you don’t get caffeine poisoning.

Categories: Uncategorized
  1. October 1, 2015 at 8:33 am

    I like the picture of the colorful ladies. Say hello to Debbie for me.

    At a meetup this week I got a Cloudera tee shirt that said: “Data is the new bacon.” What does that mean? That data is not kosher? I don’t get it.


    • October 1, 2015 at 8:38 am

      Maybe it’s like Kevin Bacon, everyone’s favorite co-star …


  2. October 1, 2015 at 8:36 am

    (sp) lightning

    (unless you meant it to be en-lightening, maybe …)


  3. Zathras
    October 1, 2015 at 9:32 am

    I am in the big data/corporate setting myself. One of the biggest time savers I’ve done in the last year has been to make myself the contact for big data vendors, instead of highers-up. What has happened in the past is that these big data vendors make pitches to higher execs where they lie, flat-out lie, how quickly and automatically their platform turns data into money for the company. I then have to spend hours upon hours debunking those lies. Now, I can debunk those lies at the initial pitch, which saves me an enormous amount of time.


  4. Aaron Lercher
    October 1, 2015 at 1:31 pm

    This makes me look forward to reading your book, Cathy.

    It sounds like you didn’t have time to discuss your challenging argument:
    In the context of social analysis, correlation -> causation -> blame.

    Too bad there wasn’t time for discussion.

    One response might be: “I’m not a policy maker. I just do my job. It’s not my job to blame anyone. I write reports that are very cautious about inference, so I avoid the path you are warning against.”

    Maybe that’s what you are hoping to hear, but I suspect not. If that’s the conclusion, it would not affect anyone’s behavior or judgment, except maybe making reports more rigorous.

    It’s possible that the conclusion you want for your argument is that data scientists *are* policy makers, whether they want it or not, unless they quit their jobs. This conclusion would seem to obligate data scientists to make judgments, and to work for good agendas and policies, not bad ones.

    That conclusion is a tough sell, but I suspect it’s really the conclusion you want.


  5. October 4, 2015 at 1:06 am

    I was bummed I couldn’t attend your talk because I was speaking at the same time. Am looking forward to seeing the video from your talk though – such a critical topic and especially for Strata folks (even if they don’t realize it).


  1. No trackbacks yet.
Comments are closed.
%d bloggers like this: