Navigating the mindset for data journalism
I’ve been working my butt off this summer starting up a data journalism program and teaching in it. I couldn’t ask for a better crew of students and instructors: engaged, intelligent, brave, and eager to learn. And my class has been amazing, due to the incredibly guest speakers who have given their time to us. On Tuesday we were honored to have danah boyd come talk about her new book It’s Complicated, and yesterday Julie Steele talked to us about visualization and how our technological tools affect our design, which was fabulous and also super useful for the class projects.
I feel like it’s the picture perfect situation for the emerging field of data journalism to be defined and developed. Even so, there are real obstacles to getting this right that I hadn’t anticipated. Let me focus on obstacles that exist within the academy, since that’s what I’ve been confronting these past few weeks and months.
Basically, as everyone knows, academia is severely partitioned between departments, both physically and culturally. Data journalism sits more or less between journalism and computer science, and both of those fields have cultures that are unintentionally hostile to a thriving new descendant. Let me exaggerate for effect, which is what I do.
In cartoonish form, introductory computer science classes are competitive weeder classes that promote a certain kind of narrow, clever, problem-solving approach. If you get your code to work, and work fast, you’re done, and you move quickly to the next question because there’s an avalanche of work and technical issues to plow through.
You don’t get that much time to think, and you almost never address the question of how to do things differently, or why syntax is inconsistent between different parts of python, or generally why a computer language is the way it is and how it could have been designed differently and what the history was that made it so, because you don’t have time and you have to learn learn learn. In other words, it’s kind of the least context-laden and most content-heavy way of learning that you can imagine. You impress people by what you can make work, and how fast, and it is a deep but narrow way of working, kind of like efficient well-digging.
Now let’s paint an equally exaggerated vision of the journalist training. A good journalist collects a ton of information to create a kind of palette for the topic in question, and dives straight into ambiguity or history or bias or contradiction to learn even more, and then starts to build a thesis after such comprehensive information collection has occurred. In other words, the context is what makes a topic interesting and important and newsworthy, and the human and gripping example is critical to illustrate the topic as well as to make it into a story rather than a set of facts. You impress people by your ability to synthesize an incredible breadth of knowledge and then find the hook that makes it a compelling story and draw it out and make it real. This is a broad filtering method where you don’t take the next step until you know you should.
To make it even more dumbed down, journalists are ever aware of the things they know they don’t know, and desperately want to fill in their knowledge gaps because otherwise they feel fraudulent, like they’re jumping to unwarranted conclusions. Computer scientists don’t care about not knowing things as long as their programs work. They can be blithe with respect to messy human details, which of course means they sometimes don’t notice or figure out their data has selection bias because they got an answer, but also means they are super efficient.
Now you can see why it’s a tough thing to teach journalists to code, and it’s also a tough thing to expect coders to become journalists. Both sides emphasize a kind of learning and a definition of success that the other side is blind to.
What would a middle ground look like? In the ideal scenario, it would be a place that appreciates and uses the power of data and programming and spends the time learning the history and searching the inherent human bias of data collection and analysis. That scenario is exciting, but it clearly takes time to build and represents a real investment both by the academic institutions that build it and by the media that eventually hire the data journalists coming from it.
In other words, the outside world has to actually want to hire the emerging thoughtful fruit of that labor. It brings me to other problems for data journalism that largely live outside the academic world, which I might blog about at some other time.