In an effort to further develop my data journalism skills I’ve been exploring communities of practice in search of technical help, feedback, advice and inspiration. As a journalist who doesn’t code (yet) I’m finding it daunting to infiltrate a field dominated by programmers and already cross-skilled reporters whose expertise level surpasses mine. It’s useful to read online forums, but difficult to contribute when the ethos of the group requires posting specific, practical questions and answers about issues beyond my skill set.
I decided to try the direct approach by emailing Michelle Minkoff after reading her tweets and posts on the NICAR discussion forum because she seemed approachable and I could relate to her approach to data-driven storytelling at PBS:
We’re pioneering the concept of DataStories, which combine the visual power of data visualizations with the structured organization traditionally associated with data applications, and add a layer of editorial contextualization to enable Web users to learn something new about their world that is most relevant to them.
Michelle was kind enough to respond with excellent advice which I’m passing along here for other people who may be in the same position as I am. I’ve stripped out most of the specific feedback about my work, but if you’re interested in the context, you can see it here, here and here.
Showing people what to look at
From your work, it sounds like you’ve been concentrating a lot on tools. Your experimental portfolio does a really nice job of walking through different story forms, but in addition to changing forms due to technical limitations, try different forms and see which fits the story best. To understand more about how this process works, and gain more background on using data viz to tell the story, as opposed to adding it on as a sidecar (all too common in the journalism world these days), I would suggest watching Amanda Cox’s talk here (http://nmd.arkena.tv/012898356641464/go-figure). It’s very inspiring and has a lot of great information. Cox is a member of the graphics team at the New York Times, but comes from a statistics background (she previously worked at the Bureau of Labor Statistics, analyzing economic data and summarizing for the public), and she’s exceptional at thinking about the best way to visualize the story that the data tells.
Data provided in technical forms
You’ve done a great job using a variety of non-technical tools to perform data journalism tasks. When you mention that you really need a drive to get this done, you hit the nail on the head. Whether it’s finding tools, figuring out what the heck XML is, or figuring out how to Web programming — none of these skills are magic. But you do have to keep pushing at them. I would think you’d be hard pressed to encourage a data source to make its data more accessible to journalists without technical backgrounds. The technical formats are actually extremely helpful in terms of enabling a journalist to analyze the data, once you figure out how they work. We’re lucky that so much data is available, and able to be manipulated. I wish I had an easy answer for you, but you just have to figure it out. Open up a file, look for repeatable patterns. Structured data should have a pattern — if it’s good data. You can open an XML file right in a Google spreadsheet — maybe even JSON, too. But even just start looking at the files in TextEdit (Mac) or Notepad (Windows). Start to figure out what the rows and columns would be, and try to wrap your head around how it works.
What to learn next
In terms of data analysis, your next big step should probably be figuring out a way to ask questions of the data. In a city budget, which category is getting the most money? Which category has the biggest change from last year? The biggest dollar/pound/other currency change? The biggest percent change? A great book to read that will introduce you to simple math that will help you find stories is Sarah Cohen’s Numbers in the Newsroom book. Learn how to manipulate data, either by starting to understand formulas in Excel, or looking into something called SQL. SQL will get you into the world of how to query, or ask questions of, a database. So, if you have a list of many salaries, and want a list of people receiving more than $300,000, you would write something like: SELECT * FROM salaries WHERE salary > 400000;
If you’d like to look into that, the easiest database to get started with is Sqlite, which also comes as a Firefox extension, so you can work with it right in your browser. https://addons.mozilla.org/en-US/firefox/addon/sqlite-manager/ Work through the tutorials here should give you a handle on the actual SQL language (SQL stands for Structured Query Language) http://sqlzoo.net/
As for visualization, start looking into some things called Google Chart Tools (static graphs, http://code.google.com/apis/chart/) and Google Visualization API (interactive graphs; http://code.google.com/apis/visualization/documentation/gallery.html). They do require some coding, but reading examples and following them will teach you a lot about visualization. Both tools have something called a Chart Playground or Chart Wizard, which lets you adjust examples to see how the charts work. http://code.google.com/apis/chart/docs/chart_wizard.html I still use these tools almost every day, even though I can now “code for real” — whatever that means.
Thank you Michelle!