For a few years in the mid-1990’s, I was the geek in the corner of the newsroom—the in-house “computer-assisted reporter” at The Times-Picayune in New Orleans. Many papers big and small had somebody like me: a reporter, often self-trained or schooled at camps such as those offered by Investigative Reporters and Editors, who spent his days gathering massive databases from government agencies and working them into stories. It was commonly called “computer-assisted project journalism,” or given some other self-inflating moniker, and produced some stories that were both good and very, very big.
In the world of computer-assisted journalism, I had gone from zero to 100, starting with almost no knowledge of databases and spreadsheets, and soon tackling projects with records in the millions. My biggest took weeks to complete (okay, maybe months) and started with 60 million Louisiana Medicaid claims records. Over several painstaking weeks, I whittled those 60 million records down to a few million records relating to doctors, a few million others relating to hospitals, a few million more relating to nursing homes. Given the state of technology at the time, I sometimes started a database search when I was preparing to leave at night, hopeful that it would be completed by the time I got back to my desk in the morning. Other times, I had six computers, side-by-side, all tied up in the newsroom.
The end result of this number crunching was a series of stories that revealed doctors who were billing for 30 hours of care per day and hospitals that were billing for millions of dollars of care that they were ineligible to provide.
Technology is a lot better today, and I know a bit more about computers—meaning a major project like that would take far less time. But for most of the past several years, I covered a beat for The Wall Street Journal—and, as one of many such beat reporters, I simply didn’t have the time to dive into massive computer-assisted projects. While at the Journal, I did complete a few big computer-assisted reporting (CAR) stories—although none approached the scope of earlier work.
In April, I moved onto another job, as an investigative reporter for the Knight Ridder news bureau in Washington, D.C., which should give me the opportunity to do the big CAR stories again. But what I realized during my years at the Journal is that CAR tools can be used in the smallest of ways in regular beat coverage.
This, of course, isn’t news to many, many reporters who incorporate databases and spreadsheets into coverage of the financial markets, or government budgets, or campaign finance. But there are plenty of reporters—I’d guess more than half—who have yet to work their way around an Excel spreadsheet, which is actually a simple-to-use tool that can free them from relying on a source for analysis, thus allowing them to see their own patterns emerge or to find entirely new ones.
Using Technology to Get Answers
Hardly a day goes by that I don’t open an Excel spreadsheet for some story or another. Often, it’s to manage a list of information already available on the Web. The Food and Drug Administration (FDA), for example, was part of my Journal beat, and that agency dumps huge masses of information onto its Web site. While much of it is valuable, it’s also hard to analyze for the kind of patterns that might make for a story.
An example: The FDA monitors the drug advertisements that companies run for their products, making sure they don’t overstate a drug’s effectiveness or downplay its risks. They write violation letters to companies that break the rules, and they list hundreds of these letters on their Web site, going back several years. I had noticed some companies getting slapped more than once with violation letters, so I wanted to see which companies and which drugs had been cited the most.
How to do this? I could print out all the information and manually scroll through each one, counting them up with little marks. But there are hundreds of drugs out there, so the risk of missing one in such a count is real. And it would also take several mind-numbing hours. What I wanted was the actual, computerized list from the FDA that should detail each violation letter along with its date, company and drug. Since a form of that list was on the Web site, it had to exist in another form in somebody’s computer at the agency, and I could have put in a Freedom of Information Act (FOIA) request to get it. But I could be well into retirement before the FDA might actually fulfill an FOIA request (it has one of the slowest response times in the federal government).
A simple Excel tool let me bypass that process entirely. With a couple of clicks and drags, I picked up each yearly list of violation letters and pasted them into a single Excel spreadsheet file. From there, it’s easy to sort them in whichever way you want. After alphabetizing them by drug name, for example, I could see that several had been cited many times. By telling Excel to perform a simple count of how often each drug name occurred, I could see that one drug had been cited 14 times by the FDA, another 11 times.
Companies, it seems, were running ads that pushed the limits, hoping they wouldn’t be caught. If they were, they dumped that ad and replaced it with another that also went too far. FDA authority was weak, making it difficult to prevent companies from using such a strategy. Once I had the basic numbers, I could zero in on which letters I wanted to completely review and which companies I wanted to interview. It was a good little enterprise story—nothing major, certainly, but interesting enough to be given nice play in the paper and be picked up by competitors.
The most interesting thing is how simple it was to complete. Doing the analysis of drug-ad violations by hand would have taken several hours, maybe a day, and the risk of messing up was real. Waiting for the FDA to respond to the FOIA request would have taken—who knows? Clicking and pasting into an Excel spreadsheet took minutes, and even after running and re-running and running again the analysis (to make sure I hadn’t botched it), the whole chore took at most a couple of hours.
There were other simple CAR stories I did at the Journal. This year I performed a count of the thousands of “warning letters” the FDA has sent in the past 10 years. What I found is that the number it sends to companies is down drastically this year, which we explained was due to Bush Administration changes. Similarly, while covering the steel industry, an examination of import records showed that many of the steelmakers that complained about steel imports were, in fact, importing themselves—and we could say, down to the pound, how much of hot-rolled, cold-rolled, and slab they had brought in. The computer work in both stories took at most a half day, but formed the backbone for solid stories.
The key to doing such stories is to think small and to take a simple class in how to use Excel. Such basic knowledge will allow reporters to go far beyond their current abilities and lay the groundwork for someday doing a major CAR project that might demand more powerful computer software. One day, I might decide I need to analyze 60 million Medicaid records and, if my editors don’t laugh me out of the room when I say I need a few months to do it, I know I’ll be able to. But until then, there are plenty of other small stories waiting to be discovered.
Chris Adams is a reporter with the Knight Ridder Washington, D.C. bureau.