Can Data Mining Catch Terrorists?

New theories on how to look at data point to ways to catch terrorists, but it's no easy task

When Gen. Michael Hayden faced Congress last week for a pre-confirmation grilling as President Bush's nominee to lead the Central Intelligence Agency, he started by calling intelligence gathering "the football in American political discourse" since the 9/11 terrorist attacks. Then, when pressed about a megadatabase of phone records of U.S. citizens allegedly compiled under his watch at the National Security Agency, Hayden punted.

The nominee declined to discuss the sensitive issue in open session or otherwise address the wide--and sometimes wild--speculation about how much phone data the feds have collected and what they're doing with it. The issue got new life in a May 11 story in USA Today, which reported that AT&T, BellSouth, and Verizon had turned over phone call records of tens of millions of Americans to populate the NSA database. The purpose of the data collection, according to USA Today, is to identify potential terrorist activity. But privacy advocates teed off on how such a database might be misused.

The technology certainly exists to assemble a massive phone-records database, but it's not clear whether the NSA has the volume of data it would need to get a complete picture of terrorist activity or the data mining algorithms necessary to tell the difference between calls among friends and those among terrorists. Businesses and government agencies routinely mine multiterabyte databases to create meaning out of minutia. But the stealth nature of the terrorism business would make connecting the dots infinitely harder.

Indeed, it's unclear just what data the NSA has in hand. BellSouth and Verizon both denied sharing phone records in bulk with the intelligence agency; AT&T was sketchy about its participation; and Bush was noncommittal on whether such a database even exists.

Here's what we do know: The NSA is a sophisticated user of database technology--Larry Ellison has long said the NSA is one of Oracle's earliest customers. We also know that government agencies are intensely interested in data mining. A 2004 survey by the Government Accountability Office found that federal agencies were engaged in or planning 199 data mining projects, including 122 involving personal data. A database of phone records wouldn't be hard to create; the data exists.

We also know that terrorists make phone calls. After the 2001 attacks, the government determined that the 19 terrorists had made 206 international calls from the United States, according to press reports. A logical step for data analysts would be to search through phone records to see if there are other networks of people whose calls followed similar patterns.

Social Connections

In data mining, the practice of looking for underlying connections between people is called social network analysis. Phone data is useful because it helps expose relationships and associations among different groups. With social network analysis, contacts are commonly laid out graphically to illustrate connections and find patterns. At the simplest level, this could be shown as links similar to the spokes of a wheel leading to one source, indicating that a person holds a leadership position within a terrorist cell. Looking deeper, it could uncover relationships, such as two suspected terrorists linked only through a third, unknown person.

This content continues onto the next page...