In the March issue of DTI, Sharon Weinberger takes a look at how various branches of the federal government are teaming up with the defense industry to invent new ways to read and analyze your Tweets and Facebook posts, looking for patterns that may—or may not—be there.
“The D.C. earthquake marked a new high watermark, which was 7,500 tweets per second,” says Sean Love, the geospatial business development director for Northrop Grumman. “The people in New York actually knew about the earthquake via Twitter before they felt the vibration; so the news passed faster than the tremors through the ground.”
For many working in the U.S. military and the intelligence community (IC), this realization—that Twitter and other forms of social media can help provide an early warning indicator, like radar spotting an aircraft—is spurring a sudden interest in research to help harness this new information revolution. The departments of Homeland Security and Defense, and the IC, have all started projects designed to harness social media to provide early alerts of potential crises, the next Arab Spring or military conflict.
Mining information from microblogs—like Twitter and other forms of social media ranging from Facebook posts to Wikipedia entries, is not entirely new. Commercial companies have been moving swiftly into social media monitoring for several years, generating algorithms to predict everything from fast-moving clothing trends to movie ticket sales. The field, often referred to as predictive analytics, involves mining large volumes of public data to make forecasts.
For the past two years, Northrop has been working on its own proprietary technology that it describes as an “open source ingest engine,” which collects publicly available data, such as Twitter posts. “One of the nastiest data problems that we have is open-source data,” says Northrop’s Love, pointing to the deluge of data available on the Web, ranging from simple Google searches to the so-called “deep web.”
The company’s open-source exploitation system crawls through all the available online data on a particular subject, and then performs an automated triage, sorting and narrowing data. One test case used by Northrop to demonstrate its technology concerns the Zetas drug cartel in Mexico. The company collected data from 300 websites, ranging from those of the U.S. and Mexican governments to social media. “We’re able to extract out people, organizations, places, etc., and start to make some sense and curate the data as it’s being ingested,” says Steve Relitz, principal investigator for the Northrop project.
Read the whole thing here.