Nipping at the heels of yesterday’s story about the software that automatically writes news articles comes another technological innovation changing the shape of journalism: software that reads news articles.
Kalev Leetaru of the University of Illinois determined that using the Nautilus SGI supercomputer to analyze news stories can help predict major world events. The analysis he used for the experiment was retrospective, feeding the computer millions of articles from which it was able to determine a deteriorating national sentiment towards Libya and Egypt before the revolutions in those countries. The system was also able to narrow down Osama Bin Laden’s location to within 125 miles before he was found and killed last May.
More than 100 million articles were gathered for this study, from various sources including the New York Times archive, Open Source Center and BBC Monitoring (two organizations that monitor local media output worldwide). The system searched for two primary things in the articles: mood and location. Words such as “nice” or “horrible” were used to measure mood, and geocoding converted mentions of places such as “Cairo” or “Pakistan” to plottable coordinates.
For countries that experienced the “Arab Spring,” the supercomputer produced graphs that showed a noticeable decline in media sentiment both within each country and without. Before President Mubarak’s resignation, the tone of media coverage of Egypt fell to one of its lowest points in 30 years, predicting something that U.S. government could not. As Leetaru told BBC news, the president’s continued support of Mubarak showed that high-level analysis suggested Mubarak wasn’t going anywhere. The graph, however, suggests otherwise.
Leetaru’s next step is developing technology to allow this system to forecast major world events, rather than just analyzing them after the fact. He compares it to economic forecasting algorithms, as well as meteorology, in that none of those systems (including his) are perfect, but using them is far better than just guessing.