By analyzing the news, researchers can predict natural disasters
Researchers have developed software that predicts when and where disease outbreaks can occur, based on a twenty-year archive of New York Times articles and other Internet data, Mashable reports . The authors of the development are Microsoft and Technion - Israel Institute of Technology.
The system shows amazing results when tested on historical data. For example, reports of a drought in Angola in 2006 raised a warning about a possible cholera outbreak in the country, because previous events had taught the system that cholera outbreaks were more likely in the years after the drought. Angola's second cholera warning was triggered by news of storms in Africa in early 2007; less than a week later there were reports that cholera had actually spread in the region. In similar trials involving the prediction of disease, violence, and a significant number of deaths, system warnings were correct in 70–90% of cases.
In the future, the system can help humanitarian organizations better deal with outbreaks of disease or other problems, says Eric Horwitz, a scientist and co-director of Microsoft Research. Horvitz conducted the study in collaboration with Kira Radinski, a researcher at the Technion - Israel Institute of Technology.
According to Horvitz, the current performance indicators of the system are good enough to suggest that its improved version can be used in real conditions. The system was developed using the New York Times news archive for 22 years - from 1986 to 2007, and also uses data from the Web to find out what leads to notable events.
“One of the sources we found useful was DBpediain which crowdsourcing provides information from Wikipedia in a structured form, ”says Radinsky. “We can understand or see the location of places in news articles, how much people earn there, and even information about politics.” Other sources included WordNet , which helps the system understand the meaning of words, and OpenCyc , a general knowledge database.
All of them provide valuable context that is not available in the news, and which is necessary to find out the general rules, which events precede others. For example, a system can infer a link between events in cities in Rwanda and Angola, based on the fact that both countries in Africa have similar GDP and other factors. This approach led the system to the conclusion that in predicting cholera outbreaks, one should take into account the location of a country or city, the proportion of the water surface, population density, GDP, and whether there was a drought in the previous year.
The very idea of finding ways to predict disease outbreaks is not new, nor is the concept of data mining for forecasting, but the scope of this project potentially makes it very useful. Since the system is able to successfully correlate between events and it is enough to generalize the data to make the result useful, it can be applied in various fields.