Featured Research

from universities, journals, and other organizations

Finding real value in big data for public health

Date:
July 2, 2014
Source:
San Diego State University
Summary:
Media reports of public health breakthroughs from big data have been largely oversold, according to a new study. But don't throw away that data just yet. The authors maintain that the promise of big data can be fulfilled by tweaking existing methodological and reporting standards. In the study, the research team demonstrate this by revising the inner plumbing of the Google Flu Trends (GFT) digital disease surveillance system, which was heavily criticized last year (see here and here) after producing erroneous forecasts.

A graph depicting Google Flu Trends.
Credit: Image courtesy of San Diego State University

Media reports of public health breakthroughs made possible by big data have been largely oversold, according to a new study, published in the American Journal of Preventive Medicine.

Related Articles


"Many studies deserve praise for being the first of their kind, but if we actually began relying on the claims made by big data surveillance in public health, we would come to some peculiar conclusions," said John W. Ayers, San Diego State University Graduate School of Public Health research professor and senior author of the study. "Some of these conclusions may even pose serious public health harm."

But don't throw away that data just yet.

The authors maintain that the promise of big data can be fulfilled by tweaking existing methodological and reporting standards. In the study, the Ayers and his colleagues demonstrate this by revising the inner plumbing of the Google Flu Trends (GFT) digital disease surveillance system, which was heavily criticized last year (see here and here) after producing erroneous forecasts.

"Assuming you can't use big data to improve public health is simply wrong," added Ayers. "Existing shortcomings are a result of methodologies, not the data themselves."

A solution for Google Flu Trends

In the first external revision proposed to GFT, Ayers and co-researchers David Zhang, Maurcio Santiliana (both with Harvard University), and Benjamin Althouse (with the Santa Fe institute) explored new methods for using open-sourced, publicly available Google search archives to forecast influenza, an approach that can serve as a blueprint to fix broader shortcomings in public health surveillance.

To address GFT's problems, the team significantly beefed up the existing GFT model. First, rather than relying on a single trend that represents a group of influenza search queries, they monitored changes in individual search queries, giving various algorithmic weight to some queries over others based on how they potentially improved predictions compared to patient data collected by health agencies.

Second, instead of relying on investigator opinion for periodic updates to the model, the team built in automatic updating that adjusts the weight given to any single query in the model each and every week based on artificial intelligence techniques to maximize predictive accuracy.

During the 2009 H1N1 pandemic and 2012/13 season -- two critically important periods of influenza surveillance in the United States -- the alternative method yielded more accurate influenza predictions than GFT every week, and was typically more accurate than GFT during other influenza seasons.

"With these tweaks, GFT could live up to the high expectations it originally aspired to," Ayers said. "Still, the greatest strength of our model is how the queries being used to describe influenza trends are changing over time as search patterns change in the population or the model occasional underperforms due to false-positive queries."

For example, during the 2012/2013 season, GFT predicted that 10.6% of the population had influenza like illness when only 6.1% did according to patient records. The team's alternative significantly reduced the error in that prediction, estimating that 7.7% of people would have the flu. And within two weeks the model self-updated, considerably changing the weight given to certain queries that spiked during that time, improving the model for future performance.

What's next for big data

"Big data is no substitute for good methods, and consumers need to better discern good from bad methods," Ayers said. To achieve these ends, he and his colleagues added that digital disease surveillance researchers need greater transparency in the reporting of studies and better methods when using big data in public health.

"When dealing with big data methods, it is extremely important to make sure they are transparent and free," co-author Althouse added. "Reproducibility and validation are keystones of the scientific method, and they should be at the center of the big data revolution."

Importantly, these criticisms shouldn't be taken as an indictment of the promise of big data, or of the early attempts to wrangle it into something beneficial for the public, Ayers said. Now that the initial hype is wearing off, researchers can begin seriously exploring and testing the strengths and limitations of existing models and sharpening their methodologies.

"We certainly don't want any single entity or investigator, let alone Google -- who has been at the forefront of developing and maintaining these systems -- to feel like they are unfairly the targets of our criticism," Ayers said. "It's going to take the entire community recognizing and rectifying existing shortcomings. When we do, big data will certainly yield big impacts."


Story Source:

The above story is based on materials provided by San Diego State University. Note: Materials may be edited for content and length.


Journal Reference:

  1. Mauricio Santillana, D. Wendong Zhang, Benjamin M. Althouse, John W. Ayers. What Can Digital Disease Detection Learn from (an External Revision to) Google Flu Trends? American Journal of Preventive Medicine, 2014; DOI: 10.1016/j.amepre.2014.05.020

Cite This Page:

San Diego State University. "Finding real value in big data for public health." ScienceDaily. ScienceDaily, 2 July 2014. <www.sciencedaily.com/releases/2014/07/140702122432.htm>.
San Diego State University. (2014, July 2). Finding real value in big data for public health. ScienceDaily. Retrieved October 23, 2014 from www.sciencedaily.com/releases/2014/07/140702122432.htm
San Diego State University. "Finding real value in big data for public health." ScienceDaily. www.sciencedaily.com/releases/2014/07/140702122432.htm (accessed October 23, 2014).

Share This



More Computers & Math News

Thursday, October 23, 2014

Featured Research

from universities, journals, and other organizations


Featured Videos

from AP, Reuters, AFP, and other news services

Chameleon Camouflage to Give Tanks Cloaking Capabilities

Chameleon Camouflage to Give Tanks Cloaking Capabilities

Reuters - Innovations Video Online (Oct. 22, 2014) — Inspired by the way a chameleon changes its colour to disguise itself; scientists in Poland want to replace traditional camouflage paint with thousands of electrochromic plates that will continuously change colour to blend with its surroundings. The first PL-01 concept tank prototype will be tested within a few years, with scientists predicting that a similar technology could even be woven into the fabric of a soldiers' clothing making them virtually invisible to the naked eye. Matthew Stock reports. Video provided by Reuters
Powered by NewsLook.com
Internet of Things Aims to Smarten Your Life

Internet of Things Aims to Smarten Your Life

AP (Oct. 22, 2014) — As more and more Bluetooth-enabled devices are reaching consumers, developers are busy connecting them together as part of the Internet of Things. (Oct. 22) Video provided by AP
Powered by NewsLook.com
Google's Inbox Is The Latest Gmail Competitor

Google's Inbox Is The Latest Gmail Competitor

Newsy (Oct. 22, 2014) — Google's new e-mail app is meant for greater personalization and allows users to better categorize their mail, but Gmail isn't going away just yet. Video provided by Newsy
Powered by NewsLook.com
Free Math App Is A Teacher's Worst Nightmare

Free Math App Is A Teacher's Worst Nightmare

Newsy (Oct. 22, 2014) — New photo-recognition software from MicroBlink, called PhotoMath, solves linear equations and simple math problems with step-by-step results. Video provided by Newsy
Powered by NewsLook.com

Search ScienceDaily

Number of stories in archives: 140,361

Find with keyword(s):
 
Enter a keyword or phrase to search ScienceDaily for related topics and research stories.

Save/Print:
Share:  

Breaking News:

Strange & Offbeat Stories

 

Space & Time

Matter & Energy

Computers & Math

In Other News

... from NewsDaily.com

Science News

Health News

Environment News

Technology News



Save/Print:
Share:  

Free Subscriptions


Get the latest science news with ScienceDaily's free email newsletters, updated daily and weekly. Or view hourly updated newsfeeds in your RSS reader:

Get Social & Mobile


Keep up to date with the latest news from ScienceDaily via social networks and mobile apps:

Have Feedback?


Tell us what you think of ScienceDaily -- we welcome both positive and negative comments. Have any problems using the site? Questions?
Mobile iPhone Android Web
Follow Facebook Twitter Google+
Subscribe RSS Feeds Email Newsletters
Latest Headlines Health & Medicine Mind & Brain Space & Time Matter & Energy Computers & Math Plants & Animals Earth & Climate Fossils & Ruins