Featured Research

from universities, journals, and other organizations

Hopkins-Led Team Developing New Ways To Handle Flood Of Data

Date:
September 27, 1999
Source:
Johns Hopkins University
Summary:
The fountain of scientific data has become a fire hose and is turning into a raging river. A Johns Hopkins-led consortium is working on ways to handle the information overload faced by scientists.

The fountain of information at the heart of science has become a fire hose, and an increase to river-like volumes is on the way. The CERN particle collider in Geneva, Switzerland, for instance, currently produces more than 1 petabyte, or about 1,000,000,000,000,000 bytes, of information every year. The words and other text in all the books in the Library of Congress, in contrast, add up to only about one-thousandth of that information, or one terabyte (1 trillion bytes). And CERN is just one example of the tremendous information-generating powers of modern science.

"Our current ways of doing science are very much based on the concept that our data sets are so small that we can sort of ‘eyeball' the whole thing and locate the interesting data," says Alexander Szalay, Alumni Centennial Professor of Physics and Astronomy at The Johns Hopkins University. "And with the data sets we are getting in an increasing number of areas of science, this is just not going to be feasible. So we have to do something drastically different."

Szalay leads an interdisciplinary team of researchers developing new ways to store, access and search large volumes of data. Participants in the Hopkins-led collaborative include scientists from Cal Tech, the U.S. Department of Energy's Fermilab and Microsoft Corp. They have been working together for several years already; this month they will receive the first formal support for their efforts in a 3-year, $2.5 million grant from the National Science Foundation.

"This problem is of course much bigger than astronomy or particle physics," Szalay says. "I think this is actually becoming more a problem for the whole society. We are choking on information, and we have to sort out the relevant from the irrelevant. So I think what we're doing is a very interesting test bed for experimenting with new technologies that could have broader applications elsewhere."

Particle physicists were among the first to have to deal with huge quantities of information. Their work to manage that information led to the development of tools and techniques that found uses beyond the realm of the physics lab, notes Aihud Pevsner, Jacob P. Hain Professor of Physics and Astronomy at Johns Hopkins and a member of the collaborative.

"To help work with large data sets at CERN, Tim Berners-Lee invented in 1989 what later became the World Wide Web," says Pevsner. "He did it because the tools that they had at the time were inadequate for the distribution of the data sets they were working with."

Pevsner, a particle physicist, will be one of 500 American physicists working at the Large Hadron Collider (LHC) at CERN, the world's most powerful particle collider. The LHC is expected to produce 100-petabyte data sets.

Szalay is a researcher for the Sloan Digital Sky Survey (SDSS), an effort he calls the "cosmic genome project," which will map everything visible in several large chunks of the northern and southern sky. SDSS starts next year, and before it is over he estimates that it will produce 40 terabytes of data with a 2-terabyte catalog.

Such a high volume of data reduces the chances that astronomers will miss gathering important information, but it also makes it harder to find that information among what's been gathered. "When you have so much data that it chokes you, you have to keep breaking it up into smaller chunks until it no longer chokes you," Szalay says.

Developing better ways to break down large quantities of information is the first major component of research under the NSF grant. The SDSS information, for example, might be broken up both by the area of the sky that the data comes from and by the color of the objects observed in the sky. The challenge, though, is to make sure that this process of partitioning the data improves the scientists' abilities to see important patterns and irregularities in the data.

"We want to try to make it possible for data that will be of interest to the same kinds of queries to be ‘located' close together so they are easier to find," says Ethan Vishniac, director of the Johns Hopkins Center for Astrophysical Sciences, also a collaborative member.

Another concern is that these huge chunks of information will probably be stored at geographically different locations. Some next-generation science projects involve so much information, according to Szalay, that it cannot be brought to researchers across computer networks. Arranging ways to simultaneously access data in these different locations without ever bringing it together in one database, a technique called "distributed processing," is the second major component of research supported by the NSF grant.

The third component of the NSF grant will improve a technique called "parallel" querying. This involves searching in different locations at the same time, not unlike sending out an army of librarians to search or work in several different, large libraries at once. Researchers will strive to make these search agents smarter and more independent by improving the software they use. To test their efforts at dealing with these challenges, researchers will use data from the SDSS, from the CERN Particle Collider and from GALEX, a sky-mapping survey that covers the same areas as SDSS but measures different forms of radiation.

"Data sets that are astronomical in every sense of that word are great test beds for computer scientists to experiment with to develop novel techniques for visualizing, organizing, and querying information," says Michael Goodrich, Hopkins professor of computer science and a member of the collaborative.

Additional collaborators include physicist Harvey Newman, research scientist Julian Bunn and astronomer Chris Martin of Caltech; physicist Thomas Nash of Fermilab; computer scientist Jim Gray of Microsoft; and astronomers Ani Thakar and Peter Kunszt of Hopkins.

The $2.5 million NSF grant is one of 31 announced by NSF as part of a new effort to support "knowledge and distributed intelligence" projects. The grants are focused on efforts to apply new computer technology across multidisciplinary areas in science and engineering.


Story Source:

The above story is based on materials provided by Johns Hopkins University. Note: Materials may be edited for content and length.


Cite This Page:

Johns Hopkins University. "Hopkins-Led Team Developing New Ways To Handle Flood Of Data." ScienceDaily. ScienceDaily, 27 September 1999. <www.sciencedaily.com/releases/1999/09/990924115544.htm>.
Johns Hopkins University. (1999, September 27). Hopkins-Led Team Developing New Ways To Handle Flood Of Data. ScienceDaily. Retrieved October 21, 2014 from www.sciencedaily.com/releases/1999/09/990924115544.htm
Johns Hopkins University. "Hopkins-Led Team Developing New Ways To Handle Flood Of Data." ScienceDaily. www.sciencedaily.com/releases/1999/09/990924115544.htm (accessed October 21, 2014).

Share This



More Computers & Math News

Tuesday, October 21, 2014

Featured Research

from universities, journals, and other organizations


Featured Videos

from AP, Reuters, AFP, and other news services

Thanks, Marty McFly! Hoverboards Could Be Coming In 2015

Thanks, Marty McFly! Hoverboards Could Be Coming In 2015

Newsy (Oct. 21, 2014) — If you've ever watched "Back to the Future Part II" and wanted to get your hands on a hoverboard, well, you might soon be in luck. Video provided by Newsy
Powered by NewsLook.com
Robots to Fly Planes Where Humans Can't

Robots to Fly Planes Where Humans Can't

Reuters - Innovations Video Online (Oct. 21, 2014) — Researchers in South Korea are developing a robotic pilot that could potentially replace humans in the cockpit. Unlike drones and autopilot programs which are configured for specific aircraft, the robots' humanoid design will allow it to fly any type of plane with no additional sensors. Ben Gruber reports. Video provided by Reuters
Powered by NewsLook.com
Japanese Scientists Unveil Floating 3D Projection

Japanese Scientists Unveil Floating 3D Projection

Reuters - Innovations Video Online (Oct. 20, 2014) — Scientists in Tokyo have demonstrated what they say is the world's first 3D projection that floats in mid air. A laser that fires a pulse up to a thousand times a second superheats molecules in the air, creating a spark which can be guided to certain points in the air to shape what the human eye perceives as an image. Matthew Stock reports. Video provided by Reuters
Powered by NewsLook.com
Apple Enters Mobile Payment Business

Apple Enters Mobile Payment Business

AP (Oct. 20, 2014) — Apple is making a strategic bet with the launch of Apple Pay, the mobile pay service aimed at turning your iPhone into your wallet. (Oct. 20) Video provided by AP
Powered by NewsLook.com

Search ScienceDaily

Number of stories in archives: 140,361

Find with keyword(s):
 
Enter a keyword or phrase to search ScienceDaily for related topics and research stories.

Save/Print:
Share:  

Breaking News:

Strange & Offbeat Stories

 

Space & Time

Matter & Energy

Computers & Math

In Other News

... from NewsDaily.com

Science News

Health News

Environment News

Technology News



Save/Print:
Share:  

Free Subscriptions


Get the latest science news with ScienceDaily's free email newsletters, updated daily and weekly. Or view hourly updated newsfeeds in your RSS reader:

Get Social & Mobile


Keep up to date with the latest news from ScienceDaily via social networks and mobile apps:

Have Feedback?


Tell us what you think of ScienceDaily -- we welcome both positive and negative comments. Have any problems using the site? Questions?
Mobile iPhone Android Web
Follow Facebook Twitter Google+
Subscribe RSS Feeds Email Newsletters
Latest Headlines Health & Medicine Mind & Brain Space & Time Matter & Energy Computers & Math Plants & Animals Earth & Climate Fossils & Ruins