Featured Research

from universities, journals, and other organizations

Big data: Searching large amounts of data quickly and efficiently

Date:
March 1, 2013
Source:
University Saarland
Summary:
Not only scientific institutes but also companies harvest an amazing amount of data. Traditional database management systems are often unable to cope with this. Suitable tools are lacking in information retrieval on big data. Computer scientists have now developed an approach which enables searching large amounts of data in a fast and efficient way.

Computer scientists from Saarbrücken have developed an approach which enables searching large amounts of data in a fast and efficient way.
Credit: Image courtesy of University Saarland

Not only scientific institutes but also companies harvest an amazing amount of data. Traditional database management systems are often unable to cope with this. Suitable tools are lacking in information retrieval on big data. Computer scientists from Saarbrücken have developed an approach which enables searching large amounts of data in a fast and efficient way.

The term "big data" is defined as a huge amount of digital information, so big and so complex that normal database technology cannot process it. It is not only scientific institutes like the nuclear research center CERN that often store huge amounts of data ("Big Data"). Companies like Google and Facebook do this as well, and analyze it to make better strategic decisions for their business. How successful such an attempt can be was shown in a New York Times article published last year. It reported on the US-based company "Target" which, by analyzing the buying patterns of a young woman, knew about her pregnancy before her father did.

The analyzed amount of data is distributed on several servers on the internet. The search queries go to several servers in parallel. Traditional database management systems do not match all use cases. Either they cannot cope with big data, or they overstrain the user. Therefore data analysts love tools which are based on the open-source software framework Apache Hadoop and which use its efficient file system HDFS. Those do not require expert knowledge. "If you are used to the programming language Java, you can already do a lot with it," explains Jens Dittrich, professor of information systems at Saarland University. But he also adds that Hadoop is not able to query big datasets as efficiently as database systems that are designed for parallel processing.

Dittrich's and his colleague's solution is the development of the "Hadoop Aggressive Indexing Library," abbreviated with HAIL. It enables saving enormous amounts of data in HDFS in such a way that queries are answered up to 100 times faster. The researchers use a method which you can already find in a telephone book. So that you do not have to read the complete list of names, the entries are sorted according to surnames. The sorting of the names generates the so-called index. The researchers generate such an index for the datasets they distribute on several servers. But in contrast to the telephone book, they sort the data according to several criteria at once and store it multiply. "The more criteria you provide, the higher the probability that you find the specified data very fast," Dittrich explains. "To use the telephone book example again, it means that you have six different books. Every one contains a different sorting of the data -- according to name, street, ZIP code, city and telephone number. With the right telephone book you can search according to different criteria and will succeed faster." In addition to that, Dittrich and his research group managed to generate the indexes without any additional costs. He and his group members organized the indexing in such a way that no additional computing time and delay is required. Even the additional storage space requirement is low.

The researchers will show their results at the trade fair Cebit in Hannover starting on 5 March.


Story Source:

The above story is based on materials provided by University Saarland. Note: Materials may be edited for content and length.


Cite This Page:

University Saarland. "Big data: Searching large amounts of data quickly and efficiently." ScienceDaily. ScienceDaily, 1 March 2013. <www.sciencedaily.com/releases/2013/03/130301122503.htm>.
University Saarland. (2013, March 1). Big data: Searching large amounts of data quickly and efficiently. ScienceDaily. Retrieved April 24, 2014 from www.sciencedaily.com/releases/2013/03/130301122503.htm
University Saarland. "Big data: Searching large amounts of data quickly and efficiently." ScienceDaily. www.sciencedaily.com/releases/2013/03/130301122503.htm (accessed April 24, 2014).

Share This



More Computers & Math News

Thursday, April 24, 2014

Featured Research

from universities, journals, and other organizations


Featured Videos

from AP, Reuters, AFP, and other news services

Monkeys Are Better At Math Than We Thought, Study Shows

Monkeys Are Better At Math Than We Thought, Study Shows

Newsy (Apr. 23, 2014) — A Harvard University study suggests monkeys can use symbols to perform basic math calculations. Video provided by Newsy
Powered by NewsLook.com
High Court to Hear Dispute of TV Over Internet

High Court to Hear Dispute of TV Over Internet

AP (Apr. 22, 2014) — The future of Aereo, an online service that provides over-the-air TV channels, hinges on a battle with broadcasters that goes before the U.S. Supreme Court on Tuesday. (April 22) Video provided by AP
Powered by NewsLook.com
Aereo Takes on Broadcast TV Titans in Supreme Court Today

Aereo Takes on Broadcast TV Titans in Supreme Court Today

TheStreet (Apr. 22, 2014) — Aereo heads to the Supreme Court today to fight for its right to stream broadcast TV over the Internet -- against broadcasters who say the start-up infringes upon copyright law. TheStreet Deputy Managing Editor Leon Lazaroff explains the importance of the case in the TV industry and details what the outcome of it could mean for broadcasters and for cloud storage services -- as Aereo allows its subscribers to not just watch live TV shows but also store content to a DVR in the cloud. Video provided by TheStreet
Powered by NewsLook.com
Lytro Introduces 'Illum,' A Professional Light-Field Camera

Lytro Introduces 'Illum,' A Professional Light-Field Camera

Newsy (Apr. 22, 2014) — The light-field photography engineers at Lytro unveiled their next innovation: a professional DSLR-like camera called "Illum." Video provided by Newsy
Powered by NewsLook.com

Search ScienceDaily

Number of stories in archives: 140,361

Find with keyword(s):
 
Enter a keyword or phrase to search ScienceDaily for related topics and research stories.

Save/Print:
Share:  

Breaking News:
from the past week

In Other News

... from NewsDaily.com

Science News

Health News

Environment News

Technology News



Save/Print:
Share:  

Free Subscriptions


Get the latest science news with ScienceDaily's free email newsletters, updated daily and weekly. Or view hourly updated newsfeeds in your RSS reader:

Get Social & Mobile


Keep up to date with the latest news from ScienceDaily via social networks and mobile apps:

Have Feedback?


Tell us what you think of ScienceDaily -- we welcome both positive and negative comments. Have any problems using the site? Questions?
Mobile iPhone Android Web
Follow Facebook Twitter Google+
Subscribe RSS Feeds Email Newsletters
Latest Headlines Health & Medicine Mind & Brain Space & Time Matter & Energy Computers & Math Plants & Animals Earth & Climate Fossils & Ruins