Featured Research

from universities, journals, and other organizations

Getting to the bottom of statistics: Software utilizes data from the Internet for interpreting statistics

Date:
July 16, 2012
Source:
Technische Universität Darmstadt
Summary:
Interpreting the results of statistical surveys, e.g., Transparency Internation­al’s corruption indices, is not always a simple matter. As Dr. Heiko Paulheim of the Knowledge Engineering Group at the TU Darmstadt’s Computer Sciences Dept. put it, “Although methods that will unearth explanations for statistics are available, they are confined to utilizing data contained in the statistics involved. Further, background information will not be taken into account. That is what led us to the idea of applying data-mining methods that we had been studying here to the semantic web in order to obtain further, background infor­ma­tion that will allow us to learn more from statistics.”

Explain-a-LOD helps to interpret statistics, like for example the corruption perceptions index by Transparency International.
Credit: Diagram: Transparency International

Interpreting the results of statistical surveys, e.g., Transparency Internation­al's corruption indices, is not always a simple matter. As Dr. Heiko Paulheim of the Knowledge Engineering Group at the TU Darmstadt's Computer Sciences Dept. put it, "Although methods that will unearth explanations for statistics are available, they are confined to utilizing data contained in the statistics involved. Further, background information will not be taken into account. That is what led us to the idea of applying data-mining methods that we had been studying here to the semantic web in order to obtain further, background infor­ma­tion that will allow us to learn more from statistics."

Related Articles


The "Explain-a-LOD" tool that Paulheim developed accesses linked open data (LOD), i.e., enormous compilations of publicly available, semantically linked data accessible on the Internet, and, from that data, automatically formulates hypo­theses regarding the interpretation of arbitrary types of statistics. To start off, the statistics to be interpreted are read into Explain-a-LOD. Explain-a-LOD then automatically searches the pools of linked open data for associated records and adds them to the initial set. Paulheim explained that, "If, for example, the country "Germany" is listed in the corruption-index data, LOD‑records that contain information on Germany will be identified and further attributes, such as its population, its membership in the EU and OECD, or the total number of companies domiciled there, generated. Attributes that are unlikely to yield useful hypotheses will be automatically deleted in order to reduce the volumes of such enriched statistics.

Once that preprocessing has been concluded, Explain-a-LOD proceeds to the second stage and automatically formulates hypotheses, based on the enriched statistics. The methods employed include simple correlation analyses, as well as other methods for recognizing regularities in statistical data, in order to allow formulation of more-complex hypotheses covering more than just a single attribute. Users will then be presented with the resultant hypotheses, in the form of, e.g., phrases, such as "OECD-member countries have low corruption indices" if any positive correlation exists between the attribute "OECD‑member­ship" and the target attribute, "corruption index," regardless of whether the original statistics contained any references to countries' OECD‑membership, or lack of it. That background knowledge will be automat­ically taken into account by Explain-a-LOD.

Surprising and useful hypotheses

Paulheim and his colleagues have thoroughly tested their approach on various sorts of statistics, including Mercer's standard-of-living study and Trans­parency International's corruption index. Paulheim noted that, "What one obtains are mixtures of obvious and surprising hypotheses, such as "cities where tempera­tures do not exceed 21°C during the month of May have high stan­dards of living," "capital cities generally have lower standards of living than other cities," or "countries that have few schools and few radio stations have high cor­rup­tion indices." An evaluation of the results by test persons verified that impression. Paulheim added that, "The test persons perceived the resultant hypotheses as largely surprising, as well as nontrivial, and, very frequently, as useful." However, the test persons had serious doubts regarding the trustworth­i­ness of the resultant hypotheses, which, Paulheim noted, was also attributable to the unsatisfactory qualities of some of the data contained in the open-data cloud.

Explain-a-LOD has been presented at several international conferences over the past few months. The tool received the "Best In-Use Paper" and "Best Demo" awards at the Extended Semantic Web Conference 2012 held on Crete in late May. Several upgradings of Explain-a-LOD, among them implementation of further attribute-generation algorithms and facilities for accessing further data pools from the LOD‑cloud, are planned for the future.

Further information: http://www.ke.tu-darmstadt.de/resources/explain-a-lod


Story Source:

The above story is based on materials provided by Technische Universität Darmstadt. Note: Materials may be edited for content and length.


Cite This Page:

Technische Universität Darmstadt. "Getting to the bottom of statistics: Software utilizes data from the Internet for interpreting statistics." ScienceDaily. ScienceDaily, 16 July 2012. <www.sciencedaily.com/releases/2012/07/120716091925.htm>.
Technische Universität Darmstadt. (2012, July 16). Getting to the bottom of statistics: Software utilizes data from the Internet for interpreting statistics. ScienceDaily. Retrieved October 25, 2014 from www.sciencedaily.com/releases/2012/07/120716091925.htm
Technische Universität Darmstadt. "Getting to the bottom of statistics: Software utilizes data from the Internet for interpreting statistics." ScienceDaily. www.sciencedaily.com/releases/2012/07/120716091925.htm (accessed October 25, 2014).

Share This



More Computers & Math News

Saturday, October 25, 2014

Featured Research

from universities, journals, and other organizations


Featured Videos

from AP, Reuters, AFP, and other news services

Real-Life Transformer Robot Walks, Then Folds Into a Car

Real-Life Transformer Robot Walks, Then Folds Into a Car

Buzz60 (Oct. 24, 2014) — Brave Robotics and Asratec teamed with original Transformers toy company Tomy to create a functional 5-foot-tall humanoid robot that can march and fold itself into a 3-foot-long sports car. Jen Markham has the story. Video provided by Buzz60
Powered by NewsLook.com
Microsoft Riding High On Strong Surface, Cloud Performance

Microsoft Riding High On Strong Surface, Cloud Performance

Newsy (Oct. 24, 2014) — Microsoft's Q3 earnings showed its tablets and cloud services are really hitting their stride. Video provided by Newsy
Powered by NewsLook.com
The Best Apps to Organize Your Life

The Best Apps to Organize Your Life

Buzz60 (Oct. 23, 2014) — Need help organizing your bills, schedules and other things? Ko Im (@konakafe) has the best apps to help you stay on top of it all! Video provided by Buzz60
Powered by NewsLook.com
Nike And Apple Team Up To Create Wearable ... Something

Nike And Apple Team Up To Create Wearable ... Something

Newsy (Oct. 23, 2014) — For those looking for wearable tech that's significantly less nerdy than Google Glass, Nike CEO Mark Parker says don't worry, It's on the way. Video provided by Newsy
Powered by NewsLook.com

Search ScienceDaily

Number of stories in archives: 140,361

Find with keyword(s):
 
Enter a keyword or phrase to search ScienceDaily for related topics and research stories.

Save/Print:
Share:  

Breaking News:

Strange & Offbeat Stories

 

Space & Time

Matter & Energy

Computers & Math

In Other News

... from NewsDaily.com

Science News

Health News

Environment News

Technology News



Save/Print:
Share:  

Free Subscriptions


Get the latest science news with ScienceDaily's free email newsletters, updated daily and weekly. Or view hourly updated newsfeeds in your RSS reader:

Get Social & Mobile


Keep up to date with the latest news from ScienceDaily via social networks and mobile apps:

Have Feedback?


Tell us what you think of ScienceDaily -- we welcome both positive and negative comments. Have any problems using the site? Questions?
Mobile iPhone Android Web
Follow Facebook Twitter Google+
Subscribe RSS Feeds Email Newsletters
Latest Headlines Health & Medicine Mind & Brain Space & Time Matter & Energy Computers & Math Plants & Animals Earth & Climate Fossils & Ruins