Featured Research

from universities, journals, and other organizations

Computer program assesses quality of Wikipedia entries

Date:
August 6, 2014
Source:
Inderscience Publishers
Summary:
Wikipedia the free, online collaborative encyclopedia is an important source of information. However, while the team of volunteer editors endeavors to maintain high standards, there are occasionally problems with the veracity of content, deliberate vandalism and incomplete entries. Computer scientists have now devised a software algorithm that can automatically check a particular entry and rank it according to quality.

Wikipedia the free, online collaborative encyclopedia is an important source of information. However, while the team of volunteer editors endeavors to maintain high standards, there are occasionally problems with the veracity of content, deliberate vandalism and incomplete entries. Writing in the International Journal of Information Quality, computer scientists in China have devised a software algorithm that can automatically check a particular entry and rank it according to quality.

Jingyu Han and Kejia Chen of Nanjing University of Posts and Telecommunications, explain that the quality of data on Wikipedia has for many years been the focus of user attention. Its detractors suggest that it can never be a valid information source in the way that a proprietary encyclopedia might be because the contributors and editors are not under the direct control of a single publisher with a vested interest in quality control. Its supporters suggest that the social nature of contributions and edits and the online tracking of changes is one of Wikipedia's greatest strengths rather than a weakness.

Nevertheless, it would quiet the detractors if there were a way to quantify the quality of Wikipedia entries in an objective and automated manner. Now, Han and Chen have turned to Bayesian statistics to help them create just such a system. The notion of finding evidence based on an analysis of probabilities was first described by 18th Century mathematician and theologian Thomas Bayes. Bayesian probabilities were then utilized by Pierre-Simon Laplace to pioneer a new statistical method. Today, Bayesian analysis is commonly used to assess the content of emails and to determine the probability that the content is spam, junk mail, and so filter it from the user's inbox if the probability is high.

Han and Chen have now used dynamic Bayesian network (DBN) to analyze in a similar manner the content of Wikipedia entries. They apply multivariate Gaussian distribution modeling to the DBN analysis, which gives them a distribution of the quality of each article so that entries might be ranked. Very low-ranking entries might be flagged for editorial attention to raise the quality. By contrast, high-ranking entries could be marked in some way as the definitive entry so that such an entry is not subsequently overwritten with lower quality information.

The team has tested its algorithm on sets of several hundred articles comparing the automated quality assessment by the computer with assessment by a human user. Their algorithm out-performs a human user by up to 23 percent in correctly classifying the quality rank of a given article in the set, the team reports. The use of a computerized system to provide a quality standard for Wikipedia entries would avoid the subjective need to have people classify each entry. It could thus improve the standard as well as provide a basis for an improved reputation for the online encyclopedia.


Story Source:

The above story is based on materials provided by Inderscience Publishers. Note: Materials may be edited for content and length.


Journal Reference:

  1. Jingyu Han, Kejia Chen. Ranking Wikipedia article's data quality by learning dimension distributions. International Journal of Information Quality, 2014; 3 (3): 207 DOI: 10.1504/IJIQ.2014.064056

Cite This Page:

Inderscience Publishers. "Computer program assesses quality of Wikipedia entries." ScienceDaily. ScienceDaily, 6 August 2014. <www.sciencedaily.com/releases/2014/08/140806142211.htm>.
Inderscience Publishers. (2014, August 6). Computer program assesses quality of Wikipedia entries. ScienceDaily. Retrieved October 22, 2014 from www.sciencedaily.com/releases/2014/08/140806142211.htm
Inderscience Publishers. "Computer program assesses quality of Wikipedia entries." ScienceDaily. www.sciencedaily.com/releases/2014/08/140806142211.htm (accessed October 22, 2014).

Share This



More Computers & Math News

Wednesday, October 22, 2014

Featured Research

from universities, journals, and other organizations


Featured Videos

from AP, Reuters, AFP, and other news services

Chameleon Camouflage to Give Tanks Cloaking Capabilities

Chameleon Camouflage to Give Tanks Cloaking Capabilities

Reuters - Innovations Video Online (Oct. 22, 2014) — Inspired by the way a chameleon changes its colour to disguise itself; scientists in Poland want to replace traditional camouflage paint with thousands of electrochromic plates that will continuously change colour to blend with its surroundings. The first PL-01 concept tank prototype will be tested within a few years, with scientists predicting that a similar technology could even be woven into the fabric of a soldiers' clothing making them virtually invisible to the naked eye. Matthew Stock reports. Video provided by Reuters
Powered by NewsLook.com
Internet of Things Aims to Smarten Your Life

Internet of Things Aims to Smarten Your Life

AP (Oct. 22, 2014) — As more and more Bluetooth-enabled devices are reaching consumers, developers are busy connecting them together as part of the Internet of Things. (Oct. 22) Video provided by AP
Powered by NewsLook.com
Free Math App Is A Teacher's Worst Nightmare

Free Math App Is A Teacher's Worst Nightmare

Newsy (Oct. 22, 2014) — New photo-recognition software from MicroBlink, called PhotoMath, solves linear equations and simple math problems with step-by-step results. Video provided by Newsy
Powered by NewsLook.com
Rate Hike Worries Down on Inflation Data

Rate Hike Worries Down on Inflation Data

Reuters - Business Video Online (Oct. 22, 2014) — Inflation remains well under control according to the latest consumer price index, giving the Federal Reserve more room to keep interest rates low for awhile. Bobbi Rebell reports. Video provided by Reuters
Powered by NewsLook.com

Search ScienceDaily

Number of stories in archives: 140,361

Find with keyword(s):
 
Enter a keyword or phrase to search ScienceDaily for related topics and research stories.

Save/Print:
Share:  

Breaking News:

Strange & Offbeat Stories

 

Space & Time

Matter & Energy

Computers & Math

In Other News

... from NewsDaily.com

Science News

Health News

Environment News

Technology News



Save/Print:
Share:  

Free Subscriptions


Get the latest science news with ScienceDaily's free email newsletters, updated daily and weekly. Or view hourly updated newsfeeds in your RSS reader:

Get Social & Mobile


Keep up to date with the latest news from ScienceDaily via social networks and mobile apps:

Have Feedback?


Tell us what you think of ScienceDaily -- we welcome both positive and negative comments. Have any problems using the site? Questions?
Mobile iPhone Android Web
Follow Facebook Twitter Google+
Subscribe RSS Feeds Email Newsletters
Latest Headlines Health & Medicine Mind & Brain Space & Time Matter & Energy Computers & Math Plants & Animals Earth & Climate Fossils & Ruins