Featured Research

from universities, journals, and other organizations

Mathematical Distribution Links Open Source Software And Literature

Date:
February 2, 2009
Source:
ETH Zurich
Summary:
The frequency of words in texts, the size of companies and the linking together of components in Linux software distributions show approximately the same mathematical distribution: they obey Zipf’s law. Researchers have tested how this happens in Linux programs.

The number of packets (y axis) to which more than C links point (x axis). On the double logarithmic scale, all four Debian Linux distributions that were studied yield straight lines with a gradient of approximately -1, which corresponds to Zipf's law.
Credit: Image courtesy of ETH Zurich

The frequency of words in texts, the size of companies and the linking together of components in Linux software distributions show approximately the same mathematical distribution: they obey Zipf’s law. ETH Zurich researchers tested how this happens in Linux programs.

In the first half of the twentieth century, the American linguist George Kingsley Zipf studied how often each word occurs in literary texts. A few words were very frequent, e.g. "the" and "and", but the majority of words occurred only rarely. The resulting pattern could be expressed in figures: the most frequent word occurred about twice as often as the second most frequent and three times as often as the third most frequent, i.e. the frequency of a word was inversely proportional to its rank. This has since been called Zipf’s law.

Scientists have discovered that this distribution holds true – more or less – for quite different systems, e.g. the numbers of visitors to web sites, the size of towns and the size of companies in numerous countries. Researchers suspected that this recurring pattern is associated with the growth process of the systems being studied.

Free-of-charge raw material due to Open Source

Doctoral student Thomas Maillart and Didier Sornette, Professor on the Chair of Entrepreneurial Risks, together with Sebastian Spδth and Georg von Krogh, Professor on the Chair of Strategic Management and Innovation at ETH Zurich, have now demonstrated empirically the conditions under which a distribution obeying Zipf’s law occurs. They did this by examining the linking of Linux software packets.

Their results were published in the scientific journal Physical Review Letters and mentioned in Nature as a Research Highlight.

In an earlier publication, Sornette had already suggested carrying out an empirical test of Zipf’s law. When searching for a subject for his thesis, his doctoral student Thomas Maillart came across an article about open source software by Sebastian Spδth and Georg von Krogh. Maillart realised that this contained data with which the origin of Zipf’s law could be verified.

Linux is an operating system similar to Microsoft Windows or Mac OS. Many versions of it are available to download free of charge via the Internet. Each Linux distribution consists of various software packets which thus represent free-of-charge raw material for the scientists to use in their research. Debian Linux – the distribution studied by the ETH Zurich researchers – comprised only 474 packets in 1996, whereas there were already more than 18,000 in 2007.

Characteristic distribution arises as a result of the growth

The packets are networked by numerous links through which they call one another. First of all, for four versions of Debian, Maillart examined whether the number of incoming packet links obeys Zipf’s law. This was confirmed (see graphic). The scientists then studied how the number of links referring to a packet develops over time. They assumed a proportional growth pattern: the more links that already lead to a packet, the faster the number of links increases.

The evaluation of the Linux packets data showed that the researchers’ model was correct. In new packets, the number of links deviated from Zipf’s law, and the characteristic distribution arose only as a result of the growth of the Linux distribution. A condition that the researchers had used in their model was also confirmed: the fluctuation in the number of links becomes larger as it grows. Consequently, it can drop down to zero again even if it is very large, which, for the Linux packet, means that it is no longer being used.

Conclusions on Entrepreneurial risks

Thomas Maillart describes himself as a risk manager. He says that he had already calculated risks as a Civil Engineering student at EPFL, where these risks were connected with the safety of building structures. He then worked in a company insuring Internet risks. He has now written the paper on Zipf’s law in the context of his thesis on Internet risks at the Chair on Entrepreneurial Risks at ETH Zurich.

Being able to estimate the growth of Linux packets is exciting from an entrepreneurial point of view. However, the significance of the paper extends far beyond this specialist area, because the knowledge applies to all systems obeying Zipf’s law. To the size of companies, for example: by analogy with the number of links pointing to a Linux packet, a company’s size provides no certainty that the company will survive, as the financial crisis has confirmed.


Story Source:

The above story is based on materials provided by ETH Zurich. Note: Materials may be edited for content and length.


Journal Reference:

  1. Maillart et al. Empirical Tests of Zipf’s Law Mechanism in Open Source Linux Distribution. Physical Review Letters, 2008; 101 (21): 218701 DOI: 10.1103/PhysRevLett.101.218701

Cite This Page:

ETH Zurich. "Mathematical Distribution Links Open Source Software And Literature." ScienceDaily. ScienceDaily, 2 February 2009. <www.sciencedaily.com/releases/2009/01/090127215432.htm>.
ETH Zurich. (2009, February 2). Mathematical Distribution Links Open Source Software And Literature. ScienceDaily. Retrieved October 22, 2014 from www.sciencedaily.com/releases/2009/01/090127215432.htm
ETH Zurich. "Mathematical Distribution Links Open Source Software And Literature." ScienceDaily. www.sciencedaily.com/releases/2009/01/090127215432.htm (accessed October 22, 2014).

Share This



More Computers & Math News

Wednesday, October 22, 2014

Featured Research

from universities, journals, and other organizations


Featured Videos

from AP, Reuters, AFP, and other news services

Chameleon Camouflage to Give Tanks Cloaking Capabilities

Chameleon Camouflage to Give Tanks Cloaking Capabilities

Reuters - Innovations Video Online (Oct. 22, 2014) — Inspired by the way a chameleon changes its colour to disguise itself; scientists in Poland want to replace traditional camouflage paint with thousands of electrochromic plates that will continuously change colour to blend with its surroundings. The first PL-01 concept tank prototype will be tested within a few years, with scientists predicting that a similar technology could even be woven into the fabric of a soldiers' clothing making them virtually invisible to the naked eye. Matthew Stock reports. Video provided by Reuters
Powered by NewsLook.com
Internet of Things Aims to Smarten Your Life

Internet of Things Aims to Smarten Your Life

AP (Oct. 22, 2014) — As more and more Bluetooth-enabled devices are reaching consumers, developers are busy connecting them together as part of the Internet of Things. (Oct. 22) Video provided by AP
Powered by NewsLook.com
Free Math App Is A Teacher's Worst Nightmare

Free Math App Is A Teacher's Worst Nightmare

Newsy (Oct. 22, 2014) — New photo-recognition software from MicroBlink, called PhotoMath, solves linear equations and simple math problems with step-by-step results. Video provided by Newsy
Powered by NewsLook.com
Rate Hike Worries Down on Inflation Data

Rate Hike Worries Down on Inflation Data

Reuters - Business Video Online (Oct. 22, 2014) — Inflation remains well under control according to the latest consumer price index, giving the Federal Reserve more room to keep interest rates low for awhile. Bobbi Rebell reports. Video provided by Reuters
Powered by NewsLook.com

Search ScienceDaily

Number of stories in archives: 140,361

Find with keyword(s):
 
Enter a keyword or phrase to search ScienceDaily for related topics and research stories.

Save/Print:
Share:  

Breaking News:

Strange & Offbeat Stories

 

Space & Time

Matter & Energy

Computers & Math

In Other News

... from NewsDaily.com

Science News

Health News

Environment News

Technology News



Save/Print:
Share:  

Free Subscriptions


Get the latest science news with ScienceDaily's free email newsletters, updated daily and weekly. Or view hourly updated newsfeeds in your RSS reader:

Get Social & Mobile


Keep up to date with the latest news from ScienceDaily via social networks and mobile apps:

Have Feedback?


Tell us what you think of ScienceDaily -- we welcome both positive and negative comments. Have any problems using the site? Questions?
Mobile iPhone Android Web
Follow Facebook Twitter Google+
Subscribe RSS Feeds Email Newsletters
Latest Headlines Health & Medicine Mind & Brain Space & Time Matter & Energy Computers & Math Plants & Animals Earth & Climate Fossils & Ruins