Science News

... from universities, journals, and other research organizations

Search Engines Biased, Out-Of-Date, And Index No More Than 16% Of The Web

July 12, 1999 — A new NEC Research Institute study analyzes the accessibility and distribution of information on the web. The study was conducted by Dr. Steve Lawrence and Dr. C. Lee Giles and will appear in the July 8 issue of the journal Nature.


Share This:

-- LOW COVERAGE -- Search engine coverage has decreased substantially since Dec. 97, with no engine indexing more than about 16% of the publicly indexable web.

-- UNEQUAL ACCESS -- Search engines are more likely to index sites that have more links to them (more 'popular' sites). They are also typically more likely to index US sites than non-US sites, and more likely to index commercial sites than educational sites.

-- OUT-OF-DATE -- Indexing of new or modified pages by just one of the major search engines can take months.

-- AMOUNT OF INFORMATION -- The publicly indexable web contains about 800 million pages encompassing about 15 terabytes of data (about 6 terabytes of textual content after removing HTML tags, comments, and extra whitespace); it also contains about 180 million images.

-- TYPE OF INFORMATION -- 83% of sites contain commercial content and 6% contain scientific/educational content. Only 1.5% of sites contain pornographic content.

The web is transforming society, and the search engines are an important part of the process. For example, consumers use search engines to locate and buy goods or to research many decisions (such as choosing a vacation destination, medical treatment or election vote).

Search engine indexing and ranking may have economic, social, political, and scientific effects. For example, indexing and ranking of online stores can substantially effect economic viability; delayed indexing of scientific research can lead to the duplication of work or slower progress; and delayed or biased indexing may affect social or political decisions.

One of the great promises of the web is to equalize access to information. As the web fast becomes a major communications medium, attention should be paid to the accessibility of information on the web, in order to minimize unequal access to information, and maximize the benefits of the web for society.

For more information see http://wwwmetrics.com.

###

The NEC Research Institute conducts long-term, fundamental research in computer and physical sciences. The mission of the Institute is to contribute significant new understanding of computer and communication (C&C) technologies for the future. Institute research activities have a long-term goal of significant advances in the understanding of intelligence and information processing in biological and machine systems, and in the physical and system aspects of future computer architectures.

Share this story on Facebook, Twitter, and Google:

Other social bookmarking and sharing tools:

|

Story Source:

The above story is reprinted from materials provided by NEC Research Institute.

Note: Materials may be edited for content and length. For further information, please contact the source cited above.


APA

MLA

Note: If no author is given, the source is cited instead.

Search ScienceDaily

Number of stories in archives: 138,598

Find with keyword(s):
 
Enter a keyword or phrase to search ScienceDaily's archives for related news topics,
the latest news stories, reference articles, science videos, images, and books.

Recommend ScienceDaily on Facebook, Twitter, and Google:

Other social bookmarking and sharing services:

|

 
Interested in ad-free access? If you'd like to read ScienceDaily without ads, let us know!
  more breaking science news

Social Networks


Follow ScienceDaily on Facebook, Twitter,
and Google:

Recommend ScienceDaily on Facebook, Twitter, and Google +1:

Other social bookmarking and sharing tools:

|

Breaking News

... from NewsDaily.com

  • more science news

In Other News ...

  • more top news

Science Video News


Image Based Search Engine Created

VizSeek is one of the first search engines on the Internet to use a photograph, a 2D image, or a 3D model and transform it into a 3D shape. The. ...  > full story

Strange Science News

 

Free Subscriptions

... from ScienceDaily

Get the latest science news with our free email newsletters, updated daily and weekly. Or view hourly updated newsfeeds in your RSS reader:

Feedback

... we want to hear from you!

Tell us what you think of ScienceDaily -- we welcome both positive and negative comments. Have any problems using the site? Questions?

Post this page to your favorite social bookmarking site:
Include this item in your blog or web site:
Cite this article in your essay, paper, or report:
Email this page's link to a friend or colleague: