Featured Research

from universities, journals, and other organizations

Around The World In 800 Billion Bases: Sanger Institute Genetic Records Are World's Biggest

Date:
January 19, 2006
Source:
Wellcome Trust Sanger Institute
Summary:
This week, the Wellcome Trust Sanger Institute's World Trace Archive database of DNA sequences hit one billion entries. The Trace Archive is a store of all the sequence data produced and published by the world scientific community, including the Sanger Institute's own prodigious output as a world-leading genomics institution.

On Tuesday 17 January 2006 the Wellcome Trust Sanger Institute's World Trace Archive database of DNA sequences hit one billion entries. The Trace Archive is a store of all the sequence data produced and published by the world scientific community, including the Sanger Institute's own prodigious output as a world-leading genomics institution.
Credit: Image courtesy of Wellcome Trust Sanger Institute

This week, the Wellcome Trust Sanger Institute's World Trace Archive database of DNA sequences hit one billion entries. The Trace Archive is a store of all the sequence data produced and published by the world scientific community, including the Sanger Institute's own prodigious output as a world-leading genomics institution.

To grasp how much data is in the Archive, if it were printed out as a single line of text, it would stretch around the world more than 250 times. Printing it out on pages of A4 would produce a stack of paper two-and-a-half times as high as Mount Everest.

Each entry is a piece of genetic information averaging 864 characters long. Scientists can search these sequences and piece them together to build up the whole genetic information of organisms - mice, fish, flies, bacteria and, of course, humans.

The Archive is 22 Terabytes in size and doubling every ten months - perhaps the largest single scientific database in Europe, if not the world.

Martin Widlake, Database Services Manager at the Wellcome Trust Sanger Institute said: "At 22 000 GB the Trace Archive is in the Top Ten UNIX databases in the world. That's not bad for a research organisation of 850 employees in the countryside just outside Cambridge."

"It is possibly the biggest single (acknowledged) scientific RDBMS database in Europe, if not the world."

All the data are freely available to the world scientific community (http://trace.ensembl.org/), as a resource to geneticists all over the globe. When a researcher is studying a disease or gene, they can download the genetic information known about the area they are studying.

The data are being actively used by biomedical researchers in academic and commercial organizations. The three internet domains that make most use of the trace archive are .com, .edu and .uk. Dotcoms are responsible for about 80% of download each week - mostly as big 'customers', taking vast chunks each visit. Next are US university researchers, followed by UK scientists.

Trace data are the raw results of genetic research to allow them to identify and study genes, to reveal variations (mutations) in genes and to study similarity to genes in other organisms. These are vital starting points for studying and better understanding the biology of health and disease.

By any comparison, the billion records stands above many other familiar repositories. The British Library holds 13 million items: the US Library of Congress holds 115 million items. The Trace Archive holds one billion chunks of unique information.

"Accessing the data becomes a larger and larger problem as the dataset grows," continued Martin Widlake. "At present it is simple and very quick to access a record if you know its unique identifier as issued by the Sanger Institute, the US National Center for Biotechnology Information (NCBI) database, or the 'name' of the trace as given by the organization that originally sequenced that piece of genetic information."

"Scanning the whole dataset for a single genetic sequence, which is a lot like searching for a single sentence in the contents of the British Library, is a massive task. However, the team at the Sanger Institute are working on new methods to make the data easier to search and access".

The data are held in duplicate, with the NCBI also maintaining a copy: with two sites holding it, a single disaster cannot wipe out the only copy of this vital and heavily used database.


Story Source:

The above story is based on materials provided by Wellcome Trust Sanger Institute. Note: Materials may be edited for content and length.


Cite This Page:

Wellcome Trust Sanger Institute. "Around The World In 800 Billion Bases: Sanger Institute Genetic Records Are World's Biggest." ScienceDaily. ScienceDaily, 19 January 2006. <www.sciencedaily.com/releases/2006/01/060118090322.htm>.
Wellcome Trust Sanger Institute. (2006, January 19). Around The World In 800 Billion Bases: Sanger Institute Genetic Records Are World's Biggest. ScienceDaily. Retrieved October 21, 2014 from www.sciencedaily.com/releases/2006/01/060118090322.htm
Wellcome Trust Sanger Institute. "Around The World In 800 Billion Bases: Sanger Institute Genetic Records Are World's Biggest." ScienceDaily. www.sciencedaily.com/releases/2006/01/060118090322.htm (accessed October 21, 2014).

Share This



More Computers & Math News

Tuesday, October 21, 2014

Featured Research

from universities, journals, and other organizations


Featured Videos

from AP, Reuters, AFP, and other news services

Japanese Scientists Unveil Floating 3D Projection

Japanese Scientists Unveil Floating 3D Projection

Reuters - Innovations Video Online (Oct. 20, 2014) Scientists in Tokyo have demonstrated what they say is the world's first 3D projection that floats in mid air. A laser that fires a pulse up to a thousand times a second superheats molecules in the air, creating a spark which can be guided to certain points in the air to shape what the human eye perceives as an image. Matthew Stock reports. Video provided by Reuters
Powered by NewsLook.com
Apple Enters Mobile Payment Business

Apple Enters Mobile Payment Business

AP (Oct. 20, 2014) Apple is making a strategic bet with the launch of Apple Pay, the mobile pay service aimed at turning your iPhone into your wallet. (Oct. 20) Video provided by AP
Powered by NewsLook.com
Google To Protect Against Piracy ... At A Cost

Google To Protect Against Piracy ... At A Cost

Newsy (Oct. 20, 2014) Google is changing its search-engine results to protect content producers from piracy — for a price. Video provided by Newsy
Powered by NewsLook.com
What We Know About Microsoft's Rumored Smartwatch

What We Know About Microsoft's Rumored Smartwatch

Newsy (Oct. 20, 2014) Microsoft will reportedly release a smartwatch that works across different mobile platforms, has a two-day battery life and tracks heart rate. Video provided by Newsy
Powered by NewsLook.com

Search ScienceDaily

Number of stories in archives: 140,361

Find with keyword(s):
Enter a keyword or phrase to search ScienceDaily for related topics and research stories.

Save/Print:
Share:

Breaking News:

Strange & Offbeat Stories


Space & Time

Matter & Energy

Computers & Math

In Other News

... from NewsDaily.com

Science News

Health News

Environment News

Technology News



Save/Print:
Share:

Free Subscriptions


Get the latest science news with ScienceDaily's free email newsletters, updated daily and weekly. Or view hourly updated newsfeeds in your RSS reader:

Get Social & Mobile


Keep up to date with the latest news from ScienceDaily via social networks and mobile apps:

Have Feedback?


Tell us what you think of ScienceDaily -- we welcome both positive and negative comments. Have any problems using the site? Questions?
Mobile: iPhone Android Web
Follow: Facebook Twitter Google+
Subscribe: RSS Feeds Email Newsletters
Latest Headlines Health & Medicine Mind & Brain Space & Time Matter & Energy Computers & Math Plants & Animals Earth & Climate Fossils & Ruins