Featured Research

from universities, journals, and other organizations

Computer database compresses DNA sequences used in medical research

Date:
November 15, 2009
Source:
Inderscience
Summary:
Researchers in Egypt have developed a technique to compress DNA sequences of the kind used in medical research so that they take up a lot less space in a computer database but without loss of information.

Researchers in Egypt have developed a technique to compress DNA sequences of the kind used in medical research so that they take up a lot less space in a computer database but without loss of information. The approach is described in detail in a forthcoming issue of the International Journal of Bioinformatics Research and Applications.

Related Articles


Molecular sequence databases, such as those at EMBL, GenBank, and Entrez contain millions of DNA sequences filling many thousands of gigabytes of computer storage capacity of sequences. With almost every new scientific publication in genetics and related sciences, a new sequence is added and the rate at which the data is accumulating is on the rise.

These sequences play a vital role in medical research, disease diagnosis, and the design and development of new drugs. However,

DNA sequences are comprised of just four different bases labelled A, C, G, and T. Each base can be represented in computer code by a two character binary digit, two bits in other words, A (00), C (01), G (10), and T (11). At first glance, one might imagine that this is the most efficient way to store DNA sequences.

DNA sequences, however, are not random, they contain repeating sections, palindromes, and other features that could be represented by fewer bits than is required to spell out the complete sequence in binary. A repeat pattern could be abbreviated to say the binary equivalent of "six times G" for instance, which would be a few bits shorter than explicitly writing "GGGGGG" in binary. Similarly, palindromes could be abbreviated in code relative to their complementary pattern in the DNA sequence.

Many computer users are familiar with compression software that can remove "redundant" code from a music file -- to produce an mp3 -- or an image -- to make a jpg. However, these compression methods lose information. Less familiar to many users are lossless compression methods such as FLAC for sound files, TIFF for images, and the "zip" format for documents and other files. Lossless compression exploits the repeats, palindromes and patterns present in the digital data to reduce the overall size of the file in question.

Now, Taysir Soliman of the Faculty of Computer and Information, at Assiut University, and colleagues Tarek Gharib, Alshaimaa Abo-Alian, and M.A. El Sharkawy of the Faculty of Computer and Information Sciences, at Ain Shams University, have developed a Lossless Compression Algorithm that works with digitized DNA sequences to reduce the amount of computer storage needed for each sequence.

LCA achieves a better compression ratio than existing compression algorithms for DNA, such as GenCompress, DNACompress, and DNAPack, the team says. The same approach could also be used for protein sequences.

The compression algorithm may also have direct application in DNA research, the team suggests. They are now investigating ways in which the results of the compression might be used to differentiate between sections of a DNA sequence that code for proteins and those in the sequence that do not, so-called non-coding regions.


Story Source:

The above story is based on materials provided by Inderscience. Note: Materials may be edited for content and length.


Journal Reference:

  1. Soliman et al. A Lossless Compression Algorithm for DNA sequences. International Journal of Bioinformatics Research and Applications, 2009; 5 (6): 593 DOI: 10.1504/IJBRA.2009.029040

Cite This Page:

Inderscience. "Computer database compresses DNA sequences used in medical research." ScienceDaily. ScienceDaily, 15 November 2009. <www.sciencedaily.com/releases/2009/11/091111120105.htm>.
Inderscience. (2009, November 15). Computer database compresses DNA sequences used in medical research. ScienceDaily. Retrieved February 27, 2015 from www.sciencedaily.com/releases/2009/11/091111120105.htm
Inderscience. "Computer database compresses DNA sequences used in medical research." ScienceDaily. www.sciencedaily.com/releases/2009/11/091111120105.htm (accessed February 27, 2015).

Share This


More From ScienceDaily



More Health & Medicine News

Friday, February 27, 2015

Featured Research

from universities, journals, and other organizations


Featured Videos

from AP, Reuters, AFP, and other news services

Could a $34 Smartphone Device Improve HIV Diagnosis in Africa?

Could a $34 Smartphone Device Improve HIV Diagnosis in Africa?

Reuters - Innovations Video Online (Feb. 27, 2015) A dongle that plugs into a Smartphone mimics a lab-based blood test for HIV and syphilis and can detect the diseases in 15 minutes, say researchers. Tara Cleary reports. Video provided by Reuters
Powered by NewsLook.com
Doctor Says Head Transplants Possible Within Two Years

Doctor Says Head Transplants Possible Within Two Years

Buzz60 (Feb. 27, 2015) An Italian doctor is saying he could stick someone&apos;s head onto someone else&apos;s body. Patrick Jones (@Patrick_E_Jones) reports. Video provided by Buzz60
Powered by NewsLook.com
How Your Dentist Could Help Screen You For Diabetes

How Your Dentist Could Help Screen You For Diabetes

Newsy (Feb. 27, 2015) A new study from researchers at New York University suggests dentists could soon use blood samples taken from patients&apos; mouths to test for diabetes. Video provided by Newsy
Powered by NewsLook.com
The Best Tips to Makeover Your Health

The Best Tips to Makeover Your Health

Buzz60 (Feb. 27, 2015) If you&apos;re looking to boost your health this season, there are a few quick and easy steps to prompt you for success. Krystin Goodwin (@Krystingoodwin) has the best tips to give your health a makeover this spring! Video provided by Buzz60
Powered by NewsLook.com

Search ScienceDaily

Number of stories in archives: 140,361

Find with keyword(s):
Enter a keyword or phrase to search ScienceDaily for related topics and research stories.

Save/Print:
Share:

Breaking News:

Strange & Offbeat Stories


Health & Medicine

Mind & Brain

Living & Well

In Other News

... from NewsDaily.com

Science News

Health News

Environment News

Technology News



Save/Print:
Share:

Free Subscriptions


Get the latest science news with ScienceDaily's free email newsletters, updated daily and weekly. Or view hourly updated newsfeeds in your RSS reader:

Get Social & Mobile


Keep up to date with the latest news from ScienceDaily via social networks and mobile apps:

Have Feedback?


Tell us what you think of ScienceDaily -- we welcome both positive and negative comments. Have any problems using the site? Questions?
Mobile: iPhone Android Web
Follow: Facebook Twitter Google+
Subscribe: RSS Feeds Email Newsletters
Latest Headlines Health & Medicine Mind & Brain Space & Time Matter & Energy Computers & Math Plants & Animals Earth & Climate Fossils & Ruins