Featured Research

from universities, journals, and other organizations

Software Program Developed At Georgia Tech Adds New Capabilities To Decipher Genomes Of Higher Organisms

Date:
December 2, 1998
Source:
Georgia Institute Of Technology
Summary:
A software program that has been successfully annotating the genes of common bacteria since 1992 is now capable of finding genes in higher organisms. It is particularly useful for finding human genes in anonymous human DNA sequences. Researchers deciphered the complete bacterial genome of Haemophilus influenzae , the structure of which is depicted here, using GeneMark, a software program developed at Georgia Tech.

A software program that has been successfully annotating the genes of common bacteria since 1992 is now capable of finding genes in higher organisms. It is particularly useful for finding human genes in anonymous human DNA sequences.

Related Articles


Understanding the genomes of key microorganisms may increase understanding of human genetics because lower organisms have some genes that correspond to human genes. Also scientists can design new drugs based on knowledge of disease-causing bacteria.

The original software program, called GeneMark, uses probabilistic mathematical models to predict the locations of genes on a strand of DNA. GeneMark was developed by Dr. Mark Borodovsky, a professor of biology at the Georgia Institute of Technology. It has become the world's most-used software program for deciphering bacterial DNA and has proven itself 98 percent accurate.

Borodovsky's latest development uses GeneMark.hmm, a refined version of the original program, as its base to make more sophisticated predictions for the genomes of eukaryotic, or higher organisms.

"Deciphering bacterial DNA is simpler than deciphering human DNA since its genes run continuously, without gaps," Borodovsky explained. "The genes of human DNA may be divided into pieces, called exons, with non-coding genetic material between the exons. These spacers in the genes, called introns, were hard to detect by a computer algorithm. Also, eukaryotic DNA is much longer, with an average gene size of 10,000 nucleotides."

Therefore, the predictions of where eukaryotic genes lie on a strand of DNA must include predictions of the boundaries between the exons, which contain the genetic information, and introns, which are the non-coding regions.

To create a computer program to achieve this, Borodovsky employed a probabilistic mathematical model called Hidden Markov Models or HMM. A recent grant from the National Institutes of Health (NIH) is funding incorporation of HMM into GeneMark, making the program responsive to the boundaries between exons and introns.

Borodovsky developed GeneMark.hmm with Dr. Alexander Lukashin, a researcher who works in the lab. A test of the program demonstrated its "state-of- the-art accuracy," said Borodovsky, meaning, when tested against current means of finding eukaryotic genes, GeneMark.hmm performed at least as well as the best current methods.

"GeneMark.hmm is more than a static software program or product," Borodovsky noted. "It is rather an approach for DNA sequence analysis that is under continuous development."

It is already being used to annotate parts of the genomes of five eukaryotic organisms, including humans, nematodes, fruit flies and a plant in the mustard family.

Borodovsky will present his latest results at the International Workshop on Genomic Sequence Analysis on Dec. 1-4 at the Issak Newton Institute for Mathematical Sciences at the University of Cambridge in England.

GeneMark.hmm will fill a need, as evidenced by early demand from scientists, Borodovsky said. Even before information about GeneMark.hmm has been published in a scientific journal, almost 30 researchers expressed interest to one of Borodovsky's graduate students, John Besemer, who gave a poster presentation on GeneMark.hmm at a recent conference on the eukaryotic organism Chlamydomonas reinhardtii.

Meanwhile, Borodovsky has recorded his research in predicting gene coding regions in a chapter of new book "Organization of the Prokaryotic Genome," soon to be published by the American Society of Microbiology. The chapter is called "Statistical Predictions in Genuine Coding Regions."

Borodovsky, a Russian emigre, conceived the idea for GeneMark while still living in Russia in the 1980s. He envisioned a software program based on Markov models to manage the vast amounts of genetic information scientists were churning out.

The Russian mathematician, Andrey Markov, introduced his models early in the 20th century. Borodovsky believed Markov models could portray genes by the frequency of certain combinations of bases in known genes, contrary to non-genes. Therefore these probabilistic models could be applied to DNA sequences to predict where genes would lie on DNA.

When scientists sequence DNA, they are left with strings of nucleotides that need to be separated into genes and non-coding regions and then translated into proteins to make sense.

Since 1992, researchers from around the world have sent their sequenced DNA fragments via e-mail to Georgia Tech's GeneMark e-mail server, which predicts locations of genes. After mapping gene locations, the computer program compares the newly predicted protein sequence to known ones in a database. This determines protein function. The protein analysis is done in collaboration with the National Center for Biotechnology Information at the NIH.

GeneMark has proven itself a powerful tool for finding bacterial genes, in particular. Researchers at the Institute for Genomic Research have used GeneMark to sequence the complete genomes of numerous common bacteria.

GeneMark Genesis, the refined version of GeneMark, which Borodovsky developed with graduate student William Hayes, was used to find genes in genomes of the bacteria Methanoccocus jannaschii and Helicobacter pylori. There were no experimentally studied segments of M. jannaschii available to train the Markov models, upon which gene prediction is based in GeneMark. So the new program "learned Markov models from anonymous sequences based on the grammar of the genetic code," Borodovsky said.

Borodovsky's work is at the forefront of a new interdisciplinary field called bioinformatics, which uses mathematical methods and computers to answer many important biological questions. Bioinformatics can also help discover genes and design new drugs. Borodovsky is spearheading development of Georgia Tech's new master's degree program in bioinformatics, the first such program in the United States.


Story Source:

The above story is based on materials provided by Georgia Institute Of Technology. Note: Materials may be edited for content and length.


Cite This Page:

Georgia Institute Of Technology. "Software Program Developed At Georgia Tech Adds New Capabilities To Decipher Genomes Of Higher Organisms." ScienceDaily. ScienceDaily, 2 December 1998. <www.sciencedaily.com/releases/1998/12/981202075222.htm>.
Georgia Institute Of Technology. (1998, December 2). Software Program Developed At Georgia Tech Adds New Capabilities To Decipher Genomes Of Higher Organisms. ScienceDaily. Retrieved November 26, 2014 from www.sciencedaily.com/releases/1998/12/981202075222.htm
Georgia Institute Of Technology. "Software Program Developed At Georgia Tech Adds New Capabilities To Decipher Genomes Of Higher Organisms." ScienceDaily. www.sciencedaily.com/releases/1998/12/981202075222.htm (accessed November 26, 2014).

Share This


More From ScienceDaily



More Health & Medicine News

Wednesday, November 26, 2014

Featured Research

from universities, journals, and other organizations


Featured Videos

from AP, Reuters, AFP, and other news services

From Popcorn To Vending Snacks: FDA Ups Calorie Count Rules

From Popcorn To Vending Snacks: FDA Ups Calorie Count Rules

Newsy (Nov. 25, 2014) — The US FDA is announcing new calorie rules on Tuesday that will require everywhere from theaters to vending machines to include calorie counts. Video provided by Newsy
Powered by NewsLook.com
Daily Serving Of Yogurt Could Reduce Risk Of Type 2 Diabetes

Daily Serving Of Yogurt Could Reduce Risk Of Type 2 Diabetes

Newsy (Nov. 25, 2014) — Need another reason to eat yogurt every day? Researchers now say it could reduce a person's risk of developing type 2 diabetes. Video provided by Newsy
Powered by NewsLook.com
Madagascar Working to Contain Plague Outbreak

Madagascar Working to Contain Plague Outbreak

AFP (Nov. 24, 2014) — Madagascar said Monday it is trying to contain an outbreak of plague -- similar to the Black Death that swept Medieval Europe -- that has killed 40 people and is spreading to the capital Antananarivo. Duration: 00:42 Video provided by AFP
Powered by NewsLook.com
Are Female Bosses More Likely To Be Depressed?

Are Female Bosses More Likely To Be Depressed?

Newsy (Nov. 24, 2014) — A new study links greater authority with increased depressive symptoms among women in the workplace. Video provided by Newsy
Powered by NewsLook.com

Search ScienceDaily

Number of stories in archives: 140,361

Find with keyword(s):
 
Enter a keyword or phrase to search ScienceDaily for related topics and research stories.

Save/Print:
Share:  

Breaking News:

Strange & Offbeat Stories

 

Health & Medicine

Mind & Brain

Living & Well

In Other News

... from NewsDaily.com

Science News

Health News

Environment News

Technology News



Save/Print:
Share:  

Free Subscriptions


Get the latest science news with ScienceDaily's free email newsletters, updated daily and weekly. Or view hourly updated newsfeeds in your RSS reader:

Get Social & Mobile


Keep up to date with the latest news from ScienceDaily via social networks and mobile apps:

Have Feedback?


Tell us what you think of ScienceDaily -- we welcome both positive and negative comments. Have any problems using the site? Questions?
Mobile iPhone Android Web
Follow Facebook Twitter Google+
Subscribe RSS Feeds Email Newsletters
Latest Headlines Health & Medicine Mind & Brain Space & Time Matter & Energy Computers & Math Plants & Animals Earth & Climate Fossils & Ruins