New! Sign up for our free email newsletter.
Science News
from research organizations

Bioinformatician helps biologists find key genes

Date:
October 7, 2014
Source:
South Dakota State University
Summary:
It’s like looking for a needle in a haystack. Scientists searching for the gene or gene combination that affects even one plant or animal characteristic must sort through massive amounts of data, according to a professor of the mathematics and statistics. He leads a bioinformatics research group, which provides the expertise that plant and animal scientists need to uncover how genes and proteins affect cell functions.
Share:
FULL STORY

It's like looking for a needle in a haystack.

Scientists searching for the gene or gene combination that affects even one plant or animal characteristic must sort through massive amounts of data, according to associate professor Xijin Ge of the mathematics and statistics department at South Dakota State University.

"Biologists used to study one gene at time, but now they can look at tens of thousands of genes at once." Ge said. Just one experiment to analyze gene expression can produce one terabyte of sequence data. "That's a little beyond many biologists' comfort zone."

He leads the bioinformatics research group, which provides the expertise that SDSU plant and animal scientists need to uncover how genes and proteins affect cell functions.

Setting up the experiments

Typically, scientists consult with Ge when planning their studies. After examining what they want to investigate, the researchers decide which techniques should be used to obtain data and a plan to analyze the data.

"It's critical to have the statistician and biologist working together," noted plant science professor Fedora Sutton, who worked with Ge on identifying gene interactions that account for freeze resistance in winter wheat. "He is able to say, based on statistical rules and regulations, this is where this has to be."

Using the same technique on one sample is not enough, Sutton pointed out. Multiple samples must be grown under the same conditions and then analyzed to have biological replicates. Ge explained that experiments must be designed to gather biological rather than technical replicates.

Once the technique to gather data is chosen and a plan of data analyses is created, Ge said, "we can figure out how many replicates are needed."

Analyzing megabytes of data

"Bioinformatics is an important tool to zoom in on the target gene networks," said Xing-You Gu, who collaborated with Ge to identify genes that are associated with seed dormancy in weedy rice.

Weeds survive adverse environmental conditions because of strong seed dormancy, Gu explained. "To devise new weed management strategies, we need to understand the molecular genetic mechanisms of seed dormancy."

Gu used a map-based cloning strategy and then Ge applied bioinformatics tools, such as statistical tests and clustering, to find the candidate genes. This task involved looking at more than 30,000 to 40,000 genes, which can produce three to four million data points, according to Ge.

To determine which genes are responsible, Ge must first eliminate those data points that contain noise and then "focus on the reliable signals because we're looking at so many genes." Sometimes nearly half the data are eliminated.

Visualizing gene expression

Ge uses data-mining algorithms to find patterns of interest to the scientists. Typically, Ge's analysis produces a visual representation of the data that is statistically significant.

One of Sutton's visuals was a heat map depicting gene expressions that were increased or upregulated in red, those that were shut down or downregulated in green and those unaffected in black. This allowed her to identify six genes as potential markers which will then help breeders develop more lines of freeze-resistant winter wheat.

"We are trying to explain what's going on in the cell," Ge said. "We have to make the data tell a story."

After identifying the genes, the researchers "want to piece together the jigsaw puzzle and figure out the common characteristics of the affected genes," Ge explained. This will allow us to identify the sub-systems, or pathways, that are regulated.

About SDSU Bioinformatics Research Group

The Bioinformatics Research Group, led by Dr. Xijin Ge, is devoted to using the tools of mathematics, computer science, and the biological sciences to explore the frontiers of the natural world. Research focuses on using, discovering and implementing statistical, machine learning and data mining algorithms to find patterns of interest within the mass of publicly available biological data. Members of our group are involved in studying evolutionary comparative genomics, text mining, and analysis of gene expression data.


Story Source:

Materials provided by South Dakota State University. Note: Content may be edited for style and length.


Cite This Page:

South Dakota State University. "Bioinformatician helps biologists find key genes." ScienceDaily. ScienceDaily, 7 October 2014. <www.sciencedaily.com/releases/2014/10/141007184204.htm>.
South Dakota State University. (2014, October 7). Bioinformatician helps biologists find key genes. ScienceDaily. Retrieved April 14, 2024 from www.sciencedaily.com/releases/2014/10/141007184204.htm
South Dakota State University. "Bioinformatician helps biologists find key genes." ScienceDaily. www.sciencedaily.com/releases/2014/10/141007184204.htm (accessed April 14, 2024).

Explore More

from ScienceDaily

RELATED STORIES