Featured Research

from universities, journals, and other organizations

UC Santa Cruz provides access to encyclopedia of the human genome

Date:
September 5, 2012
Source:
University of California - Santa Cruz
Summary:
The ENCODE project has enabled scientists to assign specific functions for 80 percent of the human genome, providing new insights into the mechanisms of gene regulation and giving biomedical researchers a solid genetic foundation for understanding how the body works in health and disease. The project's data coordination center at UCSC has made all of the ENCODE data available for public use through the UCSC Genome Browser.

A massive international collaboration has enabled scientists to assign specific functions for 80 percent of the human genome, providing new insights into the mechanisms of gene regulation and giving biomedical researchers a solid genetic foundation for understanding how the body works in health and disease.

The results of the Encyclopedia of DNA Elements (ENCODE) project are described in a coordinated set of 30 papers published in several journals on September 5, 2012. Scientists at the University of California, Santa Cruz, have operated the Data Coordination Center for ENCODE since an initial pilot project began in 2003, and they have made all of the ENCODE data available for public use through the UCSC Genome Browser.

"Our job was to gather data from 32 labs running different types of experiments on a staggering array of cells and tissues, and we had to establish a common data language so we could get it all into a single database that scientists across the world could use. We also developed a lot of new ways of looking at the data, creating search and visualization tools so that people could find the data most relevant to them," said Jim Kent, director of the UCSC Genome Browser project and head of the ENCODE Data Coordination Center.

ENCODE is supported by the National Human Genome Research Institute (NHGRI), one of the National Institutes of Health. Hundreds of researchers across the United States, United Kingdom, Spain, Singapore, and Japan performed more than 1,600 sets of experiments on 147 types of tissue using technologies standardized across the consortium. In total, ENCODE generated more than 15 trillion bytes of raw data, and the data analysis consumed the equivalent of more than 300 years of compute time.

"We've come a long way, and we have learned an incredible amount by integrating the different types of data that ENCODE produced, which was done at a scale never before achieved in biology. This data integration was one of the keys to the success of the project," said Ewan Birney of the European Bioinformatics Institute in the United Kingdom, lead analysis coordinator of the ENCODE data.

For Kent and his data coordination team at UCSC's Center for Biomolecular Science and Engineering, the scale of the project presented many challenges. To start with, they had to coordinate a small army of researchers who were producing data in labs around the world. "We had five data wranglers who traveled around to the labs, probably four conference calls a week at the height of it, plus large group meetings twice a year, and countless emails and skype calls," Kent said.

Researchers were able to map more than 4 million regulatory regions in the human genome where proteins specifically interact with the DNA. These findings represent a significant advance in understanding the precise and complex controls over how and when genes are active within a cell.

"The regulatory elements are responsible for ensuring that you get crystalline protein in the lens of your eye and hemoglobin in your blood, and not the other way around," Kent said. "It's quite complex. The information processing and the intelligence of the genome reside in the regulatory elements. With this project, we probably went from understanding less than five percent to now around 75 percent of them."

The ENCODE data are rapidly becoming a fundamental resource for researchers working to understand human biology and disease. More than one hundred papers using ENCODE data have already been published by investigators who were not part of the ENCODE project. For example, researchers studying the genetic basis of human diseases use genome-wide association studies to identify disease-associated variants, or markers, in the genome, and they are using the ENCODE resource in an effort to determine which of the many specific variants identified in a study actually contribute to disease. These disease-associated variants map not only to protein-coding regions of the genome, but more often to the non-coding regions of the genome, the vast tracts of sequence between genes where ENCODE has identified many regulatory sites.

"As much as nine out of 10 times, disease-linked genetic variants are not in protein-coding regions," said Mike Pazin, an ENCODE program director at NHGRI. "Far from being 'junk' DNA, this regulatory DNA clearly makes important contributions to human disease."

The coordinated publication set includes one main integrative paper and five other papers in the journal Nature; 18 papers in Genome Research; and six papers in Genome Biology. The ENCODE data are so complex that the three journals have developed a pioneering way to present the information in an integrated form that they call "threads." Since the same topics were addressed in different ways in different papers, a new website will allow anyone to follow a topic through all of the papers in the ENCODE publication set in which it appears. In addition to the "threaded papers," six review articles are being published in the Journal of Biological Chemistry, and other affiliated papers in Science, Cell, and other journals.

Despite the enormity of the data set described in this historic set of publications, it does not comprehensively describe all of the functional elements in all of the different types of cells in the human body. Much additional work needs to be done, and ENCODE is about to be renewed for an additional four years. During the next phase, ENCODE will increase the depth of the catalog with respect to the types of functional elements and cell types studied. It will also develop new tools for more sophisticated analyses of the data.


Story Source:

The above story is based on materials provided by University of California - Santa Cruz. The original article was written by Tim Stephens. Note: Materials may be edited for content and length.


Cite This Page:

University of California - Santa Cruz. "UC Santa Cruz provides access to encyclopedia of the human genome." ScienceDaily. ScienceDaily, 5 September 2012. <www.sciencedaily.com/releases/2012/09/120905135004.htm>.
University of California - Santa Cruz. (2012, September 5). UC Santa Cruz provides access to encyclopedia of the human genome. ScienceDaily. Retrieved July 31, 2014 from www.sciencedaily.com/releases/2012/09/120905135004.htm
University of California - Santa Cruz. "UC Santa Cruz provides access to encyclopedia of the human genome." ScienceDaily. www.sciencedaily.com/releases/2012/09/120905135004.htm (accessed July 31, 2014).

Share This




More Health & Medicine News

Thursday, July 31, 2014

Featured Research

from universities, journals, and other organizations


Featured Videos

from AP, Reuters, AFP, and other news services

Health Insurers' Profits Slide

Health Insurers' Profits Slide

Reuters - Business Video Online (July 30, 2014) Obamacare-related costs were said to be behind the profit plunge at Wellpoint and Humana, but Wellpoint sees the new exchanges boosting its earnings for the full year. Fred Katayama reports. Video provided by Reuters
Powered by NewsLook.com
Concern Grows Over Worsening Ebola Crisis

Concern Grows Over Worsening Ebola Crisis

AFP (July 30, 2014) Pan-African airline ASKY has suspended all flights to and from the capitals of Liberia and Sierra Leone amid the worsening Ebola health crisis, which has so far caused 672 deaths in Guinea, Liberia and Sierra Leone. Duration: 00:43 Video provided by AFP
Powered by NewsLook.com
At Least 20 Chikungunya Cases in New Jersey

At Least 20 Chikungunya Cases in New Jersey

AP (July 30, 2014) At least 20 New Jersey residents have tested positive for chikungunya, a mosquito-borne virus that has spread through the Caribbean. (July 30) Video provided by AP
Powered by NewsLook.com
Xtreme Eating: Your Daily Caloric Intake All On One Plate

Xtreme Eating: Your Daily Caloric Intake All On One Plate

Newsy (July 30, 2014) The Center for Science in the Public Interest released its 2014 list of single meals with whopping calorie counts. Video provided by Newsy
Powered by NewsLook.com

Search ScienceDaily

Number of stories in archives: 140,361

Find with keyword(s):
Enter a keyword or phrase to search ScienceDaily for related topics and research stories.

Save/Print:
Share:

Breaking News:

More Coverage


Allegedly Useless Parts of the Human Genome Fulfil Regulatory Tasks

Sep. 7, 2012 Heidelberg scientists contribute to the encyclopedia of all functional DNA elements in the human ... read more
from the past week

In Other News

... from NewsDaily.com

Science News

Health News

    Environment News

    Technology News



      Save/Print:
      Share:

      Free Subscriptions


      Get the latest science news with ScienceDaily's free email newsletters, updated daily and weekly. Or view hourly updated newsfeeds in your RSS reader:

      Get Social & Mobile


      Keep up to date with the latest news from ScienceDaily via social networks and mobile apps:

      Have Feedback?


      Tell us what you think of ScienceDaily -- we welcome both positive and negative comments. Have any problems using the site? Questions?
      Mobile: iPhone Android Web
      Follow: Facebook Twitter Google+
      Subscribe: RSS Feeds Email Newsletters
      Latest Headlines Health & Medicine Mind & Brain Space & Time Matter & Energy Computers & Math Plants & Animals Earth & Climate Fossils & Ruins