Science News
from research organizations

Bioinformatics software developed to predict effect of cancer-associated mutations

Software analyzes 40,000 proteins per minute

June 30, 2016
University of the Basque Country
A new piece of software has been developed that analyses mutations in proteins. These mutations are potential inducers of diseases, such as cancer. The development is free, easy, versatile and, above all, fast bioinformatics application that is capable of analyzing and combining the information from 40,000 proteins within the space of one minute.

A mutation of amino acid 144 of the UPSP21 protein (which has over 500 amino acids) causes this protein to accumulate in the nucleus of a cell instead of remaining outside. The image, obtained by means of fluorescent microscopy, shows the protein in green and the cell nucleus in blue.
Credit: UPV/EHU

Proteins consist of chains of amino acids and in each one it is possible to make out short sequences of amino acids with a discrete function called functional motifs. In some instances these motifs have already been described, in others they are yet to be specified. When a functional motif appears modified, the mutation could influence the development of a disease such as cancer. Verifying the possible changes in a protein is one of the first steps in conducting research into its function. If we bear in mind that the current draft of the human proteome is made up of over 40,000 proteins, seeking the mutations in each one is a monumental task.

That is why the three researchers began to work on the bioinformatics tool. So José Antonio Rodríguez had the biological question; Asier Fullaondo, the knowledge of bioinformatics tools and databases; and Gorka Prieto, the programming skills. Initially, these PhD holders developed a piece of software (WREGEX, available for the scientific community on the UPV/EHU's server) that can be used to predict and automatically seek out 'functional motifs' (the small groups of amino acids that develop specific tasks in a protein). They tested the programme to predict 'motifs' that move a protein from the nucleus to the cytoplasm of a cell, the so-called 'nuclear exportation signals'. At the end of this research phase in 2014, a paper was published in the journal Bioinformatics. But, as José Antonio Rodríguez pointed out, "in research the answer to one question opens the door to more questions." The question on that occasion was: Which proteins in a sequence of amino acids could have a functional cancer-mutant 'motif'?

The team took another step and combined the information on the sequences of all known human proteins with the COSMIC catalogue that gathers the mutations linked to cancer. Thus appeared a new version (WREGEX 2.0) that allows a normal protein to be compared with the same mutant one so as to be able to predict 'functional motifs' that have been modified and which could be linked to cancer. "You may also have experience in how a motif functions and you want to find out which proteins it could appear in and whether it appears modified into cancer. With this software you can obtain candidates to start to study," explained Gorka Prieto.

Once the bioinformatics programme had been developed, it had to be tested and to do this they carried out a "cell exportation trial'. They again chose various candidates that could constitute a motif responsible for moving the protein outside the cell nucleus. They checked their functioning and, after modifying them according to the tumour mutations described in the COSMIC catalogue, they ran the trial again. That way, they certified that the candidates acted as an 'exportation signal', that the mutation affected the way they worked, and that the software was therefore valid.

So this tool combines three types of information: the protein sequences, the functional motifs and the cancer mutations. "One of the main features of WREGEX 2.0 is that it can simultaneously study highly complex proteomes with masses of proteins and combine information, in the case of the trial, with cancer mutations; but the door is open for using other databases containing information about other types of mutations. The advantage, moreover, is that 40,000 proteins a minute can be analysed, while with other programs the analysis of a single protein took several minutes," explained Asier Fullaondo. So with this software it is possible to predict that the alteration in a protein may influence the development of disease, not just cancer.

So far, thirteen pieces of research have already used this computing tool. Researchers in China, Japan, Korea, Germany and the United States have accessed the server. In the meantime, the multidisciplinary tandem formed by the three PhD holders is already thinking about continuing with the work to improve the tool.

Story Source:

Materials provided by University of the Basque Country. Note: Content may be edited for style and length.

Journal Reference:

  1. Gorka Prieto, Asier Fullaondo, Jose A. Rodríguez. Proteome-wide search for functional motifs altered in tumors: Prediction of nuclear export signals inactivated by cancer-related mutations. Scientific Reports, 2016; 6: 25869 DOI: 10.1038/srep25869

Cite This Page:

University of the Basque Country. "Bioinformatics software developed to predict effect of cancer-associated mutations: Software analyzes 40,000 proteins per minute." ScienceDaily. ScienceDaily, 30 June 2016. <>.
University of the Basque Country. (2016, June 30). Bioinformatics software developed to predict effect of cancer-associated mutations: Software analyzes 40,000 proteins per minute. ScienceDaily. Retrieved May 25, 2017 from
University of the Basque Country. "Bioinformatics software developed to predict effect of cancer-associated mutations: Software analyzes 40,000 proteins per minute." ScienceDaily. (accessed May 25, 2017).