New! Sign up for our free email newsletter.
Science News
from research organizations

Advancing the application of genomic sequences through 'Kmasker plants'

January 21, 2020
Leibniz Institute of Plant Genetics and Crop Plant Research
The correct assembly of plant genomes can be hampered by a large amount of repetitive sequences. Researchers have developed a bioinformatics tool for the automatic detection of repetitive genome regions, based on the identification of k-mers (nucleotide sequences of a pre-determined length).

The development of next-generation-sequencing (NGS) has enabled researchers to investigate genomes that might previously have been considered too complex or too expensive. Nevertheless, the analysis of complex plant genomes, which often have an enormous amount of repetitive sequences, is still a challenge. Therefore, bioinformatics researchers from Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Martin Luther University Halle-Wittenberg (MLU) and Leibniz Institute of Plant Biochemistry (IPB) have now published "Kmasker plants," a program that allows the identification of repetitive sequences and thus facilitates the analysis of plant genomes.

In bioinformatics, the term k-mer is used to describe a nucleotide sequence of a certain length "k." By defining and counting such sequences, researchers can quantify repetitive sequences in the genome they are studying and assign them to corresponding positions. As early as 2014, researchers at IPK in Gatersleben used this approach to develop the in-silico (computer-based) tool "Kmasker." It was used to detect repetitions in the characterisation of the barley genome (Schmutzer et al., 2014).

The use of NGS is becoming more and more important, but the error-free composition of complex genomes from NGS results is still a challenge. For this reason, the researchers recently decided to revive and expand this initial proof-of-concept project. Under the leadership of Dr. Thomas Schmutzer, formerly from the research group "Bioinformatics and Information Technology" at IPK and now affiliated with the MLU, scientists from the MLU, the IPK, Wageningen University & Research and the IPB Halle worked in close cooperation on the redesign and development of "Kmasker plants." This collaboration was largely supported by the two service centres "GCBN" and "CiBi" from the German Network for Bioinformatics Infrastructure "de.NBI."

"Kmasker plants" allows for the rapid and reference-free screening of nucleotide sequences using genome-wide derived k-mers. In extension to the previous version, the bioinformatics tool now also enables comparative studies between different cultivars or closely related species, and supports the identification of sequences suitable as fluorescence in situ hybridisation (FISH) probes or CRISPR/Cas9-specific guide RNAs. Furthermore, "Kmasker plants" has been published with a web service that contains the pre-computed indices for selected economically important crop plants, such as barley or wheat. Dr. Schmutzer emphasises that "this tool will enable plant researchers all over the world to test plant genomes and thus, for example, identify repeat free parts of their sequence of interest." Rather, he believes that the enhanced features will make it possible to detect sequence candidate regions that have multiplied in the genome of one species but are missing in other species or occur in smaller copy numbers. This is a common effect that contributes to phenotypic variation of agronomic importance in various crops. A significant example is the Vrn-H2 gene, which is present in a single copy in winter barley, while it is missing in barley spring lines.

The "Kmasker plants" web-service is now available as part of the IPK Crop Analysis Tool Suite (CATS) and therefore as a service of the de.NBI Service Platform. Alternatively, the "Kmasker plants" source code can directly be accessed and installed via GitHub.

Story Source:

Materials provided by Leibniz Institute of Plant Genetics and Crop Plant Research. Note: Content may be edited for style and length.

Journal Reference:

  1. Sebastian Beier, Chris Ulpinnis, Markus Schwalbe, Thomas Münch, Robert Hoffie, Iris Koeppel, Christian Hertig, Nagaveni Budhagatapalli, Stefan Hiekel, Krishna M. Pathi, Goetz Hensel, Martin Grosse, Sindy Chamas, Sophia Gerasimova, Jochen Kumlehn, Uwe Scholz, Thomas Schmutzer. Kmasker plants – a tool for assessing complex sequence space in plant species. The Plant Journal, 2020; DOI: 10.1111/tpj.14645

Cite This Page:

Leibniz Institute of Plant Genetics and Crop Plant Research. "Advancing the application of genomic sequences through 'Kmasker plants'." ScienceDaily. ScienceDaily, 21 January 2020. <>.
Leibniz Institute of Plant Genetics and Crop Plant Research. (2020, January 21). Advancing the application of genomic sequences through 'Kmasker plants'. ScienceDaily. Retrieved December 3, 2023 from
Leibniz Institute of Plant Genetics and Crop Plant Research. "Advancing the application of genomic sequences through 'Kmasker plants'." ScienceDaily. (accessed December 3, 2023).

Explore More
from ScienceDaily