Improvement of the banana "Musa acuminata" reference sequence using NGS data and semi-automated bioinformatics methods
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F61389030%3A_____%2F16%3A00459912" target="_blank" >RIV/61389030:_____/16:00459912 - isvavai.cz</a>
Result on the web
<a href="http://dx.doi.org/10.1186/s12864-016-2579-4" target="_blank" >http://dx.doi.org/10.1186/s12864-016-2579-4</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1186/s12864-016-2579-4" target="_blank" >10.1186/s12864-016-2579-4</a>
Alternative languages
Result language
angličtina
Original language name
Improvement of the banana "Musa acuminata" reference sequence using NGS data and semi-automated bioinformatics methods
Original language description
Background: Recent advances in genomics indicate functional significance of a majority of genome sequences and their long range interactions. As a detailed examination of genome organization and function requires very high quality genome sequence, the objective of this study was to improve reference genome assembly of banana (Musa acuminata).nResults: We have developed a modular bioinformatics pipeline to improve genome sequence assemblies, which can handle various types of data. The pipeline comprises several semi-automated tools. However, unlike classical automated tools that are based on global parameters, the semi-automated tools proposed an expert mode for a user who can decide on suggested improvements through local compromises. The pipeline was used to improve the draft genome sequence of Musa acuminata. Genotyping by sequencing (GBS) of a segregating population and paired-end sequencing were used to detect and correct scaffold misassemblies. Long insert size paired-end reads identified scaffold junctions and fusions missed by automated assembly methods. GBS markers were used to anchor scaffolds to pseudo-molecules with a new bioinformatics approach that avoids the tedious step of marker ordering during genetic map construction. Furthermore, a genome map was constructed and used to assemble scaffolds into super scaffolds. Finally, a consensus gene annotation was projected on the new assembly from two pre-existing annotations. This approach reduced the total Musa scaffold number from 7513 to 1532 (i.e. by 80 %), with an N50 that increased from 1.3 Mb (65 scaffolds) to 3.0 Mb (26 scaffolds). 89.5 % of the assembly was anchored to the 11 Musa chromosomes compared to the previous 70 %. nConclusion: The release of the Musa acuminata reference genome version 2 provides a platform for detailed analysis of banana genome variation, function and evolution. Bioinformatics tools developed in this work can be used to improve genome sequence assemblies in other species.
Czech name
—
Czech description
—
Classification
Type
J<sub>x</sub> - Unclassified - Peer-reviewed scientific article (Jimp, Jsc and Jost)
CEP classification
EB - Genetics and molecular biology
OECD FORD branch
—
Result continuities
Project
—
Continuities
I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Others
Publication year
2016
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Name of the periodical
B M C Genomics
ISSN
1471-2164
e-ISSN
—
Volume of the periodical
17
Issue of the periodical within the volume
MAR 16
Country of publishing house
US - UNITED STATES
Number of pages
12
Pages from-to
—
UT code for WoS article
000372091000004
EID of the result in the Scopus database
—