Multi-branch Convolutional Neural Network for Identification of Small Non-coding RNA genomic loci
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216224%3A14740%2F20%3A00114619" target="_blank" >RIV/00216224:14740/20:00114619 - isvavai.cz</a>
Result on the web
<a href="https://doi.org/10.1038/s41598-020-66454-3" target="_blank" >https://doi.org/10.1038/s41598-020-66454-3</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1038/s41598-020-66454-3" target="_blank" >10.1038/s41598-020-66454-3</a>
Alternative languages
Result language
angličtina
Original language name
Multi-branch Convolutional Neural Network for Identification of Small Non-coding RNA genomic loci
Original language description
Genomic regions that encode small RNA genes exhibit characteristic patterns in their sequence, secondary structure, and evolutionary conservation. Convolutional Neural Networks are a family of algorithms that can classify data based on learned patterns. Here we present MuStARD an application of Convolutional Neural Networks that can learn patterns associated with user-defined sets of genomic regions, and scan large genomic areas for novel regions exhibiting similar characteristics. We demonstrate that MuStARD is a generic method that can be trained on different classes of human small RNA genomic loci, without need for domain specific knowledge, due to the automated feature and background selection processes built into the model. We also demonstrate the ability of MuStARD for inter-species identification of functional elements by predicting mouse small RNAs (pre-miRNAs and snoRNAs) using models trained on the human genome. MuStARD can be used to filter small RNA-Seq datasets for identification of novel small RNA loci, intra- and inter- species, as demonstrated in three use cases of human, mouse, and fly pre-miRNA prediction. MuStARD is easy to deploy and extend to a variety of genomic classification questions. Code and trained models are freely available at gitlab.com/RBP_Bioinformatics/mustard.
Czech name
—
Czech description
—
Classification
Type
J<sub>imp</sub> - Article in a specialist periodical, which is included in the Web of Science database
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
Result was created during the realization of more than one project. More information in the Projects tab.
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2020
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Name of the periodical
Scientific reports
ISSN
2045-2322
e-ISSN
—
Volume of the periodical
10
Issue of the periodical within the volume
1
Country of publishing house
GB - UNITED KINGDOM
Number of pages
10
Pages from-to
1-10
UT code for WoS article
000559960100004
EID of the result in the Scopus database
2-s2.0-85086354140