ToTem: a tool for variant calling pipeline optimization
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F65269705%3A_____%2F18%3A00068775" target="_blank" >RIV/65269705:_____/18:00068775 - isvavai.cz</a>
Nalezeny alternativní kódy
RIV/00216224:14740/18:00101855 RIV/61989592:15310/18:73588794
Výsledek na webu
<a href="https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-018-2227-x" target="_blank" >https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-018-2227-x</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1186/s12859-018-2227-x" target="_blank" >10.1186/s12859-018-2227-x</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
ToTem: a tool for variant calling pipeline optimization
Popis výsledku v původním jazyce
Background: High-throughput bioinformatics analyses of next generation sequencing (NGS) data often require challenging pipeline optimization. The key problem is choosing appropriate tools and selecting the best parameters for optimal precision and recall. Results: Here we introduce ToTem, a tool for automated pipeline optimization. ToTem is a stand-alone web application with a comprehensive graphical user interface (GUI). ToTem is written in Java and PHP with an underlying connection to a MySQL database. Its primary role is to automatically generate, execute and benchmark different variant calling pipeline settings. Our tool allows an analysis to be started from any level of the process and with the possibility of plugging almost any tool or code. To prevent an over-fitting of pipeline parameters, ToTem ensures the reproducibility of these by using cross validation techniques that penalize the final precision, recall and F-measure. The results are interpreted as interactive graphs and tables allowing an optimal pipeline to be selected, based on the user's priorities. Using ToTem, we were able to optimize somatic variant calling from ultra-deep targeted gene sequencing (TGS) data and germline variant detection in whole genome sequencing (WGS) data. Conclusions: ToTem is a tool for automated pipeline optimization which is freely available as a web application at https://totern.software
Název v anglickém jazyce
ToTem: a tool for variant calling pipeline optimization
Popis výsledku anglicky
Background: High-throughput bioinformatics analyses of next generation sequencing (NGS) data often require challenging pipeline optimization. The key problem is choosing appropriate tools and selecting the best parameters for optimal precision and recall. Results: Here we introduce ToTem, a tool for automated pipeline optimization. ToTem is a stand-alone web application with a comprehensive graphical user interface (GUI). ToTem is written in Java and PHP with an underlying connection to a MySQL database. Its primary role is to automatically generate, execute and benchmark different variant calling pipeline settings. Our tool allows an analysis to be started from any level of the process and with the possibility of plugging almost any tool or code. To prevent an over-fitting of pipeline parameters, ToTem ensures the reproducibility of these by using cross validation techniques that penalize the final precision, recall and F-measure. The results are interpreted as interactive graphs and tables allowing an optimal pipeline to be selected, based on the user's priorities. Using ToTem, we were able to optimize somatic variant calling from ultra-deep targeted gene sequencing (TGS) data and germline variant detection in whole genome sequencing (WGS) data. Conclusions: ToTem is a tool for automated pipeline optimization which is freely available as a web application at https://totern.software
Klasifikace
Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10609 - Biochemical research methods
Návaznosti výsledku
Projekt
Výsledek vznikl pri realizaci vícero projektů. Více informací v záložce Projekty.
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Ostatní
Rok uplatnění
2018
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
BMC Bioinformatics
ISSN
1471-2105
e-ISSN
—
Svazek periodika
19
Číslo periodika v rámci svazku
JUN 2018
Stát vydavatele periodika
GB - Spojené království Velké Británie a Severního Irska
Počet stran výsledku
9
Strana od-do
243
Kód UT WoS článku
000436517200004
EID výsledku v databázi Scopus
2-s2.0-85049074706