Determining Optical Mapping Errors by Simulations
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F61989592%3A15110%2F21%3A73609650" target="_blank" >RIV/61989592:15110/21:73609650 - isvavai.cz</a>
Nalezeny alternativní kódy
RIV/61989100:27240/21:10248027 RIV/00098892:_____/21:N0000071
Výsledek na webu
<a href="https://academic.oup.com/bioinformatics/article/37/20/3391/6275255" target="_blank" >https://academic.oup.com/bioinformatics/article/37/20/3391/6275255</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1093/bioinformatics/btab259" target="_blank" >10.1093/bioinformatics/btab259</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Determining Optical Mapping Errors by Simulations
Popis výsledku v původním jazyce
Motivation: Optical mapping is a complementary technology to traditional DNA sequencing technologies, such as next-generation sequencing (NGS). It provides genome-wide, high-resolution restriction maps from single, stained molecules of DNA. It can be used to detect large and small structural variants, copy number variations and complex rearrangements. Optical mapping is affected by different kinds of errors in comparison with traditional DNA sequencing technologies. It is important to understand the source of these errors and how they affect the obtained data. This article proposes a novel approach to modeling errors in the data obtained from the Bionano Genomics Inc. Saphyr system with Direct Label and Stain (DLS) chemistry. Some studies have already addressed this issue for older instruments with nicking enzymes, but we are unaware of a study that addresses this new system. Results: The main result is a framework for studying errors in the data obtained from the Saphyr instrument with DLS chemistry. The framework’s main component is a simulation that computes how major sources of errors for this instrument (a false site, a missing site and resolution errors) affect the distribution of fragment lengths in optical maps. The simulation is parametrized by variables describing these errors and we are using a differential evolution algorithm to evaluate parameters that best fit the data from the instrument. Results of the experiments manifest that this approach can be used to study errors in the optical mapping data analysis.
Název v anglickém jazyce
Determining Optical Mapping Errors by Simulations
Popis výsledku anglicky
Motivation: Optical mapping is a complementary technology to traditional DNA sequencing technologies, such as next-generation sequencing (NGS). It provides genome-wide, high-resolution restriction maps from single, stained molecules of DNA. It can be used to detect large and small structural variants, copy number variations and complex rearrangements. Optical mapping is affected by different kinds of errors in comparison with traditional DNA sequencing technologies. It is important to understand the source of these errors and how they affect the obtained data. This article proposes a novel approach to modeling errors in the data obtained from the Bionano Genomics Inc. Saphyr system with Direct Label and Stain (DLS) chemistry. Some studies have already addressed this issue for older instruments with nicking enzymes, but we are unaware of a study that addresses this new system. Results: The main result is a framework for studying errors in the data obtained from the Saphyr instrument with DLS chemistry. The framework’s main component is a simulation that computes how major sources of errors for this instrument (a false site, a missing site and resolution errors) affect the distribution of fragment lengths in optical maps. The simulation is parametrized by variables describing these errors and we are using a differential evolution algorithm to evaluate parameters that best fit the data from the instrument. Results of the experiments manifest that this approach can be used to study errors in the optical mapping data analysis.
Klasifikace
Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
—
Návaznosti
I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Ostatní
Rok uplatnění
2021
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
BIOINFORMATICS
ISSN
1367-4803
e-ISSN
1460-2059
Svazek periodika
37
Číslo periodika v rámci svazku
20
Stát vydavatele periodika
GB - Spojené království Velké Británie a Severního Irska
Počet stran výsledku
7
Strana od-do
3391-3397
Kód UT WoS článku
000733829400001
EID výsledku v databázi Scopus
2-s2.0-85134965137