A bioinformatic platform to integrate target capture and whole genome sequences of various read depths for phylogenomics

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F60076658%3A12310%2F21%3A43904322" target="_blank" >RIV/60076658:12310/21:43904322 - isvavai.cz</a>
Nalezeny alternativní kódy
RIV/60077344:_____/21:00547831
Výsledek na webu
<a href="https://onlinelibrary.wiley.com/doi/10.1111/mec.16240" target="_blank" >https://onlinelibrary.wiley.com/doi/10.1111/mec.16240</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1111/mec.16240" target="_blank" >10.1111/mec.16240</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
A bioinformatic platform to integrate target capture and whole genome sequences of various read depths for phylogenomics
Popis výsledku v původním jazyce
The increasing availability of short-read whole genome sequencing (WGS) provides unprecedented opportunities to study ecological and evolutionary processes. Although loci of interest can be extracted from WGS data and combined with target sequence data, this requires suitable bioinformatic workflows. Here, we test different assembly and locus extraction strategies and implement them into secapr, a pipeline that processes short-read data into multilocus alignments for phylogenetics and molecular ecology analyses. We integrate the processing of data from low-coverage WGS (<30x) and target sequence capture into a flexible framework, while optimizing de novo contig assembly and loci extraction. Specifically, we test different assembly strategies by contrasting their ability to recover loci from targeted butterfly protein-coding genes, using four data sets: a WGS data set across different average coverages (10x, 5x and 2x) and a data set for which these loci were enriched prior to sequencing via target sequence capture. Using the resulting de novo contigs, we account for potential errors within contigs and infer phylogenetic trees to evaluate the ability of each assembly strategy to recover species relationships. We demonstrate that choosing multiple sizes of kmer simultaneously for assembly results in the highest yield of extracted loci from de novo assembled contigs, while data sets derived from sequencing read depths as low as 5x recovers the expected species relationships in phylogenetic trees. By making the tested assembly approaches available in the secapr pipeline, we hope to inspire future studies to incorporate complementary data and make an informed choice on the optimal assembly strategy.
Název v anglickém jazyce
A bioinformatic platform to integrate target capture and whole genome sequences of various read depths for phylogenomics
Popis výsledku anglicky
The increasing availability of short-read whole genome sequencing (WGS) provides unprecedented opportunities to study ecological and evolutionary processes. Although loci of interest can be extracted from WGS data and combined with target sequence data, this requires suitable bioinformatic workflows. Here, we test different assembly and locus extraction strategies and implement them into secapr, a pipeline that processes short-read data into multilocus alignments for phylogenetics and molecular ecology analyses. We integrate the processing of data from low-coverage WGS (<30x) and target sequence capture into a flexible framework, while optimizing de novo contig assembly and loci extraction. Specifically, we test different assembly strategies by contrasting their ability to recover loci from targeted butterfly protein-coding genes, using four data sets: a WGS data set across different average coverages (10x, 5x and 2x) and a data set for which these loci were enriched prior to sequencing via target sequence capture. Using the resulting de novo contigs, we account for potential errors within contigs and infer phylogenetic trees to evaluate the ability of each assembly strategy to recover species relationships. We demonstrate that choosing multiple sizes of kmer simultaneously for assembly results in the highest yield of extracted loci from de novo assembled contigs, while data sets derived from sequencing read depths as low as 5x recovers the expected species relationships in phylogenetic trees. By making the tested assembly approaches available in the secapr pipeline, we hope to inspire future studies to incorporate complementary data and make an informed choice on the optimal assembly strategy.

Klasifikace

Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10618 - Ecology

Návaznosti výsledku

Projekt
<a href="/cs/project/GJ20-18566Y" target="_blank" >GJ20-18566Y: Význam mezidruhových interakcí při diversifikaci neotropických motýlů v makroevolučním a mikroevolučním měřítku</a><br>
Návaznosti
I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace

Ostatní

Rok uplatnění
2021
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název periodika
Molecular Ecology
ISSN
0962-1083
e-ISSN
—
Svazek periodika
30
Číslo periodika v rámci svazku
23
Stát vydavatele periodika
US - Spojené státy americké
Počet stran výsledku
15
Strana od-do
6021-6035
Kód UT WoS článku
000712974600001
EID výsledku v databázi Scopus
2-s2.0-85118255763

Podobné výsledky(10)

Plant virome analysis in search for new viruses: experience from CRI Prague, Czech republic The contribution of mitochondrial metagenomics to large-scale data mining and phylogenetic analysis of Coleoptera A guide to carrying out a phylogenomic target sequence capture project

Co hledáte?

Rychlé hledání

Chytré vyhledávání

A bioinformatic platform to integrate target capture and whole genome sequences of various read depths for phylogenomics

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)