Prot2HG: a database of protein domains mapped to the human genome
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11130%2F20%3A10410925" target="_blank" >RIV/00216208:11130/20:10410925 - isvavai.cz</a>
Nalezeny alternativní kódy
RIV/00064203:_____/20:10410925
Výsledek na webu
<a href="https://verso.is.cuni.cz/pub/verso.fpl?fname=obd_publikace_handle&handle=SvKLP3gY2a" target="_blank" >https://verso.is.cuni.cz/pub/verso.fpl?fname=obd_publikace_handle&handle=SvKLP3gY2a</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1093/database/baz161" target="_blank" >10.1093/database/baz161</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Prot2HG: a database of protein domains mapped to the human genome
Popis výsledku v původním jazyce
Genetic variation occurring within conserved functional protein domains warrants special attention when examining DNA variation in the context of disease causation. Here we introduce a resource, freely available at www.prot2hg.com, that addresses the question of whether a particular variant falls onto an annotated protein domain and directly translates chromosomal coordinates onto protein residues. The tool can perform a multiple-site query in a simple way, and the whole dataset is available for download as well as incorporated into our own accessible pipeline. To create this resource, National Center for Biotechnology Information protein data were retrieved using the Entrez Programming Utilities. After processing all human protein domains, residue positions were reverse translated and mapped to the reference genome hg19 and stored in a MySQL database. In total, 760 487 protein domains from 42 371 protein models were mapped to hg19 coordinates and made publicly available for search or download (www.prot2hg.com). In addition, this annotation was implemented into the genomics research platform GENESIS in order to query nearly 8000 exomes and genomes of families with rare Mendelian disorders (tgp-foundation.org). When applied to patient genetic data, we found that rare (<1%) variants in the Genome Aggregation Database were significantly more annotated onto a protein domain in comparison to common (>1%) variants. Similarly, variants described as pathogenic or likely pathogenic in ClinVar were more likely to be annotated onto a domain. In addition, we tested a dataset consisting of 60 causal variants in a cohort of patients with epileptic encephalopathy and found that 71% of them (43 variants) were propagated onto protein domains. In summary, we developed a resource that annotates variants in the coding part of the genome onto conserved protein domains in order to increase variant prioritization efficiency. Database URL: www.prot2hg.com.
Název v anglickém jazyce
Prot2HG: a database of protein domains mapped to the human genome
Popis výsledku anglicky
Genetic variation occurring within conserved functional protein domains warrants special attention when examining DNA variation in the context of disease causation. Here we introduce a resource, freely available at www.prot2hg.com, that addresses the question of whether a particular variant falls onto an annotated protein domain and directly translates chromosomal coordinates onto protein residues. The tool can perform a multiple-site query in a simple way, and the whole dataset is available for download as well as incorporated into our own accessible pipeline. To create this resource, National Center for Biotechnology Information protein data were retrieved using the Entrez Programming Utilities. After processing all human protein domains, residue positions were reverse translated and mapped to the reference genome hg19 and stored in a MySQL database. In total, 760 487 protein domains from 42 371 protein models were mapped to hg19 coordinates and made publicly available for search or download (www.prot2hg.com). In addition, this annotation was implemented into the genomics research platform GENESIS in order to query nearly 8000 exomes and genomes of families with rare Mendelian disorders (tgp-foundation.org). When applied to patient genetic data, we found that rare (<1%) variants in the Genome Aggregation Database were significantly more annotated onto a protein domain in comparison to common (>1%) variants. Similarly, variants described as pathogenic or likely pathogenic in ClinVar were more likely to be annotated onto a domain. In addition, we tested a dataset consisting of 60 causal variants in a cohort of patients with epileptic encephalopathy and found that 71% of them (43 variants) were propagated onto protein domains. In summary, we developed a resource that annotates variants in the coding part of the genome onto conserved protein domains in order to increase variant prioritization efficiency. Database URL: www.prot2hg.com.
Klasifikace
Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
30103 - Neurosciences (including psychophysiology)
Návaznosti výsledku
Projekt
<a href="/cs/project/NV16-30206A" target="_blank" >NV16-30206A: Celogenomové a RNA masivně paralelní sekvenování jako nástroj pro objasnění příčin vzácných typů dědičných neuropatií.</a><br>
Návaznosti
S - Specificky vyzkum na vysokych skolach<br>I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Ostatní
Rok uplatnění
2020
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
Database : the journal of biological databases and curation
ISSN
1758-0463
e-ISSN
—
Svazek periodika
neuveden
Číslo periodika v rámci svazku
April
Stát vydavatele periodika
GB - Spojené království Velké Británie a Severního Irska
Počet stran výsledku
7
Strana od-do
baz161
Kód UT WoS článku
000548893500001
EID výsledku v databázi Scopus
2-s2.0-85083404177