Multiple Conformer Descriptors for QSAR Modeling
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F61989592%3A15110%2F21%3A73610368" target="_blank" >RIV/61989592:15110/21:73610368 - isvavai.cz</a>
Výsledek na webu
<a href="https://onlinelibrary.wiley.com/doi/10.1002/minf.202060030" target="_blank" >https://onlinelibrary.wiley.com/doi/10.1002/minf.202060030</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1002/minf.202060030" target="_blank" >10.1002/minf.202060030</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Multiple Conformer Descriptors for QSAR Modeling
Popis výsledku v původním jazyce
The most widely used QSAR approaches are mainly based on 2D molecular representation which ignores stereoconfiguration and conformational flexibility of compounds. 3D QSAR uses a single conformer of each compound which is difficult to choose reasonably. 4D QSAR uses multiple conformers to overcome the issues of 2D and 3D methods. However, many of existing 4D QSAR models suffer from the necessity to pre-align conformers, while alignment-independent approaches often ignore stereoconfiguration of compounds. In this study we propose a QSAR modeling approach based on transforming chirality-aware 3D pharmacophore descriptors of individual conformers into a set of latent variables representing the whole conformer set of a molecule. This is achieved by clustering together all conformers of all training set compounds. The final representation of a compound is a bit string encoding cluster membership of its conformers. In our study we used Random Forest, but this representation can be used in combination with any machine learning method. We compared this approach with conventional 2D and 3D approaches using multiple data sets and investigated the sensitivity of the approach proposed to tuning parameters: number of conformers and clusters.
Název v anglickém jazyce
Multiple Conformer Descriptors for QSAR Modeling
Popis výsledku anglicky
The most widely used QSAR approaches are mainly based on 2D molecular representation which ignores stereoconfiguration and conformational flexibility of compounds. 3D QSAR uses a single conformer of each compound which is difficult to choose reasonably. 4D QSAR uses multiple conformers to overcome the issues of 2D and 3D methods. However, many of existing 4D QSAR models suffer from the necessity to pre-align conformers, while alignment-independent approaches often ignore stereoconfiguration of compounds. In this study we propose a QSAR modeling approach based on transforming chirality-aware 3D pharmacophore descriptors of individual conformers into a set of latent variables representing the whole conformer set of a molecule. This is achieved by clustering together all conformers of all training set compounds. The final representation of a compound is a bit string encoding cluster membership of its conformers. In our study we used Random Forest, but this representation can be used in combination with any machine learning method. We compared this approach with conventional 2D and 3D approaches using multiple data sets and investigated the sensitivity of the approach proposed to tuning parameters: number of conformers and clusters.
Klasifikace
Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10608 - Biochemistry and molecular biology
Návaznosti výsledku
Projekt
<a href="/cs/project/LTARF18013" target="_blank" >LTARF18013: Zvýšení úspěšnosti primárního skríningu biologicky aktivních látek pomocí výpočetních modelů</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Ostatní
Rok uplatnění
2021
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
Molecular Informatics
ISSN
1868-1743
e-ISSN
—
Svazek periodika
40
Číslo periodika v rámci svazku
11
Stát vydavatele periodika
DE - Spolková republika Německo
Počet stran výsledku
11
Strana od-do
"nestránkováno"
Kód UT WoS článku
000680535600001
EID výsledku v databázi Scopus
2-s2.0-85112051056