CYPlebrity: Machine learning models for the prediction of inhibitors of cytochrome P450 enzymes
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F60461373%3A22310%2F21%3A43923453" target="_blank" >RIV/60461373:22310/21:43923453 - isvavai.cz</a>
Result on the web
<a href="https://www.sciencedirect.com/science/article/pii/S0968089621003965" target="_blank" >https://www.sciencedirect.com/science/article/pii/S0968089621003965</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1016/j.bmc.2021.116388" target="_blank" >10.1016/j.bmc.2021.116388</a>
Alternative languages
Result language
angličtina
Original language name
CYPlebrity: Machine learning models for the prediction of inhibitors of cytochrome P450 enzymes
Original language description
The vast majority of approved drugs are metabolized by the five major cytochrome P450 (CYP) isozymes, 1A2, 2C9, 2C19, 2D6 and 3A4. Inhibition of CYP isozymes can cause drug-drug interactions with severe pharmacological and toxicological consequences. Computational methods for the fast and reliable prediction of the inhibition of CYP isozymes by small molecules are therefore of high interest and relevance to pharmaceutical companies and a host of other industries, including the cosmetics and agrochemical industries. Today, a large number of machine learning models for predicting the inhibition of the major CYP isozymes by small molecules are available. With this work we aim to go beyond the coverage of existing models, by combining data from several major public and proprietary sources. More specifically, we used up to 18815 compounds with measured bioactivities to train random forest classification models for the individual CYP isozymes. A major advantage of the new data collection over existing ones is the better representation of the minority class, the CYP inhibitors. With the new data collection we achieved inhibitor-to-non-inhibitor ratios in the order of 1:1 (CYP1A2) to 1:3 (CYP2D6). We show that our models reach competitive performance on external data, with Matthews correlation coefficients (MCCs) ranging from 0.62 (CYP2C19) to 0.70 (CYP2D6), and areas under the receiver operating characteristic curve (AUCs) between 0.89 (CYP2C19) and 0.92 (CYPs 2D6 and 3A4). Importantly, the models show a high level of robustness, reflected in a good predictivity also for compounds that are structurally dissimilar to the compounds represented in the training data. The best models presented in this work are freely accessible for academic research via a web service.
Czech name
—
Czech description
—
Classification
Type
J<sub>imp</sub> - Article in a specialist periodical, which is included in the Web of Science database
CEP classification
—
OECD FORD branch
10608 - Biochemistry and molecular biology
Result continuities
Project
<a href="/en/project/LM2018130" target="_blank" >LM2018130: National Infrastructure for Chemical Biology</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2021
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Name of the periodical
BIOORGANIC & MEDICINAL CHEMISTRY
ISSN
0968-0896
e-ISSN
—
Volume of the periodical
46
Issue of the periodical within the volume
46
Country of publishing house
GB - UNITED KINGDOM
Number of pages
11
Pages from-to
—
UT code for WoS article
000701659400006
EID of the result in the Scopus database
—