Transfer Learning Allows Accurate RBP Target Site Prediction with Limited Sample Sizes

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216224%3A14310%2F23%3A00131807" target="_blank" >RIV/00216224:14310/23:00131807 - isvavai.cz</a>
Výsledek na webu
<a href="https://www.mdpi.com/2079-7737/12/10/1276" target="_blank" >https://www.mdpi.com/2079-7737/12/10/1276</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.3390/biology12101276" target="_blank" >10.3390/biology12101276</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Transfer Learning Allows Accurate RBP Target Site Prediction with Limited Sample Sizes
Popis výsledku v původním jazyce
RNA-binding proteins are vital regulators in numerous biological processes. Their disfunction can result in diverse diseases, such as cancer or neurodegenerative disorders, making the prediction of their binding sites of high importance. Deep learning (DL) has brought about a revolution in various biological domains, including the field of protein–RNA interactions. Nonetheless, several challenges persist, such as the limited availability of experimentally validated binding sites to train well-performing DL models for the majority of proteins. Here, we present a novel training approach based on transfer learning (TL) to address the issue of limited data. Employing a sophisticated and interpretable architecture, we compare the performance of our method trained using two distinct approaches: training from scratch (SCR) and utilizing TL. Additionally, we benchmark our results against the current state-of-the-art methods. Furthermore, we tackle the challenges associated with selecting appropriate input features and determining optimal interval sizes. Our results show that TL enhances model performance, particularly in datasets with minimal training data, where satisfactory results can be achieved with just a few hundred RNA binding sites. Moreover, we demonstrate that integrating both sequence and evolutionary conservation information leads to superior performance. Additionally, we showcase how incorporating an attention layer into the model facilitates the interpretation of predictions within a biologically relevant context.
Název v anglickém jazyce
Transfer Learning Allows Accurate RBP Target Site Prediction with Limited Sample Sizes
Popis výsledku anglicky
RNA-binding proteins are vital regulators in numerous biological processes. Their disfunction can result in diverse diseases, such as cancer or neurodegenerative disorders, making the prediction of their binding sites of high importance. Deep learning (DL) has brought about a revolution in various biological domains, including the field of protein–RNA interactions. Nonetheless, several challenges persist, such as the limited availability of experimentally validated binding sites to train well-performing DL models for the majority of proteins. Here, we present a novel training approach based on transfer learning (TL) to address the issue of limited data. Employing a sophisticated and interpretable architecture, we compare the performance of our method trained using two distinct approaches: training from scratch (SCR) and utilizing TL. Additionally, we benchmark our results against the current state-of-the-art methods. Furthermore, we tackle the challenges associated with selecting appropriate input features and determining optimal interval sizes. Our results show that TL enhances model performance, particularly in datasets with minimal training data, where satisfactory results can be achieved with just a few hundred RNA binding sites. Moreover, we demonstrate that integrating both sequence and evolutionary conservation information leads to superior performance. Additionally, we showcase how incorporating an attention layer into the model facilitates the interpretation of predictions within a biologically relevant context.

Klasifikace

Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Návaznosti výsledku

Projekt
<a href="/cs/project/EF18_053%2F0016952" target="_blank" >EF18_053/0016952: Postdoc2MUNI</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace

Ostatní

Rok uplatnění
2023
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název periodika
Biology
ISSN
2079-7737
e-ISSN
2079-7737
Svazek periodika
12
Číslo periodika v rámci svazku
10
Stát vydavatele periodika
CH - Švýcarská konfederace
Počet stran výsledku
19
Strana od-do
1-19
Kód UT WoS článku
001090019100001
EID výsledku v databázi Scopus
—

Podobné výsledky(10)

Cryptic binding site prediction with protein language models When will RNA get its AlphaFold moment?Exploration of protein sequence embeddings for protein-ligand binding site detection

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Transfer Learning Allows Accurate RBP Target Site Prediction with Limited Sample Sizes

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)