Testing the Effectiveness of the Diagnostic Probing Paradigm on Italian Treebanks

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F23%3ARWFTI84I" target="_blank" >RIV/00216208:11320/23:RWFTI84I - isvavai.cz</a>
Výsledek na webu
<a href="https://www.scopus.com/inward/record.uri?eid=2-s2.0-85151097172&doi=10.3390%2finfo14030144&partnerID=40&md5=838b086eb0b3bfe825fb38678010927b" target="_blank" >https://www.scopus.com/inward/record.uri?eid=2-s2.0-85151097172&doi=10.3390%2finfo14030144&partnerID=40&md5=838b086eb0b3bfe825fb38678010927b</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.3390/info14030144" target="_blank" >10.3390/info14030144</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Testing the Effectiveness of the Diagnostic Probing Paradigm on Italian Treebanks
Popis výsledku v původním jazyce
"The outstanding performance recently reached by neural language models (NLMs) across many natural language processing (NLP) tasks has steered the debate towards understanding whether NLMs implicitly learn linguistic competence. Probes, i.e., supervised models trained using NLM representations to predict linguistic properties, are frequently adopted to investigate this issue. However, it is still questioned if probing classification tasks really enable such investigation or if they simply hint at surface patterns in the data. This work contributes to this debate by presenting an approach to assessing the effectiveness of a suite of probing tasks aimed at testing the linguistic knowledge implicitly encoded by one of the most prominent NLMs, BERT. To this aim, we compared the performance of probes when predicting gold and automatically altered values of a set of linguistic features. Our experiments were performed on Italian and were evaluated across BERT’s layers and for sentences with different lengths. As a general result, we observed higher performance in the prediction of gold values, thus suggesting that the probing model is sensitive to the distortion of feature values. However, our experiments also showed that the length of a sentence is a highly influential factor that is able to confound the probing model’s predictions. © 2023 by the authors."
Název v anglickém jazyce
Testing the Effectiveness of the Diagnostic Probing Paradigm on Italian Treebanks
Popis výsledku anglicky
"The outstanding performance recently reached by neural language models (NLMs) across many natural language processing (NLP) tasks has steered the debate towards understanding whether NLMs implicitly learn linguistic competence. Probes, i.e., supervised models trained using NLM representations to predict linguistic properties, are frequently adopted to investigate this issue. However, it is still questioned if probing classification tasks really enable such investigation or if they simply hint at surface patterns in the data. This work contributes to this debate by presenting an approach to assessing the effectiveness of a suite of probing tasks aimed at testing the linguistic knowledge implicitly encoded by one of the most prominent NLMs, BERT. To this aim, we compared the performance of probes when predicting gold and automatically altered values of a set of linguistic features. Our experiments were performed on Italian and were evaluated across BERT’s layers and for sentences with different lengths. As a general result, we observed higher performance in the prediction of gold values, thus suggesting that the probing model is sensitive to the distortion of feature values. However, our experiments also showed that the length of a sentence is a highly influential factor that is able to confound the probing model’s predictions. © 2023 by the authors."

Klasifikace

Druh
J<sub>SC</sub> - Článek v periodiku v databázi SCOPUS
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Návaznosti výsledku

Projekt
—
Návaznosti
—

Ostatní

Rok uplatnění
2023
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název periodika
"Information (Switzerland)"
ISSN
2078-2489
e-ISSN
—
Svazek periodika
14
Číslo periodika v rámci svazku
3
Stát vydavatele periodika
US - Spojené státy americké
Počet stran výsledku
144
Strana od-do
1-144
Kód UT WoS článku
—
EID výsledku v databázi Scopus
2-s2.0-85151097172

Podobné výsledky(10)

Is Multilingual BERT Fluent in Language Generation?Naturalistic Causal Probing for Morpho-Syntax Enhancing deep neural networks with morphological information

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Testing the Effectiveness of the Diagnostic Probing Paradigm on Italian Treebanks

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)