Assessing BERT’s ability to learn Italian syntax: a study on null-subject and agreement phenomena

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F23%3AXH2VDDW6" target="_blank" >RIV/00216208:11320/23:XH2VDDW6 - isvavai.cz</a>
Výsledek na webu
<a href="https://www.scopus.com/inward/record.uri?eid=2-s2.0-85105463216&doi=10.1007%2fs12652-021-03297-4&partnerID=40&md5=7c005e809e4bf3ce2c40bc3a356f9e0a" target="_blank" >https://www.scopus.com/inward/record.uri?eid=2-s2.0-85105463216&doi=10.1007%2fs12652-021-03297-4&partnerID=40&md5=7c005e809e4bf3ce2c40bc3a356f9e0a</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/s12652-021-03297-4" target="_blank" >10.1007/s12652-021-03297-4</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Assessing BERT’s ability to learn Italian syntax: a study on null-subject and agreement phenomena
Popis výsledku v původním jazyce
"The work presented in this paper investigates the ability of BERT neural language model pretrained in Italian to embed syntactic dependency relationships into its layers, by approximating a Dependency Parse Tree. To this end, a structural probe, namely a supervised model able to extract linguistic structures from a language model, has been trained leveraging the contextual embeddings from the layers of BERT. An experimental assessment has been performed using an Italian version of BERT-base model and a set of datasets for Italian labelled with Universal Dependencies formalism. The results, achieved using standard metrics of dependency parsers, have shown that a knowledge of the Italian syntax is embedded in central-upper layers of the BERT model, according to what observed in literature for the English case. In addition, the probe has been also used to experimentally evaluate the BERT model behaviour in case of two specific syntactic phenomena in Italian, namely null-subject and subject-verb-agreement, showing better performance than an Italian state-of-the-art parser. These findings can open a path for the development of new hybrid approaches, exploiting the probe to integrate or improve limits or weaknesses in analysing articulated constructions of Italian syntax, traditionally complex to be parsed. © 2021, The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature."
Název v anglickém jazyce
Assessing BERT’s ability to learn Italian syntax: a study on null-subject and agreement phenomena
Popis výsledku anglicky
"The work presented in this paper investigates the ability of BERT neural language model pretrained in Italian to embed syntactic dependency relationships into its layers, by approximating a Dependency Parse Tree. To this end, a structural probe, namely a supervised model able to extract linguistic structures from a language model, has been trained leveraging the contextual embeddings from the layers of BERT. An experimental assessment has been performed using an Italian version of BERT-base model and a set of datasets for Italian labelled with Universal Dependencies formalism. The results, achieved using standard metrics of dependency parsers, have shown that a knowledge of the Italian syntax is embedded in central-upper layers of the BERT model, according to what observed in literature for the English case. In addition, the probe has been also used to experimentally evaluate the BERT model behaviour in case of two specific syntactic phenomena in Italian, namely null-subject and subject-verb-agreement, showing better performance than an Italian state-of-the-art parser. These findings can open a path for the development of new hybrid approaches, exploiting the probe to integrate or improve limits or weaknesses in analysing articulated constructions of Italian syntax, traditionally complex to be parsed. © 2021, The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature."

Klasifikace

Druh
J<sub>SC</sub> - Článek v periodiku v databázi SCOPUS
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Návaznosti výsledku

Projekt
—
Návaznosti
—

Ostatní

Rok uplatnění
2023
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název periodika
"Journal of Ambient Intelligence and Humanized Computing"
ISSN
1868-5137
e-ISSN
—
Svazek periodika
14
Číslo periodika v rámci svazku
1
Stát vydavatele periodika
US - Spojené státy americké
Počet stran výsledku
15
Strana od-do
289-303
Kód UT WoS článku
—
EID výsledku v databázi Scopus
2-s2.0-85105463216

Podobné výsledky(10)

Probing Cross-lingual Transfer of XLM Multi-language Model Integrating graph embedding and neural models for improving transition-based dependency parsing On the evolution of syntactic information encoded by BERT's contextualized representations

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Assessing BERT’s ability to learn Italian syntax: a study on null-subject and agreement phenomena

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)