Do Not Fire the Linguist: Grammatical Profiles Help Language Models Detect Semantic Change
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F22%3AD9D7L695" target="_blank" >RIV/00216208:11320/22:D9D7L695 - isvavai.cz</a>
Výsledek na webu
<a href="https://aclanthology.org/2022.lchange-1.6" target="_blank" >https://aclanthology.org/2022.lchange-1.6</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.18653/v1/2022.lchange-1.6" target="_blank" >10.18653/v1/2022.lchange-1.6</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Do Not Fire the Linguist: Grammatical Profiles Help Language Models Detect Semantic Change
Popis výsledku v původním jazyce
Morphological and syntactic changes in word usage — as captured, e.g., by grammatical profiles — have been shown to be good predictors of a word's meaning change. In this work, we explore whether large pre-trained contextualised language models, a common tool for lexical semantic change detection, are sensitive to such morphosyntactic changes. To this end, we first compare the performance of grammatical profiles against that of a multilingual neural language model (XLM-R) on 10 datasets, covering 7 languages, and then combine the two approaches in ensembles to assess their complementarity. Our results show that ensembling grammatical profiles with XLM-R improves semantic change detection performance for most datasets and languages. This indicates that language models do not fully cover the fine-grained morphological and syntactic signals that are explicitly represented in grammatical profiles. An interesting exception are the test sets where the time spans under analysis are much longer than the time gap between them (for example, century-long spans with a one-year gap between them). Morphosyntactic change is slow so grammatical profiles do not detect in such cases. In contrast, language models, thanks to their access to lexical information, are able to detect fast topical changes.
Název v anglickém jazyce
Do Not Fire the Linguist: Grammatical Profiles Help Language Models Detect Semantic Change
Popis výsledku anglicky
Morphological and syntactic changes in word usage — as captured, e.g., by grammatical profiles — have been shown to be good predictors of a word's meaning change. In this work, we explore whether large pre-trained contextualised language models, a common tool for lexical semantic change detection, are sensitive to such morphosyntactic changes. To this end, we first compare the performance of grammatical profiles against that of a multilingual neural language model (XLM-R) on 10 datasets, covering 7 languages, and then combine the two approaches in ensembles to assess their complementarity. Our results show that ensembling grammatical profiles with XLM-R improves semantic change detection performance for most datasets and languages. This indicates that language models do not fully cover the fine-grained morphological and syntactic signals that are explicitly represented in grammatical profiles. An interesting exception are the test sets where the time spans under analysis are much longer than the time gap between them (for example, century-long spans with a one-year gap between them). Morphosyntactic change is slow so grammatical profiles do not detect in such cases. In contrast, language models, thanks to their access to lexical information, are able to detect fast topical changes.
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
—
Návaznosti
—
Ostatní
Rok uplatnění
2022
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
Proceedings of the 3rd Workshop on Computational Approaches to Historical Language Change
ISBN
978-1-955917-42-1
ISSN
—
e-ISSN
—
Počet stran výsledku
14
Strana od-do
54-67
Název nakladatele
Association for Computational Linguistics
Místo vydání
—
Místo konání akce
Dublin, Ireland
Datum konání akce
1. 1. 2022
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
—