Increasing the quality and quantity of source language data for unsupervised cross-lingual POS tagging.

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F13%3A10194631" target="_blank" >RIV/00216208:11320/13:10194631 - isvavai.cz</a>
Výsledek na webu
—
DOI - Digital Object Identifier
—

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Increasing the quality and quantity of source language data for unsupervised cross-lingual POS tagging.
Popis výsledku v původním jazyce
Bilingual corpora offer a promising bridge between resource-rich and resource-poor languages, enabling the development of natural language processing systems for the latter. English is often selected as the resource-rich language, but another choice might give better performance. In this paper, we consider the task of unsupervised cross-lingual POS tagging, and construct a model that predicts the best source language for a given target language. In experiments on 9 languages, this model improves on using a single fixed source language. We then show that further improvements can be made by combining information from multiple source languages.
Název v anglickém jazyce
Increasing the quality and quantity of source language data for unsupervised cross-lingual POS tagging.
Popis výsledku anglicky
Bilingual corpora offer a promising bridge between resource-rich and resource-poor languages, enabling the development of natural language processing systems for the latter. English is often selected as the resource-rich language, but another choice might give better performance. In this paper, we consider the task of unsupervised cross-lingual POS tagging, and construct a model that predicts the best source language for a given target language. In experiments on 9 languages, this model improves on using a single fixed source language. We then show that further improvements can be made by combining information from multiple source languages.

Klasifikace

Druh
D - Stať ve sborníku
CEP obor
IN - Informatika
OECD FORD obor
—

Návaznosti výsledku

Projekt
<a href="/cs/project/GBP103%2F12%2FG084" target="_blank" >GBP103/12/G084: Centrum pro multi-modální interpretaci dat velkého rozsahu</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Ostatní

Rok uplatnění
2013
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název statě ve sborníku
Proceedings of the 6th International Joint Conference on Natural Language Processing
ISBN
978-4-9907348-0-0
ISSN
—
e-ISSN
—
Počet stran výsledku
7
Strana od-do
1243-1249
Název nakladatele
Asian Federation of Natural Language Processing
Místo vydání
Nagoya, Japan
Místo konání akce
Nagoya, Japan
Datum konání akce
14. 10. 2013
Typ akce podle státní příslušnosti
CST - Celostátní akce
Kód UT WoS článku
—

Podobné výsledky(10)

Unsupervised Stem-based Cross-lingual Part-of-Speech Tagging for Morphologically Rich Low-Resource Languages Text-Inductive Graphone-Based Language Adaptation for Low-Resource Speech Synthesis Improving BERTScore for Machine Translation Evaluation Through Contrastive Learning

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Increasing the quality and quantity of source language data for unsupervised cross-lingual POS tagging.

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)