On the art of taming and exploiting parallel tags in a multilingual corpus
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11210%2F12%3A10132260" target="_blank" >RIV/00216208:11210/12:10132260 - isvavai.cz</a>
Result on the web
<a href="http://utkl.ff.cuni.cz/~rosen/public/2010_unitags_slavicorp.pdf" target="_blank" >http://utkl.ff.cuni.cz/~rosen/public/2010_unitags_slavicorp.pdf</a>
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
On the art of taming and exploiting parallel tags in a multilingual corpus
Original language description
Multilingual parallel corpora can be annotated with monolingual tools, such as morphosyntactic taggers. However, even taggers for typologically similar languages often use incompatible tagsets, which results in conceptual and formal variety of tags within a single corpus. Retraining taggers on data annotated with a common tagset is not a realistic option. Differences between tagsets are often rooted in different linguistic perspectives rather than in real distinctions between the languages, which meansgood chances to find a common ground. Moreover, a different perspective may provide additional information missing in one tagset but present in another. Our first goal is to delegate the task of dealing with multiple tagsets to an abstract interlingual representation of linguistic categories. Ideally, each tag in every language-specific tagset used in the corpus is linked to a position in a tangled hierarchy of concepts. To accommodate the different perspectives, the hierarchy takes thre
Czech name
—
Czech description
—
Classification
Type
J<sub>x</sub> - Unclassified - Peer-reviewed scientific article (Jimp, Jsc and Jost)
CEP classification
AI - Linguistics
OECD FORD branch
—
Result continuities
Project
—
Continuities
Z - Vyzkumny zamer (s odkazem do CEZ)
Others
Publication year
2012
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Name of the periodical
Prace Filologiczne
ISSN
0138-0567
e-ISSN
—
Volume of the periodical
63
Issue of the periodical within the volume
—
Country of publishing house
PL - POLAND
Number of pages
16
Pages from-to
241-256
UT code for WoS article
—
EID of the result in the Scopus database
—