Cross-lingual dependency transfer with harmonized Indian language treebanks
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F14%3A10289406" target="_blank" >RIV/00216208:11320/14:10289406 - isvavai.cz</a>
Result on the web
<a href="http://tlt13.sfs.uni-tuebingen.de/tlt13-proceedings.pdf" target="_blank" >http://tlt13.sfs.uni-tuebingen.de/tlt13-proceedings.pdf</a>
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
Cross-lingual dependency transfer with harmonized Indian language treebanks
Original language description
One of the most important aspect of cross-lingual dependency transfer is how different annotation styles which often underestimate the parsing accuracy are handled. The emerging trend is that the annotation style of different language treebanks can be harmonized into one style and the cumbersome manual transformation rules thus can be avoided. In this paper, we use harmonized treebanks (POS tagsets and dependency structures of original treebanks mapped to a common style) for inducing dependencies in a cross-lingual setting. We transfer dependencies using delexicalized parsers that use harmonized version of the original treebanks. We apply this approach to five Indian languages (Hindi, Urdu, Telugu, Bengali and Tamil) and show that best performance canbe obtained in delexicalized parsing when the transfer takes place from Indian language (IL) to IL treebanks.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
IN - Informatics
OECD FORD branch
—
Result continuities
Project
<a href="/en/project/LM2010013" target="_blank" >LM2010013: LINDAT-CLARIN: Institute for analysis, processing and distribution of linguistic data</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2014
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Proceedings of 13th International Workshop on Treebanks and Linguistic Theories (TLT13)
ISBN
978-3-9809183-9-8
ISSN
—
e-ISSN
—
Number of pages
12
Pages from-to
160-171
Publisher name
University of Tübingen
Place of publication
Tübingen, Germany
Event location
Tübingen, Germany
Event date
Dec 12, 2014
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—