Using an SVM Ensemble System for Improved Tamil Dependency Parsing
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F12%3A10130049" target="_blank" >RIV/00216208:11320/12:10130049 - isvavai.cz</a>
Result on the web
—
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
Using an SVM Ensemble System for Improved Tamil Dependency Parsing
Original language description
Dependency parsing has been shown to improve NLP systems in certain languages and in many cases helps achieve state of the art results in NLP applications, in particular applications for free word order languages. Morphologically rich languages are oftenshort on training data or require much higher amounts of training data due to the increased size of their lexicon. This paper examines a new approach for addressing morphologically rich languages with little training data to start. Using Tamil as our test language, we create 9 dependency parse models with a limited amount of training data. Using these models we train an SVM classifier using only the model agreements as features. We use this SVM classifier on an edge by edge decision to form an ensembleparse tree. Using only model agreements as features allows this method to remain language independent and applicable to a wide range of morphologically rich languages. We show a statistically significant 5.44% improvement over the averag
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
IN - Informatics
OECD FORD branch
—
Result continuities
Project
—
Continuities
R - Projekt Ramcoveho programu EK
Others
Publication year
2012
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
ACL 2012 Joint Workshop on Statistical Parsing and Semantic Processing of Morphologically Rich Languages
ISBN
978-1-937284-30-5
ISSN
—
e-ISSN
—
Number of pages
6
Pages from-to
72-77
Publisher name
Association for Computational Linguistics
Place of publication
Jeju, Korea
Event location
Jeju, Korea
Event date
Jul 12, 2012
Type of event by nationality
CST - Celostátní akce
UT code for WoS article
—