Using an SVM Ensemble System for Improved Tamil Dependency Parsing
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F12%3A10130049" target="_blank" >RIV/00216208:11320/12:10130049 - isvavai.cz</a>
Výsledek na webu
—
DOI - Digital Object Identifier
—
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Using an SVM Ensemble System for Improved Tamil Dependency Parsing
Popis výsledku v původním jazyce
Dependency parsing has been shown to improve NLP systems in certain languages and in many cases helps achieve state of the art results in NLP applications, in particular applications for free word order languages. Morphologically rich languages are oftenshort on training data or require much higher amounts of training data due to the increased size of their lexicon. This paper examines a new approach for addressing morphologically rich languages with little training data to start. Using Tamil as our test language, we create 9 dependency parse models with a limited amount of training data. Using these models we train an SVM classifier using only the model agreements as features. We use this SVM classifier on an edge by edge decision to form an ensembleparse tree. Using only model agreements as features allows this method to remain language independent and applicable to a wide range of morphologically rich languages. We show a statistically significant 5.44% improvement over the averag
Název v anglickém jazyce
Using an SVM Ensemble System for Improved Tamil Dependency Parsing
Popis výsledku anglicky
Dependency parsing has been shown to improve NLP systems in certain languages and in many cases helps achieve state of the art results in NLP applications, in particular applications for free word order languages. Morphologically rich languages are oftenshort on training data or require much higher amounts of training data due to the increased size of their lexicon. This paper examines a new approach for addressing morphologically rich languages with little training data to start. Using Tamil as our test language, we create 9 dependency parse models with a limited amount of training data. Using these models we train an SVM classifier using only the model agreements as features. We use this SVM classifier on an edge by edge decision to form an ensembleparse tree. Using only model agreements as features allows this method to remain language independent and applicable to a wide range of morphologically rich languages. We show a statistically significant 5.44% improvement over the averag
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
IN - Informatika
OECD FORD obor
—
Návaznosti výsledku
Projekt
—
Návaznosti
R - Projekt Ramcoveho programu EK
Ostatní
Rok uplatnění
2012
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
ACL 2012 Joint Workshop on Statistical Parsing and Semantic Processing of Morphologically Rich Languages
ISBN
978-1-937284-30-5
ISSN
—
e-ISSN
—
Počet stran výsledku
6
Strana od-do
72-77
Název nakladatele
Association for Computational Linguistics
Místo vydání
Jeju, Korea
Místo konání akce
Jeju, Korea
Datum konání akce
12. 7. 2012
Typ akce podle státní příslušnosti
CST - Celostátní akce
Kód UT WoS článku
—