Improving Dependency Parsing by Filtering Linguistic Noise

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11210%2F13%3A10188878" target="_blank" >RIV/00216208:11210/13:10188878 - isvavai.cz</a>
Výsledek na webu
<a href="http://link.springer.com/chapter/10.1007%2F978-3-642-40585-3_37" target="_blank" >http://link.springer.com/chapter/10.1007%2F978-3-642-40585-3_37</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-642-40585-3_37" target="_blank" >10.1007/978-3-642-40585-3_37</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Improving Dependency Parsing by Filtering Linguistic Noise
Popis výsledku v původním jazyce
In this paper, we describe a way to improve stochastic dependency parsing by simplifying both the training data and new text to be parsed. Many parsing errors are due to limited size of the training data, where most of the words of a given language occurseldom or not at all, thus the parser cannot learn their syntactic properties. By defining narrow classes of words with identical syntactic properties and replacing members of these classes by one representative, we facilitate language modeling done bythe parser and improve its accuracy. In our experiment, a 17.8%decrease in forms variability in the training data of the Czech dependency treebank PDT led to a 8.1% relative error reduction.
Název v anglickém jazyce
Improving Dependency Parsing by Filtering Linguistic Noise
Popis výsledku anglicky
In this paper, we describe a way to improve stochastic dependency parsing by simplifying both the training data and new text to be parsed. Many parsing errors are due to limited size of the training data, where most of the words of a given language occurseldom or not at all, thus the parser cannot learn their syntactic properties. By defining narrow classes of words with identical syntactic properties and replacing members of these classes by one representative, we facilitate language modeling done bythe parser and improve its accuracy. In our experiment, a 17.8%decrease in forms variability in the training data of the Czech dependency treebank PDT led to a 8.1% relative error reduction.

Klasifikace

Druh
D - Stať ve sborníku
CEP obor
AI - Jazykověda
OECD FORD obor
—

Návaznosti výsledku

Projekt
<a href="/cs/project/GA13-27184S" target="_blank" >GA13-27184S: Treebank češtiny na základě gramatiky</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Ostatní

Rok uplatnění
2013
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název statě ve sborníku
Text, Speech, and Dialogue
ISBN
978-3-642-40584-6
ISSN
0302-9743
e-ISSN
—
Počet stran výsledku
7
Strana od-do
288-294
Název nakladatele
Springer
Místo vydání
Berlin
Místo konání akce
Plzeň
Datum konání akce
1. 9. 2013
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
—

Podobné výsledky(10)

Weakly Supervised Headline Dependency Parsing Indonesian Dependency Treebank: Annotation and Parsing Using Parallel Features in Parsing of Machine-Translated Sentences for Correction of Grammatical Errors

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Improving Dependency Parsing by Filtering Linguistic Noise

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)