On Disambiguation in Czech Corpora
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216224%3A14330%2F00%3A00002818" target="_blank" >RIV/00216224:14330/00:00002818 - isvavai.cz</a>
Result on the web
—
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
On Disambiguation in Czech Corpora
Original language description
Lemma disambiguation means finding the basic word form, typically nominative singular for nouns or infinitive for verbs. We developed a multistrategy method for lemma disambiguation of unannotated text. The method is based on a combination of inductivelogic programming and instance-based learning. We present results of the most important subtasks of lemma disambiguation for Czech language. Although no expert knowledge on Czech grammar has been used the accuracy reaches 90% with a fraction of words re maining ambiguous. We also display first results of tag disambiguation.
Czech name
—
Czech description
—
Classification
Type
V<sub>x</sub> - Unclassified - Research report containing classified information
CEP classification
JC - Computer hardware and software
OECD FORD branch
—
Result continuities
Project
<a href="/en/project/VS97028" target="_blank" >VS97028: Natural Language Processing Laboratory (with applications supporting education of people with limited sight)</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2000
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Number of pages
12
Place of publication
Brno (CZE)
Publisher/client name
FI MU
Version
—