DMoG : A Data-Based Morphological Guesser
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216224%3A14330%2F21%3A00123251" target="_blank" >RIV/00216224:14330/21:00123251 - isvavai.cz</a>
Result on the web
<a href="https://nlp.fi.muni.cz/raslan/raslan21.pdf#page=143" target="_blank" >https://nlp.fi.muni.cz/raslan/raslan21.pdf#page=143</a>
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
DMoG : A Data-Based Morphological Guesser
Original language description
We present a novel corpus-based approach to lemmatization of unknown words. The tool learns affix patterns from annotated data, and based on these patterns, it predicts other word forms that should be present in the corpus. A lemma candidate then comes from the pattern whose predictions are really found in the corpus. We present a prototype implementation and an initial evaluation on Czech, which shows promising results.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10200 - Computer and information sciences
Result continuities
Project
<a href="/en/project/LM2018101" target="_blank" >LM2018101: Digital Research Infrastructure for the Language Technologies, Arts and Humanities</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2021
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Recent Advances in Slavonic Natural Language Processing (RASLAN 2021)
ISBN
9788026316701
ISSN
2336-4289
e-ISSN
—
Number of pages
4
Pages from-to
135-138
Publisher name
Tribun EU
Place of publication
Brno
Event location
Brno
Event date
Jan 1, 2021
Type of event by nationality
EUR - Evropská akce
UT code for WoS article
—