Character-level and syntax-level models for low-resource and multilingual natural language processing
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F23%3AFFMM4XDT" target="_blank" >RIV/00216208:11320/23:FFMM4XDT - isvavai.cz</a>
Result on the web
<a href="https://edoc.ub.uni-muenchen.de/32094/" target="_blank" >https://edoc.ub.uni-muenchen.de/32094/</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.5282/edoc.32094" target="_blank" >10.5282/edoc.32094</a>
Alternative languages
Result language
angličtina
Original language name
Character-level and syntax-level models for low-resource and multilingual natural language processing
Original language description
"This thesis investigates cross-lingual links for improving the processing of low-resource languages with language-agnostic models at the character and syntax level. Specifically, we propose to (i) use orthographic similarities and transliteration between Named Entities and rare words in different languages to improve the construction of Bilingual Word Embeddings (BWEs) and named entity resources, and (ii) exploit multiparallel corpora for projecting labels from high- to low-resource languages, thereby gaining access to weakly supervised processing methods for the latter."
Czech name
—
Czech description
—
Classification
Type
O - Miscellaneous
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
—
Continuities
—
Others
Publication year
2023
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů