Character-level and syntax-level models for low-resource and multilingual natural language processing

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F23%3AFFMM4XDT" target="_blank" >RIV/00216208:11320/23:FFMM4XDT - isvavai.cz</a>
Výsledek na webu
<a href="https://edoc.ub.uni-muenchen.de/32094/" target="_blank" >https://edoc.ub.uni-muenchen.de/32094/</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.5282/edoc.32094" target="_blank" >10.5282/edoc.32094</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Character-level and syntax-level models for low-resource and multilingual natural language processing
Popis výsledku v původním jazyce
"This thesis investigates cross-lingual links for improving the processing of low-resource languages with language-agnostic models at the character and syntax level. Specifically, we propose to (i) use orthographic similarities and transliteration between Named Entities and rare words in different languages to improve the construction of Bilingual Word Embeddings (BWEs) and named entity resources, and (ii) exploit multiparallel corpora for projecting labels from high- to low-resource languages, thereby gaining access to weakly supervised processing methods for the latter."
Název v anglickém jazyce
Character-level and syntax-level models for low-resource and multilingual natural language processing
Popis výsledku anglicky
"This thesis investigates cross-lingual links for improving the processing of low-resource languages with language-agnostic models at the character and syntax level. Specifically, we propose to (i) use orthographic similarities and transliteration between Named Entities and rare words in different languages to improve the construction of Bilingual Word Embeddings (BWEs) and named entity resources, and (ii) exploit multiparallel corpora for projecting labels from high- to low-resource languages, thereby gaining access to weakly supervised processing methods for the latter."

Klasifikace

Druh
O - Ostatní výsledky
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Návaznosti výsledku

Projekt
—
Návaznosti
—

Ostatní

Rok uplatnění
2023
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Podobné výsledky(10)

Quality of Word Vectors and Its Impact on Named Entity Recognition in Czech Named Entity Recognition in the Romanian Legal Domain Automatický přepis zvukových archívů pro účely vyhledávání informací

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Character-level and syntax-level models for low-resource and multilingual natural language processing

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Podobné výsledky(10)