Pronunciation Ambiguities in Japanese Kanji
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F23%3AEZFI84TF" target="_blank" >RIV/00216208:11320/23:EZFI84TF - isvavai.cz</a>
Výsledek na webu
<a href="https://aclanthology.org/2023.cawl-1.7/" target="_blank" >https://aclanthology.org/2023.cawl-1.7/</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.18653/v1/2023.cawl-1.7" target="_blank" >10.18653/v1/2023.cawl-1.7</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Pronunciation Ambiguities in Japanese Kanji
Popis výsledku v původním jazyce
"Japanese writing is a complex system, and a large part of the complexity resides in the use of kanji. A single kanji character in modern Japanese may have multiple pronunciations, either as native vocabulary or as words borrowed from Chinese. This causes a problem for text-to-speech synthesis (TTS) because the system has to predict which pronunciation of each kanji character is appropriate in the context. The problem is called homograph disambiguation. To solve the problem, this research provides a new annotated Japanese single kanji character pronunciation data set and describes an experiment using the logistic regression (LR) classifier. A baseline is computed to compare with the LR classifier accuracy. This experiment provides the first experimental research in Japanese single kanji homograph disambiguation. The annotated Japanese data is freely released to the public to support further work."
Název v anglickém jazyce
Pronunciation Ambiguities in Japanese Kanji
Popis výsledku anglicky
"Japanese writing is a complex system, and a large part of the complexity resides in the use of kanji. A single kanji character in modern Japanese may have multiple pronunciations, either as native vocabulary or as words borrowed from Chinese. This causes a problem for text-to-speech synthesis (TTS) because the system has to predict which pronunciation of each kanji character is appropriate in the context. The problem is called homograph disambiguation. To solve the problem, this research provides a new annotated Japanese single kanji character pronunciation data set and describes an experiment using the logistic regression (LR) classifier. A baseline is computed to compare with the LR classifier accuracy. This experiment provides the first experimental research in Japanese single kanji homograph disambiguation. The annotated Japanese data is freely released to the public to support further work."
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
—
Návaznosti
—
Ostatní
Rok uplatnění
2023
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
"Proceedings of the Workshop on Computation and Written Language"
ISBN
978-1-959429-90-6
ISSN
—
e-ISSN
—
Počet stran výsledku
11
Strana od-do
50-60
Název nakladatele
ACL
Místo vydání
Aarhus, Denmark
Místo konání akce
Aarhus, Denmark
Datum konání akce
1. 1. 2023
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
—