Named Entity Recognition in Vietnamese Tweets

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F61989100%3A27240%2F15%3A86096567" target="_blank" >RIV/61989100:27240/15:86096567 - isvavai.cz</a>
Výsledek na webu
<a href="http://dx.doi.org/10.1007/978-3-319-21786-4_18" target="_blank" >http://dx.doi.org/10.1007/978-3-319-21786-4_18</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-319-21786-4_18" target="_blank" >10.1007/978-3-319-21786-4_18</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Named Entity Recognition in Vietnamese Tweets
Popis výsledku v původním jazyce
Named entity recognition (NER) is a task of detecting named entities in documents and categorizing them to predefined classes such as Person (PER), Location (LOC), Organization (ORG) and so on. There have been many approaches proposed to tackle this problem in both formal texts such as news or authorized web content and short texts such as contents in online social network. However, those texts were written in languages other than Vietnamese. In this paper, we propose a method for NER in Vietnamese tweets. Since tweets on Twitter are noisy, irregular, short and consist of acronyms, spelling errors, NER in those tweets is a challenging task. Our method firstly normalizes tweets and then applies a learning model to recognize named entities using six different types of features. We built a training set of more than 40,000 named entities, and a testing set of 2,446 named entities to evaluate our system. The experiment results show that our system achieves encouraging performance with 82.3%
Název v anglickém jazyce
Named Entity Recognition in Vietnamese Tweets
Popis výsledku anglicky
Named entity recognition (NER) is a task of detecting named entities in documents and categorizing them to predefined classes such as Person (PER), Location (LOC), Organization (ORG) and so on. There have been many approaches proposed to tackle this problem in both formal texts such as news or authorized web content and short texts such as contents in online social network. However, those texts were written in languages other than Vietnamese. In this paper, we propose a method for NER in Vietnamese tweets. Since tweets on Twitter are noisy, irregular, short and consist of acronyms, spelling errors, NER in those tweets is a challenging task. Our method firstly normalizes tweets and then applies a learning model to recognize named entities using six different types of features. We built a training set of more than 40,000 named entities, and a testing set of 2,446 named entities to evaluate our system. The experiment results show that our system achieves encouraging performance with 82.3%

Klasifikace

Druh
D - Stať ve sborníku
CEP obor
IN - Informatika
OECD FORD obor
—

Návaznosti výsledku

Projekt
—
Návaznosti
S - Specificky vyzkum na vysokych skolach

Ostatní

Rok uplatnění
2015
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název statě ve sborníku
Lecture Notes in Computer Science. Volume 9197
ISBN
978-3-319-21785-7
ISSN
0302-9743
e-ISSN
—
Počet stran výsledku
11
Strana od-do
205-215
Název nakladatele
Springer Verlag
Místo vydání
London
Místo konání akce
Beijing
Datum konání akce
4. 8. 2015
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
—

Podobné výsledky(10)

Normalization of Vietnamese Tweets on Twitter Sebastian, Basti, Wastl?! Recognizing Named Entities in Bavarian Dialectal Data Annotating the Tweebank Corpus on Named Entity Recognition and Building NLP Models for Social Media Analysis

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Named Entity Recognition in Vietnamese Tweets

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)