Vše

Co hledáte?

Vše
Projekty
Výsledky výzkumu
Subjekty

Rychlé hledání

  • Projekty podpořené TA ČR
  • Významné projekty
  • Projekty s nejvyšší státní podporou
  • Aktuálně běžící projekty

Chytré vyhledávání

  • Takto najdu konkrétní +slovo
  • Takto z výsledků -slovo zcela vynechám
  • “Takto můžu najít celou frázi”

Context-Based Bigram Model for POS Tagging in Hindi: A Heuristic Approach

Identifikátory výsledku

  • Kód výsledku v IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F23%3ANXJGVCU7" target="_blank" >RIV/00216208:11320/23:NXJGVCU7 - isvavai.cz</a>

  • Výsledek na webu

    <a href="https://doi.org/10.1007/s40745-022-00434-4" target="_blank" >https://doi.org/10.1007/s40745-022-00434-4</a>

  • DOI - Digital Object Identifier

    <a href="http://dx.doi.org/10.1007/s40745-022-00434-4" target="_blank" >10.1007/s40745-022-00434-4</a>

Alternativní jazyky

  • Jazyk výsledku

    angličtina

  • Název v původním jazyce

    Context-Based Bigram Model for POS Tagging in Hindi: A Heuristic Approach

  • Popis výsledku v původním jazyce

    "In the domain of natural language processing, part-of-speech (POS) tagging is the most important task. It plays a vital role in applications like sentiment analysis, text summarization, opinion mining, etc. POS tagging is a process of assigning POS information (noun, pronoun, verb, etc.) to the given word. This information is considered in the context of their relationship with the surrounding words. Hindi is very popular language in countries like India, Nepal, United States, Mauritius, etc. Majority of Indians are accustomed to Hindi for reading and writing. They also use Hindi for writing on social media such as Twitter, Facebook, WhatsApp, etc. POS tagging is the most important phase to analyze these Hindi text from social media. The text scripted in Hindi is ambiguous in nature and rich in morphology. It makes identification of POS information challenging. In this article, a heuristic based approach is proposed for identifying POS information. The proposed method deployed a context-based bigram model that create a bigram sequence based on the relationship with the adjacent words. Subsequently, it selects the most likelihood POS information for a word based on both the forward and reverse bigram sequences. The experimental result of the proposed heuristic approach is compared with existing state-of-the-art techniques like hidden Markov model, decision tree, conditional random fields, support vector machine, neural network, and recurrent neural networks. Finally, it is observe that the proposed heuristic approach for POS tagging in Hindi outperforms the existing techniques and attains an accuracy of 94.3%."

  • Název v anglickém jazyce

    Context-Based Bigram Model for POS Tagging in Hindi: A Heuristic Approach

  • Popis výsledku anglicky

    "In the domain of natural language processing, part-of-speech (POS) tagging is the most important task. It plays a vital role in applications like sentiment analysis, text summarization, opinion mining, etc. POS tagging is a process of assigning POS information (noun, pronoun, verb, etc.) to the given word. This information is considered in the context of their relationship with the surrounding words. Hindi is very popular language in countries like India, Nepal, United States, Mauritius, etc. Majority of Indians are accustomed to Hindi for reading and writing. They also use Hindi for writing on social media such as Twitter, Facebook, WhatsApp, etc. POS tagging is the most important phase to analyze these Hindi text from social media. The text scripted in Hindi is ambiguous in nature and rich in morphology. It makes identification of POS information challenging. In this article, a heuristic based approach is proposed for identifying POS information. The proposed method deployed a context-based bigram model that create a bigram sequence based on the relationship with the adjacent words. Subsequently, it selects the most likelihood POS information for a word based on both the forward and reverse bigram sequences. The experimental result of the proposed heuristic approach is compared with existing state-of-the-art techniques like hidden Markov model, decision tree, conditional random fields, support vector machine, neural network, and recurrent neural networks. Finally, it is observe that the proposed heuristic approach for POS tagging in Hindi outperforms the existing techniques and attains an accuracy of 94.3%."

Klasifikace

  • Druh

    J<sub>ost</sub> - Ostatní články v recenzovaných periodicích

  • CEP obor

  • OECD FORD obor

    10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Návaznosti výsledku

  • Projekt

  • Návaznosti

Ostatní

  • Rok uplatnění

    2023

  • Kód důvěrnosti údajů

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

  • Název periodika

    "Annals of Data Science"

  • ISSN

    2198-5812

  • e-ISSN

  • Svazek periodika

    ""

  • Číslo periodika v rámci svazku

    2023

  • Stát vydavatele periodika

    US - Spojené státy americké

  • Počet stran výsledku

    32

  • Strana od-do

    347-378

  • Kód UT WoS článku

  • EID výsledku v databázi Scopus