Context-Based Bigram Model for POS Tagging in Hindi: A Heuristic Approach
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F23%3ANXJGVCU7" target="_blank" >RIV/00216208:11320/23:NXJGVCU7 - isvavai.cz</a>
Result on the web
<a href="https://doi.org/10.1007/s40745-022-00434-4" target="_blank" >https://doi.org/10.1007/s40745-022-00434-4</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/s40745-022-00434-4" target="_blank" >10.1007/s40745-022-00434-4</a>
Alternative languages
Result language
angličtina
Original language name
Context-Based Bigram Model for POS Tagging in Hindi: A Heuristic Approach
Original language description
"In the domain of natural language processing, part-of-speech (POS) tagging is the most important task. It plays a vital role in applications like sentiment analysis, text summarization, opinion mining, etc. POS tagging is a process of assigning POS information (noun, pronoun, verb, etc.) to the given word. This information is considered in the context of their relationship with the surrounding words. Hindi is very popular language in countries like India, Nepal, United States, Mauritius, etc. Majority of Indians are accustomed to Hindi for reading and writing. They also use Hindi for writing on social media such as Twitter, Facebook, WhatsApp, etc. POS tagging is the most important phase to analyze these Hindi text from social media. The text scripted in Hindi is ambiguous in nature and rich in morphology. It makes identification of POS information challenging. In this article, a heuristic based approach is proposed for identifying POS information. The proposed method deployed a context-based bigram model that create a bigram sequence based on the relationship with the adjacent words. Subsequently, it selects the most likelihood POS information for a word based on both the forward and reverse bigram sequences. The experimental result of the proposed heuristic approach is compared with existing state-of-the-art techniques like hidden Markov model, decision tree, conditional random fields, support vector machine, neural network, and recurrent neural networks. Finally, it is observe that the proposed heuristic approach for POS tagging in Hindi outperforms the existing techniques and attains an accuracy of 94.3%."
Czech name
—
Czech description
—
Classification
Type
J<sub>ost</sub> - Miscellaneous article in a specialist periodical
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
—
Continuities
—
Others
Publication year
2023
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Name of the periodical
"Annals of Data Science"
ISSN
2198-5812
e-ISSN
—
Volume of the periodical
""
Issue of the periodical within the volume
2023
Country of publishing house
US - UNITED STATES
Number of pages
32
Pages from-to
347-378
UT code for WoS article
—
EID of the result in the Scopus database
—