HPS: High precision stemmer
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F49777513%3A23520%2F15%3A43922745" target="_blank" >RIV/49777513:23520/15:43922745 - isvavai.cz</a>
Result on the web
<a href="http://dx.doi.org/10.1016/j.ipm.2014.08.006" target="_blank" >http://dx.doi.org/10.1016/j.ipm.2014.08.006</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1016/j.ipm.2014.08.006" target="_blank" >10.1016/j.ipm.2014.08.006</a>
Alternative languages
Result language
angličtina
Original language name
HPS: High precision stemmer
Original language description
Research into unsupervised ways of stemming has resulted, in the past few years, in the development of methods that are reliable and perform well. Our approach further shifts the boundaries of the state of the art by providing more accurate stemming results. The idea of the approach consists in building a stemmer in two stages. In the first stage, a stemming algorithm based upon clustering, which exploits the lexical and semantic information of words, is used to prepare large-scale training data for thesecond-stage algorithm. The second-stage algorithm uses a maximum entropy classifier. The stemming-specific features help the classifier decide when and how to stem a particular word. In our research, we have pursued the goal of creating a multi-purposestemming tool. Its design opens up possibilities of solving non-traditional tasks such as approximating lemmas or improving language modeling. However, we still aim at very good results in the traditional task of information retrieval. T
Czech name
—
Czech description
—
Classification
Type
J<sub>x</sub> - Unclassified - Peer-reviewed scientific article (Jimp, Jsc and Jost)
CEP classification
JD - Use of computers, robotics and its application
OECD FORD branch
—
Result continuities
Project
<a href="/en/project/ED1.1.00%2F02.0090" target="_blank" >ED1.1.00/02.0090: NTIS - New Technologies for Information Society</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>S - Specificky vyzkum na vysokych skolach
Others
Publication year
2015
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Name of the periodical
Information Processing and Mangement
ISSN
0306-4573
e-ISSN
—
Volume of the periodical
51
Issue of the periodical within the volume
1
Country of publishing house
NL - THE KINGDOM OF THE NETHERLANDS
Number of pages
24
Pages from-to
68-91
UT code for WoS article
000345491900005
EID of the result in the Scopus database
—