Score Normalization Methods Applied to Topic Identification
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F49777513%3A23520%2F14%3A43922926" target="_blank" >RIV/49777513:23520/14:43922926 - isvavai.cz</a>
Result on the web
<a href="http://link.springer.com/chapter/10.1007/978-3-319-10816-2_17" target="_blank" >http://link.springer.com/chapter/10.1007/978-3-319-10816-2_17</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-319-10816-2_17" target="_blank" >10.1007/978-3-319-10816-2_17</a>
Alternative languages
Result language
angličtina
Original language name
Score Normalization Methods Applied to Topic Identification
Original language description
Multi-label classification plays the key role in modern categorization systems. Its goal is to find a set of labels belonging to each data item. In the multi-label document classification unlike in the multi-class classification, where only the best topic is chosen, the classifier must decide if a document does or does not belong to each topic from the predefined topic set. We are using the generative classifier to tackle this task, but the problem with this approach is that the threshold for the positive classification must be set. This threshold can vary for each document depending on the content of the document (words used, length of the document, ...). In this paper we use the Unconstrained Cohort Normalization, primary proposed for speaker identification/verification task, for robustly finding the threshold defining the boundary between the correct and the incorrect topics of a document. In our former experiments we have proposed a method for finding this threshold inspired by ano
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
JD - Use of computers, robotics and its application
OECD FORD branch
—
Result continuities
Project
<a href="/en/project/LM2010013" target="_blank" >LM2010013: LINDAT-CLARIN: Institute for analysis, processing and distribution of linguistic data</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>S - Specificky vyzkum na vysokych skolach
Others
Publication year
2014
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Text, Speech, and Dialogue, 17th International Conference, TSD 2014, Brno, Czech Republic, September 8-12, 2014. Proceedings
ISBN
978-3-319-10815-5
ISSN
0302-9743
e-ISSN
—
Number of pages
8
Pages from-to
133-140
Publisher name
Springer
Place of publication
Heidelberg
Event location
Brno, Czech Republic
Event date
Sep 8, 2014
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—