K-best Viterbi Semi-supervized Active Learning in Sequence Labelling
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F67985807%3A_____%2F17%3A00478628" target="_blank" >RIV/67985807:_____/17:00478628 - isvavai.cz</a>
Výsledek na webu
<a href="http://ceur-ws.org/Vol-1885/144.pdf" target="_blank" >http://ceur-ws.org/Vol-1885/144.pdf</a>
DOI - Digital Object Identifier
—
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
K-best Viterbi Semi-supervized Active Learning in Sequence Labelling
Popis výsledku v původním jazyce
In application domains where there exists a large amount of unlabelled data but obtaining labels is expensive, active learning is a useful way to select which data should be labelled. In addition to its traditional successful use in classification and regression tasks, active learning has been also applied to sequence labelling. According to the standard active learning approach, sequences for which the labelling would be the most informative should be labelled. However, labelling the entire sequence may be inefficient as for some its parts, the labels can be predicted using a model. Labelling such parts brings only a little new information. Therefore in this paper, we investigate a sequence labelling approach in which in the sequence selected for labelling, the labels of most tokens are predicted by a model and only tokens that the model can not predict with sufficient confidence are labelled. Those tokens are identified using the k-best Viterbi algorithm.
Název v anglickém jazyce
K-best Viterbi Semi-supervized Active Learning in Sequence Labelling
Popis výsledku anglicky
In application domains where there exists a large amount of unlabelled data but obtaining labels is expensive, active learning is a useful way to select which data should be labelled. In addition to its traditional successful use in classification and regression tasks, active learning has been also applied to sequence labelling. According to the standard active learning approach, sequences for which the labelling would be the most informative should be labelled. However, labelling the entire sequence may be inefficient as for some its parts, the labels can be predicted using a model. Labelling such parts brings only a little new information. Therefore in this paper, we investigate a sequence labelling approach in which in the sequence selected for labelling, the labels of most tokens are predicted by a model and only tokens that the model can not predict with sufficient confidence are labelled. Those tokens are identified using the k-best Viterbi algorithm.
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
<a href="/cs/project/GA17-01251S" target="_blank" >GA17-01251S: Metaučení pro extrakci pravidel s numerickými konsekventy</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Ostatní
Rok uplatnění
2017
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
Proceedings ITAT 2017: Information Technologies - Applications and Theory
ISBN
978-1974274741
ISSN
1613-0073
e-ISSN
—
Počet stran výsledku
9
Strana od-do
144-152
Název nakladatele
Technical University & CreateSpace Independent Publishing Platform
Místo vydání
Aachen & Charleston
Místo konání akce
Martinské hole
Datum konání akce
22. 9. 2017
Typ akce podle státní příslušnosti
EUR - Evropská akce
Kód UT WoS článku
—