Multimodal Fusion: Combining Visual and Textual Cues for Concept Detection in Video
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21240%2F15%3A00224288" target="_blank" >RIV/68407700:21240/15:00224288 - isvavai.cz</a>
Result on the web
<a href="http://link.springer.com/chapter/10.1007%2F978-3-319-14998-1_13" target="_blank" >http://link.springer.com/chapter/10.1007%2F978-3-319-14998-1_13</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-319-14998-1_13" target="_blank" >10.1007/978-3-319-14998-1_13</a>
Alternative languages
Result language
angličtina
Original language name
Multimodal Fusion: Combining Visual and Textual Cues for Concept Detection in Video
Original language description
Visual concept detection is one of the most active research areas in multimedia analysis. The goal of visual concept detection is to assign to each elementary temporal segment of a video, a confidence score for each target concept (e.g. forest, ocean, sky, etc.). The establishment of such associations between the video content and the concept labels is a key step toward semantics-based indexing, retrieval, and summarization of videos, as well as deeper analysis (e.g., video event detection). Due to itssignificance for the multimedia analysis community, concept detection is the topic of international benchmarking activities such as TRECVID. While video is typically a multi-modal signal composed of visual content, speech, audio, and possibly also subtitles, most research has so far focused on exploiting the visual modality. In this chapter we introduce fusion and text analysis techniques for harnessing automatic speech recognition (ASR) transcripts or subtitles for improving the results
Czech name
—
Czech description
—
Classification
Type
C - Chapter in a specialist book
CEP classification
IN - Informatics
OECD FORD branch
—
Result continuities
Project
—
Continuities
I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Others
Publication year
2015
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Book/collection name
Multimedia Data Mining and Analytics
ISBN
978-3-319-14997-4
Number of pages of the result
16
Pages from-to
295-310
Number of pages of the book
454
Publisher name
Springer International Publishing AG
Place of publication
Cham
UT code for WoS chapter
—