Speeding up the multimedia feature extraction: a comparative study on the big data approach
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216224%3A14330%2F17%3A00094702" target="_blank" >RIV/00216224:14330/17:00094702 - isvavai.cz</a>
Výsledek na webu
<a href="http://dx.doi.org/10.1007/s11042-016-3415-1" target="_blank" >http://dx.doi.org/10.1007/s11042-016-3415-1</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/s11042-016-3415-1" target="_blank" >10.1007/s11042-016-3415-1</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Speeding up the multimedia feature extraction: a comparative study on the big data approach
Popis výsledku v původním jazyce
The current explosion of multimedia data is significantly increasing the amount of potential knowledge. However, to get to the actual information requires to apply novel content-based techniques which in turn require time consuming extraction of indexable features from the raw data. In order to deal with large datasets, this task needs to be parallelized. However, there are multiple approaches to choose from, each with its own benefits and drawbacks. There are also several parameters that must be taken into consideration, for example the amount of available resources, the size of the data and their availability. In this paper, we empirically evaluate and compare approaches based on Apache Hadoop, Apache Storm, Apache Spark, and Grid computing, employed to distribute the extraction task over an outsourced and distributed infrastructure.
Název v anglickém jazyce
Speeding up the multimedia feature extraction: a comparative study on the big data approach
Popis výsledku anglicky
The current explosion of multimedia data is significantly increasing the amount of potential knowledge. However, to get to the actual information requires to apply novel content-based techniques which in turn require time consuming extraction of indexable features from the raw data. In order to deal with large datasets, this task needs to be parallelized. However, there are multiple approaches to choose from, each with its own benefits and drawbacks. There are also several parameters that must be taken into consideration, for example the amount of available resources, the size of the data and their availability. In this paper, we empirically evaluate and compare approaches based on Apache Hadoop, Apache Storm, Apache Spark, and Grid computing, employed to distribute the extraction task over an outsourced and distributed infrastructure.
Klasifikace
Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
<a href="/cs/project/GBP103%2F12%2FG084" target="_blank" >GBP103/12/G084: Centrum pro multi-modální interpretaci dat velkého rozsahu</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Ostatní
Rok uplatnění
2017
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
Multimedia Tools and Applications
ISSN
1380-7501
e-ISSN
—
Svazek periodika
76
Číslo periodika v rámci svazku
5
Stát vydavatele periodika
US - Spojené státy americké
Počet stran výsledku
21
Strana od-do
7497-7517
Kód UT WoS článku
000397278400062
EID výsledku v databázi Scopus
2-s2.0-84960356866