Occam's Razor for Big Data? On Detecting Quality in Large Unstructured Datasets
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F60076658%3A12310%2F19%3A43900048" target="_blank" >RIV/60076658:12310/19:43900048 - isvavai.cz</a>
Výsledek na webu
<a href="https://www.mdpi.com/2076-3417/9/15/3065/htm" target="_blank" >https://www.mdpi.com/2076-3417/9/15/3065/htm</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.3390/app9153065" target="_blank" >10.3390/app9153065</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Occam's Razor for Big Data? On Detecting Quality in Large Unstructured Datasets
Popis výsledku v původním jazyce
Detecting quality in large unstructured datasets requires capacities far beyond the limits of human perception and communicability and, as a result, there is an emerging trend towards increasingly complex analytic solutions in data science to cope with this problem. This new trend towards analytic complexity represents a severe challenge for the principle of parsimony (Occam's razor) in science. This review article combines insight from various domains such as physics, computational science, data engineering, and cognitive science to review the specific properties of big data. Problems for detecting data quality without losing the principle of parsimony are then highlighted on the basis of specific examples. Computational building block approaches for data clustering can help to deal with large unstructured datasets in minimized computation time, and meaning can be extracted rapidly from large sets of unstructured image or video data parsimoniously through relatively simple unsupervised machine learning algorithms. Why we still massively lack in expertise for exploiting big data wisely to extract relevant information for specific tasks, recognize patterns and generate new information, or simply store and further process large amounts of sensor data is then reviewed, and examples illustrating why we need subjective views and pragmatic methods to analyze big data contents are brought forward. The review concludes on how cultural differences between East and West are likely to affect the course of big data analytics, and the development of increasingly autonomous artificial intelligence (AI) aimed at coping with the big data deluge in the near future.
Název v anglickém jazyce
Occam's Razor for Big Data? On Detecting Quality in Large Unstructured Datasets
Popis výsledku anglicky
Detecting quality in large unstructured datasets requires capacities far beyond the limits of human perception and communicability and, as a result, there is an emerging trend towards increasingly complex analytic solutions in data science to cope with this problem. This new trend towards analytic complexity represents a severe challenge for the principle of parsimony (Occam's razor) in science. This review article combines insight from various domains such as physics, computational science, data engineering, and cognitive science to review the specific properties of big data. Problems for detecting data quality without losing the principle of parsimony are then highlighted on the basis of specific examples. Computational building block approaches for data clustering can help to deal with large unstructured datasets in minimized computation time, and meaning can be extracted rapidly from large sets of unstructured image or video data parsimoniously through relatively simple unsupervised machine learning algorithms. Why we still massively lack in expertise for exploiting big data wisely to extract relevant information for specific tasks, recognize patterns and generate new information, or simply store and further process large amounts of sensor data is then reviewed, and examples illustrating why we need subjective views and pragmatic methods to analyze big data contents are brought forward. The review concludes on how cultural differences between East and West are likely to affect the course of big data analytics, and the development of increasingly autonomous artificial intelligence (AI) aimed at coping with the big data deluge in the near future.
Klasifikace
Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10102 - Applied mathematics
Návaznosti výsledku
Projekt
—
Návaznosti
I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Ostatní
Rok uplatnění
2019
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
Applied Sciences-Basel
ISSN
2076-3417
e-ISSN
—
Svazek periodika
9
Číslo periodika v rámci svazku
15
Stát vydavatele periodika
CH - Švýcarská konfederace
Počet stran výsledku
27
Strana od-do
—
Kód UT WoS článku
000482134500120
EID výsledku v databázi Scopus
—