A Baseline for General Music Object Detection with Deep Learning
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F18%3A10390086" target="_blank" >RIV/00216208:11320/18:10390086 - isvavai.cz</a>
Result on the web
<a href="https://www.mdpi.com/2076-3417/8/9/1488" target="_blank" >https://www.mdpi.com/2076-3417/8/9/1488</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.3390/app8091488" target="_blank" >10.3390/app8091488</a>
Alternative languages
Result language
angličtina
Original language name
A Baseline for General Music Object Detection with Deep Learning
Original language description
Deep learning is bringing breakthroughs to many computer vision subfields including Optical Music Recognition (OMR), which has seen a series of improvements to musical symbol detection achieved by using generic deep learning models. However, so far, each such proposal has been based on a specific dataset and different evaluation criteria, which made it difficult to quantify the new deep learning-based state-of-the-art and assess the relative merits of these detection models on music scores. In this paper, a baseline for general detection of musical symbols with deep learning is presented. We consider three datasets of heterogeneous typology but with the same annotation format, three neural models of different nature, and establish their performance in terms of a common evaluation standard. The experimental results confirm that the direct music object detection with deep learning is indeed promising, but at the same time illustrates some of the domain-specific shortcomings of the general detectors. A q
Czech name
—
Czech description
—
Classification
Type
J<sub>imp</sub> - Article in a specialist periodical, which is included in the Web of Science database
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
<a href="/en/project/GBP103%2F12%2FG084" target="_blank" >GBP103/12/G084: Center for Large Scale Multi-modal Data Interpretation</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2018
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Name of the periodical
Applied Sciences
ISSN
2076-3417
e-ISSN
—
Volume of the periodical
8
Issue of the periodical within the volume
9
Country of publishing house
CH - SWITZERLAND
Number of pages
21
Pages from-to
1488-1508
UT code for WoS article
000445760200077
EID of the result in the Scopus database
2-s2.0-85052803034