Analyzing Wav2Vec 1.0 Embeddings for Cross-Database Parkinson’s Disease Detection and Speech Features Extraction
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21460%2F24%3A00377182" target="_blank" >RIV/68407700:21460/24:00377182 - isvavai.cz</a>
Result on the web
<a href="https://doi.org/10.3390/s24175520" target="_blank" >https://doi.org/10.3390/s24175520</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.3390/s24175520" target="_blank" >10.3390/s24175520</a>
Alternative languages
Result language
angličtina
Original language name
Analyzing Wav2Vec 1.0 Embeddings for Cross-Database Parkinson’s Disease Detection and Speech Features Extraction
Original language description
Advancements in deep learning speech representations have facilitated the effective use of extensive unlabeled speech datasets for Parkinson’s disease (PD) modeling with minimal annotated data. This study employs the non-fine-tuned wav2vec 1.0 architecture to develop machine learning models for PD speech diagnosis tasks, such as cross-database classification and regression to predict demographic and articulation characteristics. The primary aim is to analyze overlapping components within the embeddings on both classification and regression tasks, investigating whether latent speech representations in PD are shared across models, particularly for related tasks. Firstly, evaluation using three multi-language PD datasets showed that wav2vec accurately detected PD based on speech, outperforming feature extraction using mel-frequency cepstral coefficients in the proposed cross-database classification scenarios. In cross-database scenarios using Italian and English-read texts, wav2vec demonstrated performance comparable to intra-dataset evaluations. We also compared our cross-database findings against those of other related studies. Secondly, wav2vec proved effective in regression, modeling various quantitative speech characteristics related to articulation and aging. Ultimately, subsequent analysis of important features examined the presence of significant overlaps between classification and regression models. The feature importance experiments discovered shared features across trained models, with increased sharing for related tasks, further suggesting that wav2vec contributes to improved generalizability. The study proposes wav2vec embeddings as a next promising step toward a speech-based universal model to assist in the evaluation of PD.
Czech name
—
Czech description
—
Classification
Type
J<sub>imp</sub> - Article in a specialist periodical, which is included in the Web of Science database
CEP classification
—
OECD FORD branch
20601 - Medical engineering
Result continuities
Project
<a href="/en/project/LX22NPO5107" target="_blank" >LX22NPO5107: National institute for Neurological Research</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2024
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Name of the periodical
Sensors
ISSN
1424-8220
e-ISSN
1424-8220
Volume of the periodical
24
Issue of the periodical within the volume
17
Country of publishing house
CH - SWITZERLAND
Number of pages
22
Pages from-to
1-22
UT code for WoS article
001311558300001
EID of the result in the Scopus database
2-s2.0-85203878761