Healthc Inform Res.  2020 Oct;26(4):274-283. 10.4258/hir.2020.26.4.274.

Analysis of Smartphone Recordings in Time, Frequency, and Cepstral Domains to Classify Parkinson’s Disease

Affiliations
  • 1Department of Biomedical Engineering, Mohammed V University in Rabat, Morocco
  • 2Electronic Systems Sensors and Nanobiotechnologies (E2SN), ENSET, Mohammed V University in Rabat, Morocco

Abstract


Objectives
Parkinson’s disease (PD) is the second most common neurodegenerative disorder; it affects more than 10 million people worldwide. Detecting PD usually requires a professional assessment by an expert, and investigation of the voice as a biomarker of the disease could be effective in speeding up the diagnostic process.
Methods
We present our methodology in which we distinguish PD patients from healthy controls (HC) using a large sample of 18,210 smartphone recordings. Those recordings were processed by an audio processing technique to create a final dataset of 80,594 instances and 138 features from the time, frequency, and cepstral domains. This dataset was preprocessed and normalized to create baseline machinelearning models using four classifiers, namely, linear support vector machine, K-nearest neighbor, random forest, and extreme gradient boosting (XGBoost). We divided our dataset into training and held-out test sets. Then we used stratified 5-fold cross-validation and four performance measures: accuracy, sensitivity, specificity, and F1-score to assess the performance of the models. We applied two feature selection methods, analysis of variance (ANOVA) and least absolute shrinkage and selection operator (LASSO), to reduce the dimensionality of the dataset by selecting the best subset of features that maximizes the performance of the classifiers.
Results
LASSO outperformed ANOVA with almost the same number of features. With 33 features, XGBoost achieved a maximum accuracy of 95.31% on training data, and 95.78% by predicting unseen data.
Conclusions
Developing a smartphone-based system that implements machine-learning techniques is an effective way to diagnose PD using the voice as a biomarker.

Keyword

Parkinson Disease, Voice Disorders, Telemedicine, Machine Learning, Classification

Figure

  • Figure 1 Cohort selection steps using the demographic survey and the medical timepoint of the records.

  • Figure 2 Feature extraction process and the machine-learning process. PD: Parkinson’s disease, SVM: support vector machine, KNN: k-nearest neighbor, XGBoost: extreme gradient boosting.


Reference

References

1. Parkinson J. An essay on the shaking palsy. London, UK: Sherwood, Neely and Jones;1817.
2. Parkinson J. An essay on the shaking palsy 1817. J Neuropsychiatry Clin Neurosci. 2002; 14(2):223–36.
3. Charcot JM. Lecon sur les maladies du systeme nerveux faites. Lesson on disease of the nervous system. Paris, France: Aux bureaux du Progres Medical;1872.
4. Charcot JM. Lectures on the diseases of the nervous system: delivered at La Salpetriere. London, UK: The New Sydenham Society;1877.
5. Brissaud E, Meige H. Lecons sur les maladies nerveuses (Salpetriere, 1893–1894). Paris, France: G. Masson;1895.
6. Heisters D. Parkinson’s: symptoms, treatments and research. Br J Nurs. 2011; 20(9):548–54.
Article
7. Miller N, Allcock L, Jones D, Noble E, Hildreth AJ, Burn DJ. Prevalence and pattern of perceived intelligibility changes in Parkinson’s disease. J Neurol Neurosurg Psychiatry. 2007; 78(11):1188–90.
Article
8. Ho AK, Iansek R, Marigliani C, Bradshaw JL, Gates S. Speech impairment in a large sample of patients with Parkinson’s disease. Behav Neurol. 1998; 11(3):131–7.
Article
9. Tsanas A, Little MA, McSharry PE, Ramig LO. Accurate telemonitoring of Parkinson’s disease progression by noninvasive speech tests. IEEE Trans Biomed Eng. 2010; 57(4):884–93.
Article
10. Little MA, McSharry PE, Hunter EJ, Spielman J, Ramig LO. Suitability of dysphonia measurements for telemonitoring of Parkinson’s disease. IEEE Trans Biomed Eng. 2009; 56(4):1015.
Article
11. Benba A, Jilbab A, Hammouch A, Sandabad S. Voice-prints analysis using MFCC and SVM for detecting patients with Parkinson’s disease. In : Proceedings of 2015 International conference on electrical and information technologies (ICEIT); 2015 Mar 25–27; Marrakech, Morocco. p. 300–4.
Article
12. Hemmerling D, Sztaho D. Parkinson’s disease classification based on vowel sound. In : Proceedings of the 11th International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications; 2019 Dec 17–19; Firenze, Italy.
13. Sage Bionetworks. mPower: mobile Parkinson disease study [Internet]. Seattle (WA): Sage Bionetworks;2019. [cited at 2020 Oct 29]. Available from: https://www.synapse.org/#!Synapse:syn4993293/wiki/247859 .
14. Bot BM, Suver C, Neto EC, Kellen M, Klein A, Bare C, et al. The mPower study, Parkinson disease mobile data collected using ResearchKit. Sci Data. 2016; 3:160011.
Article
15. Giannakopoulos T, Pikrakis A. Introduction to audio analysis: a MATLAB approach. San Diego (CA): Academic Press;2014.
16. Giannakopoulos T. pyAudioAnalysis: an open-source python library for audio signal analysis. PLoS One. 2015; 10(12):e0144610.
Article
17. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: machine learning in Python. J Mach Learn Res. 2011; 12:2825–30.
18. Chandrashekar G, Sahin F. A survey on feature selection methods. Comput Electr Eng. 2014; 40(1):16–28.
Article
19. Singh S, Xu W. Robust detection of Parkinson’s disease using harvested smartphone voice data: a telemedicine approach. Telemed J E Health. 2020; 26(3):327–34.
Article
Full Text Links
  • HIR
Actions
Cited
CITED
export Copy
Close
Share
  • Twitter
  • Facebook
Similar articles
Copyright © 2024 by Korean Association of Medical Journal Editors. All rights reserved.     E-mail: koreamed@kamje.or.kr