parkinson-disease-prediction
Predictive Diagnosis of Parkinson Disease
Science Score: 57.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 2 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (8.1%) to scientific vocabulary
Repository
Predictive Diagnosis of Parkinson Disease
Basic Info
- Host: GitHub
- Owner: alexksh2
- Language: R
- Default Branch: main
- Size: 74.2 KB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Predictive Diagnosis of Parkinson Disease
Parkinson’s disease (PD) is a neurodegenerative disease that results in uncontrollable movements and behavioural changes (National Institute on Aging 2022). Over 1 in 4 people are misdiagnosed with a different condition, with nearly 50% given treatment for their incorrectly-diagnosed condition, resulting in over 34% reporting worse health than before as a result (Media, P. A. 2019).
In fact, research studies has indicated that the characterisation of prediagnosis Parkinson's Disease (PD) and the early prediction of the disease's progression are essential for preventive interventions, risk stratification and understanding of the disease pathology (Yuan, et al. 2021)
Therefore, several machine learning classification models such as Logistic Regression, K-Nearest Neighbors Algorithm, Classification and Regression Tree Model and Random Forest have been modelled using the input dataset such that accurate predictive diagnosis of Parkinson Disease can be conducted using Artificial Intelligence.
Information About Dataset Used
Source of the dataset: https://www.kaggle.com/datasets/vikasukani/parkinsons-disease-data-set/code
This dataset is composed of a range of biomedical voice measurements from 31 people, 23 with Parkinson's disease (PD). Each column in the table is a particular voice measure, and each row corresponds to one of 195 voice recordings from these individuals ("name" column).
Attribute Information
- name : ASCII subject name and recording number
- MDVP:Fo(Hz) : Average vocal fundamental frequency
- MDVP:Fhi(Hz) : Maximum vocal fundamental frequency
- MDVP:Flo(Hz) : Minimum vocal fundamental frequency
- MDVP:Jitter(%) , MDVP:Jitter(Abs) , MDVP:RAP , MDVP:PPQ , Jitter:DDP : Several measures of variation in fundamental frequency
- MDVP:Shimmer , MDVP:Shimmer(dB) , Shimmer:APQ3 , Shimmer:APQ5 , MDVP:APQ , Shimmer:DDA : Several measures of variation in amplitude
- NHR , HNR : Two measures of ratio of noise to tonal components in the voice
- status : Health status of the subject (one) - Parkinson's, (zero) - healthy
- RPDE , D2 : Two nonlinear dynamical complexity measures
- DFA : Signal fractal scaling exponent
- spread1 , spread2 , PPE : Three nonlinear measures of fundamental frequency variation
Model Results
Overall Accuracy Rate : The overall accuracy of machine learning model = (TP + TN)/ (TP + TN + FP + FN)
Precision : How often a machine learning model correctly predicts the positive class = TP / (TP + FP)
Recall : The ability of a model to find all the relevant cases within a data set = TP / (TP + FN)
F1-score : A machine learning evaluation metric that measures a model's accuracy by combining the precision and recall scores of a model.
1. Logistic Regression:
F1-score = 0.88
Overall Accuracy Rate = 21 / 24
False Positive = 2
False Negative = 1
2. K-Nearest Neighbors Algorithm:
F1-score = 0.7272727
Overall Accuracy Rate = 18 / 24
False Positive = 2
False Negative = 4
3. Classification and Regression Tree Model:
F1-score = 0.8461538
Overall Accuracy Rate = 20 / 24
False Positive = 3
False Negative = 1
Analysis of CART Model:
a. If the record has spread1 less than -0.54, the model prediction is that patient does not suffer from Parkinson
b. if the record has spread1 larger than 0.48, the model prediction is that the patient suffer from Parkinson
c. Other decision criteria on model prediction can be derived using the CART Model diagram above
4. Random Forest:
F1-score 0.8695652
Overall Accuracy Rate = 21 / 24
False Positive = 1
False Negative = 2
Conclusion:
Logistic and Random Forest are the two best-performing classification models for this analysis. However, random seed values are observed to have a significant effect on all machine learning models performance. Therefore, it may be concluded that more data records are required to provide a more comprehensive assessment on all machine learning models.
Citations
Media, P. A. (2019, December 30). Quarter of Parkinson’s sufferers were wrongly diagnosed, says charity. The Guardian. https://www.theguardian.com/society/2019/dec/30/quarter-of-parkinsons-sufferers-were-wrongly-diagnosed-says-charity
Yuan, W., Beaulieu-Jones, B., Krolewski, R., Palmer, N., Veyrat-Follet, C., Frau, F., Cohen, C., Bozzi, S., Cogswell, M., Kumar, D., Coulouvrat, C., Leroy, B., Fischer, T. Z., Sardi, S. P., Chandross, K. J., Rubin, L. L., Wills, A.-M., Kohane, I., & Lipnick, S. L. (2021). Accelerating diagnosis of Parkinson’s disease through risk prediction. BMC Neurology, 21(1). https://doi.org/10.1186/s12883-021-02226-4
National Institute on Aging. (2022, April 14). Parkinson’s disease: Causes, Symptoms, and Treatments. National Institute on Aging. https://www.nia.nih.gov/health/parkinsons-disease
Owner
- Login: alexksh2
- Kind: user
- Repositories: 1
- Profile: https://github.com/alexksh2
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - family-names: "Alex" given-names: "Khoo Shien How" orcid: "https://orcid.org/0000-0000-0000-0000" title: "Parkinson Disease Prediction" version: 2.0.4 date-released: 2023-11-2 url: "https://github.com/alexksh2/Parkinson-Disease-Prediction"