machine_learning_models
Model Scripts created in Python Jupyter's Lab
Science Score: 57.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 1 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (18.2%) to scientific vocabulary
Repository
Model Scripts created in Python Jupyter's Lab
Basic Info
- Host: GitHub
- Owner: AydenMQ
- License: mit
- Language: Jupyter Notebook
- Default Branch: main
- Size: 261 KB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 4
Metadata Files
README.md
MachineLearningModels
Model Scripts created in Python Jupyter's Lab. Code produced by Ayden McCarthy. Manuscript Title: "Machine Learning Models Predict Assessment Outcomes From Military Physical Employment Standards Via a Physical Test Battery". Program of Study: PhD. Institution: Macquarie University. Year: 2024. DOI: 10.5281/zenodo.14043036
Model Suite
This repository contains four Jupyter Notebook models developed as part of a scientific manuscript. Each model is designed to predict specific performance metrics using Ridge regression, support vector regression, random forest, and multilayer perceptron models. Each notebook follows a logical order and structure, providing reproducible and transparent data analysis for researchers and practitioners. This suite supports applications in performance modelling, prediction, and cross-validation for various populations.
Table of Contents
Installation
To get started, clone this repository and install the required Python packages. Ensure that you have installed Python 3 and that JupyterLab is set up. Use the following commands:
bash
git clone https://github.com/AydenMQ/Machine_Learning_Models.git
cd Machine_Learning_Models
pip install -r requirements.txt
Download the requirements.txt file. If not, create a requirements.txt file with the following content:
text
pandas==2.2.2
numpy==1.26.4
seaborn==0.12.2
matplotlib==3.8.3
shap==0.43.0
mlxtend==0.23.1
scipy==1.13.0
scikit-learn==1.4.2
This ensures compatibility with the specific package versions used in your notebooks.
Models
This repository includes four machine learning models, each in a separate Jupyter Notebook (.ipynb). The models were developed for robust predictive modelling. Each notebook handles training, hyperparameter tuning, and output generation on an unseen testing data set. The Four Models Include: Support Vector Regression, Ridge, Random Forest, and Multilayer Perceptron Model.
Notebook Structure
Each model notebook follows the same logical flow:
- Data Import: Load the dataset, which may contain performance metrics, demographic variables, or other features.
- Pre-processing: Standardises the training data set.
- Model Training: Train the specified model with a Gridsearch to tune hyperparameters.
- Validation: Cross-validation (if applicable) to validate the model's performance and generalization capabilities.
- Results Output: Save the results, including RMSE, and other relevant metrics, to CSV or JSON for reporting.
Usage
Running the Notebooks
Open each notebook in JupyterLab and run the cells sequentially. Each cell contains code blocks that need to be executed in order to train and evaluate the model.
- Start JupyterLab:
bash
jupyter lab
- Open the desired notebook (e.g.,
Ridge_Model.ipynb), and run all cells or step through cells to review intermediate results. Note: Make sure you have the training and testing set correctly spelled as .csv files.
Parameter Tuning
Each model can be tuned for optimal performance using scikit-learn’s GridSearchCV or manual adjustment within the notebook. Refer to the hyperparameters section to adjust the parameters. For example, alpha for Ridge regularization.
Cross-validation
Each notebook incorporates cross-validation to ensure model robustness on unseen data (testing phase). Results include cross-validated RMSE for consistent performance evaluation.
Output Files
The notebooks generate output files that include:
- SHAP: Shows what features are the most influential in reducing the error.
- Performance Metrics: Metrics such as RMSE.
- Prediction Outputs: Predicted vs. actual performance values for the test dataset.
Contributions
Contributions are welcome. If you have suggestions for model improvements, feature requests, or bug fixes, please fork the repository and submit a pull request.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Owner
- Login: AydenMQ
- Kind: user
- Repositories: 1
- Profile: https://github.com/AydenMQ
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - family-names: "McCarthy" given-names: "Ayden" orcid: "https://orcid.org/0000-0001-8927-7484" title: "AydenMQ/Machine_Learning_Models: Machine Learning Models Version 2" version: 2.0.0 doi: 10.5281/zenodo.14038000 date-released: 2024-11-05 url: "https://https://github.com/AydenMQ/Machine_Learning_Models"
GitHub Events
Total
- Release event: 4
- Push event: 17
- Create event: 6
Last Year
- Release event: 4
- Push event: 17
- Create event: 6