lung-cancer-risk-prediction-with-machine-learning-models
https://github.com/meetjariwala10/lung-cancer-risk-prediction-with-machine-learning-models
Science Score: 57.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 3 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.8%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: MeetJariwala10
- Language: Jupyter Notebook
- Default Branch: main
- Size: 446 KB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Lung-Cancer-Risk-Prediction-with-Machine-Learning-Models
This repository implements machine learning models for lung cancer risk prediction inspired by the paper:
Dritsas, E.; Trigka, M. (2022). Lung Cancer Risk Prediction with Machine Learning Models. Big Data and Cognitive Computing, 6(4), 139.
DOI: 10.3390/bdcc6040139
The paper demonstrates a comparative analysis of several classifiers (e.g., Naive Bayes, SVM, Random Forest, Rotation Forest, etc.) on a publicly available dataset and highlights the superior performance of the Rotation Forest classifier in terms of accuracy, precision, recall, F-Measure, and AUC.
Overview
This project implements lung cancer risk prediction models using machine learning techniques. The key features of this repository include:
- Data Preprocessing: Balancing the dataset using SMOTE.
- Feature Analysis: Evaluating feature importance using methods like gain ratio and random forest.
- Modeling: Training a variety of classification models such as Naive Bayes, Bayesian network, logistic regression, SVM, Random Forest, and Rotation Forest.
- Evaluation: Assessing models with metrics including accuracy, precision, recall, F-Measure, and AUC via 10-fold cross-validation in the Weka environment.
The project is implemented in a Jupyter Notebook (MLMINIPROJECT.ipynb) that contains the code and experiments.
Prerequisites and Installation
To run the code in this repository, please ensure you have the following:
- Python 3.x installed.
- Required Python libraries such as:
numpypandasscikit-learnimblearn(for SMOTE)matplotliborseaborn(for plotting)
You can install these dependencies using pip:
```bash pip install numpy pandas scikit-learn imbalanced-learn matplotlib seaborn
Owner
- Name: Meet Jariwala
- Login: MeetJariwala10
- Kind: user
- Repositories: 1
- Profile: https://github.com/MeetJariwala10
Coding Enthusiast
Citation (CITATIONS.bib)
@Article{bdcc6040139,
AUTHOR = {Dritsas, Elias and Trigka, Maria},
TITLE = {Lung Cancer Risk Prediction with Machine Learning Models},
JOURNAL = {Big Data and Cognitive Computing},
VOLUME = {6},
YEAR = {2022},
NUMBER = {4},
ARTICLE-NUMBER = {139},
URL = {https://www.mdpi.com/2504-2289/6/4/139},
ISSN = {2504-2289},
ABSTRACT = {The lungs are the center of breath control and ensure that every cell in the body receives oxygen. At the same time, they filter the air to prevent the entry of useless substances and germs into the body. The human body has specially designed defence mechanisms that protect the lungs. However, they are not enough to completely eliminate the risk of various diseases that affect the lungs. Infections, inflammation or even more serious complications, such as the growth of a cancerous tumor, can affect the lungs. In this work, we used machine learning (ML) methods to build efficient models for identifying high-risk individuals for incurring lung cancer and, thus, making earlier interventions to avoid long-term complications. The suggestion of this article is the Rotation Forest that achieves high performance and is evaluated by well-known metrics, such as precision, recall, F-Measure, accuracy and area under the curve (AUC). More specifically, the evaluation of the experiments showed that the proposed model prevailed with an AUC of 99.3%, F-Measure, precision, recall and accuracy of 97.1%.},
DOI = {10.3390/bdcc6040139}
}
GitHub Events
Total
- Push event: 6
- Create event: 2
Last Year
- Push event: 6
- Create event: 2