Science Score: 57.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 3 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.8%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: MeetJariwala10
  • Language: Jupyter Notebook
  • Default Branch: main
  • Size: 446 KB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created 11 months ago · Last pushed 11 months ago
Metadata Files
Readme Citation

README.md

Lung-Cancer-Risk-Prediction-with-Machine-Learning-Models

This repository implements machine learning models for lung cancer risk prediction inspired by the paper:

Dritsas, E.; Trigka, M. (2022). Lung Cancer Risk Prediction with Machine Learning Models. Big Data and Cognitive Computing, 6(4), 139.
DOI: 10.3390/bdcc6040139

The paper demonstrates a comparative analysis of several classifiers (e.g., Naive Bayes, SVM, Random Forest, Rotation Forest, etc.) on a publicly available dataset and highlights the superior performance of the Rotation Forest classifier in terms of accuracy, precision, recall, F-Measure, and AUC.


Overview

This project implements lung cancer risk prediction models using machine learning techniques. The key features of this repository include:

  • Data Preprocessing: Balancing the dataset using SMOTE.
  • Feature Analysis: Evaluating feature importance using methods like gain ratio and random forest.
  • Modeling: Training a variety of classification models such as Naive Bayes, Bayesian network, logistic regression, SVM, Random Forest, and Rotation Forest.
  • Evaluation: Assessing models with metrics including accuracy, precision, recall, F-Measure, and AUC via 10-fold cross-validation in the Weka environment.

The project is implemented in a Jupyter Notebook (MLMINIPROJECT.ipynb) that contains the code and experiments.


Prerequisites and Installation

To run the code in this repository, please ensure you have the following:

  • Python 3.x installed.
  • Required Python libraries such as:
    • numpy
    • pandas
    • scikit-learn
    • imblearn (for SMOTE)
    • matplotlib or seaborn (for plotting)

You can install these dependencies using pip:

```bash pip install numpy pandas scikit-learn imbalanced-learn matplotlib seaborn

Owner

  • Name: Meet Jariwala
  • Login: MeetJariwala10
  • Kind: user

Coding Enthusiast

Citation (CITATIONS.bib)

@Article{bdcc6040139,
AUTHOR = {Dritsas, Elias and Trigka, Maria},
TITLE = {Lung Cancer Risk Prediction with Machine Learning Models},
JOURNAL = {Big Data and Cognitive Computing},
VOLUME = {6},
YEAR = {2022},
NUMBER = {4},
ARTICLE-NUMBER = {139},
URL = {https://www.mdpi.com/2504-2289/6/4/139},
ISSN = {2504-2289},
ABSTRACT = {The lungs are the center of breath control and ensure that every cell in the body receives oxygen. At the same time, they filter the air to prevent the entry of useless substances and germs into the body. The human body has specially designed defence mechanisms that protect the lungs. However, they are not enough to completely eliminate the risk of various diseases that affect the lungs. Infections, inflammation or even more serious complications, such as the growth of a cancerous tumor, can affect the lungs. In this work, we used machine learning (ML) methods to build efficient models for identifying high-risk individuals for incurring lung cancer and, thus, making earlier interventions to avoid long-term complications. The suggestion of this article is the Rotation Forest that achieves high performance and is evaluated by well-known metrics, such as precision, recall, F-Measure, accuracy and area under the curve (AUC). More specifically, the evaluation of the experiments showed that the proposed model prevailed with an AUC of 99.3%, F-Measure, precision, recall and accuracy of 97.1%.},
DOI = {10.3390/bdcc6040139}
}



GitHub Events

Total
  • Push event: 6
  • Create event: 2
Last Year
  • Push event: 6
  • Create event: 2