article-breast-cancer-classification-boosting
https://github.com/joaomh/article-breast-cancer-classification-boosting
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic links in README
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (0.2%) to scientific vocabulary
Last synced: 10 months ago
·
JSON representation
·
Repository
Basic Info
- Host: GitHub
- Owner: joaomh
- License: mit
- Language: Jupyter Notebook
- Default Branch: main
- Size: 3.13 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Created over 1 year ago
· Last pushed over 1 year ago
Metadata Files
License
Citation
Owner
- Name: João Pinheiro
- Login: joaomh
- Kind: user
- Company: Itaú Unibanco @itau
- Website: https://www.youtube.com/2001Engenharia
- Repositories: 6
- Profile: https://github.com/joaomh
My biggest life goal is to improve people's lives through technology and teaching :)
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you use this work, please cite it as below."
authors:
- family-names: Herrera Pinheiro
given-names: João Manoel
- family-names: Becker
given-names: Marcelo
title: "Breast Cancer Classification Using Gradient Boosting Algorithms Focusing on Reducing the False Negative and SHAP for Explainability"
abstract: "Cancer is one of the diseases that kill the most women in the world, with breast cancer being responsible for the highest number of cancer cases and consequently deaths. However, it can be prevented by early detection and, consequently, early treatment. Any development for detection or prediction of this kind of cancer is important for a better healthy life. Many studies focus on a model with high accuracy in cancer prediction, but sometimes accuracy alone may not always be a reliable metric. This study implies an investigative approach to studying the performance of different machine learning algorithms based on boosting to predict breast cancer focusing on the recall metric. Boosting machine learning algorithms has been proven to be an effective tool for detecting medical diseases. The dataset of the University of California, Irvine (UCI) repository has been utilized to train and test the model classifier that contains their attributes. The main objective of this study is to use state-of-the-art boosting algorithms such as AdaBoost, XGBoost, CatBoost and LightGBM to predict and diagnose breast cancer and to find the most effective metric regarding recall, ROC-AUC, and confusion matrix. Furthermore, previous studies have applied Optuna to individual algorithms like XGBoost or LightGBM, but no prior research has collectively examined all four boosting algorithms within a unified Optuna framework, a library for hyperparameter optimization, and the SHAP method to improve the interpretability of our model, which can be used as a support to identify and predict breast cancer. We were able to improve AUC or recall for all the models and reduce the False Negative for AdaBoost and LightGBM; the final AUC was more than 99.41% for all models."
journal: "Inteligencia Artificial"
volume: 28
issue: 75
year: 2024
month: 12
start: 63
end: 80
doi: 10.4114/intartif.vol28iss75pp63-80
url: https://journal.iberamia.org/index.php/intartif/article/view/1637
GitHub Events
Total
- Watch event: 2
- Push event: 1
- Fork event: 1
- Create event: 2
Last Year
- Watch event: 2
- Push event: 1
- Fork event: 1
- Create event: 2