https://github.com/adamouization/breast-cancer-detection-code

Common deep learning pipeline for the Breast Cancer Detection Dissertation

Science Score: 46.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
✓
DOI references
Found 3 DOI reference(s) in README
✓
Academic publication links
Links to: zenodo.org
✓
Committers with academic emails
2 of 5 committers (40.0%) from academic institutions
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (14.4%) to scientific vocabulary

Last synced: 10 months ago · JSON representation

Repository

Common deep learning pipeline for the Breast Cancer Detection Dissertation

Basic Info

Host: GitHub
Owner: Adamouization
License: bsd-2-clause
Language: Python
Default Branch: master
Homepage: http://doi.org/10.5281/zenodo.3975093
Size: 1.13 MB

Statistics

Stars: 1
Watchers: 2
Forks: 0
Open Issues: 0
Releases: 1

Created about 6 years ago · Last pushed almost 6 years ago

Metadata Files

Readme License

Breast Cancer Detection in Mammograms using Deep Learning Techniques - Common Pipeline Code

Repository containing the code written in common for the Breast Cancer Detection in Mammograms using Deep Learning Techniques dissertation. This code was further extended individually by each group member to get results by testing new deep learning techniques.

Usage on a GPU lab machine

Clone the repository:

cd ~/Projects git clone https://github.com/Adamouization/Breast-Cancer-Detection-Code

Create a repository that will be used to install Tensorflow 2 with CUDA 10 for Python and activate the virtual environment for GPU usage:

cd libraries/tf2 tar xvzf tensorflow2-cuda-10-1-e5bd53b3b5e6.tar.gz sh build.sh

Activate the virtual environment:

source /cs/scratch/<username>/tf2/venv/bin/activate

Create outputand save_models directories to store the results:

mkdir output mkdir saved_models

cd into the src directory and run the code:

python main.py [-h] -d DATASET -m MODEL [-r RUNMODE] [-i IMAGESIZE] [-v]

where: * -h is a flag for help on how to run the code. * DATASET is the dataset to use. Must be either mini-MIAS or CBIS-DDMS. * MODEL is the model to use. Must be either basic or advanced. * RUNMODE is the mode to run in (train or test). Default value is train. * IMAGESIZE is the image size to feed into the CNN model (small - 512x512px; or large - 2048x2048px). Default value is small. * -v is a flag controlling verbose mode, which prints additional statements for debugging purposes.

Dataset usage

mini-MIAS dataset

This example will use the mini-MIAS dataset. After cloning the project, travel to the data/mini-MIAS directory (there should be 3 files in it).
Create images_original and images_processed directories in this directory:

cd data/mini-MIAS/ mkdir images_original mkdir images_processed

Move to the images_original directory and download the raw un-processed images:

cd images_original wget http://peipa.essex.ac.uk/pix/mias/all-mias.tar.gz

Unzip the dataset then delete all non-image files:

tar xvzf all-mias.tar.gz rm -rf *.txt rm -rf README

Move back up one level and move to the images_processed directory. Create 3 new directories there (benign_cases, malignant_cases and normal_cases):

cd ../images_processed mkdir benign_cases mkdir malignant_cases mkdir normal_cases

Now run the python script for processing the dataset and render it usable with Tensorflow and Keras:

python3 ../../../src/dataset_processing_scripts/mini-MIAS-initial-pre-processing.py

DDSM and CBIS-DDSM datasets

These datasets are very large (exceeding 160GB) and more complex than the mini-MIAS dataset to use. Downloading and pre-processing them will therefore not be covered by this README.

Our generated CSV files to use these datasets can be found in the /data/CBIS-DDSM directory, but the mammograms will have to be downloaded separately. The DDSM dataset can be downloaded here, while the CBIS-DDSM dataset can be downloaded here.

Authors

Adam Jaamour
Ashay Patel
Shuen-Jen Chen

Owner

Name: Adam Jaamour
Login: Adamouization
Kind: user
Location: United Kingdom
Company: @NewDayTechnology

Website: www.adam.jaamour.com
Twitter: Adamouization
Repositories: 43
Profile: https://github.com/Adamouization

💻 Data Scientist @NewDayTechnology 🧠 MSc AI @ Uni of St Andrews 📓 BSc Computer Science @ Uni of Bath 💼 Former SWE @ Scuderia Alpha Tauri F1 Team

GitHub Events

Total

Last Year

Committers

Last synced: about 1 year ago

All Time

Total Commits: 104
Total Committers: 5
Avg Commits per committer: 20.8
Development Distribution Score (DDS): 0.423

Past Year

Commits: 0
Committers: 0
Avg Commits per committer: 0.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
Adam Jaamour	a**m@j**m	60
Ashayp31	p**5@g**m	25
shuenjen	s**n@g**m	12
ap316	a**6@p**k	5
sjc29	s**9@p**k	2

Committer Domains (Top 20 + Academic)

pc5-030-l.cs.st-andrews.ac.uk: 1 pc5-028-l.cs.st-andrews.ac.uk: 1 jaamour.com: 1

Issues and Pull Requests

Last synced: about 1 year ago

All Time

Total issues: 0
Total pull requests: 19
Average time to close issues: N/A
Average time to close pull requests: less than a minute
Total issue authors: 0
Total pull request authors: 3
Average comments per issue: 0
Average comments per pull request: 0.0
Merged pull requests: 19
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

Pull Request Authors

Adamouization (12)
shuenjen (5)
Ashayp31 (1)

Top Labels

Issue Labels

Pull Request Labels

Dependencies

libraries/tf2/tensorflow2-cuda-10-1-e5bd53b3b5e6/requirements.txt pypi

tensorflow-gpu >=2.1

requirements.txt pypi

Jinja2 ==2.11.2
Keras-Applications ==1.0.8
Keras-Preprocessing ==1.1.2
Markdown ==3.2.2
MarkupSafe ==1.1.1
Pillow ==7.1.2
PyWavelets ==1.1.1
Pygments ==2.6.1
QtPy ==1.9.0
Send2Trash ==1.5.0
Werkzeug ==1.0.1
absl-py ==0.9.0
astor ==0.8.1
astunparse ==1.6.3
attrs ==19.3.0
backcall ==0.2.0
bleach ==3.1.5
cachetools ==4.1.0
certifi ==2020.6.20
chardet ==3.0.4
cycler ==0.10.0
decorator ==4.4.2
defusedxml ==0.6.0
entrypoints ==0.3
gast ==0.2.2
google-auth ==1.18.0
google-auth-oauthlib ==0.4.1
google-pasta ==0.2.0
grpcio ==1.30.0
h5py ==2.10.0
idna ==2.9
imageio ==2.8.0
importlib-metadata ==1.6.1
imutils ==0.5.3
install ==1.3.3
ipykernel ==5.3.0
ipython ==7.15.0
ipython-genutils ==0.2.0
ipywidgets ==7.5.1
jedi ==0.17.1
joblib ==0.15.1
jsonschema ==3.2.0
jupyter ==1.0.0
jupyter-client ==6.1.3
jupyter-console ==6.1.0
jupyter-core ==4.6.3
kiwisolver ==1.2.0
matplotlib ==3.2.2
mistune ==0.8.4
nbconvert ==5.6.1
nbformat ==5.0.7
networkx ==2.4
notebook ==6.0.3
numpy ==1.18.5
oauthlib ==3.1.0
opencv-python ==4.2.0.34
opt-einsum ==3.2.1
packaging ==20.4
pandas ==1.0.5
pandocfilters ==1.4.2
parso ==0.7.0
pexpect ==4.8.0
pickleshare ==0.7.5
prometheus-client ==0.8.0
prompt-toolkit ==3.0.5
protobuf ==3.12.2
ptyprocess ==0.6.0
pyasn1 ==0.4.8
pyasn1-modules ==0.2.8
pydicom ==2.0.0
pyparsing ==2.4.7
pyrsistent ==0.16.0
python-dateutil ==2.8.1
pytz ==2020.1
pyzmq ==19.0.1
qtconsole ==4.7.5
requests ==2.24.0
requests-oauthlib ==1.3.0
rsa ==4.6
scikit-image ==0.17.2
scikit-learn ==0.23.1
scipy ==1.4.1
seaborn ==0.10.1
six ==1.15.0
tb-nightly ==2.3.0a20200626
tensorboard ==2.1.1
tensorboard-plugin-wit ==1.6.0.post3
tensorflow ==2.1.0
tensorflow-estimator ==2.1.0
tensorflow-gpu ==2.2.0
tensorflow-io ==0.12.0
termcolor ==1.1.0
terminado ==0.8.3
testpath ==0.4.4
tf-estimator-nightly ==2.3.0.dev2020062601
tf-nightly ==2.5.0.dev20200626
threadpoolctl ==2.1.0
tifffile ==2020.6.3
tornado ==6.0.4
traitlets ==4.3.3
urllib3 ==1.25.9
wcwidth ==0.2.5
webencodings ==0.5.1
widgetsnbextension ==3.5.1
wrapt ==1.12.1
zipp ==3.1.0

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/adamouization/breast-cancer-detection-code

Science Score: 46.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

Breast Cancer Detection in Mammograms using Deep Learning Techniques - Common Pipeline Code

Usage on a GPU lab machine

Dataset usage

mini-MIAS dataset

DDSM and CBIS-DDSM datasets

Authors

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies