attention-based-models-for-hyper-kvasir

Automatic and accurate analysis of medical images is a subject of great importance in our current society. In particular, this work focuses on gastrointestinal endoscopy images, as the study of these images helps to detect possible health conditions in those regions.

https://github.com/richardesp/attention-based-models-for-hyper-kvasir

Science Score: 57.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 1 DOI reference(s) in README
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (13.2%) to scientific vocabulary

Last synced: 10 months ago · JSON representation ·

Repository

Basic Info

Host: GitHub
Owner: richardesp
License: other
Language: Jupyter Notebook
Default Branch: main
Homepage:
Size: 77.2 MB

Statistics

Stars: 3
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 1

Created about 3 years ago · Last pushed over 1 year ago

Metadata Files

Readme License Citation

README.md

Attention-based Models for Hyper-Kvasir

Installation
Project Structure
Usage
Dataset
License
Acknowledgments

Installation

To set up the environment and install the necessary dependencies, follow these steps:

Clone the repository: bash git clone https://github.com/your-username/Attention-based-models-for-Hyper-Kvasir.git cd Attention-based-models-for-Hyper-Kvasir
Create a Conda environment:

bash conda create --name hyperkvasir_env python=3.8 conda activate hyperkvasir_env

Install the required packages::

bash conda install --file requirements.txt

Project Structure

bash . ├── LICENSE.txt ├── README.md ├── main.py ├── notebooks │ ├── Pre-processing notebooks... │ ├── Model training notebooks... │ └── Visualization and analysis notebooks... ├── requirements.txt ├── scripts │ └── download_dataset.sh └── src ├── base │ └── base_make_dataset.py ├── data │ └── make_dataset.py ├── features │ └── build_features.py ├── models │ ├── predict_model.py │ └── train_model.py ├── utils │ ├── check_gpu_arm.py │ └── get_vprint.py └── visualization └── visualize.py

Usage

Any training or prediction on images can be performed by executing the provided notebooks in the notebooks directory. Navigate to the specific notebook that matches your desired operation:

For preprocessing tasks, refer to the notebooks prefixed with 1.x-rep.
For model training, especially with Vision Transformers (ViT), refer to the notebooks prefixed with 2.x-rep.
For initial runs with pre-trained Vision Transformers, use the notebook 3.0-rep-pre-trained-vit-initial-run.ipynb.
For visualization and analysis, including attention extraction, UMAP, and importance correlation plots, refer to the notebooks prefixed with 5.x-rep.

Dataset

The dataset used in this project can be found at Hyper-Kvasir Dataset.

Dataset Details

The dataset can be split into four distinct parts: - Labeled image data - Unlabeled image data - Segmented image data - Annotated video data

Each part is further described below:

Labeled images

In total, the dataset contains 10,662 labeled images stored using the JPEG format. The images can be found in the images folder. The classes, which each of the images belong to, correspond to the folder they are stored in (e.g., the ’polyp’ folder contains all polyp images, the ’barretts’ folder contains all images of Barrett’s esophagus, etc.). The number of images per class are not balanced, which is a general challenge in the medical field due to the fact that some findings occur more often than others. This adds an additional challenge for researchers, since methods applied to the data should also be able to learn from a small amount of training data. The labeled images represent 23 different classes of findings.

Unlabeled Images

In total, the dataset contains 99,417 unlabeled images. The unlabeled images can be found in the unlabeled folder which is a subfolder in the image folder, together with the other labeled image folders. In addition to the unlabeled image files, we also provide the extracted global features and cluster assignments in the Hyper-Kvasir Github repository as Attribute-Relation File Format (ARFF) files. ARFF files can be opened and processed using, for example, the WEKA machine learning library, or they can easily be converted into comma-separated values (CSV) files.

Segmented Images

We provide the original image, a segmentation mask, and a bounding box for 1,000 images from the polyp class. In the mask, the pixels depicting polyp tissue, the region of interest, are represented by the foreground (white mask), while the background (in black) does not contain polyp pixels. The bounding box is defined as the outermost pixels of the found polyp. For this segmentation set, we have two folders, one for images and one for masks, each containing 1,000 JPEG-compressed images. The bounding boxes for the corresponding images are stored in a JavaScript Object Notation (JSON) file. The image and its corresponding mask have the same filename. The images and files are stored in the segmented images folder. It is important to point out that the segmented images have duplicates in the images folder of polyps since the images were taken from there.

Annotated Videos

The dataset contains a total of 373 videos containing different findings and landmarks. This corresponds to approximately 11.62 hours of videos and 1,059,519 video frames that can be converted to images if needed. Each video has been manually assessed by a medical professional working in the field of gastroenterology and resulted in a total of 171 annotated findings.

License

The license for the Hyper-Kvasir dataset is Creative Commons Attribution 4.0 International (CC BY 4.0).

More information can be found here.

Acknowledgments

We would like to extend our sincere gratitude to:

Isabel Jiménez-Velasco
Manuel J. Marín-Jiménez
Rafael Muñoz-Salinas

for their invaluable contributions and insights that greatly benefited this project.

Citation

If you use this work, please cite it as follows:

bibtex @inproceedings{espantaleon2023caip, author = {Ricardo Espantale{\'{o}}n{-}P{\'{e}}rez and Isabel Jim{\'{e}}nez{-}Velasco and Rafael Mu{\~{n}}oz{-}Salinas and Manuel J. Mar{\'{\i}}n{-}Jim{\'{e}}nez}, title = {Empirical Study of Attention-Based Models for Automatic Classification of Gastrointestinal Endoscopy Images}, booktitle = {Computer Analysis of Images and Patterns - 20th International Conference, {CAIP}}, series = {Lecture Notes in Computer Science}, volume = {14185}, pages = {98--108}, year = {2023}, doi = {10.1007/978-3-031-44240-7\_10} }

Owner

Login: richardesp
Kind: user

Repositories: 17
Profile: https://github.com/richardesp

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite the following work:"
authors:
  - family-names: Espantaleón-Pérez
    given-names: Ricardo
  - family-names: Jiménez-Velasco
    given-names: Isabel
  - family-names: Muñoz-Salinas
    given-names: Rafael
  - family-names: Marín-Jiménez
    given-names: Manuel J.
title: "Empirical Study of Attention-Based Models for Automatic Classification of Gastrointestinal Endoscopy Images"
type: article
year: 2023
doi: "10.1007/978-3-031-44240-7_10"
conference:
  name: "Computer Analysis of Images and Patterns - 20th International Conference, CAIP"
  series: "Lecture Notes in Computer Science"
  volume: "14185"
pages: "98-108"

GitHub Events

Total

Release event: 1
Watch event: 2
Push event: 2
Create event: 1

Last Year

Release event: 1
Watch event: 2
Push event: 2
Create event: 1

Dependencies

requirements.txt pypi

absl-py =1.4.0=pypi_0
anyio =3.6.2=pyhd8ed1ab_0
appdirs =1.4.4=pyh9f0ad1d_0
appnope =0.1.3=pyhd8ed1ab_0
argon2-cffi =21.3.0=pyhd8ed1ab_0
argon2-cffi-bindings =21.2.0=py310h1a28f6b_0
asttokens =2.2.1=pyhd8ed1ab_0
astunparse =1.6.3=pypi_0
attrs =22.2.0=pyh71513ae_0
babel =2.11.0=py310hca03da5_0
backcall =0.2.0=pyh9f0ad1d_0
backports =1.0=pyhd8ed1ab_3
backports.functools_lru_cache =1.6.4=pyhd8ed1ab_0
bayesian-optimization =1.4.2=pypi_0
beautifulsoup4 =4.11.2=pyha770c72_0
blas =1.0=openblas
bleach =6.0.0=pyhd8ed1ab_0
boto3 =1.26.64=pyhd8ed1ab_0
botocore =1.29.64=pyhd8ed1ab_0
bottleneck =1.3.5=py310h96f19d2_0
brotli =1.0.9=h1a8c8d9_8
brotli-bin =1.0.9=h1a8c8d9_8
brotlipy =0.7.0=py310h1a28f6b_1002
bzip2 =1.0.8=h3422bc3_4
c-ares =1.18.1=h3422bc3_0
ca-certificates =2023.01.10=hca03da5_0
cached-property =1.5.2=hd8ed1ab_1
cached_property =1.5.2=pyha770c72_1
cachetools =5.3.0=pypi_0
cairo =1.16.0=h302bd0f_3
certifi =2022.12.7=py310hca03da5_0
cffi =1.15.1=py310h80987f9_3
charset-normalizer =2.1.1=pyhd8ed1ab_0
click =8.1.3=unix_pyhd8ed1ab_2
cloudpickle =2.2.1=pypi_0
colorama =0.4.6=pyhd8ed1ab_0
comm =0.1.2=pyhd8ed1ab_0
cryptography =38.0.4=py310h834c97f_0
cycler =0.11.0=pyhd8ed1ab_0
debugpy =1.5.1=py310hc377ac9_0
decorator =5.1.1=pyhd8ed1ab_0
defusedxml =0.7.1=pyhd8ed1ab_0
eigen =3.3.7=h525c30c_1
entrypoints =0.4=pyhd8ed1ab_0
executing =1.2.0=pyhd8ed1ab_0
expat =2.4.9=hc377ac9_0
fftw =3.3.9=h1a28f6b_1
flask =2.2.2=pyhd8ed1ab_0
flatbuffers =23.1.21=pypi_0
flit-core =3.8.0=pyhd8ed1ab_0
fontconfig =2.14.1=h6b8db82_1
fonttools =4.25.0=pyhd3eb1b0_0
freetype =2.12.1=h1192e45_0
gast =0.4.0=pypi_0
gettext =0.21.0=h826f4ad_0
giflib =5.2.1=h80987f9_1
glib =2.69.1=h514c7bf_2
google-auth =2.16.0=pypi_0
google-auth-oauthlib =0.4.6=pypi_0
google-pasta =0.2.0=pypi_0
graphite2 =1.3.14=hc377ac9_1
grpcio =1.51.1=pypi_0
gst-plugins-base =1.14.1=hf0a386a_0
gstreamer =1.14.1=he09cfb7_0
gym =0.26.2=pypi_0
gym-notices =0.0.8=pypi_0
h5py =3.7.0=py310h181c318_0
harfbuzz =4.3.0=hb1b0ec1_0
hdf5 =1.12.1=h160e8cb_2
icu =68.1=hc377ac9_0
idna =3.4=pyhd8ed1ab_0
importlib-metadata =6.0.0=pyha770c72_0
importlib_metadata =6.0.0=hd8ed1ab_0
importlib_resources =5.10.2=pyhd8ed1ab_0
ipykernel =6.19.2=py310h33ce5c2_0
ipython =8.9.0=pyhd1c38e8_0
ipython_genutils =0.2.0=py_1
ipywidgets =8.0.4=pyhd8ed1ab_0
itsdangerous =2.1.2=pyhd8ed1ab_0
jedi =0.18.2=pyhd8ed1ab_0
jinja2 =3.1.2=pyhd8ed1ab_1
jmespath =1.0.1=pyhd8ed1ab_0
joblib =1.2.0=pyhd8ed1ab_0
jpeg =9e=he4db4b2_2
json5 =0.9.6=pyhd3eb1b0_0
jsonschema =4.17.3=pyhd8ed1ab_0
jupyter =1.0.0=py310hca03da5_8
jupyter_client =8.0.2=pyhd8ed1ab_0
jupyter_console =6.4.4=pyhd8ed1ab_0
jupyter_core =5.1.1=py310hca03da5_0
jupyter_events =0.6.3=pyhd8ed1ab_0
jupyter_server =1.23.4=py310hca03da5_0
jupyter_server_terminals =0.4.4=pyhd8ed1ab_1
jupyterlab =3.5.3=py310hca03da5_0
jupyterlab_pygments =0.2.2=pyhd8ed1ab_0
jupyterlab_server =2.16.5=py310hca03da5_0
jupyterlab_widgets =3.0.5=pyhd8ed1ab_0
kaggle =1.5.12=pypi_0
keras =2.11.0=pypi_0
keras-preprocessing =1.1.2=pyhd3eb1b0_0
kiwisolver =1.4.4=py310h313beb8_0
krb5 =1.19.4=h8380606_0
lcms2 =2.14=h481adae_1
lerc =3.0=hc377ac9_0
libaec =1.0.6=hb7217d7_1
libblas =3.9.0=16_osxarm64_openblas
libbrotlicommon =1.0.9=h1a8c8d9_8
libbrotlidec =1.0.9=h1a8c8d9_8
libbrotlienc =1.0.9=h1a8c8d9_8
libcblas =3.9.0=16_osxarm64_openblas
libclang =15.0.6.1=pypi_0
libcurl =7.87.0=h0f1d93c_0
libcxx =14.0.6=h2692d47_0
libdeflate =1.8=h1a28f6b_5
libedit =3.1.20221030=h80987f9_0
libev =4.33=h642e427_1
libffi =3.4.2=h3422bc3_5
libgfortran =5.0.0=11_3_0_hd922786_27
libgfortran5 =11.3.0=hdaf2cc0_27
libiconv =1.17=he4db4b2_0
liblapack =3.9.0=16_osxarm64_openblas
libllvm12 =12.0.0=h12f7ac0_4
libnghttp2 =1.46.0=h95c9599_0
libopenblas =0.3.21=openmp_hc731615_3
libpng =1.6.37=hb8d0fd4_0
libpq =12.9=h65cfe13_3
libsodium =1.0.18=h27ca646_1
libssh2 =1.10.0=hf27765b_0
libtiff =4.5.0=h313beb8_1
libwebp =1.2.4=h68602c7_0
libwebp-base =1.2.4=h57fd34a_0
libxcb =1.13=h9b22ae9_1004
libxml2 =2.9.14=h8c5e841_0
libxslt =1.1.35=h9833966_0
llvm-openmp =15.0.7=h7cfbb63_0
lxml =4.9.1=py310h2fae87d_0
lz4-c =1.9.4=h313beb8_0
markdown =3.4.1=pypi_0
markupsafe =2.1.1=py310h1a28f6b_0
matplotlib =3.5.2=py310hca03da5_0
matplotlib-base =3.5.2=py310hc377ac9_0
matplotlib-inline =0.1.6=pyhd8ed1ab_0
mistune =2.0.4=pyhd8ed1ab_0
munkres =1.1.4=pyh9f0ad1d_0
nbclassic =0.5.1=pyhd8ed1ab_0
nbclient =0.7.2=pyhd8ed1ab_0
nbconvert =7.2.9=pyhd8ed1ab_0
nbconvert-core =7.2.9=pyhd8ed1ab_0
nbconvert-pandoc =7.2.9=pyhd8ed1ab_0
nbformat =5.7.3=pyhd8ed1ab_0
ncurses =6.3=h07bb92c_1
nest-asyncio =1.5.6=pyhd8ed1ab_0
notebook =6.5.2=pyha770c72_1
notebook-shim =0.2.2=pyhd8ed1ab_0
nspr =4.33=hc377ac9_0
nss =3.74=h142855e_0
numexpr =2.8.4=py310hecc3335_0
numpy =1.23.5=py310hb93e574_0
numpy-base =1.23.5=py310haf87e8b_0
oauthlib =3.2.2=pypi_0
opencv =4.6.0=py310he2359d5_2
openssl =1.1.1s=h1a28f6b_0
opt-einsum =3.3.0=pypi_0
packaging =23.0=pyhd8ed1ab_0
pandas =1.5.2=py310h46d7db6_0
pandas-datareader =0.10.0=pyh6c4a22f_0
pandoc =2.19.2=hce30654_1
pandocfilters =1.5.0=pyhd8ed1ab_0
parso =0.8.3=pyhd8ed1ab_0
pcre =8.45=hc377ac9_0
pexpect =4.8.0=pyh1a96a4e_2
pickleshare =0.7.5=py_1003
pillow =9.3.0=py310h313beb8_2
pip =23.0=pyhd8ed1ab_0
pixman =0.40.0=h1a28f6b_0
pkgutil-resolve-name =1.3.10=pyhd8ed1ab_0
platformdirs =2.6.2=pyhd8ed1ab_0
ply =3.11=py310hca03da5_0
pooch =1.6.0=pyhd8ed1ab_0
prometheus_client =0.16.0=pyhd8ed1ab_0
prompt-toolkit =3.0.36=pyha770c72_0
prompt_toolkit =3.0.36=hd8ed1ab_0
protobuf =3.19.6=pypi_0
psutil =5.9.0=py310h1a28f6b_0
pthread-stubs =0.4=h27ca646_1001
ptyprocess =0.7.0=pyhd3deb0d_0
pure_eval =0.2.2=pyhd8ed1ab_0
pyasn1 =0.4.8=pypi_0
pyasn1-modules =0.2.8=pypi_0
pycparser =2.21=pyhd8ed1ab_0
pygments =2.14.0=pyhd8ed1ab_0
pyopenssl =23.0.0=pyhd8ed1ab_0
pyparsing =3.0.9=pyhd8ed1ab_0
pyqt =5.15.7=py310hc377ac9_0
pyqt5-sip =12.11.0=pypi_0
pyrsistent =0.18.0=py310h1a28f6b_0
pysocks =1.7.1=pyha2e5f31_6
python =3.10.9=hc0d8a6c_0
python-dateutil =2.8.2=pyhd8ed1ab_0
python-fastjsonschema =2.16.2=pyhd8ed1ab_0
python-json-logger =2.0.4=pyhd8ed1ab_0
python-slugify =8.0.0=pypi_0
pytz =2022.7.1=pyhd8ed1ab_0
pyyaml =6.0=py310h80987f9_1
pyzmq =23.2.0=py310hc377ac9_0
qt-main =5.15.2=ha2d02b5_7
qt-webengine =5.15.9=h2903aaf_4
qtconsole =5.4.0=py310hca03da5_0
qtpy =2.2.0=py310hca03da5_0
qtwebkit =5.212=h0f11f3c_4
readline =8.1.2=h46ed386_0
requests =2.28.2=pyhd8ed1ab_0
requests-oauthlib =1.3.1=pypi_0
rfc3339-validator =0.1.4=pyhd8ed1ab_0
rfc3986-validator =0.1.1=pyh9f0ad1d_0
rsa =4.9=pypi_0
s3transfer =0.6.0=pyhd8ed1ab_0
scikit-learn =1.2.0=py310h313beb8_1
scipy =1.9.3=py310h20cbe94_0
send2trash =1.8.0=pyhd8ed1ab_0
setuptools =67.1.0=pyhd8ed1ab_0
sip =6.6.2=py310hc377ac9_0
six =1.16.0=pyh6c4a22f_0
sniffio =1.3.0=pyhd8ed1ab_0
soupsieve =2.3.2.post1=pyhd8ed1ab_0
sqlite =3.40.1=h7a7dc30_0
stack_data =0.6.2=pyhd8ed1ab_0
tensorboard =2.11.2=pypi_0
tensorboard-data-server =0.6.1=pypi_0
tensorboard-plugin-wit =1.8.1=pypi_0
tensorflow-addons =0.19.0=pypi_0
tensorflow-estimator =2.11.0=pypi_0
tensorflow-macos =2.11.0=pypi_0
tensorflow-metal =0.7.0=pypi_0
termcolor =2.2.0=pypi_0
terminado =0.17.1=pyhd1c38e8_0
text-unidecode =1.3=pypi_0
threadpoolctl =3.1.0=pyh8a188c0_0
tinycss2 =1.2.1=pyhd8ed1ab_0
tk =8.6.12=hb8d0fd4_0
toml =0.10.2=pyhd3eb1b0_0
tomli =2.0.1=py310hca03da5_0
tornado =6.2=py310h1a28f6b_0
tqdm =4.64.1=pyhd8ed1ab_0
traitlets =5.9.0=pyhd8ed1ab_0
typeguard =2.13.3=pypi_0
typing-extensions =4.4.0=hd8ed1ab_0
typing_extensions =4.4.0=pyha770c72_0
tzdata =2022g=h191b570_0
urllib3 =1.26.14=pyhd8ed1ab_0
wcwidth =0.2.6=pyhd8ed1ab_0
webencodings =0.5.1=py_1
websocket-client =1.5.1=pyhd8ed1ab_0
werkzeug =2.2.2=pyhd8ed1ab_0
wheel =0.38.4=pyhd8ed1ab_0
widgetsnbextension =4.0.5=pyhd8ed1ab_0
wrapt =1.14.1=pypi_0
xorg-libxau =1.0.9=h27ca646_0
xorg-libxdmcp =1.1.3=h27ca646_0
xz =5.2.10=h80987f9_1
yaml =0.2.5=h3422bc3_2
zeromq =4.3.4=hbdafb3b_1
zipp =3.12.1=pyhd8ed1ab_0
zlib =1.2.13=h5a0b063_0
zstd =1.5.2=h8574219_0

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science