gcvit

Tensorflow 2.0 Implementation of GCViT: Global Context Vision Transformer

https://github.com/awsaf49/gcvit-tf

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.8%) to scientific vocabulary

Keywords

attention cnn computer-vision image-classification image-recognition imagenet self-attention transformer

Keywords from Contributors

hack
Last synced: 6 months ago · JSON representation

Repository

Tensorflow 2.0 Implementation of GCViT: Global Context Vision Transformer

Basic Info
  • Host: GitHub
  • Owner: awsaf49
  • License: mit
  • Language: Jupyter Notebook
  • Default Branch: main
  • Homepage:
  • Size: 27.6 MB
Statistics
  • Stars: 26
  • Watchers: 5
  • Forks: 6
  • Open Issues: 1
  • Releases: 7
Topics
attention cnn computer-vision image-classification image-recognition imagenet self-attention transformer
Created over 3 years ago · Last pushed about 2 years ago
Metadata Files
Readme License Citation

README.md

GCViT: Global Context Vision Transformer

python tensorflow

Open In Colab Open In Kaggle

Tensorflow 2.0 Implementation of GCViT

This library implements GCViT using Tensorflow 2.0 specifically in tf.keras.Model manner to get PyTorch flavor.

Update

Paper Implementation & Explanation **

I have explained the GCViT paper in a Kaggle notebook GCViT: Global Context Vision Transformer, which also includes a detailed implementation of the model from scratch. The notebook provides a comprehensive explanation of each part of the model, with intuition.

Do check it out, especially if you are interested in learning more about GCViT or implementing it yourself. Note that this notebook has won the Kaggle ML Research Award 2022.

Model

  • Architecture:

  • Local Vs Global Attention:

Result

Official codebase had some issue which has been fixed recently (12 August 2022). Here's the result of ported weights on ImageNetV2-Test data,

| Model | Acc@1 | Acc@5 | #Params | |--------------|-------|-------|---------| | GCViT-XXTiny | 0.663 | 0.873 | 12M | | GCViT-XTiny | 0.685 | 0.885 | 20M | | GCViT-Tiny | 0.708 | 0.899 | 28M | | GCViT-Small | 0.720 | 0.901 | 51M | | GCViT-Base | 0.731 | 0.907 | 90M | | GCViT-Large | 0.734 | 0.913 | 202M |

Installation

```bash pip install -U gcvit

or

pip install -U git+https://github.com/awsaf49/gcvit-tf

```

Usage

Load model using following codes, py from gcvit import GCViTTiny model = GCViTTiny(pretrain=True)

Any input size other than 224x224, py from gcvit import GCViTTiny model = GCViTTiny(input_shape=(512,512,3), pretrain=True, resize_query=True) Simple code to check model's prediction, py from skimage.data import chelsea img = tf.keras.applications.imagenet_utils.preprocess_input(chelsea(), mode='torch') # Chelsea the cat img = tf.image.resize(img, (224, 224))[None,] # resize & create batch pred = model(img).numpy() print(tf.keras.applications.imagenet_utils.decode_predictions(pred)[0]) Prediction: py [('n02124075', 'Egyptian_cat', 0.9194835), ('n02123045', 'tabby', 0.009686623), ('n02123159', 'tiger_cat', 0.0061576385), ('n02127052', 'lynx', 0.0011503297), ('n02883205', 'bow_tie', 0.00042479983)] For feature extraction: py model = GCViTTiny(pretrain=True) # when pretrain=True, num_classes must be 1000 model.reset_classifier(num_classes=0, head_act=None) feature = model(img) print(feature.shape) Feature: py (None, 512) For feature map: py model = GCViTTiny(pretrain=True) # when pretrain=True, num_classes must be 1000 feature = model.forward_features(img) print(feature.shape) Feature map: py (None, 7, 7, 512)

Kaggle Models

These pre-trained models can also be loaded using Kaggle Models. Setting from_kaggle=True will enforce model to load weights from Kaggle Models without downloading, thus can be used without internet in Kaggle. py from gcvit import GCViTTiny model = GCViTTiny(pretrain=True, from_kaggle=True)

Live-Demo

  • For live demo on Image Classification & Grad-CAM, with ImageNet weights, click powered by Space and Gradio. here's an example,

Example

For working training example checkout these notebooks on Google Colab Open In Colab & Kaggle Open In Kaggle.

Here is grad-cam result after training on Flower Classification Dataset,

To Do

  • [ ] Convert it to multi-backend Keras 3.0
  • [ ] Segmentation Pipeline
  • [x] Support for Kaggle Models
  • [x] Remove tensorflow_addons
  • [x] New updated weights have been added.
  • [x] Working training example in Colab & Kaggle.
  • [x] GradCAM showcase.
  • [x] Gradio Demo.
  • [x] Build model with tf.keras.Model.
  • [x] Port weights from official repo.
  • [x] Support for TPU.

Acknowledgement

Citation

bibtex @article{hatamizadeh2022global, title={Global Context Vision Transformers}, author={Hatamizadeh, Ali and Yin, Hongxu and Kautz, Jan and Molchanov, Pavlo}, journal={arXiv preprint arXiv:2206.09959}, year={2022} }

Owner

  • Name: Awsaf
  • Login: awsaf49
  • Kind: user
  • Location: Dhaka, Bangladesh
  • Company: @wandb

Kaggle Grandmaster @Kaggle | Dev Expert @wandb | Student, Dept of EEE, BUET

GitHub Events

Total
  • Watch event: 2
  • Fork event: 1
Last Year
  • Watch event: 2
  • Fork event: 1

Committers

Last synced: over 1 year ago

All Time
  • Total Commits: 212
  • Total Committers: 4
  • Avg Commits per committer: 53.0
  • Development Distribution Score (DDS): 0.175
Past Year
  • Commits: 14
  • Committers: 1
  • Avg Commits per committer: 14.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Awsaf a****9@g****m 175
André Pedersen a****4@g****m 33
ImgBotApp I****p@g****m 3
Paul Mooney m****p@g****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 7
  • Total pull requests: 14
  • Average time to close issues: 11 days
  • Average time to close pull requests: 5 days
  • Total issue authors: 4
  • Total pull request authors: 5
  • Average comments per issue: 3.43
  • Average comments per pull request: 1.5
  • Merged pull requests: 11
  • Bot issues: 0
  • Bot pull requests: 4
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • andreped (4)
  • fakyras (1)
  • awsaf49 (1)
  • pablojrios (1)
Pull Request Authors
  • andreped (7)
  • imgbot[bot] (3)
  • dependabot[bot] (1)
  • awsaf49 (1)
  • paultimothymooney (1)
Top Labels
Issue Labels
enhancement (2) dependencies (1) question (1) bug (1)
Pull Request Labels
enhancement (4) documentation (1) bug (1) dependencies (1)

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 71 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 17
  • Total maintainers: 1
pypi.org: gcvit

Tensorflow 2.0 Implementation of GCViT: Global Context Vision Transformer. https://github.com/awsaf49/gcvit-tf

  • Versions: 17
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 71 Last month
Rankings
Dependent packages count: 6.6%
Stargazers count: 14.9%
Forks count: 15.7%
Average: 17.0%
Dependent repos count: 30.6%
Maintainers (1)
Last synced: 6 months ago

Dependencies

requirements.txt pypi
  • gradio ==3.1.0
  • matplotlib *
  • numpy *
  • tensorflow ==2.4.1
  • tensorflow_addons ==0.14.0
setup.py pypi
  • for *
.github/workflows/publish_to_pypi.yml actions
  • actions/checkout master composite
  • actions/setup-python v1 composite
  • pypa/gh-action-pypi-publish master composite