swin-transformer

Shifted Windows Vision Transformer

https://github.com/abhik-biswas/swin-transformer

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.3%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Shifted Windows Vision Transformer

Basic Info
  • Host: GitHub
  • Owner: Abhik-Biswas
  • License: apache-2.0
  • Language: Jupyter Notebook
  • Default Branch: main
  • Size: 5.93 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created almost 2 years ago · Last pushed almost 2 years ago
Metadata Files
Readme License Citation

README.md

Review Assignment Due Date

A review of SWIN Transformers

SWIN Transformer Model

This README provides a concise overview of the SWIN Transformer model, instructions on how to create and compile a model using TensorFlow, and access to weights obtained during experiments.

Creating and Compiling the SWIN Transformer Model

To create and compile a SWIN Transformer model using TensorFlow, follow the provided code snippet. Ensure that the code is executed within the context of strategy.scope() for distributed training, if necessary. Install the required libraries, including TensorFlow and have the swin_transformer.py file in the same directory as the IPYNB, before running the code.

```python import tensorflow as tf from swintransformer import SwinTransformer from tensorflow.keras.applications.imagenetutils import preprocess_input

with strategy.scope(): imgadjustlayer = tf.keras.layers.Lambda(lambda data: preprocessinput(tf.cast(data, tf.float32), mode="torch"), inputshape=[*IMAGESIZE, 3]) pretrainedmodel = SwinTransformer('swinlarge224', numclasses=len(CLASSES), includetop=False, pretrained=True, use_tpu=False)

model = tf.keras.Sequential([
    img_adjust_layer,
    pretrained_model,
    tf.keras.layers.Dense(len(CLASSES), activation='softmax')
])

model.compile( optimizer=tf.keras.optimizers.Adam(learningrate=1e-5, epsilon=1e-8), loss='sparsecategorical_crossentropy', metrics=['accuracy'] ) model.summary() ```

Access to Model Weights

The weights obtained from the experiments can be accessed through the following Google Drive link: SWIN Transformer Pre-trained Weights. Download the weights as needed and use them to initialize your SWIN Transformer model for specific tasks or further experimentation. Ensure compatibility with the model architecture to achieve optimal performance.

Access the Dataset

Data that gave the best results

```python from kaggledatasets import KaggleDatasets GCSDSPATH = KaggleDatasets().getgcspath("flower-classification") IMAGESIZE = [224, 224] EPOCHS = 15 BATCHSIZE = 16 * strategy.numreplicasinsync

GCSPATHSELECT = { # available image sizes 192: GCSDSPATH + '/tfrecords-jpeg-192x192', 224: GCSDSPATH + '/tfrecords-jpeg-224x224', 331: GCSDSPATH + '/tfrecords-jpeg-331x331', 512: GCSDSPATH + '/tfrecords-jpeg-512x512' } GCSPATH = GCSPATHSELECT[IMAGESIZE[0]]

TRAININGFILENAMES = tf.io.gfile.glob(GCSPATH + '/train/.tfrec') VALIDATIONFILENAMES = tf.io.gfile.glob(GCSPATH + '/val/.tfrec') ```

Other datasets

This is just to get the dataset. The remaining functions and data pre-processing steps have been defined in the notebook. The following is a snapshot of the data that we have used in our project and it gave us the best results.

data_sample

Other datasets that we have tried to work with are the Tiny-Imagenet Dataset and CIFAR-100. Since we were loading pre-trained weights, we had to upscale these images to 224 x 224 size, which led to a poor performance.

Credits

  • Project Inspiration: This project was inspired from the paper Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
  • Code implementation: The implementation used in this project is credited to this implementation. We had to follow the exact same structure of the implementation that was there in this repository in order for us to properly load the pre-trained weights, and use them as a starting point to train our model on the "Flower" dataset. This adaptation ensures compatibility and consistency, especially with pre-existing weights, which helped us achieve training and convergence, especially on the limited resources that were available to us.

File Directory Structure

``` ./ |-- figures | |-- aaa3gcpworkexamplescreenshot1.png | |-- aaa3gcpworkexamplescreenshot2.png | |-- aaa3gcpworkexamplescreenshot3.png |-- swintransformer.py |-- swinnotebook.ipynb |-- README.md |-- E4040.2023Fall.aaa3.report.ar4634.ab5640.ap4478.pdf |-- TinyImagenet.ipynb

```

Owner

  • Name: Abhik Biswas
  • Login: Abhik-Biswas
  • Kind: user

Data Science@IITM | Mathematics @SXCCAL | ML | DL | AI

Citation (citation.cff)

cff-version: 1.2.0
message: "If you use this software in your work, please cite it using the following metadata."
title: "A review of the Shifted Windows (SWIN) Vision Transformer"
version: 1.0.0
date-released: 2024-04-02
authors:
  - family-names: "Biswas"
    given-names: "Abhik"
  - family-names: "Raghavan"
    given-names: "Aarya"
  - family-names: "Praveen Kumar"
    given-names: "Abhilash"

GitHub Events

Total
Last Year