swin-transformer
Shifted Windows Vision Transformer
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.3%) to scientific vocabulary
Repository
Shifted Windows Vision Transformer
Basic Info
- Host: GitHub
- Owner: Abhik-Biswas
- License: apache-2.0
- Language: Jupyter Notebook
- Default Branch: main
- Size: 5.93 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
A review of SWIN Transformers
SWIN Transformer Model
This README provides a concise overview of the SWIN Transformer model, instructions on how to create and compile a model using TensorFlow, and access to weights obtained during experiments.
Creating and Compiling the SWIN Transformer Model
To create and compile a SWIN Transformer model using TensorFlow, follow the provided code snippet. Ensure that the code is executed within the context of strategy.scope() for distributed training, if necessary. Install the required libraries, including TensorFlow and have the swin_transformer.py file in the same directory as the IPYNB, before running the code.
```python import tensorflow as tf from swintransformer import SwinTransformer from tensorflow.keras.applications.imagenetutils import preprocess_input
with strategy.scope(): imgadjustlayer = tf.keras.layers.Lambda(lambda data: preprocessinput(tf.cast(data, tf.float32), mode="torch"), inputshape=[*IMAGESIZE, 3]) pretrainedmodel = SwinTransformer('swinlarge224', numclasses=len(CLASSES), includetop=False, pretrained=True, use_tpu=False)
model = tf.keras.Sequential([
img_adjust_layer,
pretrained_model,
tf.keras.layers.Dense(len(CLASSES), activation='softmax')
])
model.compile( optimizer=tf.keras.optimizers.Adam(learningrate=1e-5, epsilon=1e-8), loss='sparsecategorical_crossentropy', metrics=['accuracy'] ) model.summary() ```
Access to Model Weights
The weights obtained from the experiments can be accessed through the following Google Drive link: SWIN Transformer Pre-trained Weights. Download the weights as needed and use them to initialize your SWIN Transformer model for specific tasks or further experimentation. Ensure compatibility with the model architecture to achieve optimal performance.
Access the Dataset
Data that gave the best results
```python from kaggledatasets import KaggleDatasets GCSDSPATH = KaggleDatasets().getgcspath("flower-classification") IMAGESIZE = [224, 224] EPOCHS = 15 BATCHSIZE = 16 * strategy.numreplicasinsync
GCSPATHSELECT = { # available image sizes 192: GCSDSPATH + '/tfrecords-jpeg-192x192', 224: GCSDSPATH + '/tfrecords-jpeg-224x224', 331: GCSDSPATH + '/tfrecords-jpeg-331x331', 512: GCSDSPATH + '/tfrecords-jpeg-512x512' } GCSPATH = GCSPATHSELECT[IMAGESIZE[0]]
TRAININGFILENAMES = tf.io.gfile.glob(GCSPATH + '/train/.tfrec') VALIDATIONFILENAMES = tf.io.gfile.glob(GCSPATH + '/val/.tfrec') ```
Other datasets
This is just to get the dataset. The remaining functions and data pre-processing steps have been defined in the notebook. The following is a snapshot of the data that we have used in our project and it gave us the best results.
Other datasets that we have tried to work with are the Tiny-Imagenet Dataset and CIFAR-100. Since we were loading pre-trained weights, we had to upscale these images to 224 x 224 size, which led to a poor performance.
Credits
- Project Inspiration: This project was inspired from the paper Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
- Code implementation: The implementation used in this project is credited to this implementation. We had to follow the exact same structure of the implementation that was there in this repository in order for us to properly load the pre-trained weights, and use them as a starting point to train our model on the "Flower" dataset. This adaptation ensures compatibility and consistency, especially with pre-existing weights, which helped us achieve training and convergence, especially on the limited resources that were available to us.
File Directory Structure
``` ./ |-- figures | |-- aaa3gcpworkexamplescreenshot1.png | |-- aaa3gcpworkexamplescreenshot2.png | |-- aaa3gcpworkexamplescreenshot3.png |-- swintransformer.py |-- swinnotebook.ipynb |-- README.md |-- E4040.2023Fall.aaa3.report.ar4634.ab5640.ap4478.pdf |-- TinyImagenet.ipynb
```
Owner
- Name: Abhik Biswas
- Login: Abhik-Biswas
- Kind: user
- Repositories: 2
- Profile: https://github.com/Abhik-Biswas
Data Science@IITM | Mathematics @SXCCAL | ML | DL | AI
Citation (citation.cff)
cff-version: 1.2.0
message: "If you use this software in your work, please cite it using the following metadata."
title: "A review of the Shifted Windows (SWIN) Vision Transformer"
version: 1.0.0
date-released: 2024-04-02
authors:
- family-names: "Biswas"
given-names: "Abhik"
- family-names: "Raghavan"
given-names: "Aarya"
- family-names: "Praveen Kumar"
given-names: "Abhilash"