Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.8%) to scientific vocabulary
Keywords
Repository
A modular PyTorch library for vision transformer models
Basic Info
- Host: GitHub
- Owner: SforAiDl
- License: mit
- Language: Python
- Default Branch: main
- Homepage: https://vformer.readthedocs.io/
- Size: 178 KB
Statistics
- Stars: 162
- Watchers: 5
- Forks: 22
- Open Issues: 15
- Releases: 3
Topics
Metadata Files
README.md
VFormer
A modular PyTorch library for vision transformers models
Library Features
- Contains implementations of prominent ViT architectures broken down into modular components like encoder, attention mechanism, and decoder
- Makes it easy to develop custom models by composing components of different architectures
- Contains utilities for visualizing attention maps of models using techniques such as gradient rollout
Installation
From source (recommended)
```shell
git clone https://github.com/SforAiDl/vformer.git cd vformer/ python setup.py install
```
From PyPI
```shell
pip install vformer
```
Models supported
- [x] Vanilla ViT
- [x] Swin Transformer
- [x] Pyramid Vision Transformer
- [x] CrossViT
- [x] Compact Vision Transformer
- [x] Compact Convolutional Transformer
- [x] Visformer
- [x] Vision Transformers for Dense Prediction
- [x] CvT
- [x] ConViT
- [x] ViViT
- [x] Perceiver IO
- [x] Memory Efficient Attention
Example usage
To instantiate and use a Swin Transformer model -
```python
import torch from vformer.models.classification import SwinTransformer
image = torch.randn(1, 3, 224, 224) # Example data model = SwinTransformer( imgsize=224, patchsize=4, inchannels=3, nclasses=10, embeddim=96, depths=[2, 2, 6, 2], numheads=[3, 6, 12, 24], windowsize=7, droprate=0.2, ) logits = model(image) ```
VFormer has a modular design and allows for easy experimentation using blocks/modules of different architectures. For example, if desired, you can use just the encoder or the windowed attention layer of the Swin Transformer model.
```python
from vformer.attention import WindowAttention
windowattn = WindowAttention( dim=128, windowsize=7, num_heads=2, **kwargs, )
```
```python
from vformer.encoder import SwinEncoder
swinencoder = SwinEncoder( dim=128, inputresolution=(224, 224), depth=2, numheads=2, windowsize=7, **kwargs, )
```
Please refer to our documentation to know more.
References
- vit-pytorch
- Swin-Transformer
- PVT
- vit-explain
- CrossViT
- Compact-Transformers
- Visformer
- DPT
- CvT
- convit
- ViViT-pytorch
- perceiver-pytorch
- memory-efficient-attention
<!--
Citations (click to expand)
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
bibtex
@article{dosovitskiy2020vit,
title={An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale},
author={Dosovitskiy, Alexey and Beyer, Lucas and Kolesnikov, Alexander and Weissenborn, Dirk and Zhai, Xiaohua and Unterthiner, Thomas and Dehghani, Mostafa and Minderer, Matthias and Heigold, Georg and Gelly, Sylvain and Uszkoreit, Jakob and Houlsby, Neil},
journal={ICLR},
year={2021}
}
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
bibtex
@article{liu2021Swin,
title={Swin Transformer: Hierarchical Vision Transformer using Shifted Windows},
author={Liu, Ze and Lin, Yutong and Cao, Yue and Hu, Han and Wei, Yixuan and Zhang, Zheng and Lin, Stephen and Guo, Baining},
journal={arXiv preprint arXiv:2103.14030},
year={2021}
}
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
bibtex
@misc{wang2021pyramid,
title={Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions},
author={Wenhai Wang and Enze Xie and Xiang Li and Deng-Ping Fan and Kaitao Song and Ding Liang and Tong Lu and Ping Luo and Ling Shao},
year={2021},
eprint={2102.12122},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification
bibtex
@inproceedings{chen2021crossvit,
title={{CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification}},
author={Chun-Fu (Richard) Chen and Quanfu Fan and Rameswar Panda},
booktitle={International Conference on Computer Vision (ICCV)},
year={2021}
}
Escaping the Big Data Paradigm with Compact Transformers
bibtex
@article{hassani2021escaping,
title = {Escaping the Big Data Paradigm with Compact Transformers},
author = {Ali Hassani and Steven Walton and Nikhil Shah and Abulikemu Abuduweili and Jiachen Li and Humphrey Shi},
year = 2021,
url = {https://arxiv.org/abs/2104.05704},
eprint = {2104.05704},
archiveprefix = {arXiv},
primaryclass = {cs.CV}
}
Visformer: The Vision-friendly Transformer
bibtex
@misc{chen2021visformer,
title={Visformer: The Vision-friendly Transformer},
author={Zhengsu Chen and Lingxi Xie and Jianwei Niu and Xuefeng Liu and Longhui Wei and Qi Tian},
year={2021},
eprint={2104.12533},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
Vision Transformers for Dense Prediction
bibtex
@misc{ranftl2021vision,
title={Vision Transformers for Dense Prediction},
author={René Ranftl and Alexey Bochkovskiy and Vladlen Koltun},
year={2021},
eprint={2103.13413},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
-->
Owner
- Name: Society for Artificial Intelligence and Deep Learning
- Login: SforAiDl
- Kind: organization
- Location: In a galaxy far far away
- Website: www.saidl.in
- Twitter: SforAiDL
- Repositories: 20
- Profile: https://github.com/SforAiDl
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you find this software useful, please cite it as below." authors: - family-names: "Deo" given-names: "Abhijit" - family-names: "Agrawal" given-names: "Aditya" - family-names: "Shah" given-names: "Neelay" - family-names: "Li" given-names: "Alvin" title: "VFormer: A modular PyTorch library for vision transformers" date-released: 2022-01-11 url: "https://github.com/SforAiDl/vformer" license: MIT
GitHub Events
Total
- Watch event: 1
Last Year
- Watch event: 1
Committers
Last synced: about 3 years ago
All Time
- Total Commits: 89
- Total Committers: 6
- Avg Commits per committer: 14.833
- Development Distribution Score (DDS): 0.438
Top Committers
| Name | Commits | |
|---|---|---|
| Neelay Shah | s****9@g****m | 50 |
| Abhijit Deo | 7****g@u****m | 23 |
| Aditya Agrawal | 7****2@u****m | 9 |
| alvanli | 5****i@u****m | 4 |
| Hrithik Nambiar | h****2@g****m | 2 |
| Rishav Mukherji | 7****o@u****m | 1 |
Issues and Pull Requests
Last synced: 9 months ago
All Time
- Total issues: 33
- Total pull requests: 68
- Average time to close issues: about 2 months
- Average time to close pull requests: 9 days
- Total issue authors: 5
- Total pull request authors: 5
- Average comments per issue: 1.61
- Average comments per pull request: 1.5
- Merged pull requests: 49
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- NeelayS (21)
- abhi-glitchhg (7)
- alvanli (3)
- Claire874 (1)
- aditya-agrawal-30502 (1)
Pull Request Authors
- abhi-glitchhg (36)
- NeelayS (12)
- aditya-agrawal-30502 (11)
- Amapocho (5)
- alvanli (4)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 14 last-month
- Total dependent packages: 0
- Total dependent repositories: 1
- Total versions: 4
- Total maintainers: 3
pypi.org: vformer
A modular PyTorch library for vision transformer models
- Homepage: https://github.com/SforAiDl/vformer
- Documentation: https://vformer.readthedocs.io/
- License: MIT
-
Latest release: 0.1.3
published over 3 years ago
Rankings
Maintainers (3)
Dependencies
- Jinja2 ==3.0.1
- MarkupSafe ==2.0.1
- Pillow *
- PyYAML ==5.4.1
- arrow ==1.1.1
- attrs ==21.2.0
- backports.entry-points-selectable ==1.1.0
- binaryornot ==0.4.4
- certifi ==2021.5.30
- cfgv ==3.3.1
- chardet ==4.0.0
- charset-normalizer ==2.0.4
- click ==8.0.1
- cookiecutter ==1.7.3
- distlib ==0.3.2
- einops ==0.3.2
- filelock ==3.0.12
- identify ==2.2.13
- idna ==3.2
- iniconfig ==1.1.1
- jinja2-time ==0.2.0
- nodeenv ==1.6.0
- olefile *
- packaging ==21.0
- platformdirs ==2.3.0
- pluggy ==1.0.0
- poyo ==0.5.0
- pre-commit ==2.15.0
- py ==1.10.0
- pyparsing ==2.4.7
- pytest >=6.2.5
- python-dateutil ==2.8.2
- python-slugify ==5.0.2
- requests ==2.26.0
- six ==1.16.0
- text-unidecode ==1.3
- toml ==0.10.2
- torch >=1.10.0
- torchvision >=0.11.0
- typing-extensions *
- urllib3 ==1.26.6
- virtualenv ==20.7.2
- actions/cache v2 composite
- actions/checkout v2 composite
- actions/setup-python v2 composite
- codecov/codecov-action v1 composite
- actions/checkout v2 composite
- conda-incubator/setup-miniconda v2 composite
- actions/checkout v3 composite
- actions/setup-python v2 composite
- actions/upload-artifact v3.1.0 composite
- actions/checkout v2 composite
- actions/setup-python v2 composite
- actions/checkout v2 composite
- actions/setup-python v2 composite
- actions/checkout v2 composite
- actions/setup-python v2 composite
- pypa/gh-action-pypi-publish 27b31702a0e7fc50959f5ad993c78deac1bdfc29 composite
- bzip2 1.0.8.*
- ca-certificates 2022.4.26.*
- libffi 3.4.2.*
- openssl 1.1.1p.*
- pip 21.2.4.*
- setuptools 61.2.0.*
- sqlite 3.38.5.*
- tk 8.6.12.*
- tzdata 2022a.*
- wheel 0.37.1.*
- xz 5.2.5.*
- zlib 1.2.12.*