vformer

A modular PyTorch library for vision transformer models

https://github.com/sforaidl/vformer

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Committers with academic emails
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (11.8%) to scientific vocabulary

Keywords

pytorch vision-transformer

Last synced: 10 months ago · JSON representation ·

Repository

A modular PyTorch library for vision transformer models

Basic Info

Host: GitHub
Owner: SforAiDl
License: mit
Language: Python
Default Branch: main
Homepage: https://vformer.readthedocs.io/
Size: 178 KB

Statistics

Stars: 162
Watchers: 5
Forks: 22
Open Issues: 15
Releases: 3

Topics

pytorch vision-transformer

Created almost 5 years ago · Last pushed over 2 years ago

Metadata Files

Readme Contributing License Citation Authors

VFormer

A modular PyTorch library for vision transformers models

[![Tests](https://github.com/SforAiDl/vformer/actions/workflows/package-test.yml/badge.svg)](https://github.com/SforAiDl/vformer/actions/workflows/package-test.yml) [![Docs](https://readthedocs.org/projects/vformer/badge/?version=latest)](https://vformer.readthedocs.io/en/latest/?badge=latest) [![codecov](https://codecov.io/gh/SforAiDl/vformer/branch/main/graph/badge.svg?token=5QKCZ67CM2)](https://codecov.io/gh/SforAiDl/vformer) [![Downloads](https://pepy.tech/badge/vformer)](https://pepy.tech/project/vformer) **[Documentation](https://vformer.readthedocs.io/en/latest/)**

Library Features

Contains implementations of prominent ViT architectures broken down into modular components like encoder, attention mechanism, and decoder
Makes it easy to develop custom models by composing components of different architectures
Contains utilities for visualizing attention maps of models using techniques such as gradient rollout

Installation

From source (recommended)

```shell

git clone https://github.com/SforAiDl/vformer.git cd vformer/ python setup.py install

```

From PyPI

```shell

pip install vformer

```

Models supported

[x] Vanilla ViT
[x] Swin Transformer
[x] Pyramid Vision Transformer
[x] CrossViT
[x] Compact Vision Transformer
[x] Compact Convolutional Transformer
[x] Visformer
[x] Vision Transformers for Dense Prediction
[x] CvT
[x] ConViT
[x] ViViT
[x] Perceiver IO
[x] Memory Efficient Attention

Example usage

To instantiate and use a Swin Transformer model -

```python

import torch from vformer.models.classification import SwinTransformer

image = torch.randn(1, 3, 224, 224) # Example data model = SwinTransformer( imgsize=224, patchsize=4, inchannels=3, nclasses=10, embeddim=96, depths=[2, 2, 6, 2], numheads=[3, 6, 12, 24], windowsize=7, droprate=0.2, ) logits = model(image) ```

VFormer has a modular design and allows for easy experimentation using blocks/modules of different architectures. For example, if desired, you can use just the encoder or the windowed attention layer of the Swin Transformer model.

```python

from vformer.attention import WindowAttention

windowattn = WindowAttention( dim=128, windowsize=7, num_heads=2, **kwargs, )

```

```python

from vformer.encoder import SwinEncoder

swinencoder = SwinEncoder( dim=128, inputresolution=(224, 224), depth=2, numheads=2, windowsize=7, **kwargs, )

```

Please refer to our documentation to know more.

References

Citations (click to expand)

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale bibtex @article{dosovitskiy2020vit, title={An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale}, author={Dosovitskiy, Alexey and Beyer, Lucas and Kolesnikov, Alexander and Weissenborn, Dirk and Zhai, Xiaohua and Unterthiner, Thomas and Dehghani, Mostafa and Minderer, Matthias and Heigold, Georg and Gelly, Sylvain and Uszkoreit, Jakob and Houlsby, Neil}, journal={ICLR}, year={2021} }

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows bibtex @article{liu2021Swin, title={Swin Transformer: Hierarchical Vision Transformer using Shifted Windows}, author={Liu, Ze and Lin, Yutong and Cao, Yue and Hu, Han and Wei, Yixuan and Zhang, Zheng and Lin, Stephen and Guo, Baining}, journal={arXiv preprint arXiv:2103.14030}, year={2021} }

Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions bibtex @misc{wang2021pyramid, title={Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions}, author={Wenhai Wang and Enze Xie and Xiang Li and Deng-Ping Fan and Kaitao Song and Ding Liang and Tong Lu and Ping Luo and Ling Shao}, year={2021}, eprint={2102.12122}, archivePrefix={arXiv}, primaryClass={cs.CV} } CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification

bibtex @inproceedings{chen2021crossvit, title={{CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification}}, author={Chun-Fu (Richard) Chen and Quanfu Fan and Rameswar Panda}, booktitle={International Conference on Computer Vision (ICCV)}, year={2021} }

Escaping the Big Data Paradigm with Compact Transformers

bibtex @article{hassani2021escaping, title = {Escaping the Big Data Paradigm with Compact Transformers}, author = {Ali Hassani and Steven Walton and Nikhil Shah and Abulikemu Abuduweili and Jiachen Li and Humphrey Shi}, year = 2021, url = {https://arxiv.org/abs/2104.05704}, eprint = {2104.05704}, archiveprefix = {arXiv}, primaryclass = {cs.CV} }

Visformer: The Vision-friendly Transformer

bibtex @misc{chen2021visformer, title={Visformer: The Vision-friendly Transformer}, author={Zhengsu Chen and Lingxi Xie and Jianwei Niu and Xuefeng Liu and Longhui Wei and Qi Tian}, year={2021}, eprint={2104.12533}, archivePrefix={arXiv}, primaryClass={cs.CV} }

Vision Transformers for Dense Prediction

bibtex @misc{ranftl2021vision, title={Vision Transformers for Dense Prediction}, author={René Ranftl and Alexey Bochkovskiy and Vladlen Koltun}, year={2021}, eprint={2103.13413}, archivePrefix={arXiv}, primaryClass={cs.CV} }

-->

Owner

Name: Society for Artificial Intelligence and Deep Learning
Login: SforAiDl
Kind: organization
Location: In a galaxy far far away

Website: www.saidl.in
Twitter: SforAiDL
Repositories: 20
Profile: https://github.com/SforAiDl

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you find this software useful, please cite it as below."
authors:
- family-names: "Deo"
  given-names: "Abhijit"
- family-names: "Agrawal"
  given-names: "Aditya"
- family-names: "Shah"
  given-names: "Neelay"
- family-names: "Li"
  given-names: "Alvin"
title: "VFormer: A modular PyTorch library for vision transformers"
date-released: 2022-01-11
url: "https://github.com/SforAiDl/vformer"
license: MIT

GitHub Events

Total

Watch event: 1

Last Year

Watch event: 1

Committers

Last synced: over 3 years ago

All Time

Total Commits: 89
Total Committers: 6
Avg Commits per committer: 14.833
Development Distribution Score (DDS): 0.438

Top Committers

Name	Email	Commits
Neelay Shah	s**9@g**m	50
Abhijit Deo	7**g@u**m	23
Aditya Agrawal	7**2@u**m	9
alvanli	5**i@u**m	4
Hrithik Nambiar	h**2@g**m	2
Rishav Mukherji	7**o@u**m	1

Issues and Pull Requests

Last synced: 11 months ago

All Time

Total issues: 33
Total pull requests: 68
Average time to close issues: about 2 months
Average time to close pull requests: 9 days
Total issue authors: 5
Total pull request authors: 5
Average comments per issue: 1.61
Average comments per pull request: 1.5
Merged pull requests: 49
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

NeelayS (21)
abhi-glitchhg (7)
alvanli (3)
Claire874 (1)
aditya-agrawal-30502 (1)

Pull Request Authors

abhi-glitchhg (36)
NeelayS (12)
aditya-agrawal-30502 (11)
Amapocho (5)
alvanli (4)

Top Labels

Issue Labels

Paper implementation (18) enhancement (8) documentation (4) good first issue (4) contributions welcome (1) low priority (1)

Pull Request Labels

documentation (7) Paper implementation (6) enhancement (2)

Packages

Total packages: 1
Total downloads:
- pypi 14 last-month

Total dependent packages: 0
Total dependent repositories: 1
Total versions: 4
Total maintainers: 3

pypi.org: vformer

A modular PyTorch library for vision transformer models

Homepage: https://github.com/SforAiDl/vformer
Documentation: https://vformer.readthedocs.io/
License: MIT
Latest release: 0.1.3
published almost 4 years ago

Versions: 4
Dependent Packages: 0
Dependent Repositories: 1
Downloads: 14 Last month

Rankings

Stargazers count: 5.7%

Forks count: 8.1%

Dependent packages count: 10.1%

Average: 15.0%

Dependent repos count: 21.6%

Downloads: 29.8%

Maintainers (3)

abhi-glitchhg adi30502 NeelayS

Last synced: 10 months ago

Dependencies

requirements.txt pypi

Jinja2 ==3.0.1
MarkupSafe ==2.0.1
Pillow *
PyYAML ==5.4.1
arrow ==1.1.1
attrs ==21.2.0
backports.entry-points-selectable ==1.1.0
binaryornot ==0.4.4
certifi ==2021.5.30
cfgv ==3.3.1
chardet ==4.0.0
charset-normalizer ==2.0.4
click ==8.0.1
cookiecutter ==1.7.3
distlib ==0.3.2
einops ==0.3.2
filelock ==3.0.12
identify ==2.2.13
idna ==3.2
iniconfig ==1.1.1
jinja2-time ==0.2.0
nodeenv ==1.6.0
olefile *
packaging ==21.0
platformdirs ==2.3.0
pluggy ==1.0.0
poyo ==0.5.0
pre-commit ==2.15.0
py ==1.10.0
pyparsing ==2.4.7
pytest >=6.2.5
python-dateutil ==2.8.2
python-slugify ==5.0.2
requests ==2.26.0
six ==1.16.0
text-unidecode ==1.3
toml ==0.10.2
torch >=1.10.0
torchvision >=0.11.0
typing-extensions *
urllib3 ==1.26.6
virtualenv ==20.7.2

.github/workflows/codecov.yml actions

actions/cache v2 composite
actions/checkout v2 composite
actions/setup-python v2 composite
codecov/codecov-action v1 composite

.github/workflows/conda_test.yml actions

actions/checkout v2 composite
conda-incubator/setup-miniconda v2 composite

.github/workflows/documentation.yml actions

actions/checkout v3 composite
actions/setup-python v2 composite
actions/upload-artifact v3.1.0 composite

.github/workflows/linting.yml actions

actions/checkout v2 composite
actions/setup-python v2 composite

.github/workflows/package-test.yml actions

actions/checkout v2 composite
actions/setup-python v2 composite

.github/workflows/publish.yml actions

actions/checkout v2 composite
actions/setup-python v2 composite
pypa/gh-action-pypi-publish 27b31702a0e7fc50959f5ad993c78deac1bdfc29 composite

environment.yml conda

bzip2 1.0.8.*
ca-certificates 2022.4.26.*
libffi 3.4.2.*
openssl 1.1.1p.*
pip 21.2.4.*
setuptools 61.2.0.*
sqlite 3.38.5.*
tk 8.6.12.*
tzdata 2022a.*
wheel 0.37.1.*
xz 5.2.5.*
zlib 1.2.12.*

setup.py pypi

vformer

Science Score: 54.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

VFormer

A modular PyTorch library for vision transformers models

Library Features

Installation

From source (recommended)

From PyPI

Models supported

Example usage

References

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Committers

All Time

Top Committers

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

pypi.org: vformer

Rankings

Maintainers (3)

Dependencies