vformer

A modular PyTorch library for vision transformer models

https://github.com/sforaidl/vformer

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.8%) to scientific vocabulary

Keywords

pytorch vision-transformer
Last synced: 7 months ago · JSON representation ·

Repository

A modular PyTorch library for vision transformer models

Basic Info
Statistics
  • Stars: 162
  • Watchers: 5
  • Forks: 22
  • Open Issues: 15
  • Releases: 3
Topics
pytorch vision-transformer
Created over 4 years ago · Last pushed over 2 years ago
Metadata Files
Readme Contributing License Citation Authors

README.md

VFormer

A modular PyTorch library for vision transformers models

[![Tests](https://github.com/SforAiDl/vformer/actions/workflows/package-test.yml/badge.svg)](https://github.com/SforAiDl/vformer/actions/workflows/package-test.yml) [![Docs](https://readthedocs.org/projects/vformer/badge/?version=latest)](https://vformer.readthedocs.io/en/latest/?badge=latest) [![codecov](https://codecov.io/gh/SforAiDl/vformer/branch/main/graph/badge.svg?token=5QKCZ67CM2)](https://codecov.io/gh/SforAiDl/vformer) [![Downloads](https://pepy.tech/badge/vformer)](https://pepy.tech/project/vformer) **[Documentation](https://vformer.readthedocs.io/en/latest/)**

Library Features

  • Contains implementations of prominent ViT architectures broken down into modular components like encoder, attention mechanism, and decoder
  • Makes it easy to develop custom models by composing components of different architectures
  • Contains utilities for visualizing attention maps of models using techniques such as gradient rollout

Installation

From source (recommended)

```shell

git clone https://github.com/SforAiDl/vformer.git cd vformer/ python setup.py install

```

From PyPI

```shell

pip install vformer

```

Models supported

Example usage

To instantiate and use a Swin Transformer model -

```python

import torch from vformer.models.classification import SwinTransformer

image = torch.randn(1, 3, 224, 224) # Example data model = SwinTransformer( imgsize=224, patchsize=4, inchannels=3, nclasses=10, embeddim=96, depths=[2, 2, 6, 2], numheads=[3, 6, 12, 24], windowsize=7, droprate=0.2, ) logits = model(image) ```

VFormer has a modular design and allows for easy experimentation using blocks/modules of different architectures. For example, if desired, you can use just the encoder or the windowed attention layer of the Swin Transformer model.

```python

from vformer.attention import WindowAttention

windowattn = WindowAttention( dim=128, windowsize=7, num_heads=2, **kwargs, )

```

```python

from vformer.encoder import SwinEncoder

swinencoder = SwinEncoder( dim=128, inputresolution=(224, 224), depth=2, numheads=2, windowsize=7, **kwargs, )

```

Please refer to our documentation to know more.


References

Citations (click to expand)


An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale bibtex @article{dosovitskiy2020vit, title={An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale}, author={Dosovitskiy, Alexey and Beyer, Lucas and Kolesnikov, Alexander and Weissenborn, Dirk and Zhai, Xiaohua and Unterthiner, Thomas and Dehghani, Mostafa and Minderer, Matthias and Heigold, Georg and Gelly, Sylvain and Uszkoreit, Jakob and Houlsby, Neil}, journal={ICLR}, year={2021} }

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows bibtex @article{liu2021Swin, title={Swin Transformer: Hierarchical Vision Transformer using Shifted Windows}, author={Liu, Ze and Lin, Yutong and Cao, Yue and Hu, Han and Wei, Yixuan and Zhang, Zheng and Lin, Stephen and Guo, Baining}, journal={arXiv preprint arXiv:2103.14030}, year={2021} }

Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions bibtex @misc{wang2021pyramid, title={Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions}, author={Wenhai Wang and Enze Xie and Xiang Li and Deng-Ping Fan and Kaitao Song and Ding Liang and Tong Lu and Ping Luo and Ling Shao}, year={2021}, eprint={2102.12122}, archivePrefix={arXiv}, primaryClass={cs.CV} } CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification

bibtex @inproceedings{chen2021crossvit, title={{CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification}}, author={Chun-Fu (Richard) Chen and Quanfu Fan and Rameswar Panda}, booktitle={International Conference on Computer Vision (ICCV)}, year={2021} }

Escaping the Big Data Paradigm with Compact Transformers

bibtex @article{hassani2021escaping, title = {Escaping the Big Data Paradigm with Compact Transformers}, author = {Ali Hassani and Steven Walton and Nikhil Shah and Abulikemu Abuduweili and Jiachen Li and Humphrey Shi}, year = 2021, url = {https://arxiv.org/abs/2104.05704}, eprint = {2104.05704}, archiveprefix = {arXiv}, primaryclass = {cs.CV} }

Visformer: The Vision-friendly Transformer

bibtex @misc{chen2021visformer, title={Visformer: The Vision-friendly Transformer}, author={Zhengsu Chen and Lingxi Xie and Jianwei Niu and Xuefeng Liu and Longhui Wei and Qi Tian}, year={2021}, eprint={2104.12533}, archivePrefix={arXiv}, primaryClass={cs.CV} }

Vision Transformers for Dense Prediction

bibtex @misc{ranftl2021vision, title={Vision Transformers for Dense Prediction}, author={René Ranftl and Alexey Bochkovskiy and Vladlen Koltun}, year={2021}, eprint={2103.13413}, archivePrefix={arXiv}, primaryClass={cs.CV} }

-->

Owner

  • Name: Society for Artificial Intelligence and Deep Learning
  • Login: SforAiDl
  • Kind: organization
  • Location: In a galaxy far far away

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you find this software useful, please cite it as below."
authors:
- family-names: "Deo"
  given-names: "Abhijit"
- family-names: "Agrawal"
  given-names: "Aditya"
- family-names: "Shah"
  given-names: "Neelay"
- family-names: "Li"
  given-names: "Alvin"
title: "VFormer: A modular PyTorch library for vision transformers"
date-released: 2022-01-11
url: "https://github.com/SforAiDl/vformer"
license: MIT

GitHub Events

Total
  • Watch event: 1
Last Year
  • Watch event: 1

Committers

Last synced: about 3 years ago

All Time
  • Total Commits: 89
  • Total Committers: 6
  • Avg Commits per committer: 14.833
  • Development Distribution Score (DDS): 0.438
Top Committers
Name Email Commits
Neelay Shah s****9@g****m 50
Abhijit Deo 7****g@u****m 23
Aditya Agrawal 7****2@u****m 9
alvanli 5****i@u****m 4
Hrithik Nambiar h****2@g****m 2
Rishav Mukherji 7****o@u****m 1

Issues and Pull Requests

Last synced: 9 months ago

All Time
  • Total issues: 33
  • Total pull requests: 68
  • Average time to close issues: about 2 months
  • Average time to close pull requests: 9 days
  • Total issue authors: 5
  • Total pull request authors: 5
  • Average comments per issue: 1.61
  • Average comments per pull request: 1.5
  • Merged pull requests: 49
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • NeelayS (21)
  • abhi-glitchhg (7)
  • alvanli (3)
  • Claire874 (1)
  • aditya-agrawal-30502 (1)
Pull Request Authors
  • abhi-glitchhg (36)
  • NeelayS (12)
  • aditya-agrawal-30502 (11)
  • Amapocho (5)
  • alvanli (4)
Top Labels
Issue Labels
Paper implementation (18) enhancement (8) documentation (4) good first issue (4) contributions welcome (1) low priority (1)
Pull Request Labels
documentation (7) Paper implementation (6) enhancement (2)

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 14 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 1
  • Total versions: 4
  • Total maintainers: 3
pypi.org: vformer

A modular PyTorch library for vision transformer models

  • Versions: 4
  • Dependent Packages: 0
  • Dependent Repositories: 1
  • Downloads: 14 Last month
Rankings
Stargazers count: 5.7%
Forks count: 8.1%
Dependent packages count: 10.1%
Average: 15.0%
Dependent repos count: 21.6%
Downloads: 29.8%
Maintainers (3)
Last synced: 7 months ago

Dependencies

requirements.txt pypi
  • Jinja2 ==3.0.1
  • MarkupSafe ==2.0.1
  • Pillow *
  • PyYAML ==5.4.1
  • arrow ==1.1.1
  • attrs ==21.2.0
  • backports.entry-points-selectable ==1.1.0
  • binaryornot ==0.4.4
  • certifi ==2021.5.30
  • cfgv ==3.3.1
  • chardet ==4.0.0
  • charset-normalizer ==2.0.4
  • click ==8.0.1
  • cookiecutter ==1.7.3
  • distlib ==0.3.2
  • einops ==0.3.2
  • filelock ==3.0.12
  • identify ==2.2.13
  • idna ==3.2
  • iniconfig ==1.1.1
  • jinja2-time ==0.2.0
  • nodeenv ==1.6.0
  • olefile *
  • packaging ==21.0
  • platformdirs ==2.3.0
  • pluggy ==1.0.0
  • poyo ==0.5.0
  • pre-commit ==2.15.0
  • py ==1.10.0
  • pyparsing ==2.4.7
  • pytest >=6.2.5
  • python-dateutil ==2.8.2
  • python-slugify ==5.0.2
  • requests ==2.26.0
  • six ==1.16.0
  • text-unidecode ==1.3
  • toml ==0.10.2
  • torch >=1.10.0
  • torchvision >=0.11.0
  • typing-extensions *
  • urllib3 ==1.26.6
  • virtualenv ==20.7.2
.github/workflows/codecov.yml actions
  • actions/cache v2 composite
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
  • codecov/codecov-action v1 composite
.github/workflows/conda_test.yml actions
  • actions/checkout v2 composite
  • conda-incubator/setup-miniconda v2 composite
.github/workflows/documentation.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v2 composite
  • actions/upload-artifact v3.1.0 composite
.github/workflows/linting.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
.github/workflows/package-test.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
.github/workflows/publish.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
  • pypa/gh-action-pypi-publish 27b31702a0e7fc50959f5ad993c78deac1bdfc29 composite
environment.yml conda
  • bzip2 1.0.8.*
  • ca-certificates 2022.4.26.*
  • libffi 3.4.2.*
  • openssl 1.1.1p.*
  • pip 21.2.4.*
  • setuptools 61.2.0.*
  • sqlite 3.38.5.*
  • tk 8.6.12.*
  • tzdata 2022a.*
  • wheel 0.37.1.*
  • xz 5.2.5.*
  • zlib 1.2.12.*
setup.py pypi