gpvit

[ICLR 2023 Spotlight] GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group Propagation

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (11.4%) to scientific vocabulary

Keywords

computer-vision image-classification instance-segmentation object-detection semantic-segmentation vision-transformer visual-recognition

Last synced: 6 months ago · JSON representation

Repository

[ICLR 2023 Spotlight] GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group Propagation

Basic Info

Host: GitHub
Owner: ChenhongyiYang
License: apache-2.0
Language: Python
Default Branch: main
Homepage:
Size: 6.16 MB

Statistics

Stars: 100
Watchers: 4
Forks: 3
Open Issues: 2
Releases: 1

Topics

computer-vision image-classification instance-segmentation object-detection semantic-segmentation vision-transformer visual-recognition

Created about 3 years ago · Last pushed over 2 years ago

Metadata Files

Readme Contributing License Citation

GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group Propagation

This repository contains the official PyTorch implementation of GPViT, a high-resolution non-hierarchical vision transformer architecture designed for high-performing visual recognition, which is introduced in our paper:

GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group Propagation, Chenhongyi Yang, *Jiarui Xu, *Shalini De Mello, Elliot J. Crowley, Xiaolong Wang, ICLR 2023

Usage

Environment Setup

Our code base is built upon the MM-series toolkits. Specifically, classification is based on MMClassification; object detection is based on MMDetection; and semantic segmentation is based on MMSegmentation. Users can follow the official site of those toolkit to set up their environments. We also provide a sample setting up script as following:

shell conda create -n gpvit python=3.7 -y source activate gpvit pip install torch==1.7.1+cu101 torchvision==0.8.2+cu101 -f https://download.pytorch.org/whl/torch_stable.html pip install -U openmim mim install mmcv-full==1.4.8 pip install timm pip install lmdb # for ImageNet experiments pip install -v -e . cd downstream/mmdetection # set up object detection and instance segmentation pip install -v -e . cd ../mmsegmentation # set up semantic segmentation pip install -v -e .

Data Preparation

ImageNet Classification

Training GPViT

```shell

Example: Training GPViT-L1 model

zsh tool/disttrain.sh configs/gpvit/gpvitl1.py 16 ```

Testing GPViT

```shell

Example: Testing GPViT-L1 model

zsh tool/disttest.sh configs/gpvit/gpvitl1.py workdirs/gpvitl1/epoch_300.pth 16 --metrics accuracy ```

COCO Object Detection and Instance Segmentation

Run cd downstream/mmdetection first.

Training GPViT based Mask R-CNN

```shell

Example: Training GPViT-L1 models with 1x and 3x+MS schedules

zsh tools/disttrain.sh configs/gpvit/maskrcnn/gpvitl1maskrcnn1x.py 16 zsh tools/disttrain.sh configs/gpvit/maskrcnn/gpvitl1maskrcnn3x.py 16 ```

Training GPViT based RetinaNet

```shell

Example: Training GPViT-L1 models with 1x and 3x+MS schedules

zsh tools/disttrain.sh configs/gpvit/retinanet/gpvitl1retinanet1x.py 16 zsh tools/disttrain.sh configs/gpvit/retinanet/gpvitl1retinanet3x.py 16 ```

Testing GPViT based Mask R-CNN

```shell

Example: Testing GPViT-L1 Mask R-CNN 1x model

zsh tools/disttest.sh configs/gpvit/maskrcnn/gpvitl1maskrcnn1x.py workdirs/gpvitl1maskrcnn1x/epoch12.pth 16 --eval bbox segm ```

Testing GPViT based RetinaNet

```shell

Example: Testing GPViT-L1 RetinaNet 1x model

zsh tools/disttest.sh configs/gpvit/retinanet/gpvitl1retinanet1x.py workdirs/gpvitl1retinanet1x/epoch_12.pth 16 --eval bbox ```

ADE20K Semantic Segmentation

Run cd downstream/mmsegmentation first.

Training GPViT based semantic segmentation models

```shell

Example: Training GPViT-L1 based SegFormer and UperNet models

zsh tools/disttrain.sh configs/gpvit/gpvitl1segformer.py 16 zsh tools/disttrain.sh configs/gpvit/gpvitl1upernet.py 16 ```

Testing GPViT based semantic segmentation models

```shell

Example: Testing GPViT-L1 based SegFormer and UperNet models

zsh tools/disttest.sh configs/gpvit/gpvitl1segformer.py workdirs/gpvitl1segformer/iter160000.pth 16 --eval mIoU zsh tools/disttest.sh configs/gpvit/gpvitl1upernet.py workdirs/gpvitl1upernet/iter160000.pth 16 --eval mIoU ```

Benchmark results

ImageNet-1k Classification

| Model | #Params (M) | Top-1 Acc | Top-5 Acc | Config | Model | |:--------:|:-----------:|:---------:|:---------:|:----------:|:------------------------------------------------------------------------------------------------:| | GPViT-L1 | 9.3 | 80.5 | 95.4 | config | model | | GPViT-L2 | 23.8 | 83.4 | 96.6 | config | model | | GPViT-L3 | 36.2 | 84.1 | 96.9 | config | model | | GPViT-L4 | 75.4 | 84.3 | 96.9 | config | model |

COCO Mask R-CNN 1x Schedule

| Model | #Params (M) | AP Box | AP Mask | Config | Model | |:--------:|:-----------:|:------:|:-------:|:----------:|:---------:| | GPViT-L1 | 33 | 48.1 | 42.7 | config | model | | GPViT-L2 | 50 | 49.9 | 43.9 | config | model | | GPViT-L3 | 64 | 50.4 | 44.4 | config | model | | GPViT-L4 | 109 | 51.0 | 45.0 | config | model |

COCO Mask R-CNN 3x+MS Schedule

| Model | #Params (M) | AP Box | AP Mask | Config | Model | |:--------:|:-----------:|:------:|:-------:|:----------:|:---------:| | GPViT-L1 | 33 | 50.2 | 44.3 | config | model | | GPViT-L2 | 50 | 51.4 | 45.1 | config | model | | GPViT-L3 | 64 | 51.6 | 45.2 | config | model | | GPViT-L4 | 109 | 52.1 | 45.7 | config | model |

COCO RetinaNet 1x Schedule

| Model | #Params (M) | AP Box | Config | Model | |:--------:|:-----------:|:------:|:----------:|:---------:| | GPViT-L1 | 21 | 45.8 | config | model | | GPViT-L2 | 37 | 48.0 | config | model | | GPViT-L3 | 52 | 48.3 | config | model | | GPViT-L4 | 96 | 48.7 | config | model |

COCO RetinaNet 3x+MS Schedule

| Model | #Params (M) | AP Box | Config | Model | |:--------:|:-----------:|:------:|:----------:|:---------:| | GPViT-L1 | 21 | 48.1 | config | model | | GPViT-L2 | 37 | 49.0 | config | model | | GPViT-L3 | 52 | 49.4 | config | model | | GPViT-L4 | 96 | 49.8 | config | model |

ADE20K UperNet

| Model | #Params (M) | mIoU | Config | Model | |:--------:|:-----------:|:----:|:----------:|:---------:| | GPViT-L1 | 37 | 49.1 | config | model | | GPViT-L2 | 53 | 50.2 | config | model | | GPViT-L3 | 66 | 51.7 | config | model | | GPViT-L4 | 107 | 52.5 | config | model |

ADE20K SegFormer

| Model | #Params (M) | mIoU | Config | Model | |:--------:|:-----------:|:----:|:----------:|:---------:| | GPViT-L1 | 9 | 46.9 | config | model | | GPViT-L2 | 24 | 49.2 | config | model | | GPViT-L3 | 36 | 50.8 | config | model | | GPViT-L4 | 76 | 51.3 | config | model |

Citation

@InProceedings{yang2023gpvit, title={{GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group Propagation}}, author={Chenhongyi Yang and Jiarui Xu and Shalini De Mello and Elliot J. Crowley and Xiaolong Wang}, journal={ICLR} year={2023}, }

Owner

Name: Chenhongyi Yang
Login: ChenhongyiYang
Kind: user
Location: Zurich, Switzerland
Company: Meta

Website: chenhongyiyang.com
Repositories: 4
Profile: https://github.com/ChenhongyiYang

Research Scientist at Meta Reality Labs

GitHub Events

Total

Watch event: 5

Last Year

Watch event: 5

Packages

Total packages: 2
Total downloads: unknown

Total dependent packages: 0
(may contain duplicates)
Total dependent repositories: 0
(may contain duplicates)
Total versions: 2

proxy.golang.org: github.com/chenhongyiyang/gpvit

Documentation: https://pkg.go.dev/github.com/chenhongyiyang/gpvit#section-documentation
License: apache-2.0
Latest release: v0.0.1
published about 3 years ago

Versions: 1
Dependent Packages: 0
Dependent Repositories: 0

Rankings

Dependent packages count: 6.5%

Average: 6.7%

Dependent repos count: 7.0%

Last synced: 6 months ago

proxy.golang.org: github.com/ChenhongyiYang/GPViT

Documentation: https://pkg.go.dev/github.com/ChenhongyiYang/GPViT#section-documentation
License: apache-2.0
Latest release: v0.0.1
published about 3 years ago

Versions: 1
Dependent Packages: 0
Dependent Repositories: 0

Rankings

Dependent packages count: 6.5%

Average: 6.7%

Dependent repos count: 7.0%

Last synced: 6 months ago

Dependencies

downstream/mmdetection/requirements/albu.txt pypi

albumentations >=0.3.2

downstream/mmdetection/requirements/build.txt pypi

cython *
numpy *

downstream/mmdetection/requirements/docs.txt pypi

docutils ==0.16.0
myst-parser *
sphinx ==4.0.2
sphinx-copybutton *
sphinx_markdown_tables *
sphinx_rtd_theme ==0.5.2

downstream/mmdetection/requirements/mminstall.txt pypi

mmcv-full >=1.3.17

downstream/mmdetection/requirements/optional.txt pypi

cityscapesscripts *
imagecorruptions *
scipy *
sklearn *
timm *

downstream/mmdetection/requirements/readthedocs.txt pypi

mmcv *
torch *
torchvision *

downstream/mmdetection/requirements/runtime.txt pypi

matplotlib *
numpy *
pycocotools *
six *
terminaltables *

downstream/mmdetection/requirements/tests.txt pypi

asynctest * test
codecov * test
flake8 * test
interrogate * test
isort ==4.3.21 test
kwarray * test
onnx ==1.7.0 test
onnxruntime >=1.8.0 test
protobuf <=3.20.1 test
pytest * test
ubelt * test
xdoctest >=0.10.0 test
yapf * test

downstream/mmsegmentation/requirements/docs.txt pypi

docutils ==0.16.0
myst-parser *
sphinx ==4.0.2
sphinx_copybutton *
sphinx_markdown_tables *

downstream/mmsegmentation/requirements/mminstall.txt pypi

mmcls >=0.20.1
mmcv-full >=1.4.4,<=1.5.0

downstream/mmsegmentation/requirements/optional.txt pypi

cityscapesscripts *

downstream/mmsegmentation/requirements/readthedocs.txt pypi

mmcv *
prettytable *
torch *
torchvision *

downstream/mmsegmentation/requirements/runtime.txt pypi

matplotlib *
mmcls >=0.20.1
numpy *
packaging *
prettytable *

downstream/mmsegmentation/requirements/tests.txt pypi

codecov * test
flake8 * test
interrogate * test
pytest * test
xdoctest >=0.10.0 test
yapf * test

requirements/docs.txt pypi

docutils ==0.17.1
myst-parser *
pytorch_sphinx_theme *
sphinx ==4.5.0
sphinx-copybutton *
sphinx_markdown_tables *

requirements/mminstall.txt pypi

einops >=0.6.0
mmcv-full >=1.4.2,<1.9.0

requirements/optional.txt pypi

albumentations >=0.3.2
colorama *
requests *
rich *
scipy *

requirements/readthedocs.txt pypi

mmcv >=1.4.2
torch *
torchvision *

requirements/runtime.txt pypi

einops >=0.6.0
matplotlib >=3.1.0
numpy *
packaging *

requirements/tests.txt pypi

codecov * test
flake8 * test
interrogate * test
isort ==4.3.21 test
mmdet * test
pytest * test
xdoctest >=0.10.0 test
yapf * test

downstream/mmdetection/requirements.txt pypi

downstream/mmdetection/setup.py pypi

downstream/mmsegmentation/requirements.txt pypi

downstream/mmsegmentation/setup.py pypi

requirements.txt pypi

setup.py pypi

gpvit

Science Score: 36.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group Propagation

Usage

Environment Setup

Data Preparation

ImageNet Classification

Training GPViT

Example: Training GPViT-L1 model

Testing GPViT

Example: Testing GPViT-L1 model

COCO Object Detection and Instance Segmentation

Training GPViT based Mask R-CNN

Example: Training GPViT-L1 models with 1x and 3x+MS schedules

Training GPViT based RetinaNet

Example: Training GPViT-L1 models with 1x and 3x+MS schedules

Testing GPViT based Mask R-CNN

Example: Testing GPViT-L1 Mask R-CNN 1x model

Testing GPViT based RetinaNet

Example: Testing GPViT-L1 RetinaNet 1x model

ADE20K Semantic Segmentation

Training GPViT based semantic segmentation models

Example: Training GPViT-L1 based SegFormer and UperNet models

Testing GPViT based semantic segmentation models

Example: Testing GPViT-L1 based SegFormer and UperNet models

Benchmark results

ImageNet-1k Classification

COCO Mask R-CNN 1x Schedule

COCO Mask R-CNN 3x+MS Schedule

COCO RetinaNet 1x Schedule

COCO RetinaNet 3x+MS Schedule

ADE20K UperNet

ADE20K SegFormer

Citation

Owner

GitHub Events

Total

Last Year

Packages

proxy.golang.org: github.com/chenhongyiyang/gpvit

Rankings

proxy.golang.org: github.com/ChenhongyiYang/GPViT

Rankings

Dependencies