https://github.com/alphonsg/swin-transformer-object-detection

This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" on Object Detection and Instance Segmentation.

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
○
codemeta.json file
○
.zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (4.0%) to scientific vocabulary

Last synced: 9 months ago · JSON representation

Repository

This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" on Object Detection and Instance Segmentation.

Basic Info

Host: GitHub
Owner: AlphonsG
License: apache-2.0
Language: Python
Default Branch: master
Homepage: https://arxiv.org/abs/2103.14030
Size: 19.9 MB

Statistics

Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Releases: 0

Fork of SwinTransformer/Swin-Transformer-Object-Detection

Created almost 5 years ago · Last pushed about 4 years ago

Metadata Files

Readme Contributing License Code of conduct

Swin Transformer for Object Detection

This repo contains the supported code and configuration files to reproduce object detection results of Swin Transformer. It is based on mmdetection.

Updates

05/11/2021 Models for MoBY are released

04/12/2021 Initial commits

Results and Models

Mask R-CNN

| Backbone | Pretrain | Lr Schd | box mAP | mask mAP | #params | FLOPs | config | log | model | | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |:---: | | Swin-T | ImageNet-1K | 1x | 43.7 | 39.8 | 48M | 267G | config | github/baidu | github/baidu | | Swin-T | ImageNet-1K | 3x | 46.0 | 41.6 | 48M | 267G | config | github/baidu | github/baidu | | Swin-S | ImageNet-1K | 3x | 48.5 | 43.3 | 69M | 359G | config | github/baidu | github/baidu |

Cascade Mask R-CNN

| Backbone | Pretrain | Lr Schd | box mAP | mask mAP | #params | FLOPs | config | log | model | | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |:---: | | Swin-T | ImageNet-1K | 1x | 48.1 | 41.7 | 86M | 745G | config | github/baidu | github/baidu | | Swin-T | ImageNet-1K | 3x | 50.4 | 43.7 | 86M | 745G | config | github/baidu | github/baidu | | Swin-S | ImageNet-1K | 3x | 51.9 | 45.0 | 107M | 838G | config | github/baidu | github/baidu | | Swin-B | ImageNet-1K | 3x | 51.9 | 45.0 | 145M | 982G | config | github/baidu | github/baidu |

RepPoints V2

| Backbone | Pretrain | Lr Schd | box mAP | mask mAP | #params | FLOPs | | :---: | :---: | :---: | :---: | :---: | :---: | :---: | | Swin-T | ImageNet-1K | 3x | 50.0 | - | 45M | 283G |

Mask RepPoints V2

| Backbone | Pretrain | Lr Schd | box mAP | mask mAP | #params | FLOPs | | :---: | :---: | :---: | :---: | :---: | :---: | :---: | | Swin-T | ImageNet-1K | 3x | 50.3 | 43.6 | 47M | 292G |

Notes:

Pre-trained models can be downloaded from Swin Transformer for ImageNet Classification.
Access code for baidu is swin.

Results of MoBY with Swin Transformer

Mask R-CNN

Cascade Mask R-CNN

Notes:

The drop path rate needs to be tuned for best practice.
MoBY pre-trained models can be downloaded from MoBY with Swin Transformer.

Usage

Installation

Please refer to get_started.md for installation and dataset preparation.

Inference

```

single-gpu testing

python tools/test.py --eval bbox segm

multi-gpu testing

tools/disttest.sh <CONFIGFILE> --eval bbox segm ```

Training

To train a detector with pre-trained models, run: ```

single-gpu training

python tools/train.py --cfg-options model.pretrained= [model.backbone.use_checkpoint=True] [other optional arguments]

multi-gpu training

tools/disttrain.sh <CONFIGFILE> --cfg-options model.pretrained= [model.backbone.usecheckpoint=True] [other optional arguments] For example, to train a Cascade Mask R-CNN model with a `Swin-T` backbone and 8 gpus, run: tools/disttrain.sh configs/swin/cascademaskrcnnswintinypatch4window7mstrain480-800giou4conv1fadamw3xcoco.py 8 --cfg-options model.pretrained=<PRETRAINMODEL> ```

Note: use_checkpoint is used to save GPU memory. Please refer to this page for more details.

Apex (optional):

We use apex for mixed precision training by default. To install apex, run: git clone https://github.com/NVIDIA/apex cd apex pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./ If you would like to disable apex, modify the type of runner as EpochBasedRunner and comment out the following code block in the configuration files: ```

do not use mmdet version fp16

fp16 = None optimizerconfig = dict( type="DistOptimizerHook", updateinterval=1, gradclip=None, coalesce=True, bucketsizemb=-1, usefp16=True, ) ```

Citing Swin Transformer

@article{liu2021Swin, title={Swin Transformer: Hierarchical Vision Transformer using Shifted Windows}, author={Liu, Ze and Lin, Yutong and Cao, Yue and Hu, Han and Wei, Yixuan and Zhang, Zheng and Lin, Stephen and Guo, Baining}, journal={arXiv preprint arXiv:2103.14030}, year={2021} }

Dependencies

.github/workflows/build.yml actions

actions/checkout v2 composite
actions/setup-python v2 composite
codecov/codecov-action v1.0.10 composite

.github/workflows/build_pat.yml actions

actions/checkout v2 composite

.github/workflows/deploy.yml actions

actions/checkout v2 composite
actions/setup-python v2 composite

docker/Dockerfile docker

pytorch/pytorch ${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel build

docker/serve/Dockerfile docker

${BASE_IMAGE} latest build

requirements/build.txt pypi

cython *
numpy *

requirements/docs.txt pypi

recommonmark *
sphinx *
sphinx_markdown_tables *
sphinx_rtd_theme *

requirements/optional.txt pypi

albumentations >=0.3.2
cityscapesscripts *
imagecorruptions *
mmlvis *
scipy *
sklearn *

requirements/readthedocs.txt pypi

mmcv *
torch *
torchvision *

requirements/runtime.txt pypi

matplotlib *
mmpycocotools *
numpy *
six *
terminaltables *
timm *

requirements/tests.txt pypi

asynctest * test
codecov * test
flake8 * test
interrogate * test
isort ==4.3.21 test
kwarray * test
onnx ==1.7.0 test
onnxruntime ==1.5.1 test
pytest * test
ubelt * test
xdoctest >=0.10.0 test
yapf * test

requirements.txt pypi

setup.py pypi

https://github.com/alphonsg/swin-transformer-object-detection

Science Score: 10.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

Swin Transformer for Object Detection

Updates

Results and Models

Mask R-CNN

Cascade Mask R-CNN

RepPoints V2

Mask RepPoints V2

Results of MoBY with Swin Transformer

Mask R-CNN

Cascade Mask R-CNN

Usage

Installation

Inference

single-gpu testing

multi-gpu testing

Training

single-gpu training

multi-gpu training

Apex (optional):

do not use mmdet version fp16

Citing Swin Transformer

Other Links

Owner

GitHub Events

Total

Last Year

Dependencies