topformer

TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation, CVPR2022

https://github.com/hustvl/topformer

Science Score: 64.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org, scholar.google
✓
Committers with academic emails
2 of 6 committers (33.3%) from academic institutions
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (8.6%) to scientific vocabulary

Keywords

mobile-semantic-segmentation semantic-segmentation

Last synced: 9 months ago · JSON representation ·

Repository

TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation, CVPR2022

Basic Info

Host: GitHub
Owner: hustvl
License: other
Language: Python
Default Branch: main
Homepage:
Size: 1.97 MB

Statistics

Stars: 395
Watchers: 7
Forks: 42
Open Issues: 26
Releases: 0

Topics

mobile-semantic-segmentation semantic-segmentation

Created about 4 years ago · Last pushed over 3 years ago

Metadata Files

Readme License Citation

TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation

Paper Links: TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation (CVPR 2022)

by Wenqiang Zhang*, Zilong Huang*, Guozhong Luo, Tao Chen, Xinggang Wang†, Wenyu Liu†, Gang Yu, Chunhua Shen.

(*) equal contribution, (†) corresponding author.

Introduction

Although vision transformers (ViTs) have achieved great success in computer vision, the heavy computational cost makes it not suitable to deal with dense prediction tasks such as semantic segmentation on mobile devices. In this paper, we present a mobile-friendly architecture named Token Pyramid Vision TransFormer(TopFormer). The proposed TopFormer takes Tokens from various scales as input to produce scale-aware semantic features, which are then injected into the corresponding tokens to augment the representation. Experimental results demonstrate that our method significantly outperforms CNN- and ViT-based networks across several semantic segmentation datasets and achieves a good trade-off between accuracy and latency.

The latency is measured on a single Qualcomm Snapdragon 865 with input size 512×512×3, only an ARM CPU core is used for speed testing. *indicates the input size is 448×448×3.

Updates

04/23/2022: TopFormer backbone has been integrated into PaddleViT, checkout here for the 3rd party implementation on Paddle framework!

Requirements

pytorch 1.5+
mmcv-full==1.3.14

Main results

The classification models pretrained on ImageNet can be downloaded from Baidu Drive/Google Drive.

ADE20K

Model | Params(M) | FLOPs(G) | mIoU(ss) | Link --- |:---:|:---:|:---:|:---: | TopFormer-T448x4482x8160k | 1.4 | 0.5 | 32.5 | Baidu Drive, Google Drive TopFormer-T448x4484x8160k | 1.4 | 0.5 | 33.4 | Baidu Drive, Google Drive TopFormer-T512x5122x8160k | 1.4 | 0.6 | 33.6 | Baidu Drive, Google Drive TopFormer-T512x5124x8160k | 1.4 | 0.6 | 34.6 | Baidu Drive, Google Drive TopFormer-S512x5122x8160k | 3.1 | 1.2 | 36.5 | Baidu Drive, Google Drive TopFormer-S512x5124x8160k | 3.1 | 1.2 | 37.0 | Baidu Drive, Google Drive TopFormer-B512x5122x8160k | 5.1 | 1.8 | 38.3 | Baidu Drive, Google Drive TopFormer-B512x5124x8160k | 5.1 | 1.8 | 39.2 | Baidu Drive, Google Drive - ss indicates single-scale. - The password of Baidu Drive is topf

Usage

Please see MMSegmentation for dataset prepare.

For training, run: sh tools/dist_train.sh local_configs/topformer/<config-file> <num-of-gpus-to-use> --work-dir /path/to/save/checkpoint To evaluate, run: sh tools/dist_test.sh local_configs/topformer/<config-file> <checkpoint-path> <num-of-gpus-to-use>

To test the inference speed in mobile device, please refer to tnn_runtime.

Acknowledgement

The implementation is based on MMSegmentation.

Citation

if you find our work helpful to your experiments, please cite with: @article{zhang2022topformer, title = {TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation}, author = {Zhang, Wenqiang and Huang, Zilong and Luo, Guozhong and Chen, Tao and Wang, Xinggang and Liu, Wenyu and Yu, Gang and Shen, Chunhua.}, booktitle = {Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR)}, year = {2022} }

Owner

Name: HUST Vision Lab
Login: hustvl
Kind: organization
Location: Wuhan, China

Repositories: 78
Profile: https://github.com/hustvl

HUST Vision Lab of the School of EIC in HUST. Lab Lead @xinggangw

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
  - name: "MMSegmentation Contributors"
title: "OpenMMLab Semantic Segmentation Toolbox and Benchmark"
date-released: 2020-07-10
url: "https://github.com/open-mmlab/mmsegmentation"
license: Apache-2.0

GitHub Events

Total

Issues event: 1
Watch event: 17

Last Year

Issues event: 1
Watch event: 17

Committers

Last synced: about 1 year ago

All Time

Total Commits: 40
Total Committers: 6
Avg Commits per committer: 6.667
Development Distribution Score (DDS): 0.375

Past Year

Commits: 0
Committers: 0
Avg Commits per committer: 0.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
mulinmeng	u**4@h**n	25
mulinmeng	w**g@h**n	6
Zilong Huang	8**2@q**m	4
wqiangzhang	w**g@t**m	3
topformer-anonymous	t**s@o**m	1
pinto0309	r**2@y**p	1

Committer Domains (Top 20 + Academic)

hust.edu.cn: 2 yahoo.co.jp: 1 tencent.com: 1 qq.com: 1

Issues and Pull Requests

Last synced: about 1 year ago

All Time

Total issues: 40
Total pull requests: 3
Average time to close issues: 12 days
Average time to close pull requests: about 1 hour
Total issue authors: 31
Total pull request authors: 3
Average comments per issue: 1.33
Average comments per pull request: 0.33
Merged pull requests: 1
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 1
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 1
Pull request authors: 0
Average comments per issue: 0.0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

wmkai (3)
ke-dev (3)
vozhuo (3)
tensorflowt (2)
CoderDoubleTen (2)
yuchenlichuck (1)
nralka2007 (1)
Gabriel819 (1)
HPU-gm (1)
xperzy (1)
Liang-Shang (1)
itisianlee (1)
bixiaopeng0 (1)
javadmozaffari (1)
xuemanshanzhong (1)

Pull Request Authors

PINTO0309 (1)
LRY89757 (1)
TrellixVulnTeam (1)

Top Labels

Issue Labels

Pull Request Labels

Dependencies

requirements/docs.txt pypi

docutils ==0.16.0
myst-parser *
sphinx ==4.0.2
sphinx_copybutton *
sphinx_markdown_tables *

requirements/mminstall.txt pypi

mmcv-full >=1.3.1,<=1.4.0

requirements/optional.txt pypi

cityscapesscripts *

requirements/readthedocs.txt pypi

mmcv *
prettytable *
torch *
torchvision *

requirements/runtime.txt pypi

matplotlib *
numpy *
packaging *
prettytable *

requirements/tests.txt pypi

codecov * test
flake8 * test
interrogate * test
isort ==4.3.21 test
pytest * test
xdoctest >=0.10.0 test
yapf * test

docker/Dockerfile docker

pytorch/pytorch ${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel build

docker/serve/Dockerfile docker

pytorch/pytorch ${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel build

topformer

Science Score: 64.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation

Introduction

Updates

Requirements

Main results

Usage

Acknowledgement

Citation

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies