https://github.com/amazon-science/tubelet-transformer

This is an official implementation of TubeR: Tubelet Transformer for Video Action Detection

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (8.1%) to scientific vocabulary

Keywords

action-detection ava jhmdb transformer tubelet-transformer tuber ucf

Last synced: 11 months ago · JSON representation

Repository

This is an official implementation of TubeR: Tubelet Transformer for Video Action Detection

Basic Info

Host: GitHub
Owner: amazon-science
License: apache-2.0
Language: Python
Default Branch: main
Homepage: https://openaccess.thecvf.com/content/CVPR2022/supplemental/Zhao_TubeR_Tubelet_Transformer_CVPR_2022_supplemental.pdf
Size: 9.42 MB

Statistics

Stars: 79
Watchers: 1
Forks: 20
Open Issues: 15
Releases: 0

Topics

action-detection ava jhmdb transformer tubelet-transformer tuber ucf

Created about 4 years ago · Last pushed over 3 years ago

Metadata Files

Readme Contributing License Code of conduct

TubeR: Tubelet Transformer for Video Action Detection

This repo contains the supported code to reproduce spatio-temporal action detection results of TubeR: Tubelet Transformer for Video Action Detection.

Updates

08/08/2022 Initial commits

Results and Models

AVA 2.1 Dataset

| Backbone | Pretrain | #view | mAP | FLOPs | config | model | | :---: | :---: | :---: |:----:| :---: | :---: | :---: | | CSN-50 | Kinetics-400 | 1 view | 27.2 | 78G | config | S3 | | CSN-50 (with long-term context) | Kinetics-400 | 1 view | 28.8 | 78G | config | Comming soon | | CSN-152 | Kinetics-400+IG65M | 1 view | 29.7 | 120G | config | S3 | | CSN-152 (with long-term context) | Kinetics-400+IG65M | 1 view | 31.7 | 120G | config | Comming soon |

AVA 2.2 Dataset

| Backbone | Pretrain | #view | mAP | FLOPs | config | model | | :---: | :---: | :---: |:----:| :---: | :---: | :---: | | CSN-152 | Kinetics-400+IG65M | 1 view | 31.1 | 120G | config | S3 | | CSN-152 (with long-term context) | Kinetics-400+IG65M | 1 view | 33.4 | 120G | config | Comming soon |

JHMDB Dataset

| Backbone | #view | mAP@0.2 | mAP@0.5 | config | model | | :---: | :---: | :---: | :---: | :---: | :---: | | CSN-152 | 1 view | 87.4 | 82.3 | config | S3 |

Usage

The project is developed based on GluonCV-torch. Please refer to tutorial for details.

Dependency

The project is tested working on: - Torch 1.12 + CUDA 11.3 - timm==0.4.5 - tensorboardX

Dataset

Please download the asset.zip and unzip them at ./datasets.

[AVA] Please refer to DATASET.md for AVA dataset downloading and pre-processing. [JHMDB] Please refer to JHMDB for JHMDB dataset and Dataset Section for UCF dataset. You also can refer to ACT-Detector to prepare the two datasets.

Inference

To run inference, first modify the config file: - set the correct WORLD_SIZE, GPU_WORLD_SIZE, DIST_URL, WOLRD_URLS based on experiment setup. - set the LABEL_PATH, ANNO_PATH, DATA_PATH to your local directory accordingly. - Download the pre-trained model and set PRETRAINED_PATH to model path. - make sure LOAD and LOAD_FC are set to True

Then run: ```

run testing

python3 evaltuberava.py

for example, to evaluate ava from scratch, run:

python3 evaltuberava.py configuration/TubeRCSN152AVA21.yaml ```

Training

To train TubeR from scratch, first modify the configfile: - set the correct WORLD_SIZE, GPU_WORLD_SIZE, DIST_URL, WOLRD_URLS based on experiment setup. - set the LABEL_PATH, ANNO_PATH, DATA_PATH to your local directory accordingly. - Download the pre-trained feature backbone and transformer weights and set PRETRAIN_BACKBONE_DIR (CSN50, CSN152), PRETRAIN_TRANSFORMER_DIR (DETR) accordingly. - make sure LOAD and LOAD_FC are set to False

Then run: ```

run training from scratch

python3 traintuber.py <CONFIGFILE>

for example, to train ava from scratch, run:

python3 traintuberava.py configuration/TubeRCSN152AVA21.yaml ```

TODO

[ ]Add tutorial and pre-trained weights for TubeR with long-term memory

[ ]Add weights for UCF24

Citing TubeR

@inproceedings{zhao2022tuber, title={TubeR: Tubelet transformer for video action detection}, author={Zhao, Jiaojiao and Zhang, Yanyi and Li, Xinyu and Chen, Hao and Shuai, Bing and Xu, Mingze and Liu, Chunhui and Kundu, Kaustav and Xiong, Yuanjun and Modolo, Davide and others}, booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition}, pages={13598--13607}, year={2022} }

Owner

Name: Amazon Science
Login: amazon-science
Kind: organization

Website: https://amazon.science
Twitter: AmazonScience
Repositories: 80
Profile: https://github.com/amazon-science

GitHub Events

Total

Watch event: 11
Issue comment event: 3
Fork event: 2

Last Year

Watch event: 11
Issue comment event: 3
Fork event: 2

Issues and Pull Requests

Last synced: over 1 year ago

All Time

Total issues: 38
Total pull requests: 7
Average time to close issues: 23 days
Average time to close pull requests: about 10 hours
Total issue authors: 16
Total pull request authors: 2
Average comments per issue: 1.39
Average comments per pull request: 0.43
Merged pull requests: 3
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

AlexeyG (3)
huang-chenhai (2)
wenzhengzeng (2)
lemonheadboy (1)
jinsingsangsung (1)
DanLuoNEU (1)
quangtn266 (1)
furqanabid412 (1)
hongminglin08 (1)
Tsunehiko (1)
sqiangcao99 (1)
DCBXZ66 (1)
ykyk000 (1)
sibonjia (1)
yassouali (1)

https://github.com/amazon-science/tubelet-transformer

Science Score: 26.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

TubeR: Tubelet Transformer for Video Action Detection

Updates

Results and Models

AVA 2.1 Dataset

AVA 2.2 Dataset

JHMDB Dataset

Usage

Dependency

Dataset

Inference

run testing

for example, to evaluate ava from scratch, run:

Training

run training from scratch

for example, to train ava from scratch, run:

TODO

Citing TubeR

Owner

GitHub Events

Total

Last Year

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels