https://github.com/amazon-science/tubelet-transformer

This is an official implementation of TubeR: Tubelet Transformer for Video Action Detection

https://github.com/amazon-science/tubelet-transformer

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (8.1%) to scientific vocabulary

Keywords

action-detection ava jhmdb transformer tubelet-transformer tuber ucf
Last synced: 5 months ago · JSON representation

Repository

This is an official implementation of TubeR: Tubelet Transformer for Video Action Detection

Basic Info
Statistics
  • Stars: 79
  • Watchers: 1
  • Forks: 20
  • Open Issues: 15
  • Releases: 0
Topics
action-detection ava jhmdb transformer tubelet-transformer tuber ucf
Created over 3 years ago · Last pushed almost 3 years ago
Metadata Files
Readme Contributing License Code of conduct

README.md

TubeR: Tubelet Transformer for Video Action Detection

This repo contains the supported code to reproduce spatio-temporal action detection results of TubeR: Tubelet Transformer for Video Action Detection.

Updates

08/08/2022 Initial commits

Results and Models

AVA 2.1 Dataset

| Backbone | Pretrain | #view | mAP | FLOPs | config | model | | :---: | :---: | :---: |:----:| :---: | :---: | :---: | | CSN-50 | Kinetics-400 | 1 view | 27.2 | 78G | config | S3 | | CSN-50 (with long-term context) | Kinetics-400 | 1 view | 28.8 | 78G | config | Comming soon | | CSN-152 | Kinetics-400+IG65M | 1 view | 29.7 | 120G | config | S3 | | CSN-152 (with long-term context) | Kinetics-400+IG65M | 1 view | 31.7 | 120G | config | Comming soon |

AVA 2.2 Dataset

| Backbone | Pretrain | #view | mAP | FLOPs | config | model | | :---: | :---: | :---: |:----:| :---: | :---: | :---: | | CSN-152 | Kinetics-400+IG65M | 1 view | 31.1 | 120G | config | S3 | | CSN-152 (with long-term context) | Kinetics-400+IG65M | 1 view | 33.4 | 120G | config | Comming soon |

JHMDB Dataset

| Backbone | #view | mAP@0.2 | mAP@0.5 | config | model | | :---: | :---: | :---: | :---: | :---: | :---: | | CSN-152 | 1 view | 87.4 | 82.3 | config | S3 |

Usage

The project is developed based on GluonCV-torch. Please refer to tutorial for details.

Dependency

The project is tested working on: - Torch 1.12 + CUDA 11.3 - timm==0.4.5 - tensorboardX

Dataset

Please download the asset.zip and unzip them at ./datasets.

[AVA] Please refer to DATASET.md for AVA dataset downloading and pre-processing. [JHMDB] Please refer to JHMDB for JHMDB dataset and Dataset Section for UCF dataset. You also can refer to ACT-Detector to prepare the two datasets.

Inference

To run inference, first modify the config file: - set the correct WORLD_SIZE, GPU_WORLD_SIZE, DIST_URL, WOLRD_URLS based on experiment setup. - set the LABEL_PATH, ANNO_PATH, DATA_PATH to your local directory accordingly. - Download the pre-trained model and set PRETRAINED_PATH to model path. - make sure LOAD and LOAD_FC are set to True

Then run: ```

run testing

python3 evaltuberava.py

for example, to evaluate ava from scratch, run:

python3 evaltuberava.py configuration/TubeRCSN152AVA21.yaml ```

Training

To train TubeR from scratch, first modify the configfile: - set the correct WORLD_SIZE, GPU_WORLD_SIZE, DIST_URL, WOLRD_URLS based on experiment setup. - set the LABEL_PATH, ANNO_PATH, DATA_PATH to your local directory accordingly. - Download the pre-trained feature backbone and transformer weights and set PRETRAIN_BACKBONE_DIR (CSN50, CSN152), PRETRAIN_TRANSFORMER_DIR (DETR) accordingly. - make sure LOAD and LOAD_FC are set to False

Then run: ```

run training from scratch

python3 traintuber.py <CONFIGFILE>

for example, to train ava from scratch, run:

python3 traintuberava.py configuration/TubeRCSN152AVA21.yaml ```

TODO

[ ]Add tutorial and pre-trained weights for TubeR with long-term memory

[ ]Add weights for UCF24

Citing TubeR

@inproceedings{zhao2022tuber, title={TubeR: Tubelet transformer for video action detection}, author={Zhao, Jiaojiao and Zhang, Yanyi and Li, Xinyu and Chen, Hao and Shuai, Bing and Xu, Mingze and Liu, Chunhui and Kundu, Kaustav and Xiong, Yuanjun and Modolo, Davide and others}, booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition}, pages={13598--13607}, year={2022} }

Owner

  • Name: Amazon Science
  • Login: amazon-science
  • Kind: organization

GitHub Events

Total
  • Watch event: 11
  • Issue comment event: 3
  • Fork event: 2
Last Year
  • Watch event: 11
  • Issue comment event: 3
  • Fork event: 2

Issues and Pull Requests

Last synced: over 1 year ago

All Time
  • Total issues: 38
  • Total pull requests: 7
  • Average time to close issues: 23 days
  • Average time to close pull requests: about 10 hours
  • Total issue authors: 16
  • Total pull request authors: 2
  • Average comments per issue: 1.39
  • Average comments per pull request: 0.43
  • Merged pull requests: 3
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • AlexeyG (3)
  • huang-chenhai (2)
  • wenzhengzeng (2)
  • lemonheadboy (1)
  • jinsingsangsung (1)
  • DanLuoNEU (1)
  • quangtn266 (1)
  • furqanabid412 (1)
  • hongminglin08 (1)
  • Tsunehiko (1)
  • sqiangcao99 (1)
  • DCBXZ66 (1)
  • ykyk000 (1)
  • sibonjia (1)
  • yassouali (1)
Pull Request Authors
  • coocoo90 (4)
  • salmank255 (1)
Top Labels
Issue Labels
Pull Request Labels