https://github.com/cliangyu/mmaction2

OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark

Last synced: 9 months ago · JSON representation

Repository

OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark

Basic Info

Host: GitHub
Owner: cliangyu
License: apache-2.0
Language: Python
Default Branch: master
Homepage: https://mmaction2.readthedocs.io
Size: 63.8 MB

Statistics

Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Releases: 0

Fork of open-mmlab/mmaction2

Created about 4 years ago · Last pushed almost 4 years ago

https://github.com/cliangyu/mmaction2/blob/master/


  
   
  
    OpenMMLab website
    ^HOT
        
    OpenMMLab platform
    ^{TRY IT OUT}
  

[![Documentation](https://readthedocs.org/projects/mmaction2/badge/?version=latest)](https://mmaction2.readthedocs.io/en/latest/)
[![actions](https://github.com/open-mmlab/mmaction2/workflows/build/badge.svg)](https://github.com/open-mmlab/mmaction2/actions)
[![codecov](https://codecov.io/gh/open-mmlab/mmaction2/branch/master/graph/badge.svg)](https://codecov.io/gh/open-mmlab/mmaction2)
[![PyPI](https://img.shields.io/pypi/v/mmaction2)](https://pypi.org/project/mmaction2/)
[![LICENSE](https://img.shields.io/github/license/open-mmlab/mmaction2.svg)](https://github.com/open-mmlab/mmaction2/blob/master/LICENSE)
[![Average time to resolve an issue](https://isitmaintained.com/badge/resolution/open-mmlab/mmaction2.svg)](https://github.com/open-mmlab/mmaction2/issues)
[![Percentage of issues still open](https://isitmaintained.com/badge/open/open-mmlab/mmaction2.svg)](https://github.com/open-mmlab/mmaction2/issues)

[Documentation](https://mmaction2.readthedocs.io/en/latest/) |
[Installation](https://mmaction2.readthedocs.io/en/latest/install.html) |
[Model Zoo](https://mmaction2.readthedocs.io/en/latest/modelzoo.html) |
[Update News](https://mmaction2.readthedocs.io/en/latest/changelog.html) |
[Ongoing Projects](https://github.com/open-mmlab/mmaction2/projects) |
[Reporting Issues](https://github.com/open-mmlab/mmaction2/issues/new/choose)



English | [](/README_zh-CN.md)

## Introduction

MMAction2 is an open-source toolbox for video understanding based on PyTorch.
It is a part of the [OpenMMLab](http://openmmlab.org/) project.

The master branch works with **PyTorch 1.5+**.


  
  

    Action Recognition Results on Kinetics-400
  
  
  

    Skeleton-base Action Recognition Results on NTU-RGB+D-120
  


  

    Skeleton-based Spatio-Temporal Action Detection and Action Recognition Results on Kinetics-400


  

    Spatio-Temporal Action Detection Results on AVA-2.1


## Major Features

- **Modular design**: We decompose a video understanding framework into different components. One can easily construct a customized video understanding framework by combining different modules.

- **Support four major video understanding tasks**: MMAction2 implements various algorithms for multiple video understanding tasks, including action recognition, action localization, spatio-temporal action detection, and skeleton-based action detection. We support **27** different algorithms and **20** different datasets for the four major tasks.

- **Well tested and documented**: We provide detailed documentation and API reference, as well as unit tests.

## What's New

- (2022-03-04) We support **Multigrid** on Kinetics400, achieve 76.07% Top-1 accuracy and accelerate training speed.
- (2021-11-24) We support **2s-AGCN** on NTU60 XSub, achieve 86.06% Top-1 accuracy on joint stream and 86.89% Top-1 accuracy on bone stream respectively.
- (2021-10-29) We provide a demo for skeleton-based and rgb-based spatio-temporal detection and action recognition (demo/demo_video_structuralize.py).
- (2021-10-26) We train and test **ST-GCN** on NTU60 with 3D keypoint annotations, achieve 84.61% Top-1 accuracy (higher than 81.5% in the [paper](https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/viewPaper/17135)).
- (2021-10-25) We provide a script(tools/data/skeleton/gen_ntu_rgbd_raw.py) to convert the NTU60 and NTU120 3D raw skeleton data to our format.
- (2021-10-25) We provide a [guide](https://github.com/open-mmlab/mmaction2/blob/master/configs/skeleton/posec3d/custom_dataset_training.md) on how to train PoseC3D with custom datasets, [bit-scientist](https://github.com/bit-scientist) authored this PR!
- (2021-10-16) We support **PoseC3D** on UCF101 and HMDB51, achieves 87.0% and 69.3% Top-1 accuracy with 2D skeletons only. Pre-extracted 2D skeletons are also available.

**Release**: v0.24.0 was released in 05/05/2022. Please refer to [changelog.md](docs/changelog.md) for details and release history.

## Installation

MMAction2 depends on [PyTorch](https://pytorch.org/), [MMCV](https://github.com/open-mmlab/mmcv), [MMDetection](https://github.com/open-mmlab/mmdetection) (optional), and [MMPose](https://github.com/open-mmlab/mmdetection)(optional).
Below are quick steps for installation.
Please refer to [install.md](docs/install.md) for more detailed instruction.

```shell
conda create -n open-mmlab python=3.8 pytorch=1.10 cudatoolkit=11.3 torchvision -c pytorch -y
conda activate open-mmlab
pip3 install openmim
mim install mmcv-full
mim install mmdet  # optional
mim install mmpose  # optional
git clone https://github.com/open-mmlab/mmaction2.git
cd mmaction2
pip3 install -e .
```

## Get Started

Please see [getting_started.md](docs/getting_started.md) for the basic usage of MMAction2.
There are also tutorials:

- [learn about configs](docs/tutorials/1_config.md)
- [finetuning models](docs/tutorials/2_finetune.md)
- [adding new dataset](docs/tutorials/3_new_dataset.md)
- [designing data pipeline](docs/tutorials/4_data_pipeline.md)
- [adding new modules](docs/tutorials/5_new_modules.md)
- [exporting model to onnx](docs/tutorials/6_export_model.md)
- [customizing runtime settings](docs/tutorials/7_customize_runtime.md)

A Colab tutorial is also provided. You may preview the notebook [here](demo/mmaction2_tutorial.ipynb) or directly [run](https://colab.research.google.com/github/open-mmlab/mmaction2/blob/master/demo/mmaction2_tutorial.ipynb) on Colab.

## Supported Methods


  
    Action Recognition
  
  
    C3D (CVPR'2014)
    TSN (ECCV'2016)
    I3D (CVPR'2017)
    I3D Non-Local (CVPR'2018)
    R(2+1)D (CVPR'2018)
  
  
    TRN (ECCV'2018)
    TSM (ICCV'2019)
    TSM Non-Local (ICCV'2019)
    SlowOnly (ICCV'2019)
    SlowFast (ICCV'2019)
  
  
    CSN (ICCV'2019)
    TIN (AAAI'2020)
    TPN (CVPR'2020)
    X3D (CVPR'2020)
    OmniSource (ECCV'2020)
  
  
    MultiModality: Audio (ArXiv'2020)
    TANet (ArXiv'2020)
    TimeSformer (ICML'2021)
    
    
  
  
    Action Localization
  
  
    SSN (ICCV'2017)
    BSN (ECCV'2018)
    BMN (ICCV'2019)
    
    
  
  
    Spatio-Temporal Action Detection
  
  
    ACRN (ECCV'2018)
    SlowOnly+Fast R-CNN (ICCV'2019)
    SlowFast+Fast R-CNN (ICCV'2019)
    LFB (CVPR'2019)
    
  
  
    Skeleton-based Action Recognition
  
  
    ST-GCN (AAAI'2018)
    2s-AGCN (CVPR'2019)
    PoseC3D (ArXiv'2021)
    
    
  


Results and models are available in the *README.md* of each method's config directory.
A summary can be found on the [**model zoo**](https://mmaction2.readthedocs.io/en/latest/recognition_models.html) page.

We will keep up with the latest progress of the community and support more popular algorithms and frameworks.
If you have any feature requests, please feel free to leave a comment in [Issues](https://github.com/open-mmlab/mmaction2/issues/19).

## Supported Datasets


  
    Action Recognition
  
  
    HMDB51 (Homepage) (ICCV'2011)
    UCF101 (Homepage) (CRCV-IR-12-01)
    ActivityNet (Homepage) (CVPR'2015)
    Kinetics-[400/600/700] (Homepage) (CVPR'2017)
  
  
    SthV1 (Homepage) (ICCV'2017)
    SthV2 (Homepage) (ICCV'2017)
    Diving48 (Homepage) (ECCV'2018)
    Jester (Homepage) (ICCV'2019)
  
  
    Moments in Time (Homepage) (TPAMI'2019)
    Multi-Moments in Time (Homepage) (ArXiv'2019)
    HVU (Homepage) (ECCV'2020)
    OmniSource (Homepage) (ECCV'2020)
  
  
    FineGYM (Homepage) (CVPR'2020)
    
    
    
  
  
    Action Localization
  
  
    THUMOS14 (Homepage) (THUMOS Challenge 2014)
    ActivityNet (Homepage) (CVPR'2015)
    
    
  
  
    Spatio-Temporal Action Detection
  
  
    UCF101-24* (Homepage) (CRCV-IR-12-01)
    JHMDB* (Homepage) (ICCV'2015)
    AVA (Homepage) (CVPR'2018)
    
  
  
    Skeleton-based Action Recognition
  
  
    PoseC3D-FineGYM (Homepage) (ArXiv'2021)
    PoseC3D-NTURGB+D (Homepage) (ArXiv'2021)
    PoseC3D-UCF101 (Homepage) (ArXiv'2021)
    PoseC3D-HMDB51 (Homepage) (ArXiv'2021)
  


Datasets marked with * are not fully supported yet, but related dataset preparation steps are provided. A summary can be found on the [**Supported Datasets**](https://mmaction2.readthedocs.io/en/latest/supported_datasets.html) page.

## Benchmark

To demonstrate the efficacy and efficiency of our framework, we compare MMAction2 with some other popular frameworks and official releases in terms of speed. Details can be found in [benchmark](docs/benchmark.md).

## Data Preparation

Please refer to [data_preparation.md](docs/data_preparation.md) for a general knowledge of data preparation.
The supported datasets are listed in [supported_datasets.md](docs/supported_datasets.md)

## FAQ

Please refer to [FAQ](docs/faq.md) for frequently asked questions.

## Projects built on MMAction2

Currently, there are many research works and projects built on MMAction2 by users from community, such as:

- Video Swin Transformer. [\[paper\]](https://arxiv.org/abs/2106.13230)[\[github\]](https://github.com/SwinTransformer/Video-Swin-Transformer)
- Evidential Deep Learning for Open Set Action Recognition, ICCV 2021 **Oral**. [\[paper\]](https://arxiv.org/abs/2107.10161)[\[github\]](https://github.com/Cogito2012/DEAR)
- Rethinking Self-supervised Correspondence Learning: A Video Frame-level Similarity Perspective, ICCV 2021 **Oral**. [\[paper\]](https://arxiv.org/abs/2103.17263)[\[github\]](https://github.com/xvjiarui/VFS)

etc., check [projects.md](docs/projects.md) to see all related projects.

## Contributing

We appreciate all contributions to improve MMAction2. Please refer to [CONTRIBUTING.md](https://github.com/open-mmlab/mmcv/blob/master/CONTRIBUTING.md) in MMCV for more details about the contributing guideline.

## Acknowledgement

MMAction2 is an open-source project that is contributed by researchers and engineers from various colleges and companies.
We appreciate all the contributors who implement their methods or add new features and users who give valuable feedback.
We wish that the toolbox and benchmark could serve the growing research community by providing a flexible toolkit to reimplement existing methods and develop their new models.

## Citation

If you find this project useful in your research, please consider cite:

```BibTeX
@misc{2020mmaction2,
    title={OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark},
    author={MMAction2 Contributors},
    howpublished = {\url{https://github.com/open-mmlab/mmaction2}},
    year={2020}
}
```

## License

This project is released under the [Apache 2.0 license](LICENSE).

## Projects in OpenMMLab

- [MIM](https://github.com/open-mmlab/mim): MIM installs OpenMMLab packages.
- [MMClassification](https://github.com/open-mmlab/mmclassification): OpenMMLab image classification toolbox and benchmark.
- [MMDetection](https://github.com/open-mmlab/mmdetection): OpenMMLab detection toolbox and benchmark.
- [MMDetection3D](https://github.com/open-mmlab/mmdetection3d): OpenMMLab's next-generation platform for general 3D object detection.
- [MMRotate](https://github.com/open-mmlab/mmrotate): OpenMMLab rotated object detection toolbox and benchmark.
- [MMSegmentation](https://github.com/open-mmlab/mmsegmentation): OpenMMLab semantic segmentation toolbox and benchmark.
- [MMOCR](https://github.com/open-mmlab/mmocr): OpenMMLab text detection, recognition, and understanding toolbox.
- [MMPose](https://github.com/open-mmlab/mmpose): OpenMMLab pose estimation toolbox and benchmark.
- [MMHuman3D](https://github.com/open-mmlab/mmhuman3d): OpenMMLab 3D human parametric model toolbox and benchmark.
- [MMSelfSup](https://github.com/open-mmlab/mmselfsup): OpenMMLab self-supervised learning toolbox and benchmark.
- [MMRazor](https://github.com/open-mmlab/mmrazor): OpenMMLab model compression toolbox and benchmark.
- [MMFewShot](https://github.com/open-mmlab/mmfewshot): OpenMMLab fewshot learning toolbox and benchmark.
- [MMAction2](https://github.com/open-mmlab/mmaction2): OpenMMLab's next-generation action understanding toolbox and benchmark.
- [MMTracking](https://github.com/open-mmlab/mmtracking): OpenMMLab video perception toolbox and benchmark.
- [MMFlow](https://github.com/open-mmlab/mmflow): OpenMMLab optical flow toolbox and benchmark.
- [MMEditing](https://github.com/open-mmlab/mmediting): OpenMMLab image and video editing toolbox.
- [MMGeneration](https://github.com/open-mmlab/mmgeneration): OpenMMLab image and video generative models toolbox.
- [MMDeploy](https://github.com/open-mmlab/mmdeploy): OpenMMLab model deployment framework.

Owner

Name: Liangyu Chen
Login: cliangyu
Kind: user
Location: Singapore
Company: Nanyang Technological University

Website: cliangyu.com
Twitter: cliangyu_
Repositories: 1
Profile: https://github.com/cliangyu

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/cliangyu/mmaction2

Science Score: 10.0%

Repository

Basic Info

Statistics

https://github.com/cliangyu/mmaction2/blob/master/

Owner

GitHub Events

Total

Last Year

Action Recognition
C3D (CVPR'2014)	TSN (ECCV'2016)	I3D (CVPR'2017)	I3D Non-Local (CVPR'2018)	R(2+1)D (CVPR'2018)
TRN (ECCV'2018)	TSM (ICCV'2019)	TSM Non-Local (ICCV'2019)	SlowOnly (ICCV'2019)	SlowFast (ICCV'2019)
CSN (ICCV'2019)	TIN (AAAI'2020)	TPN (CVPR'2020)	X3D (CVPR'2020)	OmniSource (ECCV'2020)
MultiModality: Audio (ArXiv'2020)	TANet (ArXiv'2020)	TimeSformer (ICML'2021)
Action Localization
SSN (ICCV'2017)	BSN (ECCV'2018)	BMN (ICCV'2019)
Spatio-Temporal Action Detection
ACRN (ECCV'2018)	SlowOnly+Fast R-CNN (ICCV'2019)	SlowFast+Fast R-CNN (ICCV'2019)	LFB (CVPR'2019)
Skeleton-based Action Recognition
ST-GCN (AAAI'2018)	2s-AGCN (CVPR'2019)	PoseC3D (ArXiv'2021)

Action Recognition
HMDB51 (Homepage) (ICCV'2011)	UCF101 (Homepage) (CRCV-IR-12-01)	ActivityNet (Homepage) (CVPR'2015)	Kinetics-[400/600/700] (Homepage) (CVPR'2017)
SthV1 (Homepage) (ICCV'2017)	SthV2 (Homepage) (ICCV'2017)	Diving48 (Homepage) (ECCV'2018)	Jester (Homepage) (ICCV'2019)
Moments in Time (Homepage) (TPAMI'2019)	Multi-Moments in Time (Homepage) (ArXiv'2019)	HVU (Homepage) (ECCV'2020)	OmniSource (Homepage) (ECCV'2020)
FineGYM (Homepage) (CVPR'2020)
Action Localization
THUMOS14 (Homepage) (THUMOS Challenge 2014)	ActivityNet (Homepage) (CVPR'2015)
Spatio-Temporal Action Detection
UCF101-24* (Homepage) (CRCV-IR-12-01)	JHMDB* (Homepage) (ICCV'2015)	AVA (Homepage) (CVPR'2018)
Skeleton-based Action Recognition
PoseC3D-FineGYM (Homepage) (ArXiv'2021)	PoseC3D-NTURGB+D (Homepage) (ArXiv'2021)	PoseC3D-UCF101 (Homepage) (ArXiv'2021)	PoseC3D-HMDB51 (Homepage) (ArXiv'2021)