vfe.pytorch

Video Feature Enhancement with PyTorch

https://github.com/guanxiongsun/vfe.pytorch

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (9.6%) to scientific vocabulary

Last synced: 9 months ago · JSON representation ·

Repository

Video Feature Enhancement with PyTorch

Basic Info

Host: GitHub
Owner: guanxiongsun
License: apache-2.0
Language: Python
Default Branch: main
Size: 7.09 MB

Statistics

Stars: 28
Watchers: 1
Forks: 3
Open Issues: 3
Releases: 0

Created over 2 years ago · Last pushed over 1 year ago

Metadata Files

Readme License Citation

Video Feature Enhancement with PyTorch

This repo contains the code for the paper: MAMBA, STPN, TDViT, EOVOD

Additionally, we provide archive files of two widely-used datasets, ImageNetVID and GOT-10K. The official links of these datasets are not accessible or deleted. We hope these resources can help future research.

Progress

[x] MAMBA
[x] STPN
[ ] TDViT
[ ] EOVOD

Main Results

| Model | Backbone | AP50 | AP (fast) | AP (med) | AP (slow) | Link | | :----------------: | :--------: | :--: | :-------: | :------: | :-------: | :------------------------------------------------------------------------------------------: | | FasterRCNN | ResNet-101 | 76.7 | 52.3 | 74.1 | 84.9 | model, reference| | SELSA | ResNet-101 | 81.5 | -- | -- | -- | model, reference | | MEGA | ResNet-101 | 82.9 |62.7 |81.6 |89.4 | model, reference | | MAMBA | ResNet-101 | 83.8 | 65.3 | 83.8 | 89.5 | config, model, paper| | STPN | Swin-T | 85.2 | 64.1 | 84.1 | 91.4 | config, model, paper|

Installation

The code are tested with the following environments:

Tested environments:

python 3.8
pytorch 1.10.1
cuda 11.3
mmcv-full 1.3.17

Option 1: Step-by-step installation

```bash conda create --name vfe -y python=3.8 conda activate vfe

install PyTorch with cuda support

conda install pytorch==1.10.1 torchvision==0.11.2 torchaudio==0.10.1 cudatoolkit=11.3 -c pytorch -c conda-forge

install mmcv-full 1.3.17

pip install mmcv-full==1.3.17 -f https://download.openmmlab.com/mmcv/dist/cu113/torch1.10/index.html

install other requirements

pip install -r requirements.txt

install mmpycocotools

pip install mmpycocotools ```

See here for different versions of MMCV compatible to different PyTorch and CUDA versions.

Data preparation

Download ImageNetVID (Video Object Detection) Dataset

The original links of ImageNetVID dataset are either broken or unavailible. Here, we provide the new link to download the file for the furture reference of the community. Please download ILSVRC2015 DET and ILSVRC2015 VID datasets from this LINK.

After that, we recommend to symlink the path to the datasets to datasets/. And the path structure should be as follows:

./data/ILSVRC/
./data/ILSVRC/Annotations/DET
./data/ILSVRC/Annotations/VID
./data/ILSVRC/Data/DET
./data/ILSVRC/Data/VID
./data/ILSVRC/ImageSets

Note: List txt files under ImageSets folder can be obtained from here.

Convert Annotations

We use CocoVID to maintain datasets.

Option 1: Download and uncompress json file generated by us from here.

Option 2: Use following commands to generate annotation files:

```bash

ImageNet DET

python ./tools/convertdatasets/ilsvrc/imagenet2cocodet.py -i ./data/ILSVRC -o ./data/ILSVRC/annotations

ImageNet VID

python ./tools/convertdatasets/ilsvrc/imagenet2cocovid.py -i ./data/ILSVRC -o ./data/ILSVRC/annotations

```

Usage

Inference

This section will show how to test existing models on supported datasets. The following testing environments are supported:

single GPU
single node multiple GPU

During testing, different tasks share the same API and we only support samples_per_gpu = 1.

You can use the following commands for testing:

```shell

single-gpu testing

python tools/test.py ${CONFIGFILE} ${CHECKPOINTFILE} [--out ${RESULTFILE}] [--eval ${EVALMETRICS}]

multi-gpu testing

./tools/disttest.sh ${CONFIGFILE} ${GPUNUM} [--checkpoint ${CHECKPOINTFILE}] [--out ${RESULTFILE}] [--eval ${EVALMETRICS}] ```

Optional arguments:

CHECKPOINT_FILE: Filename of the checkpoint. You do not need to define it when applying some MOT methods but specify the checkpoints in the config.
RESULT_FILE: Filename of the output results in pickle format. If not specified, the results will not be saved to a file.
EVAL_METRICS: Items to be evaluated on the results. Allowed values depend on the dataset, e.g., bbox is available for ImageNet VID, track is available for LaSOT, bbox and track are both suitable for MOT17.
--cfg-options: If specified, the key-value pair optional cfg will be merged into config file
--eval-options: If specified, the key-value pair optional eval cfg will be kwargs for dataset.evaluate() function, it’s only for evaluation
--format-only: If specified, the results will be formatted to the official format.

Examples of testing VID model

Assume that you have already downloaded the checkpoints to the directory work_dirs/.

Test MAMBA on ImageNet VID, and evaluate the bbox mAP.

shell python tools/test.py configs/vid/mamba/mamba_r101_dc5_6x.py \ --checkpoint work_dirs/mamba_r101_dc5_6x/epoch_6_model.pth \ --out results.pkl \ --eval bbox

Test MAMBA with 8 GPUs on ImageNet VID, and evaluate the bbox mAP.

shell ./tools/dist_test.sh configs/vid/mamba/mamba_r101_dc5_6x.py 8 \ --checkpoint work_dirs/mamba_r101_dc5_6x/epoch_6_model.pth \ --out results.pkl \ --eval bbox

Training

Training on a single GPU

shell python tools/train.py ${CONFIG_FILE} [optional arguments]

During training, log files and checkpoints will be saved to the working directory, which is specified by work_dir in the config file or via CLI argument --work-dir.

Training on multiple GPUs

We provide tools/dist_train.sh to launch training on multiple GPUs. The basic usage is as follows.

shell bash ./tools/dist_train.sh \ ${CONFIG_FILE} \ ${GPU_NUM} \ [optional arguments]

Examples of training VID model

Train MAMBA on ImageNet VID and ImageNet DET with single GPU, then evaluate the bbox mAP at the last epoch.

shell python tools/train.py configs/vid/mamba/mamba_r101_dc5_6x.py

Train MAMBA on ImageNet VID and ImageNet DET with 8 GPUs, then evaluate the bbox mAP at the last epoch.

shell ./tools/dist_train.sh configs/vid/mamba/mamba_r101_dc5_6x.py 8

Reference

The codebase is implemented based on two popular open-source repos: mmdetection and mmtracking in PyTorch.

Owner

Name: 孙冠雄
Login: guanxiongsun
Kind: user
Company: Queen's University Belfast

Website: guanxiongsun.github.io
Repositories: 2
Profile: https://github.com/guanxiongsun

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
  - name: "MMDetection Contributors"
title: "OpenMMLab Detection Toolbox and Benchmark"
date-released: 2018-08-22
url: "https://github.com/open-mmlab/mmdetection"
license: Apache-2.0

GitHub Events

Total

Issues event: 2
Watch event: 6
Issue comment event: 2
Fork event: 1
Create event: 1

Last Year

Issues event: 2
Watch event: 6
Issue comment event: 2
Fork event: 1
Create event: 1

Dependencies

docker/Dockerfile docker

pytorch/pytorch ${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel build

docker/serve/Dockerfile docker

pytorch/pytorch ${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel build

requirements/build.txt pypi

cython ==0.29.36
numpy <1.24.0

requirements/docs.txt pypi

docutils ==0.16.0
recommonmark *
sphinx ==4.0.2
sphinx-copybutton *
sphinx_markdown_tables *
sphinx_rtd_theme ==0.5.2

requirements/mminstall.txt pypi

mmcv-full >=1.3.17

requirements/optional.txt pypi

cityscapesscripts *
imagecorruptions *
scipy *
sklearn *

requirements/readthedocs.txt pypi

mmcv *
torch *
torchvision *

requirements/runtime.txt pypi

matplotlib *
mmpycocotools *
numpy *
six *
terminaltables *

requirements/tests.txt pypi

asynctest * test
codecov * test
flake8 * test
interrogate * test
isort ==4.3.21 test
kwarray * test
onnx ==1.7.0 test
onnxruntime >=1.8.0 test
pytest * test
ubelt * test
xdoctest >=0.10.0 test
yapf * test

requirements.txt pypi

asynctest *
cityscapesscripts *
codecov *
cython ==0.29.36
flake8 *
imagecorruptions *
interrogate *
isort ==4.3.21
kwarray *
matplotlib *
numpy <1.24.0
onnx ==1.7.0
onnxruntime >=1.8.0
pytest *
scikit-learn *
scipy *
six *
terminaltables *
ubelt *
xdoctest >=0.10.0
yapf ==0.40.1

setup.py pypi

vfe.pytorch

Science Score: 54.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

Video Feature Enhancement with PyTorch

Progress

Main Results

Installation

Tested environments:

Option 1: Step-by-step installation

install PyTorch with cuda support

install mmcv-full 1.3.17

install other requirements

install mmpycocotools

Data preparation

Download ImageNetVID (Video Object Detection) Dataset

Convert Annotations

ImageNet DET

ImageNet VID

Usage

Inference

single-gpu testing

multi-gpu testing

Examples of testing VID model

Training

Training on a single GPU

Training on multiple GPUs

Examples of training VID model

Reference

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Dependencies