zeroi2v

[ECCV 2024] ZeroI2V: Zero-Cost Adaptation of Pre-trained Transformers from Image to Video

https://github.com/mcg-nju/zeroi2v

Science Score: 62.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
✓
Institutional organization owner
Organization mcg-nju has institutional domain (mcg.nju.edu.cn)
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (8.7%) to scientific vocabulary

Last synced: 10 months ago · JSON representation ·

Repository

[ECCV 2024] ZeroI2V: Zero-Cost Adaptation of Pre-trained Transformers from Image to Video

Basic Info

Host: GitHub
Owner: MCG-NJU
License: apache-2.0
Language: Python
Default Branch: main
Size: 1.74 MB

Statistics

Stars: 21
Watchers: 1
Forks: 1
Open Issues: 0
Releases: 0

Created almost 2 years ago · Last pushed almost 2 years ago

Metadata Files

Readme License Citation

ZeroI2V: Zero-Cost Adaptation of Pre-trained Transformers from Image to VideoECCV2024)

This repo is the official implementation of "ZeroI2V: Zero-Cost Adaptation of Pre-trained Transformers from Image to Video"ECCV2024)

If you're interested in our work, check out our new video adaptation benchmark!

TODO

[x] Release source codes
[x] Pretrained model weights

Introduction

In this paper, we present a zero-cost adaptation paradigm (ZeroI2V) to transfer the image transformers to video recognition tasks (i.e., introduce zero extra cost to the adapted models during inference).

Models

You could reparameter the weight refer to tools/weight_reparam.py.

Kinetics 400

| Backbone | Pretrain | GFLOPs | Param | New Param (M) | acc@1 | Views | Config | Checkpoint (before reparam) | | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | | ViT-B/16 | CLIP | 422 | 86 | 0 | 83.0 | 8x1x3 | config | checkpoint | | ViT-L/14 | CLIP | 1946 | 304 | 0 | 86.3 | 8x1x3 | config | checkpoint | | ViT-L/14 | CLIP | 7783 | 304 | 0 | 87.2 | 32x1x3 | config | checkpoint |

Something Something V2

| Backbone | Pretrain | GFLOPs | Param | New Param (M) | acc@1 | Views | Config | Checkpoint (before reparam) | | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |:---: | | ViT-L/14 | CLIP | 7783 | 304 | 0 | 72.2 | 32x3x1 |config| checkpoint |

Installation

```bash pip install -U openmim mim install mmengine 'mmcv>=2.0.0rc1' mim install "mmdet>=3.0.0rc5" mim install "mmpose>=1.0.0rc0" git clone https://github.com/leexinhao/ZeroI2V.git cd ZeroI2V pip install -v -e .

install CLIP

pip install git+https://github.com/openai/CLIP.git ``` Our project is based on MMAction2. Please refer to install.md for more detailed instructions.

Data Preparation

All the datasets (K400, SSv2, UCF101 and HMDB51) used in this work are supported in MMAction2.

Training

The training configs of different experiments are provided in configs/recognition/. To run experiments, please use the following command. PATH/TO/CONFIG is the training config you want to use. The default training setting is 8GPU with a batchsize of 64.

shell bash tools/dist_train.sh <PATH/TO/CONFIG> <NUM_GPU>

We also provide a training script in run_exp.sh. You can simply change the training config to train different models.

Evaluation

The code will do the evaluation after training. If you would like to evaluate a model only, please use the following command,

shell bash tools/dist_test.sh <PATH/TO/CONFIG> <CHECKPOINT_FILE> <NUM_GPU> --eval top_k_accuracy

Reparameterize the linear adapter

Please refer to tools/weight_reparam.py.

Test speed and throughput

Please refer to tools/test_speed.py and tools/test_throughput.py.

If you find our work useful in your research, please cite: @article{li2023zeroi2v, title={ZeroI2V: Zero-Cost Adaptation of Pre-trained Transformers from Image to Video}, author={Li, Xinhao and Zhu, Yuhan and Wang, Limin}, journal={arXiv preprint arXiv:2310.01324}, year={2023} }

Owner

Name: Multimedia Computing Group, Nanjing University
Login: MCG-NJU
Kind: organization
Location: Nanjing

Website: mcg.nju.edu.cn
Repositories: 52
Profile: https://github.com/MCG-NJU

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
  - name: "MMAction2 Contributors"
title: "OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark"
date-released: 2020-07-21
url: "https://github.com/open-mmlab/mmaction2"
license: Apache-2.0

GitHub Events

Total

Watch event: 6
Fork event: 1

Last Year

Watch event: 6
Fork event: 1

Dependencies

requirements/build.txt pypi

Pillow *
decord >=0.4.1
einops *
matplotlib *
numpy *
opencv-contrib-python *
scipy *
torch >=1.3

requirements/docs.txt pypi

docutils ==0.16.0
einops *
myst-parser *
opencv-python *
scipy *
sphinx ==4.0.2
sphinx_copybutton *
sphinx_markdown_tables *
sphinx_rtd_theme ==0.5.2

requirements/mminstall.txt pypi

mmcv >=2.0.0rc0,<2.1.0
mmengine >=0.5.0,<1.0.0

requirements/optional.txt pypi

PyTurboJPEG *
av >=9.0
future *
fvcore *
imgaug *
librosa *
lmdb *
moviepy *
packaging *
pims *
soundfile *
tensorboard *
wandb *

requirements/readthedocs.txt pypi

mmcv *
titlecase *
torch *
torchvision *

requirements/tests.txt pypi

coverage * test
flake8 * test
interrogate * test
isort ==4.3.21 test
parameterized * test
pytest * test
pytest-runner * test
xdoctest >=0.10.0 test
yapf * test

requirements.txt pypi

setup.py pypi

tools/data/activitynet/environment.yml pypi

decorator ==4.4.2
intel-openmp ==2019.0
joblib ==0.15.1
mkl ==2019.0
numpy ==1.18.4
olefile ==0.46
pandas ==1.0.3
python-dateutil ==2.8.1
pytz ==2020.1
six ==1.14.0
youtube-dl *

tools/data/gym/environment.yml pypi

decorator ==4.4.2
intel-openmp ==2019.0
joblib ==0.15.1
mkl ==2019.0
numpy ==1.18.4
olefile ==0.46
pandas ==1.0.3
python-dateutil ==2.8.1
pytz ==2020.1
six ==1.14.0
youtube-dl *

tools/data/hvu/environment.yml pypi

decorator ==4.4.2
intel-openmp ==2019.0
joblib ==0.15.1
mkl ==2019.0
numpy ==1.18.4
olefile ==0.46
pandas ==1.0.3
python-dateutil ==2.8.1
pytz ==2020.1
six ==1.14.0
youtube-dl *

tools/data/kinetics/environment.yml pypi

decorator ==4.4.2
intel-openmp ==2019.0
joblib ==0.15.1
mkl ==2019.0
numpy ==1.18.4
olefile ==0.46
pandas ==1.0.3
python-dateutil ==2.8.1
pytz ==2020.1
six ==1.14.0
youtube-dl *

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science