https://github.com/cyrilzakka/mae3d

Masked Auto-Encoding for Large Scale Pretraining of Video Data

https://github.com/cyrilzakka/mae3d

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.8%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

Masked Auto-Encoding for Large Scale Pretraining of Video Data

Basic Info
  • Host: GitHub
  • Owner: cyrilzakka
  • License: other
  • Language: Jupyter Notebook
  • Default Branch: main
  • Size: 1.01 MB
Statistics
  • Stars: 18
  • Watchers: 3
  • Forks: 0
  • Open Issues: 2
  • Releases: 0
Created about 4 years ago · Last pushed about 3 years ago
Metadata Files
Readme License

README.md

Masked Autoencoders As Spatiotemporal Learners

3D MAE Concept

This is an unofficial PyTorch/GPU implementation of Masked Autoencoders As Spatiotemporal Learners

@Article{STMaskedAutoencoders2022, author = {Feichtenhofer, Christoph and Fan, Haoqi and Li, Yanghao and He, Kaiming}, journal = {arXiv:2205.09113}, title = {Masked Autoencoders As Spatiotemporal Learners}, year = {2022}, }

Getting Started

This repository runs on PyTorch 11.1 and above. To get started, clone the repository and install the required dependencies: $ git clone https://github.com/cyrilzakka/MAE3D $ cd MAE3D $ pip install -r requirements.txt Optionally, install wandb for training visualization: $ pip install wandb

Pretraining

Dataset Preparation

In order to perform large-scale pre-training, your data should be organized in the following way: dataset │ ├───ledger.csv └───train ├───video_1 │ ├───img_00001.jpg │ . │ └───img_03117.jpg ├───video_2 │ ├───img_00001.jpg │ . │ └───img_02744.jpg └───video_3 ├───img_00001.jpg . └───img_0323.jpg with the accompanying ledger.csv containing rows listing the video_name, start_frame, end_frame and class/pseudoclass: video_1 1 3117 1 video_2 1 2744 0 video_3 1 323 0

Dataloader

Fast and efficient loading of video data for training is done using the VideoFrameDataset library:

python dataset_train = VideoFrameDataset(root_path:str, annotationfile_path:str, num_segments:int, frames_per_segment:int, transform:None, test_mode:bool) where each video is split into even num_segments, from which a random start index is sampled and frames_per_segment consecutive frames are loaded.

Training

To train with the default --model vit_large_patch16 for --epochs 400 and a --batch_size 8 at an --input_size 224 run: $ CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node=4 main_pretrain.py More training options and parameters can be viewed and modified in main_pretrain.py.

Visualization

A visualization of MAE-3D can be found in the included interactive notebook.

License

This project is under the CC-BY-NC 4.0 license. See LICENSE for details.

Owner

  • Name: Cyril Zakka, MD
  • Login: cyrilzakka
  • Kind: user
  • Location: Palo Alto, California
  • Company: @hiesingerlab

Medical Doctor, Postdoctoral Fellow at Stanford Medicine.

GitHub Events

Total
  • Watch event: 3
Last Year
  • Watch event: 3

Issues and Pull Requests

Last synced: over 1 year ago

All Time
  • Total issues: 4
  • Total pull requests: 1
  • Average time to close issues: 5 days
  • Average time to close pull requests: N/A
  • Total issue authors: 2
  • Total pull request authors: 1
  • Average comments per issue: 2.0
  • Average comments per pull request: 0.0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • likemby (3)
  • klinic (1)
Pull Request Authors
  • akashc1 (1)
Top Labels
Issue Labels
Pull Request Labels

Dependencies

requirements.txt pypi
  • einops *
  • pandas *
  • timm *
  • wandb *