stb-vmm

STB-VMM: Swin Transformer Based Video Motion Magnification (official repository)

https://github.com/rlado/stb-vmm

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 4 DOI reference(s) in README
✓
Academic publication links
Links to: arxiv.org, sciencedirect.com
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (14.1%) to scientific vocabulary

Keywords

computer-vision deep-learning motion-magnification transformers video-amplification video-magnification

Last synced: 11 months ago · JSON representation ·

Repository

STB-VMM: Swin Transformer Based Video Motion Magnification (official repository)

Basic Info

Host: GitHub
Owner: RLado
License: mit
Language: Python
Default Branch: master
Homepage:
Size: 44.9 KB

Statistics

Stars: 41
Watchers: 3
Forks: 6
Open Issues: 0
Releases: 1

Topics

computer-vision deep-learning motion-magnification transformers video-amplification video-magnification

Created about 4 years ago · Last pushed over 2 years ago

Metadata Files

Readme License Citation

STB-VMM: Swin Transformer Based Video Motion Magnification

Ricard Lado Roigé, Marco A. Pérez

IQS School of Engineering, Universitat Ramon Llull

This repository contains the official implementation of the STB-VMM: Swin Transformer Based Video Motion Magnification paper in PyTorch.

The goal of Video Motion Magnification techniques is to magnify small motions in a video to reveal previously invisible or unseen movement. Its uses extend from bio-medical applications and deep fake detection to structural modal analysis and predictive maintenance. However, discerning small motion from noise is a complex task, especially when attempting to magnify very subtle often sub-pixel movement. As a result, motion magnification techniques generally suffer from noisy and blurry outputs. This work presents a new state-of-the-art model based on the Swin Transformer, which offers better tolerance to noisy inputs as well as higher-quality outputs that exhibit less noise, blurriness and artifacts than prior-art. Improvements in output image quality will enable more precise measurements for any application reliant on magnified video sequences, and may enable further development of video motion magnification techniques in new technical fields.

Architecture Overview

Install dependencies

bash pip install -r requirements.txt

❗FFMPEG is required to run the magnify_video script

Testing

To test STB-VMM just run the script named magnify_video.sh with the appropriate arguments.

For example:

bash bash magnify_video.sh -mag 20 -i ../demo_video/baby.mp4 -m ckpt/ckpt_e49.pth.tar -o STB-VMM_demo_x20_static -s ../demo_video/ -f 30 Note: To magnify any video a pre-trained checkpoint is required.

Note 2: If you are running Windows an alternative powershell script is provided

Training

To train the STB-VMM model use train.py with the appropriate arguments. The training dataset can be downloaded from here.

For example:

bash python3 train.py -d ../data/train -n 100000 -j 32 -b 5 -lr 0.00001 --epochs 50 #--resume ckpt/ckpt_e01.pth.tar

Demo

https://user-images.githubusercontent.com/25719985/194240973-8d93968f-283b-4802-aacb-5e32175e16f3.mp4

More at http://doi.org/10.17632/76s26nrcpv.2

Citation

bibtex @article{LADOROIGE2023110493, title = {STB-VMM: Swin Transformer based Video Motion Magnification}, journal = {Knowledge-Based Systems}, pages = {110493}, year = {2023}, issn = {0950-7051}, doi = {https://doi.org/10.1016/j.knosys.2023.110493}, url = {https://www.sciencedirect.com/science/article/pii/S0950705123002435}, author = {Ricard Lado-Roigé and Marco A. Pérez}, keywords = {Computer vision, Deep learning, Swin Transformer, Motion magnification, Image quality assessment}, abstract = {The goal of video motion magnification techniques is to magnify small motions in a video to reveal previously invisible or unseen movement. Its uses extend from bio-medical applications and deepfake detection to structural modal analysis and predictive maintenance. However, discerning small motion from noise is a complex task, especially when attempting to magnify very subtle, often sub-pixel movement. As a result, motion magnification techniques generally suffer from noisy and blurry outputs. This work presents a new state-of-the-art model based on the Swin Transformer, which offers better tolerance to noisy inputs as well as higher-quality outputs that exhibit less noise, blurriness, and artifacts than prior-art. Improvements in output image quality will enable more precise measurements for any application reliant on magnified video sequences, and may enable further development of video motion magnification techniques in new technical fields.} }

Acknowledgements

This implementation borrows from the awesome works of: - Learning-based Video Motion Magnification - Motion Magnification PyTorch - Pytorch Image Models - SwinIR

Owner

Name: Ricard Lado
Login: RLado
Kind: user
Location: Barcelona
Company: Universitat Ramon Llull (IQS School of Engineering)

Website: lado.one
Repositories: 24
Profile: https://github.com/RLado

PhD Candidate. I love Linux and all things open source.

Citation (CITATION.cff)

cff-version: 1.2.0
title: 'STB-VMM: Swin Transformer Based Video Motion Magnification'
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Ricard
    family-names: Lado-Roigé
    email: ricardlador@iqs.edu
    affiliation: >-
      IQS School of Engineering, Universitat Ramon Llull,
      Via Augusta 390, 08017 Barcelona, Spain
    orcid: 'https://orcid.org/0000-0002-6421-7351'
  - given-names: Marco A.
    family-names: Pérez
    orcid: 'https://orcid.org/0000-0003-4140-1823'
    affiliation: >-
      IQS School of Engineering, Universitat Ramon Llull,
      Via Augusta 390, 08017 Barcelona, Spain
identifiers:
  - type: doi
    value: 10.1016/j.knosys.2023.110493
    description: >-
      STB-VMM: Swin Transformer Based Video Motion
      Magnification
repository-code: 'https://github.com/RLado/STB-VMM'
abstract: >-
  The goal of video motion magnification techniques is to
  magnify small motions in a video to reveal previously
  invisible or unseen movement. Its uses extend from
  bio-medical applications and deep fake detection to
  structural modal analysis and predictive maintenance.
  However, discerning small motion from noise is a complex
  task, especially when attempting to magnify very subtle
  often sub-pixel movement. As a result, motion
  magnification techniques generally suffer from noisy and
  blurry outputs. This work presents a new state-of-the-art
  model based on the Swin Transformer, which offers better
  tolerance to noisy inputs as well as higher-quality
  outputs that exhibit less noise, blurriness and artifacts
  than prior-art. Improvements in output image quality will
  enable more precise measurements for any application
  reliant on magnified video sequences, and may enable
  further development of video motion magnification
  techniques in new technical fields. 
keywords:
  - Computer vision
  - Deep Learning
  - Swin Transformer
  - Motion Magnification
  - Image Quality Assessment
license: MIT
version: v1.0.0
date-released: '2022-07-12'

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science