stb-vmm

STB-VMM: Swin Transformer Based Video Motion Magnification (official repository)

https://github.com/rlado/stb-vmm

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 4 DOI reference(s) in README
  • Academic publication links
    Links to: arxiv.org, sciencedirect.com
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.1%) to scientific vocabulary

Keywords

computer-vision deep-learning motion-magnification transformers video-amplification video-magnification
Last synced: 6 months ago · JSON representation ·

Repository

STB-VMM: Swin Transformer Based Video Motion Magnification (official repository)

Basic Info
  • Host: GitHub
  • Owner: RLado
  • License: mit
  • Language: Python
  • Default Branch: master
  • Homepage:
  • Size: 44.9 KB
Statistics
  • Stars: 41
  • Watchers: 3
  • Forks: 6
  • Open Issues: 0
  • Releases: 1
Topics
computer-vision deep-learning motion-magnification transformers video-amplification video-magnification
Created almost 4 years ago · Last pushed over 2 years ago
Metadata Files
Readme License Citation

README.md

STB-VMM: Swin Transformer Based Video Motion Magnification

Ricard Lado Roigé, Marco A. Pérez

IQS School of Engineering, Universitat Ramon Llull


This repository contains the official implementation of the STB-VMM: Swin Transformer Based Video Motion Magnification paper in PyTorch.

The goal of Video Motion Magnification techniques is to magnify small motions in a video to reveal previously invisible or unseen movement. Its uses extend from bio-medical applications and deep fake detection to structural modal analysis and predictive maintenance. However, discerning small motion from noise is a complex task, especially when attempting to magnify very subtle often sub-pixel movement. As a result, motion magnification techniques generally suffer from noisy and blurry outputs. This work presents a new state-of-the-art model based on the Swin Transformer, which offers better tolerance to noisy inputs as well as higher-quality outputs that exhibit less noise, blurriness and artifacts than prior-art. Improvements in output image quality will enable more precise measurements for any application reliant on magnified video sequences, and may enable further development of video motion magnification techniques in new technical fields.

Architecture Overview


Install dependencies

bash pip install -r requirements.txt

FFMPEG is required to run the magnify_video script


Testing

To test STB-VMM just run the script named magnify_video.sh with the appropriate arguments.

For example:

bash bash magnify_video.sh -mag 20 -i ../demo_video/baby.mp4 -m ckpt/ckpt_e49.pth.tar -o STB-VMM_demo_x20_static -s ../demo_video/ -f 30 Note: To magnify any video a pre-trained checkpoint is required.

Note 2: If you are running Windows an alternative powershell script is provided


Training

To train the STB-VMM model use train.py with the appropriate arguments. The training dataset can be downloaded from here.

For example:

bash python3 train.py -d ../data/train -n 100000 -j 32 -b 5 -lr 0.00001 --epochs 50 #--resume ckpt/ckpt_e01.pth.tar


Demo

https://user-images.githubusercontent.com/25719985/194240973-8d93968f-283b-4802-aacb-5e32175e16f3.mp4

More at http://doi.org/10.17632/76s26nrcpv.2


Citation

bibtex @article{LADOROIGE2023110493, title = {STB-VMM: Swin Transformer based Video Motion Magnification}, journal = {Knowledge-Based Systems}, pages = {110493}, year = {2023}, issn = {0950-7051}, doi = {https://doi.org/10.1016/j.knosys.2023.110493}, url = {https://www.sciencedirect.com/science/article/pii/S0950705123002435}, author = {Ricard Lado-Roigé and Marco A. Pérez}, keywords = {Computer vision, Deep learning, Swin Transformer, Motion magnification, Image quality assessment}, abstract = {The goal of video motion magnification techniques is to magnify small motions in a video to reveal previously invisible or unseen movement. Its uses extend from bio-medical applications and deepfake detection to structural modal analysis and predictive maintenance. However, discerning small motion from noise is a complex task, especially when attempting to magnify very subtle, often sub-pixel movement. As a result, motion magnification techniques generally suffer from noisy and blurry outputs. This work presents a new state-of-the-art model based on the Swin Transformer, which offers better tolerance to noisy inputs as well as higher-quality outputs that exhibit less noise, blurriness, and artifacts than prior-art. Improvements in output image quality will enable more precise measurements for any application reliant on magnified video sequences, and may enable further development of video motion magnification techniques in new technical fields.} }


Acknowledgements

This implementation borrows from the awesome works of: - Learning-based Video Motion Magnification - Motion Magnification PyTorch - Pytorch Image Models - SwinIR

Owner

  • Name: Ricard Lado
  • Login: RLado
  • Kind: user
  • Location: Barcelona
  • Company: Universitat Ramon Llull (IQS School of Engineering)

PhD Candidate. I love Linux and all things open source.

Citation (CITATION.cff)

cff-version: 1.2.0
title: 'STB-VMM: Swin Transformer Based Video Motion Magnification'
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Ricard
    family-names: Lado-Roigé
    email: ricardlador@iqs.edu
    affiliation: >-
      IQS School of Engineering, Universitat Ramon Llull,
      Via Augusta 390, 08017 Barcelona, Spain
    orcid: 'https://orcid.org/0000-0002-6421-7351'
  - given-names: Marco A.
    family-names: Pérez
    orcid: 'https://orcid.org/0000-0003-4140-1823'
    affiliation: >-
      IQS School of Engineering, Universitat Ramon Llull,
      Via Augusta 390, 08017 Barcelona, Spain
identifiers:
  - type: doi
    value: 10.1016/j.knosys.2023.110493
    description: >-
      STB-VMM: Swin Transformer Based Video Motion
      Magnification
repository-code: 'https://github.com/RLado/STB-VMM'
abstract: >-
  The goal of video motion magnification techniques is to
  magnify small motions in a video to reveal previously
  invisible or unseen movement. Its uses extend from
  bio-medical applications and deep fake detection to
  structural modal analysis and predictive maintenance.
  However, discerning small motion from noise is a complex
  task, especially when attempting to magnify very subtle
  often sub-pixel movement. As a result, motion
  magnification techniques generally suffer from noisy and
  blurry outputs. This work presents a new state-of-the-art
  model based on the Swin Transformer, which offers better
  tolerance to noisy inputs as well as higher-quality
  outputs that exhibit less noise, blurriness and artifacts
  than prior-art. Improvements in output image quality will
  enable more precise measurements for any application
  reliant on magnified video sequences, and may enable
  further development of video motion magnification
  techniques in new technical fields. 
keywords:
  - Computer vision
  - Deep Learning
  - Swin Transformer
  - Motion Magnification
  - Image Quality Assessment
license: MIT
version: v1.0.0
date-released: '2022-07-12'

GitHub Events

Total
  • Watch event: 7
  • Fork event: 1
Last Year
  • Watch event: 7
  • Fork event: 1