stb-vmm
STB-VMM: Swin Transformer Based Video Motion Magnification (official repository)
Science Score: 67.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 4 DOI reference(s) in README -
✓Academic publication links
Links to: arxiv.org, sciencedirect.com -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.1%) to scientific vocabulary
Keywords
Repository
STB-VMM: Swin Transformer Based Video Motion Magnification (official repository)
Basic Info
Statistics
- Stars: 41
- Watchers: 3
- Forks: 6
- Open Issues: 0
- Releases: 1
Topics
Metadata Files
README.md
STB-VMM: Swin Transformer Based Video Motion Magnification
Ricard Lado Roigé, Marco A. Pérez
IQS School of Engineering, Universitat Ramon Llull
This repository contains the official implementation of the STB-VMM: Swin Transformer Based Video Motion Magnification paper in PyTorch.
The goal of Video Motion Magnification techniques is to magnify small motions in a video to reveal previously invisible or unseen movement. Its uses extend from bio-medical applications and deep fake detection to structural modal analysis and predictive maintenance. However, discerning small motion from noise is a complex task, especially when attempting to magnify very subtle often sub-pixel movement. As a result, motion magnification techniques generally suffer from noisy and blurry outputs. This work presents a new state-of-the-art model based on the Swin Transformer, which offers better tolerance to noisy inputs as well as higher-quality outputs that exhibit less noise, blurriness and artifacts than prior-art. Improvements in output image quality will enable more precise measurements for any application reliant on magnified video sequences, and may enable further development of video motion magnification techniques in new technical fields.

Install dependencies
bash
pip install -r requirements.txt
❗FFMPEG is required to run the magnify_video script
Testing
To test STB-VMM just run the script named magnify_video.sh with the appropriate arguments.
For example:
bash
bash magnify_video.sh -mag 20 -i ../demo_video/baby.mp4 -m ckpt/ckpt_e49.pth.tar -o STB-VMM_demo_x20_static -s ../demo_video/ -f 30
Note: To magnify any video a pre-trained checkpoint is required.
Note 2: If you are running Windows an alternative powershell script is provided
Training
To train the STB-VMM model use train.py with the appropriate arguments. The training dataset can be downloaded from here.
For example:
bash
python3 train.py -d ../data/train -n 100000 -j 32 -b 5 -lr 0.00001 --epochs 50 #--resume ckpt/ckpt_e01.pth.tar
Demo
https://user-images.githubusercontent.com/25719985/194240973-8d93968f-283b-4802-aacb-5e32175e16f3.mp4
More at http://doi.org/10.17632/76s26nrcpv.2
Citation
bibtex
@article{LADOROIGE2023110493,
title = {STB-VMM: Swin Transformer based Video Motion Magnification},
journal = {Knowledge-Based Systems},
pages = {110493},
year = {2023},
issn = {0950-7051},
doi = {https://doi.org/10.1016/j.knosys.2023.110493},
url = {https://www.sciencedirect.com/science/article/pii/S0950705123002435},
author = {Ricard Lado-Roigé and Marco A. Pérez},
keywords = {Computer vision, Deep learning, Swin Transformer, Motion magnification, Image quality assessment},
abstract = {The goal of video motion magnification techniques is to magnify small motions in a video to reveal previously invisible or unseen movement. Its uses extend from bio-medical applications and deepfake detection to structural modal analysis and predictive maintenance. However, discerning small motion from noise is a complex task, especially when attempting to magnify very subtle, often sub-pixel movement. As a result, motion magnification techniques generally suffer from noisy and blurry outputs. This work presents a new state-of-the-art model based on the Swin Transformer, which offers better tolerance to noisy inputs as well as higher-quality outputs that exhibit less noise, blurriness, and artifacts than prior-art. Improvements in output image quality will enable more precise measurements for any application reliant on magnified video sequences, and may enable further development of video motion magnification techniques in new technical fields.}
}
Acknowledgements
This implementation borrows from the awesome works of: - Learning-based Video Motion Magnification - Motion Magnification PyTorch - Pytorch Image Models - SwinIR
Owner
- Name: Ricard Lado
- Login: RLado
- Kind: user
- Location: Barcelona
- Company: Universitat Ramon Llull (IQS School of Engineering)
- Website: lado.one
- Repositories: 24
- Profile: https://github.com/RLado
PhD Candidate. I love Linux and all things open source.
Citation (CITATION.cff)
cff-version: 1.2.0
title: 'STB-VMM: Swin Transformer Based Video Motion Magnification'
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Ricard
family-names: Lado-Roigé
email: ricardlador@iqs.edu
affiliation: >-
IQS School of Engineering, Universitat Ramon Llull,
Via Augusta 390, 08017 Barcelona, Spain
orcid: 'https://orcid.org/0000-0002-6421-7351'
- given-names: Marco A.
family-names: Pérez
orcid: 'https://orcid.org/0000-0003-4140-1823'
affiliation: >-
IQS School of Engineering, Universitat Ramon Llull,
Via Augusta 390, 08017 Barcelona, Spain
identifiers:
- type: doi
value: 10.1016/j.knosys.2023.110493
description: >-
STB-VMM: Swin Transformer Based Video Motion
Magnification
repository-code: 'https://github.com/RLado/STB-VMM'
abstract: >-
The goal of video motion magnification techniques is to
magnify small motions in a video to reveal previously
invisible or unseen movement. Its uses extend from
bio-medical applications and deep fake detection to
structural modal analysis and predictive maintenance.
However, discerning small motion from noise is a complex
task, especially when attempting to magnify very subtle
often sub-pixel movement. As a result, motion
magnification techniques generally suffer from noisy and
blurry outputs. This work presents a new state-of-the-art
model based on the Swin Transformer, which offers better
tolerance to noisy inputs as well as higher-quality
outputs that exhibit less noise, blurriness and artifacts
than prior-art. Improvements in output image quality will
enable more precise measurements for any application
reliant on magnified video sequences, and may enable
further development of video motion magnification
techniques in new technical fields.
keywords:
- Computer vision
- Deep Learning
- Swin Transformer
- Motion Magnification
- Image Quality Assessment
license: MIT
version: v1.0.0
date-released: '2022-07-12'
GitHub Events
Total
- Watch event: 7
- Fork event: 1
Last Year
- Watch event: 7
- Fork event: 1