mma
Official code for "Activating Wider Areas in Image Super-Resolution"
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.3%) to scientific vocabulary
Repository
Official code for "Activating Wider Areas in Image Super-Resolution"
Basic Info
- Host: GitHub
- Owner: ArsenalCheng
- License: apache-2.0
- Language: Python
- Default Branch: main
- Size: 1.64 MB
Statistics
- Stars: 13
- Watchers: 1
- Forks: 3
- Open Issues: 1
- Releases: 0
Metadata Files
README.md
Activating Wider Areas in Image Super-Resolution
Cheng Cheng, Hang Wang, Hongbin Sun
🔥🔥🔥 News
- 2023-07-16: This repo is released.
[arXiv]
Abstract: The prevalence of convolution neural networks (CNNs) and vision transformers (ViTs) has markedly revolutionized the area of single-image super-resolution (SISR). To further boost the SR performances, several techniques, such as residual learning and attention mechanism, are introduced, which can be largely attributed to a wider range of activated area, that is, the input pixels that strongly influence the SR results. However, the possibility of further improving SR performance through another versatile vision backbone remains an unresolved challenge. To address this issue, in this paper, we unleash the representation potential of the modern state space model, i.e., Vision Mamba (Vim), in the context of SISR. Specifically, we present three recipes for better utilization of Vim-based models: 1) Integration into a MetaFormer-style block; 2) Pre-training on a larger and broader dataset; 3) Employing complementary attention mechanism, upon which we introduce the MMA. The resulting network MMA is capable of finding the most relevant and representative input pixels to reconstruct the corresponding high-resolution images. Comprehensive experimental analysis reveals that MMA not only achieves competitive or even superior performance compared to state-of-the-art SISR methods but also maintains relatively low memory and computational overheads (e.g., +0.5 dB PSNR elevation on Manga109 dataset with 19.8 M parameters at the scale of 2). Furthermore, MMA proves its versatility in lightweight SR applications. Through this work, we aim to illuminate the potential applications of state space models in the broader realm of image processing rather than SISR, encouraging further exploration in this innovative direction.


TODO
- Update lightweight results
Dependencies
- Python 3.10
- PyTorch 2.1.1
- NVIDIA GPU + CUDA
```
Clone the github repo and go to the default directory 'DAT'.
git clone https://github.com/ArsenalCheng/MMA.git conda create -n MMA python=3.8 conda activate MMA pip install -r requirements.txt python setup.py develop ```
Contents
Datasets
Used training and testing sets can be downloaded as follows:
| Training Set | Testing Set | | :----------------------------------------------------------- | :----------------------------------------------------------: | | DIV2K (800 training images, 100 validation images) + Flickr2K (2650 images) [complete training dataset DF2K: Google Drive] | Set5 + Set14 + BSD100 + Urban100 + Manga109 [complete testing dataset: Google Drive] |
Models
| Method | Scale | Dataset | PSNR (dB) | SSIM | Model Zoo | | :-------- | :----: | :-------: | :------: | :-------: | :----: | | MMA | 2 | Urban100 | 34.13 | 0.9446 | Google Drive | | MMA | 3 | Urban100 | 29.93 | 0.8829 | Google Drive| | MMA | 4 | Urban100 | 27.64 | 0.8272 | Google Drive|
Training
Download training (DF2K, already processed) and testing (Set5, Set14, BSD100, Urban100, Manga109, already processed) datasets, place them in
datasets/.Run the following scripts. The training configuration is in
options/train/.
``` # MMA-x2, input=64x64, 8 GPUs torchrun --nprocpernode=8 --masterport=4321 basicsr/train.py -opt options/train/MMA/trainMMAx2pretrain.yml --launcher pytorch # Then change the "pretrainnetworkg" in options/train/MMA/trainMMAx2finetune.yml to the best ckp during pretraining. torchrun --nprocpernode=8 --masterport=4321 basicsr/train.py -opt options/train/MMA/trainMMAx2_finetune.yml --launcher pytorch
# MMA-x3, input=64x64, 8 GPUs torchrun --nprocpernode=8 --masterport=4321 basicsr/train.py -opt options/train/MMA/trainMMAx3pretrain.yml --launcher pytorch # Then change the "pretrainnetworkg" in options/train/MMA/trainMMAx3finetune.yml to the best ckp during pretraining. torchrun --nprocpernode=8 --masterport=4321 basicsr/train.py -opt options/train/MMA/trainMMAx3_finetune.yml --launcher pytorch
# MMA-x4, input=64x64, 8 GPUs torchrun --nprocpernode=8 --masterport=4321 basicsr/train.py -opt options/train/MMA/trainMMAx4pretrain.yml --launcher pytorch # Then change the "pretrainnetworkg" in options/train/MMA/trainMMAx4finetune.yml to the best ckp during pretraining. torchrun --nprocpernode=8 --masterport=4321 basicsr/train.py -opt options/train/MMA/trainMMAx4_finetune.yml --launcher pytorch
```
- The training experiment is in
experiments/.
Testing
Test images with HR
- Download the pre-trained models and place them in
experiments/pretrained_models/.
We provide pre-trained models for image SR: MMA (x2, x3, x4).
Download testing (Set5, Set14, BSD100, Urban100, Manga109) datasets, place them in
datasets/.Run the following scripts. The testing configuration is in
options/test/(e.g., testMMAx2.yml).
```shell # MMA, reproduces results in Table 1 of the main paper
python basicsr/test.py -opt options/test/testMMAx2.yml python basicsr/test.py -opt options/test/testMMAx3.yml python basicsr/test.py -opt options/test/testMMAx4.yml ```
- The output is in
results/.
Citation
If you find the code helpful in your research or work, please cite the following paper(s).
@misc{cheng2024activating,
title={Activating Wider Areas in Image Super-Resolution},
author={Cheng Cheng and Hang Wang and Hongbin Sun},
year={2024},
eprint={2403.08330},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
Acknowledgements
This code is built on BasicSR.
Owner
- Name: ChengCheng
- Login: ArsenalCheng
- Kind: user
- Repositories: 1
- Profile: https://github.com/ArsenalCheng
Rookies
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you use this project, please cite it as below."
title: "BasicSR: Open Source Image and Video Restoration Toolbox"
version: 1.3.5
date-released: 2022-02-16
url: "https://github.com/XPixelGroup/BasicSR"
license: Apache-2.0
authors:
- family-names: Wang
given-names: Xintao
- family-names: Xie
given-names: Liangbin
- family-names: Yu
given-names: Ke
- family-names: Chan
given-names: Kelvin C.K.
- family-names: Loy
given-names: Chen Change
- family-names: Dong
given-names: Chao
GitHub Events
Total
- Watch event: 1
Last Year
- Watch event: 1
Dependencies
- Pillow *
- addict *
- future *
- lmdb *
- numpy >=1.17
- opencv-python *
- pyyaml *
- requests *
- scikit-image *
- scipy *
- tb-nightly *
- torch ==2.1.1
- torchvision *
- tqdm *
- yapf *