mma

Official code for "Activating Wider Areas in Image Super-Resolution"

https://github.com/arsenalcheng/mma

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (11.3%) to scientific vocabulary

Last synced: 6 months ago · JSON representation ·

Repository

Official code for "Activating Wider Areas in Image Super-Resolution"

Basic Info

Host: GitHub
Owner: ArsenalCheng
License: apache-2.0
Language: Python
Default Branch: main
Size: 1.64 MB

Statistics

Stars: 13
Watchers: 1
Forks: 3
Open Issues: 1
Releases: 0

Created almost 2 years ago · Last pushed over 1 year ago

Metadata Files

Readme License Citation

Activating Wider Areas in Image Super-Resolution

Cheng Cheng, Hang Wang, Hongbin Sun

🔥🔥🔥 News

2023-07-16: This repo is released.

[arXiv]

Abstract: The prevalence of convolution neural networks (CNNs) and vision transformers (ViTs) has markedly revolutionized the area of single-image super-resolution (SISR). To further boost the SR performances, several techniques, such as residual learning and attention mechanism, are introduced, which can be largely attributed to a wider range of activated area, that is, the input pixels that strongly influence the SR results. However, the possibility of further improving SR performance through another versatile vision backbone remains an unresolved challenge. To address this issue, in this paper, we unleash the representation potential of the modern state space model, i.e., Vision Mamba (Vim), in the context of SISR. Specifically, we present three recipes for better utilization of Vim-based models: 1) Integration into a MetaFormer-style block; 2) Pre-training on a larger and broader dataset; 3) Employing complementary attention mechanism, upon which we introduce the MMA. The resulting network MMA is capable of finding the most relevant and representative input pixels to reconstruct the corresponding high-resolution images. Comprehensive experimental analysis reveals that MMA not only achieves competitive or even superior performance compared to state-of-the-art SISR methods but also maintains relatively low memory and computational overheads (e.g., +0.5 dB PSNR elevation on Manga109 dataset with 19.8 M parameters at the scale of 2). Furthermore, MMA proves its versatility in lightweight SR applications. Through this work, we aim to illuminate the potential applications of state space models in the broader realm of image processing rather than SISR, encouraging further exploration in this innovative direction.

Intro

TODO

Update lightweight results

Dependencies

Python 3.10
PyTorch 2.1.1
NVIDIA GPU + CUDA

```

Clone the github repo and go to the default directory 'DAT'.

git clone https://github.com/ArsenalCheng/MMA.git conda create -n MMA python=3.8 conda activate MMA pip install -r requirements.txt python setup.py develop ```

Datasets
Models
Training
Testing
Results
Citation
Acknowledgements

Datasets

Used training and testing sets can be downloaded as follows:

| Training Set | Testing Set | | :----------------------------------------------------------- | :----------------------------------------------------------: | | DIV2K (800 training images, 100 validation images) + Flickr2K (2650 images) [complete training dataset DF2K: Google Drive] | Set5 + Set14 + BSD100 + Urban100 + Manga109 [complete testing dataset: Google Drive] |

Models

| Method | Scale | Dataset | PSNR (dB) | SSIM | Model Zoo | | :-------- | :----: | :-------: | :------: | :-------: | :----: | | MMA | 2 | Urban100 | 34.13 | 0.9446 | Google Drive | | MMA | 3 | Urban100 | 29.93 | 0.8829 | Google Drive| | MMA | 4 | Urban100 | 27.64 | 0.8272 | Google Drive|

Training

Download training (DF2K, already processed) and testing (Set5, Set14, BSD100, Urban100, Manga109, already processed) datasets, place them in datasets/.
Run the following scripts. The training configuration is in options/train/.

``` # MMA-x2, input=64x64, 8 GPUs torchrun --nprocpernode=8 --masterport=4321 basicsr/train.py -opt options/train/MMA/trainMMAx2pretrain.yml --launcher pytorch # Then change the "pretrainnetworkg" in options/train/MMA/trainMMAx2finetune.yml to the best ckp during pretraining. torchrun --nprocpernode=8 --masterport=4321 basicsr/train.py -opt options/train/MMA/trainMMAx2_finetune.yml --launcher pytorch

# MMA-x3, input=64x64, 8 GPUs torchrun --nprocpernode=8 --masterport=4321 basicsr/train.py -opt options/train/MMA/trainMMAx3pretrain.yml --launcher pytorch # Then change the "pretrainnetworkg" in options/train/MMA/trainMMAx3finetune.yml to the best ckp during pretraining. torchrun --nprocpernode=8 --masterport=4321 basicsr/train.py -opt options/train/MMA/trainMMAx3_finetune.yml --launcher pytorch

# MMA-x4, input=64x64, 8 GPUs torchrun --nprocpernode=8 --masterport=4321 basicsr/train.py -opt options/train/MMA/trainMMAx4pretrain.yml --launcher pytorch # Then change the "pretrainnetworkg" in options/train/MMA/trainMMAx4finetune.yml to the best ckp during pretraining. torchrun --nprocpernode=8 --masterport=4321 basicsr/train.py -opt options/train/MMA/trainMMAx4_finetune.yml --launcher pytorch

```

The training experiment is in experiments/.

Testing

Test images with HR

Download the pre-trained models and place them in experiments/pretrained_models/.

We provide pre-trained models for image SR: MMA (x2, x3, x4).

Download testing (Set5, Set14, BSD100, Urban100, Manga109) datasets, place them in datasets/.
Run the following scripts. The testing configuration is in options/test/ (e.g., testMMAx2.yml).

```shell # MMA, reproduces results in Table 1 of the main paper

python basicsr/test.py -opt options/test/testMMAx2.yml python basicsr/test.py -opt options/test/testMMAx3.yml python basicsr/test.py -opt options/test/testMMAx4.yml ```

The output is in results/.

Citation

If you find the code helpful in your research or work, please cite the following paper(s).

@misc{cheng2024activating, title={Activating Wider Areas in Image Super-Resolution}, author={Cheng Cheng and Hang Wang and Hongbin Sun}, year={2024}, eprint={2403.08330}, archivePrefix={arXiv}, primaryClass={cs.CV} }

Acknowledgements

This code is built on BasicSR.

Owner

Name: ChengCheng
Login: ArsenalCheng
Kind: user

Repositories: 1
Profile: https://github.com/ArsenalCheng

Rookies

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this project, please cite it as below."
title: "BasicSR: Open Source Image and Video Restoration Toolbox"
version: 1.3.5
date-released: 2022-02-16
url: "https://github.com/XPixelGroup/BasicSR"
license: Apache-2.0
authors:
  - family-names: Wang
    given-names: Xintao
  - family-names: Xie
    given-names: Liangbin
  - family-names: Yu
    given-names: Ke
  - family-names: Chan
    given-names: Kelvin C.K.
  - family-names: Loy
    given-names: Chen Change
  - family-names: Dong
    given-names: Chao

GitHub Events

Total

Watch event: 1

Last Year

Watch event: 1

Dependencies

requirements.txt pypi

Pillow *
addict *
future *
lmdb *
numpy >=1.17
opencv-python *
pyyaml *
requests *
scikit-image *
scipy *
tb-nightly *
torch ==2.1.1
torchvision *
tqdm *
yapf *

setup.py pypi

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

mma

Science Score: 54.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

Activating Wider Areas in Image Super-Resolution

🔥🔥🔥 News

TODO

Dependencies

Clone the github repo and go to the default directory 'DAT'.

Contents

Datasets

Models

Training

Testing

Test images with HR

Citation

Acknowledgements

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Dependencies