plainmamba
[BMVC 2024] PlainMamba: Improving Non-hierarchical Mamba in Visual Recognition
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.3%) to scientific vocabulary
Repository
[BMVC 2024] PlainMamba: Improving Non-hierarchical Mamba in Visual Recognition
Basic Info
Statistics
- Stars: 75
- Watchers: 4
- Forks: 8
- Open Issues: 3
- Releases: 0
Metadata Files
README.md
PlainMamba: Improving Non-hierarchical Mamba in Visual Recognition
This repository contains the official PyTorch implementation of our paper:
Usage
Environment Setup
Our classification codebase is built upon the MMClassification toolkit (old version). ```shell conda create -n plainmamba python=3.10 -y source activate plainmamba pip install torch==1.13.1+cu116 torchvision==0.14.1+cu116 -f https://download.pytorch.org/whl/torchstable.html --no-cache conda install -c conda-forge cudatoolkit-dev # Optional, only needed when facing cuda errors pip install -U openmim mim install mmcv-full pip install mamba-ssm pip install mlflow fvcore timm lmdb cd plainmamba pip install -e .
cd downstream/mmdetection # set up object detection and instance segmentation pip install -e . cd downstream/mmsegmentation # set up semantic segmentation pip install -e . ```
Data Preparation
For ImageNet experiment, we convert the dataset to LMDB format for efficient data loading. You can convert the dataset by running:
shell
python tools/dataset_tools/create_lmdb_dataset.py \
--train-img-dir data/imagenet/train \
--train-out data/imagenet/imagenet_lmdb/train \
--val-img-dir data/imagenet/val \
--val-out data/imagenet/imagenet_lmdb/val
You will also need to download the ImageNet meta data from Link.
For downstream tasks, please follow MMDetection and MMSegmentation to set up your datasets.
After setting up, the datasets file structure should be as the following:
```
PlainMamba
|-- ...
|-- data
| |__ imagenet
| |-- imagenetlmdb
| | |-- train
| | | |-- data.mdb
| | | |_ lock.mdb
| | |-- val
| | | |-- data.mdb
| | | |__ lock.mdb
| |__ meta
| |__ ...
|__ downstream
|-- mmsegmentation
| |-- ...
| |__ data
| |__ ade
| |__ ADEChallengeData2016
| |-- annotations
| | |__ ...
| |-- images
| | |__ ...
| |-- objectInfo150.txt
| |__ sceneCategories.txt
|
|__ mmdetection
|-- ...
|__ data
|__ coco
|-- train2017
| |__ ...
|-- val2017
| |__ ...
|__ annotations
|-- instancestrain2017.json
|-- instancesval2017.json
|__ ...
```
ImageNet Classification
Training PlainMamba
```shell
Example: Training PlainMamba-L1 model
zsh tool/disttrain.sh plainmambaconfigs/plainmambal1in1k_300e.py 8 ```
Testing PlainMamba
```shell
Example: Testing PlainMamba-L1 model
zsh tool/disttest.sh plainmambaconfigs/plainmambal1in1k300e.py workdirs/plainmambal1in1k300e/epoch_300.pth 8 --metrics accuracy ```
COCO Object Detection and Instance Segmentation
Run cd downstream/mmdetection first.
Training Mask R-CNN using PlainMamba-Adapter
```shell
Example: Training PlainMamba-Adapter-L1 Mask R-CNN with 1x schedule
zsh tools/disttrain.sh plainmambadetconfigs/maskrcnn/l1maskrcnn1x.py 8 ```
Training RetinaNet using PlainMamba-Adapter
```shell
Example: Training PlainMamba-Adapter-L1 RetinaNet with 1x schedule
zsh tools/disttrain.sh plainmambadetconfigs/retinanet/l1retinanet1x.py 8 ```
Testing Mask R-CNN
```shell
Example: Testing PlainMamba-Adapter-L1 Mask R-CNN 1x model
zsh tools/disttest.sh plainmambadetconfigs/maskrcnn/l1maskrcnn1x.py workdirs/l1maskrcnn1x/epoch12.pth 8 --eval bbox segm ```
Testing RetinaNet
```shell
Example: Testing PlainMamba-Adapter-L1 RetinaNet 1x model
zsh tools/disttest.sh plainmambadetconfigs/retinanet/l1retinanet1x.py workdirs/l1retinanet1x/epoch12.pth 8 --eval bbox ```
ADE20K Semantic Segmentation
Run cd downstream/mmsegmentation first.
Training UperNet using PlainMamba
```shell
Example: Training PlainMamba-L1 based UperNet
zsh tools/disttrain.sh plainmambasegconfigs/l1_upernet.py 8 ```
Testing UperNet
```shell
Example: Testing PlainMamba-L1 based UperNet
zsh tools/disttest.sh plainmambasegconfigs/l1upernet.py workdirs/l1upernet/iter160000.pth 8 --eval mIoU ```
Benchmark results
ImageNet-1k Classification
| Model | #Params (M) | Top-1 Acc | Top-5 Acc | Config | Model | |:--------:|:-----------:|:---------:|:---------:|:---------------------------------------------------------------------------------------------------------------:|:----------------------------------------------------------------------------:| | PlainMamba-L1 | 7.3 | 77.9 | 94.0 | Link | Link | | PlainMamba-L2 | 25.7 | 81.6 | 95.6 | Link | Link | | PlainMamba-L3 | 50.5 | 82.3 | 95.9 | Link | Link |
COCO Mask R-CNN 1x Schedule
| Model | #Params (M) | AP Box | AP Mask | Config | Model | |:---------------------:|:-----------:|:------:|:-------:|:----------------------------------------------------------------------------------------------------------------------------------------:|:--------------------------------------------------------------------------------------------------:| | PlainMamba-Adapter-L1 | 31 | 44.1 | 39.1 | Link | Link | | PlainMamba-Adapter-L2 | 53 | 46.0 | 40.6 | Link | Link | | PlainMamba-Adapter-L3 | 79 | 46.8 | 41.2 | Link | Link |
COCO RetinaNet 1x Schedule
| Model | #Params (M) | AP Box | Config | Model | |:----------------:|:-----------:|:------:|:-------------------------------------------------------------------------------------------------------------------------------------:|:---------------------------------------------------------------------------------------------------:| | PlainMamba-Adapter-L1 | 19 | 41.7 | Link | Link | | PlainMamba-Adapter-L2 | 40 | 43.9 | Link | Link | | PlainMamba-Adapter-L3 | 67 | 44.8 | Link | Link |
ADE20K UperNet
| Model | #Params (M) | mIoU | Config | Model | |:--------:|:-----------:|:----:|:------------------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------:| | PlainMamba-L1 | 35 | 44.1 | Link | Link | | PlainMamba-L2 | 55 | 46.8 | Link | Link | | PlainMamba-L3 | 81 | 49.1 | Link | Link |
Citation
@inproceedings{Yang_2024_BMVC,
author = {Chenhongyi Yang and Zehui Chen and Miguel Espinosa and Linus Ericsson and Zhenyu Wang and Jiaming Liu and Elliot J. Crowley},
title = {PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition},
booktitle = {35th British Machine Vision Conference 2024, {BMVC} 2024, Glasgow, UK, November 25-28, 2024},
publisher = {BMVA},
year = {2024}
}
Owner
- Name: Chenhongyi Yang
- Login: ChenhongyiYang
- Kind: user
- Location: Zurich, Switzerland
- Company: Meta
- Website: chenhongyiyang.com
- Repositories: 4
- Profile: https://github.com/ChenhongyiYang
Research Scientist at Meta Reality Labs
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." title: "OpenMMLab's Image Classification Toolbox and Benchmark" authors: - name: "MMClassification Contributors" version: 0.15.0 date-released: 2020-07-09 repository-code: "https://github.com/open-mmlab/mmclassification" license: Apache-2.0
GitHub Events
Total
- Watch event: 14
- Push event: 1
Last Year
- Watch event: 14
- Push event: 1
Dependencies
- albumentations >=0.3.2
- cython *
- numpy *
- docutils ==0.16.0
- myst-parser *
- sphinx ==4.0.2
- sphinx-copybutton *
- sphinx_markdown_tables *
- sphinx_rtd_theme ==0.5.2
- mmcv-full >=1.3.17
- cityscapesscripts *
- imagecorruptions *
- scipy *
- sklearn *
- timm *
- mmcv *
- torch *
- torchvision *
- matplotlib *
- numpy *
- pycocotools *
- six *
- terminaltables *
- asynctest * test
- codecov * test
- flake8 * test
- interrogate * test
- isort ==4.3.21 test
- kwarray * test
- onnx ==1.7.0 test
- onnxruntime >=1.8.0 test
- protobuf <=3.20.1 test
- pytest * test
- ubelt * test
- xdoctest >=0.10.0 test
- yapf * test
- docutils ==0.16.0
- myst-parser *
- sphinx ==4.0.2
- sphinx_copybutton *
- sphinx_markdown_tables *
- mmcls >=0.20.1
- mmcv-full >=1.4.4,<=1.5.0
- cityscapesscripts *
- mmcv *
- prettytable *
- torch *
- torchvision *
- matplotlib *
- mmcls >=0.20.1
- numpy *
- packaging *
- prettytable *
- codecov * test
- flake8 * test
- interrogate * test
- pytest * test
- xdoctest >=0.10.0 test
- yapf * test
- docutils ==0.17.1
- myst-parser *
- pytorch_sphinx_theme *
- sphinx ==4.5.0
- sphinx-copybutton *
- sphinx_markdown_tables *
- einops >=0.6.0
- mmcv-full >=1.4.2,<1.9.0
- albumentations >=0.3.2
- colorama *
- requests *
- rich *
- scipy *
- mmcv >=1.4.2
- torch *
- torchvision *
- einops >=0.6.0
- matplotlib >=3.1.0
- numpy *
- packaging *
- codecov * test
- flake8 * test
- interrogate * test
- isort ==4.3.21 test
- mmdet * test
- pytest * test
- xdoctest >=0.10.0 test
- yapf * test