https://github.com/924973292/demo
【AAAI2025】DeMo: Decoupled Feature-Based Mixture of Experts for Multi-Modal Object Re-Identification
Science Score: 23.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: arxiv.org, scholar.google -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (8.1%) to scientific vocabulary
Keywords
Repository
【AAAI2025】DeMo: Decoupled Feature-Based Mixture of Experts for Multi-Modal Object Re-Identification
Basic Info
Statistics
- Stars: 41
- Watchers: 1
- Forks: 2
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
DeMo: Decoupled Feature-Based Mixture of Experts for Multi-Modal Object Re-Identification
Yuhao Wang · Yang Liu · Aihua Zheng · Pingping Zhang*
DeMo is an advanced multi-modal object Re-Identification (ReID) framework designed to tackle dynamic imaging quality variations across modalities. By employing decoupled features and a novel Attention-Triggered Mixture of Experts (ATMoE), DeMo dynamically balances modality-specific and modality-shared information, enabling robust performance even under missing modality conditions. The framework sets new benchmarks for multi-modal and missing-modality object ReID.
News
- We released the DeMo codebase and paper! 🚀 Paper
- Great news! Our paper has been accepted to AAAI 2025! 🎉
Table of Contents
Introduction
Multi-modal object ReID combines the strengths of different modalities (e.g., RGB, NIR, TIR) to achieve robust identification across challenging scenarios. DeMo introduces a decoupled approach using Mixture of Experts (MoE) to preserve modality uniqueness and enhance diversity. This is achieved through: 1. Patch-Integrated Feature Extractor (PIFE): Captures multi-granular representations. 2. Hierarchical Decoupling Module (HDM): Separates modality-specific and shared features. 3. Attention-Triggered Mixture of Experts (ATMoE): Dynamically adjusts feature importance with adaptive attention-guided weights.
Contributions
- Introduced a decoupled feature-based MoE framework, DeMo, addressing dynamic quality changes in multi-modal imaging.
- Developed the Hierarchical Decoupling Module (HDM) for enhanced feature diversity and Attention-Triggered Mixture of Experts (ATMoE) for context-aware weighting.
- Achieved state-of-the-art performance on RGBNT201, RGBNT100, and MSVR310 benchmarks under both full and missing-modality settings.
Results
Multi-Modal Object ReID
Multi-Modal Person ReID [RGBNT201]
Multi-Modal Vehicle ReID [RGBNT100 & MSVR310]
Missing-Modality Object ReID
Missing-Modality Performance [RGBNT201]
Missing-Modality Performance [RGBNT100]
Ablation Studies [RGBNT201]
Visualizations
Feature Distribution (t-SNE)
Decoupled Features
Rank-list Visualization
Reproduction
Datasets
- RGBNT201: Google Drive
- RGBNT100: Baidu Pan (Code:
rjin) - MSVR310: Google Drive
Pretrained Models
Configuration
- RGBNT201:
configs/RGBNT201/DeMo.yml - RGBNT100:
configs/RGBNT100/DeMo.yml - MSVR310:
configs/MSVR310/DeMo.yml
Training
bash
conda create -n DeMo python=3.8.12 -y
conda activate DeMo
pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1+cu117 --extra-index-url https://download.pytorch.org/whl/cu117
cd (your_path)
pip install -r requirements.txt
python train_net.py --config_file configs/RGBNT201/DeMo.yml
Notes
- This repository is based on MambaPro. The prompt and adapter tuning on the CLIP backbone are reserved (the corresponding hyperparameters are set to
False), allowing users to explore them independently. - This code provides multi-modal Grad-CAM visualization, multi-modal ranking list generation, and t-SNE visualization tools to facilitate further research.
- The hyperparameter configuration is designed to ensure compatibility with devices equipped with less than 24GB of memory.
- Thank you for your attention and interest!
Star History
Citation
If you find DeMo helpful in your research, please consider citing:
bibtex
@inproceedings{wang2025DeMo,
title={DeMo: Decoupled Feature-Based Mixture of Experts for Multi-Modal Object Re-Identification},
author={Wang, Yuhao and Liu, Yang and Zheng, Aihua and Zhang, Pingping},
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
year={2025}
}
Owner
- Name: Yuhao Wang
- Login: 924973292
- Kind: user
- Location: Dalian
- Company: Dalian University of Technology
- Repositories: 7
- Profile: https://github.com/924973292
生如芥子,心藏须弥
GitHub Events
Total
- Issues event: 12
- Watch event: 56
- Issue comment event: 25
- Push event: 22
- Fork event: 1
- Create event: 2
Last Year
- Issues event: 12
- Watch event: 56
- Issue comment event: 25
- Push event: 22
- Fork event: 1
- Create event: 2
Dependencies
- Cython ==3.0.9
- Jinja2 ==3.1.2
- Markdown ==3.6
- MarkupSafe ==2.1.3
- PyYAML ==6.0.1
- Werkzeug ==3.0.1
- absl-py ==2.1.0
- certifi ==2022.12.7
- cffi ==1.16.0
- chardet ==5.2.0
- charset-normalizer ==2.1.1
- cloudpickle ==3.0.0
- colorama ==0.4.6
- contourpy ==1.2.0
- cycler ==0.12.1
- easydict ==1.13
- einops ==0.7.0
- exceptiongroup ==1.2.0
- filelock ==3.9.0
- fonttools ==4.49.0
- fsspec ==2024.9.0
- ftfy ==6.2.3
- fvcore ==0.1.5.post20221221
- grpcio ==1.62.1
- h5py ==3.10.0
- huggingface-hub ==0.21.4
- idna ==3.4
- iniconfig ==2.0.0
- iopath ==0.1.10
- joblib ==1.4.2
- jpeg4py ==0.1.4
- jsonpatch ==1.33
- jsonpointer ==2.4
- kiwisolver ==1.4.5
- lmdb ==1.4.1
- matplotlib ==3.8.3
- mpmath ==1.3.0
- networkx ==3.2.1
- ninja ==1.11.1.1
- numpy ==1.26.3
- opencv-python ==4.9.0.80
- packaging ==24.0
- pandas ==2.2.1
- pillow ==10.2.0
- pluggy ==1.4.0
- portalocker ==2.8.2
- protobuf ==4.25.3
- pycocotools ==2.0.7
- pycparser ==2.21
- pyparsing ==3.1.2
- pytest ==8.1.1
- python-dateutil ==2.9.0.post0
- pytz ==2024.1
- regex ==2023.12.25
- requests ==2.28.1
- safetensors ==0.4.2
- scikit-learn ==1.5.1
- scipy ==1.12.0
- seaborn ==0.13.2
- six ==1.16.0
- submitit ==1.5.1
- sympy ==1.12
- tabulate ==0.9.0
- tb-nightly ==2.17.0a20240320
- tensorboard-data-server ==0.7.2
- tensorboardX ==2.6.2.2
- termcolor ==2.4.0
- threadpoolctl ==3.5.0
- tikzplotlib ==0.10.1
- timm ==0.4.12
- tokenizers ==0.15.2
- tomli ==2.0.1
- torch ==2.1.1
- torchaudio ==2.1.1
- torchvision ==0.16.1
- tornado ==6.4
- tqdm ==4.66.2
- transformers ==4.38.2
- triton ==2.1.0
- typing_extensions ==4.8.0
- tzdata ==2024.1
- urllib3 ==1.26.13
- visdom ==0.2.4
- wcwidth ==0.2.13
- webcolors ==1.13
- websocket-client ==1.7.0
- yacs ==0.1.8