https://github.com/924973292/demo

【AAAI2025】DeMo: Decoupled Feature-Based Mixture of Experts for Multi-Modal Object Re-Identification

https://github.com/924973292/demo

Science Score: 23.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org, scholar.google
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (8.1%) to scientific vocabulary

Keywords

decoupling fusion mixture-of-experts msvr310 multi-modal reid rgbnt100 rgbnt201
Last synced: 5 months ago · JSON representation

Repository

【AAAI2025】DeMo: Decoupled Feature-Based Mixture of Experts for Multi-Modal Object Re-Identification

Basic Info
  • Host: GitHub
  • Owner: 924973292
  • License: mit
  • Language: Python
  • Default Branch: master
  • Homepage:
  • Size: 17 MB
Statistics
  • Stars: 41
  • Watchers: 1
  • Forks: 2
  • Open Issues: 0
  • Releases: 0
Topics
decoupling fusion mixture-of-experts msvr310 multi-modal reid rgbnt100 rgbnt201
Created about 1 year ago · Last pushed 12 months ago
Metadata Files
Readme License

README.md

DeMo: Decoupled Feature-Based Mixture of Experts for Multi-Modal Object Re-Identification

Description of the image

Yuhao Wang · Yang Liu · Aihua Zheng · Pingping Zhang*

AAAI 2025 Paper

RGBNT201 Results

DeMo is an advanced multi-modal object Re-Identification (ReID) framework designed to tackle dynamic imaging quality variations across modalities. By employing decoupled features and a novel Attention-Triggered Mixture of Experts (ATMoE), DeMo dynamically balances modality-specific and modality-shared information, enabling robust performance even under missing modality conditions. The framework sets new benchmarks for multi-modal and missing-modality object ReID.

News

  • We released the DeMo codebase and paper! 🚀 Paper

- Great news! Our paper has been accepted to AAAI 2025! 🎉

Table of Contents


Introduction

Multi-modal object ReID combines the strengths of different modalities (e.g., RGB, NIR, TIR) to achieve robust identification across challenging scenarios. DeMo introduces a decoupled approach using Mixture of Experts (MoE) to preserve modality uniqueness and enhance diversity. This is achieved through: 1. Patch-Integrated Feature Extractor (PIFE): Captures multi-granular representations. 2. Hierarchical Decoupling Module (HDM): Separates modality-specific and shared features. 3. Attention-Triggered Mixture of Experts (ATMoE): Dynamically adjusts feature importance with adaptive attention-guided weights.


Contributions

  • Introduced a decoupled feature-based MoE framework, DeMo, addressing dynamic quality changes in multi-modal imaging.
  • Developed the Hierarchical Decoupling Module (HDM) for enhanced feature diversity and Attention-Triggered Mixture of Experts (ATMoE) for context-aware weighting.
  • Achieved state-of-the-art performance on RGBNT201, RGBNT100, and MSVR310 benchmarks under both full and missing-modality settings.

Results

Multi-Modal Object ReID

Multi-Modal Person ReID [RGBNT201]

RGBNT201 Results

Multi-Modal Vehicle ReID [RGBNT100 & MSVR310]

RGBNT100 Results

Missing-Modality Object ReID

Missing-Modality Performance [RGBNT201]

RGBNT201 Missing-Modality

Missing-Modality Performance [RGBNT100]

RGBNT100 Missing-Modality

Ablation Studies [RGBNT201]

RGBNT201 Ablation


Visualizations

Feature Distribution (t-SNE)

t-SNE

Decoupled Features

Decoupled Features

Rank-list Visualization

Rank-list


Reproduction

Datasets

Pretrained Models

Configuration

  • RGBNT201: configs/RGBNT201/DeMo.yml
  • RGBNT100: configs/RGBNT100/DeMo.yml
  • MSVR310: configs/MSVR310/DeMo.yml

Training

bash conda create -n DeMo python=3.8.12 -y conda activate DeMo pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1+cu117 --extra-index-url https://download.pytorch.org/whl/cu117 cd (your_path) pip install -r requirements.txt python train_net.py --config_file configs/RGBNT201/DeMo.yml

Notes

  • This repository is based on MambaPro. The prompt and adapter tuning on the CLIP backbone are reserved (the corresponding hyperparameters are set to False), allowing users to explore them independently.
  • This code provides multi-modal Grad-CAM visualization, multi-modal ranking list generation, and t-SNE visualization tools to facilitate further research.
  • The hyperparameter configuration is designed to ensure compatibility with devices equipped with less than 24GB of memory.
  • Thank you for your attention and interest!

Star History

Star History Chart


Citation

If you find DeMo helpful in your research, please consider citing: bibtex @inproceedings{wang2025DeMo, title={DeMo: Decoupled Feature-Based Mixture of Experts for Multi-Modal Object Re-Identification}, author={Wang, Yuhao and Liu, Yang and Zheng, Aihua and Zhang, Pingping}, booktitle={Proceedings of the AAAI Conference on Artificial Intelligence}, year={2025} }

Owner

  • Name: Yuhao Wang
  • Login: 924973292
  • Kind: user
  • Location: Dalian
  • Company: Dalian University of Technology

生如芥子,心藏须弥

GitHub Events

Total
  • Issues event: 12
  • Watch event: 56
  • Issue comment event: 25
  • Push event: 22
  • Fork event: 1
  • Create event: 2
Last Year
  • Issues event: 12
  • Watch event: 56
  • Issue comment event: 25
  • Push event: 22
  • Fork event: 1
  • Create event: 2

Dependencies

requirements.txt pypi
  • Cython ==3.0.9
  • Jinja2 ==3.1.2
  • Markdown ==3.6
  • MarkupSafe ==2.1.3
  • PyYAML ==6.0.1
  • Werkzeug ==3.0.1
  • absl-py ==2.1.0
  • certifi ==2022.12.7
  • cffi ==1.16.0
  • chardet ==5.2.0
  • charset-normalizer ==2.1.1
  • cloudpickle ==3.0.0
  • colorama ==0.4.6
  • contourpy ==1.2.0
  • cycler ==0.12.1
  • easydict ==1.13
  • einops ==0.7.0
  • exceptiongroup ==1.2.0
  • filelock ==3.9.0
  • fonttools ==4.49.0
  • fsspec ==2024.9.0
  • ftfy ==6.2.3
  • fvcore ==0.1.5.post20221221
  • grpcio ==1.62.1
  • h5py ==3.10.0
  • huggingface-hub ==0.21.4
  • idna ==3.4
  • iniconfig ==2.0.0
  • iopath ==0.1.10
  • joblib ==1.4.2
  • jpeg4py ==0.1.4
  • jsonpatch ==1.33
  • jsonpointer ==2.4
  • kiwisolver ==1.4.5
  • lmdb ==1.4.1
  • matplotlib ==3.8.3
  • mpmath ==1.3.0
  • networkx ==3.2.1
  • ninja ==1.11.1.1
  • numpy ==1.26.3
  • opencv-python ==4.9.0.80
  • packaging ==24.0
  • pandas ==2.2.1
  • pillow ==10.2.0
  • pluggy ==1.4.0
  • portalocker ==2.8.2
  • protobuf ==4.25.3
  • pycocotools ==2.0.7
  • pycparser ==2.21
  • pyparsing ==3.1.2
  • pytest ==8.1.1
  • python-dateutil ==2.9.0.post0
  • pytz ==2024.1
  • regex ==2023.12.25
  • requests ==2.28.1
  • safetensors ==0.4.2
  • scikit-learn ==1.5.1
  • scipy ==1.12.0
  • seaborn ==0.13.2
  • six ==1.16.0
  • submitit ==1.5.1
  • sympy ==1.12
  • tabulate ==0.9.0
  • tb-nightly ==2.17.0a20240320
  • tensorboard-data-server ==0.7.2
  • tensorboardX ==2.6.2.2
  • termcolor ==2.4.0
  • threadpoolctl ==3.5.0
  • tikzplotlib ==0.10.1
  • timm ==0.4.12
  • tokenizers ==0.15.2
  • tomli ==2.0.1
  • torch ==2.1.1
  • torchaudio ==2.1.1
  • torchvision ==0.16.1
  • tornado ==6.4
  • tqdm ==4.66.2
  • transformers ==4.38.2
  • triton ==2.1.0
  • typing_extensions ==4.8.0
  • tzdata ==2024.1
  • urllib3 ==1.26.13
  • visdom ==0.2.4
  • wcwidth ==0.2.13
  • webcolors ==1.13
  • websocket-client ==1.7.0
  • yacs ==0.1.8