https://github.com/924973292/top-reid

【AAAI2024】TOP-ReID: Multi-spectral Object Re-Identification with Token Permutation

https://github.com/924973292/top-reid

Science Score: 23.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org, scholar.google
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (7.8%) to scientific vocabulary

Keywords

aaai24 missing-modal-retrieval msvr310 multi-modal multi-modal-retrieval object-reid person-reid rgbnt100 rgbnt201 vehicle-reid
Last synced: 5 months ago · JSON representation

Repository

【AAAI2024】TOP-ReID: Multi-spectral Object Re-Identification with Token Permutation

Basic Info
  • Host: GitHub
  • Owner: 924973292
  • License: mit
  • Language: Python
  • Default Branch: master
  • Homepage:
  • Size: 12.4 MB
Statistics
  • Stars: 49
  • Watchers: 1
  • Forks: 2
  • Open Issues: 0
  • Releases: 0
Topics
aaai24 missing-modal-retrieval msvr310 multi-modal multi-modal-retrieval object-reid person-reid rgbnt100 rgbnt201 vehicle-reid
Created about 2 years ago · Last pushed over 1 year ago
Metadata Files
Readme License

README.md

TOP-ReID: Multi-spectral Object Re-Identification with Token Permutation

Description of the image

Yuhao Wang · Xuehu Liu · Pingping Zhang* · Hu Lu · Zhengzheng Tu · Huchuan Lu

AAAI 2024 Paper

TOP-ReID

TOP-ReID is a powerful multi-spectral object Re-identification (ReID) framework designed to retrieve specific objects by leveraging complementary information from different image spectra. It overcomes the limitations of traditional single-spectral ReID in complex visual environments by reducing distribution gap and enhancing cyclic feature aggregation among different image spectra. Besides, TOP-ReID achieves advanced performance in multi-spectral and missing-spectral object ReID and holds great potential under cross-spectral settings.

News

Exciting news! Our paper has been accepted by the AAAI 2024! 🎉 Paper

Table of Contents

Introduction

Multi-spectral object ReID is crucial in scenarios where objects are captured through different image spectra, such as RGB, near-infrared, and thermal imaging. TOP-ReID tackles the challenges posed by the distribution gap among these spectra and enhances feature representations by utilizing all tokens of Transformers.

Contributions

  • We propose a novel feature learning framework named TOP-ReID for multi-spectral object ReID. To our best knowledge, our proposed TOP-ReID is the first work to utilize all the tokens of vision Transformers to improve the multi-spectral object ReID.
  • We propose a Token Permutation Module (TPM) and a Complementary Reconstruction Module (CRM) to facilitate multi-spectral feature alignment and handle spectral-missing problems effectively.
  • We perform comprehensive experiments on three multispectral object ReID benchmarks, i.e., RGBNT201, RGBNT100 and MSVR310. The results fully verify the effectiveness of our proposed methods.

Results

Multi-spectral Object ReID

Multi-spectral Person ReID [RGBNT201]

Multi-spectral Person ReID

Multi-spectral Vehicle ReID [RGBNT100、MSVR310]

Multi-spectral Vehicle ReID

Missing-spectral Object ReID

Missing-spectral Person ReID [RGBNT201]

Missing-spectral Person ReID

Missing-spectral Vehicle ReID [RGBNT100]

Missing-spectral Vehicle ReID

Performance comparison with different modules [RGBNT201、RGBNT100]

Performance comparison with different modules Performance comparison with different modules

Performance comparison of different backbones [RGBNT201]

Performance comparison of different backbones

Visualizations

T-SNE [RGBNT201]

T-SNE

Grad-CAM [RGBNT201、RGBNT100]

Grad-CAM

Please check the paper for detailed information Paper

Reproduction

Datasets

RGBNT201 link: https://drive.google.com/drive/folders/1EscBadX-wMAT56It5lXY-S3-b5nK1wH
RGBNT100 link: https://pan.baidu.com/s/1xqqh7N4Lctm3RcUdskG0Ug code:rjin
MSVR310 link: https://drive.google.com/file/d/1IxI-fGiluPO
Ies6YjDHeTEuVYhFdYwD/view?usp=drive_link

Pretrained

ViT-B link: https://pan.baidu.com/s/1YE-24vSo5pvwHOF-y4sfA
DeiT-S link: https://pan.baidu.com/s/1YE-24vSo5pv
wHOF-y4sfA
T2T-ViT-24 link: https://pan.baidu.com/s/1YE-24vSo5pv_wHOF-y4sfA code: vmfm

Configs

RGBNT201 file: TOP-ReID/configs/RGBNT201/TOP-ReID.yml
RGBNT100 file: TOP-ReID/configs/RGBNT100/TOP-ReID.yml
MSVR310 file: TOP-ReID/configs/MSVR310/TOP-ReID.yml

Bash

```bash

!/bin/bash

source activate (your env) cd ../(your path) pip install -r requirements.txt python trainnet.py --configfile ../RGBNT201/TOP-ReID.yml ```

Training Example

In order to facilitate users in reproducing the results, we have provided training example. It is important to note that there may be slight variations in the experimental results compared to the data presented in the paper. It is worth noting that our model shows significant improvements on the RGBNT201 dataset. This is partly related to the dataset itself and partly to our choice of learning rate. During the experimental process, to align with the learning rate settings in TransReID, we initially adjust the learning rate to 0.008. However, we find that this task is sensitive to the learning rate. When the learning rate is too low, the model's performance fluctuates significantly. Therefore, for better performance and to enhance the competitiveness of our model, we choose a uniform and more suitable learning rate, ultimately selecting 0.009 as the standardized experimental setting. On the smaller MSVR310 dataset, we follow the authors' recommendations, using a higher number of epochs to improve the model's performance. Please note the above details. Here is the example of training TOP-ReID on RGBNT201 and RGBNT100.

RGBNT201:

train.txt

RGBNT100:

train.txt

Tips

If your machine's GPU memory is insufficient, consider adjusting the batch size accordingly. However, be aware that this may potentially impact the results. Moreover, based on our experimental findings, utilizing a single Transformer as the backbone network produces comparable results (mAP: 71.7%, Rank-1: 76.7%) on RGBNT201. To reduce GPU memory usage, you can opt to utilize only one Transformer backbone to process data from all three modalities. This modification requires adjusting the model initialization definition.

Star History

Star History Chart

Citation

If you find TOP-ReID useful in your research, please consider citing: ```bibtex @inproceedings{wang2024top, title={TOP-ReID: Multi-spectral Object Re-Identification with Token Permutation}, author={Wang, Yuhao and Liu, Xuehu and Zhang, Pingping and Lu, Hu and Tu, Zhengzheng and Lu, Huchuan}, booktitle={Proceedings of the AAAI Conference on Artificial Intelligence}, volume={38}, number={6}, pages={5758--5766}, year={2024} }

Owner

  • Name: Yuhao Wang
  • Login: 924973292
  • Kind: user
  • Location: Dalian
  • Company: Dalian University of Technology

生如芥子,心藏须弥

GitHub Events

Total
  • Watch event: 14
  • Issue comment event: 2
  • Push event: 1
  • Fork event: 2
Last Year
  • Watch event: 14
  • Issue comment event: 2
  • Push event: 1
  • Fork event: 2

Issues and Pull Requests

Last synced: almost 2 years ago

All Time
  • Total issues: 5
  • Total pull requests: 0
  • Average time to close issues: 9 days
  • Average time to close pull requests: N/A
  • Total issue authors: 5
  • Total pull request authors: 0
  • Average comments per issue: 5.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 5
  • Pull requests: 0
  • Average time to close issues: 9 days
  • Average time to close pull requests: N/A
  • Issue authors: 5
  • Pull request authors: 0
  • Average comments per issue: 5.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • YPQ-XJTU (1)
  • Researcher-DL (1)
  • 1125178969 (1)
  • starsky68 (1)
  • ThomaswellY (1)
  • Betricy (1)
  • zxcdsa45687 (1)
  • chenscottus (1)
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels

Dependencies

requirements.txt pypi
  • Cython ==0.29.35
  • Markdown ==3.4.3
  • MarkupSafe ==2.1.3
  • MultiScaleDeformableAttention ==1.0
  • PyWavelets ==1.3.0
  • PyYAML ==6.0
  • Shapely ==1.7.1
  • Werkzeug ==2.2.3
  • absl-py ==1.4.0
  • antlr4-python3-runtime ==4.9.3
  • appdirs ==1.4.4
  • black ==21.4b2
  • cachetools ==5.3.1
  • certifi ==2023.5.7
  • cffi ==1.15.1
  • charset-normalizer ==3.1.0
  • click ==8.1.3
  • cloudpickle ==2.2.1
  • cycler ==0.11.0
  • easydict ==1.10
  • einops ==0.6.1
  • filelock ==3.12.0
  • fonttools ==4.38.0
  • fsspec ==2023.1.0
  • future ==0.18.3
  • fvcore ==0.1.5.post20221221
  • google-auth ==2.19.1
  • google-auth-oauthlib ==0.4.6
  • grpcio ==1.54.2
  • huggingface-hub ==0.15.1
  • hydra-core ==1.3.2
  • idna ==3.4
  • imageio ==2.30.0
  • importlib-metadata ==6.6.0
  • importlib-resources ==5.12.0
  • iopath ==0.1.9
  • jpeg4py ==0.1.4
  • jsonpatch ==1.32
  • jsonpointer ==2.3
  • kiwisolver ==1.4.4
  • matplotlib ==3.5.3
  • motmetrics ==1.4.0
  • mypy-extensions ==1.0.0
  • networkx ==2.6.3
  • oauthlib ==3.2.2
  • omegaconf ==2.3.0
  • opencv-python ==4.7.0.72
  • packaging ==23.1
  • pandas ==1.3.5
  • pathspec ==0.11.1
  • portalocker ==2.7.0
  • prettytable ==3.7.0
  • protobuf ==3.20.3
  • ptflops ==0.7
  • pyasn1 ==0.5.0
  • pyasn1-modules ==0.3.0
  • pycocotools ==2.0.7
  • pycparser ==2.21
  • pydot ==1.4.2
  • pyparsing ==3.0.9
  • python-dateutil ==2.8.2
  • pytz ==2023.3
  • regex ==2023.5.5
  • requests ==2.31.0
  • requests-oauthlib ==1.3.1
  • rsa ==4.9
  • safetensors ==0.3.1
  • scikit-image ==0.19.3
  • scipy ==1.7.3
  • six ==1.16.0
  • tabulate ==0.9.0
  • tensorboard ==2.11.2
  • tensorboard-data-server ==0.6.1
  • tensorboard-plugin-wit ==1.8.1
  • termcolor ==2.3.0
  • tifffile ==2021.11.2
  • tikzplotlib ==0.10.1
  • timm ==0.9.2
  • tokenizers ==0.13.3
  • toml ==0.10.2
  • torch ==1.10.0
  • torchaudio ==0.10.0
  • torchvision ==0.11.0
  • tornado ==6.2
  • tqdm ==4.65.0
  • transformers ==4.29.2
  • typed-ast ==1.5.4
  • urllib3 ==1.26.16
  • visdom ==0.2.4
  • wcwidth ==0.2.6
  • webcolors ==1.13
  • websocket-client ==1.5.2
  • xmltodict ==0.13.0
  • yacs ==0.1.8
  • zipp ==3.15.0