https://github.com/bytedance/r2former

Official repository for R2Former: Unified Retrieval and Reranking Transformer for Place Recognition

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (12.0%) to scientific vocabulary

Keywords

research

Last synced: 10 months ago · JSON representation

Repository

Official repository for R2Former: Unified Retrieval and Reranking Transformer for Place Recognition

Basic Info

Host: GitHub
Owner: bytedance
License: apache-2.0
Language: Jupyter Notebook
Default Branch: main
Homepage: https://openaccess.thecvf.com/content/CVPR2023/html/Zhu_R2Former_Unified_Retrieval_and_Reranking_Transformer_for_Place_Recognition_CVPR_2023_paper.html
Size: 2.66 MB

Statistics

Stars: 90
Watchers: 2
Forks: 4
Open Issues: 1
Releases: 0

Topics

research

Created about 3 years ago · Last pushed about 3 years ago

Metadata Files

Readme License

$R^{2}$ Former: Unified $R$ etrieval and $R$ eranking Transformer for Place Recognition

This is the official repository for the CVPR 2023 (Hightlight) paper: $R^{2}$ Former: Unified $R$ etrieval and $R$ eranking Transformer for Place Recognition.

@inproceedings{zhu2023r2former, title={R2former: Unified retrieval and reranking transformer for place recognition}, author={Zhu, Sijie and Yang, Linjie and Chen, Chen and Shah, Mubarak and Shen, Xiaohui and Wang, Heng}, booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition}, pages={19370--19380}, year={2023} }

Overview

In this paper, we propose a unified place recognition framework that handles both retrieval and reranking with a novel transformer model, named R2Former. The proposed reranking module takes feature correlation, attention value, and xy coordinates into account, and learns to determine whether the image pair is from the same location. The whole pipeline is end-to-end trainable and the reranking module alone can also be adopted on other CNN or transformer backbones as a generic component.

The global retrieval part is implemented based on VG Benchmark, and we add the ViT-based backbone with variable input resolution.
For MSLS training, we use the official Mapillary code base. The other datasets are prepared using datasets_vg.
We add the improved full-dataset mining and fix a bug in VG Benchmark dataloader.
We add the reranking evaluation in "test.py" and R2Former reranking module in "./model/".

Setup

Download the MSLS dataset from Mapillary and unzip the files. Install the python environment using

pip3 install -r requirements.txt pip3 install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113

Test with Pre-trained Models

Download pretrained model "CVPR23DeitSRerank.pth" from R2Former. Modify the path of MSLS dataset. Run the test script: commandline bash test.sh Note that the first time of loading MSLS dataset will be very slow, because it will scan all the images and generate positive candidates for each query. We save all the results so that the next time would be very fast.

Training - Separately

Firstly, download the pretrained global retrieval model from mslsv2deits.pth or train the global retrieval model using: commandline bash train_global_retrieval.sh You may need to change the dataset directory in the command. Place the trained global retrieval model in the main directory.

Download the pre-computed mining results from mslsv2deithardfinal.npy, which is generated using the global retrieval model (see "precomputemining.py"). Train the reranking module using: commandline bash train_reranking.sh The finetuning code is included, uncomment the last command to finetune on Pitts30K.

Training - End-to-end

Run the script: commandline bash train_end_to_end.sh

Acknowledgements

Parts of this repo are inspired by the following great repositories: - VG Benchmark - Mapillary - datasets_vg - DeiT - Patch-NetVLAD - NetVLAD's original code

Owner

Name: Bytedance Inc.
Login: bytedance
Kind: organization
Location: Singapore

Website: https://opensource.bytedance.com
Twitter: ByteDanceOSS
Repositories: 255
Profile: https://github.com/bytedance

GitHub Events

Total

Issues event: 13
Watch event: 17
Issue comment event: 8
Fork event: 2

Last Year

Issues event: 13
Watch event: 17
Issue comment event: 8
Fork event: 2

Issues and Pull Requests

Last synced: about 1 year ago

All Time

Total issues: 24
Total pull requests: 0
Average time to close issues: 2 months
Average time to close pull requests: N/A
Total issue authors: 15
Total pull request authors: 0
Average comments per issue: 3.79
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 6
Pull requests: 0
Average time to close issues: 4 months
Average time to close pull requests: N/A
Issue authors: 4
Pull request authors: 0
Average comments per issue: 1.0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

LKELN (3)
xjh19971 (3)
molu-ggg (3)
jinxxo-j (2)
yw-lorna (2)
LastEgg (1)
kaiyi98 (1)
Graysonggg (1)
anguoyuan (1)
azizyemen1 (1)
minhducquach (1)
noahzn (1)
Anuradha-Uggi (1)
Lavau (1)
jiangcheng0128 (1)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/bytedance/r2former

Science Score: 13.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

$R^{2}$ Former: Unified $R$ etrieval and $R$ eranking Transformer for Place Recognition

Overview

Setup

Test with Pre-trained Models

Training - Separately

Training - End-to-end

Acknowledgements

Owner

GitHub Events

Total

Last Year

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels