https://github.com/cv516buaa/ov-vg

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
○
codemeta.json file
○
.zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (9.9%) to scientific vocabulary

Last synced: 10 months ago · JSON representation

Repository

Basic Info

Host: GitHub
Owner: cv516Buaa
Default Branch: main
Size: 14.6 MB

Statistics

Stars: 22
Watchers: 1
Forks: 0
Open Issues: 5
Releases: 0

Created almost 3 years ago · Last pushed about 2 years ago

Metadata Files

Readme

OV-VG: A Benchmark for Open-Vocabulary Visual Grounding

Chunlei Wang · Wenquan Feng · Guangliang Cheng · Xiangtai Li · Shuchang Lyu · Binghao Liu
· Lijiang Chen · Qi Zhao

teaser

Highlight!!!!

OV-VG: Open-vocabulary Visual Grounding

Abstract

Open-vocabulary learning has emerged as a cutting-edge research area, particularly in light of the widespread adoption of vision-based foundational models. Its primary objective is to comprehend novel concepts that are not encompassed within a predefined vocabulary. One key facet of this endeavor is Visual Grounding (VG), which entails locating a specific region within an image based on a corresponding language description. While current foundational models excel at various visual language tasks, there's a noticeable absence of models specifically tailored for open-vocabulary visual grounding (OV-VG). This research endeavor introduces novel and challenging OV tasks, namely Open-Vocabulary Visual Grounding (OV-VG) and Open-Vocabulary Phrase Localization (OV-PL). The overarching aim is to establish connections between language descriptions and the localization of novel objects. To facilitate this, we have curated a comprehensive annotated benchmark, encompassing 7,272 OV-VG images (comprising 10,000 instances) and 1,000 OV-PL images. In our pursuit of addressing these challenges, we delved into various baseline methodologies rooted in existing open-vocabulary object detection (OV-D), VG, and phrase localization (PL) frameworks. Surprisingly, we discovered that state-of-the-art (SOTA) methods often falter in diverse scenarios. Consequently, we developed a novel framework that integrates two critical components: Text-Image Query Selection (TIQS) and Language-Guided Feature Attention (LGFA). These modules are designed to bolster the recognition of novel categories and enhance the alignment between visual and linguistic information. Extensive experiments demonstrate the efficacy of our proposed framework, which consistently attains SOTA performance across the OV-VG task. Additionally, ablation studies provide further evidence of the effectiveness of our innovative models.

teaser

TODO

[x] Release demo
[x] Release checkpoints
[x] Release DATASET
[ ] Release training and inference codes

Install

bash $ git clone https://github.com/cv516Buaa/OV-VG $ cd OV-VG $ pip install -r requirements.txt $ cd demo $ python demo.py

Checkpoints

OV-VG: | Baidu Drive(pw: ovvg). | Google Drive |

Dataset

OV-VG: | Baidu Drive(pw: ovvg). | Google Drive |
OV-PL: | Baidu Drive(pw: ovvg). | Google Drive |

Visualization

teaser

Citation

https://arxiv.org/abs/2310.14374

If you have any question, please discuss with me by sending email to wcl_buaa@buaa.edu.cn.

If you find this code useful please cite: @article{wang2023ov, title={OV-VG: A Benchmark for Open-Vocabulary Visual Grounding}, author={Wang, Chunlei and Feng, Wenquan and Li, Xiangtai and Cheng, Guangliang and Lyu, Shuchang and Liu, Binghao and Chen, Lijiang and Zhao, Qi}, journal={arXiv preprint arXiv:2310.14374}, year={2023} }

Owner

Name: cv516Buaa
Login: cv516Buaa
Kind: user
Location: Beijing，China
Company: Beihang University

Repositories: 2
Profile: https://github.com/cv516Buaa

Pattern Recognition and Artificial Intelligence Group Prof.Qi Zhao & Lijiang Chen Dr. Shuchang Lyu & Binghao Liu & Chunlei Wang

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/cv516buaa/ov-vg

Science Score: 10.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

OV-VG: A Benchmark for Open-Vocabulary Visual Grounding

Highlight!!!!

Abstract

TODO

Install

Checkpoints

Dataset

Visualization

Citation

Owner

GitHub Events

Total

Last Year