https://github.com/cv516buaa/udl

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (7.0%) to scientific vocabulary

Last synced: 10 months ago · JSON representation

Repository

Basic Info

Host: GitHub
Owner: cv516Buaa
Language: Python
Default Branch: main
Size: 8.64 MB

Statistics

Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Created almost 2 years ago · Last pushed almost 2 years ago

Metadata Files

Readme

UDL: Open Vocabulary Object Detection with LLM-based Unified Descriptive Language

Chunlei Wang · Wenquan Feng · Binghao Liu · Meng Li · Lijiang Chen · Qi Zhao

teaser

Highlight!!!!

UDL: Open Vocabulary Object Detection with LLM-based Unified Descriptive Language

Abstract

With the rapid development of vision-language approaches, more and more pioneers focus on the open vocabulary learning paradigm. These methods support a variety of vision-language tasks, aligning region features with language embeddings to enhance recognition ability of novel categories. However, existing methods neglect the diversity of input language in different tasks, which makes it difficult for the model to understand text context information and heavily rely on category names to detect objects. Capturing fine-grained features in images and text descriptions is also a challenge. To address these issues, we propose an Open Vocabulary Object Detection with LLM-based Unified Descriptive Language (UDL) with Hierarchical Gated Cross Attention (HGCA) and Pixel-level Visual Language Attention (PVLA) for more comprehensive contextual understanding and better visual-language alignment. On OminiLabel object detection benchmark, under the zero-shot detection setting, our approach can handle better open vocabulary object detection and achieve new SOTA results. Ablation studies and visualization experiments demonstrate the effectiveness of the proposed components. Codes will be publicly at https://github.com/cv516Buaa/UDL.

TODO

[x] Release demo
[x] Release checkpoints
[ ] Release training and inference codes

Checkpoints

UDL: | Baidu Drive(pw: udll). | Google Drive |

teaser

Citation

If you have any question, please discuss with me by sending email to wcl_buaa@buaa.edu.cn.

Owner

Name: cv516Buaa
Login: cv516Buaa
Kind: user
Location: Beijing，China
Company: Beihang University

Repositories: 2
Profile: https://github.com/cv516Buaa

Pattern Recognition and Artificial Intelligence Group Prof.Qi Zhao & Lijiang Chen Dr. Shuchang Lyu & Binghao Liu & Chunlei Wang

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/cv516buaa/udl

Science Score: 13.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

UDL: Open Vocabulary Object Detection with LLM-based Unified Descriptive Language

Highlight!!!!

Abstract

TODO

Checkpoints

Citation

Owner

GitHub Events

Total

Last Year