https://github.com/cv516buaa/udl
Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (7.0%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: cv516Buaa
- Language: Python
- Default Branch: main
- Size: 8.64 MB
Statistics
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
UDL: Open Vocabulary Object Detection with LLM-based Unified Descriptive Language
Chunlei Wang
·
Wenquan Feng
·
Binghao Liu
·
Meng Li
·
Lijiang Chen
·
Qi Zhao

Highlight!!!!
UDL: Open Vocabulary Object Detection with LLM-based Unified Descriptive Language
Abstract
With the rapid development of vision-language approaches, more and more pioneers focus on the open vocabulary learning paradigm. These methods support a variety of vision-language tasks, aligning region features with language embeddings to enhance recognition ability of novel categories. However, existing methods neglect the diversity of input language in different tasks, which makes it difficult for the model to understand text context information and heavily rely on category names to detect objects. Capturing fine-grained features in images and text descriptions is also a challenge. To address these issues, we propose an Open Vocabulary Object Detection with LLM-based Unified Descriptive Language (UDL) with Hierarchical Gated Cross Attention (HGCA) and Pixel-level Visual Language Attention (PVLA) for more comprehensive contextual understanding and better visual-language alignment. On OminiLabel object detection benchmark, under the zero-shot detection setting, our approach can handle better open vocabulary object detection and achieve new SOTA results. Ablation studies and visualization experiments demonstrate the effectiveness of the proposed components. Codes will be publicly at https://github.com/cv516Buaa/UDL.
TODO
- [x] Release demo
- [x] Release checkpoints
- [ ] Release training and inference codes
Checkpoints
UDL: | Baidu Drive(pw: udll). | Google Drive |

Citation
If you have any question, please discuss with me by sending email to wcl_buaa@buaa.edu.cn.
Owner
- Name: cv516Buaa
- Login: cv516Buaa
- Kind: user
- Location: Beijing,China
- Company: Beihang University
- Repositories: 2
- Profile: https://github.com/cv516Buaa
Pattern Recognition and Artificial Intelligence Group Prof.Qi Zhao & Lijiang Chen Dr. Shuchang Lyu & Binghao Liu & Chunlei Wang