Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (8.9%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: tengerye
- Language: Python
- Default Branch: master
- Size: 203 MB
Statistics
- Stars: 0
- Watchers: 2
- Forks: 1
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Fine-Grained Open-Vocabulary Object Detection with Fine-Grained Prompts: Task, Dataset and Benchmark
Official codebase and dataset for the ICRA 2025 paper (Oral) "Fine-Grained Open-Vocabulary Object Detection with Fine-Grained Prompts: Task, Dataset and Benchmark" [Paper] | [Project Page]
Overview
3F-OVD introduces a new benchmark for fine-grained open-vocabulary object detection (OVD), designed to evaluate detectors under realistic, challenging, and scalable conditions. We highlight the limitations of existing evaluation protocols and propose:
- A novel evaluation task that extends fine-grained detection to an open-vocabulary setting with class-level captions.
- A large-scale NEU-171K dataset spanning two domains: vehicles and retail products.
- A simple yet effective post-processing method that boosts the performance of open-vocabulary detectors by reducing false positives.
Dataset: NEU-171K
The NEU-171K dataset includes: - 145,825 images, 676,471 bounding boxes, 719 fine-grained classes. - Two domains: NEU-171K-C and NEU-171K-RP.
NEU-171K-C
NEU-171K-C contains cars in real-world traffic scenes.

NEU-171K-RP
NEU-171K-RP contains retail products captured in controlled warehouse settings.

You can access the dataset from:
More details on dataset structure and statistics are in datasets/README.md.
Benchmarking & Codebase
This repository includes: ``` - datasets/ - README.md # Dataset description and download instructions
src/
- supervised/ # Training & evaluation of traditional detectors (Section V-B)
- open_vocabulary/ # Evaluation of open-vocabulary detectors (Section V-C)
- cora/
- detic/
- gdino/
- vild/
- post_process/ # Our custom post-processing for reducing false positives (Section V-D) ```
Supported Baselines
- Supervised: Co-DETR, Faster R-CNN, FCOS, PAA, etc.
- Open-Vocabulary: ViLD, Detic, Grounding DINO
Run Evaluation
Instructions for running each baseline and applying the post-processing trick are included in the respective subfolders under src/.
Benchmarks
| Method | Trick | NEU-171K-C | NEU-171K-RP | |--------|-------|--------------------|---------------------| | GDino | w/o | 1.2e-03 | 7.4e-04 | | GDino | w | 1.3e-03 (+8.3%) | 7.6e-04 (+2.6%) | | Detic | w/o | 6.3e-04 | 2.0e-02 | | Detic | w | 6.6e-04 (+4.7%) | 2.2e-02 (+10.0%) | | Vild | w/o | 3.3e-04 | 7.5e-03 | | Vild | w | 3.8e-04 (+15.2%) | 10.6e-03 (+41.3%) |
Post-processing improves accuracy by reducing false-positive bounding boxes generated from caption tokens.
Citation
If you use this work, please cite:
bibtex
@article{liu2025fine,
title={Fine-Grained Open-Vocabulary Object Detection with Fine-Grained Prompts: Task, Dataset and Benchmark},
author={Liu, Ying and Hua, Yijing and Chai, Haojiang and Wang, Yanbo and Ye, TengQi},
journal={arXiv preprint arXiv:2503.14862},
year={2025}
}
Owner
- Name: TengQi Ye
- Login: tengerye
- Kind: user
- Location: Shanghai
- Company: Bytedance
- Repositories: 1
- Profile: https://github.com/tengerye
Obtain Ph.D. from Dublin City University. Research interest: machine learning. Work in ByteDance Inc. as computer vision engineer.
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you use our dataset or codes, please cite it as below."
authors:
- family-names: Liu
given-names: Ying
- family-names: Hua
given-names: Yijing
- family-names: Chai
given-names: Haojiang
- family-names: Wang
given-names: Yanbo
- family-names: Ye
given-names: TengQi
- family-names: "Bot"
given-names: "Hew"
orcid: "https://orcid.org/0000-0002-3501-8599"
title: "Fine-Grained Open-Vocabulary Object Detection with Fined-Grained Prompts: Task, Dataset and Benchmark"
version: 1.0.0
doi: 10.13140/RG.2.2.14172.71049
date-released: 2025-05-22
url: "https://github.com/tengerye/3FOVD"
type: article-journal
year: '2025'
conference: {}
publisher: {}
GitHub Events
Total
- Watch event: 1
- Delete event: 4
- Issue comment event: 3
- Push event: 96
- Pull request event: 25
- Fork event: 1
- Create event: 8
Last Year
- Watch event: 1
- Delete event: 4
- Issue comment event: 3
- Push event: 96
- Pull request event: 25
- Fork event: 1
- Create event: 8
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 0
- Total pull requests: 11
- Average time to close issues: N/A
- Average time to close pull requests: 2 days
- Total issue authors: 0
- Total pull request authors: 3
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 7
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 11
- Average time to close issues: N/A
- Average time to close pull requests: 2 days
- Issue authors: 0
- Pull request authors: 3
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 7
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
- chaishare (8)
- tengerye (2)
- ZCRC-EXCE (2)