3fovd

https://github.com/tengerye/3fovd

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (8.9%) to scientific vocabulary

Last synced: 11 months ago · JSON representation ·

Repository

Basic Info

Host: GitHub
Owner: tengerye
Language: Python
Default Branch: master
Size: 203 MB

Statistics

Stars: 0
Watchers: 2
Forks: 1
Open Issues: 0
Releases: 0

Created almost 2 years ago · Last pushed about 1 year ago

Metadata Files

Readme Citation

Fine-Grained Open-Vocabulary Object Detection with Fine-Grained Prompts: Task, Dataset and Benchmark

Official codebase and dataset for the ICRA 2025 paper (Oral) "Fine-Grained Open-Vocabulary Object Detection with Fine-Grained Prompts: Task, Dataset and Benchmark" [Paper] | [Project Page]

Overview

3F-OVD introduces a new benchmark for fine-grained open-vocabulary object detection (OVD), designed to evaluate detectors under realistic, challenging, and scalable conditions. We highlight the limitations of existing evaluation protocols and propose:

A novel evaluation task that extends fine-grained detection to an open-vocabulary setting with class-level captions.
A large-scale NEU-171K dataset spanning two domains: vehicles and retail products.
A simple yet effective post-processing method that boosts the performance of open-vocabulary detectors by reducing false positives.

Dataset: NEU-171K

The NEU-171K dataset includes: - 145,825 images, 676,471 bounding boxes, 719 fine-grained classes. - Two domains: NEU-171K-C and NEU-171K-RP.

NEU-171K-C

NEU-171K-C contains cars in real-world traffic scenes.

NEU-171K-RP

NEU-171K-RP contains retail products captured in controlled warehouse settings.

You can access the dataset from:

More details on dataset structure and statistics are in datasets/README.md.

Benchmarking & Codebase

This repository includes: ``` - datasets/ - README.md # Dataset description and download instructions

src/
- supervised/ # Training & evaluation of traditional detectors (Section V-B)
- open_vocabulary/ # Evaluation of open-vocabulary detectors (Section V-C)
  - cora/
  - detic/
  - gdino/
  - vild/
- post_process/ # Our custom post-processing for reducing false positives (Section V-D) ```

Supported Baselines

Supervised: Co-DETR, Faster R-CNN, FCOS, PAA, etc.
Open-Vocabulary: ViLD, Detic, Grounding DINO

Run Evaluation

Instructions for running each baseline and applying the post-processing trick are included in the respective subfolders under src/.

Benchmarks

| Method | Trick | NEU-171K-C | NEU-171K-RP | |--------|-------|--------------------|---------------------| | GDino | w/o | 1.2e-03 | 7.4e-04 | | GDino | w | 1.3e-03 (+8.3%) | 7.6e-04 (+2.6%) | | Detic | w/o | 6.3e-04 | 2.0e-02 | | Detic | w | 6.6e-04 (+4.7%) | 2.2e-02 (+10.0%) | | Vild | w/o | 3.3e-04 | 7.5e-03 | | Vild | w | 3.8e-04 (+15.2%) | 10.6e-03 (+41.3%) |

Post-processing improves accuracy by reducing false-positive bounding boxes generated from caption tokens.

Citation

If you use this work, please cite: bibtex @article{liu2025fine, title={Fine-Grained Open-Vocabulary Object Detection with Fine-Grained Prompts: Task, Dataset and Benchmark}, author={Liu, Ying and Hua, Yijing and Chai, Haojiang and Wang, Yanbo and Ye, TengQi}, journal={arXiv preprint arXiv:2503.14862}, year={2025} }

Owner

Name: TengQi Ye
Login: tengerye
Kind: user
Location: Shanghai
Company: Bytedance

Repositories: 1
Profile: https://github.com/tengerye

Obtain Ph.D. from Dublin City University. Research interest: machine learning. Work in ByteDance Inc. as computer vision engineer.

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use our dataset or codes, please cite it as below."
authors:
  - family-names: Liu
    given-names: Ying
  - family-names: Hua
    given-names: Yijing
  - family-names: Chai
    given-names: Haojiang
  - family-names: Wang
    given-names: Yanbo
  - family-names: Ye
    given-names: TengQi
  - family-names: "Bot"
    given-names: "Hew"
    orcid: "https://orcid.org/0000-0002-3501-8599"
title: "Fine-Grained Open-Vocabulary Object Detection with Fined-Grained Prompts: Task, Dataset and Benchmark"
version: 1.0.0
doi: 10.13140/RG.2.2.14172.71049
date-released: 2025-05-22
url: "https://github.com/tengerye/3FOVD"
type: article-journal
year: '2025'
conference: {}
publisher: {}

GitHub Events

Total

Watch event: 1
Delete event: 4
Issue comment event: 3
Push event: 96
Pull request event: 25
Fork event: 1
Create event: 8

Last Year

Watch event: 1
Delete event: 4
Issue comment event: 3
Push event: 96
Pull request event: 25
Fork event: 1
Create event: 8

Issues and Pull Requests

Last synced: 11 months ago

All Time

Total issues: 0
Total pull requests: 11
Average time to close issues: N/A
Average time to close pull requests: 2 days
Total issue authors: 0
Total pull request authors: 3
Average comments per issue: 0
Average comments per pull request: 0.0
Merged pull requests: 7
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 11
Average time to close issues: N/A
Average time to close pull requests: 2 days
Issue authors: 0
Pull request authors: 3
Average comments per issue: 0
Average comments per pull request: 0.0
Merged pull requests: 7
Bot issues: 0
Bot pull requests: 0

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science