Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (8.9%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: tengerye
  • Language: Python
  • Default Branch: master
  • Size: 203 MB
Statistics
  • Stars: 0
  • Watchers: 2
  • Forks: 1
  • Open Issues: 0
  • Releases: 0
Created over 1 year ago · Last pushed 9 months ago
Metadata Files
Readme Citation

README.md

Fine-Grained Open-Vocabulary Object Detection with Fine-Grained Prompts: Task, Dataset and Benchmark

Official codebase and dataset for the ICRA 2025 paper (Oral) "Fine-Grained Open-Vocabulary Object Detection with Fine-Grained Prompts: Task, Dataset and Benchmark" [Paper] | [Project Page]


Overview

3F-OVD introduces a new benchmark for fine-grained open-vocabulary object detection (OVD), designed to evaluate detectors under realistic, challenging, and scalable conditions. We highlight the limitations of existing evaluation protocols and propose:

  • A novel evaluation task that extends fine-grained detection to an open-vocabulary setting with class-level captions.
  • A large-scale NEU-171K dataset spanning two domains: vehicles and retail products.
  • A simple yet effective post-processing method that boosts the performance of open-vocabulary detectors by reducing false positives.

Dataset: NEU-171K

The NEU-171K dataset includes: - 145,825 images, 676,471 bounding boxes, 719 fine-grained classes. - Two domains: NEU-171K-C and NEU-171K-RP.

NEU-171K-C

NEU-171K-C contains cars in real-world traffic scenes. NEU-171K-C

NEU-171K-RP

NEU-171K-RP contains retail products captured in controlled warehouse settings. NEU-171K-RP

You can access the dataset from:

More details on dataset structure and statistics are in datasets/README.md.


Benchmarking & Codebase

This repository includes: ``` - datasets/ - README.md # Dataset description and download instructions

  • src/

    • supervised/ # Training & evaluation of traditional detectors (Section V-B)
    • open_vocabulary/ # Evaluation of open-vocabulary detectors (Section V-C)
      • cora/
      • detic/
      • gdino/
      • vild/
    • post_process/ # Our custom post-processing for reducing false positives (Section V-D) ```

Supported Baselines

  • Supervised: Co-DETR, Faster R-CNN, FCOS, PAA, etc.
  • Open-Vocabulary: ViLD, Detic, Grounding DINO

Run Evaluation

Instructions for running each baseline and applying the post-processing trick are included in the respective subfolders under src/.


Benchmarks

| Method | Trick | NEU-171K-C | NEU-171K-RP | |--------|-------|--------------------|---------------------| | GDino | w/o | 1.2e-03 | 7.4e-04 | | GDino | w | 1.3e-03 (+8.3%) | 7.6e-04 (+2.6%) | | Detic | w/o | 6.3e-04 | 2.0e-02 | | Detic | w | 6.6e-04 (+4.7%) | 2.2e-02 (+10.0%) | | Vild | w/o | 3.3e-04 | 7.5e-03 | | Vild | w | 3.8e-04 (+15.2%) | 10.6e-03 (+41.3%) |

Post-processing improves accuracy by reducing false-positive bounding boxes generated from caption tokens.


Citation

If you use this work, please cite: bibtex @article{liu2025fine, title={Fine-Grained Open-Vocabulary Object Detection with Fine-Grained Prompts: Task, Dataset and Benchmark}, author={Liu, Ying and Hua, Yijing and Chai, Haojiang and Wang, Yanbo and Ye, TengQi}, journal={arXiv preprint arXiv:2503.14862}, year={2025} }

Owner

  • Name: TengQi Ye
  • Login: tengerye
  • Kind: user
  • Location: Shanghai
  • Company: Bytedance

Obtain Ph.D. from Dublin City University. Research interest: machine learning. Work in ByteDance Inc. as computer vision engineer.

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use our dataset or codes, please cite it as below."
authors:
  - family-names: Liu
    given-names: Ying
  - family-names: Hua
    given-names: Yijing
  - family-names: Chai
    given-names: Haojiang
  - family-names: Wang
    given-names: Yanbo
  - family-names: Ye
    given-names: TengQi
  - family-names: "Bot"
    given-names: "Hew"
    orcid: "https://orcid.org/0000-0002-3501-8599"
title: "Fine-Grained Open-Vocabulary Object Detection with Fined-Grained Prompts: Task, Dataset and Benchmark"
version: 1.0.0
doi: 10.13140/RG.2.2.14172.71049
date-released: 2025-05-22
url: "https://github.com/tengerye/3FOVD"
type: article-journal
year: '2025'
conference: {}
publisher: {}

GitHub Events

Total
  • Watch event: 1
  • Delete event: 4
  • Issue comment event: 3
  • Push event: 96
  • Pull request event: 25
  • Fork event: 1
  • Create event: 8
Last Year
  • Watch event: 1
  • Delete event: 4
  • Issue comment event: 3
  • Push event: 96
  • Pull request event: 25
  • Fork event: 1
  • Create event: 8

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 0
  • Total pull requests: 11
  • Average time to close issues: N/A
  • Average time to close pull requests: 2 days
  • Total issue authors: 0
  • Total pull request authors: 3
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 7
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 11
  • Average time to close issues: N/A
  • Average time to close pull requests: 2 days
  • Issue authors: 0
  • Pull request authors: 3
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 7
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
  • chaishare (8)
  • tengerye (2)
  • ZCRC-EXCE (2)
Top Labels
Issue Labels
Pull Request Labels