promptdet
PromptDet: Towards Open-vocabulary Detection using Uncurated Images, ECCV2022
Science Score: 36.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.4%) to scientific vocabulary
Keywords
Repository
PromptDet: Towards Open-vocabulary Detection using Uncurated Images, ECCV2022
Basic Info
Statistics
- Stars: 166
- Watchers: 2
- Forks: 7
- Open Issues: 11
- Releases: 0
Topics
Metadata Files
README.md
PromptDet: Towards Open-vocabulary Detection using Uncurated Images (ECCV 2022)
Introduction
The goal of this work is to establish a scalable pipeline for expanding an object detector towards novel/unseen categories, using zero manual annotations. To achieve that, we make the following four contributions: (i) in pursuit of generalisation, we propose a two-stage open-vocabulary object detector, where the class-agnostic object proposals are classified with a text encoder from pre-trained visual-language model; (ii) To pair the visual latent space (of RPN box proposals) with that of the pre-trained text encoder, we propose the idea of regional prompt learning to align the textual embedding space with regional visual object features; (iii) To scale up the learning procedure towards detecting a wider spectrum of objects, we exploit the available online resource via a novel self-training framework, which allows to train the proposed detector on a large corpus of noisy uncurated web images. Lastly, (iv) to evaluate our proposed detector, termed as PromptDet, we conduct extensive experiments on the challenging LVIS and MS-COCO dataset. PromptDet shows superior performance over existing approaches with fewer additional training images and zero manual annotations whatsoever.
Training framework

updates - July 20, 2022: add the code for LAION-novel and self-training - March 28, 2022: initial release
Prerequisites
MMDetection version 2.16.0.
Please see get_started.md for installation and the basic usage of MMDetection.
Regional Prompt Learning (RPL)
We learn the prompt vectors in an off-line manner using RPL. For your convenience, we also provide the learned prompt vectors and the category embeddings.
LAION-novel dataset
The LAION-novel dataset based on the learned category embeddings can be generated by using the PromptDet tools as follows: ```python
stege-I: install the dependencies, download the laion400m 64GB image.index and metadata.hdf5 (https://the-eye.eu/public/AI/cah/), and then retrival the LAION images (urls)
pip install faiss-cpu==1.7.2 img2dataset==1.12.0 fire==0.4.0 h5py==3.6.0 python tools/promptdet/retrievallaionimage.py --indice-folder [laion400m-64GB-index] --metadata [metadata.hdf5] --text-features promptdetresources/lviscategoryembeddings.pt --output-folder data/laionlvis/images --num-images 500
stege-II: download the LAION images
python tools/promptdet/downloadlaionimage.py --output-folder data/laion_lvis/images --num-thread 10
stege-III: convert the LAION images to mmdetection format
python tools/promptdet/laiondatasetconverter.py --data-path data/laionlvis/images --out-file data/laionlvis/laion_train.json --topK 300 ``` For your convenience, we also provide the image urls of our LAION-novel dataset.
Inference
```python
assume that you are under the root directory of this project,
and you have activated your virtual environment if needed,
and with LVIS v1.0 dataset in 'data/lvis_v1'.
./tools/disttest.sh configs/promptdet/promptdetr50fpnsample1e-3mstrain1xlvisv1selftrain.py workdirs/promptdetr50fpnsample1e-3mstrain1xlvisv1selftrain.pth 4 --eval bbox segm ```
Train
```python
download 'lvisv1trainseen.json' to 'data/lvisv1/annotations'.
train detector without self-training
./tools/disttrain.sh configs/promptdet/promptdetr50fpnsample1e-3mstrain1xlvisv1.py 4
train detector with self-training
./tools/disttrain.sh configs/promptdet/promptdetr50fpnsample1e-3mstrain1xlvisv1selftrain.py 4 ``` [0] Annotation file of base categories: lvisv1train_seen.json. \ [1] Note that we provide a EpochPromptDetRunner to fetch the data from mutilple datasets alternately.
Models
For your convenience, we provide the following trained models (PromptDet) with mask AP.
Model | RPL | Self-training | Epochs | Scale Jitter | Input Size | APnovel | AP
[0] All results are obtained with a single model and without any test time data augmentation such as multi-scale, flipping and etc.. \
[1] Refer to more details in config files in config/promptdet/.
Acknowledgement
Thanks MMDetection team for the wonderful open source project!
Citation
If you find PromptDet useful in your research, please consider citing:
@inproceedings{feng2022promptdet,
title={PromptDet: Towards Open-vocabulary Detection using Uncurated Images},
author={Feng, Chengjian and Zhong, Yujie and Jie, Zequn and Chu, Xiangxiang and Ren, Haibing and Wei, Xiaolin and Xie, Weidi and Ma, Lin},
journal={Proceedings of the European Conference on Computer Vision},
year={2022}
}
Owner
- Name: Chengjian Feng
- Login: fcjian
- Kind: user
- Repositories: 7
- Profile: https://github.com/fcjian
GitHub Events
Total
- Watch event: 6
Last Year
- Watch event: 6
Committers
Last synced: over 1 year ago
Top Committers
| Name | Commits | |
|---|---|---|
| fengchengjian | f****n@m****m | 17 |
| hadoop-vacv | h****v@s****t | 10 |
| fengchengjian | f****n@o****m | 3 |
| hadoop-vacv | h****v@s****t | 3 |
| hadoop-vacv | h****v@s****t | 3 |
| lijinlong11 | l****1@m****m | 3 |
| hadoop-vacv | h****v@s****t | 1 |
| fcjian | 8****n | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: about 1 year ago
All Time
- Total issues: 16
- Total pull requests: 0
- Average time to close issues: about 2 months
- Average time to close pull requests: N/A
- Total issue authors: 14
- Total pull request authors: 0
- Average comments per issue: 1.06
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 1
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 1
- Pull request authors: 0
- Average comments per issue: 0.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- yyyyyyfs (2)
- hanoonaR (2)
- Kyfafyd (1)
- lsx66 (1)
- jihwanp (1)
- Feobi1999 (1)
- XinZhangRadar (1)
- YasminZhang (1)
- krisandchris (1)
- vansin (1)
- liujiaheng (1)
- eternaldolphin (1)
- LeonG7 (1)
- RuoyuChen10 (1)
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- cython *
- numpy *
- docutils ==0.16.0
- recommonmark *
- sphinx ==4.0.2
- sphinx_markdown_tables *
- sphinx_rtd_theme ==0.5.2
- mmcv-full >=1.3.8
- cityscapesscripts *
- imagecorruptions *
- scipy *
- sklearn *
- mmcv *
- torch *
- torchvision *
- matplotlib *
- numpy *
- pycocotools *
- pycocotools-windows *
- six *
- terminaltables *
- asynctest * test
- codecov * test
- flake8 * test
- interrogate * test
- isort ==4.3.21 test
- kwarray * test
- mmtrack * test
- onnx ==1.7.0 test
- onnxruntime >=1.8.0 test
- pytest * test
- ubelt * test
- xdoctest >=0.10.0 test
- yapf * test
- pytorch/pytorch ${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel build
- pytorch/pytorch ${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel build