cpa-enhancer

This is the official repository of the paper: CPA-Enhancer: Chain-of-Thought Prompted Adaptive Enhancer for Object Detection under Unknown Degradations

https://github.com/zyw-stu/cpa-enhancer

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.8%) to scientific vocabulary
Last synced: 10 months ago · JSON representation ·

Repository

This is the official repository of the paper: CPA-Enhancer: Chain-of-Thought Prompted Adaptive Enhancer for Object Detection under Unknown Degradations

Basic Info
  • Host: GitHub
  • Owner: zyw-stu
  • License: apache-2.0
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 20.3 MB
Statistics
  • Stars: 47
  • Watchers: 2
  • Forks: 2
  • Open Issues: 11
  • Releases: 0
Created over 2 years ago · Last pushed about 2 years ago
Metadata Files
Readme License Citation

README.md

CPA-Enhancer: Chain-of-Thought Prompted Adaptive Enhancer for Object Detection under Unknown Degradations

📰 ArXiv Preprint: Arxiv 2403.11220

✅ Updates

  • March. 24th, 2024: We have released the CPA-Seg for segmentation tasks of CPA-Enhancer.

🚀 Overview

Overall Workflow of the CPA-Enhancer Framework
Overview of the proposed CPA-Enhancer.

Overall Workflow of the CPA-Enhancer Framework
Our proposed content-driven prompt block (CPB).

Abstract : Object detection methods under known single degradations have been extensively investigated. However, existing approaches require prior knowledge of the degradation type and train a separate model for each, limiting their practical applications in unpredictable environments. To address this challenge, we propose a chain-of-thought (CoT) prompted adaptive enhancer, CPA-Enhancer, for object detection under unknown degradations. Specifically, CPA-Enhancer progressively adapts its enhancement strategy under the step-by-step guidance of CoT prompts, that encode degradation-related information. To the best of our knowledge, it’s the first work that exploits CoT prompting for object detection tasks. Overall, CPA-Enhancer is a plug-and-play enhancement model that can be integrated into any generic detectors to achieve substantial gains on degraded images, without knowing the degradation type priorly. Experimental results demonstrate that CPA-Enhancer not only sets the new state of the art for object detection but also boosts the performance of other downstream vision tasks under multiple unknown degradations.

🛠️ Installation

  • Step0. Download and install Miniconda from the official website.
  • Step1. Create a conda environment and activate it.

shell conda create --name openmmlab python=3.8 -y conda activate openmmlab

shell conda install pytorch==1.11.0 torchvision==0.12.0 torchaudio==0.11.0 cudatoolkit=11.3 -c pytorch

shell pip install -U openmim mim install mmengine mim install "mmcv>=2.0.0"

  • Step4. Install other related packages.

shell cd CPA_Enhancer pip install -r ./cpa/requirements.txt

📁 Data Preparation

Synthetic Datasets

  • Step1. Download VOC PASCAL trainval and test data

shell $ wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar $ wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar $ wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar

  • Step2. Construct VnA-T ( containing 5 categories, with a total of 8111 images) / VnB-T (containing 10 categories, with a total of 12334 images) from VOCtrainval_06-Nov-2007.tar and VOCtrainval_11-May-2012.tar; Construct VnA-T ( containing 5 categories, with a total of 2734 images) / VnB-T (containing 10 categories, with a total of 3760 images) from VOCtest_06-Nov-2007.tar.

We also provide a list of image names included in each dataset, which you can find in the cpa/dataSyn/datalist.

```python

5 class

target_classes = ['person','car','bus','bicycle','motorbike']

10 class

target_classes = ['bicycle','boat','bottle','bus','car','cat','chair','dog','motorbike','person'] ```

Make sure the directory follows this basic VOC structure.

shell data_vocnorm (data_vocnorm_10) # path\to\vocnorm ├── train # VnA-T (VnB-T) | ├── Annotations | | └──xxx.xml | | ... | └── ImageSets | | └──Main | | └──train_voc.txt # you can find it in cpa\dataSyn\datalist | └── JPEGImages | └──xxx.jpg | ... ├── test # VnA (VnB) | ├── Annotations | | └──xxx.xml | | ... | └── ImageSets | | └──Main | | └──test_voc.txt # you can find it in cpa\dataSyn\datalist | └── JPEGImages | └──xxx.jpg | ...

  • Step3. Sythesize degraded datasets from VnA and VnB by executing the following command and restructure them into VOC format.

```shell

Modify the paths in the code to match your actual paths.

all-in-one setting

python cpa/dataSyn/datamakefog.py # VF/VF-T python cpa/dataSyn/datamakelowlight.py # VD/VD-T/VDB python cpa/dataSyn/datamakesnow.py # VS/VS-T python cpa/dataSyn/datamakerain.py # VR/VR-T

one-by-one setting

python cpa/dataSyn/datamakefoghybrid.py # VF-HT python cpa/dataSyn/datamakelowlighthybrid.py # VD-HT ```

Real-world Datasets

  • Step1. Download Exdark and RTTS datasets.
  • Step2. Restructure the RTTS dataset (4322 images) into VOC format, ensuring that the directory conforms to this basic structure.

shell RTTS # path\to\RTTS ├── Annotations | └──xxx.xml | ... └── ImageSets | └──Main | └──test_rtts.txt └── JPEGImages └──xxx.jpg ...

  • Step3. Similarly, restructure the ExdarkA dataset (containing 5 categories, with a total of 1283 images) and the ExdarkB dataset (containing 10 categories, with a total of 2563 images) into VOC format.

shell exdark_5 (exdark_10) # path\to\ExDarkA (ExDarkB) ├── Annotations | └──xxx.xml | ... └── ImageSets | └──Main | └──test_exdark_5.txt (test_exdark_10.txt) # you can find it in cpa\dataSyn\datalist └── JPEGImages └──xxx.jpg ...

🎯 Usage

📍 All-in-One Setting

  • Step 1. Modify the METAINFO in mmdet/datasets/voc.py

python METAINFO = { 'classes': ('person', 'car', 'bus', 'bicycle', 'motorbike'), # 5 classes 'palette': [(106, 0, 228), (119, 11, 32), (165, 42, 42), (0, 0, 192),(197, 226, 255)] }

  • Step 2. Modify the voc_classes in mmdet/evaluation/functional/class_names.py

python def voc_classes() -> list: return [ 'person', 'car', 'bus', 'bicycle', 'motorbike' # 5 classes ]

  • Step 3. Modify the num_classes in configs\yolo\cpa_config.py

python bbox_head=dict( type='YOLOV3Head', num_classes=5, # 5 classes ... )

  • Step 4. Recompile the code.

cd CPA_Enhancer pip install -v -e .

  • Step 5. Modify the data_root ,ann_fileand data_prefix in configs\yolo\cpa_config.py to match your actual paths of the used datasets.

The pretrained models and training/testing logs can be found in checkpoint.zip

🔹 Train

```shell

Train our model from scratch.

python tools/train.py configs/yolo/cpa_config.py
```

🔹 Test

```shell

you can download our pretrained model for testing

python tools/test.py configs/yolo/cpa_config.py path/to/checkpoint/xx.pth ```

🔹 Demo

```shell

you can download our pretrained model for inference

python demo/cpademo.py \ --inputs ../cpa/testimage # path to your input images or dictionary --model ../configs/yolo/cpaconfig.py --weights path/to/checkpoint/xx.pth --out-dir ../cpa/output # output file ```

📍 One-by-One Setting

For the foggy conditions (containing five categories), the overall process is the same as above (Step1-5).

For the low-light conditions ( containing ten categories ) , You only need to modify a few places as follows (Step1-3).

  • Step 1. Modify the METAINFO in mmdet/datasets/voc.py

````python

10 classes

METAINFO = { 'classes': ('bicycle', 'boat', 'bottle','bus', 'car', 'cat', 'chair','dog','motorbike','person'), 'palette': [(106, 0, 228), (119, 11, 32), (165, 42, 42), (0, 0, 192),(197, 226, 255), (0, 60, 100), (0, 0, 142), (255, 77, 255), (153, 69, 1), (120, 166, 157),] } ````

  • Step 2. Modify the voc_classes in mmdet/evaluation/functional/class_names.py

python def voc_classes() -> list: return [ 'bicycle', 'boat', 'bottle','bus', 'car', 'cat', 'chair','dog','motorbike','person' # 10 classes ]

  • Step 3. Modify the num_classes in configs/yolo/cpa_config.py

python bbox_head=dict( type='YOLOV3Head', num_classes=10, # 10 classes ... )

📊 Results

Quantitative results

Overall Workflow of the CPA-Enhancer Framework
Quantitative comparisons under the all-in-one setting.

Image 1 Image 2

Comparisons in the one-by-one setting under the foggy degradation (left) and low-light degradation (right)

Visual Results

Overall Workflow of the CPA-Enhancer Framework
Visual comparisons of CPA-Enhancer under the all-in-one setting.

💐 Acknowledgments

Special thanks to the creators of mmdetection upon which this code is built, for their valuable work in advancing object detection research.

🔗 Citation

If you use this codebase, or CPA-Enhancer inspires your work, we would greatly appreciate it if you could star the repository and cite it using the following BibTeX entry. @misc{zhang2024cpaenhancer, title={CPA-Enhancer: Chain-of-Thought Prompted Adaptive Enhancer for Object Detection under Unknown Degradations}, author={Yuwei Zhang and Yan Wu and Yanming Liu and Xinyue Peng}, year={2024}, eprint={2403.11220}, archivePrefix={arXiv}, primaryClass={cs.CV} }

Owner

  • Name: zyw-stu
  • Login: zyw-stu
  • Kind: user
  • Location: BeiJing

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
  - name: "MMDetection Contributors"
title: "OpenMMLab Detection Toolbox and Benchmark"
date-released: 2018-08-22
url: "https://github.com/open-mmlab/mmdetection"
license: Apache-2.0

GitHub Events

Total
  • Issues event: 10
  • Watch event: 7
  • Issue comment event: 8
  • Fork event: 1
Last Year
  • Issues event: 10
  • Watch event: 7
  • Issue comment event: 8
  • Fork event: 1

Issues and Pull Requests

Last synced: 10 months ago

All Time
  • Total issues: 2
  • Total pull requests: 0
  • Average time to close issues: 5 months
  • Average time to close pull requests: N/A
  • Total issue authors: 2
  • Total pull request authors: 0
  • Average comments per issue: 0.5
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 2
  • Pull requests: 0
  • Average time to close issues: 5 months
  • Average time to close pull requests: N/A
  • Issue authors: 2
  • Pull request authors: 0
  • Average comments per issue: 0.5
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • Faker-Lost (2)
  • wuyuyuyuaaa (2)
  • Lyzx123 (2)
  • roemin1999 (1)
  • 13185742215 (1)
  • lovemuyao (1)
  • sunday1112 (1)
  • mrwrui (1)
  • wangdalu4399 (1)
  • ducnt1210 (1)
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels

Dependencies

docker/Dockerfile docker
  • pytorch/pytorch ${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel build
docker/serve/Dockerfile docker
  • pytorch/pytorch ${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel build
docker/serve_cn/Dockerfile docker
  • pytorch/pytorch ${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel build
cpa/requirements.txt pypi
  • Pillow ==10.0.1
  • PyYAML ==6.0.1
  • Requests ==2.31.0
  • Shapely ==2.0.3
  • addict ==2.4.0
  • albumentations ==1.4.1
  • boto3 ==1.34.62
  • botocore ==1.34.62
  • cityscapesscripts ==2.2.2
  • clip ==0.2.0
  • einops ==0.7.0
  • fairscale ==0.4.13
  • ffmpegcv ==0.3.11
  • gradio ==4.21.0
  • imagecorruptions ==1.1.2
  • imageio ==2.31.4
  • instaboostfast ==0.1.2
  • label_studio_ml ==1.0.9
  • label_studio_tools ==0.0.3
  • lap ==0.4.0
  • matplotlib ==3.7.4
  • memory_profiler ==0.61.0
  • mmcv ==2.1.0
  • mmengine ==0.10.1
  • mmpretrain ==1.2.0
  • model_archiver ==1.0.3
  • model_index ==0.1.11
  • motmetrics ==1.4.0
  • nltk ==3.8.1
  • numpy ==1.24.3
  • opencv_python ==4.8.1.78
  • openpyxl ==3.1.2
  • pandas ==2.0.3
  • parameterized ==0.9.0
  • prettytable ==3.10.0
  • psutil ==5.9.0
  • pycocoevalcap ==1.2
  • pycocotools ==2.0.7
  • pytest ==8.1.1
  • pytorch_sphinx_theme ==0.0.19
  • rich ==13.7.1
  • roboflow ==1.1.23
  • sahi ==0.11.15
  • scikit_image ==0.19.3
  • scikit_learn ==1.3.2
  • scipy ==1.10.1
  • seaborn ==0.13.2
  • setuptools ==60.2.0
  • six ==1.16.0
  • terminaltables ==3.1.10
  • thop ==0.1.1.post2209072238
  • torch ==1.11.0
  • torchvision ==0.12.0
  • tqdm ==4.65.2
  • transformers ==4.38.2
  • ts ==0.5.1
  • wandb ==0.16.1
  • xlrd ==2.0.1
  • xlutils ==2.0.0
requirements/albu.txt pypi
  • albumentations >=0.3.2
requirements/build.txt pypi
  • cython *
  • numpy *
requirements/docs.txt pypi
  • docutils ==0.16.0
  • myst-parser *
  • sphinx ==4.0.2
  • sphinx-copybutton *
  • sphinx_markdown_tables *
  • sphinx_rtd_theme ==0.5.2
  • urllib3 <2.0.0
requirements/mminstall.txt pypi
  • mmcv >=2.0.0rc4,<2.2.0
  • mmengine >=0.7.1,<1.0.0
requirements/multimodal.txt pypi
  • fairscale *
  • nltk *
  • pycocoevalcap *
  • transformers *
requirements/optional.txt pypi
  • cityscapesscripts *
  • fairscale *
  • imagecorruptions *
  • scikit-learn *
requirements/readthedocs.txt pypi
  • mmcv >=2.0.0rc4,<2.2.0
  • mmengine >=0.7.1,<1.0.0
  • scipy *
  • torch *
  • torchvision *
  • urllib3 <2.0.0
requirements/runtime.txt pypi
  • matplotlib *
  • numpy *
  • pycocotools *
  • scipy *
  • shapely *
  • six *
  • terminaltables *
  • tqdm *
requirements/tests.txt pypi
  • asynctest * test
  • cityscapesscripts * test
  • codecov * test
  • flake8 * test
  • imagecorruptions * test
  • instaboostfast * test
  • interrogate * test
  • isort ==4.3.21 test
  • kwarray * test
  • memory_profiler * test
  • nltk * test
  • onnx ==1.7.0 test
  • onnxruntime >=1.8.0 test
  • parameterized * test
  • prettytable * test
  • protobuf <=3.20.1 test
  • psutil * test
  • pytest * test
  • transformers * test
  • ubelt * test
  • xdoctest >=0.10.0 test
  • yapf * test
requirements/tracking.txt pypi
  • mmpretrain *
  • motmetrics *
  • numpy <1.24.0
  • scikit-learn *
  • seaborn *
requirements.txt pypi
setup.py pypi