cpa-enhancer

This is the official repository of the paper: CPA-Enhancer: Chain-of-Thought Prompted Adaptive Enhancer for Object Detection under Unknown Degradations

https://github.com/zyw-stu/cpa-enhancer

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (12.8%) to scientific vocabulary

Last synced: 10 months ago · JSON representation ·

Repository

This is the official repository of the paper: CPA-Enhancer: Chain-of-Thought Prompted Adaptive Enhancer for Object Detection under Unknown Degradations

Basic Info

Host: GitHub
Owner: zyw-stu
License: apache-2.0
Language: Python
Default Branch: main
Homepage:
Size: 20.3 MB

Statistics

Stars: 47
Watchers: 2
Forks: 2
Open Issues: 11
Releases: 0

Created over 2 years ago · Last pushed about 2 years ago

Metadata Files

Readme License Citation

CPA-Enhancer: Chain-of-Thought Prompted Adaptive Enhancer for Object Detection under Unknown Degradations

📰 ArXiv Preprint: Arxiv 2403.11220

✅ Updates

March. 24th, 2024: We have released the CPA-Seg for segmentation tasks of CPA-Enhancer.

🚀 Overview

Overall Workflow of the CPA-Enhancer Framework
Overview of the proposed CPA-Enhancer.

Overall Workflow of the CPA-Enhancer Framework
Our proposed content-driven prompt block (CPB).

Abstract : Object detection methods under known single degradations have been extensively investigated. However, existing approaches require prior knowledge of the degradation type and train a separate model for each, limiting their practical applications in unpredictable environments. To address this challenge, we propose a chain-of-thought (CoT) prompted adaptive enhancer, CPA-Enhancer, for object detection under unknown degradations. Specifically, CPA-Enhancer progressively adapts its enhancement strategy under the step-by-step guidance of CoT prompts, that encode degradation-related information. To the best of our knowledge, it’s the first work that exploits CoT prompting for object detection tasks. Overall, CPA-Enhancer is a plug-and-play enhancement model that can be integrated into any generic detectors to achieve substantial gains on degraded images, without knowing the degradation type priorly. Experimental results demonstrate that CPA-Enhancer not only sets the new state of the art for object detection but also boosts the performance of other downstream vision tasks under multiple unknown degradations.

🛠️ Installation

Step0. Download and install Miniconda from the official website.
Step1. Create a conda environment and activate it.

shell conda create --name openmmlab python=3.8 -y conda activate openmmlab

Step2.Install PyTorch following official instructions, e.g.

shell conda install pytorch==1.11.0 torchvision==0.12.0 torchaudio==0.11.0 cudatoolkit=11.3 -c pytorch

Step3. Install MMEngine and MMCV using MIM.

shell pip install -U openmim mim install mmengine mim install "mmcv>=2.0.0"

Step4. Install other related packages.

shell cd CPA_Enhancer pip install -r ./cpa/requirements.txt

📁 Data Preparation

Synthetic Datasets

Step1. Download VOC PASCAL trainval and test data

shell $ wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar $ wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar $ wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar

Step2. Construct VnA-T ( containing 5 categories, with a total of 8111 images) / VnB-T (containing 10 categories, with a total of 12334 images) from VOCtrainval_06-Nov-2007.tar and VOCtrainval_11-May-2012.tar; Construct VnA-T ( containing 5 categories, with a total of 2734 images) / VnB-T (containing 10 categories, with a total of 3760 images) from VOCtest_06-Nov-2007.tar.

We also provide a list of image names included in each dataset, which you can find in the cpa/dataSyn/datalist.

```python

5 class

target_classes = ['person','car','bus','bicycle','motorbike']

10 class

target_classes = ['bicycle','boat','bottle','bus','car','cat','chair','dog','motorbike','person'] ```

Make sure the directory follows this basic VOC structure.

Step3. Sythesize degraded datasets from VnA and VnB by executing the following command and restructure them into VOC format.

```shell

Modify the paths in the code to match your actual paths.

all-in-one setting

python cpa/dataSyn/datamakefog.py # VF/VF-T python cpa/dataSyn/datamakelowlight.py # VD/VD-T/VDB python cpa/dataSyn/datamakesnow.py # VS/VS-T python cpa/dataSyn/datamakerain.py # VR/VR-T

one-by-one setting

python cpa/dataSyn/datamakefoghybrid.py # VF-HT python cpa/dataSyn/datamakelowlighthybrid.py # VD-HT ```

Real-world Datasets

Step1. Download Exdark and RTTS datasets.
Step2. Restructure the RTTS dataset (4322 images) into VOC format, ensuring that the directory conforms to this basic structure.

shell RTTS # path\to\RTTS ├── Annotations | └──xxx.xml | ... └── ImageSets | └──Main | └──test_rtts.txt └── JPEGImages └──xxx.jpg ...

Step3. Similarly, restructure the ExdarkA dataset (containing 5 categories, with a total of 1283 images) and the ExdarkB dataset (containing 10 categories, with a total of 2563 images) into VOC format.

shell exdark_5 (exdark_10) # path\to\ExDarkA (ExDarkB) ├── Annotations | └──xxx.xml | ... └── ImageSets | └──Main | └──test_exdark_5.txt (test_exdark_10.txt) # you can find it in cpa\dataSyn\datalist └── JPEGImages └──xxx.jpg ...

🎯 Usage

📍 All-in-One Setting

Step 1. Modify the METAINFO in mmdet/datasets/voc.py

python METAINFO = { 'classes': ('person', 'car', 'bus', 'bicycle', 'motorbike'), # 5 classes 'palette': [(106, 0, 228), (119, 11, 32), (165, 42, 42), (0, 0, 192),(197, 226, 255)] }

Step 2. Modify the voc_classes in mmdet/evaluation/functional/class_names.py

python def voc_classes() -> list: return [ 'person', 'car', 'bus', 'bicycle', 'motorbike' # 5 classes ]

Step 3. Modify the num_classes in configs\yolo\cpa_config.py

python bbox_head=dict( type='YOLOV3Head', num_classes=5, # 5 classes ... )

Step 4. Recompile the code.

cd CPA_Enhancer pip install -v -e .

Step 5. Modify the data_root ,ann_fileand data_prefix in configs\yolo\cpa_config.py to match your actual paths of the used datasets.

The pretrained models and training/testing logs can be found in checkpoint.zip

🔹 Train

```shell

Train our model from scratch.

python tools/train.py configs/yolo/cpa_config.py
```

🔹 Test

```shell

you can download our pretrained model for testing

python tools/test.py configs/yolo/cpa_config.py path/to/checkpoint/xx.pth ```

🔹 Demo

```shell

you can download our pretrained model for inference

python demo/cpademo.py \ --inputs ../cpa/testimage # path to your input images or dictionary --model ../configs/yolo/cpaconfig.py --weights path/to/checkpoint/xx.pth --out-dir ../cpa/output # output file ```

📍 One-by-One Setting

For the foggy conditions (containing five categories), the overall process is the same as above (Step1-5).

For the low-light conditions ( containing ten categories ) , You only need to modify a few places as follows (Step1-3).

Step 1. Modify the METAINFO in mmdet/datasets/voc.py

````python

10 classes

METAINFO = { 'classes': ('bicycle', 'boat', 'bottle','bus', 'car', 'cat', 'chair','dog','motorbike','person'), 'palette': [(106, 0, 228), (119, 11, 32), (165, 42, 42), (0, 0, 192),(197, 226, 255), (0, 60, 100), (0, 0, 142), (255, 77, 255), (153, 69, 1), (120, 166, 157),] } ````

Step 2. Modify the voc_classes in mmdet/evaluation/functional/class_names.py

python def voc_classes() -> list: return [ 'bicycle', 'boat', 'bottle','bus', 'car', 'cat', 'chair','dog','motorbike','person' # 10 classes ]

Step 3. Modify the num_classes in configs/yolo/cpa_config.py

python bbox_head=dict( type='YOLOV3Head', num_classes=10, # 10 classes ... )

📊 Results

Quantitative results

Overall Workflow of the CPA-Enhancer Framework
Quantitative comparisons under the all-in-one setting.

Comparisons in the one-by-one setting under the foggy degradation (left) and low-light degradation (right)

Visual Results

Overall Workflow of the CPA-Enhancer Framework
Visual comparisons of CPA-Enhancer under the all-in-one setting.

💐 Acknowledgments

Special thanks to the creators of mmdetection upon which this code is built, for their valuable work in advancing object detection research.

🔗 Citation

If you use this codebase, or CPA-Enhancer inspires your work, we would greatly appreciate it if you could star the repository and cite it using the following BibTeX entry. @misc{zhang2024cpaenhancer, title={CPA-Enhancer: Chain-of-Thought Prompted Adaptive Enhancer for Object Detection under Unknown Degradations}, author={Yuwei Zhang and Yan Wu and Yanming Liu and Xinyue Peng}, year={2024}, eprint={2403.11220}, archivePrefix={arXiv}, primaryClass={cs.CV} }

Owner

Name: zyw-stu
Login: zyw-stu
Kind: user
Location: BeiJing

Repositories: 1
Profile: https://github.com/zyw-stu

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
  - name: "MMDetection Contributors"
title: "OpenMMLab Detection Toolbox and Benchmark"
date-released: 2018-08-22
url: "https://github.com/open-mmlab/mmdetection"
license: Apache-2.0

GitHub Events

Total

Issues event: 10
Watch event: 7
Issue comment event: 8
Fork event: 1

Last Year

Issues event: 10
Watch event: 7
Issue comment event: 8
Fork event: 1

Issues and Pull Requests

Last synced: 10 months ago

All Time

Total issues: 2
Total pull requests: 0
Average time to close issues: 5 months
Average time to close pull requests: N/A
Total issue authors: 2
Total pull request authors: 0
Average comments per issue: 0.5
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 2
Pull requests: 0
Average time to close issues: 5 months
Average time to close pull requests: N/A
Issue authors: 2
Pull request authors: 0
Average comments per issue: 0.5
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

Faker-Lost (2)
wuyuyuyuaaa (2)
Lyzx123 (2)
roemin1999 (1)
13185742215 (1)
lovemuyao (1)
sunday1112 (1)
mrwrui (1)
wangdalu4399 (1)
ducnt1210 (1)

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies

docker/Dockerfile docker

pytorch/pytorch ${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel build

docker/serve/Dockerfile docker

pytorch/pytorch ${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel build

docker/serve_cn/Dockerfile docker

pytorch/pytorch ${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel build

cpa/requirements.txt pypi

Pillow ==10.0.1
PyYAML ==6.0.1
Requests ==2.31.0
Shapely ==2.0.3
addict ==2.4.0
albumentations ==1.4.1
boto3 ==1.34.62
botocore ==1.34.62
cityscapesscripts ==2.2.2
clip ==0.2.0
einops ==0.7.0
fairscale ==0.4.13
ffmpegcv ==0.3.11
gradio ==4.21.0
imagecorruptions ==1.1.2
imageio ==2.31.4
instaboostfast ==0.1.2
label_studio_ml ==1.0.9
label_studio_tools ==0.0.3
lap ==0.4.0
matplotlib ==3.7.4
memory_profiler ==0.61.0
mmcv ==2.1.0
mmengine ==0.10.1
mmpretrain ==1.2.0
model_archiver ==1.0.3
model_index ==0.1.11
motmetrics ==1.4.0
nltk ==3.8.1
numpy ==1.24.3
opencv_python ==4.8.1.78
openpyxl ==3.1.2
pandas ==2.0.3
parameterized ==0.9.0
prettytable ==3.10.0
psutil ==5.9.0
pycocoevalcap ==1.2
pycocotools ==2.0.7
pytest ==8.1.1
pytorch_sphinx_theme ==0.0.19
rich ==13.7.1
roboflow ==1.1.23
sahi ==0.11.15
scikit_image ==0.19.3
scikit_learn ==1.3.2
scipy ==1.10.1
seaborn ==0.13.2
setuptools ==60.2.0
six ==1.16.0
terminaltables ==3.1.10
thop ==0.1.1.post2209072238
torch ==1.11.0
torchvision ==0.12.0
tqdm ==4.65.2
transformers ==4.38.2
ts ==0.5.1
wandb ==0.16.1
xlrd ==2.0.1
xlutils ==2.0.0

requirements/albu.txt pypi

albumentations >=0.3.2

requirements/build.txt pypi

cython *
numpy *

requirements/docs.txt pypi

docutils ==0.16.0
myst-parser *
sphinx ==4.0.2
sphinx-copybutton *
sphinx_markdown_tables *
sphinx_rtd_theme ==0.5.2
urllib3 <2.0.0

requirements/mminstall.txt pypi

mmcv >=2.0.0rc4,<2.2.0
mmengine >=0.7.1,<1.0.0

requirements/multimodal.txt pypi

fairscale *
nltk *
pycocoevalcap *
transformers *

requirements/optional.txt pypi

cityscapesscripts *
fairscale *
imagecorruptions *
scikit-learn *

requirements/readthedocs.txt pypi

mmcv >=2.0.0rc4,<2.2.0
mmengine >=0.7.1,<1.0.0
scipy *
torch *
torchvision *
urllib3 <2.0.0

requirements/runtime.txt pypi

matplotlib *
numpy *
pycocotools *
scipy *
shapely *
six *
terminaltables *
tqdm *

requirements/tests.txt pypi

asynctest * test
cityscapesscripts * test
codecov * test
flake8 * test
imagecorruptions * test
instaboostfast * test
interrogate * test
isort ==4.3.21 test
kwarray * test
memory_profiler * test
nltk * test
onnx ==1.7.0 test
onnxruntime >=1.8.0 test
parameterized * test
prettytable * test
protobuf <=3.20.1 test
psutil * test
pytest * test
transformers * test
ubelt * test
xdoctest >=0.10.0 test
yapf * test

requirements/tracking.txt pypi

mmpretrain *
motmetrics *
numpy <1.24.0
scikit-learn *
seaborn *

requirements.txt pypi

setup.py pypi

cpa-enhancer

Science Score: 54.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

CPA-Enhancer: Chain-of-Thought Prompted Adaptive Enhancer for Object Detection under Unknown Degradations

✅ Updates

🚀 Overview

🛠️ Installation

📁 Data Preparation

Synthetic Datasets

5 class

10 class

Modify the paths in the code to match your actual paths.

all-in-one setting

one-by-one setting

Real-world Datasets

🎯 Usage

📍 All-in-One Setting

Train our model from scratch.

you can download our pretrained model for testing

you can download our pretrained model for inference

📍 One-by-One Setting

10 classes

📊 Results

Quantitative results

Visual Results

💐 Acknowledgments

🔗 Citation

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies