carb

Official PyTorch implementation of "Weakly Supervised Semantic Segmentation for Driving Scenes", AAAI2024

https://github.com/k0u-id/carb

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (10.5%) to scientific vocabulary

Last synced: 6 months ago · JSON representation ·

Repository

Official PyTorch implementation of "Weakly Supervised Semantic Segmentation for Driving Scenes", AAAI2024

Basic Info

Host: GitHub
Owner: k0u-id
License: apache-2.0
Language: Python
Default Branch: main
Homepage:
Size: 1.15 MB

Statistics

Stars: 18
Watchers: 4
Forks: 0
Open Issues: 0
Releases: 0

Created about 2 years ago · Last pushed 12 months ago

Metadata Files

Readme License Citation

Weakly Supervised Semantic Segmentation for Driving Scenes (AAAI 2024)

Official pytorch implementation of "Weakly Supervised Semantic Segmentation for Driving Scenes"

Weakly Supervised Semantic Segmentation for Driving Scenes
Dongseob Kim^*,1, Seungho Lee^*,1, Junsuk Choe², Hyunjung Shim³
¹ Yonsei University, ² Sogang University, and ³ Korea Advanced Institute of Science & Technology
_* indicates an equal contribution.

Abstract State-of-the-art techniques in weakly-supervised semantic segmentation (WSSS) using image-level labels exhibit severe performance degradation on driving scene datasets such as Cityscapes. To address this challenge, we develop a new WSSS framework tailored to driving scene datasets. Based on extensive analysis of dataset characteristics, we employ Contrastive Language-Image Pre-training (CLIP) as our baseline to obtain pseudo-masks. However, CLIP introduces two key challenges: (1) pseudo-masks from CLIP lack in representing small object classes, and (2) these masks contain notable noise. We propose solutions for each issue as follows. (1) We devise Global-Local View Training that seamlessly incorporates small-scale patches during model training, thereby enhancing the model's capability to handle small-sized yet critical objects in driving scenes (e.g., traffic light). (2) We introduce Consistency-Aware Region Balancing (CARB), a novel technique that discerns reliable and noisy regions through evaluating the consistency between CLIP masks and segmentation predictions. It prioritizes reliable pixels over noisy pixels via adaptive loss weighting. Notably, the proposed method achieves 51.8\% mIoU on the Cityscapes test dataset, showcasing its potential as a strong WSSS baseline on driving scene datasets. Experimental results on CamVid and WildDash2 demonstrate the effectiveness of our method across diverse datasets, even with small-scale datasets or visually challenging conditions.

Updates

21 Mar, 2024: Initial upload

Installation

Step 0. Install PyTorch and Torchvision following official instructions, e.g.,

```shell pip install torch torchvision

FYI, we're using torch==1.9.1 and torchvision==0.10.1

We used docker image pytorch:1.9.1-cuda11.1-cudnn8-devel

```

Step 1. Install MMCV. ```shell pip install mmcv-full

FYI, we're using mmcv-full==1.4.0

```

Step 2. Install CLIP. shell pip install ftfy regex tqdm pip install git+https://github.com/openai/CLIP.git

Step 3. Install CARB. ```shell git clone https://github.com/k0u-id/CARB.git cd CARB pip install -v -e .

"-v" means verbose, or more output

"-e" means installing a project in editable mode,

thus any local modifications made to the code will take effect without reinstallation.

```

Step 4. Maybe you need. (if error occurs) shell sudo apt-get install -y libgl1-mesa-glx libglib2.0-0 sudo apt-get install libmagickwand-dev pip install yapf==0.40.1 pip install git+https://github.com/lucasb-eyer/pydensecrf.git

Dataset Preparation & Pretrained Checkpoint

In our paper, we experiment with Cityscapes, CamVid, and WildDash2.

Example directory hierarchy CARB |--- data | |--- cityscapes | | |---leftImg8bit | | |---gtFine | |--- camvid11 | | |---img | | |---mask | |--- wilddash2 | | |---img | | |---mask |--- work_dirs | |--- output_dirs (config_name) | | ... | ...

Dataset - Cityscapes - Camvid - WildDash2

Pretrained Checkpoint - Cityscapes - CamVid - WildDash2

training CARB

CARB trains segmentation model with single or dual path. You need to prepair fixed-masks (pseudo-masks) for single path training.

Step 0. Download and convert the CLIP models, e.g., ```shell python tools/maskcliputils/convertclip_weights.py --model ViT16

Other options for model: RN50, RN101, RN50x4, RN50x16, RN50x64, ViT32, ViT16, ViT14

```

Step 1. Prepare the text embeddings of the target dataset, e.g., ```shell python tools/maskcliputils/promptengineering.py --model ViT16 --class-set city_carb

Other options for model: RN50, RN101, RN50x4, RN50x16, ViT32, ViT16

Other options for class-set: camvid, wilddash2

Default option is ViT16, city_carb

```

Train. Here, we give an example of multiple GPUs on a single machine. ```shell

Please see this file for the detail of execution.

You can change detailed configuration by changing config files (e.g., CARB/configs/carb/cityscapescarbdual.py)

bash tools/train.sh ```

Inference CARB

```shell

Please see this file for the detail of execution.

bash tools/test.sh ```

Acknoledgement

This is highly borrowed from MaskCLIP, mmsegmentation. Thanks to Chong, zhou.

Citation

If you use CARB or this code base in your work, please cite @inproceedings{kim2024weakly, title={Weakly Supervised Semantic Segmentation for Driving Scenes}, author={Kim, Dongseob and Lee, Seungho and Choe, Junsuk and Shim, Hyunjung}, booktitle={Proceedings of the AAAI Conference on Artificial Intelligence}, volume={38}, number={3}, pages={2741--2749}, year={2024} }

Owner

Name: Dongseob Kim
Login: k0u-id
Kind: user
Location: Seoul, Korea
Company: CVML

Repositories: 1
Profile: https://github.com/k0u-id

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
  - name: "MMSegmentation Contributors"
title: "OpenMMLab Semantic Segmentation Toolbox and Benchmark"
date-released: 2020-07-10
url: "https://github.com/open-mmlab/mmsegmentation"
license: Apache-2.0

GitHub Events

Total

Issues event: 1
Watch event: 2
Push event: 1

Last Year

Issues event: 1
Watch event: 2
Push event: 1

Dependencies

requirements/docs.txt pypi

docutils ==0.16.0
myst-parser *
sphinx ==4.0.2
sphinx_copybutton *
sphinx_markdown_tables *

requirements/mminstall.txt pypi

mmcv-full >=1.3.1,<=1.4.0

requirements/optional.txt pypi

cityscapesscripts *

requirements/readthedocs.txt pypi

mmcv *
prettytable *
torch *
torchvision *

requirements/runtime.txt pypi

Wand *
matplotlib *
numpy *
packaging *
prettytable *
scikit-image *

requirements/tests.txt pypi

codecov * test
flake8 * test
interrogate * test
isort ==4.3.21 test
pytest * test
xdoctest >=0.10.0 test
yapf * test

requirements.txt pypi

setup.py pypi

carb

Science Score: 54.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

Weakly Supervised Semantic Segmentation for Driving Scenes (AAAI 2024)

Updates

Installation

FYI, we're using torch==1.9.1 and torchvision==0.10.1

We used docker image pytorch:1.9.1-cuda11.1-cudnn8-devel

FYI, we're using mmcv-full==1.4.0

"-v" means verbose, or more output

"-e" means installing a project in editable mode,

thus any local modifications made to the code will take effect without reinstallation.

Dataset Preparation & Pretrained Checkpoint

training CARB

Other options for model: RN50, RN101, RN50x4, RN50x16, RN50x64, ViT32, ViT16, ViT14

Other options for model: RN50, RN101, RN50x4, RN50x16, ViT32, ViT16

Other options for class-set: camvid, wilddash2

Default option is ViT16, city_carb

Please see this file for the detail of execution.

You can change detailed configuration by changing config files (e.g., CARB/configs/carb/cityscapescarbdual.py)

Inference CARB

Please see this file for the detail of execution.

Acknoledgement

Citation

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Dependencies