maskclip

Reproduction and experiments of MaskCLIP(+)

https://github.com/haeun1107/maskclip

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (10.5%) to scientific vocabulary

Last synced: 11 months ago · JSON representation ·

Repository

Reproduction and experiments of MaskCLIP(+)

Basic Info

Host: GitHub
Owner: haeun1107
License: apache-2.0
Language: Python
Default Branch: master
Size: 437 MB

Statistics

Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Releases: 0

Created 12 months ago · Last pushed 11 months ago

Metadata Files

Readme Contributing License Code of conduct Citation

Extract Free Dense Labels from CLIP [Project Page]

███╗ ███╗ █████╗ ███████╗██╗ ██╗ ██████╗██╗ ██╗██████╗ ████╗ ████║██╔══██╗██╔════╝██║ ██╔╝██╔════╝██║ ██║██╔══██╗ ██╔████╔██║███████║███████╗█████╔╝ ██║ ██║ ██║██████╔╝ ██║╚██╔╝██║██╔══██║╚════██║██╔═██╗ ██║ ██║ ██║██╔═══╝ ██║ ╚═╝ ██║██║ ██║███████║██║ ██╗╚██████╗███████╗██║██║ ╚═╝ ╚═╝╚═╝ ╚═╝╚══════╝╚═╝ ╚═╝ ╚═════╝╚══════╝╚═╝╚═╝ This is the code for our paper: Extract Free Dense Labels from CLIP.

This repo is a fork of mmsegmentation. So the installation and data preparation is pretty similar.

Installation

Step 0. Install PyTorch and Torchvision following official instructions, e.g.,

```shell pip install torch torchvision

FYI, we're using torch==1.9.1 and torchvision==0.10.1

```

Step 1. Install MMCV using MIM. shell pip install -U openmim mim install mmcv-full

Step 2. Install CLIP. shell pip install ftfy regex tqdm pip install git+https://github.com/openai/CLIP.git

Step 3. Install MaskCLIP. ```shell git clone https://github.com/chongzhou96/MaskCLIP.git cd MaskCLIP pip install -v -e .

"-v" means verbose, or more output

"-e" means installing a project in editable mode,

thus any local modifications made to the code will take effect without reinstallation.

```

Dataset Preparation

Please refer to dataset_prepare.md. In our paper, we experiment with Pascal VOC, Pascal Context, and COCO Stuff 164k.

[HAEUN] Dataset Preparation

🔗 Pascal Context dataset was downloaded from the official source:
http://host.robots.ox.ac.uk/pascal/VOC/voc2010/index.html

MaskCLIP

MaskCLIP doesn't require any training. We only need to (1) download and convert the CLIP model and (2) prepare the text embeddings of the objects of interest.

Step 0. Download and convert the CLIP models, e.g., ```shell mkdir -p pretrain python tools/maskcliputils/convertclip_weights.py --model ViT16 --backbone

Other options for model: RN50, RN101, RN50x4, RN50x16, RN50x64, ViT32, ViT16, ViT14

```

Step 1. Prepare the text embeddings of the objects of interest, e.g., ```shell python tools/maskcliputils/promptengineering.py --model ViT16 --class-set context

Other options for model: RN50, RN101, RN50x4, RN50x16, ViT32, ViT16

Other options for class-set: voc, context, stuff

Actually, we've played around with many more interesting target classes. (See prompt_engineering.py)

```

Step 2. Get quantitative results (mIoU): ```shell python tools/test.py ${CONFIGFILE} ${CHECKPOINTFILE} --eval mIoU

e.g., python tools/test.py configs/maskclip/maskclipvit16520x520pascalcontext59.py pretrain/ViT16clip_backbone.pth --eval mIoU

```

Step 3. (optional) Get qualitative results: ```shell python tools/test.py ${CONFIGFILE} ${CHECKPOINTFILE} --show-dir ${OUTPUT_DIR}

e.g., python tools/test.py configs/maskclip/maskclipvit16520x520pascalcontext59.py pretrain/ViT16clip_backbone.pth --show-dir output/

```

MaskCLIP+

MaskCLIP+ trains another segmentation model with pseudo labels extracted from MaskCLIP.

Step 0. Download and convert the CLIP models, e.g., ```shell mkdir -p pretrain python tools/maskcliputils/convertclip_weights.py --model ViT16

Other options for model: RN50, RN101, RN50x4, RN50x16, RN50x64, ViT32, ViT16, ViT14

```

Step 1. Prepare the text embeddings of the target dataset, e.g., ```shell python tools/maskcliputils/promptengineering.py --model ViT16 --class-set context

Other options for model: RN50, RN101, RN50x4, RN50x16, ViT32, ViT16

Other options for class-set: voc, context, stuff

```

Train. Depending on your setup (single/mutiple GPU(s), multiple machines), the training script can be different. Here, we give an example of multiple GPUs on a single machine. For more infomation, please refer to train.md. ```shell sh tools/disttrain.sh ${CONFIGFILE} ${GPU_NUM}

e.g., sh tools/disttrain.sh configs/maskclipplus/zeroshot/maskclipplusr50deeplabv3plusr101-d8480x48040kpascal_context.py 4

```

Inference. See step 2 and step 3 under the MaskCLIP section. (We will release the trained models soon.)

Citation

If you use MaskCLIP or this code base in your work, please cite @InProceedings{zhou2022maskclip, author = {Zhou, Chong and Loy, Chen Change and Dai, Bo}, title = {Extract Free Dense Labels from CLIP}, booktitle = {European Conference on Computer Vision (ECCV)}, year = {2022} }

Contact

For questions about our paper or code, please contact Chong Zhou.

Owner

Login: haeun1107
Kind: user

Repositories: 1
Profile: https://github.com/haeun1107

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
  - name: "MMSegmentation Contributors"
title: "OpenMMLab Semantic Segmentation Toolbox and Benchmark"
date-released: 2020-07-10
url: "https://github.com/open-mmlab/mmsegmentation"
license: Apache-2.0

GitHub Events

Total

Push event: 11
Create event: 1

Last Year

Push event: 11
Create event: 1

Dependencies

docker/Dockerfile docker

pytorch/pytorch ${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel build

docker/serve/Dockerfile docker

pytorch/pytorch ${PYTORCH}-cuda${CUDA}-cudnn${CUDNN}-devel build

requirements/docs.txt pypi

docutils ==0.16.0
myst-parser *
sphinx ==4.0.2
sphinx_copybutton *
sphinx_markdown_tables *

requirements/mminstall.txt pypi

mmcv-full >=1.3.1,<=1.4.0

requirements/optional.txt pypi

cityscapesscripts *

requirements/readthedocs.txt pypi

mmcv *
prettytable *
torch *
torchvision *

requirements/runtime.txt pypi

Wand *
matplotlib *
numpy *
packaging *
prettytable *
scikit-image *

requirements/tests.txt pypi

codecov * test
flake8 * test
interrogate * test
isort ==4.3.21 test
pytest * test
xdoctest >=0.10.0 test
yapf * test

requirements.txt pypi

setup.py pypi

maskclip

Science Score: 54.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

Extract Free Dense Labels from CLIP [Project Page]

Installation

FYI, we're using torch==1.9.1 and torchvision==0.10.1

"-v" means verbose, or more output

"-e" means installing a project in editable mode,

thus any local modifications made to the code will take effect without reinstallation.

Dataset Preparation

[HAEUN] Dataset Preparation

MaskCLIP

Other options for model: RN50, RN101, RN50x4, RN50x16, RN50x64, ViT32, ViT16, ViT14

Other options for model: RN50, RN101, RN50x4, RN50x16, ViT32, ViT16

Other options for class-set: voc, context, stuff

Actually, we've played around with many more interesting target classes. (See prompt_engineering.py)

e.g., python tools/test.py configs/maskclip/maskclipvit16520x520pascalcontext59.py pretrain/ViT16clip_backbone.pth --eval mIoU

e.g., python tools/test.py configs/maskclip/maskclipvit16520x520pascalcontext59.py pretrain/ViT16clip_backbone.pth --show-dir output/

MaskCLIP+

Other options for model: RN50, RN101, RN50x4, RN50x16, RN50x64, ViT32, ViT16, ViT14

Other options for model: RN50, RN101, RN50x4, RN50x16, ViT32, ViT16

Other options for class-set: voc, context, stuff

e.g., sh tools/disttrain.sh configs/maskclipplus/zeroshot/maskclipplusr50deeplabv3plusr101-d8480x48040kpascal_context.py 4

Citation

Contact

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Dependencies