138-open-vocabulary-semantic-segmentation-with-mask-adapted-clip

https://github.com/szu-advtech-2023/138-open-vocabulary-semantic-segmentation-with-mask-adapted-clip

Science Score: 18.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.6%) to scientific vocabulary
Last synced: 10 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: SZU-AdvTech-2023
  • Language: Python
  • Default Branch: main
  • Size: 35.9 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created over 2 years ago · Last pushed over 2 years ago
Metadata Files
Citation

https://github.com/SZU-AdvTech-2023/138-Open-Vocabulary-Semantic-Segmentation-with-Mask-Adapted-CLIP/blob/main/

## 

### Requirements

- Linux with Python  3.6
- PyTorch  1.8 and [torchvision](https://github.com/pytorch/vision/) that matches the PyTorch installation. Install them together at [pytorch.org](https://pytorch.org/) to make sure of this. Note, please check PyTorch version matches that is required by Detectron2.
- Detectron2: follow [Detectron2 installation instructions](https://detectron2.readthedocs.io/tutorials/install.html).

### Usage

Install required packages.

```
conda create --name ovseg python=3.8
conda activate ovseg
conda install pytorch==1.10.1 torchvision==0.11.2 torchaudio==0.10.1 cudatoolkit=11.3 -c pytorch -c conda-forge
pip install -r requirements.txt
```



You need to download `detectron2==0.6` following [instructions](https://detectron2.readthedocs.io/en/latest/tutorials/install.html)

```
python -m pip install detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu113/torch1.10/index.html
```



FurtherMore, install the modified clip package.

```
cd third_party/CLIP
python -m pip install -Ue .
```



## 

This doc is a modification/extension of [MaskFormer](https://github.com/facebookresearch/MaskFormer/blob/main/datasets/README.md) following [Detectron2 fromat](https://detectron2.readthedocs.io/en/latest/tutorials/datasets.html).

A dataset can be used by accessing [DatasetCatalog](https://detectron2.readthedocs.io/modules/data.html#detectron2.data.DatasetCatalog) for its data, or [MetadataCatalog](https://detectron2.readthedocs.io/modules/data.html#detectron2.data.MetadataCatalog) for its metadata (class names, etc). This document explains how to setup the builtin datasets so they can be used by the above APIs. [Use Custom Datasets](https://detectron2.readthedocs.io/tutorials/datasets.html) gives a deeper dive on how to use `DatasetCatalog` and `MetadataCatalog`, and how to add new datasets to them.

OVSeg has builtin support for a few datasets. The datasets are assumed to exist in a directory specified by the environment variable `DETECTRON2_DATASETS`. Under this directory, detectron2 will look for datasets in the structure described below, if needed.

```
$DETECTRON2_DATASETS/
  coco/                 # COCOStuff-171
  ADEChallengeData2016/ # ADE20K-150
  ADE20K_2021_17_01/    # ADE20K-847
  VOCdevkit/
    VOC2012/            # PASCALVOC-20
    VOC2010/            # PASCALContext-59, PASCALContext-459
```



You can set the location for builtin datasets by `export DETECTRON2_DATASETS=/path/to/datasets`. If left unset, the default is `./datasets` relative to your current working directory.

Without specific notifications, our model is trained on COCOStuff-171 and evlauted on ADE20K-150, ADE20K-847, PASCALVOC-20, PASCALContext-59 and PASCALContext-459.

| dataset        | split     | # images | # categories |
| -------------- | --------- | -------- | ------------ |
| COCO Stuff     | train2017 | 118K     | 171          |
| ADE20K         | val       | 2K       | 150/847      |
| Pascal VOC     | val       | 1.5K     | 20           |
| Pascal Context | val       | 5K       | 59/459       |

### Expected dataset structure for [COCO Stuff](https://github.com/nightrome/cocostuff):

```
coco/
  train2017/ # http://images.cocodataset.org/zips/train2017.zip
  annotations/ # http://images.cocodataset.org/annotations/annotations_trainval2017.zip
  stuffthingmaps/
    stuffthingmaps_trainval2017.zip # http://calvin.inf.ed.ac.uk/wp-content/uploads/data/cocostuffdataset/stuffthingmaps_trainval2017.zip
    train2017/
  # below are generated
  stuffthingmaps_detectron2/ 
    train2017/
```



The directory `stuffthingmaps_detectron2` is generated by running `python datasets/prepare_coco_stuff_sem_seg.py`.

### Expected dataset structure for [ADE20k Scene Parsing (ADE20K-150)](http://sceneparsing.csail.mit.edu/):

```
ADEChallengeData2016/
  annotations/
  images/
  objectInfo150.txt
  # below are generated
  annotations_detectron2/
```



The directory `annotations_detectron2` is generated by running `python datasets/prepare_ade20k_sem_seg.py`.

### Expected dataset structure for [ADE20k-Full (ADE20K-847)](https://github.com/CSAILVision/ADE20K#download):

```
ADE20K_2021_17_01/
  images/
  index_ade20k.pkl
  objects.txt
  # below are generated
  images_detectron2/
  annotations_detectron2/
```



The directories `images_detectron2` and `annotations_detectron2` are generated by running `python datasets/prepare_ade20k_full_sem_seg.py`.

### Expected dataset structure for [Pascal VOC 2012 (PASCALVOC-20)](http://host.robots.ox.ac.uk/pascal/VOC/voc2012/#devkit):

```
VOCdevkit/VOC2012/
  Annotations/
  ImageSets/
  JPEGImages/
  SegmentationClass/
  SegmentationObject/
  SegmentationClassAug/ # https://github.com/kazuto1011/deeplab-pytorch/blob/master/data/datasets/voc12/README.md
  # below are generated
  images_detectron2/
  annotations_detectron2/
```



It starts with a tar file `VOCtrainval_11-May-2012.tar`.

We use SBD augmentated training data as `SegmentationClassAug` following [Deeplab](https://github.com/kazuto1011/deeplab-pytorch/blob/master/data/datasets/voc12/README.md)

The directories `images_detectron2` and `annotations_detectron2` are generated by running `python datasets/prepare_voc_sem_seg.py`.

### Expected dataset structure for [Pascal Context](https://www.cs.stanford.edu/~roozbeh/pascal-context/):

```
VOCdevkit/VOC2010/
  Annotations/
  ImageSets/
  JPEGImages/
  SegmentationClass/
  SegmentationObject/
  # below are from https://www.cs.stanford.edu/~roozbeh/pascal-context/trainval.tar.gz
  trainval/
  labels.txt
  59_labels.txt # https://www.cs.stanford.edu/~roozbeh/pascal-context/59_labels.txt
  pascalcontext_val.txt # https://drive.google.com/file/d/1BCbiOKtLvozjVnlTJX51koIveUZHCcUh/view?usp=sharing
  # below are generated
  annotations_detectron2/
    pc459_val
    pc59_val
```



It starts with a tar file `VOCtrainval_03-May-2010.tar`. You may want to download the 5K validation set [here](https://drive.google.com/file/d/1BCbiOKtLvozjVnlTJX51koIveUZHCcUh/view?usp=sharing).

The directory `annotations_detectron2` is generated by running `python datasets/prepare_pascal_context.py`.



## 

CLIPhttps://huggingface.co/spaces/facebook/ov-seg/resolve/main/ovseg_clip_l_9a1909.pth?download=true

Swinbasehttps://huggingface.co/spaces/facebook/ov-seg/resolve/main/ovseg_swinbase_vitL14_ft_mpt.pth?download=true

SAMhttps://huggingface.co/spaces/facebook/ov-seg/resolve/main/sam_vit_l_0b3195.pth?download=true

ov-seg-sam



## 

```powershell
python app.py
```

Owner

  • Name: SZU-AdvTech-2023
  • Login: SZU-AdvTech-2023
  • Kind: organization

Citation (citation.txt)

@inproceedings{REPO138,
    author = "Liang, Feng and Wu, Bichen and Dai, Xiaoliang and Li, Kunpeng and Zhao, Yinan and Zhang, Hang and Zhang, Peizhao and Vajda, Peter and Marculescu, Diana",
    booktitle = "Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition",
    pages = "7061--7070",
    title = "{Open-Vocabulary Semantic Segmentation with Mask-Adapted CLIP}",
    year = "2023"
}

GitHub Events

Total
Last Year