https://github.com/cvi-szu/qa-clims

[ACM MM 2023] QA-CLIMS: Question-Answer Cross Language Image Matching for Weakly Supervised Semantic Segmentation

https://github.com/cvi-szu/qa-clims

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
    Links to: arxiv.org, scholar.google, acm.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (8.6%) to scientific vocabulary

Keywords

semantic-segmentation weakly-supervised-learning weakly-supervised-segmentation
Last synced: 5 months ago · JSON representation

Repository

[ACM MM 2023] QA-CLIMS: Question-Answer Cross Language Image Matching for Weakly Supervised Semantic Segmentation

Basic Info
  • Host: GitHub
  • Owner: CVI-SZU
  • License: mit
  • Language: Python
  • Default Branch: master
  • Homepage:
  • Size: 9.21 MB
Statistics
  • Stars: 12
  • Watchers: 3
  • Forks: 0
  • Open Issues: 1
  • Releases: 0
Topics
semantic-segmentation weakly-supervised-learning weakly-supervised-segmentation
Created about 2 years ago · Last pushed over 1 year ago
Metadata Files
Readme License

README.md

[MM'23] QA-CLIMS

This is the official PyTorch implementation of our paper:

QA-CLIMS: Question-Answer Cross Language Image Matching for Weakly Supervised Semantic Segmentation
Songhe Deng, Wei Zhuo, Jinheng Xie, Linlin Shen
Computer Vision Institute, Shenzhen University
ACM International Conference on Multimedia, 2023
[Paper] [arXiv]

Environment

  • Python 3.7
  • PyTorch 1.7.1
  • torchvision 0.8.2

shell pip install -r requirements.txt

PASCAL VOC2012

You can find the following files at here.

| File | filename | |:---------------------------|:-------------------------------------------------------------------------------| | FG & BG VQA results | voc_vqa_fg_blip.npy
voc_vqa_bg_blip.npy | | FG & BG VQA text features | voc_vqa_fg_blip_ViT-L-14_cache.npy
voc_vqa_bg_blip_ViT-L-14_cache.npy | | pre-trained baseline model | res50_cam.pth | | QA-CLIMS model | res50_qa_clims.pth |

1. Prepare VQA result features

You can download the VQA text features voc_vqa_fg_blip_ViT-L-14_cache.npy and voc_vqa_bg_blip_ViT-L-14_cache.npy above and put its in vqa/.

Or, you can generate it by yourself: To generate VQA results, please follow [third_party/README](third_party/README.md#BLIP). After that, run following command to generate VQA text features: ```shell python gen_text_feats_cache.py voc \ --vqa_fg_file vqa/voc_vqa_fg_blip.npy \ --vqa_fg_cache_file vqa/voc_vqa_fg_blip_ViT-L-14_cache.npy \ --vqa_bg_file vqa/voc_vqa_bg_blip.npy \ --vqa_bg_cache_file vqa/voc_vqa_bg_blip_ViT-L-14_cache.npy \ --clip ViT-L/14 ```

2. Train QA-CLIMS and generate initial CAMs

Please download the pre-trained baseline model res50_cam.pth above and put it at cam-baseline-voc12/res50_cam.pth.

shell bash run_voc12_qa_clims.sh

3. Train IRNet and generate pseudo semantic masks

shell bash run_voc12_sem_seg.sh

4.Train DeepLab using pseudo semantic masks.

Please follow deeplab-pytorch or CLIMS.

MS COCO2014

You can find the following files at here.

| File | filename | |:---------------------------|:---------------------------------------------------------------------------------| | FG & BG VQA results | coco_vqa_fg_blip.npy
coco_vqa_bg_blip.npy | | FG & BG VQA text features | coco_vqa_fg_blip_ViT-L-14_cache.npy
coco_vqa_bg_blip_ViT-L-14_cache.npy | | pre-trained baseline model | res50_cam.pth | | QA-CLIMS model | res50_qa_clims.pth |

Please place the downloaded coco_vqa_fg_blip_ViT-L-14_cache.npy and coco_vqa_bg_blip_ViT-L-14_cache.npy in vqa/, and res50_cam.pth in cam-baseline-coco14/.

Then, running the following command:

shell bash run_coco14_qa_clims.sh bash run_coco14_sem_seg.sh

Citation

If you find this code useful for your research, please consider cite our paper:

@inproceedings{deng2023qa-clims, title={QA-CLIMS: Question-Answer Cross Language Image Matching for Weakly Supervised Semantic Segmentation}, author={Deng, Songhe and Zhuo, Wei and Xie, Jinheng and Shen, Linlin}, booktitle={Proceedings of the 31st ACM International Conference on Multimedia}, pages={5572--5583}, year={2023} }


This repository was highly based on CLIMS and IRNet, thanks for their great works!

Owner

  • Name: Computer Vision Institute, SZU
  • Login: CVI-SZU
  • Kind: organization
  • Location: Shenzhen Univeristy, Shenzhen, China

Computer Vision Institute, Shenzhen University

GitHub Events

Total
  • Issues event: 7
  • Watch event: 1
  • Issue comment event: 7
Last Year
  • Issues event: 7
  • Watch event: 1
  • Issue comment event: 7

Committers

Last synced: almost 2 years ago

All Time
  • Total Commits: 2
  • Total Committers: 1
  • Avg Commits per committer: 2.0
  • Development Distribution Score (DDS): 0.0
Past Year
  • Commits: 2
  • Committers: 1
  • Avg Commits per committer: 2.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
SongHe d****8@o****m 2

Issues and Pull Requests

Last synced: over 1 year ago

All Time
  • Total issues: 5
  • Total pull requests: 0
  • Average time to close issues: 19 days
  • Average time to close pull requests: N/A
  • Total issue authors: 2
  • Total pull request authors: 0
  • Average comments per issue: 5.2
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 5
  • Pull requests: 0
  • Average time to close issues: 19 days
  • Average time to close pull requests: N/A
  • Issue authors: 2
  • Pull request authors: 0
  • Average comments per issue: 5.2
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • HYTHYThythyt (4)
  • ineedugirl (3)
  • xixiaos (1)
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels

Dependencies

requirements.txt pypi
  • chainercv *
  • cmapy *
  • cython *
  • imageio *
  • matplotlib *
  • nltk *
  • numpy *
  • opencv-python *
  • pydensecrf *
  • timm *
  • torch *
  • torchvision *
  • transformers *
third_party/BLIP/requirements.txt pypi
  • fairscale ==0.4.4
  • pycocoevalcap *
  • timm ==0.4.12
  • transformers ==4.15.0
third_party/CLIP/requirements.txt pypi
  • ftfy *
  • regex *
  • torch *
  • torchvision *
  • tqdm *
third_party/CLIP/setup.py pypi
  • for *
  • open *
  • str *