grad-cam

Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.

https://github.com/jacobgil/pytorch-grad-cam

Keywords

class-activation-maps computer-vision deep-learning explainable-ai explainable-ml grad-cam image-classification interpretability interpretable-ai interpretable-deep-learning machine-learning object-detection pytorch score-cam vision-transformers visualizations xai

Keywords from Contributors

transformer yolov5 pretrained-models cryptocurrency interactive optim autograding onnx distribution vlm

Last synced: 6 months ago · JSON representation

Repository

Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.

Basic Info

Host: GitHub
Owner: jacobgil
License: mit
Language: Python
Default Branch: master
Homepage: https://jacobgil.github.io/pytorch-gradcam-book
Size: 134 MB

Statistics

Stars: 11,641
Watchers: 44
Forks: 1,636
Open Issues: 159
Releases: 0

Topics

class-activation-maps computer-vision deep-learning explainable-ai explainable-ml grad-cam image-classification interpretability interpretable-ai interpretable-deep-learning machine-learning object-detection pytorch score-cam vision-transformers visualizations xai

Created over 8 years ago · Last pushed 11 months ago

Metadata Files

Readme License

README.md

Advanced AI explainability for PyTorch

pip install grad-cam

Documentation with advanced tutorials: https://jacobgil.github.io/pytorch-gradcam-book

This is a package with state of the art methods for Explainable AI for computer vision. This can be used for diagnosing model predictions, either in production or while developing models. The aim is also to serve as a benchmark of algorithms and metrics for research of new explainability methods.

⭐ Comprehensive collection of Pixel Attribution methods for Computer Vision.

⭐ Tested on many Common CNN Networks and Vision Transformers.

⭐ Advanced use cases: Works with Classification, Object Detection, Semantic Segmentation, Embedding-similarity and more.

⭐ Includes smoothing methods to make the CAMs look nice.

⭐ High performance: full support for batches of images in all methods.

⭐ Includes metrics for checking if you can trust the explanations, and tuning them for best performance.

visualization

| Method | What it does | |---------------------|-----------------------------------------------------------------------------------------------------------------------------| | GradCAM | Weight the 2D activations by the average gradient | | HiResCAM | Like GradCAM but element-wise multiply the activations with the gradients; provably guaranteed faithfulness for certain models | | GradCAMElementWise | Like GradCAM but element-wise multiply the activations with the gradients then apply a ReLU operation before summing | | GradCAM++ | Like GradCAM but uses second order gradients | | XGradCAM | Like GradCAM but scale the gradients by the normalized activations | | AblationCAM | Zero out activations and measure how the output drops (this repository includes a fast batched implementation) | | ScoreCAM | Perbutate the image by the scaled activations and measure how the output drops | | EigenCAM | Takes the first principle component of the 2D Activations (no class discrimination, but seems to give great results) | | EigenGradCAM | Like EigenCAM but with class discrimination: First principle component of Activations*Grad. Looks like GradCAM, but cleaner | | LayerCAM | Spatially weight the activations by positive gradients. Works better especially in lower layers | | FullGrad | Computes the gradients of the biases from all over the network, and then sums them | | Deep Feature Factorizations | Non Negative Matrix Factorization on the 2D activations | | KPCA-CAM | Like EigenCAM but with Kernel PCA instead of PCA |
| FEM | A gradient free method that binarizes activations by an activation > mean + k * std rule. | | ShapleyCAM | Weight the activations using the gradient and Hessian-vector product.| | FinerCAM | Improves fine-grained classification by comparing similar classes, suppressing shared features and highlighting discriminative details. |

Visual Examples

| What makes the network think the image label is 'pug, pug-dog' | What makes the network think the image label is 'tabby, tabby cat' | Combining Grad-CAM with Guided Backpropagation for the 'pug, pug-dog' class | | ---------------------------------------------------------------|--------------------|-----------------------------------------------------------------------------| | | |

Object Detection and Semantic Segmentation

| Object Detection | Semantic Segmentation | | -----------------|-----------------------| | | |

| 3D Medical Semantic Segmentation | | -------------------------- | | |

Explaining similarity to other images / embeddings

Deep Feature Factorization

CLIP

| Explaining the text prompt "a dog" | Explaining the text prompt "a cat" | | -----------------------------------|------------------------------------| | |

Classification

Resnet50:

| Category | Image | GradCAM | AblationCAM | ScoreCAM | | ---------|-------|----------|------------|------------| | Dog | | | | | | Cat | | | | |

Vision Transfomer (Deit Tiny):

| Category | Image | GradCAM | AblationCAM | ScoreCAM | | ---------|-------|----------|------------|------------| | Dog | | | | | | Cat | | | | |

Swin Transfomer (Tiny window:7 patch:4 input-size:224):

| Category | Image | GradCAM | AblationCAM | ScoreCAM | | ---------|-------|----------|------------|------------| | Dog | | | | | | Cat | | | | |

Metrics and Evaluation for XAI

Usage examples

```python from pytorchgradcam import GradCAM, HiResCAM, ScoreCAM, GradCAMPlusPlus, AblationCAM, XGradCAM, EigenCAM, FullGrad from pytorchgradcam.utils.modeltargets import ClassifierOutputTarget from pytorchgradcam.utils.image import showcamonimage from torchvision.models import resnet50

model = resnet50(pretrained=True) targetlayers = [model.layer4[-1]] inputtensor = # Create an input tensor image for your model..

Note: input_tensor can be a batch tensor with several images!

We have to specify the target we want to generate the CAM for.

targets = [ClassifierOutputTarget(281)]

Construct the CAM object once, and then re-use it on many images.

with GradCAM(model=model, targetlayers=targetlayers) as cam: # You can also pass augsmooth=True and eigensmooth=True, to apply smoothing. grayscalecam = cam(inputtensor=inputtensor, targets=targets) # In this example grayscalecam has only one image in the batch: grayscalecam = grayscalecam[0, :] visualization = showcamonimage(rgbimg, grayscalecam, usergb=True) # You can also get the model outputs without having to redo inference model_outputs = cam.outputs ```

cam.py has a more detailed usage example.

Choosing the layer(s) to extract activations from

You need to choose the target layer to compute the CAM for. Some common choices are: - FasterRCNN: model.backbone - Resnet18 and 50: model.layer4[-1] - VGG, densenet161 and mobilenet: model.features[-1] - mnasnet1_0: model.layers[-1] - ViT: model.blocks[-1].norm1 - SwinT: model.layers[-1].blocks[-1].norm1

If you pass a list with several layers, the CAM will be averaged accross them. This can be useful if you're not sure what layer will perform best.

Adapting for new architectures and tasks

Methods like GradCAM were designed for and were originally mostly applied on classification models, and specifically CNN classification models. However you can also use this package on new architectures like Vision Transformers, and on non classification tasks like Object Detection or Semantic Segmentation.

The be able to adapt to non standard cases, we have two concepts. - The reshape transform - how do we convert activations to represent spatial images ? - The model targets - What exactly should the explainability method try to explain ?

The reshape_transform argument

In a CNN the intermediate activations in the model are a mult-channel image that have the dimensions channel x rows x cols, and the various explainabiltiy methods work with these to produce a new image.

In case of another architecture, like the Vision Transformer, the shape might be different, like (rows x cols + 1) x channels, or something else. The reshape transform converts the activations back into a multi-channel image, for example by removing the class token in a vision transformer. For examples, check here

The model_target argument

The model target is just a callable that is able to get the model output, and filter it out for the specific scalar output we want to explain.

For classification tasks, the model target will typically be the output from a specific category. The targets parameter passed to the CAM method can then use ClassifierOutputTarget: python targets = [ClassifierOutputTarget(281)]

However for more advanced cases, you might want a different behaviour. Check here for more examples.

Tutorials

Here you can find detailed examples of how to use this for various custom use cases like object detection:

These point to the new documentation jupter-book for fast rendering. The jupyter notebooks themselves can be found under the tutorials folder in the git repository.

Guided backpropagation

```python from pytorchgradcam import GuidedBackpropReLUModel from pytorchgradcam.utils.image import ( showcamonimage, deprocessimage, preprocessimage ) gbmodel = GuidedBackpropReLUModel(model=model, device=model.device()) gb = gbmodel(inputtensor, target_category=None)

cammask = cv2.merge([grayscalecam, grayscalecam, grayscalecam]) camgb = deprocessimage(cammask * gb) result = deprocessimage(gb) ```

Metrics and evaluating the explanations

```python from pytorchgradcam.utils.modeltargets import ClassifierOutputSoftmaxTarget from pytorchgradcam.metrics.cammult_image import CamMultImageConfidenceChange

Create the metric target, often the confidence drop in a score of some category

metrictarget = ClassifierOutputSoftmaxTarget(281) scores, batchvisualizations = CamMultImageConfidenceChange()(inputtensor, inversecams, targets, model, returnvisualization=True) visualization = deprocessimage(batch_visualizations[0, :])

State of the art metric: Remove and Debias

from pytorchgradcam.metrics.road import ROADMostRelevantFirst, ROADLeastRelevantFirst cammetric = ROADMostRelevantFirst(percentile=75) scores, perturbationvisualizations = cammetric(inputtensor, grayscalecams, targets, model, returnvisualization=True)

You can also average across different percentiles, and combine

(LeastRelevantFirst - MostRelevantFirst) / 2

from pytorchgradcam.metrics.road import ROADMostRelevantFirstAverage, ROADLeastRelevantFirstAverage, ROADCombined cammetric = ROADCombined(percentiles=[20, 40, 60, 80]) scores = cammetric(inputtensor, grayscalecams, targets, model) ```

Smoothing to get nice looking CAMs

To reduce noise in the CAMs, and make it fit better on the objects, two smoothing methods are supported:

aug_smooth=True

Test time augmentation: increases the run time by x6.

Applies a combination of horizontal flips, and mutiplying the image by [1.0, 1.1, 0.9].

This has the effect of better centering the CAM around the objects.

eigen_smooth=True

First principle component of activations*weights

This has the effect of removing a lot of noise.

|AblationCAM | aug smooth | eigen smooth | aug+eigen smooth| |------------|------------|--------------|--------------------| | | | |

Running the example script:

Usage: python cam.py --image-path <path_to_image> --method <method> --output-dir <output_dir_path>

To use with a specific device, like cpu, cuda, cuda:0, mps or hpu: python cam.py --image-path <path_to_image> --device cuda --output-dir <output_dir_path>

You can choose between:

GradCAM , HiResCAM, ScoreCAM, GradCAMPlusPlus, AblationCAM, XGradCAM , LayerCAM, FullGrad, EigenCAM, ShapleyCAM, and FinerCAM.

Some methods like ScoreCAM and AblationCAM require a large number of forward passes, and have a batched implementation.

You can control the batch size with cam.batch_size =

Citation

If you use this for research, please cite. Here is an example BibTeX entry:

@misc{jacobgilpytorchcam, title={PyTorch library for CAM methods}, author={Jacob Gildenblat and contributors}, year={2021}, publisher={GitHub}, howpublished={\url{https://github.com/jacobgil/pytorch-grad-cam}}, }

References

https://arxiv.org/abs/1610.02391
Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, Dhruv Batra

https://arxiv.org/abs/2011.08891
Use HiResCAM instead of Grad-CAM for faithful explanations of convolutional neural networks Rachel L. Draelos, Lawrence Carin

https://arxiv.org/abs/1710.11063
Grad-CAM++: Improved Visual Explanations for Deep Convolutional Networks Aditya Chattopadhyay, Anirban Sarkar, Prantik Howlader, Vineeth N Balasubramanian

https://arxiv.org/abs/1910.01279
Score-CAM: Score-Weighted Visual Explanations for Convolutional Neural Networks Haofan Wang, Zifan Wang, Mengnan Du, Fan Yang, Zijian Zhang, Sirui Ding, Piotr Mardziel, Xia Hu

https://ieeexplore.ieee.org/abstract/document/9093360/
Ablation-cam: Visual explanations for deep convolutional network via gradient-free localization. Saurabh Desai and Harish G Ramaswamy. In WACV, pages 972–980, 2020

https://arxiv.org/abs/2008.02312
Axiom-based Grad-CAM: Towards Accurate Visualization and Explanation of CNNs Ruigang Fu, Qingyong Hu, Xiaohu Dong, Yulan Guo, Yinghui Gao, Biao Li

https://arxiv.org/abs/2008.00299
Eigen-CAM: Class Activation Map using Principal Components Mohammed Bany Muhammad, Mohammed Yeasin

http://mftp.mmcheng.net/Papers/21TIP_LayerCAM.pdf
LayerCAM: Exploring Hierarchical Class Activation Maps for Localization Peng-Tao Jiang; Chang-Bin Zhang; Qibin Hou; Ming-Ming Cheng; Yunchao Wei

https://arxiv.org/abs/1905.00780
Full-Gradient Representation for Neural Network Visualization Suraj Srinivas, Francois Fleuret

https://arxiv.org/abs/1806.10206
Deep Feature Factorization For Concept Discovery Edo Collins, Radhakrishna Achanta, Sabine Süsstrunk

https://arxiv.org/abs/2410.00267
KPCA-CAM: Visual Explainability of Deep Computer Vision Models using Kernel PCA Sachin Karmani, Thanushon Sivakaran, Gaurav Prasad, Mehmet Ali, Wenbo Yang, Sheyang Tang

https://hal.science/hal-02963298/document
Features Understanding in 3D CNNs for Actions Recognition in Video Kazi Ahmed Asif Fuad, Pierre-Etienne Martin, Romain Giot, Romain Bourqui, Jenny Benois-Pineau, Akka Zemmar

https://arxiv.org/abs/2501.06261
CAMs as Shapley Value-based Explainers Huaiguang Cai

https://arxiv.org/pdf/2501.11309
Finer-CAM : Spotting the Difference Reveals Finer Details for Visual Explanation
Ziheng Zhang*, Jianyang Gu*, Arpita Chowdhury, Zheda Mai, David Carlyn,Tanya Berger-Wolf, Yu Su, Wei-Lun Chao

Owner

Name: Jacob Gildenblat
Login: jacobgil
Kind: user
Location: Israel

Website: jacobgil.github.io
Twitter: jacobgildenblat
Repositories: 16
Profile: https://github.com/jacobgil

Playing with tensors.

GitHub Events

Total

Issues event: 27
Watch event: 1,565
Issue comment event: 71
Push event: 7
Pull request review event: 5
Pull request event: 11
Fork event: 116

Last Year

Issues event: 27
Watch event: 1,565
Issue comment event: 71
Push event: 7
Pull request review event: 5
Pull request event: 11
Fork event: 116

Committers

Last synced: 9 months ago

All Time

Total Commits: 227
Total Committers: 42
Avg Commits per committer: 5.405
Development Distribution Score (DDS): 0.256

Past Year

Commits: 33
Committers: 12
Avg Commits per committer: 2.75
Development Distribution Score (DDS): 0.394

Top Committers

Name	Email	Commits
Jacob Gildenblat	j**t@g**m	169
jdecid	j**d@g**m	7
Oliver	o**9@g**m	5
Ming Lu	l**6@g**m	3
LucaButera	2****a	2
Rachel Draelos, MD, PhD	r**s@g**m	2
Ziheng Zhang	z**7@o**u	2
jackyjinjing	h**3@1**m	2
Fan Jingbo	f**1@g**m	2
Justas Birgiolas	J****B	1
Junjie	6****z	1
Garima Jain	g**9@g**m	1
Daniel De León	1****3	1
Christophe Foyer	c**r@g**m	1
Chris Hammill	c**l@g**m	1
ChiLin Chiou	c**u@g**m	1
Aray Karjauv	k**y@g**m	1
Anthony Dave	4****i	1
Ambesh Shekhar	3****a	1
Akon-Fiber	5****r	1
Akash A Desai	6****8	1
priyavrat-misra	c**m@p**e	1
dependabot[bot]	4****]	1
cai2-huaiguang	c**3@m**n	1
Zhou T	1****w	1
Zachary Mostowsky	3****y	1
Yuta Fukasawa	y**8@g**m	1
Yonghye Kwon	d**e@g**m	1
Ujjwal Sharma	m**a@g**m	1
Shreyas	s**a@g**m	1
and 12 more...

Committer Domains (Top 20 + Academic)

163.com: 2 mail2.sysu.edu.cn: 1 pm.me: 1 osu.edu: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 210
Total pull requests: 48
Average time to close issues: about 1 month
Average time to close pull requests: 7 months
Total issue authors: 194
Total pull request authors: 40
Average comments per issue: 2.26
Average comments per pull request: 1.44
Merged pull requests: 23
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 33
Pull requests: 15
Average time to close issues: about 2 months
Average time to close pull requests: 7 days
Issue authors: 32
Pull request authors: 9
Average comments per issue: 0.58
Average comments per pull request: 1.67
Merged pull requests: 9
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

sammlapp (4)
MoH-assan (3)
MaxPolak97 (3)
C-C-Y (2)
marios1861 (2)
RMobina (2)
PietroManganelliConforti (2)
lunaryan (2)
hxngu (2)
jooseuk (2)
vggls (2)
iremalti (2)
Yanll2021 (1)
johnwalking (1)
manuelGue (1)

Pull Request Authors

jackyjinjing (6)
Link7808 (4)
ValMystletainn (4)
TekayaNidham (2)
EdgeObserver (2)
ShoufaChen (2)
anthonyweidai (2)
daniel-de-leon-user293 (2)
TrungKhoaLe (2)
lgov (2)
Christophe-Foyer (2)
kumar-selvakumaran (2)
ashishpatel26 (2)
sgsangodkar (2)
hoel-bagard (2)

Top Labels

Issue Labels

Pull Request Labels

Packages

Total packages: 2
Total downloads:
- pypi 35,467 last-month
Total docker downloads: 148

Total dependent packages: 12
(may contain duplicates)
Total dependent repositories: 80
(may contain duplicates)
Total versions: 38
Total maintainers: 2

pypi.org: grad-cam

Many Class Activation Map methods implemented in Pytorch for classification, segmentation, object detection and more

Homepage: https://github.com/jacobgil/pytorch-grad-cam
Documentation: https://grad-cam.readthedocs.io/
License: MIT License
Latest release: 1.5.5
published 11 months ago

Versions: 33
Dependent Packages: 12
Dependent Repositories: 80
Downloads: 35,467 Last month
Docker Downloads: 148

Rankings

Stargazers count: 0.3%

Forks count: 1.1%

Dependent packages count: 1.3%

Downloads: 1.3%

Average: 1.4%

Dependent repos count: 1.7%

Docker downloads count: 2.8%

Maintainers (2)

jdecid jacobgil

Last synced: 6 months ago

conda-forge.org: grad-cam

Homepage: https://github.com/jacobgil/pytorch-grad-cam
License: MIT
Latest release: 1.4.0
published over 3 years ago

Versions: 5
Dependent Packages: 0
Dependent Repositories: 0

Rankings

Stargazers count: 3.7%

Forks count: 4.1%

Average: 23.3%

Dependent repos count: 34.0%

Dependent packages count: 51.2%

Last synced: 6 months ago

grad-cam

Science Score: 46.0%

Keywords

Keywords from Contributors

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Advanced AI explainability for PyTorch

Visual Examples

Object Detection and Semantic Segmentation

Explaining similarity to other images / embeddings

Deep Feature Factorization

CLIP

Classification

Resnet50:

Vision Transfomer (Deit Tiny):

Swin Transfomer (Tiny window:7 patch:4 input-size:224):

Metrics and Evaluation for XAI

Usage examples

Note: input_tensor can be a batch tensor with several images!

We have to specify the target we want to generate the CAM for.

Construct the CAM object once, and then re-use it on many images.

Choosing the layer(s) to extract activations from

Adapting for new architectures and tasks

The reshape_transform argument

The model_target argument

Tutorials

Guided backpropagation

Metrics and evaluating the explanations

Create the metric target, often the confidence drop in a score of some category

State of the art metric: Remove and Debias

You can also average across different percentiles, and combine

(LeastRelevantFirst - MostRelevantFirst) / 2

Smoothing to get nice looking CAMs

Running the example script:

Citation

References

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

pypi.org: grad-cam

Rankings

Maintainers (2)

conda-forge.org: grad-cam

Rankings

Dependencies