semanticlens
Mechanistic understanding and validation of large AI models with SemanticLens
Science Score: 67.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 1 DOI reference(s) in README -
✓Academic publication links
Links to: nature.com -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.0%) to scientific vocabulary
Repository
Mechanistic understanding and validation of large AI models with SemanticLens
Basic Info
- Host: GitHub
- Owner: jim-berend
- License: bsd-3-clause
- Language: Python
- Default Branch: main
- Homepage: https://jim-berend.github.io/semanticlens/
- Size: 26.5 MB
Statistics
- Stars: 20
- Watchers: 3
- Forks: 0
- Open Issues: 0
- Releases: 5
Metadata Files
README.md
An open-source PyTorch library for interpreting and validating large vision models.
Read the paper now as part of Nature Machine Intelligence (Open Access).
SemanticLens is a universal framework for explaining and validating large vision models. While deep learning models are powerful, their internal workings are often a "black box," making them difficult to trust and debug. SemanticLens addresses this by mapping the internal components of a model (like neurons or filters) into the rich, semantic space of a foundation model (e.g., CLIP or SigLIP).
This allows you to "translate" what the model is doing into a human-understandable format, enabling you to search, analyze, and audit its internal representations.
How It Works
Overview of the SemanticLens framework as introduced in our research paper.
The core workflow of SemanticLens involves three main steps:
1) Collect: For each component in a model M, we identify the data samples that cause the highest activation (the "concept examples").
We provide a suite of ComponentVisualizers that implement different strategies, from simple activation maximization to relevance-maximization and attribution-based cropping.
2) Embed: These examples are then fed into a foundation model (like CLIP), which creates a meaningful vector representation for each component. SemanticLens includes built-in support for OpenCLIP and can be easily extended with other foundation models (see base.py).
3) Analyze: These vector representations enable powerful analyses. The Lens class is the main interface for this, orchestrating the preprocessing, caching, and evaluation needed to search and audit your model using its new semantic embeddings.
Installation
You can install SemanticLens directly from PyPI:
bash
pip install semanticlens
To install the latest version from this repository:
bash
pip install git+https://github.com/jim-berend/semanticlens.git
Quickstart
Example usage: ```python import semanticlens as sl
... # dataset and model setup
Initialization
cv = sl.componentvisualization.ActivationComponentVisualizer( model, datasetmodel, datasetfm, layernames=layernames, device=device, cachedir=cache_dir, )
fm = sl.foundation_models.OpenClip(url="RN50", pretrained="openai", device=device)
lens = sl.Lens(fm, device=device)
Semantic Embedding
conceptdb = lens.computeconceptdb(cv, batchsize=128, numworkers=8) aggregatedcptdb = {k: v.mean(1) for k, v in conceptdb.items()}
Analysis
polysemanticityscores = lens.evalpolysemanticity(concept_db)
searchresults = lens.textprobing(["cats", "dogs"], aggregatedcptdb)
- Full quickstart guide: quickstart.ipynb
- Package documentation: docs
Contributing
We welcome contributions to SemanticLens! Whether you're fixing a bug, adding a new feature, or improving the documentation, your help is appreciated.
If you'd like to contribute, please follow these steps: 1. Fork the repository on GitHub. 2. Create a new branch for your feature or bug fix (git checkout -b feature/your-feature-name). 3. Make your changes and commit them with a clear message. 4. Open a pull request to the main branch of the original repository.
For bug reports or feature requests, please use the GitHub Issues section. Before starting work on a major change, it's a good idea to open an issue first to discuss your plan.
License
Citation
@article{dreyer_mechanistic_2025,
title = {Mechanistic understanding and validation of large {AI} models with {SemanticLens}},
copyright = {2025 The Author(s)},
issn = {2522-5839},
url = {https://www.nature.com/articles/s42256-025-01084-w},
doi = {10.1038/s42256-025-01084-w},
language = {en},
urldate = {2025-08-18},
journal = {Nature Machine Intelligence},
author = {Dreyer, Maximilian and Berend, Jim and Labarta, Tobias and Vielhaben, Johanna and Wiegand, Thomas and Lapuschkin, Sebastian and Samek, Wojciech},
month = aug,
year = {2025},
note = {Publisher: Nature Publishing Group},
keywords = {Computer science, Information technology},
pages = {1--14},
}
Owner
- Name: Jim
- Login: jim-berend
- Kind: user
- Repositories: 1
- Profile: https://github.com/jim-berend
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - family-names: "Berend" given-names: "Jim" - family-names: "Dreyer" given-names: "Maximilian" title: "SemanticLens Software" version: 0.1.0 doi: 10.5281/zenodo.15233580 date-released: 2025-04-16 url: "https://github.com/jim-berend/semanticlens"
GitHub Events
Total
- Release event: 4
- Watch event: 19
- Push event: 17
- Public event: 1
- Fork event: 1
- Create event: 5
Last Year
- Release event: 4
- Watch event: 19
- Push event: 17
- Public event: 1
- Fork event: 1
- Create event: 5
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 26 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 3
- Total maintainers: 1
pypi.org: semanticlens
A package for mechanistic understanding and validation of large AI model with SemanticLens
- Documentation: https://semanticlens.readthedocs.io/
- License: bsd-3-clause
-
Latest release: 0.1.2
published 6 months ago
Rankings
Maintainers (1)
Dependencies
- actions/checkout v4 composite
- actions/download-artifact v4 composite
- actions/setup-python v5 composite
- actions/upload-artifact v4 composite
- pypa/gh-action-pypi-publish release/v1 composite
- einops >=0.8.0
- open-clip-torch >=2.30.0
- scikit-learn >=1.6.1
- timm >=1.0.13
- torch >=2.5.1
- transformers >=4.48.0
- zennit-crp >=0.6.0
- 105 dependencies
- actions/checkout v1 composite
- actions/setup-python v2 composite