tuned-lens

Tools for understanding how transformer predictions are built layer-by-layer

https://github.com/alignmentresearch/tuned-lens

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (13.4%) to scientific vocabulary

Keywords

machine-learning pytorch transformers

Last synced: 10 months ago · JSON representation ·

Repository

Tools for understanding how transformer predictions are built layer-by-layer

Basic Info

Host: GitHub
Owner: AlignmentResearch
License: mit
Language: Python
Default Branch: main
Homepage: https://tuned-lens.readthedocs.io/en/latest/
Size: 5.98 MB

Statistics

Stars: 512
Watchers: 6
Forks: 57
Open Issues: 15
Releases: 5

Topics

machine-learning pytorch transformers

Created over 3 years ago · Last pushed 11 months ago

Metadata Files

Readme License Citation

Tuned Lens 🔎

Tools for understanding how transformer predictions are built layer-by-layer.

This package provides a simple interface for training and evaluating tuned lenses. A tuned lens allows us to peek at the iterative computations a transformer uses to compute the next token.

What is a Lens?

A diagram showing how a translator within the lens allows you to skip intermediate layers.

A lens into a transformer with n layers allows you to replace the last m layers of the model with an affine transformation (we call these affine translators). Each affine translator is trained to minimize the KL divergence between its prediction and the final output distribution of the original model. This means that after training, the tuned lens allows you to skip over these last few layers and see the best prediction that can be made from the model's intermediate representations, i.e., the residual stream, at layer n - m.

The reason we need to train an affine translator is that the representations may be rotated, shifted, or stretched from layer to layer. This training differentiates this method from simpler approaches that unembed the residual stream of the network directly using the unembedding matrix, i.e., the logit lens. We explain this process and its applications in the paper Eliciting Latent Predictions from Transformers with the Tuned Lens.

Acknowledgments

Originally conceived by Igor Ostrovsky and Stella Biderman at EleutherAI, this library was built as a collaboration between FAR and EleutherAI researchers.

Install Instructions

Installing from PyPI

First, you will need to install the basic prerequisites into a virtual environment: * Python 3.9+ * PyTorch 1.13.0+

Then, you can simply install the package using pip. pip install tuned-lens

Installing the container

If you prefer to run the training scripts from within a container, you can use the provided Docker container.

docker pull ghcr.io/alignmentresearch/tuned-lens:latest docker run --rm tuned-lens:latest tuned-lens --help

Contributing

Make sure to install the dev dependencies and install the pre-commit hooks. $ git clone https://github.com/AlignmentResearch/tuned-lens.git $ pip install -e ".[dev]" $ pre-commit install

Citation

If you find this library useful, please cite it as:

bibtex @article{belrose2023eliciting, title={Eliciting Latent Predictions from Transformers with the Tuned Lens}, authors={Belrose, Nora and Furman, Zach and Smith, Logan and Halawi, Danny and McKinney, Lev and Ostrovsky, Igor and Biderman, Stella and Steinhardt, Jacob}, journal={to appear}, year={2023} }

Warning This package has not reached 1.0. Expect the public interface to change regularly and without a major version bumps.

Owner

Name: FAR AI
Login: AlignmentResearch
Kind: organization
Email: hello@far.ai

Website: https://far.ai
Repositories: 16
Profile: https://github.com/AlignmentResearch

FAR AI is an alignment research non-profit working to ensure AI systems are trustworthy and beneficial to society.

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Nora"
  given-names: "Belrose"
- family-names: "Zach"
  given-names: "Furman"
- family-names: "Logan"
  given-names: "Smith"
- family-names: "Danny"
  given-names: "Halawi"
- family-names: "Lev"
  given-names: "McKinney"
- family-names: "Igor"
  given-names: "Ostrovsky"
- family-names: "Stella"
  given-names: "Biderman"
- family-names: "Jacob"
  given-names: "Steinhardt"
title: "Eliciting Latent Predictions from Transformers with the Tuned Lens"
version: 0.1.0
date-released: "2023-03-06"
url: "https://github.com/AlignmentResearch/tuned-lens"
preferred-citation:
  type: article
  authors:
  - family-names: "Belrose"
    given-names: "Nora"
  - family-names: "Furman"
    given-names: "Zach"
  - family-names: "Smith"
    given-names: "Logan"
  - family-names: "Halawi"
    given-names: "Danny"
  - family-names: "McKinney"
    given-names: "Lev"
  - family-names: "Ostrovsky"
    given-names: "Igor"
  - family-names: "Biderman"
    given-names: "Stella"
  - family-names: "Steinhardt"
    given-names: "Jacob"
  journal: "to appear"
  title: "Eliciting Latent Predictions from Transformers with the Tuned Lens"
  year: 2023

GitHub Events

Total

Issues event: 7
Watch event: 98
Issue comment event: 8
Push event: 1
Pull request review event: 1
Pull request event: 2
Fork event: 13

Last Year

Issues event: 7
Watch event: 98
Issue comment event: 8
Push event: 1
Pull request review event: 1
Pull request event: 2
Fork event: 13

Issues and Pull Requests

Last synced: 10 months ago

All Time

Total issues: 3
Total pull requests: 1
Average time to close issues: N/A
Average time to close pull requests: 18 days
Total issue authors: 3
Total pull request authors: 1
Average comments per issue: 0.0
Average comments per pull request: 2.0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 3
Pull requests: 1
Average time to close issues: N/A
Average time to close pull requests: 18 days
Issue authors: 3
Pull request authors: 1
Average comments per issue: 0.0
Average comments per pull request: 2.0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

levmckinney (3)
themaverick (1)
NielsRogge (1)
Windy3f3f3f3f (1)
Nicole-Nobili (1)
jbloomAus (1)
sunwookim1214 (1)

Pull Request Authors

levmckinney (5)
Nicole-Nobili (1)
joshbarua (1)
norabelrose (1)

Top Labels

Issue Labels

Pull Request Labels

Packages

Total packages: 1
Total downloads:
- pypi 368 last-month
Total docker downloads: 54

Total dependent packages: 0
Total dependent repositories: 1
Total versions: 8
Total maintainers: 1

pypi.org: tuned-lens

Tools for understanding how transformer predictions are built layer-by-layer

Documentation: https://tuned-lens.readthedocs.io/
License: MIT License
Latest release: 0.2.0
published almost 3 years ago

Versions: 8
Dependent Packages: 0
Dependent Repositories: 1
Downloads: 368 Last month
Docker Downloads: 54

Rankings

Docker downloads count: 4.7%

Dependent packages count: 10.0%

Average: 12.4%

Downloads: 13.3%

Dependent repos count: 21.7%

Maintainers (1)

levmckinney

Last synced: 10 months ago

Dependencies

.github/workflows/pre-merge.yml actions

actions/checkout v3 composite
actions/checkout v2 composite
actions/setup-python v3 composite
codecov/codecov-action v3 composite
docker/build-push-action v4 composite
docker/setup-buildx-action v2 composite
pre-commit/action v3.0.0 composite

.github/workflows/pre-release.yml actions

actions/checkout v2 composite
actions/checkout master composite
actions/setup-python v2 composite
actions/setup-python v3 composite
docker/build-push-action v4 composite
docker/setup-buildx-action v2 composite
pypa/gh-action-pypi-publish release/v1 composite
softprops/action-gh-release v1 composite

.github/workflows/publish.yml actions

actions/checkout v3 composite
actions/checkout master composite
actions/setup-python v3 composite
docker/build-push-action v4 composite
docker/login-action v2 composite
docker/metadata-action v4 composite
docker/setup-buildx-action v2 composite
pypa/gh-action-pypi-publish release/v1 composite

Dockerfile docker

base latest build
nvidia/cuda 11.8.0-devel-ubuntu22.04 build

pyproject.toml pypi

accelerate *
datasets *
flatten-dict >=0.4.1
huggingface_hub >=0.16.4
plotly >=5.13.1
simple-parsing >=0.1.4
torch >=1.13.0
torchdata >=0.6.0
transformers >=4.28.1
wandb >=0.15.0

tuned-lens

Science Score: 54.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Tuned Lens 🔎

What is a Lens?

Acknowledgments

Install Instructions

Installing from PyPI

Installing the container

Contributing

Citation

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

pypi.org: tuned-lens

Rankings

Maintainers (1)

Dependencies