perceiver-io
A PyTorch implementation of Perceiver, Perceiver IO and Perceiver AR with PyTorch Lightning scripts for distributed training
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.9%) to scientific vocabulary
Keywords
Repository
A PyTorch implementation of Perceiver, Perceiver IO and Perceiver AR with PyTorch Lightning scripts for distributed training
Basic Info
Statistics
- Stars: 486
- Watchers: 9
- Forks: 43
- Open Issues: 2
- Releases: 0
Topics
Metadata Files
README.md
Perceiver, Perceiver IO and Perceiver AR
This repository is a PyTorch implementation of Perceiver, Perceiver IO and Perceiver AR, with PyTorch Lightning interfaces for model training and Hugging Face 🤗 interfaces for inference.
| Perceiver: General Perception with Iterative Attention (paper, video) | ![]() |
| Perceiver IO: A General Architecture for Structured Inputs & Outputs (paper, blog post) | ![]() |
| General-purpose, long-context autoregressive modeling with Perceiver AR (paper, blog post) | ![]() |
Overview
Core of the perceiver-io library are backend models, lightweight PyTorch implementations of Perceiver,
Perceiver IO and Perceiver AR. They can be wrapped into PyTorch Lightning
modules for training (Lightning interface) and 🤗 modules for inference (Hugging Face interface). See
library design for details.
The command line interface for training is implemented with Lightning CLI.
Training datasets are 🤗 datasets wrapped into PyTorch Lightning data modules.
For NLP tasks, perceiver-io supports all 🤗 fast tokenizers
and the 🤗 Perceiver UTF-8 bytes tokenizer.
Documentation
- Installation
- Getting started
- Library design
- Pretrained models
- Training examples
- Inference examples
- Model construction
- Building blocks
Installation
Via pip
shell
pip install perceiver-io[text,vision,audio]
From sources
Installation from sources requires a Miniconda and a Poetry (1.2.0 or higher) installation.
Create and activate the perceiver-io conda environment:
shell
conda env create -f environment.yml
conda activate perceiver-io
Install main and test dependencies, including all extras:
```shell
Without dependencies required for examples
poetry install --all-extras ```
If you want to run the examples locally, additionally use --with examples:
shell
poetry install --all-extras --with examples
Docker image
shell
docker pull ghcr.io/krasserm/perceiver-io:latest
See Docker image for details.
Getting started
Inference
Optical flow
Compute the optical flow between consecutive frames of an input video and write the rendered results to an output video:
```python from urllib.request import urlretrieve from transformers import pipeline
from perceiver.data.vision import videoutils from perceiver.model.vision import opticalflow # register auto-classes and pipeline
urlretrieve( url="https://martin-krasser.com/perceiver/flow/sintelclipcavedragonfight.mp4", filename="sintelclipcavedragonfight.mp4", )
Create optical flow pipeline
opticalflowpipeline = pipeline("optical-flow", model="krasserm/perceiver-io-optical-flow", device="cuda:0")
load consecutive video frame pairs
framepairs = videoutils.readvideoframepairs("sintelclipcavedragon_fight.mp4")
create and render optical flow for all frame pairs
opticalflows = opticalflowpipeline(framepairs, render=True, device="cuda:0")
create video with rendered optical flows
videoutils.writevideo("sintelclipcavedragonfightoutput.mp4", opticalflows, fps=24) ```
Here is a side-by-side comparison of the input and output video:
Symbolic audio generation
Create audio sequences by generating symbolic (MIDI) audio data and converting the generated audio symbols into WAV output using fluidsynth (Note: fluidsynth must be installed in order for the following example to work):
```python from transformers import pipeline from pretty_midi import PrettyMIDI from perceiver.model.audio import symbolic # auto-class registration
repo_id = "krasserm/perceiver-ar-sam-giant-midi"
prompt = PrettyMIDI("prompt.mid") audiogenerator = pipeline("symbolic-audio-generation", model=repoid)
output = audiogenerator(prompt, maxnewtokens=64, numlatents=1, dosample=True, topp=0.95, temperature=1.0, render=True)
with open("generatedaudio.wav", "wb") as f: f.write(output["generatedaudio_wav"]) ```
Examples of generated audio sequences are available on the 🤗 hub.
See inference examples for more examples.
Training
Train a small Perceiver IO image classifier (907K parameters) on MNIST from the command line. The classifier cross-attends to individual pixels of input images with repeated cross-attention. See image classification training example for more details.
shell
python -m perceiver.scripts.vision.image_classifier fit \
--model.num_latents=32 \
--model.num_latent_channels=128 \
--model.encoder.num_frequency_bands=32 \
--model.encoder.num_cross_attention_layers=2 \
--model.encoder.num_self_attention_blocks=3 \
--model.encoder.num_self_attention_layers_per_block=3 \
--model.encoder.first_self_attention_block_shared=false \
--model.encoder.dropout=0.1 \
--model.encoder.init_scale=0.1 \
--model.decoder.num_output_query_channels=128 \
--model.decoder.dropout=0.1 \
--model.decoder.init_scale=0.1 \
--data=MNISTDataModule \
--data.batch_size=64 \
--optimizer=AdamW \
--optimizer.lr=1e-3 \
--lr_scheduler.warmup_steps=500 \
--trainer.accelerator=gpu \
--trainer.devices=1 \
--trainer.max_epochs=30 \
--trainer.logger=TensorBoardLogger \
--trainer.logger.save_dir=logs \
--trainer.logger.name=logs
Model construction describes how to implement model-specific command line interfaces
with the Lightning CLI. Training checkpoints are written to the logs/img_clf/version_0/checkpoints directory. Assuming
a checkpoint with filename epoch=025-val_loss=0.065.ckpt exists, it can be converted to a perceiver-io 🤗 model with
```python from perceiver.model.vision.imageclassifier import convertmnistclassifiercheckpoint
convertmnistclassifiercheckpoint( savedir="example/mnist-classifier", ckpturl="logs/imgclf/version0/checkpoints/epoch=025-valloss=0.065.ckpt", ) ```
so that it can be used in a 🤗 image classification pipeline
```python from datasets import load_dataset from transformers import pipeline
mnistdataset = loaddataset("mnist", split="test")[:9]
images = mnistdataset["image"] labels = mnistdataset["label"]
classifier = pipeline("image-classification", model="example/mnist-classifier") predictions = [pred[0]["label"] for pred in classifier(images)]
print(f"Labels: {labels}")
print(f"Predictions: {predictions}")
Labels: [7, 2, 1, 0, 4, 1, 4, 9, 5]
Predictions: [7, 2, 1, 0, 4, 1, 4, 9, 5]
```
or loaded directly:
```python import torch from transformers import AutoModelForImageClassification, AutoImageProcessor
model = AutoModelForImageClassification.frompretrained("example/mnist-classifier") processor = AutoImageProcessor.frompretrained("example/mnist-classifier")
inputs = processor(images, return_tensors="pt")
with torch.nograd(): # use perceiver-io Hugging Face model output1 = model(**inputs).logits
with torch.nograd():
# or use perceiver-io backend model directly
output2 = model.backendmodel(inputs.pixelvalues)
print(f"Predictions: {output1.argmax(dim=-1).numpy().tolist()}")
print(f"Predictions: {output2.argmax(dim=-1).numpy().tolist()}")
Predictions: [7, 2, 1, 0, 4, 1, 4, 9, 5]
Predictions: [7, 2, 1, 0, 4, 1, 4, 9, 5]
```
See training examples for more examples.
Articles
Articles referencing this repository:
- Training compute-optimal Perceiver AR language models
- A gentle introduction to Rotary Position Embedding
Other implementations
Owner
- Name: Martin Krasser
- Login: krasserm
- Kind: user
- Location: Vienna, Austria
- Website: https://martin-krasser.com/
- Repositories: 29
- Profile: https://github.com/krasserm
Freelance machine learning engineer.
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you use this software, please cite it as below."
title: "A PyTorch implementation of Perceiver, Perceiver IO and Perceiver AR with PyTorch Lightning scripts for distributed training."
authors:
- family-names: "Krasser"
given-names: "Martin"
- family-names: "Stumpf"
given-names: "Christoph"
version: 0.11.1
doi: 10.5281/zenodo.10451410
date-released: 2024-01-02
url: "https://github.com/krasserm/perceiver-io"
license: "Apache-2.0"
GitHub Events
Total
- Issues event: 1
- Watch event: 50
- Issue comment event: 2
- Fork event: 3
Last Year
- Issues event: 1
- Watch event: 50
- Issue comment event: 2
- Fork event: 3
Committers
Last synced: 10 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Martin Krasser | k****m@g****m | 159 |
| Jirka | j****c@s****z | 21 |
| Christoph Stumpf | s****h@g****m | 6 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 9 months ago
All Time
- Total issues: 32
- Total pull requests: 27
- Average time to close issues: 16 days
- Average time to close pull requests: 3 days
- Total issue authors: 15
- Total pull request authors: 4
- Average comments per issue: 1.03
- Average comments per pull request: 1.07
- Merged pull requests: 23
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 1
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 1
- Pull request authors: 0
- Average comments per issue: 2.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- krasserm (18)
- jiveshkalra (1)
- Colezwhy (1)
- 0scarJ1ang (1)
- awarebayes (1)
- je-santos (1)
- wscffaa (1)
- martin-minovski (1)
- ajv012 (1)
- kl2005ad (1)
- zhangyuygss (1)
- Norooa (1)
- batrlatom (1)
- ArianKhorasani (1)
Pull Request Authors
- krasserm (12)
- cstub (7)
- Borda (6)
- mattsta (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 328 last-month
- Total dependent packages: 0
- Total dependent repositories: 1
- Total versions: 17
- Total maintainers: 2
pypi.org: perceiver-io
Perceiver IO
- Homepage: https://github.com/krasserm/perceiver-io
- Documentation: https://perceiver-io.readthedocs.io/
- License: Apache-2.0
-
Latest release: 0.11.0
published over 2 years ago
Rankings
Dependencies
- atomicwrites 1.4.0 develop
- cfgv 3.3.1 develop
- coverage 6.4 develop
- distlib 0.3.4 develop
- identify 2.5.1 develop
- iniconfig 1.1.1 develop
- invoke 1.7.1 develop
- nodeenv 1.6.0 develop
- platformdirs 2.5.2 develop
- pluggy 1.0.0 develop
- pre-commit 2.19.0 develop
- py 1.11.0 develop
- pytest 7.1.2 develop
- pytest-cov 3.0.0 develop
- toml 0.10.2 develop
- tomli 2.0.1 develop
- virtualenv 20.14.1 develop
- absl-py 1.0.0
- aiohttp 3.8.1
- aiosignal 1.2.0
- async-timeout 4.0.2
- asynctest 0.13.0
- attrs 21.4.0
- cachetools 5.1.0
- certifi 2022.5.18.1
- charset-normalizer 2.0.12
- colorama 0.4.4
- datasets 2.2.2
- dill 0.3.4
- docstring-parser 0.14.1
- einops 0.4.1
- fairscale 0.4.6
- filelock 3.7.0
- frozenlist 1.3.0
- fsspec 2022.5.0
- google-auth 2.6.6
- google-auth-oauthlib 0.4.6
- grpcio 1.46.3
- huggingface-hub 0.7.0
- idna 3.3
- importlib-metadata 4.11.4
- jsonargparse 4.7.3
- lightning-bolts 0.5.0
- markdown 3.3.7
- multidict 6.0.2
- multiprocess 0.70.12.2
- numpy 1.21.1
- oauthlib 3.2.0
- packaging 21.3
- pandas 1.1.5
- pillow 9.1.1
- protobuf 3.20.1
- pyarrow 8.0.0
- pyasn1 0.4.8
- pyasn1-modules 0.2.8
- pydeprecate 0.3.2
- pyparsing 3.0.9
- python-dateutil 2.8.2
- pytorch-lightning 1.6.3
- pytorch-ranger 0.1.1
- pytz 2022.1
- pyyaml 6.0
- regex 2022.4.24
- requests 2.27.1
- requests-oauthlib 1.3.1
- responses 0.18.0
- rsa 4.8
- six 1.16.0
- tensorboard 2.9.0
- tensorboard-data-server 0.6.1
- tensorboard-plugin-wit 1.8.1
- tokenizers 0.12.1
- torch 1.11.0
- torch-optimizer 0.3.0
- torchmetrics 0.8.2
- torchvision 0.12.0
- tqdm 4.64.0
- transformers 4.19.2
- typing-extensions 4.2.0
- urllib3 1.26.9
- werkzeug 2.1.2
- xxhash 3.0.0
- yarl 1.7.2
- zipp 3.8.0
- invoke ^1.6.0 develop
- pre-commit ^2.17.0 develop
- pytest ^7.0.1 develop
- pytest-cov ^3.0.0 develop
- datasets 2.2.*
- einops 0.4.*
- fairscale 0.4.*
- jsonargparse 4.7.*
- lightning-bolts 0.5.*
- python ^3.7
- pytorch-lightning 1.6.*
- tokenizers 0.12.*
- torch 1.11.*
- torch-optimizer 0.3.*
- torchmetrics 0.8.*
- torchvision 0.12.*
- transformers 4.19.*
- abatilo/actions-poetry v2.0.0 composite
- actions/checkout v3 composite
- actions/checkout master composite
- actions/setup-python v2 composite
- abatilo/actions-poetry v2.0.0 composite
- actions/cache v2 composite
- actions/checkout v3 composite
- actions/setup-python v2 composite
- pytorch/pytorch 1.12.1-cuda11.3-cudnn8-runtime build
- actions/checkout v3 composite
- docker/build-push-action v4 composite
- docker/login-action v2 composite


