perceiver-io

A PyTorch implementation of Perceiver, Perceiver IO and Perceiver AR with PyTorch Lightning scripts for distributed training

https://github.com/krasserm/perceiver-io

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Committers with academic emails
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (11.9%) to scientific vocabulary

Keywords

deep-learning machine-learning perceiver perceiver-ar perceiver-io pytorch pytorch-lightning

Last synced: 11 months ago · JSON representation ·

Repository

A PyTorch implementation of Perceiver, Perceiver IO and Perceiver AR with PyTorch Lightning scripts for distributed training

Basic Info

Host: GitHub
Owner: krasserm
License: apache-2.0
Language: Python
Default Branch: main
Homepage:
Size: 22.2 MB

Statistics

Stars: 486
Watchers: 9
Forks: 43
Open Issues: 2
Releases: 0

Topics

deep-learning machine-learning perceiver perceiver-ar perceiver-io pytorch pytorch-lightning

Created almost 5 years ago · Last pushed over 2 years ago

Metadata Files

Readme License Citation

Perceiver, Perceiver IO and Perceiver AR

This repository is a PyTorch implementation of Perceiver, Perceiver IO and Perceiver AR, with PyTorch Lightning interfaces for model training and Hugging Face 🤗 interfaces for inference.

Perceiver: General Perception with Iterative Attention (paper, video)
Perceiver IO: A General Architecture for Structured Inputs & Outputs (paper, blog post)
General-purpose, long-context autoregressive modeling with Perceiver AR (paper, blog post)

Overview

Core of the perceiver-io library are backend models, lightweight PyTorch implementations of Perceiver, Perceiver IO and Perceiver AR. They can be wrapped into PyTorch Lightning modules for training (Lightning interface) and 🤗 modules for inference (Hugging Face interface). See library design for details.

library-design

The command line interface for training is implemented with Lightning CLI. Training datasets are 🤗 datasets wrapped into PyTorch Lightning data modules. For NLP tasks, perceiver-io supports all 🤗 fast tokenizers and the 🤗 Perceiver UTF-8 bytes tokenizer.

Documentation

Installation

Via pip

shell pip install perceiver-io[text,vision,audio]

From sources

Installation from sources requires a Miniconda and a Poetry (1.2.0 or higher) installation.

Create and activate the perceiver-io conda environment:

shell conda env create -f environment.yml conda activate perceiver-io

Install main and test dependencies, including all extras:

```shell

Without dependencies required for examples

poetry install --all-extras ```

If you want to run the examples locally, additionally use --with examples:

shell poetry install --all-extras --with examples

Docker image

shell docker pull ghcr.io/krasserm/perceiver-io:latest

See Docker image for details.

Getting started

Inference

Optical flow

Compute the optical flow between consecutive frames of an input video and write the rendered results to an output video:

```python from urllib.request import urlretrieve from transformers import pipeline

from perceiver.data.vision import videoutils from perceiver.model.vision import opticalflow # register auto-classes and pipeline

urlretrieve( url="https://martin-krasser.com/perceiver/flow/sintelclipcavedragonfight.mp4", filename="sintelclipcavedragonfight.mp4", )

Create optical flow pipeline

opticalflowpipeline = pipeline("optical-flow", model="krasserm/perceiver-io-optical-flow", device="cuda:0")

load consecutive video frame pairs

framepairs = videoutils.readvideoframepairs("sintelclipcavedragon_fight.mp4")

create and render optical flow for all frame pairs

opticalflows = opticalflowpipeline(framepairs, render=True, device="cuda:0")

create video with rendered optical flows

videoutils.writevideo("sintelclipcavedragonfightoutput.mp4", opticalflows, fps=24) ```

Here is a side-by-side comparison of the input and output video:

optical-flow-sbs

Symbolic audio generation

Create audio sequences by generating symbolic (MIDI) audio data and converting the generated audio symbols into WAV output using fluidsynth (Note: fluidsynth must be installed in order for the following example to work):

```python from transformers import pipeline from pretty_midi import PrettyMIDI from perceiver.model.audio import symbolic # auto-class registration

repo_id = "krasserm/perceiver-ar-sam-giant-midi"

prompt = PrettyMIDI("prompt.mid") audiogenerator = pipeline("symbolic-audio-generation", model=repoid)

output = audiogenerator(prompt, maxnewtokens=64, numlatents=1, dosample=True, topp=0.95, temperature=1.0, render=True)

with open("generatedaudio.wav", "wb") as f: f.write(output["generatedaudio_wav"]) ```

Examples of generated audio sequences are available on the 🤗 hub.

See inference examples for more examples.

Training

Train a small Perceiver IO image classifier (907K parameters) on MNIST from the command line. The classifier cross-attends to individual pixels of input images with repeated cross-attention. See image classification training example for more details.

shell python -m perceiver.scripts.vision.image_classifier fit \ --model.num_latents=32 \ --model.num_latent_channels=128 \ --model.encoder.num_frequency_bands=32 \ --model.encoder.num_cross_attention_layers=2 \ --model.encoder.num_self_attention_blocks=3 \ --model.encoder.num_self_attention_layers_per_block=3 \ --model.encoder.first_self_attention_block_shared=false \ --model.encoder.dropout=0.1 \ --model.encoder.init_scale=0.1 \ --model.decoder.num_output_query_channels=128 \ --model.decoder.dropout=0.1 \ --model.decoder.init_scale=0.1 \ --data=MNISTDataModule \ --data.batch_size=64 \ --optimizer=AdamW \ --optimizer.lr=1e-3 \ --lr_scheduler.warmup_steps=500 \ --trainer.accelerator=gpu \ --trainer.devices=1 \ --trainer.max_epochs=30 \ --trainer.logger=TensorBoardLogger \ --trainer.logger.save_dir=logs \ --trainer.logger.name=logs

Model construction describes how to implement model-specific command line interfaces with the Lightning CLI. Training checkpoints are written to the logs/img_clf/version_0/checkpoints directory. Assuming a checkpoint with filename epoch=025-val_loss=0.065.ckpt exists, it can be converted to a perceiver-io 🤗 model with

```python from perceiver.model.vision.imageclassifier import convertmnistclassifiercheckpoint

convertmnistclassifiercheckpoint( savedir="example/mnist-classifier", ckpturl="logs/imgclf/version0/checkpoints/epoch=025-valloss=0.065.ckpt", ) ```

so that it can be used in a 🤗 image classification pipeline

```python from datasets import load_dataset from transformers import pipeline

mnistdataset = loaddataset("mnist", split="test")[:9]

images = mnistdataset["image"] labels = mnistdataset["label"]

classifier = pipeline("image-classification", model="example/mnist-classifier") predictions = [pred[0]["label"] for pred in classifier(images)]

print(f"Labels: {labels}") print(f"Predictions: {predictions}") Labels: [7, 2, 1, 0, 4, 1, 4, 9, 5] Predictions: [7, 2, 1, 0, 4, 1, 4, 9, 5] ```

or loaded directly:

```python import torch from transformers import AutoModelForImageClassification, AutoImageProcessor

model = AutoModelForImageClassification.frompretrained("example/mnist-classifier") processor = AutoImageProcessor.frompretrained("example/mnist-classifier")

inputs = processor(images, return_tensors="pt")

with torch.nograd(): # use perceiver-io Hugging Face model output1 = model(**inputs).logits

with torch.nograd(): # or use perceiver-io backend model directly
output2 = model.backendmodel(inputs.pixelvalues)

print(f"Predictions: {output1.argmax(dim=-1).numpy().tolist()}") print(f"Predictions: {output2.argmax(dim=-1).numpy().tolist()}") Predictions: [7, 2, 1, 0, 4, 1, 4, 9, 5] Predictions: [7, 2, 1, 0, 4, 1, 4, 9, 5] ```

See training examples for more examples.

Articles

Articles referencing this repository:

Other implementations

Owner

Name: Martin Krasser
Login: krasserm
Kind: user
Location: Vienna, Austria

Website: https://martin-krasser.com/
Repositories: 29
Profile: https://github.com/krasserm

Freelance machine learning engineer.

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
title: "A PyTorch implementation of Perceiver, Perceiver IO and Perceiver AR with PyTorch Lightning scripts for distributed training."
authors:
  - family-names: "Krasser"
    given-names: "Martin"
  - family-names: "Stumpf"
    given-names: "Christoph"
version: 0.11.1
doi: 10.5281/zenodo.10451410
date-released: 2024-01-02
url: "https://github.com/krasserm/perceiver-io"
license: "Apache-2.0"

GitHub Events

Total

Issues event: 1
Watch event: 50
Issue comment event: 2
Fork event: 3

Last Year

Issues event: 1
Watch event: 50
Issue comment event: 2
Fork event: 3

Committers

Last synced: about 1 year ago

All Time

Total Commits: 186
Total Committers: 3
Avg Commits per committer: 62.0
Development Distribution Score (DDS): 0.145

Past Year

Commits: 0
Committers: 0
Avg Commits per committer: 0.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
Martin Krasser	k**m@g**m	159
Jirka	j**c@s**z	21
Christoph Stumpf	s**h@g**m	6

Committer Domains (Top 20 + Academic)

seznam.cz: 1

Issues and Pull Requests

Last synced: about 1 year ago

All Time

Total issues: 32
Total pull requests: 27
Average time to close issues: 16 days
Average time to close pull requests: 3 days
Total issue authors: 15
Total pull request authors: 4
Average comments per issue: 1.03
Average comments per pull request: 1.07
Merged pull requests: 23
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 1
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 1
Pull request authors: 0
Average comments per issue: 2.0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

krasserm (18)
jiveshkalra (1)
Colezwhy (1)
0scarJ1ang (1)
awarebayes (1)
je-santos (1)
wscffaa (1)
martin-minovski (1)
ajv012 (1)
kl2005ad (1)
zhangyuygss (1)
Norooa (1)
batrlatom (1)
ArianKhorasani (1)

Pull Request Authors

krasserm (12)
cstub (7)
Borda (6)
mattsta (1)

Top Labels

Issue Labels

bug (1)

Pull Request Labels

Packages

Total packages: 1
Total downloads:
- pypi 328 last-month

Total dependent packages: 0
Total dependent repositories: 1
Total versions: 17
Total maintainers: 2

pypi.org: perceiver-io

Perceiver IO

Homepage: https://github.com/krasserm/perceiver-io
Documentation: https://perceiver-io.readthedocs.io/
License: Apache-2.0
Latest release: 0.11.0
published about 3 years ago

Versions: 17
Dependent Packages: 0
Dependent Repositories: 1
Downloads: 328 Last month

Rankings

Stargazers count: 3.4%

Forks count: 6.6%

Dependent packages count: 10.1%

Average: 10.5%

Downloads: 11.0%

Dependent repos count: 21.6%

Maintainers (2)

krasserm cstub

Last synced: 11 months ago

Dependencies

poetry.lock pypi

atomicwrites 1.4.0 develop
cfgv 3.3.1 develop
coverage 6.4 develop
distlib 0.3.4 develop
identify 2.5.1 develop
iniconfig 1.1.1 develop
invoke 1.7.1 develop
nodeenv 1.6.0 develop
platformdirs 2.5.2 develop
pluggy 1.0.0 develop
pre-commit 2.19.0 develop
py 1.11.0 develop
pytest 7.1.2 develop
pytest-cov 3.0.0 develop
toml 0.10.2 develop
tomli 2.0.1 develop
virtualenv 20.14.1 develop
absl-py 1.0.0
aiohttp 3.8.1
aiosignal 1.2.0
async-timeout 4.0.2
asynctest 0.13.0
attrs 21.4.0
cachetools 5.1.0
certifi 2022.5.18.1
charset-normalizer 2.0.12
colorama 0.4.4
datasets 2.2.2
dill 0.3.4
docstring-parser 0.14.1
einops 0.4.1
fairscale 0.4.6
filelock 3.7.0
frozenlist 1.3.0
fsspec 2022.5.0
google-auth 2.6.6
google-auth-oauthlib 0.4.6
grpcio 1.46.3
huggingface-hub 0.7.0
idna 3.3
importlib-metadata 4.11.4
jsonargparse 4.7.3
lightning-bolts 0.5.0
markdown 3.3.7
multidict 6.0.2
multiprocess 0.70.12.2
numpy 1.21.1
oauthlib 3.2.0
packaging 21.3
pandas 1.1.5
pillow 9.1.1
protobuf 3.20.1
pyarrow 8.0.0
pyasn1 0.4.8
pyasn1-modules 0.2.8
pydeprecate 0.3.2
pyparsing 3.0.9
python-dateutil 2.8.2
pytorch-lightning 1.6.3
pytorch-ranger 0.1.1
pytz 2022.1
pyyaml 6.0
regex 2022.4.24
requests 2.27.1
requests-oauthlib 1.3.1
responses 0.18.0
rsa 4.8
six 1.16.0
tensorboard 2.9.0
tensorboard-data-server 0.6.1
tensorboard-plugin-wit 1.8.1
tokenizers 0.12.1
torch 1.11.0
torch-optimizer 0.3.0
torchmetrics 0.8.2
torchvision 0.12.0
tqdm 4.64.0
transformers 4.19.2
typing-extensions 4.2.0
urllib3 1.26.9
werkzeug 2.1.2
xxhash 3.0.0
yarl 1.7.2
zipp 3.8.0

pyproject.toml pypi

invoke ^1.6.0 develop
pre-commit ^2.17.0 develop
pytest ^7.0.1 develop
pytest-cov ^3.0.0 develop
datasets 2.2.*
einops 0.4.*
fairscale 0.4.*
jsonargparse 4.7.*
lightning-bolts 0.5.*
python ^3.7
pytorch-lightning 1.6.*
tokenizers 0.12.*
torch 1.11.*
torch-optimizer 0.3.*
torchmetrics 0.8.*
torchvision 0.12.*
transformers 4.19.*

.github/workflows/ci_install-pkg.yml actions

abatilo/actions-poetry v2.0.0 composite
actions/checkout v3 composite
actions/checkout master composite
actions/setup-python v2 composite

.github/workflows/ci_testing.yml actions

abatilo/actions-poetry v2.0.0 composite
actions/cache v2 composite
actions/checkout v3 composite
actions/setup-python v2 composite

Dockerfile docker

pytorch/pytorch 1.12.1-cuda11.3-cudnn8-runtime build

.github/workflows/docker.yml actions

actions/checkout v3 composite
docker/build-push-action v4 composite
docker/login-action v2 composite

environment.yml pypi

perceiver-io

Science Score: 54.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Perceiver, Perceiver IO and Perceiver AR

Overview

Documentation

Installation

Via pip

From sources

Without dependencies required for examples

Docker image

Getting started

Inference

Optical flow

Create optical flow pipeline

load consecutive video frame pairs

create and render optical flow for all frame pairs

create video with rendered optical flows

Symbolic audio generation

Training

Articles

Other implementations

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

pypi.org: perceiver-io

Rankings

Maintainers (2)

Dependencies