https://github.com/beegass/state-spaces
Sequence Modeling with Structured State Spaces
Science Score: 10.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (8.8%) to scientific vocabulary
Last synced: 9 months ago
·
JSON representation
Repository
Sequence Modeling with Structured State Spaces
Basic Info
Statistics
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
- Releases: 0
Fork of state-spaces/s4
Created over 3 years ago
· Last pushed over 3 years ago
https://github.com/BeeGass/state-spaces/blob/main/
# Structured State Spaces for Sequence Modeling This repository provides implementations and experiments for the following papers. ## S4D  > **On the Parameterization and Initialization of Diagonal State Space Models**\ > Albert Gu, Ankit Gupta, Karan Goel, Christopher R\ > Paper: https://arxiv.org/abs/2206.11893 Other variants including [DSS](https://github.com/ag1988/dss) and [GSS](https://arxiv.org/abs/2206.13947) are also supported. DSS is the predecessor to S4D that is also available in its own [fork](https://github.com/ag1988/dss). ## HTTYH  > **How to Train Your HiPPO: State Spaces with Generalized Orthogonal Basis Projections**\ > Albert Gu*, Isys Johnson*, Aman Timalsina, Atri Rudra, Christopher R\ > Paper: https://arxiv.org/abs/2206.12037 ## SaShiMi (ICML 2022 - Long Talk)  > **It's Raw! Audio Generation with State-Space Models**\ > Karan Goel, Albert Gu, Chris Donahue, Christopher R\ > Paper: https://arxiv.org/abs/2202.09729 ## S4 (ICLR 2022 - Outstanding Paper HM)  > **Efficiently Modeling Long Sequences with Structured State Spaces**\ > Albert Gu, Karan Goel, Christopher R\ > Paper: https://arxiv.org/abs/2111.00396 ## LSSL (NeurIPS 2021)  > **Combining Recurrent, Convolutional, and Continuous-time Models with the Linear State Space Layer**\ > Albert Gu, Isys Johnson, Karan Goel, Khaled Saab, Tri Dao, Atri Rudra, Christopher R\ > Paper: https://arxiv.org/abs/2110.13985 ## HiPPO (NeurIPS 2020 - Spotlight)  > **HiPPO: Recurrent Memory with Optimal Polynomial Projections**\ > Albert Gu*, Tri Dao*, Stefano Ermon, Atri Rudra, Christopher R\ > Paper: https://arxiv.org/abs/2008.07669 ## Table of Contents Setting up the environment and porting S4 to external codebases: - [Setup](#setup) - [Getting Started with S4](#getting-started-with-s4) Reproducing experiments from the papers: - [Experiments](#experiments) - [SaShiMi](sashimi/) Using this repository for training models: - [Training](#training) - [Generation](#generation) - [Repository Structure](#overall-repository-structure) - [READMEs](#readmes) - [Citation](#citation) ### Changelog See [CHANGELOG.md](CHANGELOG.md) ### Roadmap - More documentation for training from scratch using this repository - Compilation of S4 resources and implementations - pip package ## Setup ### Requirements This repository requires Python 3.8+ and Pytorch 1.10+. Other packages are listed in [requirements.txt](./requirements.txt). ### Cauchy Kernel A core operation of S4 is the "Cauchy kernel" described in the [paper](https://arxiv.org/abs/2111.00396). This is actually a very simple operation; a naive implementation of this operation can be found in the [standalone](src/models/s4/s4.py) in the function `cauchy_naive`. However, as the paper describes, this has suboptimal memory usage that currently requires a custom kernel to overcome in PyTorch. Two more efficient methods are supported. The code will automatically detect if either of these is installed and call the appropriate kernel. #### Custom CUDA Kernel This version is faster but requires manual compilation for each machine environment. Run `python setup.py install` from the directory `extensions/cauchy/`. #### Pykeops This version is provided by the [pykeops library](https://www.kernel-operations.io/keops/python/installation.html). Installation usually works out of the box with `pip install pykeops cmake` which are also listed in the requirements file. ## Getting Started with S4 ### S4 Module Self-contained files for the S4 layer and variants can be found in [src/models/s4/](./src/models/s4/), which includes instructions for calling the module. See [notebooks/](notebooks/) for visualizations explaining some concepts behind HiPPO and S4. ### Example Train Script (External Usage) [example.py](example.py) is a self-contained training script for MNIST and CIFAR that imports the standalone S4 file. The default settings `python example.py` reaches 88% accuracy on sequential CIFAR with a very simple S4D model of 200k parameters. This script can be used as an example for using S4 in external repositories. ### Training with this Repository (Internal Usage) This repository aims to provide a very flexible framework for training sequence models. Many models and datasets are supported. Basic usage is `python -m train`, or equivalently ``` python -m train pipeline=mnist model=s4 ``` which trains an S4 model on the Permuted MNIST dataset. This should get to around 90% after 1 epoch which takes 1-3 minutes depending on GPU. More examples of using this repository can be found in [Experiments](#experiments) and [Training](#training). ### Optimizer Hyperparameters One important feature of this codebase is supporting parameters that require different optimizer hyperparameters. In particular, the SSM kernel is particularly sensitive to the $(A, B)$ (and sometimes $\Delta$ parameters), so the learning rate on these parameters is sometimes lowered and the weight decay is always set to $0$. See the method `register` in the model (e.g. [s4d.py](src/models/s4/s4d.py)) and the function `setup_optimizer` in the training script (e.g. [example.py](example.py)) for an examples of how to implement this in external repos. ### HiPPO/S4 Visualizations Figures from the HTTYH and S4D papers can be visualized from [notebooks/](notebooks/). These include [animations](notebooks/hippo_function_approximation.ipynb) of HiPPO and S4 that were used in various S4 talks. The animation code can also be found in a [.py file](src/models/hippo/visualizations.py) instead of notebook. ## Experiments Instructions for reproducing experiments from the papers can be found in [experiments.md](experiments.md). ### Data Basic datasets are auto-downloaded, including MNIST, CIFAR, and Speech Commands. All logic for creating and loading datasets is in [src/dataloaders](./src/dataloaders/) directory. The README inside this subdirectory documents how to download and organize other datasets. ### Models Models are defined in [src/models](src/models). See the README in this subdirectory for an overview. ## Training The core training infrastructure of this repository is based on [Pytorch-Lightning](https://pytorch-lightning.readthedocs.io/en/latest/) with a configuration scheme based on [Hydra](https://hydra.cc/docs/intro/). The main entrypoint is `train.py` and configs are found in `configs/`. ### Configs and Hyperparameters Pre-defined configs for many end-to-end experiments are provided (see [experiments.md](experiments.md)). Configs can also be easily modified through the command line. An example experiment is ``` python -m train pipeline=mnist dataset.permute=True model=s4 model.n_layers=3 model.d_model=128 model.norm=batch model.prenorm=True wandb=null ``` This uses the Permuted MNIST task with an S4 model with a specified number of layers, backbone dimension, and normalization type. See [configs/README.md](configs/) for more detailed documentation about the configs. #### Hydra It is recommended to read the [Hydra documentation](https://hydra.cc/docs/intro/) to fully understand the configuration framework. For help launching specific experiments, please file an issue. ### Resuming Each experiment will be logged to its own directory (generated by Hydra) of the form `./outputs//
Owner
- Name: Bryan
- Login: BeeGass
- Kind: user
- Location: Cambridge, MA
- Company: @USArmyResearchLab
- Website: onlygass.dev
- Twitter: BeeAGass
- Repositories: 14
- Profile: https://github.com/BeeGass
Research Engineer interested in SSMs