multi-variate-parallel-transformer

Repository for paper "A foundation model with multi-variate parallel attention to generate neuronal activity"

https://github.com/ibm/multi-variate-parallel-transformer

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.0%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Repository for paper "A foundation model with multi-variate parallel attention to generate neuronal activity"

Basic Info
  • Host: GitHub
  • Owner: IBM
  • License: apache-2.0
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 59.6 KB
Statistics
  • Stars: 0
  • Watchers: 3
  • Forks: 0
  • Open Issues: 2
  • Releases: 0
Created about 1 year ago · Last pushed 8 months ago
Metadata Files
Readme License Citation

README.md

MVPFormer: A foundation model with multi-variate parallel attention to generate neuronal activity

arXiv Static Badge

MVPFormer is a foundation model trained and tested on almost 10,000 hours of iEEG recordings. It can do next-state prediction and, with the addition of classification heads, can also detect seizures.

If your GPU has compute capability > 8.0, i.e., it is Ampere or later, MVPFormer will automatically use the optimised Flash-MVPA; otherwise, it will run with the slower and more memory hungry PyTorch MVPA implementation. Doing inference without Flash-MVPA is supported, while training without Flash-MVPA is not recommended.

Prepare the environment

To prepare the environment for running MVPFormer you need a mixture of pip and compilation from source.

Pip

The requirements.txt file is provided in the repository. Simply install all requirements with pip install -r requirements.txt.

DeepSpeed

You have to compile DeepSpeed manually to activate some necessary extensions. The procedure can vary based on your software and hardware stack, here we report our reference installation steps.

bash DS_BUILD_FUSED_ADAM=1 DS_BUILD_FUSED_LAMB=1 pip install --no-cache-dir deepspeed --global-option="build_ext" --global-option="-j8"

Inference with MVPFormer

We use PyTorch Lightning to distribute reproducible configuration files for our experiments. The example testing configuration file can be found in the configs folder. You can start testing with: bash python main.py test --config configs/mvpformer_classification.yaml --model.init_args.base_model '<base_checkpoint_path>' --model.init_args.head_model '<head_checkpoint_path>' --data.init_args.folder '<dataset_path>' --data.init_args.test_patients ['<dataset_subject>']

Training MVPFormer

We use PyTorch Lightning to distribute reproducible configuration files for our experiments. The example testing configuration file can be found in the configs folder. You can start training with: bash python main.py fit --config configs/mvpformer_classification.yaml --model.init_args.base_model '<base_checkpoint_path>' --model.init_args.head_model '<head_checkpoint_path>' --data.init_args.folder '<dataset_path>' --data.init_args.train_patients ['<dataset_subject>']

The example parameters are equivalent to what we have used to train MVPFormer, except in the hardware setup such as the number of GPUs and the number of CPU workers.

Dataset

The SWEC iEEG dataset can be found at this repository hosted by the Hospital of Bern.

Checkpoints

The checkpoints can be downloaded from this location. The checkpoint with base are the base models with only generative pre-training. The swec models are the classification heads.

Disclaimer

This software may only be used for research. For other applications any liability is denied. In particular, the software must not be used for diagnostic purposes.

Citation

@article{carzaniga2025foundation, title={A foundation model with multi-variate parallel attention to generate neuronal activity}, author={Carzaniga, Francesco and Hersche, Michael and Sebastian, Abu and Schindler, Kaspar and Rahimi, Abbas}, journal={arXiv preprint arXiv:2506.20354}, year={2025} }

License

If you would like to see the detailed LICENSE click here.

```text

Copyright IBM Corp. 2024 - 2025

SPDX-License-Identifier: Apache-2.0

```

Owner

  • Name: International Business Machines
  • Login: IBM
  • Kind: organization
  • Email: awesome@ibm.com
  • Location: United States of America

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Carzaniga"
  given-names: "Francesco"
  orcid: "https://orcid.org/0009-0001-2727-6248"
- family-names: "Hersche"
  given-names: "Michael"
  orcid: "https://orcid.org/0000-0003-3065-7639"
- family-names: "Sebastian"
  given-names: "Abu"
  orcid: "https://orcid.org/0000-0001-5603-5243"
- family-names: "Schindler"
  given-names: "Kaspar"
  orcid: "https://orcid.org/0000-0002-2387-7767"
- family-names: "Rahimi"
  given-names: "Abbas"
  orcid: "https://orcid.org/0000-0003-3141-4970"
title: "A foundation model with multi-variate parallel attention to generate neuronal activity"
url: "https://github.com/IBM/multi-variate-parallel-transformer"
preferred-citation:
  type: article
  authors:
  - family-names: "Carzaniga"
    given-names: "Francesco"
    orcid: "https://orcid.org/0009-0001-2727-6248"
  - family-names: "Hersche"
    given-names: "Michael"
    orcid: "https://orcid.org/0000-0003-3065-7639"
  - family-names: "Sebastian"
    given-names: "Abu"
    orcid: "https://orcid.org/0000-0001-5603-5243"
  - family-names: "Schindler"
    given-names: "Kaspar"
    orcid: "https://orcid.org/0000-0002-2387-7767"
  - family-names: "Rahimi"
    given-names: "Abbas"
    orcid: "https://orcid.org/0000-0003-3141-4970"
  journal: "arXiv preprint arXiv:2506.20354"
  title: "A foundation model with multi-variate parallel attention to generate neuronal activity"
  year: 2025

GitHub Events

Total
  • Watch event: 5
  • Issue comment event: 1
  • Push event: 2
  • Fork event: 1
Last Year
  • Watch event: 5
  • Issue comment event: 1
  • Push event: 2
  • Fork event: 1