High-performance neural population dynamics modeling enabled by scalable computational infrastructure

High-performance neural population dynamics modeling enabled by scalable computational infrastructure - Published in JOSS (2023)

https://github.com/tnel-ucsd/autolfads-deploy

Science Score: 100.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 7 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: joss.theoj.org
  • Committers with academic emails
  • Institutional organization owner
    Organization tnel-ucsd has institutional domain (tnel.ucsd.edu)
  • JOSS paper metadata
    Published in Journal of Open Source Software

Scientific Fields

Artificial Intelligence and Machine Learning Computer Science - 40% confidence
Last synced: 4 months ago · JSON representation ·

Repository

Deployment strategies for AutoLFADS

Basic Info
  • Host: GitHub
  • Owner: TNEL-UCSD
  • License: other
  • Language: Python
  • Default Branch: master
  • Homepage:
  • Size: 15.1 MB
Statistics
  • Stars: 14
  • Watchers: 4
  • Forks: 2
  • Open Issues: 0
  • Releases: 2
Created over 3 years ago · Last pushed almost 2 years ago
Metadata Files
Readme Contributing License Code of conduct Citation

README.md

Scaling AutoLFADS

DOI

Introduction

This repository provides a set of solutions for running AutoLFADS in a wider variety of compute environments. This enables more users to take better advantage of the hardware available to them to perform computationally demanding hyperparameter sweeps.

We provide three options for different cluster configurations and encourage the user to select the one that best suits their needs: - Local Compute: users directly leverage a container image that bundles all the AutoLFADS software dependencies and provides an entrypoint directly to the LFADS package. Interactivity with this workflow is provided via YAML model configuration files and command line arguments. - Unmanaged Compute (Ray): users configure a Ray cluster and interact with the workflow by updating YAML model configurations, updating hyperparameter sweep scripts, and then running experiment code. - Managed Compute (KubeFlow): users interact with a KubeFlow service by providing an experiment specification that includes model configuration and hyperparameter sweep specifications either as a YAML file or using a code-less UI-based workflow.

The solution matrix below provides a rough guide for identifying an suitable workflow:

| | Local Container | Ray | KubeFlow | |---------------------------------------------------|-----------------|-----------|---------------| | Number of Users | 1 | 1-3 | >1 | | Number of Jobs | 1 | >1 | >1 | | Preferred Interaction | CLI | CLI | CLI / UI | | Infrastructure | Local | Unmanaged | Managed/Cloud | | Cost | $ | $ - $$ | $ - $$$ |

Details describing the AutoLFADS solutions and evaluation against the Neural Latents Benchmark datasets can be found in our paper.

Installation & Usage

Follow the appropriate guide below to run AutoLFADS on your target platform. We recommend copying the following files to your team's source control and modifying them as necessary to organize and execute custom experiments. - Model configuration file (e.g. examples/lorenz/data/config.yaml) - KubeFlow configuration file (e.g. examples/lorenz/kubeflow_job.yaml) or Ray run script (e.g. examples/lorenz/ray_run.py)

Container

Running LFADS in a container provides isolation from your host operating system and instead relies on a system installed container runtime. This workflow is suitable for evaluating algorithm operation on small datasets or exploring specific model parameter changes. It is suitable for use on shared compute environments and other platforms where there is limited system package isolation.

Prerequisites: Container runtime (e.g. Docker - Linux / Mac / Windows, Podman - Linux / Mac / Windows, containerD - Linux / Windows) and the Nvidia Container Toolkit (GPU only).

Instructions are provided in docker syntax, but can be easily modified for other container runtimes

  1. Specify latest for CPU operation and latest-gpu for GPU compatible operation bash TAG=latest
  2. (OPTIONAL) Pull the docker image to your local machine. This step ensures you have the latest version of the image. bash docker pull ucsdtnel/autolfads:$TAG
  3. Browse to a directory that has access to your data and LFADS configuration file bash # The general structure should be as follows (names can be changed, just update the paths in the run parameters) # \<my-data-directory> # \data # <data files> # config.yaml (LFADS model parameter file) # \output # <location for generated outputs> cd <my-data-directory>
  4. Run LFADS (bash scripts provided in examples for convenience) ```bash

    Docker flags

    --rm removes container resources on exit

    --runtime specifies a non-default container runtime

    --gpus specifies which gpus to provide to the container

    -it start the container with interactive input and TTY

    -v : mount a path from host to container

    $(pwd): expands the terminal working directory so you don't need to type a fully qualified path

    AutoLFADS overrides

    --data location inside container with data

    --checkpoint location inside container that maps to a host location to store model outputs

    --config-file location inside container that contains training configuration

    KEY VALUE command line overrides for training configuration

    For CPU

    docker run --rm -it -v $(pwd):/share ucsdtnel/autolfads:$TAG \ --data /share/data \ --checkpoint /share/container_output \ --config-file /share/data/config.yaml

    For GPU (Note: $TAG value should have a -gpu suffix`)

    docker run --rm --runtime=nvidia --gpus='"device=0"' -it -v $(pwd):/share ucsdtnel/autolfads:$TAG \ --data /share/data \ --checkpoint /share/container_output \ --config-file /share/data/config.yaml ```

Ray

Running AutoLFADS using Ray enables scaling your processing jobs to many worker nodes in an ad-hoc cluster that you specify. This workflow is suitable for running on unmanaged or loosely managed compute resources (e.g. lab compute machines) where you have direct ssh access to the instances. It is also possible to use this workflow with VM based cloud environments as noted here.

Prerequisites: Conda

AutoLFADS Installation

  1. Clone the latest version of autolfads-tf2 bash git clone git@github.com:snel-repo/autolfads-tf2.git
  2. Change the working directory to the newly cloned repository bash cd autolfads-tf2
  3. Create a new conda environment bash conda create --name autolfads-tf2 python=3.7
  4. Activate the environment bash conda activate autolfads-tf2
  5. Install GPU specific packages bash conda install -c conda-forge cudatoolkit=10.0 conda install -c conda-forge cudnn=7.6
  6. Install LFADS bash python3 -m pip install -e lfads-tf2
  7. Install LFADS Ray Tune component bash python3 -m pip install -e tune-tf2
  8. Modify ray/ray_cluster_template.yaml with the appropriate information. Note, you will need to fill in values for all <...> stubs.
  9. Modify ray/run_pbt.py with the desired hyperparameter exploration configuration
  10. Modify ray/run_pbt.py variable SINGLE_MACHINE to be False
  11. Run AutoLFADS bash python3 ray/run_pbt.py

KubeFlow

Running AutoLFADS using KubeFlow enables scaling your experiments across an entire cluster. This workflow allows for isolated multi-user utilization and is ideal for running on managed infrastructure (e.g. University, public or private cloud) or on service-oriented clusters (i.e. no direct access to compute instances). It leverages industry standard tooling and enables scalable compute workflows beyond AutoLFADS for groups looking to adopt a framework for scalable machine learning.

If you are using a cloud provider, KubeFlow provides a series of tutorials to get you setup with a completely configured install. We currently require a feature that was introduced in Katib 0.14. The below installation provides a pathway for installing KubeFlow on a vanilla Kubernetes cluster integrating the noted changes.

Prerequisites: Kubernetes cluster access and Ansible (installed locally; only needed when deploying KubeFlow)

  1. Install Istio if your cluster does not yet have it bash ansible-playbook istio.yml --extra-vars "run_option=install"
  2. Install NFS Storage Controller (if you need an RWX storage driver) bash ansible-playbook nfs_storage_class.yml --extra-vars "run_option=install"
  3. Install KubeFlow bash ansible-playbook kubeflow.yml --extra-vars "run_option=install"
  4. Use examples/lorenz/kubeflow_job.yaml as a template to specify a new job with desired hyperparameter exploration configuration and AutoLFADS configuration. Refer to the dataset README for details on how to acquire and prepare the data.
  5. Run AutoLFADS bash kubectl create -f kubeflow_job.yaml
  6. (Optional) Start or monitor job using KubeFlow UI bash # Start a tunnel between your computer and the kubernetes network if you did not add an ingress entry kubectl port-forward svc/istio-ingressgateway -n istio-system --address 0.0.0.0 8080:80 # Browse to http://localhost:8080
  7. Results can be downloaded from the KubeFlow Volumes UI or directly from the data mount location.

Contributing

Find a bug? Built new integration for AutoLFADS on your framework of choice? We'd love to hear about it and work with you to integrate your solution to this repository! Drop us an Issue or PR and we'd be happy to collaborate.

Citing

If you found this work helpful, please cite the following works:

@article{keshtkaran2021large, title = {A large-scale neural network training framework for generalized estimation of single-trial population dynamics}, author = {Keshtkaran, Mohammad Reza and Sedler, Andrew R and Chowdhury, Raeed H and Tandon, Raghav and Basrai, Diya and Nguyen, Sarah L and Sohn, Hansem and Jazayeri, Mehrdad and Miller, Lee E and Pandarinath, Chethan}, journal = {BioRxiv}, year = {2021}, publisher = {Cold Spring Harbor Laboratory} } @article{Patel2023, doi = {10.21105/joss.05023}, url = {https://doi.org/10.21105/joss.05023}, year = {2023}, publisher = {The Open Journal}, volume = {8}, number = {83}, pages = {5023}, author = {Aashish N. Patel and Andrew R. Sedler and Jingya Huang and Chethan Pandarinath and Vikash Gilja}, title = {High-performance neural population dynamics modeling enabled by scalable computational infrastructure}, journal = {Journal of Open Source Software} }

Owner

  • Name: TNEL UCSD
  • Login: TNEL-UCSD
  • Kind: organization
  • Location: San Diego, California

Translational Neuroengineering Lab @ University of California, San Diego

JOSS Publication

High-performance neural population dynamics modeling enabled by scalable computational infrastructure
Published
March 21, 2023
Volume 8, Issue 83, Page 5023
Authors
Aashish N. Patel
Department of Electrical and Computer Engineering, University of California San Diego, United States of America, Institute for Neural Computation, University of California San Diego, United States of America
Andrew R. Sedler ORCID
Center for Machine Learning, Georgia Institute of Technology, United States of America, Department of Biomedical Engineering, Georgia Institute of Technology, United States of America
Jingya Huang
Department of Electrical and Computer Engineering, University of California San Diego, United States of America
Chethan Pandarinath
Center for Machine Learning, Georgia Institute of Technology, United States of America, Department of Biomedical Engineering, Georgia Institute of Technology, United States of America, Department of Neurosurgery, Emory University, United States of America, These authors contributed equally
Vikash Gilja
Department of Electrical and Computer Engineering, University of California San Diego, United States of America, These authors contributed equally
Editor
Elizabeth DuPre ORCID
Tags
autolfads kubeflow ray neuroscience

Citation (CITATION.cff)

cff-version: 1.2.0
title: >-
  Deployment strategies for scaling AutoLFADS to
  model neural population dynamics
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Aashish
    family-names: Patel
    affiliation: University of California San Diego
  - given-names: Andrew
    family-names: Sedler
    affiliation: Georgia Institute of Technology
  - given-names: Jingya
    family-names: Huang
    affiliation: University of California San Diego
  - given-names: Chethan
    family-names: Pandarinath
    affiliation: Georgia Institute of Technology
  - given-names: Vikash
    family-names: Gilja
    affiliation: University of California San Diego
identifiers:
  - type: doi
    value: 10.5281/zenodo.6786931

GitHub Events

Total
  • Watch event: 1
Last Year
  • Watch event: 1

Committers

Last synced: 5 months ago

All Time
  • Total Commits: 30
  • Total Committers: 3
  • Avg Commits per committer: 10.0
  • Development Distribution Score (DDS): 0.067
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
a9p 5****p 28
SophiaUCSD16 j****d@g****m 1
Andrew Sedler 3****9 1

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 7
  • Total pull requests: 19
  • Average time to close issues: 25 days
  • Average time to close pull requests: 5 days
  • Total issue authors: 2
  • Total pull request authors: 4
  • Average comments per issue: 0.14
  • Average comments per pull request: 0.37
  • Merged pull requests: 18
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • richford (4)
  • tachukao (3)
Pull Request Authors
  • a9p (17)
  • SophiaUCSD16 (1)
  • compwizk (1)
  • arsedler9 (1)
Top Labels
Issue Labels
Pull Request Labels

Dependencies

kubeflow/roles/kubeflow/files/kubeflow/manifests/apps/kfp-tekton/upstream/base/installs/multi-user/pipelines-profile-controller/requirements-dev.txt pypi
  • pytest * development
  • pytest-lazy-fixture * development
  • requests * development
kubeflow/roles/kubeflow/files/kubeflow/manifests/apps/pipeline/upstream/base/installs/multi-user/pipelines-profile-controller/requirements-dev.txt pypi
  • pytest * development
  • pytest-lazy-fixture * development
  • requests * development
.github/workflows/model-test.yml actions
  • actions/checkout v3 composite