High-performance neural population dynamics modeling enabled by scalable computational infrastructure

High-performance neural population dynamics modeling enabled by scalable computational infrastructure - Published in JOSS (2023)

https://github.com/tnel-ucsd/autolfads-deploy

Science Score: 100.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 7 DOI reference(s) in README and JOSS metadata
✓
Academic publication links
Links to: joss.theoj.org
○
Committers with academic emails
✓
Institutional organization owner
Organization tnel-ucsd has institutional domain (tnel.ucsd.edu)
✓
JOSS paper metadata
Published in Journal of Open Source Software

Scientific Fields

Artificial Intelligence and Machine Learning Computer Science - 40% confidence

Last synced: 6 months ago · JSON representation ·

Repository

Deployment strategies for AutoLFADS

Basic Info

Host: GitHub
Owner: TNEL-UCSD
License: other
Language: Python
Default Branch: master
Homepage:
Size: 15.1 MB

Statistics

Stars: 14
Watchers: 4
Forks: 2
Open Issues: 0
Releases: 2

Created almost 4 years ago · Last pushed almost 2 years ago

Metadata Files

Readme Contributing License Code of conduct Citation

Scaling AutoLFADS

Introduction

This repository provides a set of solutions for running AutoLFADS in a wider variety of compute environments. This enables more users to take better advantage of the hardware available to them to perform computationally demanding hyperparameter sweeps.

We provide three options for different cluster configurations and encourage the user to select the one that best suits their needs: - Local Compute: users directly leverage a container image that bundles all the AutoLFADS software dependencies and provides an entrypoint directly to the LFADS package. Interactivity with this workflow is provided via YAML model configuration files and command line arguments. - Unmanaged Compute (Ray): users configure a Ray cluster and interact with the workflow by updating YAML model configurations, updating hyperparameter sweep scripts, and then running experiment code. - Managed Compute (KubeFlow): users interact with a KubeFlow service by providing an experiment specification that includes model configuration and hyperparameter sweep specifications either as a YAML file or using a code-less UI-based workflow.

The solution matrix below provides a rough guide for identifying an suitable workflow:

| | Local Container | Ray | KubeFlow | |---------------------------------------------------|-----------------|-----------|---------------| | Number of Users | 1 | 1-3 | >1 | | Number of Jobs | 1 | >1 | >1 | | Preferred Interaction | CLI | CLI | CLI / UI | | Infrastructure | Local | Unmanaged | Managed/Cloud | | Cost | $ | $ - $$ | $ - $$$ |

Details describing the AutoLFADS solutions and evaluation against the Neural Latents Benchmark datasets can be found in our paper.

Installation & Usage

Follow the appropriate guide below to run AutoLFADS on your target platform. We recommend copying the following files to your team's source control and modifying them as necessary to organize and execute custom experiments. - Model configuration file (e.g. examples/lorenz/data/config.yaml) - KubeFlow configuration file (e.g. examples/lorenz/kubeflow_job.yaml) or Ray run script (e.g. examples/lorenz/ray_run.py)

Container

Running LFADS in a container provides isolation from your host operating system and instead relies on a system installed container runtime. This workflow is suitable for evaluating algorithm operation on small datasets or exploring specific model parameter changes. It is suitable for use on shared compute environments and other platforms where there is limited system package isolation.

Prerequisites: Container runtime (e.g. Docker - Linux / Mac / Windows, Podman - Linux / Mac / Windows, containerD - Linux / Windows) and the Nvidia Container Toolkit (GPU only).

Instructions are provided in docker syntax, but can be easily modified for other container runtimes

Specify latest for CPU operation and latest-gpu for GPU compatible operation bash TAG=latest
(OPTIONAL) Pull the docker image to your local machine. This step ensures you have the latest version of the image. bash docker pull ucsdtnel/autolfads:$TAG
Browse to a directory that has access to your data and LFADS configuration file bash # The general structure should be as follows (names can be changed, just update the paths in the run parameters) # \<my-data-directory> # \data # <data files> # config.yaml (LFADS model parameter file) # \output # <location for generated outputs> cd <my-data-directory>
Run LFADS (bash scripts provided in examples for convenience) ```bash

Docker flags

--rm removes container resources on exit

--runtime specifies a non-default container runtime

--gpus specifies which gpus to provide to the container

-it start the container with interactive input and TTY

-v : mount a path from host to container

$(pwd): expands the terminal working directory so you don't need to type a fully qualified path

AutoLFADS overrides

--data location inside container with data

--checkpoint location inside container that maps to a host location to store model outputs

--config-file location inside container that contains training configuration

KEY VALUE command line overrides for training configuration

For CPU

docker run --rm -it -v $(pwd):/share ucsdtnel/autolfads:$TAG \ --data /share/data \ --checkpoint /share/container_output \ --config-file /share/data/config.yaml

For GPU (Note: $TAG value should have a -gpu suffix`)

docker run --rm --runtime=nvidia --gpus='"device=0"' -it -v $(pwd):/share ucsdtnel/autolfads:$TAG \ --data /share/data \ --checkpoint /share/container_output \ --config-file /share/data/config.yaml ```

Ray

Running AutoLFADS using Ray enables scaling your processing jobs to many worker nodes in an ad-hoc cluster that you specify. This workflow is suitable for running on unmanaged or loosely managed compute resources (e.g. lab compute machines) where you have direct ssh access to the instances. It is also possible to use this workflow with VM based cloud environments as noted here.

Prerequisites: Conda

AutoLFADS Installation

Clone the latest version of autolfads-tf2 bash git clone git@github.com:snel-repo/autolfads-tf2.git
Change the working directory to the newly cloned repository bash cd autolfads-tf2
Create a new conda environment bash conda create --name autolfads-tf2 python=3.7
Activate the environment bash conda activate autolfads-tf2
Install GPU specific packages bash conda install -c conda-forge cudatoolkit=10.0 conda install -c conda-forge cudnn=7.6
Install LFADS bash python3 -m pip install -e lfads-tf2
Install LFADS Ray Tune component bash python3 -m pip install -e tune-tf2
Modify ray/ray_cluster_template.yaml with the appropriate information. Note, you will need to fill in values for all <...> stubs.
Modify ray/run_pbt.py with the desired hyperparameter exploration configuration
Modify ray/run_pbt.py variable SINGLE_MACHINE to be False
Run AutoLFADS bash python3 ray/run_pbt.py

KubeFlow

Running AutoLFADS using KubeFlow enables scaling your experiments across an entire cluster. This workflow allows for isolated multi-user utilization and is ideal for running on managed infrastructure (e.g. University, public or private cloud) or on service-oriented clusters (i.e. no direct access to compute instances). It leverages industry standard tooling and enables scalable compute workflows beyond AutoLFADS for groups looking to adopt a framework for scalable machine learning.

If you are using a cloud provider, KubeFlow provides a series of tutorials to get you setup with a completely configured install. We currently require a feature that was introduced in Katib 0.14. The below installation provides a pathway for installing KubeFlow on a vanilla Kubernetes cluster integrating the noted changes.

Prerequisites: Kubernetes cluster access and Ansible (installed locally; only needed when deploying KubeFlow)

Install Istio if your cluster does not yet have it bash ansible-playbook istio.yml --extra-vars "run_option=install"
Install NFS Storage Controller (if you need an RWX storage driver) bash ansible-playbook nfs_storage_class.yml --extra-vars "run_option=install"
Install KubeFlow bash ansible-playbook kubeflow.yml --extra-vars "run_option=install"
Use examples/lorenz/kubeflow_job.yaml as a template to specify a new job with desired hyperparameter exploration configuration and AutoLFADS configuration. Refer to the dataset README for details on how to acquire and prepare the data.
Run AutoLFADS bash kubectl create -f kubeflow_job.yaml
(Optional) Start or monitor job using KubeFlow UI bash # Start a tunnel between your computer and the kubernetes network if you did not add an ingress entry kubectl port-forward svc/istio-ingressgateway -n istio-system --address 0.0.0.0 8080:80 # Browse to http://localhost:8080
Results can be downloaded from the KubeFlow Volumes UI or directly from the data mount location.

Contributing

Find a bug? Built new integration for AutoLFADS on your framework of choice? We'd love to hear about it and work with you to integrate your solution to this repository! Drop us an Issue or PR and we'd be happy to collaborate.

Citing

If you found this work helpful, please cite the following works:

@article{keshtkaran2021large, title = {A large-scale neural network training framework for generalized estimation of single-trial population dynamics}, author = {Keshtkaran, Mohammad Reza and Sedler, Andrew R and Chowdhury, Raeed H and Tandon, Raghav and Basrai, Diya and Nguyen, Sarah L and Sohn, Hansem and Jazayeri, Mehrdad and Miller, Lee E and Pandarinath, Chethan}, journal = {BioRxiv}, year = {2021}, publisher = {Cold Spring Harbor Laboratory} } @article{Patel2023, doi = {10.21105/joss.05023}, url = {https://doi.org/10.21105/joss.05023}, year = {2023}, publisher = {The Open Journal}, volume = {8}, number = {83}, pages = {5023}, author = {Aashish N. Patel and Andrew R. Sedler and Jingya Huang and Chethan Pandarinath and Vikash Gilja}, title = {High-performance neural population dynamics modeling enabled by scalable computational infrastructure}, journal = {Journal of Open Source Software} }

Owner

Name: TNEL UCSD
Login: TNEL-UCSD
Kind: organization
Location: San Diego, California

Website: https://tnel.ucsd.edu
Repositories: 8
Profile: https://github.com/TNEL-UCSD

Translational Neuroengineering Lab @ University of California, San Diego

JOSS Publication

High-performance neural population dynamics modeling enabled by scalable computational infrastructure

Published

March 21, 2023

DOI

10.21105/joss.05023

Volume 8, Issue 83, Page 5023

Authors

Aashish N. Patel
Department of Electrical and Computer Engineering, University of California San Diego, United States of America, Institute for Neural Computation, University of California San Diego, United States of America

Andrew R. Sedler

Center for Machine Learning, Georgia Institute of Technology, United States of America, Department of Biomedical Engineering, Georgia Institute of Technology, United States of America

Jingya Huang
Department of Electrical and Computer Engineering, University of California San Diego, United States of America

Chethan Pandarinath
Center for Machine Learning, Georgia Institute of Technology, United States of America, Department of Biomedical Engineering, Georgia Institute of Technology, United States of America, Department of Neurosurgery, Emory University, United States of America, These authors contributed equally

Vikash Gilja
Department of Electrical and Computer Engineering, University of California San Diego, United States of America, These authors contributed equally

Editor

Elizabeth DuPre

Citation (CITATION.cff)

cff-version: 1.2.0
title: >-
  Deployment strategies for scaling AutoLFADS to
  model neural population dynamics
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Aashish
    family-names: Patel
    affiliation: University of California San Diego
  - given-names: Andrew
    family-names: Sedler
    affiliation: Georgia Institute of Technology
  - given-names: Jingya
    family-names: Huang
    affiliation: University of California San Diego
  - given-names: Chethan
    family-names: Pandarinath
    affiliation: Georgia Institute of Technology
  - given-names: Vikash
    family-names: Gilja
    affiliation: University of California San Diego
identifiers:
  - type: doi
    value: 10.5281/zenodo.6786931

GitHub Events

Total

Watch event: 1

Last Year

Watch event: 1

Committers

Last synced: 7 months ago

All Time

Total Commits: 30
Total Committers: 3
Avg Commits per committer: 10.0
Development Distribution Score (DDS): 0.067

Past Year

Commits: 0
Committers: 0
Avg Commits per committer: 0.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
a9p	5****p	28
SophiaUCSD16	j**d@g**m	1
Andrew Sedler	3****9	1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 7
Total pull requests: 19
Average time to close issues: 25 days
Average time to close pull requests: 5 days
Total issue authors: 2
Total pull request authors: 4
Average comments per issue: 0.14
Average comments per pull request: 0.37
Merged pull requests: 18
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

richford (4)
tachukao (3)

Pull Request Authors

a9p (17)
SophiaUCSD16 (1)
compwizk (1)
arsedler9 (1)

Top Labels

Issue Labels

Pull Request Labels

Dependencies

kubeflow/roles/kubeflow/files/kubeflow/manifests/apps/kfp-tekton/upstream/base/installs/multi-user/pipelines-profile-controller/requirements-dev.txt pypi

pytest * development
pytest-lazy-fixture * development
requests * development

kubeflow/roles/kubeflow/files/kubeflow/manifests/apps/pipeline/upstream/base/installs/multi-user/pipelines-profile-controller/requirements-dev.txt pypi

pytest * development
pytest-lazy-fixture * development
requests * development

.github/workflows/model-test.yml actions

actions/checkout v3 composite

High-performance neural population dynamics modeling enabled by scalable computational infrastructure

Science Score: 100.0%

Scientific Fields

Repository

Basic Info

Statistics

Metadata Files

README.md

Scaling AutoLFADS

Introduction

Installation & Usage

Container

Docker flags

--rm removes container resources on exit

--runtime specifies a non-default container runtime

--gpus specifies which gpus to provide to the container

-it start the container with interactive input and TTY

-v : mount a path from host to container

$(pwd): expands the terminal working directory so you don't need to type a fully qualified path

AutoLFADS overrides

--data location inside container with data

--checkpoint location inside container that maps to a host location to store model outputs

--config-file location inside container that contains training configuration

KEY VALUE command line overrides for training configuration

For CPU

For GPU (Note: $TAG value should have a -gpu suffix`)

Ray

AutoLFADS Installation

KubeFlow

Contributing

Citing

Owner

JOSS Publication

High-performance neural population dynamics modeling enabled by scalable computational infrastructure

Authors

Editor

Tags

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies

For GPU (Note: $TAG value should have a `-gpu` suffix`)