pytensor-federated

Distributed differentiable graph computation using PyTensor

https://github.com/michaelosthege/pytensor-federated

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
    1 of 3 committers (33.3%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.7%) to scientific vocabulary

Keywords

federated-learning graph grpc pymc pytensor

Keywords from Contributors

optimizing-compiler mesh interpretability sequences projection interactive optim hacking network-simulation
Last synced: 6 months ago · JSON representation ·

Repository

Distributed differentiable graph computation using PyTensor

Basic Info
  • Host: GitHub
  • Owner: michaelosthege
  • License: agpl-3.0
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 153 KB
Statistics
  • Stars: 5
  • Watchers: 4
  • Forks: 0
  • Open Issues: 1
  • Releases: 3
Topics
federated-learning graph grpc pymc pytensor
Created over 3 years ago · Last pushed 6 months ago
Metadata Files
Readme License Citation

README.md

PyPI version pipeline coverage

pytensor-federated

This package implements federated computing with PyTensor.

Using pytensor-federated, differentiable cost functions can be computed on federated nodes. Inputs and outputs are transmitted in binary via a bidirectional gRPC stream.

A client side LogpGradOp is provided to conveniently embed federated compute operations in PyTensor graphs such as a PyMC model.

The example code implements a simple Bayesian linear regression to data that is "private" to the federated compute process.

Run each command in its own terminal:

bash python demo_node.py

bash python demo_model.py

Architecture

pytensor-federated is designed to be a very generalizable framework for federated computing with gRPC, but it comes with implementations for PyTensor, and specifically for use cases of Bayesian inference. This is reflected in the actual implementation, where the most basic gRPC service implementation -- the ArraysToArraysService -- is wrapped by a few implementation flavors, specifically for common use cases in Bayesian inference.

At the core, everything is built around an ArraysToArrays gRPC service, which takes any number of (NumPy) arrays as parameters, and returns any number of (NumPy) arrays as outputs. The arrays can have arbitrary dtype or shape, as long as the buffer interface is supported (meaning dtype=object doesn't work, but datetime dtypes are ok).

This ArraysToArraysService can be used to wrap arbitrary model functions, thereby enabling to run model simulations and MCMC/optimization on different machines. The protobuf files that specify the data types and gRPC interface can be compiled to other programming languages, such that the model implementation could be C++, while MCMC/optimization run in Python.

For the Bayesian inference or optimization use case, it helps to first understand the inputs and outputs of the undelying computation graph. For example, parameter estimation with a differential equation model requires... * observations to which the model should be fitted * timepoints at which there were observations * parameters (including initial states) theta, some of which are to be estimated

From timepoints and parameters theta, the model predicts trajectories. Together with observations, these predictions are fed into some kind of likelihood function, which produces a scalar log-likelihood log-likelihood as the output.

Different sub-graphs of this example could be wrapped by an ArraysToArraysService: * [theta,] -> [log-likelihood,] * [timepoints, theta] -> [trajectories,] * [timepoints, observations, theta] -> [log-likelihood,]

If the entire model is differentiable, one can even return gradients. For example, with a linear model: [slope, intercept] -> [LL, dLL_dslope, dLL_dintercept].

The role of PyTensor here is purely technical: PyTensor is a graph computation framework that implements auto-differentiation. Wrapping the ArraysToArraysServiceClient in PyTensor Ops simply makes it easier to build more sophisticated compute graphs. PyTensor is also the computatation backend for PyMC, which is the most popular framework for Bayesian inference in Python.

Installation & Contributing

bash conda env create -f environment.yml

Additional dependencies are needed to compile the protobufs:

bash conda install -c conda-forge libprotobuf-static pip install --pre betterproto[compiler]

bash python protobufs/generate.py

Set up pre-commit for automated code style enforcement:

bash pip install pre-commit pre-commit install

Owner

  • Name: Michael Osthege
  • Login: michaelosthege
  • Kind: user
  • Location: Germany
  • Company: Forschungszentrum Jülich GmbH

PhD student in bioprocess and laboratory automation, PyMC developer

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: pytensor-federated
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Michael
    family-names: Osthege
    email: michael.osthege@outlook.com
    orcid: 'https://orcid.org/0000-0002-2734-7624'
    affiliation: Forschungszentrum Jülich GmbH
repository-code: 'https://github.com/michaelosthege/pytensor-federated'
abstract: >-
  PyTensor-Federated is a package that extends PyTensor to
  perform differentiable graph computation across multiple
  machines.
keywords:
  - federated computing
  - pymc
  - pytensor
license: AGPL-3.0
commit: 09f3f1057998e038afb54ba0f6130a0c3b859172
version: 1.0.0
date-released: '2023-09-13'

GitHub Events

Total
  • Watch event: 1
  • Delete event: 8
  • Issue comment event: 3
  • Push event: 9
  • Pull request review event: 8
  • Pull request event: 18
  • Create event: 8
Last Year
  • Watch event: 1
  • Delete event: 8
  • Issue comment event: 3
  • Push event: 9
  • Pull request review event: 8
  • Pull request event: 18
  • Create event: 8

Committers

Last synced: over 1 year ago

All Time
  • Total Commits: 99
  • Total Committers: 3
  • Avg Commits per committer: 33.0
  • Development Distribution Score (DDS): 0.253
Past Year
  • Commits: 20
  • Committers: 2
  • Avg Commits per committer: 10.0
  • Development Distribution Score (DDS): 0.3
Top Committers
Name Email Commits
Michael Osthege m****e@o****m 74
dependabot[bot] 4****] 18
Michael Osthege m****e@f****e 7
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 19
  • Total pull requests: 58
  • Average time to close issues: about 1 month
  • Average time to close pull requests: 3 days
  • Total issue authors: 1
  • Total pull request authors: 2
  • Average comments per issue: 0.11
  • Average comments per pull request: 0.31
  • Merged pull requests: 55
  • Bot issues: 0
  • Bot pull requests: 32
Past Year
  • Issues: 0
  • Pull requests: 16
  • Average time to close issues: N/A
  • Average time to close pull requests: about 16 hours
  • Issue authors: 0
  • Pull request authors: 2
  • Average comments per issue: 0
  • Average comments per pull request: 0.25
  • Merged pull requests: 14
  • Bot issues: 0
  • Bot pull requests: 14
Top Authors
Issue Authors
  • michaelosthege (19)
Pull Request Authors
  • dependabot[bot] (47)
  • michaelosthege (27)
Top Labels
Issue Labels
enhancement (9) good first issue (4) bug (3) question (1) help wanted (1)
Pull Request Labels
dependencies (47) github_actions (45) enhancement (10) bug (6) python (2) documentation (1)

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 234 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 3
  • Total maintainers: 1
pypi.org: pytensor-federated

This package helps to reduce the amount of boilerplate code when creating Airflow DAGs from Python callables.

  • Versions: 3
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 234 Last month
Rankings
Dependent packages count: 7.4%
Average: 38.2%
Dependent repos count: 68.9%
Maintainers (1)
Last synced: 6 months ago

Dependencies

.github/workflows/pre-commit.yml actions
  • actions/checkout v4 composite
  • actions/setup-python v4 composite
  • pre-commit/action v3.0.0 composite
.github/workflows/release.yml actions
  • actions/checkout v4 composite
  • actions/setup-python v4 composite
.github/workflows/test.yml actions
  • actions/cache v3 composite
  • actions/checkout v4 composite
  • codecov/codecov-action v3.1.4 composite
  • conda-incubator/setup-miniconda v2 composite
pyproject.toml pypi
requirements.txt pypi
  • betterproto ==2.0.0b5
  • black *
  • isort *
  • nest-asyncio *
  • numpy *
  • psutil *
setup.py pypi
environment.yml pypi
  • betterproto ==2.0.0b6
  • pymc ==5.10.0