ExpFamilyPCA.jl

ExpFamilyPCA.jl: A Julia Package for Exponential Family Principal Component Analysis - Published in JOSS (2025)

https://github.com/sisl/expfamilypca.jl

Science Score: 100.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 4 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: joss.theoj.org
  • Committers with academic emails
  • Institutional organization owner
    Organization sisl has institutional domain (sisl.stanford.edu)
  • JOSS paper metadata
    Published in Journal of Open Source Software

Keywords

compression compression-algorithm denoising dimensionality-reduction epca exponential-family interpretability julia machine-learning pca principal-component-analysis reinforcement-learning signal-processing

Keywords from Contributors

pde meshing raytracer ode

Scientific Fields

Medicine Life Sciences - 40% confidence
Last synced: 4 months ago · JSON representation ·

Repository

A Julia package for exponential family principal component analysis (EPCA).

Basic Info
Statistics
  • Stars: 10
  • Watchers: 4
  • Forks: 2
  • Open Issues: 1
  • Releases: 8
Topics
compression compression-algorithm denoising dimensionality-reduction epca exponential-family interpretability julia machine-learning pca principal-component-analysis reinforcement-learning signal-processing
Created almost 2 years ago · Last pushed 4 months ago
Metadata Files
Readme License Citation

README.md

ExpFamilyPCA.jl

Docs Dev-Docs status Build Status codecov

ExpFamilyPCA.jl is a Julia package for exponential family principal component analysis (EPCA), a versatile generalization of PCA designed to handle non-Gaussian data, enabling dimensionality reduction and data analysis across a wide variety of distributions (e.g., binary, count, and compositional data). It is designed for applications in machine learning (belief compression, text analysis), signal processing (denoising), and data science (sample debiasing, clustering, dimensionality reduction), but can be applied to other fields with diverse data types.

  • Website: https://sisl.github.io/ExpFamilyPCA.jl/dev/
  • Math: https://sisl.github.io/ExpFamilyPCA.jl/dev/math/intro/
  • API Documentation: https://sisl.github.io/ExpFamilyPCA.jl/dev/api/

Features

  • Implements exponential family PCA (EPCA)
  • Supports multiple exponential family distributions
  • Flexible constructors for custom distributions
  • Fast symbolic differentiation and optimization
  • Numerically stable scientific computation

Installation

To install the package, use the Julia package manager. In the Julia REPL, type:

julia using Pkg; Pkg.add("ExpFamilyPCA")

Supported Distributions

The following distributions are supported:

| Distribution | Description | | ----------------------------- | ------------------------------------------------ | | BernoulliEPCA | For binary data | | BinomialEPCA | For count data with a fixed number of trials | | ContinuousBernoulliEPCA | For probabilities between 0 and 1 | | GammaEPCA | For positive continuous data | | GaussianEPCA | Standard PCA for real-valued data | | NegativeBinomialEPCA | For over-dispersed count data | | ParetoEPCA | For heavy-tailed distributions | | PoissonEPCA | For count and discrete distribution data | | WeibullEPCA | For life data and survival analysis |

Quickstart

Each EPCA object supports the following methods: - fit!: Trains the model and returns compressed training data. - compress: Compresses new input data. - decompress: Reconstructs original data from the compressed representation.

Example:

```julia X = samplefrompoisson(n1, indim) Y = samplefrompoisson(n2, indim) epca = PoissonEPCA(indim, outdim)

Xcompressed = fit!(epca, X) Ycompressed = compress(epca, Y) Yreconstructed = decompress(epca, Ycompressed) ```

The sample_from_poisson function is a placeholder for generating random Poisson-distributed data. It is not implemented in the code snippet to maintain clarity and focus on the core functionality of the example. If you wish to implement it, you can use the Distributions.jl package. For instance, you could define it as:

```julia using Distributions

function samplefrompoisson(n::Int, dim::Int) d = Poisson() rand(d, n, dim) end ```

Custom Distributions

When working with custom distributions, certain specifications are often more convenient and computationally efficient than others. For example, inducing the gamma EPCA objective from the log-partition $G(\theta) = -\log(-\theta)$ and its derivative $g(\theta) = -1/\theta$ is much simpler than implementing the full the Itakura-Saito distance:

$$ D(P(\omega), \hat{P}(\omega)) =\frac{1}{2\pi} \int_{-\pi}^{\pi} \Bigg[ \frac{P(\omega)}{\hat{P}(\omega)} - \log \frac{P(\omega)}{\hat{P}{\omega}} - 1\Bigg] d\omega. $$

In ExpFamilyPCA.jl, we would write:

julia G(θ) = -log(-θ) g(θ) = -1 / θ gamma_epca = EPCA(indim, outdim, G, g, Val((:G, :g)); options = NegativeDomain())

A lengthier discussion of the EPCA constructors and math is provided in the documentation.

Contributing

Contributions are welcome! If you want to contribute, please fork the repository, create a new branch, and submit a pull request. Before contributing, please make sure to update tests as appropriate.

Citing

If ExpFamilyPCA.jl is useful in your research and you would like to acknowledge it, please cite this paper:

bib @article{bhamidipaty2025expfamilypca, title = {{ExpFamilyPCA}.jl: A {J}ulia Package for Exponential Family Principal Component Analysis}, author = {Bhamidipaty, Logan Mondal and Kochenderfer, Mykel J. and Hastie, Trevor}, journal = {Journal of Open Source Software}, volume = {10}, number = {105}, pages = {7403}, year = {2025}, month = {jan}, doi = {10.21105/joss.07403}, url = {https://joss.theoj.org/papers/10.21105/joss.07403} }

Owner

  • Name: Stanford Intelligent Systems Laboratory
  • Login: sisl
  • Kind: organization
  • Location: Stanford, CA

JOSS Publication

ExpFamilyPCA.jl: A Julia Package for Exponential Family Principal Component Analysis
Published
January 14, 2025
Volume 10, Issue 105, Page 7403
Authors
Logan Mondal Bhamidipaty ORCID
Stanford University
Mykel J. Kochenderfer ORCID
Stanford University
Trevor Hastie ORCID
Stanford University
Editor
Oskar Laverny ORCID
Tags
compression dimensionality reduction PCA exponential family EPCA open-source POMDP MDP sequential decision making RL

Citation (CITATION.cff)

cff-version: "1.2.0"
authors:
- family-names: Bhamidipaty
  given-names: Logan Mondal
  orcid: "https://orcid.org/0009-0001-3978-9462"
- family-names: Kochenderfer
  given-names: Mykel J.
  orcid: "https://orcid.org/0000-0002-7238-9663"
- family-names: Hastie
  given-names: Trevor
  orcid: "https://orcid.org/0000-0002-0164-3142"
doi: 10.5281/zenodo.14624991
message: If you use this software, please cite our article in the
  Journal of Open Source Software.
preferred-citation:
  authors:
  - family-names: Bhamidipaty
    given-names: Logan Mondal
    orcid: "https://orcid.org/0009-0001-3978-9462"
  - family-names: Kochenderfer
    given-names: Mykel J.
    orcid: "https://orcid.org/0000-0002-7238-9663"
  - family-names: Hastie
    given-names: Trevor
    orcid: "https://orcid.org/0000-0002-0164-3142"
  date-published: 2025-01-14
  doi: 10.21105/joss.07403
  issn: 2475-9066
  issue: 105
  journal: Journal of Open Source Software
  publisher:
    name: Open Journals
  start: 7403
  title: "ExpFamilyPCA.jl: A Julia Package for Exponential Family
    Principal Component Analysis"
  type: article
  url: "https://joss.theoj.org/papers/10.21105/joss.07403"
  volume: 10
title: "ExpFamilyPCA.jl: A Julia Package for Exponential Family
  Principal Component Analysis"

GitHub Events

Total
  • Create event: 10
  • Commit comment event: 7
  • Release event: 6
  • Issues event: 14
  • Watch event: 7
  • Delete event: 9
  • Issue comment event: 57
  • Push event: 110
  • Pull request event: 9
  • Fork event: 2
Last Year
  • Create event: 10
  • Commit comment event: 7
  • Release event: 6
  • Issues event: 14
  • Watch event: 7
  • Delete event: 9
  • Issue comment event: 57
  • Push event: 110
  • Pull request event: 9
  • Fork event: 2

Committers

Last synced: 5 months ago

All Time
  • Total Commits: 243
  • Total Committers: 4
  • Avg Commits per committer: 60.75
  • Development Distribution Score (DDS): 0.029
Past Year
  • Commits: 195
  • Committers: 3
  • Avg Commits per committer: 65.0
  • Development Distribution Score (DDS): 0.026
Top Committers
Name Email Commits
Logan Bhamidipaty l****0@g****m 236
Guillaume Dalle 2****e 4
dependabot[bot] 4****] 2
CompatHelper Julia c****y@j****g 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 5
  • Total pull requests: 25
  • Average time to close issues: 25 days
  • Average time to close pull requests: 7 days
  • Total issue authors: 3
  • Total pull request authors: 4
  • Average comments per issue: 8.8
  • Average comments per pull request: 0.32
  • Merged pull requests: 11
  • Bot issues: 0
  • Bot pull requests: 18
Past Year
  • Issues: 5
  • Pull requests: 16
  • Average time to close issues: 25 days
  • Average time to close pull requests: 2 days
  • Issue authors: 3
  • Pull request authors: 4
  • Average comments per issue: 8.8
  • Average comments per pull request: 0.44
  • Merged pull requests: 10
  • Bot issues: 0
  • Bot pull requests: 9
Top Authors
Issue Authors
  • gdalle (4)
  • FlyingWorkshop (2)
  • JuliaTagBot (1)
Pull Request Authors
  • github-actions[bot] (24)
  • dependabot[bot] (7)
  • FlyingWorkshop (4)
  • gdalle (3)
Top Labels
Issue Labels
Pull Request Labels
dependencies (7)

Packages

  • Total packages: 1
  • Total downloads:
    • julia 2 total
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 6
juliahub.com: ExpFamilyPCA

A Julia package for exponential family principal component analysis (EPCA).

  • Versions: 6
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 2 Total
Rankings
Dependent repos count: 3.2%
Downloads: 3.8%
Average: 7.8%
Dependent packages count: 16.3%
Last synced: 4 months ago

Dependencies

.github/workflows/CI.yml actions
  • actions/checkout v4 composite
  • julia-actions/cache v1 composite
  • julia-actions/julia-buildpkg v1 composite
  • julia-actions/julia-runtest v1 composite
  • julia-actions/setup-julia v1 composite
.github/workflows/CompatHelper.yml actions
.github/workflows/TagBot.yml actions
  • JuliaRegistries/TagBot v1 composite
.github/workflows/draft-pdf.yml actions
  • actions/checkout v4 composite
  • actions/upload-artifact v1 composite
  • openjournals/openjournals-draft-action master composite