ExpFamilyPCA.jl
ExpFamilyPCA.jl: A Julia Package for Exponential Family Principal Component Analysis - Published in JOSS (2025)
Science Score: 100.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 4 DOI reference(s) in README and JOSS metadata -
✓Academic publication links
Links to: joss.theoj.org -
○Committers with academic emails
-
✓Institutional organization owner
Organization sisl has institutional domain (sisl.stanford.edu) -
✓JOSS paper metadata
Published in Journal of Open Source Software
Keywords
Keywords from Contributors
Scientific Fields
Repository
A Julia package for exponential family principal component analysis (EPCA).
Basic Info
- Host: GitHub
- Owner: sisl
- License: mit
- Language: Julia
- Default Branch: main
- Homepage: https://sisl.github.io/ExpFamilyPCA.jl/
- Size: 17.8 MB
Statistics
- Stars: 10
- Watchers: 4
- Forks: 2
- Open Issues: 1
- Releases: 8
Topics
Metadata Files
README.md
ExpFamilyPCA.jl
ExpFamilyPCA.jl is a Julia package for exponential family principal component analysis (EPCA), a versatile generalization of PCA designed to handle non-Gaussian data, enabling dimensionality reduction and data analysis across a wide variety of distributions (e.g., binary, count, and compositional data). It is designed for applications in machine learning (belief compression, text analysis), signal processing (denoising), and data science (sample debiasing, clustering, dimensionality reduction), but can be applied to other fields with diverse data types.
- Website: https://sisl.github.io/ExpFamilyPCA.jl/dev/
- Math: https://sisl.github.io/ExpFamilyPCA.jl/dev/math/intro/
- API Documentation: https://sisl.github.io/ExpFamilyPCA.jl/dev/api/
Features
- Implements exponential family PCA (EPCA)
- Supports multiple exponential family distributions
- Flexible constructors for custom distributions
- Fast symbolic differentiation and optimization
- Numerically stable scientific computation
Installation
To install the package, use the Julia package manager. In the Julia REPL, type:
julia
using Pkg; Pkg.add("ExpFamilyPCA")
Supported Distributions
The following distributions are supported:
| Distribution | Description |
| ----------------------------- | ------------------------------------------------ |
| BernoulliEPCA | For binary data |
| BinomialEPCA | For count data with a fixed number of trials |
| ContinuousBernoulliEPCA | For probabilities between 0 and 1 |
| GammaEPCA | For positive continuous data |
| GaussianEPCA | Standard PCA for real-valued data |
| NegativeBinomialEPCA | For over-dispersed count data |
| ParetoEPCA | For heavy-tailed distributions |
| PoissonEPCA | For count and discrete distribution data |
| WeibullEPCA | For life data and survival analysis |
Quickstart
Each EPCA object supports the following methods:
- fit!: Trains the model and returns compressed training data.
- compress: Compresses new input data.
- decompress: Reconstructs original data from the compressed representation.
Example:
```julia X = samplefrompoisson(n1, indim) Y = samplefrompoisson(n2, indim) epca = PoissonEPCA(indim, outdim)
Xcompressed = fit!(epca, X) Ycompressed = compress(epca, Y) Yreconstructed = decompress(epca, Ycompressed) ```
The sample_from_poisson function is a placeholder for generating random Poisson-distributed data. It is not implemented in the code snippet to maintain clarity and focus on the core functionality of the example. If you wish to implement it, you can use the Distributions.jl package. For instance, you could define it as:
```julia using Distributions
function samplefrompoisson(n::Int, dim::Int) d = Poisson() rand(d, n, dim) end ```
Custom Distributions
When working with custom distributions, certain specifications are often more convenient and computationally efficient than others. For example, inducing the gamma EPCA objective from the log-partition $G(\theta) = -\log(-\theta)$ and its derivative $g(\theta) = -1/\theta$ is much simpler than implementing the full the Itakura-Saito distance:
$$ D(P(\omega), \hat{P}(\omega)) =\frac{1}{2\pi} \int_{-\pi}^{\pi} \Bigg[ \frac{P(\omega)}{\hat{P}(\omega)} - \log \frac{P(\omega)}{\hat{P}{\omega}} - 1\Bigg] d\omega. $$
In ExpFamilyPCA.jl, we would write:
julia
G(θ) = -log(-θ)
g(θ) = -1 / θ
gamma_epca = EPCA(indim, outdim, G, g, Val((:G, :g)); options = NegativeDomain())
A lengthier discussion of the EPCA constructors and math is provided in the documentation.
Contributing
Contributions are welcome! If you want to contribute, please fork the repository, create a new branch, and submit a pull request. Before contributing, please make sure to update tests as appropriate.
Citing
If ExpFamilyPCA.jl is useful in your research and you would like to acknowledge it, please cite this paper:
bib
@article{bhamidipaty2025expfamilypca,
title = {{ExpFamilyPCA}.jl: A {J}ulia Package for Exponential Family Principal Component Analysis},
author = {Bhamidipaty, Logan Mondal and Kochenderfer, Mykel J. and Hastie, Trevor},
journal = {Journal of Open Source Software},
volume = {10},
number = {105},
pages = {7403},
year = {2025},
month = {jan},
doi = {10.21105/joss.07403},
url = {https://joss.theoj.org/papers/10.21105/joss.07403}
}
Owner
- Name: Stanford Intelligent Systems Laboratory
- Login: sisl
- Kind: organization
- Location: Stanford, CA
- Website: sisl.stanford.edu
- Repositories: 236
- Profile: https://github.com/sisl
JOSS Publication
ExpFamilyPCA.jl: A Julia Package for Exponential Family Principal Component Analysis
Authors
Tags
compression dimensionality reduction PCA exponential family EPCA open-source POMDP MDP sequential decision making RLCitation (CITATION.cff)
cff-version: "1.2.0"
authors:
- family-names: Bhamidipaty
given-names: Logan Mondal
orcid: "https://orcid.org/0009-0001-3978-9462"
- family-names: Kochenderfer
given-names: Mykel J.
orcid: "https://orcid.org/0000-0002-7238-9663"
- family-names: Hastie
given-names: Trevor
orcid: "https://orcid.org/0000-0002-0164-3142"
doi: 10.5281/zenodo.14624991
message: If you use this software, please cite our article in the
Journal of Open Source Software.
preferred-citation:
authors:
- family-names: Bhamidipaty
given-names: Logan Mondal
orcid: "https://orcid.org/0009-0001-3978-9462"
- family-names: Kochenderfer
given-names: Mykel J.
orcid: "https://orcid.org/0000-0002-7238-9663"
- family-names: Hastie
given-names: Trevor
orcid: "https://orcid.org/0000-0002-0164-3142"
date-published: 2025-01-14
doi: 10.21105/joss.07403
issn: 2475-9066
issue: 105
journal: Journal of Open Source Software
publisher:
name: Open Journals
start: 7403
title: "ExpFamilyPCA.jl: A Julia Package for Exponential Family
Principal Component Analysis"
type: article
url: "https://joss.theoj.org/papers/10.21105/joss.07403"
volume: 10
title: "ExpFamilyPCA.jl: A Julia Package for Exponential Family
Principal Component Analysis"
GitHub Events
Total
- Create event: 10
- Commit comment event: 7
- Release event: 6
- Issues event: 14
- Watch event: 7
- Delete event: 9
- Issue comment event: 57
- Push event: 110
- Pull request event: 9
- Fork event: 2
Last Year
- Create event: 10
- Commit comment event: 7
- Release event: 6
- Issues event: 14
- Watch event: 7
- Delete event: 9
- Issue comment event: 57
- Push event: 110
- Pull request event: 9
- Fork event: 2
Committers
Last synced: 5 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Logan Bhamidipaty | l****0@g****m | 236 |
| Guillaume Dalle | 2****e | 4 |
| dependabot[bot] | 4****] | 2 |
| CompatHelper Julia | c****y@j****g | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 4 months ago
All Time
- Total issues: 5
- Total pull requests: 25
- Average time to close issues: 25 days
- Average time to close pull requests: 7 days
- Total issue authors: 3
- Total pull request authors: 4
- Average comments per issue: 8.8
- Average comments per pull request: 0.32
- Merged pull requests: 11
- Bot issues: 0
- Bot pull requests: 18
Past Year
- Issues: 5
- Pull requests: 16
- Average time to close issues: 25 days
- Average time to close pull requests: 2 days
- Issue authors: 3
- Pull request authors: 4
- Average comments per issue: 8.8
- Average comments per pull request: 0.44
- Merged pull requests: 10
- Bot issues: 0
- Bot pull requests: 9
Top Authors
Issue Authors
- gdalle (4)
- FlyingWorkshop (2)
- JuliaTagBot (1)
Pull Request Authors
- github-actions[bot] (24)
- dependabot[bot] (7)
- FlyingWorkshop (4)
- gdalle (3)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- julia 2 total
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 6
juliahub.com: ExpFamilyPCA
A Julia package for exponential family principal component analysis (EPCA).
- Homepage: https://sisl.github.io/ExpFamilyPCA.jl/
- Documentation: https://docs.juliahub.com/General/ExpFamilyPCA/stable/
- License: MIT
-
Latest release: 2.0.3
published 12 months ago
Rankings
Dependencies
- actions/checkout v4 composite
- julia-actions/cache v1 composite
- julia-actions/julia-buildpkg v1 composite
- julia-actions/julia-runtest v1 composite
- julia-actions/setup-julia v1 composite
- JuliaRegistries/TagBot v1 composite
- actions/checkout v4 composite
- actions/upload-artifact v1 composite
- openjournals/openjournals-draft-action master composite
