pgmpy

Python library for building, learning, and reasoning with causal models.

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
✓
Committers with academic emails
7 of 151 committers (4.6%) from academic institutions
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (13.6%) to scientific vocabulary

Keywords

bayesian-networks causal-discovery causal-effect causal-identification causal-inference causal-models causal-prediction causal-validation graphical-models mixed-data probabilistic-inference python simulation synthetic-data

Keywords from Contributors

machine-learning-library nearest-neighbor-search transformer operating-system meg interactive eeg optim pdes neuroimaging

Last synced: 6 months ago · JSON representation ·

Repository

Python library for building, learning, and reasoning with causal models.

Basic Info

Host: GitHub
Owner: pgmpy
License: mit
Language: Python
Default Branch: dev
Homepage: https://pgmpy.org/
Size: 13.4 MB

Statistics

Stars: 3,027
Watchers: 75
Forks: 865
Open Issues: 373
Releases: 17

Topics

Created over 12 years ago · Last pushed 6 months ago

Metadata Files

Readme Changelog Contributing Funding License Code of conduct Citation

README.md

pgmpy is a Python library for causal and probabilistic modeling using graphical models. It provides a uniform API for building, learning, and analyzing models such as Bayesian Networks, Dynamic Bayesian Networks, Directed Acyclic Graphs (DAGs), and Structural Equation Models(SEMs). By integrating tools from both probabilistic inference and causal inference, pgmpy enables users to seamlessly transition between predictive and interventional analyses.

| | Documentation · Examples . Tutorials | |---|---| | Open Source | | | Tutorials | | Community | | | CI/CD | | | Code | | | Downloads | PyPI - Downloads |

Key Features

| Feature | Description | |--------|-------------| | Causal Discovery / Structure Learning | Learn the model structure from data, with optional integration of expert knowledge. | | Causal Validation | Assess how compatible the causal structure is with the data. | | Parameter Learning | Estimate model parameters (e.g., conditional probability distributions) from observed data. | | Probabilistic Inference | Compute posterior distributions conditioned on observed evidence. | | Causal Inference | Compute interventional and counterfactual distributions using do-calculus. | | Simulations | Generate synthetic data under specified evidence or interventions. |

Resources and Links

Example Notebooks: Examples
Tutorial Notebooks: Tutorials
Blog Posts: Medium
Documentation: Website
Bug Reports and Feature Requests: GitHub Issues
Questions: discord · Stack Overflow

Quickstart

Installation

pgmpy is available on both PyPI and anaconda. To install from PyPI, use:

bash pip install pgmpy To install from conda-forge, use:

bash conda install conda-forge::pgmpy

Examples

Discrete Data

```python from pgmpy.utils import getexamplemodel

Load a Discrete Bayesian Network and simulate data.

discretebn = getexamplemodel("alarm") alarmdf = discretebn.simulate(nsamples=100)

Learn a network from simulated data.

from pgmpy.estimators import PC

dag = PC(data=alarmdf).estimate(citest="chisquare", returntype="dag")

Learn the parameters from the data.

dagfitted = dag.fit(alarmdf) dagfitted.getcpds()

Drop a column and predict using the learned model.

evidencedf = alarmdf.drop(columns=["FIO2"], axis=1) predFIO2 = dagfitted.predict(evidence_df) ```

Linear Gaussian Data

```python

Load an example Gaussian Bayesian Network and simulate data

gaussianbn = getexamplemodel("ecoli70") ecolidf = gaussianbn.simulate(nsamples=100)

Learn the network from simulated data.

from pgmpy.estimators import PC

dag = PC(data=ecolidf).estimate(citest="pearsonr", return_type="dag")

Learn the parameters from the data.

from pgmpy.models import LinearGausianBayesianNetwork

gaussianbn = LinearGausianBayesianNetwork(dag.edges()) dagfitted = gaussianbn.fit(ecolidf) dagfitted.getcpds()

Drop a column and predict using the learned model.

evidencedf = ecolidf.drop(columns=["ftsJ"], axis=1) predftsJ = dagfitted.predict(evidence_df) ```

Mixture Data with Arbitrary Relationships

```python import pyro.distributions as dist

from pgmpy.models import FunctionalBayesianNetwork from pgmpy.factors.hybrid import FunctionalCPD

Create a Bayesian Network with mixture of discrete and continuous variables.

func_bn = FunctionalBayesianNetwork( [ ("x1", "w"), ("x2", "w"), ("x1", "y"), ("x2", "y"), ("w", "y"), ("y", "z"), ("w", "z"), ("y", "c"), ("w", "c"), ] )

Define the Functional CPDs for each node and add them to the model.

cpdx1 = FunctionalCPD("x1", fn=lambda _: dist.Normal(0.0, 1.0)) cpdx2 = FunctionalCPD("x2", fn=lambda _: dist.Normal(0.5, 1.2))

Continuous mediator: w = 0.7x1 - 0.3x2 + ε

cpd_w = FunctionalCPD( "w", fn=lambda parents: dist.Normal(0.7 * parents["x1"] - 0.3 * parents["x2"], 0.5), parents=["x1", "x2"], )

Bernoulli target with logistic link: y ~ Bernoulli(sigmoid(-0.7 + 1.5x1 + 0.8x2 + 1.2*w))

cpd_y = FunctionalCPD( "y", fn=lambda parents: dist.Bernoulli( logits=(-0.7 + 1.5 * parents["x1"] + 0.8 * parents["x2"] + 1.2 * parents["w"]) ), parents=["x1", "x2", "w"], )

Downstream Bernoulli influenced by y and w

cpd_z = FunctionalCPD( "z", fn=lambda parents: dist.Bernoulli( logits=(-1.2 + 0.8 * parents["y"] + 0.2 * parents["w"]) ), parents=["y", "w"], )

Continuous outcome depending on y and w: c = 0.2 + 0.5y + 0.3w + ε

cpd_c = FunctionalCPD( "c", fn=lambda parents: dist.Normal(0.2 + 0.5 * parents["y"] + 0.3 * parents["w"], 0.7), parents=["y", "w"], )

funcbn.addcpds(cpdx1, cpdx2, cpdw, cpdy, cpdz, cpdc) funcbn.checkmodel()

Simulate data from the model

dffunc = funcbn.simulate(n_samples=1000, seed=123)

For learning and inference in Functional Bayesian Networks, please refer to the example notebook: https://github.com/pgmpy/pgmpy/blob/dev/examples/FunctionalBayesianNetwork_Tutorial.ipynb

```

Contributing

We welcome all contributions --not just code-- to pgmpy. Please refer out contributing guide for more details. We also offer mentorship for new contributors and maintain a list of potential mentored projects. If you are interested in contributing to pgmpy, please join our discord server and introduce yourself. We will be happy to help you get started.

Owner

Name: pgmpy
Login: pgmpy
Kind: organization
Email: pgmpy@googlegroups.com

Website: http://pgmpy.org/
Repositories: 7
Profile: https://github.com/pgmpy

Python library for Probabilistic Graphical Models

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Ankan"
  given-names: "Ankur"
- family-names: "Textor"
  given-names: "Johannes"
title: "pgmpy: A Python Toolkit for Bayesian Networks"
version: 0.1.26
url: "https://github.com/pgmpy/pgmpy"
preferred-citation:
  type: article
  authors:
  - family-names: "Ankan"
    given-names: "Ankur"
  - family-names: "Textor"
    given-names: "Johannes"
  journal: "Journal Of Machine Learning Research"
  start: 1
  end: 8
  title: "pgmpy: A Python Toolkit for Bayesian Networks"
  issue: 25
  year: 2024

Committers

Last synced: 9 months ago

All Time

Total Commits: 2,613
Total Committers: 151
Avg Commits per committer: 17.305
Development Distribution Score (DDS): 0.479

Past Year

Commits: 210
Committers: 27
Avg Commits per committer: 7.778
Development Distribution Score (DDS): 0.41

Top Committers

Name	Email	Commits
Ankur Ankan	a**n@g**m	1,361
Abinash Panda	a**0@i**n	278
Yashu Seth	y**3@g**m	195
Utkarsh Gupta	u**0@g**m	102
palashahuja	a**2@g**m	67
Vivek Jain	v**r@g**m	58
Pratyaksh Sharma	p**h@m**m	43
chrisittner	m**l@c**e	36
Jihye Sofia Seo	9****o	26
kislayabhi	a**y@g**m	23
finn42	f**e@g**m	22
navin	n**2@g**m	22
joncrall	e**c@g**m	18
snigam3112	s**2@g**m	17
Nimish-4	9****4	15
lohani2280	l**1@g**m	14
Raghav Gupta	r**6@g**m	14
Anavil Tripathi	a**i@g**m	13
jp111	j**2@g**m	12
Zhongpeng Lin	z**n@m**m	11
Ashwini Chaudhary	m**h@g**m	10
nehasoni	n**0@i**n	10
Kshitij Saraogi	K**i@g**m	10
Abinash Panda	m**a@g**m	9
Nuntea	7****7	8
Harish Kashyap	h**p@g**m	7
cs15mtech11007@iith.ac.in	k**h@s**n	7
loudly-soft	h**t@y**m	6
Justin Tervala	T**n@b**m	5
Sitesh Ranjan	s**l@g**m	5
and 121 more...

Committer Domains (Top 20 + Academic)

itbhu.ac.in: 2 microsoft.com: 2 me.com: 1 chrisittner.de: 1 suiit.ac.in: 1 bah.com: 1 anchitja.in: 1 iitk.ac.in: 1 indigobio.com: 1 mail.huji.ac.il: 1 gcos.ai: 1 chriskamphuis.com: 1 veldt.jp: 1 dynatrace.com: 1 ucsd.edu: 1 qq.com: 1 nokia.com: 1 quantpost.com: 1 abinash-inspiron-n4010.(none): 1 github.com: 1 stud.uni-saarland.de: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 347
Total pull requests: 743
Average time to close issues: over 1 year
Average time to close pull requests: 29 days
Total issue authors: 156
Total pull request authors: 114
Average comments per issue: 2.4
Average comments per pull request: 1.77
Merged pull requests: 445
Bot issues: 0
Bot pull requests: 4

Past Year

Issues: 159
Pull requests: 577
Average time to close issues: 21 days
Average time to close pull requests: 5 days
Issue authors: 49
Pull request authors: 89
Average comments per issue: 1.49
Average comments per pull request: 2.01
Merged pull requests: 307
Bot issues: 0
Bot pull requests: 4

View more stats

Top Authors

Issue Authors

ankurankan (121)
jihyeseo (9)
Nimish-4 (8)
zzzrbx (7)
CamiloMartinezM (6)
arocapedro (4)
John-Almardeny (4)
fkiraly (3)
hjyoon93 (3)
Rajaram1604 (3)
xy-whu (3)
vsocrates (3)
riffhi (3)
tillwenke (3)
lemonmachine (2)

Pull Request Authors

ankurankan (255)
jihyeseo (67)
Nimish-4 (42)
Nuna7 (24)
Susmita331 (21)
arnavk23 (13)
nitishmalang (12)
hillhack (12)
DarshanCode2005 (11)
arocapedro (10)
Vanshitaaa20 (10)
DevAnuragT (10)
kollisaisiddartha (10)
mdrazak2001 (10)
Spinachboul (8)

Top Labels

Issue Labels

Good First Issue (51) Bug (36) Enhancement (19) category: Learning (18) category: Inference (13) High Priority (11) New Feature (9) category: Base Model (9) category: Tests (6) category: Documentation (6) category: Causal Inference (2) category: Simulations (2) Entrance (1) Performance (1)

Pull Request Labels

dependencies (4) python (4) Bug (1) High Priority (1)

Packages

Total packages: 4
Total downloads:
- pypi 157,423 last-month
Total docker downloads: 2,195

Total dependent packages: 29
(may contain duplicates)
Total dependent repositories: 131
(may contain duplicates)
Total versions: 55
Total maintainers: 2

pypi.org: pgmpy

A library for Probabilistic Graphical Models

Homepage: https://github.com/pgmpy/pgmpy
Documentation: https://pgmpy.readthedocs.io/
License: MIT
Latest release: 1.0.0
published 11 months ago

Versions: 27
Dependent Packages: 28
Dependent Repositories: 127
Downloads: 157,323 Last month
Docker Downloads: 2,195

Rankings

Dependent packages count: 0.5%

Downloads: 1.3%

Dependent repos count: 1.3%

Average: 1.3%

Stargazers count: 1.4%

Docker downloads count: 1.7%

Forks count: 1.8%

Maintainers (1)

Ankur.Ankan

Last synced: 6 months ago

proxy.golang.org: github.com/pgmpy/pgmpy

Documentation: https://pkg.go.dev/github.com/pgmpy/pgmpy#section-documentation
License: mit
Latest release: v1.0.0
published 11 months ago

Versions: 20
Dependent Packages: 0
Dependent Repositories: 0

Rankings

Dependent packages count: 5.6%

Average: 5.8%

Dependent repos count: 6.0%

Last synced: 6 months ago

conda-forge.org: pgmpy

Homepage: https://github.com/pgmpy/pgmpy
License: MIT
Latest release: 0.1.19
published over 3 years ago

Versions: 1
Dependent Packages: 1
Dependent Repositories: 4

Rankings

Forks count: 6.2%

Stargazers count: 8.4%

Average: 14.9%

Dependent repos count: 16.0%

Dependent packages count: 28.9%

Last synced: 6 months ago

pypi.org: pgmpy-no-torch

A library for Probabilistic Graphical Models

Homepage: https://github.com/pgmpy/pgmpy
Documentation: https://pgmpy-no-torch.readthedocs.io/
License: MIT
Latest release: 1.0.4
published 10 months ago

Versions: 7
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 100 Last month

Rankings

Dependent packages count: 9.3%

Average: 30.9%

Dependent repos count: 52.4%

Maintainers (1)

tauferp

Last synced: 6 months ago

Dependencies

.github/workflows/ci.yml actions

actions/checkout v2 composite
actions/setup-python v2 composite

requirements/optional.txt pypi

daft *

requirements/runtime.txt pypi

joblib *
networkx *
numpy *
opt_einsum *
pandas *
pyparsing *
scikit-learn *
scipy *
statsmodels *
torch *
tqdm *

requirements/tests.txt pypi

black * test
codecov >=2.0.15 test
coverage >=4.3.4 test
mock * test
pytest >=3.3.1 test
pytest-cov * test
xdoctest >=0.11.0 test

requirements.txt pypi

setup.py pypi