pgmpy

Python library for building, learning, and reasoning with causal models.

https://github.com/pgmpy/pgmpy

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
    7 of 151 committers (4.6%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.6%) to scientific vocabulary

Keywords

bayesian-networks causal-discovery causal-effect causal-identification causal-inference causal-models causal-prediction causal-validation graphical-models mixed-data probabilistic-inference python simulation synthetic-data

Keywords from Contributors

machine-learning-library nearest-neighbor-search transformer operating-system meg interactive eeg optim pdes neuroimaging
Last synced: 6 months ago · JSON representation ·

Repository

Python library for building, learning, and reasoning with causal models.

Basic Info
  • Host: GitHub
  • Owner: pgmpy
  • License: mit
  • Language: Python
  • Default Branch: dev
  • Homepage: https://pgmpy.org/
  • Size: 13.4 MB
Statistics
  • Stars: 3,027
  • Watchers: 75
  • Forks: 865
  • Open Issues: 373
  • Releases: 17
Topics
bayesian-networks causal-discovery causal-effect causal-identification causal-inference causal-models causal-prediction causal-validation graphical-models mixed-data probabilistic-inference python simulation synthetic-data
Created over 12 years ago · Last pushed 6 months ago
Metadata Files
Readme Changelog Contributing Funding License Code of conduct Citation

README.md

pgmpy is a Python library for causal and probabilistic modeling using graphical models. It provides a uniform API for building, learning, and analyzing models such as Bayesian Networks, Dynamic Bayesian Networks, Directed Acyclic Graphs (DAGs), and Structural Equation Models(SEMs). By integrating tools from both probabilistic inference and causal inference, pgmpy enables users to seamlessly transition between predictive and interventional analyses.

| | Documentation · Examples . Tutorials | |---|---| | Open Source | GitHub License GC.OS Sponsored | | Tutorials | Binder | Community | !discord !slack | | CI/CD | github-actions asv platform | | Code | !pypi !conda !python-versions !black | | Downloads | PyPI - Downloads PyPI - Downloads Downloads |

Key Features

| Feature | Description | |--------|-------------| | Causal Discovery / Structure Learning | Learn the model structure from data, with optional integration of expert knowledge. | | Causal Validation | Assess how compatible the causal structure is with the data. | | Parameter Learning | Estimate model parameters (e.g., conditional probability distributions) from observed data. | | Probabilistic Inference | Compute posterior distributions conditioned on observed evidence. | | Causal Inference | Compute interventional and counterfactual distributions using do-calculus. | | Simulations | Generate synthetic data under specified evidence or interventions. |

Resources and Links

Quickstart

Installation

pgmpy is available on both PyPI and anaconda. To install from PyPI, use:

bash pip install pgmpy To install from conda-forge, use:

bash conda install conda-forge::pgmpy

Examples

Discrete Data

```python from pgmpy.utils import getexamplemodel

Load a Discrete Bayesian Network and simulate data.

discretebn = getexamplemodel("alarm") alarmdf = discretebn.simulate(nsamples=100)

Learn a network from simulated data.

from pgmpy.estimators import PC

dag = PC(data=alarmdf).estimate(citest="chisquare", returntype="dag")

Learn the parameters from the data.

dagfitted = dag.fit(alarmdf) dagfitted.getcpds()

Drop a column and predict using the learned model.

evidencedf = alarmdf.drop(columns=["FIO2"], axis=1) predFIO2 = dagfitted.predict(evidence_df) ```

Linear Gaussian Data

```python

Load an example Gaussian Bayesian Network and simulate data

gaussianbn = getexamplemodel("ecoli70") ecolidf = gaussianbn.simulate(nsamples=100)

Learn the network from simulated data.

from pgmpy.estimators import PC

dag = PC(data=ecolidf).estimate(citest="pearsonr", return_type="dag")

Learn the parameters from the data.

from pgmpy.models import LinearGausianBayesianNetwork

gaussianbn = LinearGausianBayesianNetwork(dag.edges()) dagfitted = gaussianbn.fit(ecolidf) dagfitted.getcpds()

Drop a column and predict using the learned model.

evidencedf = ecolidf.drop(columns=["ftsJ"], axis=1) predftsJ = dagfitted.predict(evidence_df) ```

Mixture Data with Arbitrary Relationships

```python import pyro.distributions as dist

from pgmpy.models import FunctionalBayesianNetwork from pgmpy.factors.hybrid import FunctionalCPD

Create a Bayesian Network with mixture of discrete and continuous variables.

func_bn = FunctionalBayesianNetwork( [ ("x1", "w"), ("x2", "w"), ("x1", "y"), ("x2", "y"), ("w", "y"), ("y", "z"), ("w", "z"), ("y", "c"), ("w", "c"), ] )

Define the Functional CPDs for each node and add them to the model.

cpdx1 = FunctionalCPD("x1", fn=lambda _: dist.Normal(0.0, 1.0)) cpdx2 = FunctionalCPD("x2", fn=lambda _: dist.Normal(0.5, 1.2))

Continuous mediator: w = 0.7x1 - 0.3x2 + ε

cpd_w = FunctionalCPD( "w", fn=lambda parents: dist.Normal(0.7 * parents["x1"] - 0.3 * parents["x2"], 0.5), parents=["x1", "x2"], )

Bernoulli target with logistic link: y ~ Bernoulli(sigmoid(-0.7 + 1.5x1 + 0.8x2 + 1.2*w))

cpd_y = FunctionalCPD( "y", fn=lambda parents: dist.Bernoulli( logits=(-0.7 + 1.5 * parents["x1"] + 0.8 * parents["x2"] + 1.2 * parents["w"]) ), parents=["x1", "x2", "w"], )

Downstream Bernoulli influenced by y and w

cpd_z = FunctionalCPD( "z", fn=lambda parents: dist.Bernoulli( logits=(-1.2 + 0.8 * parents["y"] + 0.2 * parents["w"]) ), parents=["y", "w"], )

Continuous outcome depending on y and w: c = 0.2 + 0.5y + 0.3w + ε

cpd_c = FunctionalCPD( "c", fn=lambda parents: dist.Normal(0.2 + 0.5 * parents["y"] + 0.3 * parents["w"], 0.7), parents=["y", "w"], )

funcbn.addcpds(cpdx1, cpdx2, cpdw, cpdy, cpdz, cpdc) funcbn.checkmodel()

Simulate data from the model

dffunc = funcbn.simulate(n_samples=1000, seed=123)

For learning and inference in Functional Bayesian Networks, please refer to the example notebook: https://github.com/pgmpy/pgmpy/blob/dev/examples/FunctionalBayesianNetwork_Tutorial.ipynb

```

Contributing

We welcome all contributions --not just code-- to pgmpy. Please refer out contributing guide for more details. We also offer mentorship for new contributors and maintain a list of potential mentored projects. If you are interested in contributing to pgmpy, please join our discord server and introduce yourself. We will be happy to help you get started.

Owner

  • Name: pgmpy
  • Login: pgmpy
  • Kind: organization
  • Email: pgmpy@googlegroups.com

Python library for Probabilistic Graphical Models

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Ankan"
  given-names: "Ankur"
- family-names: "Textor"
  given-names: "Johannes"
title: "pgmpy: A Python Toolkit for Bayesian Networks"
version: 0.1.26
url: "https://github.com/pgmpy/pgmpy"
preferred-citation:
  type: article
  authors:
  - family-names: "Ankan"
    given-names: "Ankur"
  - family-names: "Textor"
    given-names: "Johannes"
  journal: "Journal Of Machine Learning Research"
  start: 1
  end: 8
  title: "pgmpy: A Python Toolkit for Bayesian Networks"
  issue: 25
  year: 2024

Committers

Last synced: 9 months ago

All Time
  • Total Commits: 2,613
  • Total Committers: 151
  • Avg Commits per committer: 17.305
  • Development Distribution Score (DDS): 0.479
Past Year
  • Commits: 210
  • Committers: 27
  • Avg Commits per committer: 7.778
  • Development Distribution Score (DDS): 0.41
Top Committers
Name Email Commits
Ankur Ankan a****n@g****m 1,361
Abinash Panda a****0@i****n 278
Yashu Seth y****3@g****m 195
Utkarsh Gupta u****0@g****m 102
palashahuja a****2@g****m 67
Vivek Jain v****r@g****m 58
Pratyaksh Sharma p****h@m****m 43
chrisittner m****l@c****e 36
Jihye Sofia Seo 9****o 26
kislayabhi a****y@g****m 23
finn42 f****e@g****m 22
navin n****2@g****m 22
joncrall e****c@g****m 18
snigam3112 s****2@g****m 17
Nimish-4 9****4 15
lohani2280 l****1@g****m 14
Raghav Gupta r****6@g****m 14
Anavil Tripathi a****i@g****m 13
jp111 j****2@g****m 12
Zhongpeng Lin z****n@m****m 11
Ashwini Chaudhary m****h@g****m 10
nehasoni n****0@i****n 10
Kshitij Saraogi K****i@g****m 10
Abinash Panda m****a@g****m 9
Nuntea 7****7 8
Harish Kashyap h****p@g****m 7
cs15mtech11007@iith.ac.in k****h@s****n 7
loudly-soft h****t@y****m 6
Justin Tervala T****n@b****m 5
Sitesh Ranjan s****l@g****m 5
and 121 more...

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 347
  • Total pull requests: 743
  • Average time to close issues: over 1 year
  • Average time to close pull requests: 29 days
  • Total issue authors: 156
  • Total pull request authors: 114
  • Average comments per issue: 2.4
  • Average comments per pull request: 1.77
  • Merged pull requests: 445
  • Bot issues: 0
  • Bot pull requests: 4
Past Year
  • Issues: 159
  • Pull requests: 577
  • Average time to close issues: 21 days
  • Average time to close pull requests: 5 days
  • Issue authors: 49
  • Pull request authors: 89
  • Average comments per issue: 1.49
  • Average comments per pull request: 2.01
  • Merged pull requests: 307
  • Bot issues: 0
  • Bot pull requests: 4
Top Authors
Issue Authors
  • ankurankan (121)
  • jihyeseo (9)
  • Nimish-4 (8)
  • zzzrbx (7)
  • CamiloMartinezM (6)
  • arocapedro (4)
  • John-Almardeny (4)
  • fkiraly (3)
  • hjyoon93 (3)
  • Rajaram1604 (3)
  • xy-whu (3)
  • vsocrates (3)
  • riffhi (3)
  • tillwenke (3)
  • lemonmachine (2)
Pull Request Authors
  • ankurankan (255)
  • jihyeseo (67)
  • Nimish-4 (42)
  • Nuna7 (24)
  • Susmita331 (21)
  • arnavk23 (13)
  • nitishmalang (12)
  • hillhack (12)
  • DarshanCode2005 (11)
  • arocapedro (10)
  • Vanshitaaa20 (10)
  • DevAnuragT (10)
  • kollisaisiddartha (10)
  • mdrazak2001 (10)
  • Spinachboul (8)
Top Labels
Issue Labels
Good First Issue (51) Bug (36) Enhancement (19) category: Learning (18) category: Inference (13) High Priority (11) New Feature (9) category: Base Model (9) category: Tests (6) category: Documentation (6) category: Causal Inference (2) category: Simulations (2) Entrance (1) Performance (1)
Pull Request Labels
dependencies (4) python (4) Bug (1) High Priority (1)

Packages

  • Total packages: 4
  • Total downloads:
    • pypi 157,423 last-month
  • Total docker downloads: 2,195
  • Total dependent packages: 29
    (may contain duplicates)
  • Total dependent repositories: 131
    (may contain duplicates)
  • Total versions: 55
  • Total maintainers: 2
pypi.org: pgmpy

A library for Probabilistic Graphical Models

  • Versions: 27
  • Dependent Packages: 28
  • Dependent Repositories: 127
  • Downloads: 157,323 Last month
  • Docker Downloads: 2,195
Rankings
Dependent packages count: 0.5%
Downloads: 1.3%
Dependent repos count: 1.3%
Average: 1.3%
Stargazers count: 1.4%
Docker downloads count: 1.7%
Forks count: 1.8%
Maintainers (1)
Last synced: 6 months ago
proxy.golang.org: github.com/pgmpy/pgmpy
  • Versions: 20
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 5.6%
Average: 5.8%
Dependent repos count: 6.0%
Last synced: 6 months ago
conda-forge.org: pgmpy
  • Versions: 1
  • Dependent Packages: 1
  • Dependent Repositories: 4
Rankings
Forks count: 6.2%
Stargazers count: 8.4%
Average: 14.9%
Dependent repos count: 16.0%
Dependent packages count: 28.9%
Last synced: 6 months ago
pypi.org: pgmpy-no-torch

A library for Probabilistic Graphical Models

  • Versions: 7
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 100 Last month
Rankings
Dependent packages count: 9.3%
Average: 30.9%
Dependent repos count: 52.4%
Maintainers (1)
Last synced: 6 months ago

Dependencies

.github/workflows/ci.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
requirements/optional.txt pypi
  • daft *
requirements/runtime.txt pypi
  • joblib *
  • networkx *
  • numpy *
  • opt_einsum *
  • pandas *
  • pyparsing *
  • scikit-learn *
  • scipy *
  • statsmodels *
  • torch *
  • tqdm *
requirements/tests.txt pypi
  • black * test
  • codecov >=2.0.15 test
  • coverage >=4.3.4 test
  • mock * test
  • pytest >=3.3.1 test
  • pytest-cov * test
  • xdoctest >=0.11.0 test
requirements.txt pypi
setup.py pypi