dodiscover

[Experimental] Global causal discovery algorithms

https://github.com/py-why/dodiscover

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.7%) to scientific vocabulary

Keywords

causal-inference causality graphs python structure-learning
Last synced: 6 months ago · JSON representation ·

Repository

[Experimental] Global causal discovery algorithms

Basic Info
Statistics
  • Stars: 106
  • Watchers: 7
  • Forks: 17
  • Open Issues: 59
  • Releases: 0
Topics
causal-inference causality graphs python structure-learning
Created over 3 years ago · Last pushed 7 months ago
Metadata Files
Readme Contributing License Citation Governance

README.md

Code style: black CircleCI unit-tests Checked with mypy codecov

DoDiscover

DoDiscover is a Python library for causal discovery (causal structure learning). If one does not have access to a causal graph for their modeling problem, they may use DoDiscover to learn causal structure from their data (e.g., in the form of a graph).

What makes dodiscover different from other causal discovery libraries?

Why do we need another causal discovery library? Here are some design goals that differentiate DoDiscover from other causal discovery libraries.

Ease of use

An analyst should be able to get a causal discovery workflow working quickly without intimate knowledge of causal discovery algorithms. DoDiscover prioritizes the workflow over the algorithms and provides default arguments to algorithm parameters.

Democratizing deep causal discovery

Many cutting-edge causal discovery algorithms rely on deep learning frameworks. However, deep learning-based causal discovery often requires obscure boilerplate code, complex configuration, and management of large artifacts such as embeddings. DoDiscover seeks to create abstractions that address these challenges and make deep causal discovery more broadly accessible. Current algorithms are a work-in-progress. We will begin by providing a robust API for the fundamental discovery algorithms.

Easy interface for articulating causal assumptions

Domain experts bring a large amount of domain knowledge to a problem. That domain knowledge can establish causal assumptions that can constrain causal discovery. Causal discovery (indeed, all causal inferences) requires causal assumptions.

However, a newly developed causal discovery algorithm has a greater research impact when it can do more with fewer assumptions. This "do more with less" orientation tends to deemphasize assumptions in the user interfaces of many causal discovery libraries.

DoDiscover prioritizes the interface for causal assumptions. Further, DoDiscover seeks to help the user feel confident with their assumptions by emphasizing testing assumptions, making inferences under uncertainty, and robustness to model misspecification.

Unite causal discovery and causal representation learning

DoDiscover is a Python library for causal discovery (causal structure learning). Our goal is to provide developers and researchers with guide rails for causal discovery that doesn't require deep knowledge of individual causal discovery algorithms.

What is the difference between dodiscover and other pywhy packages?

The goal of dodiscover is to flatten the on-ramp to causal discovery algorithms. DoWhy provides a consistent API for various causal tasks that typically require a graph structure. Similarly, DoDiscover aims to provide a cohesive and user-friendly API to apply causal discovery algorithms for inferring a causal graph from data.

causal-learn is an extensive collection of causal discovery algorithms. It continuous to host new cutting-edge algorithms in causal discovery. However, these algorithms do not have a unified API. Further, the historic focus of causal-learn is increasing the capabilities of discovery algorithms. In contrast, dodiscover's focus is on the discovery API and usability.

When possible, dodiscover prefers to provide an API wrapper to discovery algorithms in causal-learn and other libraries. Please consider contributing to causal-learn if you plan to implement an algorithm from scratch, then contributing a wrapper in dodiscover.

In the future we plan on trying to integrate the two libraries.

What is the relationship with pywhy-graphs and pywhy-stats?

pywhy-graphs is the home of graph data structures and graph algorithms in PyWhy.

pywhy-stats serves as a repository for implementations of (un)conditional independence tests, which can be utilized in various tasks, such as causal discovery.

Documentation

See the development version documentation.

Or see stable version documentation

Installation

Installation is best done via pip or conda. For developers, they can also install from source using pip. See installation page for full details.

Dependencies

Minimally, dodiscover requires:

* Python (>=3.10)
* numpy
* scipy
* networkx
* pandas

We have removed support for Python 3.8 as we depend explicitly on networkx, which has deprecated Python 3.8 support. For explicit graph functionality for representing various causal graphs, such as ADMG, or CPDAGs, you will also need:

* pywhy-graphs

For explicitly representing causal graphs, we recommend using pywhy-graphs package, but if you have a graph library that adheres to the graph protocols we require, then you can in principle use those graphs.

User Installation

If you already have a working installation of numpy, scipy and networkx, the easiest way to install dodiscover is using pip:

# doesn't work until we make an official release :p
pip install -U dodiscover

To install the package from github, clone the repository and then cd into the directory. You can then use pip to install:

pip install -e .

# for extra functionality for documentation, building, style checking and unit-testing
pip install .[doc, build, style, test]

Owner

  • Name: PyWhy
  • Login: py-why
  • Kind: organization

Citation (CITATION.cff)

# YAML 1.2
---
# Metadata for citation of this software according to the CFF format (https://citation-file-format.github.io/)
cff-version: 1.2.0
title: 'Dodiscover: Causal discovery algorithms in Python.'
abstract: 'Dodiscover is a Python library that leverages a simple API for performing causal discovery.'
authors:
    - given-names: Adam
      family-names: Li
      affiliation: 'Department of Computer Science, Columbia University, New York, NY, USA'
      orcid: 'https://orcid.org/0000-0001-8421-365X'
    - given-names: Jaron
      family-names: Lee
      affiliation: 'Johns Hopkins University'
      email: 'jaron2005@gmail.com'
    - given-names: Francesco  
      family-names: Montagna
      affiliation: 'Università di Genova'
      email: 'francesco.montagna997@gmail.com'
    - given-names: Chris
      family-names: Trevino
      affiliation: 'Microsoft'
      email: 'chtrevin@microsoft.com'
    - given-names: Robert
      family-names: Ness
      affiliation: 'Microsoft'
      email: 'robertness@microsoft.com'
type: software
repository-code: 'https://github.com/py-why/dodiscover'
license: MIT
keywords:
  - causality
  - pywhy
  - graphs
  - causal discovery
  - causal inference
...

GitHub Events

Total
  • Watch event: 23
  • Delete event: 1
  • Issue comment event: 1
  • Push event: 15
  • Pull request event: 2
  • Pull request review event: 1
  • Fork event: 1
  • Create event: 2
Last Year
  • Watch event: 23
  • Delete event: 1
  • Issue comment event: 1
  • Push event: 15
  • Pull request event: 2
  • Pull request review event: 1
  • Fork event: 1
  • Create event: 2

Issues and Pull Requests

Last synced: 9 months ago

All Time
  • Total issues: 78
  • Total pull requests: 86
  • Average time to close issues: 3 months
  • Average time to close pull requests: about 1 month
  • Total issue authors: 9
  • Total pull request authors: 12
  • Average comments per issue: 1.06
  • Average comments per pull request: 2.26
  • Merged pull requests: 63
  • Bot issues: 0
  • Bot pull requests: 17
Past Year
  • Issues: 2
  • Pull requests: 8
  • Average time to close issues: 20 days
  • Average time to close pull requests: 18 days
  • Issue authors: 2
  • Pull request authors: 4
  • Average comments per issue: 1.0
  • Average comments per pull request: 1.38
  • Merged pull requests: 6
  • Bot issues: 0
  • Bot pull requests: 5
Top Authors
Issue Authors
  • adam2392 (35)
  • robertness (32)
  • petergtz (2)
  • Jaswanth-007 (1)
  • jaron-lee (1)
  • Wapiti08 (1)
  • mcharrak (1)
  • lww28 (1)
  • ebridge2 (1)
Pull Request Authors
  • adam2392 (48)
  • dependabot[bot] (17)
  • pre-commit-ci[bot] (8)
  • robertness (7)
  • darthtrevino (4)
  • jaron-lee (3)
  • nparent1 (2)
  • petergtz (2)
  • HarshaSatyavardhan (2)
  • emrekiciman (1)
  • eeulig (1)
  • francescomontagna (1)
Top Labels
Issue Labels
good first issue (9) conditional-independence (8) help wanted (7) constraint-algorithm (7) score-algorithm (4) documentation (4) bug (3) representation learning (2) enhancement (2) UI (2) epic (1)
Pull Request Labels
dependencies (18) No Changelog Needed (15) quick-review (1)

Dependencies

poetry.lock pypi
  • astroid 2.11.7 develop
  • atomicwrites 1.4.1 develop
  • attrs 22.1.0 develop
  • bandit 1.7.4 develop
  • black 22.6.0 develop
  • certifi 2022.6.15 develop
  • charset-normalizer 2.1.0 develop
  • click 8.1.3 develop
  • colorama 0.4.5 develop
  • coverage 6.4.3 develop
  • dill 0.3.5.1 develop
  • docstring-parser 0.7.3 develop
  • falcon 2.0.0 develop
  • flake8 5.0.4 develop
  • ghp-import 2.1.0 develop
  • gitdb 4.0.9 develop
  • gitpython 3.1.27 develop
  • hug 2.6.1 develop
  • idna 3.3 develop
  • importlib-metadata 4.12.0 develop
  • iniconfig 1.1.1 develop
  • isort 5.10.1 develop
  • jinja2 3.1.2 develop
  • lazy-object-proxy 1.7.1 develop
  • livereload 2.6.3 develop
  • mako 1.2.1 develop
  • markdown 3.4.1 develop
  • markupsafe 2.1.1 develop
  • mccabe 0.7.0 develop
  • mergedeep 1.3.4 develop
  • mkdocs 1.2.4 develop
  • mkdocs-material 7.3.0 develop
  • mkdocs-material-extensions 1.0.3 develop
  • mypy 0.971 develop
  • mypy-extensions 0.4.3 develop
  • packaging 21.3 develop
  • pastel 0.2.1 develop
  • pathspec 0.9.0 develop
  • pbr 5.10.0 develop
  • pdocs 1.1.1 develop
  • platformdirs 2.5.2 develop
  • pluggy 1.0.0 develop
  • poethepoet 0.16.0 develop
  • portray 1.7.0 develop
  • py 1.11.0 develop
  • pycodestyle 2.9.1 develop
  • pyflakes 2.5.0 develop
  • pygments 2.12.0 develop
  • pylint 2.14.5 develop
  • pymdown-extensions 7.1 develop
  • pyparsing 3.0.9 develop
  • pytest 7.1.2 develop
  • pytest-cov 3.0.0 develop
  • python-dateutil 2.8.2 develop
  • pyyaml 6.0 develop
  • pyyaml-env-tag 0.1 develop
  • requests 2.28.1 develop
  • semversioner 1.1.0 develop
  • six 1.16.0 develop
  • smmap 5.0.0 develop
  • stevedore 4.0.0 develop
  • toml 0.10.2 develop
  • tomli 2.0.1 develop
  • tomlkit 0.11.3 develop
  • tornado 6.2 develop
  • typing-extensions 4.3.0 develop
  • urllib3 1.26.11 develop
  • watchdog 2.1.9 develop
  • wrapt 1.14.1 develop
  • yaspin 0.15.0 develop
  • zipp 3.8.1 develop
  • beartype 0.10.4
pyproject.toml pypi
  • bandit ^1.7.4 develop
  • black ^22.6.0 develop
  • flake8 ^5.0.4 develop
  • isort ^5.10.1 develop
  • mypy ^0.971 develop
  • poethepoet ^0.16.0 develop
  • portray ^1.7.0 develop
  • pylint ^2.14.5 develop
  • pytest ^7.1.2 develop
  • pytest-cov ^3.0.0 develop
  • semversioner ^1.1.0 develop
  • beartype ^0.10.4
  • python ^3.8
.github/workflows/circle_artifacts.yml actions
  • larsoner/circleci-artifacts-redirector-action master composite
.github/workflows/main.yml actions
  • abatilo/actions-poetry v2.3.0 composite
  • actions/checkout v3 composite
  • actions/download-artifact v3 composite
  • actions/setup-python v4 composite
  • actions/upload-artifact v3 composite
  • codecov/codecov-action v3 composite
  • softprops/action-gh-release v1 composite
.github/workflows/pr_checks.yml actions
  • actions/checkout v3 composite