mpcsl
A Modular Pipeline for Causal Structure Learning called MPCSL.
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
✓Committers with academic emails
1 of 15 committers (6.7%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (15.0%) to scientific vocabulary
Repository
A Modular Pipeline for Causal Structure Learning called MPCSL.
Basic Info
Statistics
- Stars: 8
- Watchers: 5
- Forks: 1
- Open Issues: 21
- Releases: 0
Metadata Files
README.md
MPCSL: A Modular Pipeline for Causal Structure Learning
This repository contains the backend of MPCSL, a Modular Pipeline for Causal Structure Learning, build at the chair for Enterprise Platform and Integration Concepts at the Hasso Plattner Institute. The pipeline currently includes the following features, all of which are accessible via a REST API:
- Store causal structure learning ready datasets into our backend
- Set up causal structure learning experiments for different causal structure learning algorithms in R, Python and CUDA with different hyperparameter settings and dataset choices
- Run the experiments as jobs directly in our backend
- Manage all currently running jobs on the backend
- Deliver the results and meta information of past experiments
- Show distributions and perform interventions (currently limited to specific cases)
- Comparison of different experiment results using quality metrics, such as type I or type II error, or graph edit distance
- Extend the pipeline with new algorithms in their own execution environments
The following image shows the holistic architecture as a FMC diagram:

Setup
Requirements
As the user interface files are stored in a different currently private repository, you have to clone the repo using:
git clone --recurse-submodules git@github.com:hpi-epic/mpcsl.git
Getting Started
minikube startgarden deploygarden run task seed-db- Goto
minikube ipin browser
Setup Algorithms
garden run task db-setup-algorithms loads the algorithms into the database.
Seeding Example Dataset/Experiment
With garden run task seed-db an example dataset will be loaded into the database.
The example dataset is generated from an EARTHQUAKE bayesian network on this page.
Endpoint Documentation
A Swagger documentation of our REST endpoints is available using /swagger/index.html given default host and port settings.
Maintainers
Contact: firstname.lastname@hpi.de
Contributors
Owner
- Name: Enterprise Platform & Integration Concepts Research Group
- Login: hpi-epic
- Kind: organization
- Location: Potsdam, Germany
- Website: https://epic.hpi.de
- Repositories: 31
- Profile: https://github.com/hpi-epic
Citation (CITATION.cff)
# YAML 1.2
---
abstract: |
"The examination of causal structures is crucial for data scientists in a variety of machine learning application scenarios.
In recent years, the corresponding interest in methods of causal structure learning has led to a wide spectrum of independent implementations, each having specific accuracy characteristics and introducing implementation-specific overhead in the runtime.
Hence, considering a selection of algorithms or different implementations in different programming languages utilizing different hardware setups becomes a tedious manual task with high setup costs.
Consequently, a tool that enables to plug in existing methods from different libraries into a single system to compare and evaluate the results is substantial support for data scientists in their research efforts.
In this work, we propose an architectural blueprint of a pipeline for causal structure learning and outline our reference implementation MPCSL that addresses the requirements towards platform independence and modularity while ensuring the comparability and reproducibility of experiments.
Moreover, we demonstrate the capabilities of MPCSL within a case study, where we evaluate existing implementations of the well-known PC-Algorithm concerning their runtime performance characteristics."
authors:
-
affiliation: "Hasso Plattner Institute, University of Potsdam"
family-names: Huegle
given-names: Johannes
-
affiliation: "Hasso Plattner Institute, University of Potsdam"
family-names: Hagedorn
given-names: Christopher
-
affiliation: "Hasso Plattner Institute, University of Potsdam"
family-names: Perscheid
given-names: Michael
-
affiliation: "Hasso Plattner Institute, University of Potsdam"
family-names: Plattner
given-names: Hasso
cff-version: "1.1.0"
doi: "10.1145/3447548.3467082"
message: "If you use this software, please cite the paper."
title: "MPCSL"
references:
- type: conference-paper
authors:
- affiliation: "Hasso Plattner Institute, University of Potsdam"
family-names: Huegle
given-names: Johannes
- affiliation: "Hasso Plattner Institute, University of Potsdam"
family-names: Hagedorn
given-names: Christopher
- affiliation: "Hasso Plattner Institute, University of Potsdam"
family-names: Perscheid
given-names: Michael
- affiliation: "Hasso Plattner Institute, University of Potsdam"
family-names: Plattner
given-names: Hasso
title: "MPCSL - A Modular Pipeline for Causal Structure Learning"
year: 2021
collection-title: "Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD '21)"
conference:
name: ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD '21)
doi: "10.1145/3447548.3467082"
...
GitHub Events
Total
Last Year
Committers
Last synced: about 1 year ago
Top Committers
| Name | Commits | |
|---|---|---|
| danthe | me@d****m | 40 |
| Jonathan Schneider | j****r@s****e | 35 |
| Christopher Hagedorn | c****9@g****m | 31 |
| Alexander Kastius | a****s@q****e | 29 |
| Jonas Umland | j****d@s****e | 16 |
| milanpro | m****l@g****m | 15 |
| MariusDanner | m****s@d****e | 15 |
| Tobias Nack | t****3@g****m | 12 |
| boehmchen | 4****n | 6 |
| mschroederi | c****e@m****e | 5 |
| danthe | d****6@g****e | 5 |
| Victor Künstler | v****r@o****m | 3 |
| constantin-lange | c****e@s****e | 3 |
| Johannes Huegle | j****e@h****e | 2 |
| Theresa Zobel | t****l@s****e | 2 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: about 1 year ago
All Time
- Total issues: 46
- Total pull requests: 154
- Average time to close issues: 3 months
- Average time to close pull requests: 13 days
- Total issue authors: 7
- Total pull request authors: 10
- Average comments per issue: 0.35
- Average comments per pull request: 0.45
- Merged pull requests: 141
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- mschroederi (17)
- ChristopherSchmidt89 (14)
- jonasumland (6)
- boehmchen (3)
- jonaschn (3)
- constantin-lange (2)
- Dencrash (1)
Pull Request Authors
- danthe96 (31)
- jonaschn (29)
- Raandom (26)
- ChristopherSchmidt89 (21)
- jonasumland (19)
- Dencrash (11)
- mschroederi (7)
- boehmchen (7)
- constantin-lange (2)
- theresazobel (1)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- actions/checkout v2 composite
- actions/checkout v2 composite
- actions/checkout v2 composite
- nvidia/cuda 10.1-devel-ubuntu18.04 build
- python 3.7 build
- chris89/mpci_r latest build
- ccmi *
- cython >=0.26
- networkx *
- numpy *
- pandas *
- requests *
- scikit-learn *
- scipy *
- tigramite *
- manm_cs ==0.1.2
- requests ==2.25.1
- Babel ==2.9.0
- Click ==7.0
- Faker ==1.0.0
- Flask ==1.0.2
- Flask-Migrate ==2.3.1
- Flask-RESTful ==0.3.6
- Flask-SQLAlchemy ==2.4.4
- Flask-SocketIO ==4.2.1
- Jinja2 ==2.10
- Mako ==1.0.7
- MarkupSafe ==1.1.0
- Pillow ==8.1.0
- PyYAML ==5.3.1
- Pygments ==2.7.4
- SQLAlchemy ==1.3.20
- Sphinx ==2.0.1
- Werkzeug ==0.14.1
- absl-py ==0.11.0
- aiohttp ==3.6.2
- alabaster ==0.7.12
- alembic ==1.0.6
- aniso8601 ==4.0.1
- appnope ==0.1.2
- async-timeout ==3.0.1
- atomicwrites ==1.2.1
- attrs ==18.2.0
- backcall ==0.2.0
- bidict ==0.21.2
- cachetools ==4.2.0
- causaldag ==0.1a162
- certifi ==2018.11.29
- chardet ==3.0.4
- codecov ==2.0.15
- conditional-independence ==0.1a5
- coverage ==5.3
- cycler ==0.10.0
- dataclasses ==0.6
- decorator ==4.3.2
- dnspython ==2.0.0
- docutils ==0.16
- eventlet ==0.25.1
- factory-boy ==2.11.1
- flake8 ==3.8.4
- flask-restful-swagger-2 ==0.35
- frozendict ==1.2
- future ==0.18.2
- google-auth ==1.23.0
- graphical-model-learning ==0.1a7
- graphical-models ==0.1a5
- greenlet ==0.4.17
- idna ==2.7
- ijson ==2.3
- imagesize ==1.2.0
- importlib-metadata ==3.1.1
- ipdb ==0.13.4
- ipython ==7.20.0
- ipython-genutils ==0.2.0
- itsdangerous ==1.1.0
- jedi ==0.18.0
- joblib ==1.0.0
- kiwisolver ==1.3.1
- kubernetes ==10.0.1
- marshmallow ==2.16.3
- marshmallow-sqlalchemy ==0.15.0
- matplotlib ==3.3.4
- mccabe ==0.6.1
- monotonic ==1.5
- more-itertools ==4.3.0
- multidict ==4.7.6
- netrd ==0.2.2
- networkx ==2.5
- numexpr ==2.7.2
- numpy ==1.16.0
- numpydoc ==1.1.0
- oauthlib ==3.1.0
- ortools ==8.1.8487
- packaging ==20.9
- pandas ==1.1.4
- parso ==0.8.1
- pexpect ==4.8.0
- pickleshare ==0.7.5
- pluggy ==0.8.0
- progressbar2 ==3.53.1
- prompt-toolkit ==3.0.16
- protobuf ==3.14.0
- psutil ==5.4.8
- psycopg2 ==2.8.6
- psycopg2-binary ==2.8.6
- ptyprocess ==0.7.0
- py ==1.7.0
- pyasn1 ==0.4.8
- pyasn1-modules ==0.2.8
- pycodestyle ==2.6.0
- pyflakes ==2.2.0
- pygam ==0.8.0
- pyhdb ==0.3.4
- pyparsing ==2.4.7
- pytest ==3.10.1
- pytest-cov ==2.6.1
- pytest-ordering ==0.6
- python-dateutil ==2.7.5
- python-editor ==1.0.3
- python-engineio ==4.0.0
- python-socketio ==5.0.1
- python-utils ==2.5.6
- pytz ==2018.7
- requests ==2.20.1
- requests-oauthlib ==1.3.0
- rsa ==4.6
- scikit-learn ==0.24.1
- scipy ==1.5.4
- six ==1.11.0
- snowballstemmer ==2.1.0
- sphinx-rtd-theme ==0.5.1
- sphinxcontrib-applehelp ==1.0.2
- sphinxcontrib-devhelp ==1.0.2
- sphinxcontrib-htmlhelp ==1.0.3
- sphinxcontrib-jsmath ==1.0.1
- sphinxcontrib-qthelp ==1.0.3
- sphinxcontrib-serializinghtml ==1.1.4
- sqlalchemy-hana ==0.3.0
- text-unidecode ==1.2
- threadpoolctl ==2.1.0
- tqdm ==4.57.0
- traitlets ==5.0.5
- typing ==3.7.4.3
- typing-extensions ==3.7.4.3
- urllib3 ==1.24.2
- wcwidth ==0.2.5
- websocket-client ==0.55.0
- yarl ==1.6.3
- zipp ==3.4.0