ml4prom
The implementation of the paper: "Explainable Artificial Intelligence for Improved Modeling of Processes".
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (8.6%) to scientific vocabulary
Keywords
Repository
The implementation of the paper: "Explainable Artificial Intelligence for Improved Modeling of Processes".
Basic Info
- Host: GitHub
- Owner: rizavelioglu
- License: mit
- Language: Jupyter Notebook
- Default Branch: main
- Homepage: https://arxiv.org/abs/2212.00695
- Size: 2.06 MB
Statistics
- Stars: 3
- Watchers: 2
- Forks: 0
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
ML4ProM
Please follow the notebooks to reproduce results: - ./notebooks/1_EDA.ipynb downloads datasets and does Exploratory Data Analysis for each dataset to understand datasets better, - ./notebooks/2_training.ipynb executes training scripts and presents results, - ./notebooks/3_post-train.ipynb presents feature importances for each dataset and each ML model.
How to train models and output results?
Inside the project directory (../ml4prom/) execute following to get to know more about the args:
python
python -m src.models.train_model -h
which returns:
--debug DEBUG When True, plots ROC-Curve & Confusion Matrix
--seq_encoding SEQ_ENCODING
Possible encodings; 'one-hot' & 'n-gram' where n is an integer
--unique_traces UNIQUE_TRACES
when True, duplicate traces(trace variants) are removed from dataset
--remove_biased_feats REMOVE_BIASED_FEATS
when True, the biased features are removed from dataset, e.g. patient is dead in COVID dataset
The following command does multiple things:
- load all datasets
- apply preprocessing, e.g. remove biased features, remove duplicate traces, etc.
- encode traces (sequence of events)
- train ML models with StratifiedKFold cross-validation
- output a .csv file to ./reports/ including the accuracy scores
python
python -m src.models.train_model --seq_encoding one-hot --remove_biased_feats --unique_traces
Citation:
@inproceedings{velioglu2022explainable,
title={Explainable Artificial Intelligence for Improved Modeling of Processes},
author={Velioglu, Riza and G{\"o}pfert, Jan Philip and Artelt, Andr{\'e} and Hammer, Barbara},
booktitle={International Conference on Intelligent Data Engineering and Automated Learning},
pages={313--325},
year={2022},
organization={Springer}
}
Owner
- Name: Riza Velioglu
- Login: rizavelioglu
- Kind: user
- Location: Bielefeld, Germany
- Company: Bielefeld University
- Website: https://rizavelioglu.github.io/
- Twitter: rizavelioglu
- Repositories: 4
- Profile: https://github.com/rizavelioglu
Ph.D. candidate in Machine Learning | Co-founder & CRO @recommendy
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you use this repository, please consider citing it."
title: "ML4ProM"
abstract: "This repository implements methods presented in the paper: Explainable Artificial Intelligence for Improved Modeling of Processes."
authors:
- given-names: Velioglu
family-names: Riza
orcid: https://orcid.org/0000-0002-2160-4976
preferred-citation:
type: article
title: "Explainable Artificial Intelligence for Improved Modeling of Processes"
doi: 10.1007/978-3-031-21753-1_31
url: https://link.springer.com/chapter/10.1007/978-3-031-21753-1_31
journal: "Intelligent Data Engineering and Automated Learning – IDEAL 2022"
authors:
- family-names: "Velioglu"
given-names: "Riza"
- family-names: "Göpfert"
given-names: "Jan Philip"
- family-names: "Artelt"
given-names: "André"
- family-names: "Hammer"
given-names: "Barbara"
year: 2022
keywords:
- machinelearning
- explainableartificialintelligence
- processmining
repository-code: "https://github.com/rizavelioglu/ml4prom"
GitHub Events
Total
- Push event: 1
Last Year
- Push event: 1
Dependencies
- joblib >=1.2.0
- matplotlib >=3.4.3
- nltk *
- numpy >=1.20.3
- pandas *
- pm4py *
- scikit-learn *
- seaborn ==0.11.2
- tensorflow >=2.8.0
- termcolor ==1.1.0
- tqdm ==4.62.3
- xgboost ==1.5.0