ml4prom

The implementation of the paper: "Explainable Artificial Intelligence for Improved Modeling of Processes".

https://github.com/rizavelioglu/ml4prom

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (8.6%) to scientific vocabulary

Keywords

machine-learning processmining transformers xai
Last synced: 6 months ago · JSON representation ·

Repository

The implementation of the paper: "Explainable Artificial Intelligence for Improved Modeling of Processes".

Basic Info
Statistics
  • Stars: 3
  • Watchers: 2
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Topics
machine-learning processmining transformers xai
Created over 4 years ago · Last pushed 11 months ago
Metadata Files
Readme License Citation

README.md

ML4ProM

Check out the paper on: arXiv

Please follow the notebooks to reproduce results: - ./notebooks/1_EDA.ipynb downloads datasets and does Exploratory Data Analysis for each dataset to understand datasets better, - ./notebooks/2_training.ipynb executes training scripts and presents results, - ./notebooks/3_post-train.ipynb presents feature importances for each dataset and each ML model.

How to train models and output results?

Inside the project directory (../ml4prom/) execute following to get to know more about the args: python python -m src.models.train_model -h which returns: --debug DEBUG When True, plots ROC-Curve & Confusion Matrix --seq_encoding SEQ_ENCODING Possible encodings; 'one-hot' & 'n-gram' where n is an integer --unique_traces UNIQUE_TRACES when True, duplicate traces(trace variants) are removed from dataset --remove_biased_feats REMOVE_BIASED_FEATS when True, the biased features are removed from dataset, e.g. patient is dead in COVID dataset

The following command does multiple things: - load all datasets - apply preprocessing, e.g. remove biased features, remove duplicate traces, etc. - encode traces (sequence of events) - train ML models with StratifiedKFold cross-validation - output a .csv file to ./reports/ including the accuracy scores python python -m src.models.train_model --seq_encoding one-hot --remove_biased_feats --unique_traces


Citation:

@inproceedings{velioglu2022explainable, title={Explainable Artificial Intelligence for Improved Modeling of Processes}, author={Velioglu, Riza and G{\"o}pfert, Jan Philip and Artelt, Andr{\'e} and Hammer, Barbara}, booktitle={International Conference on Intelligent Data Engineering and Automated Learning}, pages={313--325}, year={2022}, organization={Springer} }

Owner

  • Name: Riza Velioglu
  • Login: rizavelioglu
  • Kind: user
  • Location: Bielefeld, Germany
  • Company: Bielefeld University

Ph.D. candidate in Machine Learning | Co-founder & CRO @recommendy

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this repository, please consider citing it."
title: "ML4ProM"
abstract: "This repository implements methods presented in the paper: Explainable Artificial Intelligence for Improved Modeling of Processes."
authors:
- given-names: Velioglu
  family-names: Riza
  orcid: https://orcid.org/0000-0002-2160-4976
preferred-citation:
  type: article
  title: "Explainable Artificial Intelligence for Improved Modeling of Processes"
  doi: 10.1007/978-3-031-21753-1_31
  url: https://link.springer.com/chapter/10.1007/978-3-031-21753-1_31
  journal: "Intelligent Data Engineering and Automated Learning – IDEAL 2022"
  authors:
  - family-names: "Velioglu"
    given-names: "Riza"
  - family-names: "Göpfert"
    given-names: "Jan Philip"
  - family-names: "Artelt"
    given-names: "André"
  - family-names: "Hammer"
    given-names: "Barbara"
  year: 2022
keywords:
- machinelearning
- explainableartificialintelligence
- processmining
repository-code: "https://github.com/rizavelioglu/ml4prom"

GitHub Events

Total
  • Push event: 1
Last Year
  • Push event: 1

Dependencies

requirements.txt pypi
  • joblib >=1.2.0
  • matplotlib >=3.4.3
  • nltk *
  • numpy >=1.20.3
  • pandas *
  • pm4py *
  • scikit-learn *
  • seaborn ==0.11.2
  • tensorflow >=2.8.0
  • termcolor ==1.1.0
  • tqdm ==4.62.3
  • xgboost ==1.5.0