dpim

Source Code of the DPIM (Differential-Private-Inductive-Miner)

https://github.com/schulze-m/dpim

Science Score: 57.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (8.9%) to scientific vocabulary

Keywords

privacy-preserving-data-mining
Last synced: 6 months ago · JSON representation ·

Repository

Source Code of the DPIM (Differential-Private-Inductive-Miner)

Basic Info
Statistics
  • Stars: 1
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Topics
privacy-preserving-data-mining
Created almost 2 years ago · Last pushed over 1 year ago
Metadata Files
Readme License Citation

README.md

Differentially Private Inductive Miner (DPIM)

Before executing the DPIM, please make sure the requirements are installed. If not, please install the requirements by running the following command: python3 pip install -r requirements.txt The DPIM offers two modes of execution: 1. Differential private In this mode, the DPIM is executed with differential privacy. The user can specify the epsilon ($\epsilon$) value, as well as needed lower and upper bounds. The DPIM will then execute with the specified epsilon value and bounds. To avoid errors the lowest lower bound is the total number of activities $(\#unique\ activities)$ and the highest upper bound is the $(\#unique\ activities)^2 -1$. 2. Non-differential private In this mode, the DPIM is executed using $\epsilon \rightarrow \infty$. The lower and upper bounds are not needed as only those permutations are considered that occur in at least one trace.

Out of these two modi, the differential private mode is the default mode. To execute the DPIM in non-differential private mode, specify the --no-dp flag.

To test the DPIM on a specific event log, you can run the following command: python3 main.py <eventlog> --epsilon <epsilon> --lower <lower_bound> --upper <upper_bound> where: - <eventlog> is the path to the event log. This is a required argument. - <epsilon> is the epsilon value. - <lower_bound> is the lower bound. - <upper_bound> is the upper bound.

Or, for non-differential private mode, the user can run the following command: python3 main.py <event_log> --no-dp All synthetic event logs and the URLs to the BPI Challenges are in the event_logs directory.

Info: If no flag is given, or a specific flag is forgotten the DPIM asks the user to input the missing values.

Arguments

The following arguments are available for the DPIM: - eventlog The path to the event log. This is a required argument. - -e, --epsilon The $\epsilon$ value for differential privacy. The default value is 1.0. - -l, --lower The lower bound for the number of permutations. The default value is the $\#unique\ activities$ in the event log. - -u, --upper The upper bound for the number of permutations. The default value is the $(\#unique\ activities)^2 -1$. - -t --threshold The threshold used by the Rejection sampler to accept the generated PST. The default is 0.95 - --no-dp The flag to run the DPIM in non-differential private mode, $\epsilon \rightarrow \infty$.

Example

To test the DPIM on the TF_5 event log with $\epsilon = 1.0$, the user can run the following command: python3 main.py event_logs\synthetic_EventLogs\TF_5.xes -e 1.0 -l 6 -u 35 -t 0.9

Bounds used

The following tables show the lower and upper bounds used for the BPI Challenge datasets (all links can be found at BPI) and the synthetic logs.

|BPI Challanges|Synthetic Logs| |:---:|:---:| |

Event Log Lower Bound Upper Bound
BPIChallenge2011 4280 4310
BPIChallenge2012 120 150
BPIChallenge2013closedproblems 5 20
BPIChallenge2013incidents 5 20
BPIChallenge2013openproblems 5 15
BPIChallenge20151 4805 4835
BPIChallenge20152 4885 4915
BPIChallenge20153 5020 5050
BPIChallenge20154 3650 3680
BPIChallenge20155 4960 4990
BPIChallenge2017 175 205
BPIChallenge2018 605 635
BPIChallenge2019 525 555
DomesticDeclarations2020 30 60
InternationalDeclarations2020 195 225
PermitLog2020 555 585
PrepaidTravelCost2020 160 190
RequestForPayment2020 40 70
Sepsis Cases-Event Log 120 150
|
Event Log Lower Bound Upper Bound
TF04 6 20
TF05 6 35
TF06 6 35
TF07 3 8
TF08 3 8
TF09 6 35
TF10 6 35
TF11 5 24
TF12 5 24
TF13 5 24
TF14 30 60
TF15 3 8
TF16 6 35
|

Cite

@inproceedings{Schulze_2024, author={Schulze, Max and Zisgen, Yorck and Kirschte, Moritz and Mohammadi, Esfandiar and Koschmider, Agnes}, title={Differentially Private Inductive Miner}, booktitle={2024 6th International Conference on Process Mining (ICPM)}, DOI={10.1109/icpm63005.2024.10680684}, publisher={IEEE}, year={2024}, pages={89–96} }

Owner

  • Name: Schulze
  • Login: Schulze-M
  • Kind: user

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Schulze"
  given-names: "Max"
- family-names: "Zisgen"
  given-names: "Yorck"
- family-names: "Kirschte"
  given-names: "Moritz"
- family-names: "Mohammadi"
  given-names: "Esfandiar"
- family-names: "Agnes"
  given-names: "Koschmider"
title: "Differentially Private Inductive Miner"
doi: 10.1109/ICPM63005.2024.10680684
date-released: 2024-10-04
url: "https://github.com/Schulze-M/Moe-Miner.git"
preferred-citation:
  type: conference-paper
  authors: 
    - family-names: "Schulze"
      given-names: "Max"
    - family-names: "Zisgen"
      given-names: "Yorck"
    - family-names: "Kirschte"
      given-names: "Moritz"
    - family-names: "Mohammadi"
      given-names: "Esfandiar"
    - family-names: "Koschmider"
      given-names: "Agnes"
  title: "Differentially Private Inductive Miner"
  collection-title: "2024 6th International Conference on Process Mining (ICPM)"
  doi: 10.1109/ICPM63005.2024.10680684
  year: 2024
  start: 89
  end: 96

GitHub Events

Total
  • Watch event: 3
  • Push event: 5
Last Year
  • Watch event: 3
  • Push event: 5

Dependencies

requirements.txt pypi
  • numpy *
  • pm4py ==2.2.31