dpim
Source Code of the DPIM (Differential-Private-Inductive-Miner)
Science Score: 57.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 1 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (8.9%) to scientific vocabulary
Keywords
Repository
Source Code of the DPIM (Differential-Private-Inductive-Miner)
Basic Info
- Host: GitHub
- Owner: Schulze-M
- License: mit
- Language: Python
- Default Branch: main
- Homepage: https://arxiv.org/abs/2407.04595
- Size: 7.32 MB
Statistics
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
Differentially Private Inductive Miner (DPIM)
Before executing the DPIM, please make sure the requirements are installed. If not, please install the requirements by running the following command:
python3 pip install -r requirements.txt
The DPIM offers two modes of execution:
1. Differential private In this mode, the DPIM is executed with differential privacy. The user can specify the epsilon ($\epsilon$) value, as well as needed lower and upper bounds. The DPIM will then execute with the specified epsilon value and bounds. To avoid errors the lowest lower bound is the total number of activities $(\#unique\ activities)$ and the highest upper bound is the $(\#unique\ activities)^2 -1$.
2. Non-differential private In this mode, the DPIM is executed using $\epsilon \rightarrow \infty$. The lower and upper bounds are not needed as only those permutations are considered that occur in at least one trace.
Out of these two modi, the differential private mode is the default mode. To execute the DPIM in non-differential private mode, specify the --no-dp flag.
To test the DPIM on a specific event log, you can run the following command:
python3 main.py <eventlog> --epsilon <epsilon> --lower <lower_bound> --upper <upper_bound>
where:
- <eventlog> is the path to the event log. This is a required argument.
- <epsilon> is the epsilon value.
- <lower_bound> is the lower bound.
- <upper_bound> is the upper bound.
Or, for non-differential private mode, the user can run the following command:
python3 main.py <event_log> --no-dp
All synthetic event logs and the URLs to the BPI Challenges are in the event_logs directory.
Info: If no flag is given, or a specific flag is forgotten the DPIM asks the user to input the missing values.
Arguments
The following arguments are available for the DPIM:
- eventlog The path to the event log. This is a required argument.
- -e, --epsilon The $\epsilon$ value for differential privacy. The default value is 1.0.
- -l, --lower The lower bound for the number of permutations. The default value is the $\#unique\ activities$ in the event log.
- -u, --upper The upper bound for the number of permutations. The default value is the $(\#unique\ activities)^2 -1$.
- -t --threshold The threshold used by the Rejection sampler to accept the generated PST. The default is 0.95
- --no-dp The flag to run the DPIM in non-differential private mode, $\epsilon \rightarrow \infty$.
Example
To test the DPIM on the TF_5 event log with $\epsilon = 1.0$, the user can run the following command:
python3 main.py event_logs\synthetic_EventLogs\TF_5.xes -e 1.0 -l 6 -u 35 -t 0.9
Bounds used
The following tables show the lower and upper bounds used for the BPI Challenge datasets (all links can be found at BPI) and the synthetic logs.
|BPI Challanges|Synthetic Logs| |:---:|:---:| |
| Event Log | Lower Bound | Upper Bound |
|---|---|---|
| BPIChallenge2011 | 4280 | 4310 |
| BPIChallenge2012 | 120 | 150 |
| BPIChallenge2013closedproblems | 5 | 20 |
| BPIChallenge2013incidents | 5 | 20 |
| BPIChallenge2013openproblems | 5 | 15 |
| BPIChallenge20151 | 4805 | 4835 |
| BPIChallenge20152 | 4885 | 4915 |
| BPIChallenge20153 | 5020 | 5050 |
| BPIChallenge20154 | 3650 | 3680 |
| BPIChallenge20155 | 4960 | 4990 |
| BPIChallenge2017 | 175 | 205 |
| BPIChallenge2018 | 605 | 635 |
| BPIChallenge2019 | 525 | 555 |
| DomesticDeclarations2020 | 30 | 60 |
| InternationalDeclarations2020 | 195 | 225 |
| PermitLog2020 | 555 | 585 |
| PrepaidTravelCost2020 | 160 | 190 |
| RequestForPayment2020 | 40 | 70 |
| Sepsis Cases-Event Log | 120 | 150 |
| Event Log | Lower Bound | Upper Bound |
|---|---|---|
| TF04 | 6 | 20 |
| TF05 | 6 | 35 |
| TF06 | 6 | 35 |
| TF07 | 3 | 8 |
| TF08 | 3 | 8 |
| TF09 | 6 | 35 |
| TF10 | 6 | 35 |
| TF11 | 5 | 24 |
| TF12 | 5 | 24 |
| TF13 | 5 | 24 |
| TF14 | 30 | 60 |
| TF15 | 3 | 8 |
| TF16 | 6 | 35 |
Cite
@inproceedings{Schulze_2024,
author={Schulze, Max and Zisgen, Yorck and Kirschte, Moritz and Mohammadi, Esfandiar and Koschmider, Agnes},
title={Differentially Private Inductive Miner},
booktitle={2024 6th International Conference on Process Mining (ICPM)},
DOI={10.1109/icpm63005.2024.10680684},
publisher={IEEE},
year={2024},
pages={89–96} }
Owner
- Name: Schulze
- Login: Schulze-M
- Kind: user
- Repositories: 1
- Profile: https://github.com/Schulze-M
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Schulze"
given-names: "Max"
- family-names: "Zisgen"
given-names: "Yorck"
- family-names: "Kirschte"
given-names: "Moritz"
- family-names: "Mohammadi"
given-names: "Esfandiar"
- family-names: "Agnes"
given-names: "Koschmider"
title: "Differentially Private Inductive Miner"
doi: 10.1109/ICPM63005.2024.10680684
date-released: 2024-10-04
url: "https://github.com/Schulze-M/Moe-Miner.git"
preferred-citation:
type: conference-paper
authors:
- family-names: "Schulze"
given-names: "Max"
- family-names: "Zisgen"
given-names: "Yorck"
- family-names: "Kirschte"
given-names: "Moritz"
- family-names: "Mohammadi"
given-names: "Esfandiar"
- family-names: "Koschmider"
given-names: "Agnes"
title: "Differentially Private Inductive Miner"
collection-title: "2024 6th International Conference on Process Mining (ICPM)"
doi: 10.1109/ICPM63005.2024.10680684
year: 2024
start: 89
end: 96
GitHub Events
Total
- Watch event: 3
- Push event: 5
Last Year
- Watch event: 3
- Push event: 5
Dependencies
- numpy *
- pm4py ==2.2.31