https://github.com/bptlab/mimic-log-extraction
A CLI tool for extracting event logs out of MIMIC Databases.
Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (8.0%) to scientific vocabulary
Keywords
Repository
A CLI tool for extracting event logs out of MIMIC Databases.
Statistics
- Stars: 10
- Watchers: 2
- Forks: 1
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
mimic-log-extraction
A CLI tool for extracting event logs out of MIMIC Databases. This branch is for MIMIC-IV 1.0. If you use MIMIC-IV 2.0 or 2.2, please pull from the respective branch: https://github.com/bptlab/mimic-log-extraction/tree/mimic-2.0 , https://github.com/bptlab/mimic-log-extraction/blob/mimic-2.2/
- requires python 3.8.10 (newer versions might be fine, though)
- using a python virtual environment seems like a good idea
The official python documentation provides a good overview on how to create virtual environments. We recommend having the environment either in this directory, or one level above.
usage
``` usage: extract_log.py [-h] [--dbname DBNAME] [--dbhost DBHOST] [--dbuser DBUSER] [--dbpw DBPW] [--subjectids SUBJECTIDS] [--hadmids HADMIDS] [--icd ICD] [--icdversion ICDVERSION] [--icdsequencenumber ICDSEQUENCENUMBER] [--drg DRG] [--drgtype DRGTYPE] [--age AGE] [--type TYPE] [--tables TABLES] [--tablesactivities TABLESACTIVITIES] [--tablestimestamps TABLESTIMESTAMPS] [--notion NOTION] [--caseattributelist CASEATTRIBUTELIST] [--config CONFIG] [--saveintermediate] [--ignoreintermediate]
optional arguments: -h, --help show this help message and exit --dbname DBNAME Database Name --dbhost DBHOST Database Host --dbuser DBUSER Database User --dbpw DBPW Database Password --subjectids SUBJECTIDS Subject IDs of cohort --hadmids HADMIDS Hospital Admission IDs of cohort --icd ICD ICD code(s) of cohort --icdcodesintersection Optional argument, if one wants to filter for disease combinations, such that patients have to have an icd code from icdcodes and from icdcodesintersection --icdversion ICDVERSION ICD version --icdsequencenumber ICDSEQUENCENUMBER Ranking threshold of diagnosis --drg DRG DRG code(s) of cohort --drgtype DRGTYPE DRG type (HCFA, APR) --age AGE Patient Age of cohort --type TYPE Event Type --tables TABLES Low level tables --tablesactivities TABLESACTIVITIES Activity Columns for Low level tables --tablestimestamps TABLESTIMESTAMPS Timestamp Columns for Low level tables --notion NOTION Case Notion --caseattributelist CASEATTRIBUTELIST Case Attributes --config CONFIG Config file for providing all options via file --saveintermediate Store intermediate extraction results as csv. For debugging purposes. --ignoreintermediate Explicitly disable storing of intermediate results. --csvlog Store resulting log as a .csv file instead of as an .xes event log ```
Call the tool via
bash
python3 -m extract_log <...>
passing the required parameters.
If you installed the tool via cloning this repository, you should instead execute
bash
python3 ./extract_log.py <...>
config file
For providing parameters via a .yml config file, provide the path to that file via the --config flag.
This will override any setting provided via prompt or input flag, so be careful. Refer to the example_config.yml file for how to provide options. The config keys icd_codes, drg_codes, and additional_event_attributes need to be explicitly set to [] in order to not be prompted for during extraction. include_medications only needs to be set for POE event logs to avoid the prompt. When case_attributes is set to [], the respective default attributes are used. If the key is not provided, no case attributes are added. To be prompted for it during execution, prompt_case_attributes needs to be set to true.
yaml
db:
name: mimic
host: 127.0.0.1
user: some_db_user
pw: some_db_password
save_intermediate: True # True, False
csv_log: False # True, defaults to False
cohort:
subject_ids: # Omitting does not consider subject_ids
- some subject_ids
- ...
hadm_ids: # Omitting does not consider hadm_ids
- some hadm_ids
- ...
icd_codes: # could also be [] to avoid ICD filtering. Omitting makes the tool prompt for input.
- some ICD code
- ...
icd_codes_intersection: # optional argument, if one wants to filter for disease combinations, such that patients have to have an icd code from icd_codes and from icd_codes_intersection
- some ICD code
- ...
icd_version: 10 # 9, 10, 0
icd_seq_num: 1
drg_codes: [] # could also contain keys to filter for DRG codes. Omitting makes the tool prompt for input.
drg_ontology: APR # APR, HCFA
age: # could also be [] to avoid age range filtering. Omitting makes the tool prompt for input.
- 0:25
- 50:90
event_type: admission # admission, transfer, poe
include_medications: False # False, True. Only needed if POE event_type
case_notion: hospital admission # subject, hospital admission
case_attributes: [] # could also be None. [] uses default case attributes for case notion.
prompt_case_attributes: False # False, True. Setting True forces case attributes to be determined if not provided
low_level_tables: # only if event type OTHER
- pharmacy
- labevents
low_level_activities:
- medication
- label
low_level_timestamps:
- starttime
- charttime
additional_event_attributes: # Can be set to []. Omitting makes the tool prompt for input
-
start_column: a
end_column: b
time_column: c
table_to_aggregate: d
column_to_aggregate: f
aggregation_method: g
filter_column: h # can be omitted
filter_values:
- one
- other
-
start_column: a
end_column: b
time_column: c
table_to_aggregate: d
column_to_aggregate: f
aggregation_method: g
filter_column: h # can be omitted
installation
Simply run the pip installation command to install the extraction tool:
bash
pip install git+https://github.com/bptlab/mimic-log-extraction/
Alternatively, clone this repo and execute
bash
pip install -e .
For development and testing, all dev dependencies can be installed using
bash
pip install -e .[dev]
If you're using zsh, escape the square brackets: pip install -e .\[dev\]
development
After installing all required dev dependencies, make sure to regularly call
bash
pylint extract_log.py extractor --rcfile .pylintrc
mypy --config-file mypy.ini .
to ensure linted and typechecked code.
Owner
- Name: Business Process Technology
- Login: bptlab
- Kind: organization
- Location: Potsdam, Germany
- Website: https://bpt.hpi.uni-potsdam.de
- Repositories: 37
- Profile: https://github.com/bptlab
Business Process Technology @ Hasso Plattner Institute, University of Potsdam
GitHub Events
Total
- Watch event: 3
Last Year
- Watch event: 3
Committers
Last synced: 7 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Finn K | f****f | 88 |
| jcremerius | j****s@s****e | 78 |
| Jonas Cremerius | j****s@h****e | 5 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 7 months ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- actions/checkout v3 composite
- actions/setup-python v3 composite
- actions/checkout v3 composite
- actions/setup-python v3 composite