https://github.com/bptlab/mimic-log-extraction

A CLI tool for extracting event logs out of MIMIC Databases.

https://github.com/bptlab/mimic-log-extraction

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (8.0%) to scientific vocabulary

Keywords

event-log mimic-iv process-mining
Last synced: 6 months ago · JSON representation

Repository

A CLI tool for extracting event logs out of MIMIC Databases.

Basic Info
  • Host: GitHub
  • Owner: bptlab
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 3.86 MB
Statistics
  • Stars: 10
  • Watchers: 2
  • Forks: 1
  • Open Issues: 0
  • Releases: 0
Topics
event-log mimic-iv process-mining
Created over 3 years ago · Last pushed almost 3 years ago
Metadata Files
Readme

README.md

mimic-log-extraction

Pylint Typecheck

A CLI tool for extracting event logs out of MIMIC Databases. This branch is for MIMIC-IV 1.0. If you use MIMIC-IV 2.0 or 2.2, please pull from the respective branch: https://github.com/bptlab/mimic-log-extraction/tree/mimic-2.0 , https://github.com/bptlab/mimic-log-extraction/blob/mimic-2.2/

  • requires python 3.8.10 (newer versions might be fine, though)
  • using a python virtual environment seems like a good idea

The official python documentation provides a good overview on how to create virtual environments. We recommend having the environment either in this directory, or one level above.

usage

``` usage: extract_log.py [-h] [--dbname DBNAME] [--dbhost DBHOST] [--dbuser DBUSER] [--dbpw DBPW] [--subjectids SUBJECTIDS] [--hadmids HADMIDS] [--icd ICD] [--icdversion ICDVERSION] [--icdsequencenumber ICDSEQUENCENUMBER] [--drg DRG] [--drgtype DRGTYPE] [--age AGE] [--type TYPE] [--tables TABLES] [--tablesactivities TABLESACTIVITIES] [--tablestimestamps TABLESTIMESTAMPS] [--notion NOTION] [--caseattributelist CASEATTRIBUTELIST] [--config CONFIG] [--saveintermediate] [--ignoreintermediate]

optional arguments: -h, --help show this help message and exit --dbname DBNAME Database Name --dbhost DBHOST Database Host --dbuser DBUSER Database User --dbpw DBPW Database Password --subjectids SUBJECTIDS Subject IDs of cohort --hadmids HADMIDS Hospital Admission IDs of cohort --icd ICD ICD code(s) of cohort --icdcodesintersection Optional argument, if one wants to filter for disease combinations, such that patients have to have an icd code from icdcodes and from icdcodesintersection --icdversion ICDVERSION ICD version --icdsequencenumber ICDSEQUENCENUMBER Ranking threshold of diagnosis --drg DRG DRG code(s) of cohort --drgtype DRGTYPE DRG type (HCFA, APR) --age AGE Patient Age of cohort --type TYPE Event Type --tables TABLES Low level tables --tablesactivities TABLESACTIVITIES Activity Columns for Low level tables --tablestimestamps TABLESTIMESTAMPS Timestamp Columns for Low level tables --notion NOTION Case Notion --caseattributelist CASEATTRIBUTELIST Case Attributes --config CONFIG Config file for providing all options via file --saveintermediate Store intermediate extraction results as csv. For debugging purposes. --ignoreintermediate Explicitly disable storing of intermediate results. --csvlog Store resulting log as a .csv file instead of as an .xes event log ```

Call the tool via

bash python3 -m extract_log <...>

passing the required parameters.

If you installed the tool via cloning this repository, you should instead execute

bash python3 ./extract_log.py <...>

config file

For providing parameters via a .yml config file, provide the path to that file via the --config flag. This will override any setting provided via prompt or input flag, so be careful. Refer to the example_config.yml file for how to provide options. The config keys icd_codes, drg_codes, and additional_event_attributes need to be explicitly set to [] in order to not be prompted for during extraction. include_medications only needs to be set for POE event logs to avoid the prompt. When case_attributes is set to [], the respective default attributes are used. If the key is not provided, no case attributes are added. To be prompted for it during execution, prompt_case_attributes needs to be set to true.

yaml db: name: mimic host: 127.0.0.1 user: some_db_user pw: some_db_password save_intermediate: True # True, False csv_log: False # True, defaults to False cohort: subject_ids: # Omitting does not consider subject_ids - some subject_ids - ... hadm_ids: # Omitting does not consider hadm_ids - some hadm_ids - ... icd_codes: # could also be [] to avoid ICD filtering. Omitting makes the tool prompt for input. - some ICD code - ... icd_codes_intersection: # optional argument, if one wants to filter for disease combinations, such that patients have to have an icd code from icd_codes and from icd_codes_intersection - some ICD code - ... icd_version: 10 # 9, 10, 0 icd_seq_num: 1 drg_codes: [] # could also contain keys to filter for DRG codes. Omitting makes the tool prompt for input. drg_ontology: APR # APR, HCFA age: # could also be [] to avoid age range filtering. Omitting makes the tool prompt for input. - 0:25 - 50:90 event_type: admission # admission, transfer, poe include_medications: False # False, True. Only needed if POE event_type case_notion: hospital admission # subject, hospital admission case_attributes: [] # could also be None. [] uses default case attributes for case notion. prompt_case_attributes: False # False, True. Setting True forces case attributes to be determined if not provided low_level_tables: # only if event type OTHER - pharmacy - labevents low_level_activities: - medication - label low_level_timestamps: - starttime - charttime additional_event_attributes: # Can be set to []. Omitting makes the tool prompt for input - start_column: a end_column: b time_column: c table_to_aggregate: d column_to_aggregate: f aggregation_method: g filter_column: h # can be omitted filter_values: - one - other - start_column: a end_column: b time_column: c table_to_aggregate: d column_to_aggregate: f aggregation_method: g filter_column: h # can be omitted

installation

Simply run the pip installation command to install the extraction tool:

bash pip install git+https://github.com/bptlab/mimic-log-extraction/

Alternatively, clone this repo and execute

bash pip install -e .

For development and testing, all dev dependencies can be installed using

bash pip install -e .[dev]

If you're using zsh, escape the square brackets: pip install -e .\[dev\]

development

After installing all required dev dependencies, make sure to regularly call

bash pylint extract_log.py extractor --rcfile .pylintrc mypy --config-file mypy.ini .

to ensure linted and typechecked code.

Owner

  • Name: Business Process Technology
  • Login: bptlab
  • Kind: organization
  • Location: Potsdam, Germany

Business Process Technology @ Hasso Plattner Institute, University of Potsdam

GitHub Events

Total
  • Watch event: 3
Last Year
  • Watch event: 3

Committers

Last synced: 7 months ago

All Time
  • Total Commits: 171
  • Total Committers: 3
  • Avg Commits per committer: 57.0
  • Development Distribution Score (DDS): 0.485
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Finn K f****f 88
jcremerius j****s@s****e 78
Jonas Cremerius j****s@h****e 5
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 7 months ago

All Time
  • Total issues: 0
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 0
  • Total pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels

Dependencies

.github/workflows/mypy.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v3 composite
.github/workflows/pylint.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v3 composite
setup.py pypi