omop-meds
An ETL pipeline for transforming OMOP datasets into the MEDS format using the MEDS-Transforms library.
Science Score: 67.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 2 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.8%) to scientific vocabulary
Repository
An ETL pipeline for transforming OMOP datasets into the MEDS format using the MEDS-Transforms library.
Basic Info
Statistics
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 5
- Releases: 10
Metadata Files
README.md
MEDS OMOP ETL with MEDS-Transforms
An ETL pipeline for transforming OMOP datasets into the MEDS format using the MEDS-Transforms library. Thanks to the developers of the first OMOP MEDS ETL, from which we took inspiration, which can be found here: https://github.com/Medical-Event-Data-Standard/meds_etl. We currently support OMOP 5.3 and 5.4 datasets.
bash
pip install OMOP_MEDS
OMOP_MEDS root_output_dir=$ROOT_OUTPUT_DIR
To try with the MIMIC-IV OMOP demo dataset, you can run:
bash
OMOP_MEDS root_output_dir=/path/to/your/output do_download=True ++do_demo=True
Example config for an OMOP dataset:
```yaml datasetname: MIMICIVOMOP rawdatasetversion: 1.0 omopversion: 5.3
urls: dataset: - https://physionet.org/content/mimic-iv-demo-omop/0.9/ - url: EXAMPLECONTROLLEDURL username: ${oc.env:DATASETDOWNLOADUSERNAME} password: ${oc.env:DATASETDOWNLOADPASSWORD} demo: - https://physionet.org/content/mimic-iv-demo-omop/0.9/ common: - EXAMPLESHAREDURL # Often used for shared metadata files ```
Pre-MEDS settings
The following settings can be used to configure the pre-MEDS steps.
bash
OMOP_MEDS \
root_output_dir=/sc/arion/projects/hpims-hpi/projects/foundation_models_ehr/cohorts/meds_debug/small_demo \
raw_input_dir=/sc/arion/projects/hpims-hpi/projects/foundation_models_ehr/cohorts/full_omop \
do_download=False ++do_overwrite=True ++limit_subjects=50
root_output_dir: Set the root output directory.raw_input_dir: Path to the raw input directory.do_download: Set toFalseto skip downloading the dataset.++do_overwrite: Set toTrueto overwrite existing files.++limit_subjects: Limit the number of subjects to process.
MEDS-transforms settings
If you want to convert a large dataset, you can use parallelization with MEDS-transforms (the MEDS-transformation step that takes the longest).
Using local parallelization with the hydra-joblib-launcher package, you can set the number of workers:
pip install hydra-joblib-launcher --upgrade
Then, you can set the number of workers as environment variable:
bash
export N_WORKERS=16
Moreover, you can set the number of subjects per shard to balance the parallelization overhead based on how many subjects you have in your dataset:
bash
export N_SUBJECTS_PER_SHARD=1000
Citation
If you use this dataset, please use the citation link in Github.
Owner
- Name: Robin van de Water
- Login: rvandewater
- Kind: user
- Location: Berlin
- Company: Hasso Plattner Institute
- Website: https://www.rpvandewater.com/
- Repositories: 1
- Profile: https://github.com/rvandewater
PhD student in Medical Event Prediction at Hasso Plattner Institute in collaboration with the Charité hospital (Berlin)
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you use this software, please cite it as below."
title: "OMOP_MEDS ETL"
doi: "10.5281/zenodo.15132444"
authors:
- family-names: "van de Water"
given-names: "Robin Philippus"
orcid: "https://orcid.org/0000-0002-2895-4872"
date-released: "2025-02-19"
url: "https://github.com/rvandewater/OMOP_MEDS"
repository-code: "https://github.com/rvandewater/OMOP_MEDS"
license: "MIT"
GitHub Events
Total
- Create event: 11
- Issues event: 18
- Release event: 7
- Watch event: 1
- Issue comment event: 29
- Public event: 1
- Push event: 77
- Pull request event: 9
Last Year
- Create event: 11
- Issues event: 18
- Release event: 7
- Watch event: 1
- Issue comment event: 29
- Public event: 1
- Push event: 77
- Pull request event: 9
Issues and Pull Requests
Last synced: 7 months ago
All Time
- Total issues: 11
- Total pull requests: 6
- Average time to close issues: 2 days
- Average time to close pull requests: less than a minute
- Total issue authors: 2
- Total pull request authors: 1
- Average comments per issue: 3.18
- Average comments per pull request: 0.0
- Merged pull requests: 5
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 11
- Pull requests: 6
- Average time to close issues: 2 days
- Average time to close pull requests: less than a minute
- Issue authors: 2
- Pull request authors: 1
- Average comments per issue: 3.18
- Average comments per pull request: 0.0
- Merged pull requests: 5
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- bschilder (9)
- rvandewater (2)
Pull Request Authors
- rvandewater (12)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 64 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 9
- Total maintainers: 1
pypi.org: omop-meds
An ETL to convert OMOP data to the MEDS format.
- Homepage: https://github.com/rvandewater/OMOP_MEDS
- Documentation: https://omop-meds.readthedocs.io/
- License: MIT License
-
Latest release: 0.0.10
published 11 months ago
Rankings
Maintainers (1)
Dependencies
- actions/checkout v4 composite
- actions/setup-python v5 composite
- pre-commit/action v3.0.1 composite
- actions/checkout v4 composite
- actions/setup-python v5 composite
- pre-commit/action v3.0.1 composite
- trilom/file-changes-action v1.2.4 composite
- actions/checkout v4 composite
- actions/download-artifact v4 composite
- actions/setup-python v5 composite
- actions/upload-artifact v4 composite
- pypa/gh-action-pypi-publish release/v1 composite
- sigstore/gh-action-sigstore-python v3.0.0 composite
- actions/checkout v4 composite
- actions/setup-python v5 composite
- codecov/codecov-action v4.0.1 composite
- codecov/test-results-action v1 composite
- beautifulsoup4 *
- hydra-core *
- loguru *
- meds-transforms >=0.1
- polars *
- requests *