mip-dmp
Python tool with Graphical User Interface to map datasets to a specific Common Data Elements (CDEs) metadata schema of a federation of the Medical Informatics Platform (MIP).
Science Score: 49.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 5 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.0%) to scientific vocabulary
Repository
Python tool with Graphical User Interface to map datasets to a specific Common Data Elements (CDEs) metadata schema of a federation of the Medical Informatics Platform (MIP).
Basic Info
- Host: GitHub
- Owner: HBPMedical
- License: apache-2.0
- Language: Python
- Default Branch: main
- Homepage: https://hbpmedical.github.io/mip-dmp/
- Size: 7.6 MB
Statistics
- Stars: 1
- Watchers: 9
- Forks: 0
- Open Issues: 2
- Releases: 7
Metadata Files
README.md
MIP Dataset Mapper (mip_dmp)
Python tool with Graphical User Interface to map datasets to a specific Common Data Elements (CDEs) metadata schema of a federation of the Medical Informatics Platform (MIP). It is developed to support members of a MIP Federation in the task of mapping their dataset to the CDEs schema of this federation. This project is distributed under the Apache 2.0 open-source license (See LICENSE for more details).
How to install?
For the user
- Create your installation directory, go to this directory, and create a new virtual Python 3.9 environment:
bash
$ mkdir -p "/installation/directory"
$ cd "/prefered/directory"
$ virtualenv venv -p python3.9
- Activate the environment and install the package, at a specific version, directly from GitHub with Pip:
bash
$ source ./venv/bin/activate
(venv)$ pip install -r https://raw.githubusercontent.com/HBPMedical/mip-dmp/main/requirements.txt
(venv)$ pip install git+https://github.com/HBPMedical/mip-dmp.git@0.0.5
For the developer
- Clone the Git repository in your prefered directory:
bash
$ cd "/prefered/directory"
$ git clone git@github.com:HBPMedical/mip-datatools.git
- Go to the cloned repository and create a new virtual Python 3.9 environment:
bash
$ cd mip-datatools
$ virtualenv venv -p python3.9
- Activate the environment and install the package with Pip:
bash
$ source ./venv/bin/activate
(venv)$ pip install -r requirements.txt
(venv)$ pip install -e .
Usage
mip_dataset_mapper_ui
You can use the installed mip_dataset_mapper_ui script to start the MIP Dataset Mapper UI application.
Usage
In a terminal, you can launch it with the following command:
$ mip_dataset_mapper_ui
This displays the main window of MIP Dataset Mapper UI application that consists of four main component in a grid layout fashion, as shown in the screeshot below.

The task of mapping the dataset consists of the following tasks:
- Load a input CSV dataset in
.csvformat (top left) - Load a CDEs schema in
.xlxsformat (bottom left) - Edit the columns / CDEs mapping table (top right)
- Configure output directory / filename and create the output CSV dataset mapped to the CDEs schema (bottom right)
mip_dataset_mapper
You can use the installed mip_dataset_mapper script to start the command-line interface of the MIP Dataset Mapper.
Usage
```output usage: mipdatasetmapper [-h] --sourcedataset SOURCEDATASET --mappingfile MAPPINGFILE --cdesfile CDESFILE --targetdataset TARGETDATASET
Map a source dataset to a target dataset given a mapping file in JSON format generated by the MIP Dataset Mapper UI application (mipdatasetmapper_ui).
optional arguments: -h, --help show this help message and exit --sourcedataset SOURCEDATASET Source dataset file in CSV format. --mappingfile MAPPINGFILE Source Dataset Columns / Common data elements (CDEs) mapping file in JSON format. The mapping file can be generated by the MIP Dataset Mapper UI application. --cdesfile CDESFILE Common data elements (CDEs) metadata schema file in EXCEL format. --targetdataset TARGETDATASET Path to the target / output dataset file in CSV format.
```
How to cite?
If you are using the MIP Dataset Mapper (mip_dmp) in your work, please acknowledge this software with the following entry:
Tourbier, Sebastien, Schaffhauser, Birgit, & Ryvlin, Philippe. (2023). HBPMedical/mip-dmp: v0.0.7 (0.0.7). Zenodo. https://doi.org/10.5281/zenodo.8056371
Funding
This project received funding from the European Union's H2020 Framework Programme for Research and Innovation under the Specific Grant Agreement No. 945539 (Human Brain Project SGA3, as part the Medical Informatics Platform (MIP)).
Contributors ✨
Thanks goes to these wonderful people (emoji key):
Sébastien Tourbier 🐛 💻 🎨 📖 💡 🤔 🚇 🚧 🧑🏫 👀 ⚠️ |
BSchaffhauser 💵 🔍 |
This project follows the all-contributors specification. Contributions of any kind welcome!
Owner
- Name: HBP Medical Informatics Platform
- Login: HBPMedical
- Kind: organization
- Email: support@ebrains.eu
- Location: CHUV, Lausanne, Switzerland
- Website: https://ebrains.eu/service/medical-informatics-platform
- Repositories: 147
- Profile: https://github.com/HBPMedical
GitHub Events
Total
Last Year
Dependencies
- actions/checkout v3 composite
- actions/setup-python v4 composite
- peaceiris/actions-gh-pages v3 composite
- actions/checkout v3 composite
- actions/setup-python v3 composite
- commonmark ==0.9.1
- docutils ==0.18.1
- future *
- m2r2 ==0.3.2
- mock ==5.0.1
- numpy *
- pydot >=1.2.3
- recommonmark ==0.7.1
- scipy *
- sphinx >=6.1.3
- sphinx-argparse ==0.4.0
- sphinx_rtd_theme ==1.2.0
- sphinxcontrib-apidoc ==0.3.0
- sphinxemoji ==0.2.0
- PySide2 ==5.15.2.1 development
- chars2vec master development
- fuzzywuzzy ==0.18.0 development
- gensim ==4.3.1 development
- matplotlib ==3.7.1 development
- openpyxl ==3.1.2 development
- pandas ==2.0.0 development
- python-Levenshtein ==0.20.9 development
- scikit-learn ==1.2.2 development
- scipy ==1.9.1 development
- seaborn ==0.12.2 development
- tensorflow ==2.12.0 development
- PySide2 ==5.15.2.1
- chars2vec master
- fuzzywuzzy ==0.18.0
- gensim ==4.3.1
- matplotlib ==3.7.1
- openpyxl ==3.1.2
- pandas ==2.0.0
- python-Levenshtein ==0.20.9
- scikit-learn ==1.2.2
- scipy ==1.9.1
- seaborn ==0.12.2
- tensorflow ==2.12.0