https://github.com/asreview/paper-guidelines-kifms
Scripts to run simulations of systematic reviews with ASReview for 14 datasets openly published on the Dutch database for medical guidelines.
Science Score: 10.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.0%) to scientific vocabulary
Keywords
asreview
medical
medical-guidelines
python
systematic-reviews
systematic-reviews-datasets
utrecht-university
Last synced: 6 months ago
·
JSON representation
Repository
Scripts to run simulations of systematic reviews with ASReview for 14 datasets openly published on the Dutch database for medical guidelines.
Basic Info
Statistics
- Stars: 1
- Watchers: 3
- Forks: 1
- Open Issues: 0
- Releases: 0
Topics
asreview
medical
medical-guidelines
python
systematic-reviews
systematic-reviews-datasets
utrecht-university
Created over 4 years ago
· Last pushed over 4 years ago
https://github.com/asreview/paper-guidelines-KIFMS/blob/main/
# Scripts for paper on "Towards up-to-date medical guidelines" [](https://zenodo.org/badge/latestdoi/379924501) The purpose of this study was to evaluate the performance and feasibility of active learning to support the selection of relevant publications within the context of medical guideline development. This repository contains scripts to run and analyze simulations for 14 datasets openly published on the Dutch database for [medical guidelines](https://www.richtlijnendatabase.nl). The results are published in the paper "Artificial intelligence supports literature screening in medical guideline development: towards up-to-date medical guidelines". ## Installation The scripts in this repository require Python 3.6+. Install the extra dependencies with (in the command line): ``` pip install -r requirements.txt ``` ## Datasets The raw data can be obtained via the Open Science Framework [OSF](https://osf.io/vt3n4/) and contains 14 published guidelines from the [Dutch Medical Guideline Database](https://richtlijnendatabase.nl/). The following files should be obtained from OSF and put in a folder `raw_data`: ``` Distal_radius_fractures_approach.csv Distal_radius_fractures_closed_reduction.csv Hallux_valgus_prognostic.csv Head_and_neck_cancer_bone.csv Head_and_neck_cancer_imaging.csv Obstetric_emergency_training.csv Post_intensive_care_treatment.csv Pregnancy_medication.csv Shoulder_replacement_diagnostic.csv Shoulder_replacement_surgery.csv Shoulderdystocia_positioning.csv Shoulderdystocia_recurrence.csv Total_knee_replacement.csv Vascular_access.csv ``` Each dataset contains ``` title abstract ``` and three columns with labeling decisions titled: ``` noisy_inclusion expert_inclusion fulltext_inclusion ``` The datasets in *raw_data* are split into three columns with labeling decisions. The resulting 42 datasets are generated by executing `job_splitfiles.sh`. The results are stored in the subfolder *data*. ## Descriptive dataset statistics To create descriptive statistics for each dataset run: ``` sh generate_dataset_characteristics.sh ``` The results are stored in `output/simulation/[NAME_DATASET]/descriptives/*.json`, are merged into one table (*csv* and *excel*) by running `python scripts/merge_descriptives.py`, and stored in `output/table/data_descriptives.*`. ## Create wordclouds To create wordclouds for each dataset run: ``` sh wordcloud_jobs.sh ``` The results are stored in `output/simulation/[NAME_DATASET]/descriptives/wordcloud`. There are three version of the wordcloud available, a wordcloud based on the title/abstract words for: - the entire set of records; - for the relevant records only; - for the irrelevant records only. ## Simulation The simulation was conducted for each dataset with an equal amount of runs as the number of relevant records in the dataset with each relevant record being a prior inclusion and 10 randomly chosen irrelevant records. In each run, and for every dataset, the same 10 irrelevant records have been used. To extract information about the records that have been used, run `python scripts/get_prior_knowledge.py`, and the result is stored in `output/tables`. To obtain the result of the simulation, run: ``` sh run_simulation.sh ``` The results are stored in `output/simulation`. The dataset characteristics are obtained with `python scripts/merge_descriptives.py` and stored in `output/tables`. The metrics resulting from the simulation study per run, can be obtained with `python scripts/merge_metrics.py` and stored in `output/tables`. The raw `h5` files are 28.4Gb and are available on request, see the contact details. However, it is straightforward to obtain the results by running the simulation again by using ASReview v0.16. Seed values are set in `run_simulation.sh`. ## Analyses The Jupyter notebook [analyses/analyses_guidelines_KIFMS.ipynb](analyses/analyses_guidelines_KIFMS.ipynb) contains a detailed, step-by-step analysis of the simulations performed in this project. For more information about the analysis, read the [README](analyses). ## Licence The content in this repository is published under the MIT license. ## Contact For any questions or remarks, please send an email to asreview@uu.nl.
Owner
- Name: ASReview
- Login: asreview
- Kind: organization
- Email: asreview@uu.nl
- Location: Utrecht University
- Website: www.asreview.ai
- Repositories: 32
- Profile: https://github.com/asreview
ASReview - Active learning for Systematic Reviews