https://github.com/asreview/paper-megameta-hyperparameter-training
Hyperparameter-training for the Mega-Meta project
https://github.com/asreview/paper-megameta-hyperparameter-training
Science Score: 10.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: nature.com, zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.9%) to scientific vocabulary
Keywords
Repository
Hyperparameter-training for the Mega-Meta project
Basic Info
- Host: GitHub
- Owner: asreview
- License: mit
- Language: Python
- Default Branch: main
- Homepage: https://www.asreview.ai
- Size: 36.1 KB
Statistics
- Stars: 1
- Watchers: 5
- Forks: 0
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
Hyperparameter-training for the Mega-Meta project
The repository is part of the so-called, Mega-Meta study on reviewing factors contributing to substance use, anxiety, and depressive disorders. The study protocol has been pre-registered at Prospero. The procedure for obtaining the search terms, the exact search query, and selecting key papers by expert consensus can be found on the Open Science Framework.
The screening was conducted in the software ASReview (Van de Schoot et al., 2020 using the protocol as described in Hofstee et al. (2021). The server installation is described in Melnikov (2021), and the post-processing is described by van de Brand et al., 2021. The data can be found on DANS [LINK NEEDED].
This repository stores the scripts and plugins that were used for the creation of the three final project files. These final project files were created using a classifier based on a convolutional neural network, optimized using Optuna.
Content
The Plugins folder contains the 2 used ASReview plugins.
asreview-cnn-hpo was
used to find the optimal settings for each dataset, and
asreview-model-cnn-17-layer
was used to implement these settings.
The scripts folder contains a Jupyter Notebook that can be run in Google
Colab. This script installs the plugins
found in the plugins folder and then creates an ASReview lab instance that can
be used to employ these plugins, find the optimal hyperparameters, and create
the final project files.
Step-by-step quickguide
- Create a project file from one of the input files.
- Select the HPO-CNN optimizer as classifier for the created project file.
- Train the project file to recieve the optimal CNN hyper parameters. The optimal parameters appear in the console used to start ASReview.
- Plug these parameters into the CNN model (not the HPO-CNN model).
- Train the CNN model.
- Start screening records with the optimized CNN model.
Step-by-step guide
This guide details how the CNN was optimized and trained.
The process started with 3 different excel files containing ASReview output.
In Google Colab, upload and run the
cnn_training_script_for_use_in_google_colab.ipynbfile in found in the folder calledscripts. This file is used to optimize the CNNs for the later processing.
- A custom-made version of the HPO-CNN was used to determine the optimal
hyperparameters for each of the project files. These hyperparameters were
then applied to the CNN classifier model.
- To use this custom-made version of the HPO-CNN, upload the folder
containing the HPO-CNN version to Colab. A quick way of doing this is
by zipping-up the folder and uploading this zip file.
- The notebook contains code for unzipping, and then installing the
plugin automatically.
- The notebook uses NGROK to access the ASReview frontend. A personal NGROK
token is needed for the `NGROK_AUTH_TOKEN` variable. A link with
instructions on where to get such a token can be found in the notebook.
```python
!pip install pyngrok --quiet
from pyngrok import ngrok
# Terminate open tunnels if exist
ngrok.kill()
# Setting the authtoken (optional)
# Get your authtoken from https://dashboard.ngrok.com/auth
NGROK_AUTH_TOKEN = "fill token here"
ngrok.set_auth_token(NGROK_AUTH_TOKEN)
```
- After running the notebook, there will be a link produced by NGROK. This
link will open the ASReview frontend.
```python
ngrok.connect(port="80", proto="http")
<NgrokTunnel: "http://d3c5-35-197-26-146.ngrok.io" -> "http://localhost:80">
```
- In this front end, create a new project file with one of the Excel files.
- When the training of the new project file is finished (after a long time),
the still processing colab cell running ASReview will print the optimal
parameters in the output.
```
Hpo trail: 77/80
Hpo trail: 77/80
Hpo trail: 77/80
Hpo trail: 77/80
FOUND HYPERPARAMETERS: {'nlayers': 3, 'nfilters': 209}
```
- These parameters are used to optimize the CNN used for the final project files.
- For the training of these project files, the parameters that resulted as
an output of the HPO are used for the CNN. The output is set by filling in
the results into the `nlayers` and `nfilters` variables in the
`asreview-plugin-model-cnn-17-layer\asreviewcontrib\models\cnn.py` file.
```python
def _create_dense_nn_model(_size):
def model_wrapper():
backend.clear_session()
nfilters = 209
model = Sequential()
```
- Install this newly created CNN. Using this new optimized CNN, create the projects files for screening.
Funding
This project is funded by a grant from the Centre for Urban Mental Health, University of Amsterdam, The Netherlands
Licence
The content in this repository is published under the MIT license.
Contact
For any questions or remarks, please send an email to the ASReview-team.
Owner
- Name: ASReview
- Login: asreview
- Kind: organization
- Email: asreview@uu.nl
- Location: Utrecht University
- Website: www.asreview.ai
- Repositories: 32
- Profile: https://github.com/asreview
ASReview - Active learning for Systematic Reviews
GitHub Events
Total
Last Year
Dependencies
- asreview >=0.13
- sklearn *
- asreview >=0.13
- scipy *
- sklearn *
- tensorflow *