Science Score: 49.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 3 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (15.2%) to scientific vocabulary
Repository
Software for EEG analysis
Basic Info
Statistics
- Stars: 3
- Watchers: 0
- Forks: 0
- Open Issues: 18
- Releases: 8
Metadata Files
README.md
eegyolk
This library contains functions, scripts and notebooks for machine learning related to EEGs (electroencephalograms). The notebooks include an updated version of a project for deep learning for age prediciton using EEG data, as well as new ongoing work from students at the University of Urtrecht.
Notebooks
There are several groups of notebooks related to this repository. Notebooks related to the thesis of Nadine Prins can be found in the folder dyslexiapredictionnadine. Notebooks related to the thesis of Floris Pauwels can be found in the foder florisfiles. Notebooks related to general reproduction the work of earlier project can be found in reproducibility_experiments.
Documentation
There have been several versions of eegyolk, and documentation for all of them is available on readthedocs. To get the latest stable documentation follow this link.
Configuration file
The config_template.py file should be renamed to config.py. Here the paths of the file locations can be stored. The ROOT folder can be the ROOT folder of this repository as well.
The Data folder contains the following folder/files:
Program files
The main program in this repository contains functions, for example DataGenerators.
Data sets
Some of the data sets of this project are publicly available, and some are not as they contains privacy-sensitive information.
Original published data from the DDP (Dutch Dyslexia Program) is used as demo data wherever possible. This data can be obtained from: https://easy.dans.knaw.nl/ui/datasets/id/easy-dataset:112935/
Collection of newer data acquired to detect dyslexia on a different protocol ended in 2022. This data is not yet public, however, there are many public EEG datasets to which the functions in this library can be applied.
NLeSC employees can download some additional data from surfdrive. Contact Candace Makeda Moore (c.moore@esciencecenter.nl) to discuss additional access to data,
Getting started
How to get the notebooks running? Assuming the raw data set and metadata is available.
- Install all Python packages required, using conda and the environment-march-update2.yaml file.
run following line on your machine:
conda env create -f environment.ymland switch to this environment running command:conda activate envyolk. - Update the configurationtemplate.py (NOT configtemplate) file and rename to config.py.
- (being rebuilt) Use the preprocessing notebooks to process the raw data to usable data for either the ML or (reduced) DL models (separate notebooks).
- (being rebuilt) The 'model training' notebooks can be used the train and save models.
- (being rebuilt) The 'model validation' notebooks can be used to assess the performance of the models.
Testing
Testing uses synthetic data. Testing will requires you to either run tests inside a container or extract the data from our image with synthetic data in our docker. The docker image will be drcandacemakedamoore/eegyolk-test-data:latest . Until then you could also reconfigure and rename your own valid bdf files and metadata as configured and named in the tests/test.py, and local testing should work.
Finally, you can contact Dr. Moore c.moore@esciencecenter.nl for synthetic test data and/or with any questions on testing.
Installing
This has only been tested on Linux so far.
python -m venv .venv
. .venv/bin/activate
./setup.py install
Configuring
In order to preprocess and to train the models the code needs to be able to locate the raw data and the metadata, and for the training it also needs the preprocessed data to be available.
There are several ways to specify the location of the following directories:
- root: Special directory. The rest of the directory layout can be derived from its location.
- data: The location of raw CNT data files. This is the directory
containing
11mnd mmnand similar files. - metadata: The location of metadata files. This is the directory
that contains
agesdirectory, which, in turn, contains files likeages_11mnths.txt. - preprocessed: The directory that will be used by preprocessing code to output CSVs and h5 files. This directory will be used by the model training code to read the training data.
- models: The directory to output trained models to.
You can store this information persistently in several locations.
- In the same directory where you run the script (or the notebook).
Eg.
./config.json. - In home directory, eg.
~/.eegyolk/config.json. - In global directory, eg
/etc/eegyolk/config.json.
This file can have this or similar contents:
{
"root": "/mnt/data",
"metadata": "/mnt/data/meta",
"preprocessed": "/mnt/data/processed"
}
The file is read as follows: if the files specifies root
directory, then the missing entires are assumed to be relative to
the root. You don't need to specify the root entry, if you specify
all other entires.
Command-Line Interface
You can preprocess and tain the models using command-line interface.
Below are some examples of how to do that:
This will pre-process the first ten CNT files in the
/mnt/data/original-cnts directory.
python -m eegyolk acquire \
--input /mnt/data/original-cnts \
--metadata /mnt/data/metadata \
--output /mnt/data/preprocessed \
--limit 10
This will train a model using dummy algorithm. In case of dummy
algorithm both best_fit and fit do the same thing.
python -m eegyolk ml \
--input /mnt/data/preprocessed \
--output /mnt/data/trained_models \
--size 100 \
dummy best_fit
Similarly, for neural networks training and assessment:
python -m eegyolk nn \
--input /mnt/data/preprocessed \
--output /mnt/data/trained_models \
--epochs 100 \
train_model 1
It's possible to load configuration (used for directory layout) from alternative file:
python -m eegyolk --config /another/config.json ml \
--input /mnt/data/preprocessed \
--output /mnt/data/trained_models \
--size 100 \
dummy best_fit
All long options have short aliases.
GitHub Events
Total
Last Year
Committers
Last synced: about 3 years ago
All Time
- Total Commits: 580
- Total Committers: 11
- Avg Commits per committer: 52.727
- Development Distribution Score (DDS): 0.69
Top Committers
| Name | Commits | |
|---|---|---|
| drcandacemakedamoore | d****a@g****m | 180 |
| F | f****s@l****l | 133 |
| Candace Makeda Moore, MD | 3****e@u****m | 94 |
| NadineUU | n****s@s****l | 61 |
| FlorisP | F****s@l****l | 61 |
| NadineUU | n****s@s****l | 14 |
| Floris | 12 | |
| wvxvw | o****n@g****m | 12 |
| Floris | u****n | 9 |
| NadineUU | 8****U@u****m | 3 |
| FlorisP | F****s@l****l | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 8 months ago
All Time
- Total issues: 25
- Total pull requests: 26
- Average time to close issues: 7 days
- Average time to close pull requests: 1 day
- Total issue authors: 3
- Total pull request authors: 2
- Average comments per issue: 0.24
- Average comments per pull request: 0.08
- Merged pull requests: 25
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- FlorisP (12)
- NadineUU (10)
- drcandacemakedamoore (3)
Pull Request Authors
- FlorisP (18)
- wvxvw (8)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- actions/checkout v3 composite
- addnab/docker-run-action v3 composite
- s-weigand/setup-conda v1 composite
- actions/checkout v2 composite
- actions/download-artifact v2 composite
- actions/setup-python v1 composite
- actions/upload-artifact v2 composite
- marvinpinto/action-automatic-releases latest composite
- s-weigand/setup-conda v1.0.6 composite
- python 3.8 build
- absl-py ==0.11.0
- astunparse ==1.6.3
- cachetools ==4.1.1
- eegyolk ==0.0.5
- flatbuffers ==1.12
- gast ==0.3.3
- google-auth ==1.23.0
- google-auth-oauthlib ==0.4.2
- google-pasta ==0.2.0
- grpcio ==1.32.0
- ipython *
- keras-preprocessing ==1.1.2
- markdown ==3.3.3
- mne-features ==0.2
- oauthlib ==3.1.0
- opt-einsum ==3.3.0
- protobuf ==3.14.0
- pyasn1 ==0.4.8
- pyasn1-modules ==0.2.8
- pyqt5-sip ==4.19.18
- pyqtwebengine ==5.12.1
- requests-oauthlib ==1.3.0
- researchpy ==0.3.5
- rsa ==4.6
- sklearn ==0.0
- sklearn-rvm ==0.1.1
- tensorboard ==2.4.1
- tensorboard-plugin-wit ==1.7.0
- tensorflow ==2.9.1
- tensorflow-addons ==0.11.2
- tensorflow-estimator ==2.4.0
- termcolor ==1.1.0
- tf-estimator-nightly ==2.5.0.dev2020123101
- typeguard ==2.10.0
- typing-extensions ==3.7.4.3
- varname *
- werkzeug ==1.0.1
- wrapt ==1.12.1
- IPython *
- They *
- h5py *
- https *
- info. *
- matplotlib *
- mne *
- mne-features ==0.1
- numpy *
- pandas *
- pyxdf *
- scikit-learn *
- scipy *
- script *
- sklearn-rvm *
- tensorflow *
- textdistance *
- we *