eeg_manypipes_arc

This project contains all code to reproduce the Analyses of the EEG manypipes project

https://github.com/nomisciri/eeg_manypipes_arc

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.5%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

This project contains all code to reproduce the Analyses of the EEG manypipes project

Basic Info
  • Host: GitHub
  • Owner: NomisCiri
  • License: mit
  • Language: Python
  • Default Branch: main
  • Size: 1.29 MB
Statistics
  • Stars: 1
  • Watchers: 2
  • Forks: 1
  • Open Issues: 0
  • Releases: 2
Created almost 4 years ago · Last pushed about 2 years ago
Metadata Files
Readme License Citation

README.md

Run analysis DOI

eegmanypipesarc -- EEGManyPipelines Analysis Code

This project contains all code to reproduce the analyses of the EEG manypipes project.

It is hosted on GitHub: https://github.com/NomisCiri/eegmanypipesarc

The archived code can be found on Zenodo: https://doi.org/10.5281/zenodo.6549049

This is a contribution by Stefan Appelhoff, Simon Ciranka, and Casper Kerrén from the Center of Adaptive Rationality (ARC)

Original documentation provided by the organizers can be found in the organizer_documentation directory.

sourcedata and derivatives of this project are stored on GIN: https://gin.g-node.org/sappelhoff/eegmanypipesarc

UPDATE 2023-01-23 -- AT REQUEST OF THE EEG MANY PIPELINES STEERING COMMITTEE, WE TURNED THE ACCESS TO THE DATA REPOSITORY TO PRIVATE.

The report for the analysis is in REPORT.txt

Installation

To run the code, you need to install the required dependencies first. We recommend that you follow these steps (assumed to be run from the root of this repository):

  1. Download Miniconda for your system: https://docs.conda.io/en/latest/miniconda.html (this will provide you with the conda command)
  2. Use conda to install mamba: conda install mamba -n base -c conda-forge (for more information, see: https://github.com/mamba-org/mamba; NOTE: We recommend that you install mamba in your base environment.)
  3. Use the environment.yml file in this repository to create the emp ("EEGManyPipelines") environment: mamba env create -f environment.yml
  4. Activate the environment as usual with conda activate emp
  5. After the first activation, run the following to activate pre-commit hooks: pre-commit install

Obtaining the data

We recommend that you make use of the data hosted on GIN via Datalad.

If you followed the installation steps above, you should almost have a working installation of Datalad in your environment. The last step that is (probably) missing, is to install git-annex.

Depending on your operating system, do it as follows: - ubuntu: mamba install -c conda-forge git-annex - macos: brew install git-annex (use Homebrew) - windows: choco install git-annex (use Chocolatey)

Use the following steps to download the data:

  1. clone: datalad clone https://gin.g-node.org/sappelhoff/eeg_manypipes_arc
  2. go to root of cloned dataset: cd eeg_manypipes_arc
  3. get a specific piece of the data datalad get sourcedata/eeg_eeglab/EMP01.set
  4. ... or get all data: datalad get *

Note that if you do not get all the data (step 4. above), the data that you did not get is not actually present on your system. There is merely a symbolic link to a remote location (GIN). Furthermore, the entire EEG data (even after get) is "read only"; if you need to edit or overwrite the files (not recommended), you can run datalad unlock *.

Continuous integration

Under .github/workflows/run_analysis.yml we have specified a test workflow that may be helpful for you to inspect.

Running the code

Before running the code on your system you must:

  1. Obtain the data (see above)
  2. Edit config.py to include the path to your data (see FPATH_DS variable)

Description of files

  • files unrelated to analysis
    • LICENSE, detailing how our work is licensed
    • README.md, the information that you currently read
    • setup.cfg, a file to configure different software tools to work well with each other (black, flake8, ...)
    • CITATION.cff, metadata on how to cite this code
    • .gitignore, which files not to track in the version control system
    • environment.yml, needed to install software dependencies (see also "Installation" above)
    • .pre-commit-config.yaml, configuration for "pre-commit hooks" that ease software development
    • organizer_documentation/*.pdf, the original documentation provided by the EEG Many Pipelines project organizers
    • .github/workflows/run_analysis.yml, a continuous integration workflow definition for GitHub Actions

All other files are related to the analysis.

  • REPORT.txt, containing four short paragraphs on the analysis of the four hypotheses
  • report_sheets/EEGManyPipelines_results_h*.xlsx, Excel files with information about the analysis. Note h* stands for hypotheses 1, 2a, 3a, ..., 4b.
  • config.py, definitions of stable variables that are reused throughout other scripts; for example file paths
  • utils.py, definitions of functions that are reused throughout other scripts

The Python script that are doing the heavy lifting have names that are prefixed with two integers 00, 01, 02, ... This indicates the order in which to run the scripts.

The 00 are optional to run.

  • 00_find_bad_subjs.py, to find subjects to exclude from analysis based on behavioral performance (see BAD_SUBJS variable in config.py)
  • 00_inspect_raws.py, to interactively inspect raw EEG data
  • 00_prepare_handin.py, only to prepare all files for handing in for the EEGManyPipelines submission

The preprocessing scripts are those from 01 to 06. These operate on single subjects.

  • 01_find_bads.py, finding bad channels using pyprep
  • 02_mark_bad_segments.py, marking bad temporal segments using MNE-Python automatic methods
  • 03_run_ica.py, running ICA, excluding previously found bad channels and segments
  • 04_inspect_ica.py, find and exclude bad ICA components
  • 05_make_epochs.py, epoch the data
  • 06_run_autoreject.py, interpolate channels
  • 06b_check_autoreject.py, provide a summary of interpolated channels

Note that these scripts can be easily run from the command line and that you can specify certain arguments there (see the scripts for more detail). This allows running several subjects from the command line like below:

shell for i in {1..33} do python -u 01_find_bads.py \ --sub=$i \ --overwrite=True done

Finally, there is one script for testing each of the four hypotheses.

  • 07_test_h1.py, for hypothesis 1
  • 08_test_h2.py, for hypothesis 2
  • 09_test_h3.py, for hypothesis 3
  • 10_test_h4.py, for hypothesis 4

All outputs of these analyses are stored on GIN

https://gin.g-node.org/sappelhoff/eegmanypipesarc

Owner

  • Name: Simon Ciranka
  • Login: NomisCiri
  • Kind: user

Citation (CITATION.cff)

# YAML 1.2
#
# To do:
# - add "preferred-citation" once the EEGManyLabs report is published
---
# Metadata for citation of this software according to the CFF format (https://citation-file-format.github.io/)
cff-version: 1.2.0
title: eeg_manypipes_arc -- EEGManyPipelines Analysis Code
authors:
  - given-names: Stefan
    family-names: Appelhoff
    affiliation: Max Planck Institute for Human Development
    orcid: 'https://orcid.org/0000-0001-8002-0877'
  - given-names: Casper
    family-names: Kerrén
    affiliation: Max Planck Institute for Human Development
  - given-names: Simon
    family-names: Ciranka
    affiliation: Max Planck Institute for Human Development
    orcid: 'https://orcid.org/0000-0002-2067-9781'
type: software
repository-code: 'https://github.com/NomisCiri/eeg_manypipes_arc/'
license: MIT
identifiers:
  - description: "Code archive on Zenodo"
    type: doi
    value: 10.5281/zenodo.6549049
keywords:
  - analysis
  - neuroscience
  - psychology
  - eeg
  - electroencephalography
  - EEGManyPipelines
message: >-
  Please cite this software using the metadata
  in the CITATION.cff file.
...

GitHub Events

Total
Last Year

Dependencies

.github/workflows/run_analysis.yml actions
  • actions/cache v3 composite
  • actions/checkout v3 composite
  • conda-incubator/setup-miniconda v2 composite
  • crazy-max/ghaction-chocolatey v1 composite
environment.yml conda
  • autoreject <0.4
  • black <22.4
  • click <8.2
  • flake8 <4.1
  • flake8-docstrings <1.7
  • h5io <0.2
  • h5py <3.7
  • ipykernel <6.13
  • ipython <8.3
  • isort <5.11
  • matplotlib <3.6
  • mne-base <1.1
  • numpy <1.23
  • pandas <1.5
  • pip
  • pre-commit <2.19
  • psutil <5.10
  • pymatreader <0.1
  • pyprep <0.5
  • python <3.10
  • scipy <1.9
  • seaborn <0.12
  • statsmodels <0-14