https://github.com/arcascope/pisces

Pipeline for Sleep Classification and Evaluations

https://github.com/arcascope/pisces

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (17.5%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

Pipeline for Sleep Classification and Evaluations

Basic Info
  • Host: GitHub
  • Owner: Arcascope
  • License: apache-2.0
  • Language: Jupyter Notebook
  • Default Branch: main
  • Size: 32.7 MB
Statistics
  • Stars: 0
  • Watchers: 2
  • Forks: 3
  • Open Issues: 5
  • Releases: 1
Created about 3 years ago · Last pushed about 1 year ago
Metadata Files
Readme License

README.md

pisces

This package provides a framework and examples for running machine learning experiments in sleep classification. Pisces offers automated data set and subject/feature discovery based on a light folder structure, loading CSVs into pandas DataFrame objects. A number of tools are also provided for plotting, scoring, and debugging sleep research pipelines.

Installation

Start by making a python or conda environment with Python 3.11 and installing the requirements from file. For example, you can create an environment called pisces by:

shell conda create -n pisces python=3.11 conda activate pisces

In the same terminal (so that your new conda environment is active), navigate to the directory where you’d like to clone the package and run the following commands to clone it and use pip to install the package in an editable way with -e .

sh git clone https://github.com/Arcascope/pisces.git cd pisces pip install -e .

Common issues

You may end up with a version of Keras incompatible with the marshalled data in pisces/cached_models. In that case, run pisces_setup in a terminal; pisces_setup is in your path as long as a Python environment with pisces installed is active.

Usage

The pipeline is intended to be flexible and can be easily extended to include new models, datasets, and evaluation metrics. In version 2.0, we have streamlined the library to prioritizing nimbleness and easy debugging.

The examples/NHRC folder shows how to use pisces with other packages like sklearn and tensorflow providing machine learning frameworks.

Data Sets

Pisces automatically discovers data sets that match a simple, flexible format inside a given directory. The analysis in examples/NHRC/src finds data contained in the data folder of the Pisces repository. The code is simple:

``` python from pisces.data_sets import DataSetObject

sets = DataSetObject.finddatasets("../data") walch = sets['walchetal'] hybrid = sets['hybrid_motion'] ```

Now we have 2 DataSetObjects, walch and hybrid, that can be queried for their subjects and features. These were discovered because these are folders inside of data that have a compatible structure.

These two sets were discovered because of the presence of at least one subdirectory matching the glob expression cleaned_*. Every subdirectory that matches this pattern is considered a feature, so based on the example below, Pisces discovers that hybrid_motion and walch_et_al both have psg, accelerometer, and activity features, in addition to other folders they may have not listed.

The data directory looks like:

sh data ├── walch_et_al │   ├── cleaned_accelerometer │   │   ├── 46343_cleaned_motion.out │   │   ├── 759667_cleaned_motion.out │   │   ├── ... │   ├── cleaned_activity │   │   ├── 46343_cleaned_counts.out │   │   ├── 759667_cleaned_counts.out │   │   ├── ... │   ├── cleaned_psg │   │   ├── 46343_cleaned_psg.out │   │   ├── 759667_cleaned_psg.out │   │   ├── ... ├── hybrid_motion │   ├── cleaned_accelerometer │   │   ├── 46343.csv │   │   ├── 759667.csv │   │   ├── ... │   ├── cleaned_activity │   │   ├── 46343.csv │   │   ├── 759667.csv │   │   ├── ... │   ├── cleaned_psg │   │   ├── 46343_labeled_sleep.txt │   │   ├── 759667_labeled_sleep.txt │   │   ├── ...

Key takeaways for data set discovery:

  1. The data set is discovered based on the presence of a subdirectory matching the glob expression cleaned_*.
  2. Every subdirectory that matches this pattern is considered a feature; these features are named after the part matching *.
  3. Subjects within a feature are computed per-feature, based on variadic and constant parts of the filenames within each feature directory. Said in a less fancy way, because the walch_et_al accelerometer folders contain the files 46343_cleaned_motion.out and 759667_cleaned_motion.out which have _cleaned_motion.out in common, Pisces identifies 46343 and 759667 as subject IDs that have accelerometer feature data for walch_et_al.
    1. It is no problem if some subjects are missing a certain feature. When the feature data for an existing subject, without that feature in their data, is requested, the feature will return None for that subject.
    2. The naming scheme can vary greatly between features. However, the subject id MUST be the prefix on the filenames. For example, 46343_labeled_sleep.txt are both for the same subject, 46343. If instead we named those final_46343_cleaned_motion.out and 46343_labeled_sleep.txt then the subject’s data would be broken into two subjects, 46343 and final_46343.

Advanced features of data set discovery:

  1. There is no a-priori rule about what features in a data set give the labels and which are model inputs. This allows you to call the label feature whatever you want, or use a mixture of features (psg + …) as labels for complex models supporting rich outputs.
  2. You can have other folders inside data set directories that do NOT match cleaned_*, and these are totally ignored. This allows you to store other data, like raw data or metadata, in the same directory as the cleaned data.
  3. You can have other folders whose sub-structure does not match the subject/feature structure, and these are totally ignored.

Owner

  • Name: Arcascope Inc
  • Login: Arcascope
  • Kind: organization

GPS for your body's clock

GitHub Events

Total
  • Create event: 3
  • Release event: 1
  • Issues event: 1
  • Delete event: 1
  • Issue comment event: 2
  • Push event: 18
  • Pull request review event: 5
  • Pull request review comment event: 5
  • Pull request event: 10
Last Year
  • Create event: 3
  • Release event: 1
  • Issues event: 1
  • Delete event: 1
  • Issue comment event: 2
  • Push event: 18
  • Pull request review event: 5
  • Pull request review comment event: 5
  • Pull request event: 10

Issues and Pull Requests

Last synced: 10 months ago

All Time
  • Total issues: 0
  • Total pull requests: 2
  • Average time to close issues: N/A
  • Average time to close pull requests: about 1 hour
  • Total issue authors: 0
  • Total pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 2
  • Average time to close issues: N/A
  • Average time to close pull requests: about 1 hour
  • Issue authors: 0
  • Pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • ericcanton (6)
  • ftavella (2)
Pull Request Authors
  • ftavella (9)
  • ericcanton (9)
  • ojwalch (1)
Top Labels
Issue Labels
Pull Request Labels

Dependencies

requirements.txt pypi
  • Babel ==2.14.0
  • Jinja2 ==3.1.3
  • Markdown ==3.6
  • MarkupSafe ==2.1.5
  • PyYAML ==6.0.1
  • Pygments ==2.17.2
  • QtPy ==2.4.1
  • Send2Trash ==1.8.3
  • Werkzeug ==3.0.2
  • absl-py ==1.4.0
  • agcounts ==0.2.3
  • anyio ==4.3.0
  • appnope ==0.1.4
  • argon2-cffi ==23.1.0
  • argon2-cffi-bindings ==21.2.0
  • arrow ==1.3.0
  • asttokens ==2.4.1
  • astunparse ==1.6.3
  • async-lru ==2.0.4
  • attrs ==23.2.0
  • beautifulsoup4 ==4.12.3
  • bleach ==6.1.0
  • certifi ==2024.2.2
  • cffi ==1.16.0
  • charset-normalizer ==3.3.2
  • click ==8.1.7
  • comm ==0.2.2
  • contourpy ==1.2.1
  • cycler ==0.12.1
  • debugpy ==1.8.1
  • decorator ==5.1.1
  • defusedxml ==0.7.1
  • dm-tree ==0.1.8
  • etils ==1.8.0
  • execnb ==0.1.5
  • executing ==2.0.1
  • fastcore ==1.5.29
  • fastjsonschema ==2.19.1
  • flatbuffers ==24.3.25
  • fonttools ==4.51.0
  • fqdn ==1.5.1
  • fsspec ==2024.3.1
  • gast ==0.5.4
  • ghapi ==1.0.5
  • google-pasta ==0.2.0
  • googleapis-common-protos ==1.63.0
  • grpcio ==1.62.1
  • h11 ==0.14.0
  • h5py ==3.11.0
  • httpcore ==1.0.5
  • httpx ==0.27.0
  • idna ==3.7
  • importlib_resources ==6.4.0
  • ipykernel ==6.29.4
  • ipython ==8.23.0
  • ipywidgets ==8.0.4
  • isoduration ==20.11.0
  • jedi ==0.19.1
  • joblib ==1.4.0
  • json5 ==0.9.25
  • jsonpointer ==2.4
  • jsonschema ==4.21.1
  • jsonschema-specifications ==2023.12.1
  • jupyter ==1.0.0
  • jupyter-console ==6.6.3
  • jupyter-events ==0.10.0
  • jupyter-lsp ==2.2.5
  • jupyter_client ==8.6.1
  • jupyter_core ==5.7.2
  • jupyter_server ==2.14.0
  • jupyter_server_terminals ==0.5.3
  • jupyterlab ==4.1.6
  • jupyterlab_pygments ==0.3.0
  • jupyterlab_server ==2.26.0
  • jupyterlab_widgets ==3.0.10
  • kagglehub ==0.2.2
  • keras ==3.2.1
  • keras-core ==0.1.7
  • keras-cv ==0.8.2
  • kiwisolver ==1.4.5
  • lazy_loader ==0.4
  • libclang ==18.1.1
  • markdown-it-py ==3.0.0
  • matplotlib ==3.8.4
  • matplotlib-inline ==0.1.7
  • mdurl ==0.1.2
  • mistune ==3.0.2
  • ml-dtypes ==0.3.2
  • mne ==1.6.1
  • namex ==0.0.8
  • nbclient ==0.10.0
  • nbconvert ==7.16.3
  • nbdev ==2.3.13
  • nbformat ==5.10.4
  • nest-asyncio ==1.6.0
  • notebook ==7.1.2
  • notebook_shim ==0.2.4
  • numpy ==1.26.4
  • opt-einsum ==3.3.0
  • optree ==0.11.0
  • overrides ==7.7.0
  • packaging ==24.0
  • pandas ==2.2.2
  • pandocfilters ==1.5.1
  • parso ==0.8.4
  • pexpect ==4.9.0
  • pillow ==10.3.0
  • platformdirs ==4.2.0
  • polars ==0.20.19
  • pooch ==1.8.1
  • prometheus_client ==0.20.0
  • promise ==2.3
  • prompt-toolkit ==3.0.43
  • protobuf ==3.20.3
  • psutil ==5.9.8
  • ptyprocess ==0.7.0
  • pure-eval ==0.2.2
  • pycparser ==2.22
  • pyparsing ==3.1.2
  • python-dateutil ==2.9.0.post0
  • python-json-logger ==2.0.7
  • pytz ==2024.1
  • pyzmq ==26.0.0
  • qtconsole ==5.5.1
  • referencing ==0.34.0
  • regex ==2023.12.25
  • requests ==2.31.0
  • rfc3339-validator ==0.1.4
  • rfc3986-validator ==0.1.1
  • rich ==13.7.1
  • rpds-py ==0.18.0
  • scikit-learn ==1.4.2
  • scipy ==1.13.0
  • six ==1.16.0
  • sniffio ==1.3.1
  • soupsieve ==2.5
  • stack-data ==0.6.3
  • tensorboard ==2.16.2
  • tensorboard-data-server ==0.7.2
  • tensorflow ==2.16.1
  • tensorflow-datasets ==4.9.4
  • tensorflow-io-gcs-filesystem ==0.36.0
  • tensorflow-metadata ==1.14.0
  • termcolor ==2.4.0
  • terminado ==0.18.1
  • threadpoolctl ==3.4.0
  • tinycss2 ==1.2.1
  • toml ==0.10.2
  • tornado ==6.4
  • tqdm ==4.66.2
  • traitlets ==5.14.2
  • types-python-dateutil ==2.9.0.20240316
  • typing_extensions ==4.11.0
  • tzdata ==2024.1
  • uri-template ==1.3.0
  • urllib3 ==2.2.1
  • watchdog ==4.0.0
  • wcwidth ==0.2.13
  • webcolors ==1.13
  • webencodings ==0.5.1
  • websocket-client ==1.7.0
  • widgetsnbextension ==4.0.10
  • wrapt ==1.16.0
  • zipp ==3.18.1
.github/workflows/deploy.yaml actions
  • fastai/workflows/quarto-ghp master composite
.github/workflows/test.yaml actions
  • fastai/workflows/nbdev-ci master composite
setup.py pypi