iv-q1-detect-sleep-states

23rd place solution of Team Epoch on the Detect Sleep States competition hosted on Kaggle.

https://github.com/teamepochgithub/iv-q1-detect-sleep-states

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (1.4%) to scientific vocabulary

Keywords

ai kaggle-competition

Last synced: 11 months ago · JSON representation ·

Repository

23rd place solution of Team Epoch on the Detect Sleep States competition hosted on Kaggle.

Basic Info

Host: GitHub
Owner: TeamEpochGithub
License: mit
Language: Python
Default Branch: main
Homepage: https://storage.googleapis.com/kaggle-forum-message-attachments/2554059/20038/CMI___Technical_Report.pdf
Size: 416 MB

Statistics

Stars: 5
Watchers: 2
Forks: 0
Open Issues: 0
Releases: 0

Topics

ai kaggle-competition

Created over 2 years ago · Last pushed about 2 years ago

Metadata Files

Readme License Citation

CMI - Detect Sleep States | Place 23/1877

This repository contains the code for our solution to Child Mind Institute - Detect Sleep States competition where we placed 23 out of 1877 teams.

Running main

Main should be run with the current working directory set to the directory main.

Config

The config.json file is used to set the paths of data folders, where to read the raw data and where to save our processed data, to enable or disbale logging to weights and biasses and using ensembles or doing hyperparameter optimization and cross validation. Below is an exampleconfig we used.

JSON { "name": "config", "is_kaggle": false, "log_to_wandb": true, "pred_with_cpu": false, "train_series_path": "data/raw/train_series.parquet", "train_events_path": "data/raw/train_events.csv", "test_series_path": "data/raw/test_series.parquet", "fe_loc_in": "data/processed", "processed_loc_out": "data/processed", "processed_loc_in": "data/raw", "model_store_loc": "tm", "model_config_loc": "model_configs", "ensemble": { "models": ["spectrogram-cnn-gru.json"], "weights": [1], "comb_method": "confidence_average", "pred_only": false }, "cv": { "splitter": "group_k_fold", "scoring": ["score_full", "score_clean"], "splitter_params": { "n_splits": 5 } }, "train_for_submission": false, "scoring": true, "visualize_preds": { "n": 0, "browser_plot": false, "save": true } }

Logging and Kaggle

To log our experminet results we use weights and biasses. However since logging when running inference on kaggle does not make sense we have the iskaggle and logto_wandb as optional arguments in the config. For being able to run on kaggle in cpu notebook and to be able to use our GPUs locally we also have the pred with cpu argumnet in the config (using torch.device() would also use the cpu on kaggle if the GPU is disabled for the notebook so this is redundant but used).

Model training

For training models set the config to ensemble and for models give a list of all model configs you would like to use and their weights. By giving multiple model configs which are located in the modelconfigloc (specified in the config) folder and weigths, you can make ensembles and give each model a different weight for the confidence averaging step.

To do cross validation replace see the example config below

JSON { "name": "config", "is_kaggle": false, "log_to_wandb": true, "pred_with_cpu": false, "train_series_path": "data/raw/train_series.parquet", "train_events_path": "data/raw/train_events.csv", "test_series_path": "data/raw/test_series.parquet", "fe_loc_in": "data/processed", "processed_loc_out": "data/processed", "processed_loc_in": "data/raw", "model_store_loc": "tm", "model_config_loc": "model_configs", "hpo": "spectrogram-cnn-gru.json", "cv": { "splitter": "group_k_fold", "scoring": ["score_full", "score_clean"], "splitter_params": { "n_splits": 5 } }, "train_for_submission": false, "scoring": true, "visualize_preds": { "n": 0, "browser_plot": false, "save": true } }

When the config is this way cross validation will be done using the parameters given with cv. The splitter, number of folds and what scores to calculate are all arguments (clean score refers to your score on data without NaNs). This config is also used when doing hpo with weights and biasses.

When train for submission is set to true the model will also be trained on the complete train set. The visualize preds arguments are used to generate plots using plotly or to simply save jpeg of our models predictions for each series compared to the real events.

To see how the preprocessing and feature engineering steps are chosen along with how hyperparameters for each model are set please refer to src/configs/readme.md and the individual model configs in the model_configs folder. Each model config lists all preprocessing, feature engineering steps along with model hyperparameters and the type of model in the formats defined in src/configs/readme.md. (There might be some methods that re not mentioned in the readme)

The training happens by passing all our data to the trainer class located in src/models/trainers.

Model architectures

In our code the model classes (classes that have methods like train and pred) and model architectures are separated. To see our model architectures please refer to src/models/architectures and to see the model classes please refer to src/models.

Contributors

This repository was created by Team Epoch IV, based in the Dream Hall of the Delft University of Technology.

Owner

Name: Team Epoch
Login: TeamEpochGithub
Kind: organization
Email: info@teamepoch.net

Website: https://www.teamepoch.net/
Repositories: 1
Profile: https://github.com/TeamEpochGithub

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Lim"
  given-names: "Jeffrey"
  affiliation: "TU Delft Dream Team Epoch"
  email: "Jeffrey-Lim@outlook.com"
- family-names: "Witting"
  given-names: "Emiel"
  affiliation: "TU Delft Dream Team Epoch"
  email: "emiel.witting@gmail.com"
- family-names: "Heer"
  name-particle: "de"
  given-names: "Hugo"
  affiliation: "TU Delft Dream Team Epoch"
  email: "hugodeheer1234@gmail.com"
- family-names: "Kopar"
  given-names: "Cahit Tolga"
  affiliation: "TU Delft Dream Team Epoch"
  email: "cahittolgakopar@gmail.com"
- family-names: "Selm"
  name-particle: "van"
  given-names: "Jasper"
  affiliation: "TU Delft Dream Team Epoch"
  email: "jmvanselm@gmail.com"
title: "Detect Sleep States"
version: 1.0.0
date-released: 2023-12-06
url: "https://github.com/TeamEpochGithub/iv-q1-detect-sleep-states"

GitHub Events

Total

Watch event: 1

Last Year

Watch event: 1

Dependencies

requirements.txt pypi

Babel ==2.12.1
Jinja2 ==3.1.2
MarkupSafe ==2.1.3
Pillow ==10.0.0
PyYAML ==6.0.1
Pygments ==2.16.1
QtPy ==2.4.0
Send2Trash ==1.8.2
anyio ==4.0.0
argon2-cffi ==23.1.0
argon2-cffi-bindings ==21.2.0
arrow ==1.2.3
astroid ==2.15.6
asttokens ==2.4.0
async-lru ==2.0.4
attrs ==23.1.0
backcall ==0.2.0
beautifulsoup4 ==4.12.2
bleach ==6.0.0
certifi ==2023.7.22
cffi ==1.15.1
charset-normalizer ==3.2.0
colorama ==0.4.6
coloredlogs ==15.0.1
comm ==0.1.4
contourpy ==1.1.0
cramjam ==2.7.0
cycler ==0.11.0
debugpy ==1.6.7.post1
decorator ==5.1.1
defusedxml ==0.7.1
dill ==0.3.7
exceptiongroup ==1.1.3
executing ==1.2.0
fastjsonschema ==2.18.0
fastparquet ==2023.8.0
flake8 ==6.1.0
flake8-for-pycharm ==0.4.1
fonttools ==4.42.1
fqdn ==1.5.1
fsspec ==2023.9.0
greenlet ==2.0.2
idna ==3.4
ipykernel ==6.25.2
ipython ==8.15.0
ipython-genutils ==0.2.0
ipywidgets ==8.1.0
isoduration ==20.11.0
isort ==5.12.0
jedi ==0.19.0
joblib *
json5 ==0.9.14
jsonpointer ==2.4
jsonschema ==4.19.0
jsonschema-specifications ==2023.7.1
jupyter ==1.0.0
jupyter-console ==6.6.3
jupyter-events ==0.7.0
jupyter-lsp ==2.2.0
jupyter_client ==8.3.1
jupyter_core ==5.3.1
jupyter_server ==2.7.3
jupyter_server_terminals ==0.4.4
jupyterlab ==4.0.5
jupyterlab-pygments ==0.2.2
jupyterlab-widgets ==3.0.8
jupyterlab_server ==2.24.0
kaggle *
kaleido ==0.2.1
kiwisolver ==1.4.5
lazy-object-proxy ==1.9.0
matplotlib ==3.7.2
matplotlib-inline ==0.1.6
mccabe ==0.7.0
mistune ==3.0.1
nbclient ==0.8.0
nbconvert ==7.8.0
nbformat ==5.9.2
nest-asyncio ==1.5.7
notebook ==7.0.3
notebook_shim ==0.2.3
numpy ==1.25.2
overrides ==7.4.0
packaging ==23.1
pandas ==2.0.3
pandocfilters ==1.5.0
parso ==0.8.3
pickleshare ==0.7.5
platformdirs ==3.10.0
playwright ==1.37.0
plotly ==5.16.1
plotly_resampler ==0.9.1
polars ==0.19.3
pre-commit *
prometheus-client ==0.17.1
prompt-toolkit ==3.0.39
psutil ==5.9.5
pure-eval ==0.2.2
pyarrow ==13.0.0
pycodestyle ==2.11.0
pycparser ==2.21
pyee ==9.0.4
pyflakes ==3.1.0
pylint ==2.17.5
pyparsing ==3.0.9
python-dateutil ==2.8.2
python-json-logger ==2.0.7
pytz ==2023.3.post1
pyzmq ==25.1.1
qtconsole ==5.4.4
referencing ==0.30.2
requests ==2.31.0
rfc3339-validator ==0.1.4
rfc3986-validator ==0.1.1
rpds-py ==0.10.2
ruptures ==1.1.8
scikit-learn ==1.3.1
scipy ==1.11.2
seaborn ==0.12.2
segmentation-models-pytorch ==0.3.3
six ==1.16.0
sniffio ==1.3.0
soupsieve ==2.5
stack-data ==0.6.2
tenacity ==8.2.3
terminado ==0.17.1
timm *
tinycss2 ==1.2.1
tomli ==2.0.1
tomlkit ==0.12.1
torch ==2.0.1
torchaudio ==2.0.2
torchsummary ==1.5.1
torchvision ==0.15.2
tornado ==6.3.3
tqdm ==4.66.1
traitlets ==5.9.0
typing_extensions ==4.7.1
tzdata ==2023.3
uri-template ==1.3.0
urllib3 ==2.0.4
wandb *
wcwidth ==0.2.6
webcolors ==1.13
webencodings ==0.5.1
websocket-client ==1.6.2
widgetsnbextension ==4.0.8
wrapt ==1.15.0

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science