iv-q1-detect-sleep-states
23rd place solution of Team Epoch on the Detect Sleep States competition hosted on Kaggle.
https://github.com/teamepochgithub/iv-q1-detect-sleep-states
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (1.4%) to scientific vocabulary
Keywords
Repository
23rd place solution of Team Epoch on the Detect Sleep States competition hosted on Kaggle.
Basic Info
- Host: GitHub
- Owner: TeamEpochGithub
- License: mit
- Language: Python
- Default Branch: main
- Homepage: https://storage.googleapis.com/kaggle-forum-message-attachments/2554059/20038/CMI___Technical_Report.pdf
- Size: 416 MB
Statistics
- Stars: 5
- Watchers: 2
- Forks: 0
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
CMI - Detect Sleep States | Place 23/1877 
This repository contains the code for our solution to Child Mind Institute - Detect Sleep States competition where we placed 23 out of 1877 teams.
Running main
Main should be run with the current working directory set to the directory main.
Config
The config.json file is used to set the paths of data folders, where to read the raw data and where to save our processed data, to enable or disbale logging to weights and biasses and using ensembles or doing hyperparameter optimization and cross validation. Below is an exampleconfig we used.
JSON
{
"name": "config",
"is_kaggle": false,
"log_to_wandb": true,
"pred_with_cpu": false,
"train_series_path": "data/raw/train_series.parquet",
"train_events_path": "data/raw/train_events.csv",
"test_series_path": "data/raw/test_series.parquet",
"fe_loc_in": "data/processed",
"processed_loc_out": "data/processed",
"processed_loc_in": "data/raw",
"model_store_loc": "tm",
"model_config_loc": "model_configs",
"ensemble": {
"models": ["spectrogram-cnn-gru.json"],
"weights": [1],
"comb_method": "confidence_average",
"pred_only": false
},
"cv": {
"splitter": "group_k_fold",
"scoring": ["score_full", "score_clean"],
"splitter_params": {
"n_splits": 5
}
},
"train_for_submission": false,
"scoring": true,
"visualize_preds": {
"n": 0,
"browser_plot": false,
"save": true
}
}
Logging and Kaggle
To log our experminet results we use weights and biasses. However since logging when running inference on kaggle does not make sense we have the iskaggle and logto_wandb as optional arguments in the config. For being able to run on kaggle in cpu notebook and to be able to use our GPUs locally we also have the pred with cpu argumnet in the config (using torch.device() would also use the cpu on kaggle if the GPU is disabled for the notebook so this is redundant but used).
Model training
For training models set the config to ensemble and for models give a list of all model configs you would like to use and their weights. By giving multiple model configs which are located in the modelconfigloc (specified in the config) folder and weigths, you can make ensembles and give each model a different weight for the confidence averaging step.
To do cross validation replace see the example config below
JSON
{
"name": "config",
"is_kaggle": false,
"log_to_wandb": true,
"pred_with_cpu": false,
"train_series_path": "data/raw/train_series.parquet",
"train_events_path": "data/raw/train_events.csv",
"test_series_path": "data/raw/test_series.parquet",
"fe_loc_in": "data/processed",
"processed_loc_out": "data/processed",
"processed_loc_in": "data/raw",
"model_store_loc": "tm",
"model_config_loc": "model_configs",
"hpo": "spectrogram-cnn-gru.json",
"cv": {
"splitter": "group_k_fold",
"scoring": ["score_full", "score_clean"],
"splitter_params": {
"n_splits": 5
}
},
"train_for_submission": false,
"scoring": true,
"visualize_preds": {
"n": 0,
"browser_plot": false,
"save": true
}
}
When the config is this way cross validation will be done using the parameters given with cv. The splitter, number of folds and what scores to calculate are all arguments (clean score refers to your score on data without NaNs). This config is also used when doing hpo with weights and biasses.
When train for submission is set to true the model will also be trained on the complete train set. The visualize preds arguments are used to generate plots using plotly or to simply save jpeg of our models predictions for each series compared to the real events.
To see how the preprocessing and feature engineering steps are chosen along with how hyperparameters for each model are set please refer to src/configs/readme.md and the individual model configs in the model_configs folder. Each model config lists all preprocessing, feature engineering steps along with model hyperparameters and the type of model in the formats defined in src/configs/readme.md. (There might be some methods that re not mentioned in the readme)
The training happens by passing all our data to the trainer class located in src/models/trainers.
Model architectures
In our code the model classes (classes that have methods like train and pred) and model architectures are separated. To see our model architectures please refer to src/models/architectures and to see the model classes please refer to src/models.
Contributors
This repository was created by Team Epoch IV, based in the Dream Hall of the Delft University of Technology.
Read more about this competition here.
Owner
- Name: Team Epoch
- Login: TeamEpochGithub
- Kind: organization
- Email: info@teamepoch.net
- Website: https://www.teamepoch.net/
- Repositories: 1
- Profile: https://github.com/TeamEpochGithub
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - family-names: "Lim" given-names: "Jeffrey" affiliation: "TU Delft Dream Team Epoch" email: "Jeffrey-Lim@outlook.com" - family-names: "Witting" given-names: "Emiel" affiliation: "TU Delft Dream Team Epoch" email: "emiel.witting@gmail.com" - family-names: "Heer" name-particle: "de" given-names: "Hugo" affiliation: "TU Delft Dream Team Epoch" email: "hugodeheer1234@gmail.com" - family-names: "Kopar" given-names: "Cahit Tolga" affiliation: "TU Delft Dream Team Epoch" email: "cahittolgakopar@gmail.com" - family-names: "Selm" name-particle: "van" given-names: "Jasper" affiliation: "TU Delft Dream Team Epoch" email: "jmvanselm@gmail.com" title: "Detect Sleep States" version: 1.0.0 date-released: 2023-12-06 url: "https://github.com/TeamEpochGithub/iv-q1-detect-sleep-states"
GitHub Events
Total
- Watch event: 1
Last Year
- Watch event: 1
Dependencies
- Babel ==2.12.1
- Jinja2 ==3.1.2
- MarkupSafe ==2.1.3
- Pillow ==10.0.0
- PyYAML ==6.0.1
- Pygments ==2.16.1
- QtPy ==2.4.0
- Send2Trash ==1.8.2
- anyio ==4.0.0
- argon2-cffi ==23.1.0
- argon2-cffi-bindings ==21.2.0
- arrow ==1.2.3
- astroid ==2.15.6
- asttokens ==2.4.0
- async-lru ==2.0.4
- attrs ==23.1.0
- backcall ==0.2.0
- beautifulsoup4 ==4.12.2
- bleach ==6.0.0
- certifi ==2023.7.22
- cffi ==1.15.1
- charset-normalizer ==3.2.0
- colorama ==0.4.6
- coloredlogs ==15.0.1
- comm ==0.1.4
- contourpy ==1.1.0
- cramjam ==2.7.0
- cycler ==0.11.0
- debugpy ==1.6.7.post1
- decorator ==5.1.1
- defusedxml ==0.7.1
- dill ==0.3.7
- exceptiongroup ==1.1.3
- executing ==1.2.0
- fastjsonschema ==2.18.0
- fastparquet ==2023.8.0
- flake8 ==6.1.0
- flake8-for-pycharm ==0.4.1
- fonttools ==4.42.1
- fqdn ==1.5.1
- fsspec ==2023.9.0
- greenlet ==2.0.2
- idna ==3.4
- ipykernel ==6.25.2
- ipython ==8.15.0
- ipython-genutils ==0.2.0
- ipywidgets ==8.1.0
- isoduration ==20.11.0
- isort ==5.12.0
- jedi ==0.19.0
- joblib *
- json5 ==0.9.14
- jsonpointer ==2.4
- jsonschema ==4.19.0
- jsonschema-specifications ==2023.7.1
- jupyter ==1.0.0
- jupyter-console ==6.6.3
- jupyter-events ==0.7.0
- jupyter-lsp ==2.2.0
- jupyter_client ==8.3.1
- jupyter_core ==5.3.1
- jupyter_server ==2.7.3
- jupyter_server_terminals ==0.4.4
- jupyterlab ==4.0.5
- jupyterlab-pygments ==0.2.2
- jupyterlab-widgets ==3.0.8
- jupyterlab_server ==2.24.0
- kaggle *
- kaleido ==0.2.1
- kiwisolver ==1.4.5
- lazy-object-proxy ==1.9.0
- matplotlib ==3.7.2
- matplotlib-inline ==0.1.6
- mccabe ==0.7.0
- mistune ==3.0.1
- nbclient ==0.8.0
- nbconvert ==7.8.0
- nbformat ==5.9.2
- nest-asyncio ==1.5.7
- notebook ==7.0.3
- notebook_shim ==0.2.3
- numpy ==1.25.2
- overrides ==7.4.0
- packaging ==23.1
- pandas ==2.0.3
- pandocfilters ==1.5.0
- parso ==0.8.3
- pickleshare ==0.7.5
- platformdirs ==3.10.0
- playwright ==1.37.0
- plotly ==5.16.1
- plotly_resampler ==0.9.1
- polars ==0.19.3
- pre-commit *
- prometheus-client ==0.17.1
- prompt-toolkit ==3.0.39
- psutil ==5.9.5
- pure-eval ==0.2.2
- pyarrow ==13.0.0
- pycodestyle ==2.11.0
- pycparser ==2.21
- pyee ==9.0.4
- pyflakes ==3.1.0
- pylint ==2.17.5
- pyparsing ==3.0.9
- python-dateutil ==2.8.2
- python-json-logger ==2.0.7
- pytz ==2023.3.post1
- pyzmq ==25.1.1
- qtconsole ==5.4.4
- referencing ==0.30.2
- requests ==2.31.0
- rfc3339-validator ==0.1.4
- rfc3986-validator ==0.1.1
- rpds-py ==0.10.2
- ruptures ==1.1.8
- scikit-learn ==1.3.1
- scipy ==1.11.2
- seaborn ==0.12.2
- segmentation-models-pytorch ==0.3.3
- six ==1.16.0
- sniffio ==1.3.0
- soupsieve ==2.5
- stack-data ==0.6.2
- tenacity ==8.2.3
- terminado ==0.17.1
- timm *
- tinycss2 ==1.2.1
- tomli ==2.0.1
- tomlkit ==0.12.1
- torch ==2.0.1
- torchaudio ==2.0.2
- torchsummary ==1.5.1
- torchvision ==0.15.2
- tornado ==6.3.3
- tqdm ==4.66.1
- traitlets ==5.9.0
- typing_extensions ==4.7.1
- tzdata ==2023.3
- uri-template ==1.3.0
- urllib3 ==2.0.4
- wandb *
- wcwidth ==0.2.6
- webcolors ==1.13
- webencodings ==0.5.1
- websocket-client ==1.6.2
- widgetsnbextension ==4.0.8
- wrapt ==1.15.0
