in-vehicle-driver-state-analysis

In-Vehicle Driver State Analysis Using Image Segmentation

https://github.com/matejfric/in-vehicle-driver-state-analysis

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 4 DOI reference(s) in README
✓
Academic publication links
Links to: arxiv.org, zenodo.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (12.2%) to scientific vocabulary

Keywords

cnn

Last synced: 11 months ago · JSON representation ·

Repository

In-Vehicle Driver State Analysis Using Image Segmentation

Basic Info

Host: GitHub
Owner: matejfric
License: mit
Language: Jupyter Notebook
Default Branch: main
Homepage: https://doi.org/10.5281/zenodo.15242002
Size: 10.6 MB

Statistics

Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 1

Topics

cnn

Created almost 2 years ago · Last pushed about 1 year ago

Metadata Files

Readme License Citation

In-Vehicle Driver State Analysis

This repository contains supplementary source code for the diploma thesis In-Vehicle Driver State Analysis Using Image Segmentation.

PipelineAnimation

(Model results on the Driver Monitoring Dataset [1].)

1. Repository Organization
2. Development Environment
3. What can we do with an RGB image?
4. References

1. Repository Organization

model/ - Library code.
notebooks/ - Runnable code, primarily Jupyter notebooks, and Python scripts used to orchestrate notebook execution with varying parameters.
- notebooks/example/ - Demonstrates an inference pipeline on sample data. Instructions for running the pipeline are provided here.
- notebooks/datasets/ - Dataset preprocessing.
- notebooks/eda/ - Exploratory data analysis.
- notebooks/sam/ - Segment Anything Model 2 (SAM 2) inference (generation of ground truth masks). This module uses a different virtual environment than the rest of the project, as described here.
- notebooks/semantic_segmentation/ - Semantic segmentation model training, evaluation, and inference.
- notebooks/tae/ - Training and evaluation of the Temporal Autoencoder (TAE).
- notebooks/stae/ - Training and evaluation of the Spatio-Temporal Autoencoder (STAE).
- notebooks/memory_map/ - Exports training data to memory-mapped files (np.memmap) to speed up training and reduce CPU usage, at the cost of increased disk space.
- notebooks/clip/ - OpenAI CLIP model (this topic is not covered in the thesis).

2. Development Environment

Ubuntu 24.04
CUDA 12.5
Python 3.12

bash conda env create -f environment.yml -n driver-state-analysis pip install -e . pre-commit install

3. What can we do with an RGB image?

The following diagram illustrates a (non-exhaustive) list of different tasks that can be performed using an RGB image as input:

mermaid mindmap root)RGB Image) (GT Mask) YOLOv8x Segment Anything 2 (Semantic Segmentation) EfficientNet ResNet U-Net UNet++ (Monocular Depth Estimation) Image MiDaS Marigold Depth Anything Depth Anything 2 Video Video Depth Anything DepthCrafter (Edge Detection) Canny Sobel Laplacian (Pose Estimation) Graph NN YOLOv11 LBP HOG CLIP Text Skin Segmentation

4. References

Ortega, J., Kose, N., Cañas, P., Chao, M.a., Unnervik, A., Nieto, M., Otaegui, O., & Salgado, L. (2020). DMD: A Large-Scale Multi-Modal Driver Monitoring Dataset for Attention and Alertness Analysis. In: A. Bartoli & A. Fusiello (eds), Computer Vision - ECCV 2020 Workshops (pg. 387–405). Springer International Publishing.
Ansel, J., Yang, E., He, H., Gimelshein, N., Jain, A., Voznesensky, M., Bao, B., Bell, P., Berard, D., Burovski, E., Chauhan, G., Chourdia, A., Constable, W., Desmaison, A., DeVito, Z., Ellison, E., Feng, W., Gong, J., Gschwind, M., Hirsh, B., Huang, S., Kalambarkar, K., Kirsch, L., Lazos, M., Lezcano, M., Liang, Y., Liang, J., Lu, Y., Luk, C., Maher, B., Pan, Y., Puhrsch, C., Reso, M., Saroufim, M., Siraichi, M. Y., Suk, H., Suo, M., Tillet, P., Wang, E., Wang, X., Wen, W., Zhang, S., Zhao, X., Zhou, K., Zou, R., Mathews, A., Chanan, G., Wu, P., & Chintala, S. (2024). PyTorch 2: Faster Machine Learning Through Dynamic Python Bytecode Transformation and Graph Compilation [Conference paper]. 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2 (ASPLOS '24). https://doi.org/10.1145/3620665.3640366
Iakubovskii, P. (2019). Segmentation Models PyTorch (Version 0.4.0) [Computer software]. GitHub. https://github.com/qubvel/segmentation_models.pytorch
Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Ma, C., Jernite, Y., Plu, J., Xu, C., Le Scao, T., Gugger, S., Drame, M., Lhoest, Q., & Rush, A. M. (2020). Transformers: State-of-the-Art Natural Language Processing [Conference paper]. 38–45. https://www.aclweb.org/anthology/2020.emnlp-demos.6
Ravi, N., Gabeur, V., Hu, Y.-T., Hu, R., Ryali, C., Ma, T., Khedr, H., Rädle, R., Rolland, C., Gustafson, L., Mintun, E., Pan, J., Alwala, K. V., Carion, N., Wu, C.-Y., Girshick, R., Dollár, P., & Feichtenhofer, C. (2024). SAM 2: Segment anything in images and videos (arXiv:2408.00714). arXiv. https://arxiv.org/abs/2408.00714
Yang, L., Kang, B., Huang, Z., Zhao, Z., Xu, X., Feng, J., & Zhao, H. (2024). Depth Anything V2 (arXiv:2406.09414). arXiv. https://arxiv.org/abs/2406.09414
Jocher, G., Qiu, J., & Chaurasia, A. (2023). Ultralytics YOLO (Version 8.0.0) [Computer software]. https://github.com/ultralytics/ultralytics
ONNX Runtime Contributors. (2021). ONNX Runtime (Version 1.21.0) [Computer software]. https://onnxruntime.ai/
ONNX Contributors. (2019). ONNX: Open Neural Network Exchange (Version 1.17.0) [Computer software]. https://onnx.ai/
Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., & Sutskever, I. (2021). Learning transferable visual models from natural language supervision. arXiv. https://arxiv.org/abs/2103.00020

Owner

Name: Matěj Frič
Login: matejfric
Kind: user
Location: Ostrava
Company: VSB-TUO

Repositories: 1
Profile: https://github.com/matejfric

I am a hardworking university student pursuing a degree in Computation and Applied Mathematics. I am open to various internship opportunities.

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
  - family-names: Frič
    given-names: Matěj
    orcid: https://orcid.org/0009-0000-9885-1521
title: "Supplementary Material: In-Vehicle Driver State Analysis Using Image Segmentation"
version: 1.0.0
license: MIT
date-released: 2025-04-30
repository-code: "https://github.com/matejfric/in-vehicle-driver-state-analysis"
doi: 10.5281/zenodo.15242002

GitHub Events

Total

Release event: 1
Public event: 1
Push event: 14
Create event: 1

Last Year

Release event: 1
Public event: 1
Push event: 14
Create event: 1

Dependencies

notebooks/example/Dockerfile docker

python 3.12-slim build

environment.yml pypi

accelerate ==1.3.0
albucore ==0.0.23
albumentations ==2.0.5
ansicolors ==1.1.8
anyio ==4.6.2.post1
appdirs ==1.4.4
argon2-cffi ==23.1.0
argon2-cffi-bindings ==21.2.0
arrow ==1.3.0
async-lru ==2.0.5
babel ==2.17.0
backoff ==2.2.1
beautifulsoup4 ==4.13.3
bleach ==6.2.0
boto3 ==1.35.44
botocore ==1.35.44
build ==1.2.2.post1
cachecontrol ==0.14.2
cleo ==2.1.0
clip ==1.0
coloredlogs ==15.0.1
crashtest ==0.4.1
dacite ==1.6.0
dagshub ==0.3.39
dagshub-annotation-converter ==0.1.1
dataclasses-json ==0.6.7
decord ==0.6.0
defusedxml ==0.7.1
diffusers ==0.32.2
dulwich ==0.22.8
easydict ==1.13
efficientnet-pytorch ==0.7.1
einops ==0.8.0
fastjsonschema ==2.20.0
ffmpeg-python ==0.2.0
findpython ==0.6.2
flatbuffers ==25.2.10
fqdn ==1.5.1
ftfy ==6.3.1
fusepy ==3.0.1
future ==1.0.0
gql ==3.5.0
h11 ==0.14.0
httpcore ==1.0.6
httpx ==0.27.2
huggingface-hub ==0.26.0
humanfriendly ==10.0
hypothesis ==6.118.6
imageio-ffmpeg ==0.6.0
iniconfig ==2.0.0
installer ==0.7.0
ipywidgets ==8.1.5
isoduration ==20.11.0
jaraco-classes ==3.4.0
jaraco-context ==6.0.1
jaraco-functools ==4.1.0
jeepney ==0.9.0
jmespath ==1.0.1
json5 ==0.10.0
jsonpointer ==3.0.0
jsonschema ==4.23.0
jsonschema-specifications ==2024.10.1
jupyter ==1.1.1
jupyter-console ==6.6.3
jupyter-events ==0.12.0
jupyter-lsp ==2.2.5
jupyter-server ==2.15.0
jupyter-server-terminals ==0.5.3
jupyterlab ==4.3.6
jupyterlab-pygments ==0.3.0
jupyterlab-server ==2.27.3
jupyterlab-widgets ==3.0.13
jupytext ==1.17.0
keyring ==25.6.0
lxml ==5.3.0
markdown-it-py ==3.0.0
marshmallow ==3.23.0
mdit-py-plugins ==0.4.2
mdurl ==0.1.2
mistune ==3.1.3
ml-dtypes ==0.5.0
model ==0.1.0
more-itertools ==10.6.0
msgpack ==1.1.0
munch ==4.0.0
mypy-extensions ==1.0.0
nbclient ==0.10.0
nbconvert ==7.16.6
nbformat ==5.10.4
notebook ==7.3.3
notebook-shim ==0.2.4
nvidia-cublas-cu12 ==12.4.5.8
nvidia-cuda-cupti-cu12 ==12.4.127
nvidia-cuda-nvrtc-cu12 ==12.4.127
nvidia-cuda-runtime-cu12 ==12.4.127
nvidia-cudnn-cu12 ==9.1.0.70
nvidia-cufft-cu12 ==11.2.1.3
nvidia-curand-cu12 ==10.3.5.147
nvidia-cusolver-cu12 ==11.6.1.9
nvidia-cusparse-cu12 ==12.3.1.170
nvidia-nccl-cu12 ==2.21.5
nvidia-nvjitlink-cu12 ==12.4.127
nvidia-nvtx-cu12 ==12.4.127
onnx ==1.17.0
onnxruntime-gpu ==1.21.0
onnxscript ==0.1.0.dev20241110
overrides ==7.7.0
pandocfilters ==1.5.1
papermill ==2.6.0
pathvalidate ==3.2.1
pbs-installer ==2025.2.12
pkginfo ==1.12.1.2
pluggy ==1.5.0
poetry ==2.1.1
poetry-core ==2.1.1
pre-commit ==4.1.0
pretrainedmodels ==0.7.4
pycocotools ==2.0.8
pydantic ==2.10.6
pydantic-core ==2.27.2
pyproject-hooks ==1.2.0
pytest ==8.3.3
python-graphviz ==0.20.3
python-json-logger ==3.3.0
rapidfuzz ==3.12.2
referencing ==0.35.1
regex ==2024.9.11
requests-toolbelt ==1.0.0
rfc3339-validator ==0.1.4
rfc3986-validator ==0.1.1
rich ==13.9.2
rpds-py ==0.20.0
ruff ==0.9.7
s3transfer ==0.10.3
safetensors ==0.4.5
sam2util ==1.1.3
seaborn ==0.13.2
secretstorage ==3.3.3
segmentation-models-pytorch ==0.4.0
send2trash ==1.8.3
shellingham ==1.5.4
simsimd ==6.2.1
sniffio ==1.3.1
sortedcontainers ==2.4.0
soupsieve ==2.6
stringzilla ==3.12.1
supervision ==0.25.1
sympy ==1.13.1
tabulate ==0.9.0
tenacity ==9.0.0
terminado ==0.18.1
timm ==0.9.7
tinycss2 ==1.4.0
tokenizers ==0.20.1
tomlkit ==0.13.2
torch ==2.5.1
torchinfo ==1.8.0
torchview ==0.2.6
transformers ==4.46.0
treelib ==1.7.0
trove-classifiers ==2025.3.3.18
types-python-dateutil ==2.9.0.20241206
typing-extensions ==4.12.2
typing-inspect ==0.9.0
uri-template ==1.3.0
virtualenv ==20.29.2
webcolors ==24.11.1
webencodings ==0.5.1
widgetsnbextension ==4.0.13
xformers ==0.0.29.post1
zstandard ==0.23.0

model/depth_anything/requirements.txt pypi

decord *
easydict *
einops *
imageio *
imageio-ffmpeg *
matplotlib *
numpy *
opencv-python *
pillow *
torch *
torchvision *
tqdm *
xformers *

model/setup.py pypi

notebooks/example/requirements.txt pypi

jupyterlab ==4.4.0
matplotlib ==3.10.1
mlflow ==2.21.3
numpy ==2.2.4
onnxruntime ==1.21.1
opencv-python-headless ==4.11.0.86
pandas ==2.2.3
pillow ==11.2.1
tqdm ==4.67.1
transformers ==4.51.3

notebooks/sam/requirements.txt pypi

ffmpeg-python *
ipykernel *
matplotlib *
ninja *
papermill *
sam2util *
setuptools *
supervision *
ultralytics *
wheel *

pyproject.toml pypi

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science