doros

Dataset of Road Object States

https://github.com/cmiller2air/doros

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.6%) to scientific vocabulary
Last synced: 10 months ago · JSON representation ·

Repository

Dataset of Road Object States

Basic Info
  • Host: GitHub
  • Owner: cmiller2air
  • License: agpl-3.0
  • Language: Jupyter Notebook
  • Default Branch: master
  • Size: 19.5 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created over 1 year ago · Last pushed over 1 year ago
Metadata Files
Readme Contributing License Citation

README.md

DOROS: Dataset of Road Object States

Overview

DOROS (Dataset of Road Object States) is a novel vision dataset designed for advanced scene understanding in urban environments. It features multi-level annotations across three primary categories:

  • Agent: Key traffic entities such as pedestrians, cars, buses, and traffic lights.
  • Location: Logical spatial contexts like lanes, crosswalks, intersections, and parking lots.
  • Action: Dynamic behaviors such as braking, signaling, and stopping, allowing multiple simultaneous actions through multi-hot encoding.

The dataset addresses gaps in existing resources by providing: - Comprehensive multi-level annotations. - Diverse environmental conditions (e.g., rain, snow, fog, nighttime). - High-quality, assisted-manual annotations using Segment Anything and Grounding-DINO.

The dataset is particularly suitable for training and evaluating models capable of understanding complex urban traffic scenarios.

Key Features

  • Dataset Size: 49,296 high-resolution images (480×1280 RGB), sampled from over 1,000,000 raw frames.
  • Geographical Diversity: Data collected from multiple South Korean cities including Seoul, Daejeon, Hwaseong, Sejong, and Ulsan.
  • Multi-Level Annotations: Each object has exactly one ‘Agent’ and ‘Location’ label, with optional ‘Action’ labels.
  • Class Distribution: Balanced class representation across training and validation splits, as shown in Figure 1 of the paper.
  • Assisted Annotation: Initial annotations generated using Segment Anything and Grounding-DINO, refined by human annotators.

Applications

DOROS is ideal for: - Object detection and segmentation. - Action recognition in traffic scenarios. - Context-aware autonomous driving models.

Download

The dataset is available for download:

Download DOROS Dataset

Training Code

The training code leverages the Ultralytics YOLOv8 framework in segmentation mode. You can find detailed instructions and examples in the repository.

Experimental Results

Baseline Training Results

| Model Size | Params | Configuration | mAP-comb(mask) (50 / 50:95) | |------------|--------|------------------------|-----------------------------| | n | 3.27M | Baseline | 6.09 / 10.44 | | | | + no flip aug | 7.05 / 12.15 | | | | + large img | 7.29 / 12.35 | | s | 11.80M | Baseline | 9.63 / 17.02 | | | | + no flip aug | 9.89 / 17.61 | | | | + large img | 10.66 / 18.76 | | m | 27.26M | Baseline | 11.07 / 19.74 | | | | + no flip aug | 10.67 / 19.93 | | | | + large img | 12.22 / 21.81 | | l | 45.96M | Baseline | 12.24 / 22.11 | | | | + no flip aug | 13.09 / 23.76 | | | | + large img | 13.95 / 24.44 | | x | 71.78M | Baseline | 12.87 / 23.15 | | | | + no flip aug | 13.90 / 25.32 | | | | + large img | 14.54 / 25.55 |

Experimental Setup

  • Hardware: NVIDIA GeForce RTX 4090 GPU, AMD EPYC 9124 16-core CPU.
  • Framework: Ultralytics YOLOv8.
  • Augmentations: Mosaic and random HSV.
  • Optimizer: Stochastic Gradient Descent (SGD).

Key Findings

  • Baseline Configurations:
    • Standard training with default augmentations.
    • Training without flip augmentations showed a slight accuracy reduction.
    • Large image sizes improved performance by capturing finer spatial details.
  • Evaluation Metrics: Combined mAP(mask), which evaluates predictions across multi-level annotations.

For detailed ablation studies, refer to the full paper.

Acknowledgments

This work builds on tools and resources from the computer vision community, including: - Ultralytics YOLOv8 for training code. - Segment Anything and Grounding-DINO for assisted annotations.

We thank all contributors and collaborators for their support.

Owner

  • Login: cmiller2air
  • Kind: user

Citation (CITATION.cff)

# This CITATION.cff file was generated with https://bit.ly/cffinit

cff-version: 1.2.0
title: Ultralytics YOLO
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Glenn
    family-names: Jocher
    affiliation: Ultralytics
    orcid: 'https://orcid.org/0000-0001-5950-6979'
  - given-names: Ayush
    family-names: Chaurasia
    affiliation: Ultralytics
    orcid: 'https://orcid.org/0000-0002-7603-6750'
  - family-names: Qiu
    given-names: Jing
    affiliation: Ultralytics
    orcid: 'https://orcid.org/0000-0003-3783-7069'
repository-code: 'https://github.com/ultralytics/ultralytics'
url: 'https://ultralytics.com'
license: AGPL-3.0
version: 8.0.0
date-released: '2023-01-10'

GitHub Events

Total
  • Delete event: 1
  • Push event: 4
  • Create event: 3
Last Year
  • Delete event: 1
  • Push event: 4
  • Create event: 3

Dependencies

pyproject.toml pypi
  • matplotlib >=3.3.0
  • numpy >=1.23.5,<2.0.0
  • opencv-python >=4.6.0
  • pandas >=1.1.4
  • pillow >=7.1.2
  • psutil *
  • py-cpuinfo *
  • pyyaml >=5.3.1
  • requests >=2.23.0
  • scipy >=1.4.1
  • seaborn >=0.11.0
  • torch >=1.8.0
  • torchvision >=0.9.0
  • tqdm >=4.64.0
  • ultralytics-thop >=2.0.0