Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: researchgate.net
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (8.3%) to scientific vocabulary
Last synced: 10 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: bit-bots
  • License: mit
  • Language: Python
  • Default Branch: master
  • Size: 32 MB
Statistics
  • Stars: 17
  • Watchers: 13
  • Forks: 0
  • Open Issues: 2
  • Releases: 0
Created over 5 years ago · Last pushed over 1 year ago
Metadata Files
Readme License Citation

README.md

TORSO-21 Dataset: Typical Objects in RoboCup Soccer 2021

This repository contains the scripts and additional information for the TORSO-21 Dataset. This is a dataset for the RoboCup Humanoid Soccer domain consisting of images of the Humanoid League as well as the Standard Platform League. We provide two image collections. The first one consists of images from various real-world locations, recorded by different robots. It includes annotations for the ball, goalposts, robots (including team color and player number), lines, field edge, and three types of line intersections. The second collection is generated in the Webots simulator which is used for the official RoboCup Virtual Humanoid Soccer Competition. Additionally to the labels of the first collection, labels for the complete goal, depth images, 6D poses for all labels, as well as the camera location in the field of play, are provided.

Meta Data

Real World

| # of Images | 10464 | |--------------------------|-------| | # of Balls | 6081 | | # of Robots | 7641 | | # of Goalposts | 7888 | | # of L-Intersections | 10375 | | # of T-Intersection | 8659 | | # of X-Intersections | 7268 | | # of Field Segmentations | 10464 | | # of Line Segmentations | 10464 |

| Robot Team Colors | | |--------------------------|-------| | # of Blue Robots | 1277 | | # of Red Robots | 1917 | | # of Unknown Robots | 4447 |

| Robot Player Numbers | | |--------------------------|-------| | # of Robots w/out # | 6526 | | # of Robots with #1 | 229 | | # of Robots with #2 | 162 | | # of Robots with #3 | 276 | | # of Robots with #4 | 309 | | # of Robots with #5 | 42 | | # of Robots with #6 | 97 |

Simultation

# of simulated images: 24.000

Example images

Real-World

example_imageexample_imageexample_imageexample_image

example_imageexample_imageexample_imageexample_image

With annotations marked

example_imageexample_imageexample_imageexample_image

With segmentations of lines and field

example_imageexample_imageexample_imageexample_image

Simulation

example_imageexample_imageexample_imageexample_image

With annotations, segmentation mask and depth image

example_imageexample_imageexample_imageexample_image

Download Dataset and Labels

Manual Download

The images and annotations can be downloaded here: https://data.bit-bots.de/TORSO-21/

Automated Download

The data can also be downloaded with the following script (use --help for further options):

shell ./scripts/download_dataset.py --all

YOLO Label Format

If you want to train a YOLO, you can use the script provided in this repository to generate the labels.

Structure

The repository structure is as follows:

raw data # contains the annotations and images reality # the images recorded in reality train # the training set annotations.yaml # the annotations in yaml format images/ # a folder containing all the images of the training set segmentations/ # a folder containing all the segmentation masks of the training set test # the test set ... # it is structured in the same way as the training set simulation # the images recorded in simulation train # the training set annotations.yaml # the annotations in yaml format depth/ # a folder containing all the depth images of the training set images/ # a folder containing all the images of the training set segmentations/ # a folder containing all the segmentation masks of the training set test # the test set ... # it is structured in the same way as the training set scripts # some useful scripts, see below for details ...

The annotations are in the following format:

yaml images: 130-16_02_2018__11_16_34_0000_upper.png: width: 1920 height: 1080 annotations: - blurred: true color: unknown # possible values {blue, red, unknown} concealed: false in_image: true number: null # possible values {null, 1, 2, 3, 4, 5, 6} type: robot vector: - - 42 # x value - 26 # y value - - 81 - 98 pose: # The pose of the annotated object, only available in simulation position: x: 0 y: 0 z: 0 orientation: x: 0 y: 0 z: 0 w: 0 motion: standing - in_image: false type: ball metadata: # The keys should be like this but do not need to be present for all images fov: 42 location: "foobay" tags: ["natural_light", "telstar18", "do_not_use"] imageset_id: 130 camera_pose: # The pose of the annotated object, only available in simulation position: x: 0 y: 0 z: 0 orientation: x: 0 y: 0 z: 0 w: 0 Natural light: False League: HSL ...

Scripts

Set up environment

Follow these instructions to set up the dependencies for the scripts used for visualization and creation of the dataset.

  1. Install the package manager Poetry as described here. This prevents dependency conflicts and ensures that the correct versions of the dependencies are installed.

  2. Clone the repository:

    shell git clone https://github.com/bit-bots/TORSO_21_dataset.git

  3. Move into the repository and install the dependencies

- without optional dependencies (for dataset creation):

    ```shell
    cd TORSO_21_dataset && poetry install --without=dev --no-root
    ```

- with optional dependencies (for dataset creation):

    ```shell
    cd TORSO_21_dataset && poetry install --no-root
    ```

Usage

To run the tools you need to enter the poetry environment:

shell poetry shell

Alternatively, you can use poetry run ./scripts/<file> to run the scripts without sourcing.

Visualize annotations

To visualize the annotations, run the following two commands to pickle and show the annotations in the poetry environment:

shell ./scripts/pickle_annotations.py data/reality/train/annotations.yaml ./scripts/viz_annotations.py data/reality/train/annotations.pkl

Statistics and checks

metadata_statistics.py

Generates metadata statistics from an annotations file, i.e. how often which metadata type occurs.

annotation_statistics.py

This script is used to generate statistics about the annotations, i.e. how often each annotation occurs per image. Its first argument is the annotation file to generate annotations for.

sanity_check.py

Sanity-checks the annotations, i.e. checks if some labels are marked as in image and not in image and if the field boundary is contained.

YOLO Evaluation

Simple script that runs a YOLO model against the test dataset and calculates the IOU metrics.

./scripts/yolo_eval.py --yolo-path /path/to/yolo_folder --collection data/reality/test

Further scripts

To use these scripts, make sure to install all dependencies with poetry install (see Set up environment).

download_and_merge_data.py

This script downloads multiple image sets and annotations from the ImageTagger. The imagesets and the annotation format are defined at the top of the file. Its output is a folder data_raw in the root of this repository that contains all image files. To avoid conflicting names, every filename is prepended with its dataset id. Additionally, a file annotations.yaml is created that contains a dict mapping set ids to their metadata and a dict mapping image names to their labels.

download_from_imagetagger.py

This is just a verbatim copy of the ImageTagger download script. Its API is used by download_and_merge_data.py, it it not necessary to use this script directly.

annotation_filter.py

This script filters the annotations contained in data_raw/annotations.yaml to only include the images in the data folder and creates a data/annotations.yaml file.

imagetagger_prepare_script.py

This script prepares the files in data for the ImageTagger, i.e. zips the images and converts the annotations to the upload format.

line_label_tool.py

This script can be used to label lines.

convert_pascal_voc.py

This script converts labels from the Pascal VOC XML format to the yaml format as defined above.

add_metadata.py

Creates the file data/annotations_with_metadata.yaml from data/annotations.yaml and data/metadata.csv. annotations.yaml can be downloaded from the ImageTagger, metadata.csv has to be manually created.

fix_segmentations.py

This script was used to resolve an issue regarding the segmentation images of the reality collection. This fixes the issue of incorrect color values of the class field caused by anti-aliasing and a old bug in the line_label_tool.

Variational Autoencoder

The variational autoencoder, we have used, is based on noctrog's conv-vae.

The training code for the autoencoder is located in scripts/vae/.

vae/train.py

This file runs the training of the vae. More details are avalible by running vae/train.py -h.

vae/reconstruct.py

This script runs the autoencoder on a given input and shows the reconstruction of the image. More details are available by running vae/reconstruct.py -h.

vae/embeddings.py

This script runs the vae recursively on all image inside a given folder and saves their latent space representation und reconstruction errors inside a file. More details are available by running vae/embeddings.py -h.

vae/distances.py

Plots n'th neighbors in the latent space of a given image. More details are available by running vae/distances.py -h.

vae/duplicates.py

Creates a yaml file containing three lists containing:

  • The images that survived the pruning
  • The images that got selected due to the high reconstruction error
  • The images that will be removed from the dataset

More details are available by running vae/duplicates.py -h.

vae/plot_error.py

Loads an embeddings file and plots the reconstruction errors.

vae/model.py

The PyTorch model definition.

vae/dataset.py

The PyTorch dataset definition.

Architecture

| # | Layer (type) | Output Shape | Param # | |----|---------------------|----------------|-----------| | | Input | (3, 128, 112) | 0 | | 1 | Conv2d | (32, 64, 56) | 896 | | 2 | BatchNorm2d | (32, 64, 56) | 64 | | 3 | LeakyReLU | (32, 64, 56) | 0 | | 4 | Conv2d | (64, 32, 28) | 18,496 | | 5 | BatchNorm2d | (64, 32, 28) | 128 | | 6 | LeakyReLU | (64, 32, 28) | 0 | | 7 | Conv2d | (64, 16, 14) | 36,928 | | 8 | BatchNorm2d | (64, 16, 14) | 128 | | 9 | LeakyReLU | (64, 16, 14) | 0 | | 10 | Conv2d | (64, 8, 7) | 36,928 | | 11 | BatchNorm2d | (64, 8, 7) | 128 | | 12 | LeakyReLU | (64, 8, 7) | 0 | | 13 | Linear | (300) | 1,075,500 | | 14 | LeakyReLU | (300) | 0 | | 15 | Dropout | (300) | 0 | | 16 | Linear | (300) | 1,075,500 | | 17 | LeakyReLU | (300) | 0 | | 18 | Dropout | (300) | 0 | | 19 | Linear | (3584) | 1,078,784 | | 20 | LeakyReLU | (3584) | 0 | | 21 | Dropout | (3584) | 0 | | 22 | UpsamplingNearest2d | (64, 16, 14) | 0 | | 23 | ConvTranspose2d | (64, 16, 14) | 36,928 | | 24 | BatchNorm2d | (64, 16, 14) | 128 | | 25 | LeakyReLU | (64, 16, 14) | 0 | | 26 | psamplingNearest2d | (64, 32, 28) | 0 | | 27 | ConvTranspose2d | (64, 32, 28) | 36,928 | | 28 | BatchNorm2d | (64, 32, 28) | 128 | | 29 | LeakyReLU | (64, 32, 28) | 0 | | 30 | UpsamplingNearest2d | (64, 64, 56) | 0 | | 31 | ConvTranspose2d | (32, 64, 56) | 18,464 | | 32 | BatchNorm2d | (32, 64, 56) | 64 | | 33 | LeakyReLU | (32, 64, 56) | 0 | | 34 | UpsamplingNearest2d | (32, 128, 112) | 0 | | 35 | ConvTranspose2d | (3, 128, 112) | 867 | | 36 | Sigmoid | (3, 128, 112) | 0 |

Generation of Simulation Data

The code for generating the simulation data can be found here https://github.com/bit-bots/wolfgang_robot/blob/feature/recognition/wolfgang_webots_sim/src/wolfgang_webots_sim/webots_camera_controller.py

Evaluation

Visualization of the position density of the respective annotations in the image space over all images of the real-world collection:

heatmaps

Publication

When you use our dataset or related software, please cite it as follows:

TORSO-21 Dataset: Typical Objects in RoboCup Soccer 2021

Abstract
We present a dataset specifically designed to be used as a benchmark to compare vision systems in the RoboCup Humanoid Soccer domain. The dataset is composed of a collection of images taken in various real-world locations as well as a collection of simulated images. It enables comparing vision approaches with a meaningful and expressive metric. The contributions of this paper consist of providing a comprehensive and annotated dataset, an overview of the recent approaches to vision in RoboCup, methods to generate vision training data in a simulated environment, and an approach to increase the variety of a dataset by automatically selecting a diverse set of images from a larger pool. Additionally , we provide a baseline of YOLOv4 and YOLOv4-tiny on this dataset.

[ResearchGate] [Download]

bib @inproceedings{TORSO2021, author = {Bestmann, Marc and Engelke, Timon and Fiedler, Niklas and Gldenstein, Jasper and Gutsche, Jan and Hagge, Jonas and Vahl, Florian}, year = {2021}, title = {{TORSO-21 Dataset: Typical Objects in RoboCup Soccer 2021}}, booktitle={RoboCup 2021: Robot World Cup XXIV} }

NOTE: You can get various citation types in the right sidebar on GitHub "Cite this repository"...

Changelog

July 12, 2021

  • Replacement of the segmentations in the reality collection (using the fix_segmentations.py). The update towards the publication introduced incorrect color values of the field class in the segmentation images.

June 27, 2021

  • Publication

Owner

  • Name: Hamburg Bit-Bots
  • Login: bit-bots
  • Kind: organization
  • Location: Hamburg

Official Github account of Hamburg Bit-Bots

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: "TORSO-21 Dataset: Typical Objects in RoboCup Soccer 2021"
message: "If you use this dataset or its related software, please cite it using the metadata from this file."
type: dataset
license: MIT
date-released: "2021-06-27"
authors:
  - given-names: Marc
    family-names: Bestmann
    email: marc.bestmann@uni-hamburg.de
    affiliation: Universität Hamburg
    orcid: 'https://orcid.org/0000-0002-7857-793X'
  - given-names: Timon
    family-names: Engelke
    affiliation: Universität Hamburg
    email: timon.engelke@uni-hamburg.de
  - given-names: Niklas
    family-names: Fiedler
    email: niklas.fiedler@uni-hamburg.de
    affiliation: Universität Hamburg
    orcid: 'https://orcid.org/0000-0002-0856-5087'
  - given-names: Jasper
    family-names: Güldenstein
    email: jasper.gueldenstein@uni-hamburg.de
    affiliation: Universität Hamburg
    orcid: 'https://orcid.org/0000-0002-5416-5850'
  - given-names: Jan
    family-names: Gutsche
    email: jan.gutsche@uni-hamburg.de
    affiliation: Universität Hamburg
    orcid: 'https://orcid.org/0000-0002-0868-8589'
  - given-names: Jonas
    family-names: Hagge
    email: jonas.hagge@uni-hamburg.de
    affiliation: Universität Hamburg
  - given-names: Florian
    family-names: Vahl
    email: florian.vahl@uni-hamburg.de
    affiliation: Universität Hamburg
identifiers:
  - type: doi
    value: 10.1007/978-3-030-98682-7_6
    description: link.springer.com
repository-code: 'https://github.com/bit-bots/TORSO_21_dataset'
url: 'https://github.com/bit-bots/TORSO_21_dataset'
abstract: >-
  We present a dataset specifically designed to be
  used as a benchmark to compare vision systems in
  the RoboCup Humanoid Soccer domain. The dataset is
  composed of a collection of images taken in various
  real-world locations as well as a collection of
  simulated images. It enables comparing vision
  approaches with a meaningful and expressive metric.
  The contributions of this paper consist of
  providing a comprehensive and annotated dataset, an
  overview of the recent approaches to vision in
  RoboCup, methods to generate vision training data
  in a simulated environment, and an approach to
  increase the variety of a dataset by automatically
  selecting a diverse set of images from a larger
  pool. Additionally, we provide a baseline of YOLOv4
  and YOLOv4-tiny on this dataset.
keywords:
  - Computer vision
  - Vision dataset
  - Deep learning
preferred-citation:
  title: "TORSO-21 Dataset: Typical Objects in RoboCup Soccer 2021"
  type: conference-paper
  authors:
    - given-names: Marc
      family-names: Bestmann
      email: marc.bestmann@uni-hamburg.de
      affiliation: Universität Hamburg
      orcid: 'https://orcid.org/0000-0002-7857-793X'
    - given-names: Timon
      family-names: Engelke
      affiliation: Universität Hamburg
      email: timon.engelke@uni-hamburg.de
    - given-names: Niklas
      family-names: Fiedler
      email: niklas.fiedler@uni-hamburg.de
      affiliation: Universität Hamburg
      orcid: 'https://orcid.org/0000-0002-0856-5087'
    - given-names: Jasper
      family-names: Güldenstein
      email: jasper.gueldenstein@uni-hamburg.de
      affiliation: Universität Hamburg
      orcid: 'https://orcid.org/0000-0002-5416-5850'
    - given-names: Jan
      family-names: Gutsche
      email: jan.gutsche@uni-hamburg.de
      affiliation: Universität Hamburg
      orcid: 'https://orcid.org/0000-0002-0868-8589'
    - given-names: Jonas
      family-names: Hagge
      email: jonas.hagge@uni-hamburg.de
      affiliation: Universität Hamburg
    - given-names: Florian
      family-names: Vahl
      email: florian.vahl@uni-hamburg.de
      affiliation: Universität Hamburg
doi: "10.1007/978-3-030-98682-7_6"

GitHub Events

Total
  • Watch event: 1
  • Push event: 15
  • Create event: 1
Last Year
  • Watch event: 1
  • Push event: 15
  • Create event: 1

Dependencies

poetry.lock pypi
  • cycler 0.10.0 develop
  • dataclasses 0.8 develop
  • joblib 1.0.1 develop
  • kiwisolver 1.3.1 develop
  • matplotlib 3.3.4 develop
  • pandas 1.1.5 develop
  • pillow 8.2.0 develop
  • protobuf 3.17.3 develop
  • pyparsing 2.4.7 develop
  • python-dateutil 2.8.1 develop
  • pytz 2021.1 develop
  • scikit-learn 0.24.2 develop
  • scipy 1.5.4 develop
  • seaborn 0.11.1 develop
  • six 1.16.0 develop
  • sklearn 0.0 develop
  • tensorboardx 2.2 develop
  • threadpoolctl 2.1.0 develop
  • torch 1.8.1 develop
  • torchsummary 1.5.1 develop
  • torchvision 0.9.1 develop
  • typing-extensions 3.10.0.0 develop
  • certifi 2021.5.30
  • chardet 4.0.0
  • idna 2.10
  • numpy 1.19.5
  • opencv-python 4.5.2.54
  • pyyaml 5.4.1
  • requests 2.25.1
  • tqdm 4.61.1
  • urllib3 1.22
pyproject.toml pypi
  • Pillow ^8.1.2 develop
  • matplotlib ^3.3.4 develop
  • scipy 1.5.4 develop
  • seaborn ^0.11.1 develop
  • sklearn ^0.0 develop
  • tensorboardX ^2.1 develop
  • torch ^1.8.1 develop
  • torchsummary ^1.5.1 develop
  • torchvision ^0.9.1 develop
  • PyYAML ^5.4.1
  • numpy 1.19.5
  • opencv-python ^4.5.1
  • python >=3.6.2
  • requests ^2.25.1
  • tqdm ^4.59.0