luderick-seagrass

This dataset comprises of annotated footage of Girella tricuspidata in two estuary systems in South East Queensland, Australia. This data is suitable for a range of classification and object detection research in unconstrained underwater environments.

https://github.com/globalwetlands/luderick-seagrass

Science Score: 57.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 4 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (9.1%) to scientific vocabulary

Keywords

computer-vision dataset deep-learning environmental-monitoring object-detection underwater-images
Last synced: 6 months ago · JSON representation ·

Repository

This dataset comprises of annotated footage of Girella tricuspidata in two estuary systems in South East Queensland, Australia. This data is suitable for a range of classification and object detection research in unconstrained underwater environments.

Basic Info
Statistics
  • Stars: 12
  • Watchers: 1
  • Forks: 3
  • Open Issues: 1
  • Releases: 0
Topics
computer-vision dataset deep-learning environmental-monitoring object-detection underwater-images
Created over 5 years ago · Last pushed almost 4 years ago
Metadata Files
Readme Changelog Citation

README.md

Annotated videos of luderick from estuaries in southeast Queensland, Australia

CC BY 4.0 DOI

example dataset image

Overview

This dataset comprises of annotated footage of Girella tricuspidata in two estuary systems in south East Queensland, Australia. This data is suitable for a range of classification and object detection research in unconstrained underwater environments. The raw data was obtained using submerged action cameras (Haldex Sports Action Cam HD 1080p) to collect video footage in the Tweed River estuary in southeast Queensland (-28.169438, 153.547594), between February and July 2019. Additional footage was collected from seagrass meadows in a separate estuary system in Tallebudgera Creek (-28.109721, 153.448975). Each sampling day, six cameras were deployed for 1 h over a variety of seagrass patches; the angle and placement of cameras was varied among deployment to ensure a variety of backgrounds and fish angles. Videos were trimmed for training to contain only footage of luderick (the target species for the study) and split into 5 frames per second.

Full data report can be found here: https://doi.org/10.3389/fmars.2021.629485

Dataset presentation

This dataset includes 9429 annotations and 4280 images which can be used for training object detection deep learning models and other related computer vision tasks. The dataset is organised into 3 sub-datasets that have been allocated for training, test and novel test purposes.

| Dataset | ID | Raw Videos | Version | Suggested use | Luderick Annotations | Bream Annotations | Total | | ------------------------------ | --------- | --------------------------------------- | ------- | ------------- | -------------------- | ----------------- | ----- | | Luderick Seagrass Jack Evans A | Wvo7U_76t | Download (1.3GB) | 8 | training | 6649 | 53 | 6702 | | Luderick Seagrass Jack Evans B | OmKwIVpe- | Download (1.1GB) | 8 | test | 1632 | 29 | 1661 | | Luderick Seagrass Tallebudgera | 4bUBoZmvV | Download (576MB) | 6 | novel test | 1023 | 43 | 1066 | | Total | | | | 9304 | 125 | 9429 |

Images are included in a ZIP archive which can be downloaded from either of the following:
* https://download.pangaea.de/dataset/926930/files/Fishautomatedidentificationandcounting.zip * https://globalwetlands.blob.core.windows.net/globalwetlands-public/datasets/luderick-seagrass/luderick-seagrass.zip

Each annotation includes object instance annotations which consist of the following key fields: Labels are provided as a common name: either "luderick" for Girella tricuspidata or "bream" for Acanthopagrus australis; Bounding boxes that enclose the species in each frame are provided in "[x, y, width, height]" format, in pixel units; Segmentation masks which outline the species as a polygon are provided as a list of pixel coordinates in the format "[x, y, x, y, ...]"; The corresponding image is provided as an image filename. All image coordinates (bounding box and segmentation masks) are measured from the top left image corner and are 0-indexed.

Annotations are provided in both CSV format and COCO JSON format which is a commonly used data format for integration with object detection frameworks including PyTorch and TensorFlow.

Additional details for each image can be found in dataset_images.csv, including data collection deployment dates, geo-coordinates and habitat type.

COCO JSON

Each annotation in COCO JSON format includes the following fields:

| Key | Description | | ------------ | ----------------------------------------------------------------------------------- | | id | INT annotation ID | | categoryid | INT category ID | | imageid | INT image ID | | bbox | ARRAY [x, y, width, height] of bounding box in px | | area | INT area of bounding box in pixels squared | | segmentation | STR segmentation polygon coordinates in format "[[x, y, x, y, ...]]" | | iscrowd | INT 0 or 1. A value of 1 indicated the annotation includes more than one individual |

Each image in COCO JSON format includes the following fields:

| Key | Description | | --------- | --------------------- | | id | INT image ID | | height | INT image height (px) | | width | INT image width (px) | | file_name | STR image filename |

Each category in COCO JSON format includes the following fields:

| Key | Description | | ---- | --------------------------------------- | | id | INT category ID | | name | STR category name (species common name) |

COCO JSON Example

json { "annotations": [{ "id": 0, "image_id": 0, "category_id": 1, "bbox": [ 0, 76, 624, 1003 ], "iscrowd": 0, "area": 625872, "segmentation": [ [ 5, 76, 154, 80, 409, 76, 471, 86, 546, 110 ] ] }], "images": [{ "file_name": "20190618_1.mov_5fps_000001.jpg", "height": 1080, "width": 1920, "id": 0, "license": 1 } ], "categories": [{ "name": "luderick", "id": 1 }]

CSV

For each annotation in CSV format, the following columns are provided:

| Column | Description | | ------------ | -------------------------------------------------------------------- | | id | INT annotation ID | | category | STR name of category (luderick/bream) | | categoryid | INT category ID | | image | STR image file name | | imageid | INT image ID | | bboxx | INT minimum x pixel coordinate of bounding box | | bboxy | INT minimum y pixel coordinate of bounding box | | bboxw | INT width of bounding box in pixels | | bboxh | INT height of bounding box in pixels | | area | INT area of bounding box in pixels squared | | segmentation | STR segmentation polygon coordinates in format "[[x, y, x, y, ...]]" |

CSV Example

| id | category | categoryid | image | imageid | bboxx | bboxy | bboxw | bboxh | area | segmentation | | --- | -------- | ----------- | ------------------------------ | -------- | ------ | ------ | ------ | ------ | ------ | ---------------- | | 0 | luderick | 1 | 201906181.mov5fps_000001.jpg | 0 | 0 | 76 | 624 | 1003 | 625872 | "[[5, 76, ...]]" |

Owner

  • Name: The Global Wetlands Project
  • Login: globalwetlands
  • Kind: organization

International research team working to elevate scientific understanding and deliver tools for coastal conservation.

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this data repository, please cite it as below."
authors:
- family-names: "Ditria"
  given-names: "Ellen M"
- family-names: "Connolly"
  given-names: "Rod M"
  orcid: "https://orcid.org/0000-0001-6223-1291"
- family-names: "Jinks"
  given-names: "Eric L"
  orcid: "https://orcid.org/0000-0003-2507-2070"
- family-names: "Lopez-Marcano"
  given-names: "Sebastian"
title: "Annotated video footage for automated identification and counting of fish in unconstrained marine environments"
version: 1.0.0
doi: 10.1594/PANGAEA.926930
date-released: 2021-04-21
url: "https://doi.org/10.1594/PANGAEA.926930"

GitHub Events

Total
  • Watch event: 1
Last Year
  • Watch event: 1