dataset-fish-detection-low-visibility

A fully annotated baited underwater dataset of poor and fair visibility videos for the development of fish detection models and image pre-processing tools.

https://github.com/slopezmarcano/dataset-fish-detection-low-visibility

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (9.5%) to scientific vocabulary
Last synced: 10 months ago · JSON representation ·

Repository

A fully annotated baited underwater dataset of poor and fair visibility videos for the development of fish detection models and image pre-processing tools.

Basic Info
  • Host: GitHub
  • Owner: slopezmarcano
  • Default Branch: gh-pages
  • Size: 4.85 MB
Statistics
  • Stars: 8
  • Watchers: 1
  • Forks: 1
  • Open Issues: 1
  • Releases: 2
Created over 5 years ago · Last pushed about 1 year ago
Metadata Files
Readme Citation

README.md

An annotated dataset for automated detection and counting of estuarine fish in poor visibility conditions

Status Powered by In collaboration with Supported by DOI

Table of Contents

  1. Overview
  2. Uses
  3. Datasets
  4. Species
  5. Dataset Links
  6. Attributions

Overview

Here we provide an open-access and annotated baited underwater dataset of poor and fair visibility videos for the development of fish detection models and benchmarking of image pre-processing tools. We provide the annotated training annotations and images, and a 12 hour testing dataset with groundtruth MaxN abundance for four target species.

alt text

Uses

This dataset can be used 1. As a computer vision training dataset to monitor estuarine fish in the eastern coast of Australia. 2. As a benchmark dataset to test image pre-processing techniques (e.g. colour correction). 3. As a benchmark dataset to test image post-processing techniques (e.g. fish occlussion filters) 4. To supplement global fish detection models (e.g. see MegaDetector by Microsoft) 5. To increase accessibility of underwater computer vision tools for aquatic monitoring and environmental science (e.g. see lilascience)

Datasets

We provide access to two datasets: training and testing dataset.

The training dataset is a fully annotatated dataset that contains images, annotations and labels of various fish species. The training dataset includes videos from 2017-2021 of Moreton Bay, Australia across poor visibility secchi depths (2-5 m) from an standard baited underwater video rig with GoPro cameras recording at 1080p.

The testing dataset includes several non-annotated videos from the same location, visibility scenarios and period as the training dataset. The testing dataset can be used to evaluate computer vision fish detection models. The groundtruth is a csv that has manual maximum abundance counts of each fish species across each video. The maximum number of individuals per video were manually determined by researchers at the Moreton Bay Environmental Education Centre.

The training and testing dataset were collected by the Moreton Bay Environmental Education Centre.

Species

The training dataset contains >65,000 segmentation mask annotations of 19 different estuarine fish species from Moreton Bay, Australia. We targeted four species for studies conducted at the Global Wetlands Project. Therefore, these species have a larger number of annotations. We suggest caution when using annotations of the non-targeted species, as these were variably annotated across the dataset. Please contact Sebastian Lopez-Marcano for more information

| Species | Num Annotations |Targeted species | |------------------|---------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------| |Australasian Snapper | 9,489 | YES| |Bengal Sergeant | 277 | NO| |Black-Banded Trevally | 89| NO| |Blue Catfish | 2,411| NO| |Blue Swimmer Crab|847|NO| |Eastern Striped Grunter|14,631|NO| |Eastern Stripey|307|NO| |Echinoderm|14|NO| |Fanbelly Leatherjacket| 190| NO| |Gunthers Wrasse| 603| NO| |Mackerel spp|139|NO| |Moses Snapper| 53|NO| |Paradise Threadfin Bream|10,658|YES| |Pinkbanded Grubfish|502|NO| |Pomacentrid spp|27|NO| |Remora spp|41|NO| |Smallmouth Scad|7,067|YES| |Smooth Golden Toadfish|5,014|YES| |Yellowfin Bream and Tarwhine|11,872|NO|

Dataset Links

Password: fishdetection

| Dataset | Raw Videos | Raw Images | Version | Num Annotations | Annotations (CSV/JSON) | |------------------|---------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------|---------|-----------------|--------------------------------------------------------------------------------------------------| | Training dataset | NA | Download 555 MB | 7 | 8,696 | Download 19 MB | | Testing dataset | Download 6.5 GB | NA | 1 | NA | Groundtruth|

Annotations

Each annotation includes object instance annotations which consists of the following key fields: Labels are provided as a common name: YellowfinBream for Acanthopagrus australis; bounding boxes that enclose the species in each frame are provided in "[x,y,width,height]" format, in pixel units; Segmentation masks which outline the species as a polygon are provided as a list of pixel coordinates in the format "9x,y,x,y,...]".

The corresponding image is provided as an image filename. All image coordinated (bounding box and segmentation masks) are measured from the top left mage corner and or 0-indexed.

Annotations are provided in both CSV format and COCO JSON format which is a commonly used data format for integration with object detection frameworks including PyTorch and TensorFlow. For more information on annotations files in COCO JSON and/or CSV formats go here.

Attributions

Please use 'CITATION.cff' to cite this dataset.

We kindly request that the following text be included in an acknowledgements section at the end of your publications:

"We would like to thank the Moreton Bay Environmental Education Centre for freely supplying us with the fish dataset for our research. The fish dataset was supported by an AI for Earth grant from Microsoft."

alt text alt text

Owner

  • Name: Sebastian Lopez-M
  • Login: slopezmarcano
  • Kind: user
  • Location: Australia
  • Company: Griffith University

Data Science for AgTech, conservation and education. #FishID

Citation (CITATION.cff)

# YAML 1.2
---
abstract: "An open-access and annotated baited underwater dataset of poor and fair visibility videos for the development of fish detection models and benchmarking of image pre-processing tools. We provide the annotated training annotations and images, and a 12-hour testing dataset with ground-truth MaxN abundance for four target species."
authors: 
  -
    affiliation: "Quantitative Imaging Research Team, Data61, CSIRO and Coastal and Marine Research Centre, Griffith University, Australia "
    family-names: "Lopez-Marcano"
    given-names: Sebastian
    orcid: "https://orcid.org/0000-0002-0814-2906"
  -
    affiliation: "Moreton Bay Environmental Education Centre, Queensland, Australia"
    family-names: Roe
    given-names: Tim
  -
    affiliation: "Coastal and Marine Research Centre, Griffith University, Australia"
    family-names: Kitchingman
    given-names: Michaela
  -
    affiliation: "Coastal and Marine Research Centre, Griffith University, Australia"
    family-names: Jinks
    given-names: Eric
  -
    affiliation: "Coastal and Marine Research Centre, Griffith University, Australia"
    family-names: Connolly
    given-names: Rod
    orcid: "https://orcid.org/0000-0001-6223-1291"
cff-version: "1.1.0"
doi: "10.5281/zenodo.5238512"
keywords: 
  - "computer vision"
  - "object detection"
  - "deep learning"
  - "aquatic monitoring"
  - estuary
message: "If you use this software, please cite it using these metadata."
repository-code: "https://slopezmarcano.github.io/automated-fish-detection-in-low-visibility/"
title: "An annotated dataset for automated detection and counting of estuarine fish in poor visibility conditions"
version: "v1.1"
...

GitHub Events

Total
  • Issues event: 5
  • Watch event: 4
  • Issue comment event: 6
  • Push event: 2
Last Year
  • Issues event: 5
  • Watch event: 4
  • Issue comment event: 6
  • Push event: 2

Issues and Pull Requests

Last synced: 10 months ago

All Time
  • Total issues: 2
  • Total pull requests: 0
  • Average time to close issues: 4 months
  • Average time to close pull requests: N/A
  • Total issue authors: 2
  • Total pull request authors: 0
  • Average comments per issue: 0.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 2
  • Pull requests: 0
  • Average time to close issues: 4 months
  • Average time to close pull requests: N/A
  • Issue authors: 2
  • Pull request authors: 0
  • Average comments per issue: 0.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • junioryearn (2)
  • slopezmarcano (1)
Pull Request Authors
Top Labels
Issue Labels
bug (1)
Pull Request Labels