yeast-in-microstructures-dataset

Official and maintained implementation of the dataset paper "An Instance Segmentation Dataset of Yeast Cells in Microstructures" [EMBC 2023].

https://github.com/christophreich1996/yeast-in-microstructures-dataset

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org, scholar.google
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.7%) to scientific vocabulary

Keywords

cell-segmentation dataset deep-learning embc evaluation evaluation-framework evaluation-metrics instance-segmentation medical-imaging panoptic-segmentation segmentation yeast-dataset
Last synced: 6 months ago · JSON representation ·

Repository

Official and maintained implementation of the dataset paper "An Instance Segmentation Dataset of Yeast Cells in Microstructures" [EMBC 2023].

Basic Info
Statistics
  • Stars: 14
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 1
Topics
cell-segmentation dataset deep-learning embc evaluation evaluation-framework evaluation-metrics instance-segmentation medical-imaging panoptic-segmentation segmentation yeast-dataset
Created almost 3 years ago · Last pushed almost 2 years ago
Metadata Files
Readme License Citation

README.md

An Instance Segmentation Dataset of Yeast Cells in Microstructures

arXiv License: MIT Framework

Christoph Reich , Tim Prangemeier , André O. Françani & Heinz Koeppl

| Project Page | Paper | Download Dataset |

1

This repository includes the official and maintained PyTorch validation (+ data loading & visualization) code of the Yeast in Microstructures dataset proposed in An Instance Segmentation Dataset of Yeast Cells in Microstructures.

wget https://tudatalib.ulb.tu-darmstadt.de/bitstream/handle/tudatalib/3799/yeast_cell_in_microstructures_dataset.zip

Update: We have released a high-resolution dataset of our microscopy images with panoptic annotations at ICCVW 2023. Check out our TYC dataset project page!

Abstract

Extracting single-cell information from microscopy data requires accurate instance-wise segmentations. Obtaining pixel-wise segmentations from microscopy imagery remains a challenging task, especially with the added complexity of microstructured environments. This paper presents a novel dataset for segmenting yeast cells in microstructures. We offer pixel-wise instance segmentation labels for both cells and trap microstructures. In total, we release 493 densely annotated microscopy images. To facilitate a unified comparison between novel segmentation algorithms, we propose a standardized evaluation strategy for our dataset. The aim of the dataset and evaluation strategy is to facilitate the development of new cell segmentation approaches.

If you use our dataset or find this research useful in your work, please cite our paper:

bibtex @inproceedings{Reich2023, title={{An Instance Segmentation Dataset of Yeast Cells in Microstructures}}, author={Reich, Christoph and Prangemeier, Tim and Fran{\c{c}}ani, Andr{\'e} O and Koeppl, Heinz}, booktitle={{International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)}}, year={2023} }

Table of Contents

  1. Installation
  2. Dataformat
  3. Dataset Class
  4. Evaluation
  5. Visualization
  6. Additional Unlabeled Data
  7. Acknowledgements

Installation

The validation, data loading, and visualization code can be installed as a Python package by running:

shell script pip install git+https://github.com/ChristophReich1996/Yeast-in-Microstructures-Dataset.git

All dependencies are listed in requirements.txt.

Dataformat

The dataset is split into a training, validation, and test set. Please refer to the paper for more information on this.

├── test │ ├── bounding_boxes │ ├── classes │ ├── inputs │ └── instances ├── train │ ├── bounding_boxes │ ├── classes │ ├── inputs │ └── instances └── val ├── bounding_boxes ├── classes ├── inputs └── instances

Every subset (train, val, and test) includes four different folders (inputs, instances, classes, bounding_boxes) . The inputs folder includes the input images each with the shape [128, 128]. The instances folder holds the instance maps of a shape of [N, 128, 128] (N is the number of instances). The classes holds the semantic class information of each instance as a tensor of shape [N]. The bounding_boxes folder offers axis-aligned bounding boxes for each instance of shape [N, 4 (x0y0x1y1)]. Every sample of the dataset has a .pt file in each of the four folders. The .pt file can directly be loaded as a PyTorch Tensor with torch.load(...). For details on the data loading please have a look at the dataset class implementation.

Dataset Class

This repo includes a PyTorch dataset class implementation (in the yim_dataset.data module) of the Yeast in Microstructures dataset, located in the module yim_dataset.data. The dataset class implementation loads the dataset and returns the images, instance maps, bounding boxes, and semantic classes.

```python import yim_dataset from torch import Tensor from torch.utils.data import Dataset

Init dataset

dataset: Dataset = yimdataset.data.YIMDataset(path="/somepathtodata/train", returnabsolutebounding_box=False)

Get first sample of the dataset

image, instances, boundingboxes, classlabels = dataset[0] # type: Tensor, Tensor, Tensor, Tensor

Show shapes

print(image.shape) # [1, 256, 256] print(instances.shape) # [N, 256, 256] print(boundingboxes.shape) # [N, 4 (xcycwh, relative format)] print(classlabels) # [N, C=2 (trap=0 and cell=1)] ```

The dataset class implementation also offers support for custom Kornia data augmentations. You can pass an AugmentationSequential object to the dataset class. The following example utilizes random horizontal and vertical flipping as well as random Gaussian blur augmentations.

```python import kornia.augmentation import yim_dataset from torch.utils.data import Dataset

Init augmentations

augmentations = kornia.augmentation.AugmentationSequential( kornia.augmentation.RandomHorizontalFlip(p=0.5), kornia.augmentation.RandomVerticalFlip(p=0.5), kornia.augmentation.RandomGaussianBlur(kernelsize=(31, 31), sigma=(9, 9), p=0.5), datakeys=["input", "bboxxyxy", "mask"], sameon_batch=False, )

Init dataset

dataset: Dataset = yimdataset.data.YIMDataset(path="/somepathtodata/train", augmentations=augmentations) ```

Note that it is necessary to pass ["input", "bbox_xyxy", "mask"] as data keys! If a different data key configuration is given a runtime error is raised.

For wrapping the dataset with the PyTorch DataLoader please use the custom collide function.

```python from typing import List

import yim_dataset from torch import Tensor from torch.utils.data import Dataset, DataLoader

Init dataset

dataset: Dataset = yimdataset.data.YIMDataset(path="/somepathtodata/train", returnabsoluteboundingbox=False) dataloader = DataLoader( dataset=dataset, numworkers=2, batchsize=2, droplast=True, collatefn=yimdataset.data.collatefunctionyimdataset, )

Get a sample from dataloader

images, instances, boundingboxes, classlabels = next( iter(data_loader)) # type: Tensor, List[Tensor], List[Tensor], List[Tensor]

Show shapes

print(images.shape) # [B, 1, 256, 256] print(instances.shape) # list([N, 256, 256]) print(boundingboxes.shape) # list([N, 4 (xcycwh, relative format)]) print(classlabels) # list([N, C=2 (trap=0 and cell=1)]) ```

All Dataset Class Parameters [YIMDataset](yim_dataset/data/dataset.py) parameters: | Parameter | Default value | Info | |------------------------------------------------------|---------------------------------|--------------------------------------------------------------------| | `path: str` | - | Path to dataset as a string. | | `augmentations: Optional[AugmentationSequential]` | `None` | Augmentations to be used. If `None` no augmentation is employed. | | `normalize: bool` | `True` | If true images are normalized by the given normalization function. | | `normalization_function: Callable[[Tensor], Tensor]` | `normalize` (0 mean, unit std.) | Normalization function. | | `return_absolute_bounding_box: bool` | `False` | If true BBs returned absolut format (else relative) |

We provide a full dataset and data loader example in example_eval.py.

If this dataset class implementation is not sufficient for your application please customize the existing code or open a pull request with extending the existing implementation.

Evaluation

We propose to validate segmentation predictions on our dataset by using the Panoptic Quality and the cell class IoU. We implement both metrics as a TorchMetrics metric in the yim_dataset.eval module. Both metrics (PanopticQuality and CellIoU) can be used like all TorchMetrics metrics. The input to both metrics is the prediction, composed of the instance maps (list of tensors) and semantic class prediction (list of tensors), and the label is also composed of instance maps and semantic classes. Note that the instance maps are not allowed to overlap. Additionally, both metrics assume thresholded instance maps and hard semantic classes (no logits).

```python import yim_dataset from torchmetrics import Metric

pq: Metric = yimdataset.eval.PanopticQuality() celliou: Metric = yim_dataset.eval.CellIoU()

for index, (images, instances, boundingboxes, classlabels) in enumerate(dataloader): # Make prediction instancespred, boundingboxespred, classlabelspred = model( images) # type: List[Tensor], List[Tensor], List[Tensor] # Get semantic classes form one-hot vector classlabels = [c.argmax(dim=-1) for c in classlabels] classlabelspred = [c.argmax(dim=-1) for c in classlabelspred] # Compute metrics pq.update( instancespred=instancespred, classespred=classlabelspred, instancestarget=instances, classestarget=classlabels, ) celliou.update( instancespred=instancespred, classespred=classlabelspred, instancestarget=instances, classestarget=class_labels, )

Compute final metric

print(f"Panoptic Quality: {pq.compute().item()}") print(f"Cell class IoU: {cell_iou.compute().item()}") ```

A full working example is provided in example_eval.py.

Visualization

This implementation (yim_dataset.vis module) also includes various functions for reproducing the plots from the paper. The instance segmentation overlay (image + instance maps + BB + classes), as shown at the top, can be achieved by:

```python import yim_dataset from torch import Tensor from torch.utils.data import Dataset

Init dataset

dataset: Dataset = yimdataset.data.YIMDataset(path="/somepathtodata/train", returnabsolutebounding_box=False)

Get first sample of the dataset

image, instances, boundingboxes, classlabels = dataset[0] # type: Tensor, Tensor, Tensor, Tensor

Plot

yimdataset.vis.plotimageinstancesbbclasses( image=image, instances=instances, boundingboxes=yimdataset.data.boundingboxxcycwhtox0y0x1y1(boundingboxes), classlabels=classlabels.argmax(dim=1), save=False, show=True, showclasslabel=True, ) ```

All plot functions entail the parameter show: bool and save: bool. If show=True the plot is directly visualized by calling plt.show(). If you want to save the plot to a file set save=True and provide the path and file name (file_path: str).

An example use of all visualization functions is provided in example_vis.py.

Additional Unlabeled Data

Note that there are also additional unlabeled data available from the same domain. In the paper Multi-StyleGAN: Towards Image-Based Simulation of Time-Lapse Live-Cell we proposed an unlabeled dataset of ~9k images (sequences) of yeast cells in microstructures. The dataset is available at TUdatalib. Please cite the following paper if you are using the unlabeled images in your research:

bibtex @inproceedings{Reich2021, title={{Multi-StyleGAN: Towards Image-Based Simulation of Time-Lapse Live-Cell Microscopy}}, author={Reich, Christoph and Prangemeier, Tim and Wildner, Christian and Koeppl, Heinz}, booktitle={{International Conference on Medical image computing and computer-assisted intervention (MICCAI)}}, year={2021}, organization={Springer} }

Acknowledgements

We thank Christoph Hoog Antink for insightful discussions, Klaus-Dieter Voss for aid with the microfluidics fabrication, Jan Basrawi for contributing to data labeling, and Robert Sauerborn for aid with setting up the project page.

Credit to TorchMetrics (Lightning AI) , Kornia, and PyTorch for providing the basis of this implementation.

This work was supported by the Landesoffensive für wissenschaftliche Exzellenz as part of the LOEWE Schwerpunkt CompuGene. H.K. acknowledges the support from the European Research Council (ERC) with the consolidator grant CONSYN ( nr. 773196). C.R. acknowledges the support of NEC Laboratories America, Inc.

Owner

  • Name: Christoph Reich
  • Login: ChristophReich1996
  • Kind: user
  • Location: Germany
  • Company: Technical University of Munich

ELLIS Ph.D. Student @ Technical University of Munich, Technische Universität Darmstadt & University of Oxford | Prev. NEC Labs

Citation (CITATION.cff)

cff-version: 1.2.0
message: "Code of the paper: An Instance Segmentation Dataset of Yeast Cells in Microstructures"
authors:
  - family-names: Reich
    given-names: Christoph
  - family-names: Prangemeier
    given-names: Tim
  - family-names: Françani
    given-names: André O.
  - family-names: Koeppl
    given-names: Heinz
title: "An Instance Segmentation Dataset of Yeast Cells in Microstructures"
version: 0.1.0
date-released: 2023-04-17

GitHub Events

Total
  • Watch event: 1
Last Year
  • Watch event: 1

Committers

Last synced: about 2 years ago

All Time
  • Total Commits: 16
  • Total Committers: 3
  • Avg Commits per committer: 5.333
  • Development Distribution Score (DDS): 0.313
Past Year
  • Commits: 16
  • Committers: 3
  • Avg Commits per committer: 5.333
  • Development Distribution Score (DDS): 0.313
Top Committers
Name Email Commits
Christoph Reich 3****6 11
ChristophReich1996 c****h@g****t 4
Tim Prangemeier 4****r 1
Committer Domains (Top 20 + Academic)
gmx.net: 1

Issues and Pull Requests

Last synced: about 2 years ago

All Time
  • Total issues: 1
  • Total pull requests: 0
  • Average time to close issues: 1 day
  • Average time to close pull requests: N/A
  • Total issue authors: 1
  • Total pull request authors: 0
  • Average comments per issue: 1.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 1
  • Pull requests: 0
  • Average time to close issues: 1 day
  • Average time to close pull requests: N/A
  • Issue authors: 1
  • Pull request authors: 0
  • Average comments per issue: 1.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • luoolu (1)
Pull Request Authors
Top Labels
Issue Labels
question (1)
Pull Request Labels