yeast-in-microstructures-dataset
Official and maintained implementation of the dataset paper "An Instance Segmentation Dataset of Yeast Cells in Microstructures" [EMBC 2023].
https://github.com/christophreich1996/yeast-in-microstructures-dataset
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org, scholar.google -
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.7%) to scientific vocabulary
Keywords
Repository
Official and maintained implementation of the dataset paper "An Instance Segmentation Dataset of Yeast Cells in Microstructures" [EMBC 2023].
Basic Info
- Host: GitHub
- Owner: ChristophReich1996
- License: mit
- Language: Python
- Default Branch: main
- Homepage: https://christophreich1996.github.io/yeast_in_microstructures_dataset/
- Size: 14.5 MB
Statistics
- Stars: 14
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 1
Topics
Metadata Files
README.md
An Instance Segmentation Dataset of Yeast Cells in Microstructures
Christoph Reich
, Tim Prangemeier
, André O. Françani
& Heinz Koeppl
| Project Page | Paper | Download Dataset |
This repository includes the official and maintained PyTorch validation (+ data loading & visualization) code of the Yeast in Microstructures dataset proposed in An Instance Segmentation Dataset of Yeast Cells in Microstructures.
wget https://tudatalib.ulb.tu-darmstadt.de/bitstream/handle/tudatalib/3799/yeast_cell_in_microstructures_dataset.zip
Update: We have released a high-resolution dataset of our microscopy images with panoptic annotations at ICCVW 2023. Check out our TYC dataset project page!
Abstract
Extracting single-cell information from microscopy data requires accurate instance-wise segmentations. Obtaining pixel-wise segmentations from microscopy imagery remains a challenging task, especially with the added complexity of microstructured environments. This paper presents a novel dataset for segmenting yeast cells in microstructures. We offer pixel-wise instance segmentation labels for both cells and trap microstructures. In total, we release 493 densely annotated microscopy images. To facilitate a unified comparison between novel segmentation algorithms, we propose a standardized evaluation strategy for our dataset. The aim of the dataset and evaluation strategy is to facilitate the development of new cell segmentation approaches.
If you use our dataset or find this research useful in your work, please cite our paper:
bibtex
@inproceedings{Reich2023,
title={{An Instance Segmentation Dataset of Yeast Cells in Microstructures}},
author={Reich, Christoph and Prangemeier, Tim and Fran{\c{c}}ani, Andr{\'e} O and Koeppl, Heinz},
booktitle={{International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)}},
year={2023}
}
Table of Contents
- Installation
- Dataformat
- Dataset Class
- Evaluation
- Visualization
- Additional Unlabeled Data
- Acknowledgements
Installation
The validation, data loading, and visualization code can be installed as a Python package by running:
shell script
pip install git+https://github.com/ChristophReich1996/Yeast-in-Microstructures-Dataset.git
All dependencies are listed in requirements.txt.
Dataformat
The dataset is split into a training, validation, and test set. Please refer to the paper for more information on this.
├── test
│ ├── bounding_boxes
│ ├── classes
│ ├── inputs
│ └── instances
├── train
│ ├── bounding_boxes
│ ├── classes
│ ├── inputs
│ └── instances
└── val
├── bounding_boxes
├── classes
├── inputs
└── instances
Every subset (train, val, and test) includes four different folders (inputs, instances, classes, bounding_boxes)
. The inputs folder includes the input images each with the shape [128, 128]. The instances folder holds the instance
maps of a shape of [N, 128, 128] (N is the number of instances). The classes holds the semantic class information
of each instance as a tensor of shape [N]. The bounding_boxes folder offers axis-aligned bounding boxes for each
instance of shape [N, 4 (x0y0x1y1)]. Every sample of the dataset has a .pt file in each of the four folders.
The .pt file can directly be loaded as a PyTorch Tensor
with torch.load(...). For details on the data loading
please have a look at the dataset class implementation.
Dataset Class
This repo includes a PyTorch dataset class implementation (in the yim_dataset.data module) of the Yeast in
Microstructures dataset, located in the module yim_dataset.data. The dataset class implementation loads the dataset
and returns the images, instance maps, bounding boxes, and semantic classes.
```python import yim_dataset from torch import Tensor from torch.utils.data import Dataset
Init dataset
dataset: Dataset = yimdataset.data.YIMDataset(path="/somepathtodata/train", returnabsolutebounding_box=False)
Get first sample of the dataset
image, instances, boundingboxes, classlabels = dataset[0] # type: Tensor, Tensor, Tensor, Tensor
Show shapes
print(image.shape) # [1, 256, 256] print(instances.shape) # [N, 256, 256] print(boundingboxes.shape) # [N, 4 (xcycwh, relative format)] print(classlabels) # [N, C=2 (trap=0 and cell=1)] ```
The dataset class implementation also offers support for custom Kornia data augmentations. You can pass an AugmentationSequential object to the dataset class. The following example utilizes random horizontal and vertical flipping as well as random Gaussian blur augmentations.
```python import kornia.augmentation import yim_dataset from torch.utils.data import Dataset
Init augmentations
augmentations = kornia.augmentation.AugmentationSequential( kornia.augmentation.RandomHorizontalFlip(p=0.5), kornia.augmentation.RandomVerticalFlip(p=0.5), kornia.augmentation.RandomGaussianBlur(kernelsize=(31, 31), sigma=(9, 9), p=0.5), datakeys=["input", "bboxxyxy", "mask"], sameon_batch=False, )
Init dataset
dataset: Dataset = yimdataset.data.YIMDataset(path="/somepathtodata/train", augmentations=augmentations) ```
Note that it is necessary to pass ["input", "bbox_xyxy", "mask"] as data keys! If a different data key
configuration is given a runtime error is raised.
For wrapping the dataset with the PyTorch DataLoader please use the custom collide function.
```python from typing import List
import yim_dataset from torch import Tensor from torch.utils.data import Dataset, DataLoader
Init dataset
dataset: Dataset = yimdataset.data.YIMDataset(path="/somepathtodata/train", returnabsoluteboundingbox=False) dataloader = DataLoader( dataset=dataset, numworkers=2, batchsize=2, droplast=True, collatefn=yimdataset.data.collatefunctionyimdataset, )
Get a sample from dataloader
images, instances, boundingboxes, classlabels = next( iter(data_loader)) # type: Tensor, List[Tensor], List[Tensor], List[Tensor]
Show shapes
print(images.shape) # [B, 1, 256, 256] print(instances.shape) # list([N, 256, 256]) print(boundingboxes.shape) # list([N, 4 (xcycwh, relative format)]) print(classlabels) # list([N, C=2 (trap=0 and cell=1)]) ```
All Dataset Class Parameters
[YIMDataset](yim_dataset/data/dataset.py) parameters: | Parameter | Default value | Info | |------------------------------------------------------|---------------------------------|--------------------------------------------------------------------| | `path: str` | - | Path to dataset as a string. | | `augmentations: Optional[AugmentationSequential]` | `None` | Augmentations to be used. If `None` no augmentation is employed. | | `normalize: bool` | `True` | If true images are normalized by the given normalization function. | | `normalization_function: Callable[[Tensor], Tensor]` | `normalize` (0 mean, unit std.) | Normalization function. | | `return_absolute_bounding_box: bool` | `False` | If true BBs returned absolut format (else relative) |We provide a full dataset and data loader example in example_eval.py.
If this dataset class implementation is not sufficient for your application please customize the existing code or open a pull request with extending the existing implementation.
Evaluation
We propose to validate segmentation predictions on our dataset by using
the Panoptic Quality and the cell class IoU. We implement both metrics as a
TorchMetrics metric in the yim_dataset.eval module. Both metrics (PanopticQuality
and CellIoU) can be used like all TorchMetrics metrics. The input to both metrics is the
prediction, composed of the instance maps (list of tensors) and semantic class prediction (list of tensors), and the
label is also composed of instance maps and semantic classes. Note that the instance maps are not allowed to overlap.
Additionally, both metrics assume thresholded instance maps and hard semantic classes (no logits).
```python import yim_dataset from torchmetrics import Metric
pq: Metric = yimdataset.eval.PanopticQuality() celliou: Metric = yim_dataset.eval.CellIoU()
for index, (images, instances, boundingboxes, classlabels) in enumerate(dataloader): # Make prediction instancespred, boundingboxespred, classlabelspred = model( images) # type: List[Tensor], List[Tensor], List[Tensor] # Get semantic classes form one-hot vector classlabels = [c.argmax(dim=-1) for c in classlabels] classlabelspred = [c.argmax(dim=-1) for c in classlabelspred] # Compute metrics pq.update( instancespred=instancespred, classespred=classlabelspred, instancestarget=instances, classestarget=classlabels, ) celliou.update( instancespred=instancespred, classespred=classlabelspred, instancestarget=instances, classestarget=class_labels, )
Compute final metric
print(f"Panoptic Quality: {pq.compute().item()}") print(f"Cell class IoU: {cell_iou.compute().item()}") ```
A full working example is provided in example_eval.py.
Visualization
This implementation (yim_dataset.vis module) also includes various functions for reproducing the plots from the paper.
The instance segmentation overlay (image + instance maps + BB + classes), as shown at the top, can be achieved by:
```python import yim_dataset from torch import Tensor from torch.utils.data import Dataset
Init dataset
dataset: Dataset = yimdataset.data.YIMDataset(path="/somepathtodata/train", returnabsolutebounding_box=False)
Get first sample of the dataset
image, instances, boundingboxes, classlabels = dataset[0] # type: Tensor, Tensor, Tensor, Tensor
Plot
yimdataset.vis.plotimageinstancesbbclasses( image=image, instances=instances, boundingboxes=yimdataset.data.boundingboxxcycwhtox0y0x1y1(boundingboxes), classlabels=classlabels.argmax(dim=1), save=False, show=True, showclasslabel=True, ) ```
All plot functions entail the parameter show: bool and save: bool. If show=True the plot is directly visualized by
calling plt.show(). If you want to save the plot to a file set save=True and provide the path and file
name (file_path: str).
An example use of all visualization functions is provided in example_vis.py.
Additional Unlabeled Data
Note that there are also additional unlabeled data available from the same domain. In the paper Multi-StyleGAN: Towards Image-Based Simulation of Time-Lapse Live-Cell we proposed an unlabeled dataset of ~9k images (sequences) of yeast cells in microstructures. The dataset is available at TUdatalib. Please cite the following paper if you are using the unlabeled images in your research:
bibtex
@inproceedings{Reich2021,
title={{Multi-StyleGAN: Towards Image-Based Simulation of Time-Lapse Live-Cell Microscopy}},
author={Reich, Christoph and Prangemeier, Tim and Wildner, Christian and Koeppl, Heinz},
booktitle={{International Conference on Medical image computing and computer-assisted intervention (MICCAI)}},
year={2021},
organization={Springer}
}
Acknowledgements
We thank Christoph Hoog Antink for insightful discussions, Klaus-Dieter Voss for aid with the microfluidics fabrication, Jan Basrawi for contributing to data labeling, and Robert Sauerborn for aid with setting up the project page.
Credit to TorchMetrics (Lightning AI) , Kornia, and PyTorch for providing the basis of this implementation.
This work was supported by the Landesoffensive für wissenschaftliche Exzellenz as part of the LOEWE Schwerpunkt CompuGene. H.K. acknowledges the support from the European Research Council (ERC) with the consolidator grant CONSYN ( nr. 773196). C.R. acknowledges the support of NEC Laboratories America, Inc.
Owner
- Name: Christoph Reich
- Login: ChristophReich1996
- Kind: user
- Location: Germany
- Company: Technical University of Munich
- Website: christophreich1996.github.io
- Twitter: ChristophR1996
- Repositories: 41
- Profile: https://github.com/ChristophReich1996
ELLIS Ph.D. Student @ Technical University of Munich, Technische Universität Darmstadt & University of Oxford | Prev. NEC Labs
Citation (CITATION.cff)
cff-version: 1.2.0
message: "Code of the paper: An Instance Segmentation Dataset of Yeast Cells in Microstructures"
authors:
- family-names: Reich
given-names: Christoph
- family-names: Prangemeier
given-names: Tim
- family-names: Françani
given-names: André O.
- family-names: Koeppl
given-names: Heinz
title: "An Instance Segmentation Dataset of Yeast Cells in Microstructures"
version: 0.1.0
date-released: 2023-04-17
GitHub Events
Total
- Watch event: 1
Last Year
- Watch event: 1
Committers
Last synced: about 2 years ago
Top Committers
| Name | Commits | |
|---|---|---|
| Christoph Reich | 3****6 | 11 |
| ChristophReich1996 | c****h@g****t | 4 |
| Tim Prangemeier | 4****r | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: about 2 years ago
All Time
- Total issues: 1
- Total pull requests: 0
- Average time to close issues: 1 day
- Average time to close pull requests: N/A
- Total issue authors: 1
- Total pull request authors: 0
- Average comments per issue: 1.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 1
- Pull requests: 0
- Average time to close issues: 1 day
- Average time to close pull requests: N/A
- Issue authors: 1
- Pull request authors: 0
- Average comments per issue: 1.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- luoolu (1)