COVID-19 Lung Segmentation

COVID-19 Lung Segmentation - Published in JOSS (2021)

https://github.com/riccardobiondi/segmentation

Science Score: 95.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 11 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: frontiersin.org, mdpi.com, joss.theoj.org, zenodo.org
  • Committers with academic emails
    1 of 3 committers (33.3%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Keywords

covid-19 ct-images lung-regions segmentation

Scientific Fields

Psychology Social Sciences - 40% confidence
Last synced: 4 months ago · JSON representation

Repository

COVID-19 Lung Segmentation

Basic Info
Statistics
  • Stars: 20
  • Watchers: 4
  • Forks: 7
  • Open Issues: 0
  • Releases: 1
Topics
covid-19 ct-images lung-regions segmentation
Created over 5 years ago · Last pushed about 2 years ago
Metadata Files
Readme License Code of conduct Authors

README.md

| Authors | Project | Build Status | License | Code Quality | Coverage | |:------------:|:-----------:|:-----------------:|:-----------:|:----------------:|:------------:| | R. Biondi
N. Curti | COVID-19 Lung Segmentation status| Windows : Windows CI
Ubuntu : Ubuntu CI
| license | Codacy : Codacy Badge
Codebeat : CODEBEAT | codecov |

Project CI Docs CI

docs GitHub pull-requests GitHub issues

GitHub stars GitHub watchers

COVID-19 Lung Segmentation

This package allows to isolate the lung region and identify ground glass lesions on chest CT scans of patients affected by COVID-19. The segmentation approach is based on color quantization, performed by K-means clustering. This package provides a series of scripts to isolate lung regions, pre-process the images, estimate K-means centroids and labels of the lung regions.

Overview

COronaVirus Disease (COVID-19) has widely spread all over the world since the beginning of 2020. It is acute, highly contagious, viral infection mainly involving the respiratory system. Chest CT scans of patients affected by this condition have shown peculiar patterns of Ground Glass Opacities (GGO) and Consolidation (CS) related to the severity and the stage of the disease.

In this scenario, the correct and fast identification of these patterns is a fundamental task. Up to now this task is performed mainly using manual or semi-automatic techniques, which are time-consuming (hours or days) and subjected to the operator expertise.

This project provides an automatic pipeline for the segmentation of GGO areas on chest CT scans of patient affected by COVID-19. The segmentation is achieved with a color quantization algorithm, based on k-means clustering, grouping voxel by color and texture similarity.

Example of segmentation. Left: Original image: Right original image with identified ground-glass areas.

The pipeline was tested on 15 labeled chest CT scans, manually segmented by expert radiologist. The goodness of the segmentation was estimated using Dice(0.67 ± 0.12), Sensitivity(0.666 ± 0.15), Specificity(0.9993 ± 0.0005) and Precision(0.75± 0.20) scores.

These results make the pipeline suitable as initialization for more accurate methods

Contents

COVID-19 Lung segmentation is composed of scripts and modules: - scripts allows to isolate lung regions, find the centroids for colour quantization and segment the images. - modules allows to load and save the images from and to different extensions and perform operations on image series.

To refer to script documentation:

| Script | Description | |:----------:|:---------------:| | lung_extraction | Extract lung from CT scans | | train | Apply colour quantization on a series of stacks to estimate the centroid to use for segmentation | | labeling |Segment the input image by using pre-estimated centroids or user-provided set| | evaluate |Compute metrics to evaluate the prediction agains a ground truth|

To refer to modules documentation:

| Module| Description| |:---------:|:--------------:| | utils | method to load, save and preprocess stack | | method | method to filter the image tensor | | segmentation | contains useful function to segment stack of images and select ROI | | metrics | contains the implementation of the evaluation metrics|

For each script described below, there are a PowerShell and a shell script that allows their execution on multiple patients scans. Moreover it also provide a snakemake pipeline.

Prerequisites

Supported python version: Python version. Also python 3.5, 3.6, 3.7 are supported but not tested.

First of all ensure to have the right python version installed.

This script use opencv-python, numpy and SimpleITK: see requirements for more informations.

The lung extraction is performed by using a pre-trained UNet, so please ensure to have installed the lungmask package. For more information about how the network is trained, please refer to https://doi.org/10.1186/s41747-020-00173-2.

:warning: The OpenCV requirement binds the minimum Python version of this project to Python 3.5!

To run the tests you need to install PyTest and Hypothesis. Installation instructions are available at: PyTest, Hypothesis

Installation

Download the project or the latest release:

bash git clone https://github.com/RiccardoBiondi/segmentation

Now you can install the package using pip:

bash pip install segmentation/

Testing

Testing routines use PyTest and Hypothesis packages. please install these packages to perform the test. o install the package in development mode you need to add also this requirement:

  • pytest >= 3.0.7

  • hypothesis >= 4.13.0

:warning: pytest versions above 6.1.2 are not supported by python 3.5

A full set of test is provided in testing directory. You can run the full list of test with:

bash python -m pytest

Usage

This modules provides some script to segment a single scan, to automate the segmentation for multiple patients and to train your centroid set. In the following paragraph, we will see how to use all the features. To achieve this purpose, we will use, as example, the public dataset COVID-19 CT Lung and Infection Segmentation Dataset, published by Zenodo[5].

Download Data

Firstly, we have to download and prepare the data. All the data will be stored and organized in a folder named Example.

Download data into the Examples folder

using Bash:

bash $ mkdir Examples $ wget https://zenodo.org/record/3757476/files/COVID-19-CT-Seg_20cases.zip -P ./Examples $ unzip ./Examples/COVID-19-CT-Seg_20cases.zip -d ./Examples/COVID-19-CT

Or PowerShell:

```PowerShell

PS > New-Item -Path . -Name "Examples" -ItemType "directory" PS > Start-BitsTransfer -Source https://zenodo.org/record/3757476/files/COVID-19-CT-Seg20cases.zip -Destination .\Examples\ PS > Expand-Archive -LiteralPath .\Examples\COVID-19-CT-Seg20cases.zip -DestinationPath .\Examples\COVID-19-CT -Force ```

Single Scan

Once you have download the data and installed the module, you can start to segment the images. Input CT scans must be in Hounsfield units(HU) since grey-scale images are not allowed. The input allowed formats are the ones supported by SimpleITK. If the input is a DICOM series, pass the path to the directory containing the series files. Please ensure that the folder contains only one series. As output will save the segmentation as nrrd.

To segment a single CT scan run the following from the bash or PowerShell:

bash python -m CTLungSeg --input='./Examples/COVID-19-CT/coronacases_003.nii.gz' --output='./Examples/coronacases_003_label.nrrd'

Multiple Scans

In the case of multiple patients segmentation, you have to repeat the segmentation process many times: We have automated this process using bash(for Linux) and PowerShell(for Windows) scripts. We have also provided a snakemake pipeline for the whole segmentation procedure in a multi-processing environment. In the following paragraph, we will explain how to organize your data to benefits from this automation.

Script

To run the scripts,, you have to organize the data into three folders:

  • input folder: contains all and only the CT scans to segment
  • temporary folder: empty folder. Will contain the scans after the lung segmentation
  • output folder: empty folder, will contain the labels files.

As examples we will segmenta the coronacases_002 and the coronacases_005 patients.

From bash:

bash $ mkdir ./Examples/INPUT $ mkdir ./Examples/LUNG $ mkdir ./Examples/OUTPUT $ mv ./Examples/COVID-19-CT/coronacases_002.nii.gz ./Examples/COVID-19-CT/coronacases_005.nii.gz ./Examples/INPUT or from PowerShell

PowerShell PS \> New-Item -Path "Examples" -Name "INPUT" -ItemType "directory" PS \> New-Item -Path "Examples" -Name "LUNG" -ItemType "directory" PS \> New-Item -Path "Examples" -Name "OUTPUT" -ItemType "directory" PS \> Move-Item -Path "Examples\COVID-19-CT\coronacases_002.nii.gz" -Destination "Examples\INPUT" PS \> Move-Item -Path "Examples\COVID-19-CT\coronacases_005.nii.gz" -Destination "Examples\INPUT"

Now you can proceed with the lung segmentation. To achieve this purpose run from PowerShell the script:

PowerShell PS \> ./lung_extraction.ps1 ./Examples/INPUT ./Examples/LUNG

Or its equivalent bash version:

bash $ ./lung_extraction.sh./Examples/INPUT ./Examples/LUNG

Once you have successfully isolated the lung, you are ready to perform the GGO segmentation. Run the labelling scrip from PowerShell :

PowerShell PS /> ./labeling.ps1 ./Examples/LUNG ./Examples/OUTPUT

Or its corresponding bash version:

bash $ ./labeling.sh ./Examples/LUNG ./Examples/OUTPUT

Train your own centroid set

It is possible to train your centroid set instead of using the pre-trained one.

In this case you have to prepare these folders : - TRAIN : will contain the scans in the training set - TLUNG : will stores the scans after lung extraction

We will use coronaceses_003 and coronaceses_008 as training set.

From bash:

bash $ mkdir ./Examples/TRAIN $ mkdir ./Examples/TLUNG $ mv ./Examples/COVID-19-CT/coronacases_003.nii.gz ./Examples/COVID-19-CT/coronacases_008.nii.gz ./Examples/TRAIN

or Powershell:

PowerShell PS \> New-Item -Path ".\Examples" -Name "TRAIN" -ItemType "directory" PS \> New-Item -Path ".\Examples" -Name "TLUNG" -ItemType "directory" PS \> Move-Item -Path ".\Examples\COVID-19-CT\coronacases_003.nii.gz" -Destination "Examples\TRAIN" PS \> Move-Item -Path ".\Examples\COVID-19-CT\coronacases_008.nii.gz" -Destination "Examples\TRAIN"

First of all, you have to perform the lung extraction on the train scans, as before run:

bash $ ./lung_extraction.sh ./Examples/TRAIN/ ./Examples/TLUNG/

or its corresponding PowerShell version. Now, to estimate the centroid set, run:

bash $ ./train.sh ./Examples/TLUNG/ ./centroid.pkl.npy

or its corresponding PowerShell version.

Snakemake

If you have not installed snakemake, you can find the instruction here. To use the snakemake pipeline, you have to create two folders:

  • INPUT : contains all and only the CT scans to segment
  • OUTPUT : empty folder, will contain the segmented scans as nrrd.

As before we will use as examples coronacases_002 and coronacases_005 patients

:notes: If you already run the script version, these folder are ready

Execute from bash

bash $ mkdir ./Examples/INPUT $ mkdir ./Examples/OUTPUT $ mv ./Examples/COVID-19-CT/coronacases_002.nii.gz ./Examples/COVID-19-CT/coronacases_005.nii.gz ./Examples/INPUT

or PowerShell

```PowerShell PS > New-Item -Path "Examples" -Name "INPUT" -ItemType "directory" PS > New-Item -Path "Examples" -Name "OUTPUT" -ItemType "directory" PS > Move-Item -Path ".\Examples\COVID-19-CT\coronacases002.nii.gz" -Destination "Examples\INPUT" PS > Move-Item -Path ".\Examples\COVID-19-CT\coronacases005.nii.gz" -Destination "Examples\INPUT"

```

Now, from command line, execute:

bash snakemake --cores 1 --config input_path='./Examples/INPUT/' output_path='./Examples/OUTPUT/'

:notes: This command works both for Bash and Powershell

:warning: It will create a folder named LUNG inside the INPUT, which contains the results of the lung extraction step.

Train Your Centroids

As before, you can decide to train your centroid set. To achieve this purpose, using the snakemake pipeline, you have to prepare three folders :

  • INPUT: will contains all the scans to segment
  • OUTPUT: will contain the segmented scans
  • TRAIN: will contain all the scans of the training set. (NOTE Cannot be the INPUT folder)

:warning: INPUT and TRAIN folder cannot be the same

:notes: This will train the centroid set, and after that perform the segmentation on the scans in the input folder. So the INPUT folder is organized as before.

Now run Snakemake with the following configuration parameters :

bash snakemake --cores 1 --config input_path='./Examples/INPUT/' output_path='.Examples/OUTPUT/' train_path='./Examples/TRAIN/' centroid_path='./Examples/centorids.pkl.npy'

Evaluation

This project provides also a script to evaluate the goodnes of the segmentation against the ground truth. The evaluation is carried out by different metrics: Dice Coefficient, Sensitivity, Recall, Precision and Accuracy. To run te evaluation procedure, run the following command from bash or PowerShell

bash python -m CTLungSeg.evaluate --gt='/Path/To/GroundTruth.nii' --pred='/Path/To/Prediction.nii'

This will print on the command line the achieved results. To store the results to a comma spaced csv file, use the following command from bash or PowerShell

bash python -m CTLungSeg.evaluate --gt='/Path/To/GroundTruth.nii' --pred='/Path/To/Prediction.nii' --output='/Path/To/Output.csv'

Notice that both ground truth and prdiction must have the same shape. The images will be evaluated as binary images with a background value of 0.

License

The COVID-19 Lung Segmentation package is licensed under the MIT "Expat" License. License

Contribution

Any contribution is more than welcome. Just fill an issue or a pull request and we will check ASAP!

See here for further informations about how to contribute with this project.

References

1- Hofmanninger, J., Prayer, F., Pan, J. et al. Automatic lung segmentation in routine imaging is primarily a data diversity problem, not a methodology problem. Eur Radiol Exp 4, 50 (2020). https://doi.org/10.1186/s41747-020-00173-2.
2- Bradski, G. (2000). The OpenCV Library. Dr. Dobb's Journal of Software Tools.
3- Yaniv, Z., Lowekamp, B.C., Johnson, H.J. et al. SimpleITK Image-Analysis Notebooks: a Collaborative Environment for Education and Reproducible Research. J Digit Imaging 31, 290–303 (2018). https://doi.org/10.1007/s10278-017-0037-8.
4- Lowekamp Bradley, Chen David, Ibanez Luis, Blezek Daniel The Design of SimpleITK Frontiers in Neuroinformatics 7, 45 (2013) https://www.frontiersin.org/article/10.3389/fninf.2013.00045.
5- Ma Jun, Ge Cheng, Wang Yixin, An Xingle, Gao Jiantao, Yu Ziqi, Zhang Minqing, Liu Xin, Deng Xueyuan, Cao Shucheng, Wei Hao, Mei Sen, Yang Xiaoyu, Nie Ziwei, Li Chen, Tian Lu, Zhu Yuntao, Zhu Qiongjie, Dong Guoqiang, & He Jian. (2020). COVID-19 CT Lung and Infection Segmentation Dataset (Verson 1.0) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.3757476.

Authors

See also the list of contributors GitHub contributors who participated to this project.

Acknowledgments

The authors acknowledge all the members of the Department of Radiology, IRCCS Azienda Ospedaliero-Universitaria di Bologna and the SIRM foundation, Italian Society of Medical and Interventional Radiology for the support in the development of the project and analysis of the data.

Citation

If you have found COVID-19 Lung Segmentation helpful in your research, please consider citing the original paper

BibTeX @article{app11125438, author = {Biondi, Riccardo and Curti, Nico and Coppola, Francesca and Giampieri, Enrico and Vara, Giulio and Bartoletti, Michele and Cattabriga, Arrigo and Cocozza, Maria Adriana and Ciccarese, Federica and De Benedittis, Caterina and Cercenelli, Laura and Bortolani, Barbara and Marcelli, Emanuela and Pierotti, Luisa and Strigari, Lidia and Viale, Pierluigi and Golfieri, Rita and Castellani, Gastone}, title = {Classification Performance for COVID Patient Prognosis from Automatic AI Segmentation—A Single-Center Study}, journal = {Applied Sciences}, volume = {11}, year = {2021}, number = {12}, article-number = {5438}, url = {https://www.mdpi.com/2076-3417/11/12/5438}, issn = {2076-3417}, doi = {10.3390/app11125438} }

or just this project

```BibTeX @misc{COVID-19 Lung Segmentation, author = {Biondi, Riccardo and Curti, Nico and Giampieri, Enrico and Castellani, Gastone}, title = {COVID-19 Lung Segmentation}, year = {2020}, publisher = {GitHub}, howpublished = {\url{https://github.com/RiccardoBiondi/segmentation}}, }

```

Owner

  • Login: RiccardoBiondi
  • Kind: user

PhD student at the University of Bologna. I am currently focusing on medical image segmentation

JOSS Publication

COVID-19 Lung Segmentation
Published
September 30, 2021
Volume 6, Issue 65, Page 3447
Authors
Riccardo Biondi
Department of Experimental, Diagnostic and Specialty Medicine of Bologna University
Nico Curti ORCID
eDIMESLab, Department of Experimental, Diagnostic and Specialty Medicine of Bologna University
Enrico Giampieri ORCID
eDIMESLab, Department of Experimental, Diagnostic and Specialty Medicine of Bologna University
Gastone Castellani ORCID
Department of Experimental, Diagnostic and Specialty Medicine of Bologna University
Editor
Jacob Schreiber ORCID
Tags
radiomics artificial-intelligence machine-learning deep-learning medical-imaging chest-CT python3

GitHub Events

Total
  • Watch event: 2
Last Year
  • Watch event: 2

Committers

Last synced: 5 months ago

All Time
  • Total Commits: 189
  • Total Committers: 3
  • Avg Commits per committer: 63.0
  • Development Distribution Score (DDS): 0.011
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
RiccardoBiondi r****4@s****t 187
Diedre Carmo c****e@o****m 1
Daniel S. Katz d****z@i****g 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 5
  • Total pull requests: 3
  • Average time to close issues: 2 months
  • Average time to close pull requests: 6 days
  • Total issue authors: 2
  • Total pull request authors: 3
  • Average comments per issue: 2.8
  • Average comments per pull request: 0.33
  • Merged pull requests: 2
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • accidentul (3)
  • dscarmo (2)
Pull Request Authors
  • codacy-badger (1)
  • dscarmo (1)
  • danielskatz (1)
Top Labels
Issue Labels
question (2) help wanted (1)
Pull Request Labels

Dependencies

docs/requirements.txt pypi
  • IPython *
  • SimpleITK *
  • matplotlib *
  • nbsphinx ==0.8.7
  • numpy >=1.17
  • opencv-python *
  • sphinx ==4.1.2
  • sphinx-rtd-theme *
  • sphinxcontrib-napoleon *
  • sphinxcontrib-programoutput *
  • tdqm *
requirements.txt pypi
  • SimpleITK *
  • numpy >=1.17
  • opencv-python *
  • tdqm *