COVID-19 Lung Segmentation

COVID-19 Lung Segmentation - Published in JOSS (2021)

https://github.com/riccardobiondi/segmentation

Keywords

covid-19 ct-images lung-regions segmentation

Scientific Fields

Psychology Social Sciences - 40% confidence

Last synced: 6 months ago · JSON representation

Repository

COVID-19 Lung Segmentation

Basic Info

Host: GitHub
Owner: RiccardoBiondi
License: other
Language: Python
Default Branch: master
Homepage: https://covid-19-ggo-segmentation.readthedocs.io/en/latest/?badge=latest
Size: 59.1 MB

Statistics

Stars: 20
Watchers: 4
Forks: 7
Open Issues: 0
Releases: 1

Topics

covid-19 ct-images lung-regions segmentation

Created over 5 years ago · Last pushed over 2 years ago

Metadata Files

Readme License Code of conduct Authors

COVID-19 Lung Segmentation

This package allows to isolate the lung region and identify ground glass lesions on chest CT scans of patients affected by COVID-19. The segmentation approach is based on color quantization, performed by K-means clustering. This package provides a series of scripts to isolate lung regions, pre-process the images, estimate K-means centroids and labels of the lung regions.

COVID-19 Lung Segmentation

Overview

COronaVirus Disease (COVID-19) has widely spread all over the world since the beginning of 2020. It is acute, highly contagious, viral infection mainly involving the respiratory system. Chest CT scans of patients affected by this condition have shown peculiar patterns of Ground Glass Opacities (GGO) and Consolidation (CS) related to the severity and the stage of the disease.

In this scenario, the correct and fast identification of these patterns is a fundamental task. Up to now this task is performed mainly using manual or semi-automatic techniques, which are time-consuming (hours or days) and subjected to the operator expertise.

This project provides an automatic pipeline for the segmentation of GGO areas on chest CT scans of patient affected by COVID-19. The segmentation is achieved with a color quantization algorithm, based on k-means clustering, grouping voxel by color and texture similarity.

Example of segmentation. Left: Original image: Right original image with identified ground-glass areas.

The pipeline was tested on 15 labeled chest CT scans, manually segmented by expert radiologist. The goodness of the segmentation was estimated using Dice(0.67 ± 0.12), Sensitivity(0.666 ± 0.15), Specificity(0.9993 ± 0.0005) and Precision(0.75± 0.20) scores.

These results make the pipeline suitable as initialization for more accurate methods

To refer to modules documentation:

For each script described below, there are a PowerShell and a shell script that allows their execution on multiple patients scans. Moreover it also provide a snakemake pipeline.

Prerequisites

Supported python version: . Also python 3.5, 3.6, 3.7 are supported but not tested.

First of all ensure to have the right python version installed.

This script use opencv-python, numpy and SimpleITK: see requirements for more informations.

The lung extraction is performed by using a pre-trained UNet, so please ensure to have installed the lungmask package. For more information about how the network is trained, please refer to https://doi.org/10.1186/s41747-020-00173-2.

:warning: The OpenCV requirement binds the minimum Python version of this project to Python 3.5!

To run the tests you need to install PyTest and Hypothesis. Installation instructions are available at: PyTest, Hypothesis

Installation

Download the project or the latest release:

bash git clone https://github.com/RiccardoBiondi/segmentation

Now you can install the package using pip:

bash pip install segmentation/

Testing

Testing routines use PyTest and Hypothesis packages. please install these packages to perform the test. o install the package in development mode you need to add also this requirement:

pytest >= 3.0.7
hypothesis >= 4.13.0

:warning: pytest versions above 6.1.2 are not supported by python 3.5

A full set of test is provided in testing directory. You can run the full list of test with:

bash python -m pytest

Usage

This modules provides some script to segment a single scan, to automate the segmentation for multiple patients and to train your centroid set. In the following paragraph, we will see how to use all the features. To achieve this purpose, we will use, as example, the public dataset COVID-19 CT Lung and Infection Segmentation Dataset, published by Zenodo[5].

Download Data

Firstly, we have to download and prepare the data. All the data will be stored and organized in a folder named Example.

Download data into the Examples folder

using Bash:

bash $ mkdir Examples $ wget https://zenodo.org/record/3757476/files/COVID-19-CT-Seg_20cases.zip -P ./Examples $ unzip ./Examples/COVID-19-CT-Seg_20cases.zip -d ./Examples/COVID-19-CT

Or PowerShell:

```PowerShell

PS > New-Item -Path . -Name "Examples" -ItemType "directory" PS > Start-BitsTransfer -Source https://zenodo.org/record/3757476/files/COVID-19-CT-Seg20cases.zip -Destination .\Examples\ PS > Expand-Archive -LiteralPath .\Examples\COVID-19-CT-Seg20cases.zip -DestinationPath .\Examples\COVID-19-CT -Force ```

Single Scan

Once you have download the data and installed the module, you can start to segment the images. Input CT scans must be in Hounsfield units(HU) since grey-scale images are not allowed. The input allowed formats are the ones supported by SimpleITK. If the input is a DICOM series, pass the path to the directory containing the series files. Please ensure that the folder contains only one series. As output will save the segmentation as nrrd.

To segment a single CT scan run the following from the bash or PowerShell:

bash python -m CTLungSeg --input='./Examples/COVID-19-CT/coronacases_003.nii.gz' --output='./Examples/coronacases_003_label.nrrd'

Multiple Scans

In the case of multiple patients segmentation, you have to repeat the segmentation process many times: We have automated this process using bash(for Linux) and PowerShell(for Windows) scripts. We have also provided a snakemake pipeline for the whole segmentation procedure in a multi-processing environment. In the following paragraph, we will explain how to organize your data to benefits from this automation.

Script

To run the scripts,, you have to organize the data into three folders:

input folder: contains all and only the CT scans to segment
temporary folder: empty folder. Will contain the scans after the lung segmentation
output folder: empty folder, will contain the labels files.

As examples we will segmenta the coronacases_002 and the coronacases_005 patients.

From bash:

bash $ mkdir ./Examples/INPUT $ mkdir ./Examples/LUNG $ mkdir ./Examples/OUTPUT $ mv ./Examples/COVID-19-CT/coronacases_002.nii.gz ./Examples/COVID-19-CT/coronacases_005.nii.gz ./Examples/INPUT or from PowerShell

PowerShell PS \> New-Item -Path "Examples" -Name "INPUT" -ItemType "directory" PS \> New-Item -Path "Examples" -Name "LUNG" -ItemType "directory" PS \> New-Item -Path "Examples" -Name "OUTPUT" -ItemType "directory" PS \> Move-Item -Path "Examples\COVID-19-CT\coronacases_002.nii.gz" -Destination "Examples\INPUT" PS \> Move-Item -Path "Examples\COVID-19-CT\coronacases_005.nii.gz" -Destination "Examples\INPUT"

Now you can proceed with the lung segmentation. To achieve this purpose run from PowerShell the script:

PowerShell PS \> ./lung_extraction.ps1 ./Examples/INPUT ./Examples/LUNG

Or its equivalent bash version:

bash $ ./lung_extraction.sh./Examples/INPUT ./Examples/LUNG

Once you have successfully isolated the lung, you are ready to perform the GGO segmentation. Run the labelling scrip from PowerShell :

PowerShell PS /> ./labeling.ps1 ./Examples/LUNG ./Examples/OUTPUT

Or its corresponding bash version:

bash $ ./labeling.sh ./Examples/LUNG ./Examples/OUTPUT

Train your own centroid set

It is possible to train your centroid set instead of using the pre-trained one.

In this case you have to prepare these folders : - TRAIN : will contain the scans in the training set - TLUNG : will stores the scans after lung extraction

We will use coronaceses_003 and coronaceses_008 as training set.

From bash:

bash $ mkdir ./Examples/TRAIN $ mkdir ./Examples/TLUNG $ mv ./Examples/COVID-19-CT/coronacases_003.nii.gz ./Examples/COVID-19-CT/coronacases_008.nii.gz ./Examples/TRAIN

or Powershell:

PowerShell PS \> New-Item -Path ".\Examples" -Name "TRAIN" -ItemType "directory" PS \> New-Item -Path ".\Examples" -Name "TLUNG" -ItemType "directory" PS \> Move-Item -Path ".\Examples\COVID-19-CT\coronacases_003.nii.gz" -Destination "Examples\TRAIN" PS \> Move-Item -Path ".\Examples\COVID-19-CT\coronacases_008.nii.gz" -Destination "Examples\TRAIN"

First of all, you have to perform the lung extraction on the train scans, as before run:

bash $ ./lung_extraction.sh ./Examples/TRAIN/ ./Examples/TLUNG/

or its corresponding PowerShell version. Now, to estimate the centroid set, run:

bash $ ./train.sh ./Examples/TLUNG/ ./centroid.pkl.npy

or its corresponding PowerShell version.

Snakemake

If you have not installed snakemake, you can find the instruction here. To use the snakemake pipeline, you have to create two folders:

INPUT : contains all and only the CT scans to segment
OUTPUT : empty folder, will contain the segmented scans as nrrd.

As before we will use as examples coronacases_002 and coronacases_005 patients

:notes: If you already run the script version, these folder are ready

Execute from bash

bash $ mkdir ./Examples/INPUT $ mkdir ./Examples/OUTPUT $ mv ./Examples/COVID-19-CT/coronacases_002.nii.gz ./Examples/COVID-19-CT/coronacases_005.nii.gz ./Examples/INPUT

or PowerShell

```PowerShell PS > New-Item -Path "Examples" -Name "INPUT" -ItemType "directory" PS > New-Item -Path "Examples" -Name "OUTPUT" -ItemType "directory" PS > Move-Item -Path ".\Examples\COVID-19-CT\coronacases002.nii.gz" -Destination "Examples\INPUT" PS > Move-Item -Path ".\Examples\COVID-19-CT\coronacases005.nii.gz" -Destination "Examples\INPUT"

```

Now, from command line, execute:

bash snakemake --cores 1 --config input_path='./Examples/INPUT/' output_path='./Examples/OUTPUT/'

:notes: This command works both for Bash and Powershell

:warning: It will create a folder named LUNG inside the INPUT, which contains the results of the lung extraction step.

Train Your Centroids

As before, you can decide to train your centroid set. To achieve this purpose, using the snakemake pipeline, you have to prepare three folders :

INPUT: will contains all the scans to segment
OUTPUT: will contain the segmented scans
TRAIN: will contain all the scans of the training set. (NOTE Cannot be the INPUT folder)

:warning: INPUT and TRAIN folder cannot be the same

:notes: This will train the centroid set, and after that perform the segmentation on the scans in the input folder. So the INPUT folder is organized as before.

Now run Snakemake with the following configuration parameters :

bash snakemake --cores 1 --config input_path='./Examples/INPUT/' output_path='.Examples/OUTPUT/' train_path='./Examples/TRAIN/' centroid_path='./Examples/centorids.pkl.npy'

Evaluation

This project provides also a script to evaluate the goodnes of the segmentation against the ground truth. The evaluation is carried out by different metrics: Dice Coefficient, Sensitivity, Recall, Precision and Accuracy. To run te evaluation procedure, run the following command from bash or PowerShell

bash python -m CTLungSeg.evaluate --gt='/Path/To/GroundTruth.nii' --pred='/Path/To/Prediction.nii'

This will print on the command line the achieved results. To store the results to a comma spaced csv file, use the following command from bash or PowerShell

bash python -m CTLungSeg.evaluate --gt='/Path/To/GroundTruth.nii' --pred='/Path/To/Prediction.nii' --output='/Path/To/Output.csv'

Notice that both ground truth and prdiction must have the same shape. The images will be evaluated as binary images with a background value of 0.

License

The COVID-19 Lung Segmentation package is licensed under the MIT "Expat" License.

Contribution

Any contribution is more than welcome. Just fill an issue or a pull request and we will check ASAP!

See here for further informations about how to contribute with this project.

References

1- Hofmanninger, J., Prayer, F., Pan, J. et al. Automatic lung segmentation in routine imaging is primarily a data diversity problem, not a methodology problem. Eur Radiol Exp 4, 50 (2020). https://doi.org/10.1186/s41747-020-00173-2.

2- Bradski, G. (2000). The OpenCV Library. Dr. Dobb's Journal of Software Tools.

3- Yaniv, Z., Lowekamp, B.C., Johnson, H.J. et al. SimpleITK Image-Analysis Notebooks: a Collaborative Environment for Education and Reproducible Research. J Digit Imaging 31, 290–303 (2018). https://doi.org/10.1007/s10278-017-0037-8.

4- Lowekamp Bradley, Chen David, Ibanez Luis, Blezek Daniel The Design of SimpleITK Frontiers in Neuroinformatics 7, 45 (2013) https://www.frontiersin.org/article/10.3389/fninf.2013.00045.

5- Ma Jun, Ge Cheng, Wang Yixin, An Xingle, Gao Jiantao, Yu Ziqi, Zhang Minqing, Liu Xin, Deng Xueyuan, Cao Shucheng, Wei Hao, Mei Sen, Yang Xiaoyu, Nie Ziwei, Li Chen, Tian Lu, Zhu Yuntao, Zhu Qiongjie, Dong Guoqiang, & He Jian. (2020). COVID-19 CT Lung and Infection Segmentation Dataset (Verson 1.0) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.3757476.

Authors

Riccardo Biondi git
Nico Curti git, unibo
Enrico Giampieri git, unibo
Gastone Castellani unibo

See also the list of contributors who participated to this project.

Acknowledgments

The authors acknowledge all the members of the Department of Radiology, IRCCS Azienda Ospedaliero-Universitaria di Bologna and the SIRM foundation, Italian Society of Medical and Interventional Radiology for the support in the development of the project and analysis of the data.

Citation

If you have found COVID-19 Lung Segmentation helpful in your research, please consider citing the original paper

BibTeX @article{app11125438, author = {Biondi, Riccardo and Curti, Nico and Coppola, Francesca and Giampieri, Enrico and Vara, Giulio and Bartoletti, Michele and Cattabriga, Arrigo and Cocozza, Maria Adriana and Ciccarese, Federica and De Benedittis, Caterina and Cercenelli, Laura and Bortolani, Barbara and Marcelli, Emanuela and Pierotti, Luisa and Strigari, Lidia and Viale, Pierluigi and Golfieri, Rita and Castellani, Gastone}, title = {Classification Performance for COVID Patient Prognosis from Automatic AI Segmentation—A Single-Center Study}, journal = {Applied Sciences}, volume = {11}, year = {2021}, number = {12}, article-number = {5438}, url = {https://www.mdpi.com/2076-3417/11/12/5438}, issn = {2076-3417}, doi = {10.3390/app11125438} }

or just this project

```BibTeX @misc{COVID-19 Lung Segmentation, author = {Biondi, Riccardo and Curti, Nico and Giampieri, Enrico and Castellani, Gastone}, title = {COVID-19 Lung Segmentation}, year = {2020}, publisher = {GitHub}, howpublished = {\url{https://github.com/RiccardoBiondi/segmentation}}, }

```

Owner

Login: RiccardoBiondi
Kind: user

Repositories: 20
Profile: https://github.com/RiccardoBiondi

PhD student at the University of Bologna. I am currently focusing on medical image segmentation

JOSS Publication

COVID-19 Lung Segmentation

Published

September 30, 2021

DOI

10.21105/joss.03447

Volume 6, Issue 65, Page 3447

Authors

Riccardo Biondi
Department of Experimental, Diagnostic and Specialty Medicine of Bologna University

Nico Curti

eDIMESLab, Department of Experimental, Diagnostic and Specialty Medicine of Bologna University

Enrico Giampieri

eDIMESLab, Department of Experimental, Diagnostic and Specialty Medicine of Bologna University

Gastone Castellani

Department of Experimental, Diagnostic and Specialty Medicine of Bologna University

Editor

Jacob Schreiber

GitHub Events

Total

Watch event: 2

Last Year

Watch event: 2

Committers

Last synced: 7 months ago

All Time

Total Commits: 189
Total Committers: 3
Avg Commits per committer: 63.0
Development Distribution Score (DDS): 0.011

Past Year

Commits: 0
Committers: 0
Avg Commits per committer: 0.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
RiccardoBiondi	r**4@s**t	187
Diedre Carmo	c**e@o**m	1
Daniel S. Katz	d**z@i**g	1

Committer Domains (Top 20 + Academic)

ieee.org: 1 studio.unibo.it: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 5
Total pull requests: 3
Average time to close issues: 2 months
Average time to close pull requests: 6 days
Total issue authors: 2
Total pull request authors: 3
Average comments per issue: 2.8
Average comments per pull request: 0.33
Merged pull requests: 2
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

COVID-19 Lung Segmentation

Science Score: 95.0%

Keywords

Scientific Fields

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

COVID-19 Lung Segmentation

Overview

Contents

Prerequisites

Installation

Testing

Usage

Download Data

Single Scan

Multiple Scans

Script

Train your own centroid set

Snakemake

Train Your Centroids

Evaluation

License

Contribution

References

Authors

Acknowledgments

Citation

Owner

JOSS Publication

COVID-19 Lung Segmentation

Authors

Editor

Tags

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies