deep-imcyto
Nextflow pipeline for IMC segmentation tasks.
Science Score: 41.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
○codemeta.json file
-
○.zenodo.json file
-
✓DOI references
Found 9 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.7%) to scientific vocabulary
Repository
Nextflow pipeline for IMC segmentation tasks.
Basic Info
- Host: GitHub
- Owner: FrancisCrickInstitute
- License: other
- Language: Python
- Default Branch: main
- Size: 2.61 MB
Statistics
- Stars: 0
- Watchers: 3
- Forks: 0
- Open Issues: 14
- Releases: 0
Metadata Files
README.md
Introduction
deep-imcyto is a bioinformatics analysis pipeline for segmentation and other principal tasks in imaging mass cytometry data analysis. It is an update and extension of nfcore/imcyto, a bioinformatics analysis pipeline developed by van Maldegem et al. for IMC image segmentation and extraction of single cell expression data. deep-imcyto provides highly accurate cell segmentation of IMC images based on a U-net++ deep learning model as well as facilities for QC and manual review of image processing steps, which can be invaluable during IMC experimental design.
deep-imcyto is implemented in Nextflow, a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It comes with docker containers making installation trivial and results highly reproducible.
deep-imcyto is a consitutent component of the TRACERx-PHLEX pipeline for highly multiplexed imaging. Other components include TYPEx, for detailed cell phenotyping and Spatial-PHLEX for single cell spatial data analysis.
Pipeline summary
deep-imcyto has three modes of operation: QC, Simple segmentation and Multiplexed Consensus Cell Segmentation, summarised in the diagram below.
QC mode
deep-imcyto's QC mode is designed to provide quick access to individual channels in IMC data for quality control and/or review by splitting .mcd files into constituent channel images by imaged ROI. If a particular preprocessing option is selected (e.g. spillover correction, hotpixel removal or the application of a custom set of preprocessing steps specified as a CellProfiler .cppipe file) then this preprocessing will be performed, as produced as an output of the QC run for manual review.
Segmentation modes
Simple
In
simplesegmentation mode an approximation of whole cell segmentation is performed where accurate predicted nuclei are dilated by a user-defined number of pixels.Multiplexed consensus cell segmentation (MCCS)
In
MCCSmode a more accurate whole cell segmentation is performed following the multiplexed consensus cell segmentation principles using nuclear predictions and progressive masking of specific marker channels (See [LINK TO PAPER] and [LINK TO READTHEDOCS]). MCCS procedures are provided to deep-imcyto as a CellProfiler pipeline which is then executed in as parallel way as possible via Nextflow.
Running deep-imcyto on an HPC system running SLURM
Clone the deep-imcyto repository.
Download both the deep-imcyto trained nucleus model weights and the example test dataset from our Zenodo repository (https://doi.org/10.5281/zenodo.7573269)
Unzip these
.ziparchives to an appropriate location respectively (total space required ~1GB)Ensure your HPC system has
Nextflow/22.04.0andSingularity/3.6.4installed.Set your profile (see below)
Edit the following script appropriately and run it from a compute node.
This will run deep-imcyto in
simplesegmentation mode.
```bash
!/bin/bash
LOAD MODULES
ml purge ml Nextflow/22.04.0 ml Singularity/3.6.4
Define a folder on your system for the deep-imcyto software containers to be stored (space required ~10GB):
export NXFSINGULARITYCACHEDIR='/path/to/containers/deep-imcyto'
RUN DEEP-IMCYTO:
nextflow run ./main.nf\
--input "/path/to/test/dataset///*.tiff"\
--outdir '../results/simple'\
--metadata 'assets/metadata/PHLEXsimplesegmentationmetadatap1.csv'\
--email youremail@yourinstitute.ac.uk\
--nuclearweightsdirectory "/path/to/weights/directory"\
--segmentationworkflow 'simple'\
--nucleardilationradius 5\
--preprocessmethod 'hotpixel'\
--nneighbours 5\
--singularitybind_path '/camp'\
-w '/path/to/work/directory/'\
-profile
To run deep-imcyto inMCCS` mode, run the following:
```bash
!/bin/bash
LOAD MODULES
ml purge ml Nextflow/22.04.0 ml Singularity/3.6.4
Define a folder on your system for the deep-imcyto software containers to be stored (space required ~10GB):
export NXFSINGULARITYCACHEDIR='/path/to/containers/deep-imcyto'
RUN DEEP-IMCYTO:
nextflow run ./main.nf\
--input "/path/to/test/dataset///.tiff"\
--outdir '../results/MCCS'\
--metadata 'assets/metadata/PHLEXsimplesegmentationmetadatap1.csv'\
--email alastair.magness@crick.ac.uk\
--nuclearweightsdirectory "/path/to/weights/directory"\
--segmentationworkflow 'MCCS'\
--fullstackcppipe './assets/cppipes/fullstackpreprocessing.cppipe'\
--segmentationcppipe './assets/cppipes/segmentationP1.cppipe'\
--mccsstackcppipe './assets/cppipes/mccsstackpreprocessing.cppipe'\
--compensationtiff './assets/spillover/P1imc.tiff'\
--plugins "./assets/plugins"\
--singularitybindpath '/camp'\
-w '/path/to/work/directory/'\
-profile
The variable singularity_bind_path tells deep-imcyto how to bind paths inside and outside the deep-imcyto Docker/Singularity container. If it is not explicitely set deep-imcyto attempts to use the root of the absolute path to the deepimcyto repository base directory [i.e. /path in /path/to/deep-imcyto].
See usage docs for all of the available options when running the pipeline.
Container
deep-imcyto runs inside a customised Docker container built on top of the rapids-22.02-cuda11.0-base-ubuntu18.04-py3.8 Docker container for reproducible GPU-accelerated data science. Important prerequisites for the RAPIDS are as follows:
- NVIDIA Pascal™ GPU architecture or better
- CUDA 11.2/11.4/11.5 with a compatible NVIDIA driver
- nvidia-container-toolkit
See RAPIDS for more information.
Documentation
The nf-core/imcyto pipeline comes with documentation about the pipeline, found in the docs/ directory:
- Installation
- Pipeline configuration
- Running the pipeline
- Output and how to interpret the results
- Troubleshooting
Credits
deep-imcyto is primarily developed by Alastair Magness at The Francis Crick Institute. Other core contributors include Emma Colliver, Mihaela Angelova, and Katey Enfield.
nf-core/imcyto was originally written by The Bioinformatics & Biostatistics Group for use at The Francis Crick Institute, London. It was developed by Harshil Patel and Nourdine Bah in collaboration with Karishma Valand, Febe van Maldegem among others.
It would not have been possible to develop this pipeline without the guidelines, scripts and plugins provided by the Bodenmiller Lab. Thank you too!
Contributions and Support
If you would like to contribute to this pipeline, please see the contributing guidelines.
For further information or help, don't hesitate to get in touch on Slack (you can join with this invite).
Citation
You can cite the nf-core publication as follows:
The nf-core framework for community-curated bioinformatics pipelines.
Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.
Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.
ReadCube: Full Access Link
An extensive list of references for the tools used by the pipeline can be found in the CITATIONS.md file.
Owner
- Name: The Francis Crick Institute
- Login: FrancisCrickInstitute
- Kind: organization
- Location: London, UK
- Website: http://crick.ac.uk
- Repositories: 89
- Profile: https://github.com/FrancisCrickInstitute
Citation (CITATIONS.md)
# nf-core/imcyto: Citations ## [nf-core](https://www.ncbi.nlm.nih.gov/pubmed/32055031/) > Ewels PA, Peltzer A, Fillinger S, Patel H, Alneberg J, Wilm A, Garcia MU, Di Tommaso P, Nahnsen S. The nf-core framework for community-curated bioinformatics pipelines. Nat Biotechnol. 2020 Mar;38(3):276-278. doi: 10.1038/s41587-020-0439-x. PubMed PMID: 32055031. ReadCube: [Full Access Link](https://rdcu.be/b1GjZ). ## [Nextflow](https://www.ncbi.nlm.nih.gov/pubmed/28398311/) > Di Tommaso P, Chatzou M, Floden EW, Barja PP, Palumbo E, Notredame C. Nextflow enables reproducible computational workflows. Nat Biotechnol. 2017 Apr 11;35(4):316-319. doi: 10.1038/nbt.3820. PubMed PMID: 28398311. ## Pipeline tools * [CellProfiler](https://www.ncbi.nlm.nih.gov/pubmed/29969450/) > McQuin C, Goodman A, Chernyshev V, Kamentsky L, Cimini BA, Karhohs KW, Doan M, Ding L, Rafelski SM, Thirstrup D, Wiegraebe W, Singh S, Becker T, Caicedo JC, Carpenter AE. CellProfiler 3.0: Next-generation image processing for biology. PLoS Biol. 2018 Jul 3;16(7):e2005970. doi: 10.1371/journal.pbio.2005970. eCollection 2018 Jul. PubMed PMID: 29969450; PubMed Central PMCID: PMC6029841. * [ilastik](https://www.ncbi.nlm.nih.gov/pubmed/31570887/) > Berg S, Kutra D, Kroeger T, Straehle CN, Kausler BX, Haubold C, Schiegg M, Ales J, Beier T, Rudy M, Eren K, Cervantes JI, Xu B, Beuttenmueller F, Wolny A, Zhang C, Koethe U, Hamprecht FA, Kreshuk A. ilastik: interactive machine learning for (bio)image analysis. Nat Methods. 2019 Sep 30. doi: 10.1038/s41592-019-0582-9. [Epub ahead of print] Review. PubMed PMID: 31570887. * [histoCAT](https://www.ncbi.nlm.nih.gov/pubmed/28783155/) > Schapiro D, Jackson HW, Raghuraman S, Fischer JR, Zanotelli VRT, Schulz D, Giesen C, Catena R, Varga Z, Bodenmiller B. histoCAT: analysis of cell phenotypes and interactions in multiplex image cytometry data. Nat Methods. 2017 Sep;14(9):873-876. doi: 10.1038/nmeth.4391. Epub 2017 Aug 7. PubMed PMID: 28783155; PubMed Central PMCID: PMC5617107. * [imctools](https://github.com/BodenmillerGroup/imctools) * [Zanotelli & Bodenmiller, Jan 2019](https://github.com/BodenmillerGroup/ImcSegmentationPipeline/blob/development/documentation/imcsegmentationpipeline_documentation.pdf) * [CellProfiler Bodenmiller custom plugins](https://github.com/BodenmillerGroup/ImcPluginsCP) ## Software packaging/containerisation tools * [BioContainers](https://www.ncbi.nlm.nih.gov/pubmed/28379341/) > da Veiga Leprevost F, Grüning BA, Alves Aflitos S, Röst HL, Uszkoreit J, Barsnes H, Vaudel M, Moreno P, Gatto L, Weber J, Bai M, Jimenez RC, Sachsenberg T, Pfeuffer J, Vera Alvarez R, Griss J, Nesvizhskii AI, Perez-Riverol Y. BioContainers: an open-source and community-driven framework for software standardization. Bioinformatics. 2017 Aug 15;33(16):2580-2582. doi: 10.1093/bioinformatics/btx192. PubMed PMID: 28379341; PubMed Central PMCID: PMC5870671. * [Singularity](https://www.ncbi.nlm.nih.gov/pubmed/28494014/) > Kurtzer GM, Sochat V, Bauer MW. Singularity: Scientific containers for mobility of compute. PLoS One. 2017 May 11;12(5):e0177459. doi: 10.1371/journal.pone.0177459. eCollection 2017. PubMed PMID: 28494014; PubMed Central PMCID: PMC5426675. * [Docker](https://dl.acm.org/doi/10.5555/2600239.2600241)
GitHub Events
Total
- Watch event: 1
- Create event: 2
Last Year
- Watch event: 1
- Create event: 2


