pathml

Tools for computational pathology

https://github.com/dana-farber-aios/pathml

Science Score: 59.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 8 DOI reference(s) in README
  • Academic publication links
    Links to: scholar.google, nature.com, frontiersin.org
  • Committers with academic emails
    6 of 19 committers (31.6%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (16.5%) to scientific vocabulary

Keywords

biomedical-image-processing computational-pathology deep-learning digital-pathology fluorescence-microscopy-imaging histopathology image-analysis machine-learning microscopy pathml pathology python pytorch research spatial-transcriptomics

Keywords from Contributors

interactive mesh interpretability profiles pypi sequences generic projection standardization optim
Last synced: 6 months ago · JSON representation

Repository

Tools for computational pathology

Basic Info
  • Host: GitHub
  • Owner: Dana-Farber-AIOS
  • License: gpl-2.0
  • Language: Python
  • Default Branch: master
  • Homepage: https://pathml.org
  • Size: 218 MB
Statistics
  • Stars: 429
  • Watchers: 12
  • Forks: 87
  • Open Issues: 51
  • Releases: 18
Topics
biomedical-image-processing computational-pathology deep-learning digital-pathology fluorescence-microscopy-imaging histopathology image-analysis machine-learning microscopy pathml pathology python pytorch research spatial-transcriptomics
Created over 6 years ago · Last pushed 8 months ago
Metadata Files
Readme Contributing License Citation

README.md

PathML: Tools for computational pathology

Downloads Documentation Status codecov Code style: black PyPI version tests dev-tests

PathML objective is to lower the barrier to entry to digital pathology

Imaging datasets in cancer research are growing exponentially in both quantity and information density. These massive datasets may enable derivation of insights for cancer research and clinical care, but only if researchers are equipped with the tools to leverage advanced computational analysis approaches such as machine learning and artificial intelligence. In this work, we highlight three themes to guide development of such computational tools: scalability, standardization, and ease of use. We then apply these principles to develop PathML, a general-purpose research toolkit for computational pathology. We describe the design of the PathML framework and demonstrate applications in diverse use cases.

The fastest way to get started?

docker pull pathml/pathml && docker run -it -p 8888:8888 pathml/pathml

Done, what analyses can I write now?

This AI will: - write digital pathology analyses for you - walk you through the code, step-by-step - be your teacher, as you embark on your digital pathology journey More information [here](./ai-digital-pathology-assistant-v3) and usage examples [here](https://github.com/Dana-Farber-AIOS/pathml/blob/master/examples/talk_to_pathml.ipynb)

Official PathML Documentation

View the official PathML Documentation on readthedocs

Examples! Examples! Examples!

Jump to the gallery of examples below


1. Installation

PathML is an advanced tool for pathology image analysis. Below are simplified instructions to help you install PathML on your system. Whether you're a user or a developer, follow these steps to get started.

1.1 Prerequisites

We recommend using Micromamba for managing your environments. We provide instructions on how to install PathML via Micromamba below. In addition, we also provide instructions on how to install via Miniconda should you have a license.

Installation

If you don't have Miniconda installed, you can download Miniconda here.

Upating Micromamba

Make sure you have the recent version of Micromamba by using the following command: micromamba update

Updating Conda and Using libmamba (Optional)

If you are using Micromamba, you can skip to the next section.

We recommend that Anaconda/Microconda users complete the following steps to update your Conda version and use libmamba to resolve dependency conflicts.

Recent versions of Conda have integrated libmamba, a faster dependency solver. To benefit from this improvement, first ensure your Conda is updated:

conda update -n base conda

Then, to install and set the new libmamba solver, run:

conda install -n base conda-libmamba-solver conda config --set solver libmamba Note: these instructions are for Linux. Commands may be different for other platforms.

Platform-Specific External Dependencies

For installation methods 1) and 2), you will need to install the following platform-specific packages.

  • Linux: Install external dependencies with Apt: sudo apt-get install openslide-tools g++ gcc libblas-dev liblapack-dev

  • MacOS: Install external dependencies with Brew: brew install openslide

  • Windows:

  1. Option A: Install with vcpkg: vcpkg install openslide

  2. Option B: Using Pre-built OpenSlide Binaries (Alternative) For Windows users, an alternative to using vcpkg is to download and use pre-built OpenSlide binaries. This method is recommended if you prefer a quicker setup.

  • Download the OpenSlide Windows binaries from the OpenSlide Downloads page.
  • Extract the archive to your desired location, e.g., C:\OpenSlide\.

1.2 PathML Installation Methods

1.2.1 Install with Micromamba and pip (Recommended for Users)

Create and Activate Micromamba Environment and install openjdk

micromamba create -n pathml 'openjdk<=18.0' -c conda-forge python=3.9 micromamba activate pathml

Install PathML from PyPI

pip install pathml

1.2.2 Install with Anaconda and pip

Create and Activate Conda Environment

conda create --name pathml python=3.9 conda activate pathml

Install OpenJDK

conda install -c conda-forge 'openjdk<=18.0'

Install PathML from PyPI

pip install pathml

1.2.3 Install from Source (Recommended for Developers)

Clone repository

git clone https://github.com/Dana-Farber-AIOS/pathml.git cd pathml

Create conda environment

  • Linux and Windows:

conda env create -f environment.yml conda activate pathml To use GPU acceleration for model training or other tasks, you must install CUDA. The default CUDA version in our environment file is 11.6. To install a different CUDA version, refer to the instructions here).

  • MacOS:

conda env create -f requirements/environment_mac.yml conda activate pathml

Install PathML from source:

pip install -e .

1.2.4 Use Docker Container

First, download or build the PathML Docker container:

pathml-docker-installation

  • Option A: download PathML container from Docker Hub docker pull pathml/pathml:latest Optionally specify a tag for a particular version, e.g. docker pull pathml/pathml:2.0.2. To view possible tags, please refer to the PathML DockerHub page.

  • Option B: build docker container from source git clone https://github.com/Dana-Farber-AIOS/pathml.git cd pathml docker build -t pathml/pathml .

Then connect to the container: docker run -it -p 8888:8888 pathml/pathml

The above command runs the container, which is configured to spin up a jupyter lab session and expose it on port 8888. The terminal should display a URL to the jupyter lab session starting with http://127.0.0.1:8888/lab?token=<.....>. Navigate to that page and you should connect to the jupyter lab session running on the container with the pathml environment fully configured. If a password is requested, copy the string of characters following the token= in the url.

Note that the docker container requires extra configurations to use with GPU.
Note that these instructions assume that there are no other processes using port 8888.

Please refer to the Docker run documentation for further instructions on accessing the container, e.g. for mounting volumes to access files on a local machine from within the container.

1.2.5 Use Google Colab

To get PathML running in a Colab environment:

import os !pip install openslide-python !apt-get install openslide-tools !apt-get install openjdk-17-jdk-headless -qq > /dev/null os.environ["JAVA_HOME"] = "/usr/lib/jvm/java-17-openjdk-amd64" !update-alternatives --set java /usr/lib/jvm/java-17-openjdk-amd64/bin/java !java -version !pip install pathml

Thanks to all of our open-source collaborators for helping maintain these installation instructions!
Please open an issue for any bugs or other problems during installation process.

1.3. Import PathML

After you have installed all necessary dependencies and PathML itself, import it using the following command:

import pathml

For Windows users, insert the following code snippet at the beginning of your Python script or Jupyter notebook before importing PathML. This code sets up the DLL directory for OpenSlide, ensuring that the library is properly loaded:

```python

The path can also be read from a config file, etc.

OPENSLIDE_PATH = r'c:\path\to\openslide-win64\bin'

import os if hasattr(os, 'adddlldirectory'): # Windows-specific setup with os.adddlldirectory(OPENSLIDE_PATH): import openslide else: # For other OSes, this step is not needed import openslide

Now you can proceed with using PathML

import pathml ``` This code snippet ensures that the OpenSlide DLLs are correctly found by Python on Windows systems. Replace c:\path\to\openslide-win64\bin with the actual path where you extracted the OpenSlide binaries.

If you encounter any DLL load failures, verify that the OpenSlide bin directory is correctly added to your PATH.

1.4 CUDA

To use GPU acceleration for model training or other tasks, you must install CUDA. This guide should work, but for the most up-to-date instructions, refer to the official PyTorch installation instructions.

Check the version of CUDA: nvidia-smi

Replace both instances of 'cu116' in requirements/requirements_torch.txt with the CUDA version you see. For example, for CUDA 11.7, 'cu116' becomes 'cu117'.

Then create the environment:

conda env create -f environment.yml conda activate pathml

After installing PyTorch, optionally verify successful PyTorch installation with CUDA support: python -c "import torch; print(torch.cuda.is_available())"

2. Using with Jupyter (optional)

Jupyter notebooks are a convenient way to work interactively. To use PathML in Jupyter notebooks:

2.1 Set JAVA_HOME environment variable

PathML relies on Java to enable support for reading a wide range of file formats. Before using PathML in Jupyter, you may need to manually set the JAVA_HOME environment variable specifying the path to Java. To do so:

  1. Get the path to Java by running echo $JAVA_HOME in the terminal in your pathml conda environment (outside of Jupyter)
  2. Set that path as the JAVA_HOME environment variable in Jupyter: import os os.environ["JAVA_HOME"] = "/opt/conda/envs/pathml" # change path as needed

2.2 Register environment as an IPython kernel

conda activate pathml conda install ipykernel python -m ipykernel install --user --name=pathml This makes the pathml environment available as a kernel in jupyter lab or notebook.

3. Examples

Now that you are all set with PathML installation, let's get started with some analyses you can easily replicate:

1. [Load over 160+ different types of pathology images using PathML](https://github.com/Dana-Farber-AIOS/pathml/blob/master/examples/loading_images_vignette.ipynb) 2. [H&E Stain Deconvolution and Color Normalization](https://github.com/Dana-Farber-AIOS/pathml/blob/master/examples/stain_normalization.ipynb) 3. [Brightfield imaging pipeline: load an image, preprocess it on a local cluster, and get it read for machine learning analyses in PyTorch](https://github.com/Dana-Farber-AIOS/pathml/blob/master/examples/workflow_HE_vignette.ipynb) 4. [Multiparametric Imaging: Quickstart & single-cell quantification](https://github.com/Dana-Farber-AIOS/pathml/blob/master/examples/multiplex_if.ipynb) 5. [Multiparametric Imaging: CODEX & nuclei quantization](https://github.com/Dana-Farber-AIOS/pathml/blob/master/examples/codex.ipynb) 6. [Train HoVer-Net model to perform nucleus detection and classification, using data from PanNuke dataset](https://github.com/Dana-Farber-AIOS/pathml/blob/master/examples/train_hovernet.ipynb) 7. [Gallery of PathML preprocessing and transformations](https://github.com/Dana-Farber-AIOS/pathml/blob/master/examples/pathml_gallery.ipynb) 8. [Use the new Graph API to construct cell and tissue graphs from pathology images](https://github.com/Dana-Farber-AIOS/pathml/blob/master/examples/construct_graphs.ipynb) 9. [Train HACTNet model to perform cancer sub-typing using graphs constructed from the BRACS dataset](https://github.com/Dana-Farber-AIOS/pathml/blob/master/examples/train_hactnet.ipynb) 10. [Perform reconstruction of tiles obtained from pathology images using Tile Stitching](https://github.com/Dana-Farber-AIOS/pathml/blob/master/examples/tile_stitching.ipynb) 11. [Create an ONNX model in HaloAI or similar software, export it, and run it at scale using PathML](https://github.com/Dana-Farber-AIOS/pathml/blob/master/examples/InferenceOnnx_tutorial.ipynb) 12. [Step-by-step process used to analyze the Whole Slide Images (WSIs) of Non-Small Cell Lung Cancer (NSCLC) samples as published in the Journal of Clinical Oncology](https://github.com/Dana-Farber-AIOS/pathml/blob/master/examples/Graph_Analysis_NSCLC.ipynb) 13. [Talk to the PathML Digital Pathology Assistant](https://github.com/Dana-Farber-AIOS/pathml/blob/master/examples/talk_to_pathml.ipynb)

4. Citing & known uses

If you use PathML please cite:

So far, PathML was referenced in 40+ manuscripts:

5. Users

This is where in the world our most enthusiastic supporters are located:

and this is where they work:

Source: https://ossinsight.io/analyze/Dana-Farber-AIOS/pathml#people

6. Contributing

PathML is an open source project. Consider contributing to benefit the entire community!

There are many ways to contribute to PathML, including:

  • Submitting bug reports
  • Submitting feature requests
  • Writing documentation and examples
  • Fixing bugs
  • Writing code for new features
  • Sharing workflows
  • Sharing trained model parameters
  • Sharing PathML with colleagues, students, etc.

See contributing for more details.

7. License

The GNU GPL v2 version of PathML is made available via Open Source licensing. The user is free to use, modify, and distribute under the terms of the GNU General Public License version 2.

Commercial license options are available also.

8. Contact

Questions? Comments? Suggestions? Get in touch!

innovation@dfci.harvard.edu

Owner

  • Name: Dana-Farber-AIOS
  • Login: Dana-Farber-AIOS
  • Kind: organization

AI Operations and Data Science Services group

GitHub Events

Total
  • Create event: 9
  • Issues event: 1
  • Release event: 2
  • Watch event: 41
  • Delete event: 4
  • Issue comment event: 11
  • Push event: 35
  • Pull request review event: 3
  • Pull request event: 27
  • Fork event: 4
Last Year
  • Create event: 9
  • Issues event: 1
  • Release event: 2
  • Watch event: 41
  • Delete event: 4
  • Issue comment event: 11
  • Push event: 35
  • Pull request review event: 3
  • Pull request event: 27
  • Fork event: 4

Committers

Last synced: about 2 years ago

All Time
  • Total Commits: 772
  • Total Committers: 19
  • Avg Commits per committer: 40.632
  • Development Distribution Score (DDS): 0.525
Past Year
  • Commits: 41
  • Committers: 4
  • Avg Commits per committer: 10.25
  • Development Distribution Score (DDS): 0.415
Top Committers
Name Email Commits
Jacob Rosenthal j****l@d****u 367
ryanccarelli r****i@g****m 251
Jacob Rosenthal 5****i 31
sreekarreddydfci 9****i 24
Bryan Gass b****3@l****m 20
Dana-Farber D****r 19
Mohamed Omar m****2@g****m 11
jacobrosenthal j****l@m****u 9
Jacob Rosenthal J****l@d****u 9
Jacob Rosenthal 5****l 8
tddough98 4****8 5
Zhuoran Xu c****v@h****m 4
ella-dfci e****t@d****u 3
Ryan Carelli 5****i 3
David Brundage d****b@g****m 2
Surya Narayanan Hari s****1@s****u 2
dependabot[bot] 4****] 2
Yu-An Chen a****2@g****m 1
jzhang1031 j****1@h****u 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 8 months ago

All Time
  • Total issues: 80
  • Total pull requests: 127
  • Average time to close issues: about 1 month
  • Average time to close pull requests: about 1 month
  • Total issue authors: 31
  • Total pull request authors: 12
  • Average comments per issue: 2.58
  • Average comments per pull request: 0.79
  • Merged pull requests: 93
  • Bot issues: 0
  • Bot pull requests: 7
Past Year
  • Issues: 1
  • Pull requests: 18
  • Average time to close issues: N/A
  • Average time to close pull requests: about 1 month
  • Issue authors: 1
  • Pull request authors: 3
  • Average comments per issue: 0.0
  • Average comments per pull request: 0.78
  • Merged pull requests: 12
  • Bot issues: 0
  • Bot pull requests: 5
Top Authors
Issue Authors
  • surya-narayanan (16)
  • jacob-rosenthal (10)
  • Dana-Farber (9)
  • ryanccarelli (5)
  • afshinmoradi (4)
  • archanabhardwaj (4)
  • sreekarreddydfci (3)
  • luzy05111036 (2)
  • YubinXie (2)
  • jamesMo84 (2)
  • ckv1110 (2)
  • tdenize (1)
  • mdu4003 (1)
  • RYY0722 (1)
  • OmarAshkar (1)
Pull Request Authors
  • sreekarreddydfci (43)
  • VarunUllanat (37)
  • jamesgwen (31)
  • jacob-rosenthal (22)
  • dependabot[bot] (11)
  • tddough98 (5)
  • Dana-Farber (4)
  • ryanccarelli (4)
  • surya-narayanan (2)
  • BeeGass (2)
  • Karenxzr (1)
  • Yu-AnChen (1)
Top Labels
Issue Labels
bug (30) enhancement (26) question (3) documentation (3) dependencies (1)
Pull Request Labels
dependencies (11) python (3) enhancement (1)

Packages

  • Total packages: 4
  • Total downloads:
    • pypi 307 last-month
  • Total dependent packages: 0
    (may contain duplicates)
  • Total dependent repositories: 1
    (may contain duplicates)
  • Total versions: 50
  • Total maintainers: 4
proxy.golang.org: github.com/Dana-Farber-AIOS/pathml
  • Versions: 15
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 9.1%
Average: 9.6%
Dependent repos count: 10.2%
Last synced: 6 months ago
proxy.golang.org: github.com/dana-farber-aios/pathml
  • Versions: 15
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 9.1%
Average: 9.6%
Dependent repos count: 10.2%
Last synced: 7 months ago
pypi.org: pathml

Tools for computational pathology

  • Versions: 19
  • Dependent Packages: 0
  • Dependent Repositories: 1
  • Downloads: 307 Last month
Rankings
Stargazers count: 3.5%
Forks count: 5.0%
Dependent packages count: 10.1%
Average: 10.1%
Downloads: 10.5%
Dependent repos count: 21.5%
Last synced: 6 months ago
spack.io: py-pathml

An open-source toolkit for computational pathology and machine learning.

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent repos count: 0.0%
Stargazers count: 13.9%
Forks count: 15.1%
Average: 21.6%
Dependent packages count: 57.3%
Maintainers (1)
Last synced: 6 months ago

Dependencies

.github/workflows/publish-to-docker-hub.yml actions
  • actions/checkout v2 composite
  • docker/build-push-action v2.7.0 composite
  • docker/login-action v1.10.0 composite
  • docker/setup-buildx-action v1.6.0 composite
  • docker/setup-qemu-action v1.2.0 composite
.github/workflows/publish-to-pypi.yml actions
  • actions/checkout master composite
  • actions/setup-python v1 composite
  • pypa/gh-action-pypi-publish master composite
.github/workflows/tests-conda.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
  • codecov/codecov-action v2 composite
  • conda-incubator/setup-miniconda v2.0.0 composite
Dockerfile docker
  • ubuntu 20.04 build
docs/readthedocs-requirements.txt pypi
  • ipython ==7.31.1
  • nbsphinx ==0.8.8
  • nbsphinx-link ==1.3.0
  • sphinx ==4.3.2
  • sphinx-autoapi ==1.8.4
  • sphinx-copybutton ==0.4.0
  • sphinx-rtd-theme ==1.0.0
environment.yml pypi
  • anndata ==0.7.8
  • deepcell ==0.11.0
  • loguru ==0.5.3
  • opencv-contrib-python ==4.5.3.56
  • openslide-python ==1.1.2
  • python-bioformats ==4.0.0
  • python-javabridge ==4.0.0
  • scanpy ==1.8.2
  • tqdm ==4.62.3
setup.py pypi
  • dask *
  • h5py *
  • matplotlib *
  • numpy >=1.16.4
  • openslide-python *
  • pandas *
  • pip *
  • pydicom *
  • scikit-image *
  • scikit-learn *
  • scipy *
  • statsmodels *