https://github.com/astorfi/pathml
Tools for computational pathology
Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
✓DOI references
Found 3 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (19.7%) to scientific vocabulary
Last synced: 10 months ago
·
JSON representation
Repository
Tools for computational pathology
Basic Info
- Host: GitHub
- Owner: astorfi
- License: gpl-2.0
- Default Branch: master
- Homepage: https://pathml.org
- Size: 122 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Fork of Dana-Farber-AIOS/pathml
Created over 4 years ago
· Last pushed over 4 years ago
https://github.com/astorfi/pathml/blob/master/
![]()
[](https://pathml.readthedocs.io/en/latest/?badge=latest) [](https://github.com/psf/black) [](https://pypi.org/project/pathml/) [](https://pepy.tech/project/pathml) [](https://codecov.io/gh/Dana-Farber-AIOS/pathml) | Branch | Test status | | ------ | ------------- | | master |  | | dev |  | An open-source toolkit for computational pathology and machine learning. **View [documentation](https://pathml.readthedocs.io/en/latest/)** :construction: the `dev` branch is under active development, with experimental features, bug fixes, and refactors that may happen at any time! Stable versions are available as tagged releases on GitHub, or as versioned releases on PyPI # Installation There are several ways to install `PathML`: 1. `pip install` from PyPI 2. Clone repo to local machine and install from source 3. Use the PathML Docker container Option (1) is recommended for most users. It will install the latest versions of most packages. Option (2) is a deterministic environment setup, meaning that all package versions are pinned and it will install the pinned version of a package even if it is not the newest. The automated testing suite is run in this environment. This is the suggested installation method for users wanting to interface with the Mesmer model for IF workflows, and for contributors/developers. Option (3) uses the same environment from (2), but in a Docker container. Options (1) and (2) require that you first install all external dependencies (namelt, JDK-8 and system libraries used by OpenSlide and OpenCV): * Install external dependencies (Linux) with [Apt](https://ubuntu.com/server/docs/package-management): ```` sudo apt-get update && sudo apt-get install openslide-tools g++ gcc libblas-dev liblapack-dev python3-opencv ```` We recommend using conda for environment management. Download Miniconda [here](https://docs.conda.io/en/latest/miniconda.html) *Note: these instructions are for Linux. Commands may be different for other platforms.* ## Installation option 1: pip install Create conda environment with dependencies: ```` conda create --name pathml python=3.8 numpy=1.19.5 openjdk==8.0.152 -c conda-forge conda activate pathml ```` Optionally install CUDA (instructions [here](#CUDA)) Install `PathML` from PyPI: ```` pip install pathml ```` ## Installation option 2: clone repo and install from source Clone repo: ```` git clone https://github.com/Dana-Farber-AIOS/pathml.git cd pathml ```` Create conda environment: ```` conda env create -f environment.yml conda activate pathml ```` Optionally install CUDA (instructions [here](#CUDA)) Install `PathML` from source: ```` pip install -e . ```` ## Installation option 3: Docker First, download or build the PathML Docker container: - Option A: download PathML container from Docker Hub ```` docker pull pathml/pathml:latest ```` Optionally specify a tag for a particular version, e.g. `docker pull pathml/pathml:2.0.2`. To view possible tags, please refer to the [PathML DockerHub page](https://hub.docker.com/r/pathml/pathml). - Option B: build docker container from source ```` git clone https://github.com/Dana-Farber-AIOS/pathml.git cd pathml docker build -t pathml/pathml . ```` Then connect to the container: ```` docker run -it -p 8888:8888 pathml/pathml ```` The above command runs the container, which is configured to spin up a jupyter lab session and expose it on port 8888. The terminal should display a URL to the jupyter lab session starting with `http://127.0.0.1:8888/lab?token=<.....>`. Navigate to that page and you should connect to the jupyter lab session running on the container with the pathml environment fully configured. If a password is requested, copy the string of characters following the `token=` in the url. Note that the docker container requires extra configurations to use with GPU. Note that these instructions assume that there are no other processes using port 8888. Please refer to the `Docker run` [documentation](https://docs.docker.com/engine/reference/run/) for further instructions on accessing the container, e.g. for mounting volumes to access files on a local machine from within the container. ## CUDA To use GPU acceleration for model training or other tasks, you must install CUDA. This guide should work, but for the most up-to-date instructions, refer to the [official PyTorch installation instructions](https://pytorch.org/get-started/locally/). Check the version of CUDA: ```` nvidia-smi ```` Install correct version of `cudatoolkit`: ```` # update this command with your CUDA version number conda install cudatoolkit=11.0 ```` After installing PyTorch, optionally verify successful PyTorch installation with CUDA support: ```` python -c "import torch; print(torch.cuda.is_available())" ```` ## Troubleshooting installation Installation can be fragile at times due to external dependencies. If having difficulty installing, try the following: * Look through the GitHub issues to see if someone else has run into the same problem before * Ensure that the correct versions of all dependencies are installed * Make sure to use a fresh conda environment * Use pip's `--no-cache-dir` to prevent using cached files * Use deterministic environment specifications such as those used in installation options (2) and (3) # Using with Jupyter Jupyter notebooks are a convenient way to work interactively. To use `PathML` in Jupyter notebooks: ## Set JAVA_HOME environment variable PathML relies on Java to enable support for reading a wide range of file formats. Before using `PathML` in Jupyter, you may need to manually set the `JAVA_HOME` environment variable specifying the path to Java. To do so: 1. Get the path to Java by running `echo $JAVA_HOME` in the terminal in your pathml conda environment (outside of Jupyter) 2. Set that path as the `JAVA_HOME` environment variable in Jupyter: ```` import os os.environ["JAVA_HOME"] = "/opt/conda/envs/pathml" # change path as needed ```` ## Register environment as an IPython kernel ```` conda activate pathml conda install ipykernel python -m ipykernel install --user --name=pathml ```` This makes the pathml environment available as a kernel in jupyter lab or notebook. # Contributing ``PathML`` is an open source project. Consider contributing to benefit the entire community! There are many ways to contribute to `PathML`, including: * Submitting bug reports * Submitting feature requests * Writing documentation and examples * Fixing bugs * Writing code for new features * Sharing workflows * Sharing trained model parameters * Sharing ``PathML`` with colleagues, students, etc. See [contributing](https://github.com/Dana-Farber-AIOS/pathml/blob/master/CONTRIBUTING.rst) for more details. # Citing If you use `PathML` in your work, please cite our paper: Rosenthal J, Carelli R, Omar M, Brundage D, Halbert E, Nyman J, Hari SN, Van Allen EM, Marchionni L, Umeton R, Loda M. Building tools for machine learning and artificial intelligence in cancer research: best practices and a case study with the PathML toolkit for computational pathology. *Molecular Cancer Research*, 2021. DOI: [10.1158/1541-7786.MCR-21-0665](https://doi.org/10.1158/1541-7786.MCR-21-0665) # License The GNU GPL v2 version of PathML is made available via Open Source licensing. The user is free to use, modify, and distribute under the terms of the GNU General Public License version 2. Commercial license options are available also. # Contact Questions? Comments? Suggestions? Get in touch! [PathML@dfci.harvard.edu](mailto:PathML@dfci.harvard.edu)
![]()
Owner
- Name: Sina Torfi
- Login: astorfi
- Kind: user
- Location: San Jose
- Company: Meta
- Website: https://astorfi.github.io/
- Repositories: 196
- Profile: https://github.com/astorfi
PhD & Developer working on Deep Learning, Computer Vision & NLP
[](https://pathml.readthedocs.io/en/latest/?badge=latest)
[](https://github.com/psf/black)
[](https://pypi.org/project/pathml/)
[](https://pepy.tech/project/pathml)
[](https://codecov.io/gh/Dana-Farber-AIOS/pathml)
| Branch | Test status |
| ------ | ------------- |
| master |  |
| dev |  |
An open-source toolkit for computational pathology and machine learning.
**View [documentation](https://pathml.readthedocs.io/en/latest/)**
:construction: the `dev` branch is under active development, with experimental features, bug fixes, and refactors that may happen at any time!
Stable versions are available as tagged releases on GitHub, or as versioned releases on PyPI
# Installation
There are several ways to install `PathML`:
1. `pip install` from PyPI
2. Clone repo to local machine and install from source
3. Use the PathML Docker container
Option (1) is recommended for most users. It will install the latest versions of most packages.
Option (2) is a deterministic environment setup, meaning that all package versions are pinned and it will install the
pinned version of a package even if it is not the newest. The automated testing suite is run in this environment. This
is the suggested installation method for users wanting to interface with the Mesmer model for IF workflows, and for
contributors/developers. Option (3) uses the same environment from (2), but in a Docker
container.
Options (1) and (2) require that you first install all external dependencies (namelt, JDK-8 and system libraries used
by OpenSlide and OpenCV):
* Install external dependencies (Linux) with [Apt](https://ubuntu.com/server/docs/package-management):
````
sudo apt-get update && sudo apt-get install openslide-tools g++ gcc libblas-dev liblapack-dev python3-opencv
````
We recommend using conda for environment management.
Download Miniconda [here](https://docs.conda.io/en/latest/miniconda.html)
*Note: these instructions are for Linux. Commands may be different for other platforms.*
## Installation option 1: pip install
Create conda environment with dependencies:
````
conda create --name pathml python=3.8 numpy=1.19.5 openjdk==8.0.152 -c conda-forge
conda activate pathml
````
Optionally install CUDA (instructions [here](#CUDA))
Install `PathML` from PyPI:
````
pip install pathml
````
## Installation option 2: clone repo and install from source
Clone repo:
````
git clone https://github.com/Dana-Farber-AIOS/pathml.git
cd pathml
````
Create conda environment:
````
conda env create -f environment.yml
conda activate pathml
````
Optionally install CUDA (instructions [here](#CUDA))
Install `PathML` from source:
````
pip install -e .
````
## Installation option 3: Docker
First, download or build the PathML Docker container:
- Option A: download PathML container from Docker Hub
````
docker pull pathml/pathml:latest
````
Optionally specify a tag for a particular version, e.g. `docker pull pathml/pathml:2.0.2`. To view possible tags,
please refer to the [PathML DockerHub page](https://hub.docker.com/r/pathml/pathml).
- Option B: build docker container from source
````
git clone https://github.com/Dana-Farber-AIOS/pathml.git
cd pathml
docker build -t pathml/pathml .
````
Then connect to the container:
````
docker run -it -p 8888:8888 pathml/pathml
````
The above command runs the container, which is configured to spin up a jupyter lab session and expose it on port 8888.
The terminal should display a URL to the jupyter lab session starting with `http://127.0.0.1:8888/lab?token=<.....>`.
Navigate to that page and you should connect to the jupyter lab session running on the container with the pathml
environment fully configured. If a password is requested, copy the string of characters following the `token=` in the
url.
Note that the docker container requires extra configurations to use with GPU.
Note that these instructions assume that there are no other processes using port 8888.
Please refer to the `Docker run` [documentation](https://docs.docker.com/engine/reference/run/) for further instructions
on accessing the container, e.g. for mounting volumes to access files on a local machine from within the container.
## CUDA
To use GPU acceleration for model training or other tasks, you must install CUDA.
This guide should work, but for the most up-to-date instructions, refer to the [official PyTorch installation instructions](https://pytorch.org/get-started/locally/).
Check the version of CUDA:
````
nvidia-smi
````
Install correct version of `cudatoolkit`:
````
# update this command with your CUDA version number
conda install cudatoolkit=11.0
````
After installing PyTorch, optionally verify successful PyTorch installation with CUDA support:
````
python -c "import torch; print(torch.cuda.is_available())"
````
## Troubleshooting installation
Installation can be fragile at times due to external dependencies.
If having difficulty installing, try the following:
* Look through the GitHub issues to see if someone else has run into the same problem before
* Ensure that the correct versions of all dependencies are installed
* Make sure to use a fresh conda environment
* Use pip's `--no-cache-dir` to prevent using cached files
* Use deterministic environment specifications such as those used in installation options (2) and (3)
# Using with Jupyter
Jupyter notebooks are a convenient way to work interactively. To use `PathML` in Jupyter notebooks:
## Set JAVA_HOME environment variable
PathML relies on Java to enable support for reading a wide range of file formats.
Before using `PathML` in Jupyter, you may need to manually set the `JAVA_HOME` environment variable
specifying the path to Java. To do so:
1. Get the path to Java by running `echo $JAVA_HOME` in the terminal in your pathml conda environment (outside of Jupyter)
2. Set that path as the `JAVA_HOME` environment variable in Jupyter:
````
import os
os.environ["JAVA_HOME"] = "/opt/conda/envs/pathml" # change path as needed
````
## Register environment as an IPython kernel
````
conda activate pathml
conda install ipykernel
python -m ipykernel install --user --name=pathml
````
This makes the pathml environment available as a kernel in jupyter lab or notebook.
# Contributing
``PathML`` is an open source project. Consider contributing to benefit the entire community!
There are many ways to contribute to `PathML`, including:
* Submitting bug reports
* Submitting feature requests
* Writing documentation and examples
* Fixing bugs
* Writing code for new features
* Sharing workflows
* Sharing trained model parameters
* Sharing ``PathML`` with colleagues, students, etc.
See [contributing](https://github.com/Dana-Farber-AIOS/pathml/blob/master/CONTRIBUTING.rst) for more details.
# Citing
If you use `PathML` in your work, please cite our paper:
Rosenthal J, Carelli R, Omar M, Brundage D, Halbert E, Nyman J, Hari SN, Van Allen EM, Marchionni L, Umeton R, Loda M.
Building tools for machine learning and artificial intelligence in cancer research: best practices and a case study
with the PathML toolkit for computational pathology. *Molecular Cancer Research*, 2021.
DOI: [10.1158/1541-7786.MCR-21-0665](https://doi.org/10.1158/1541-7786.MCR-21-0665)
# License
The GNU GPL v2 version of PathML is made available via Open Source licensing.
The user is free to use, modify, and distribute under the terms of the GNU General Public License version 2.
Commercial license options are available also.
# Contact
Questions? Comments? Suggestions? Get in touch!
[PathML@dfci.harvard.edu](mailto:PathML@dfci.harvard.edu)