https://github.com/choosehappy/histoqc

HistoQC is an open-source quality control tool for digital pathology slides

https://github.com/choosehappy/histoqc

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: ncbi.nlm.nih.gov
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (16.5%) to scientific vocabulary
Last synced: 9 months ago · JSON representation

Repository

HistoQC is an open-source quality control tool for digital pathology slides

Basic Info
  • Host: GitHub
  • Owner: choosehappy
  • License: bsd-3-clause-clear
  • Language: JavaScript
  • Default Branch: master
  • Size: 19.8 MB
Statistics
  • Stars: 298
  • Watchers: 9
  • Forks: 111
  • Open Issues: 65
  • Releases: 4
Created about 8 years ago · Last pushed 11 months ago
Metadata Files
Readme Changelog Contributing License

Readme.md

HistoQC

HistoQC is an open-source quality control tool for digital pathology slides

screenshot

Requirements

Tested with Python 3.7 and 3.8 Note: the DockerFile installs Python 3.8, so if your goal is reproducibility you may want to take this into account

Requires:

  1. openslide

And the following additional python package:

  1. python-openslide
  2. matplotlib
  3. numpy
  4. scipy
  5. skimage
  6. sklearn
  7. pytest (optional)

You can likely install the python requirements using something like (note python 3+ requirement):

pip3 install -r requirements.txt

The library versions have been pegged to the current validated ones. Later versions are likely to work but may not allow for cross-site/version reproducibility (typically a bad thing in quality control).

Openslide binaries will have to be installed separately as per individual o/s instructions

The most basic docker image can be created with the included (7-line) Dockerfile.

Installation

Using docker

Docker is now the recommended method for installing and running HistoQC. Containerized runtimes like docker are more portable and avoid issues with python environment management, and ensure reproducible application behavior. Docker is available for Windows, MacOS, and Linux.

Note: These instructions assume you have docker engine installed on your system. If you do not have docker installed, please see the docker installation instructions.

  1. Begin by pulling the official HistoQC docker image from docker hub. This repository contains the latest stable version of HistoQC and is guaranteed up-to-date. bash docker pull histotools/histoqc:master

  2. Next, run the docker image with a few options to mount your data directory and expose the web interface on your host machine.

    ```bash docker run -v :/data --name -p :5000 -it histotools/histoqc:master /bin/bash

    Example:

    docker run -v /local/datadir:/data --name my_container -p 5000:5000 -it histotools/histoqc:master /bin/bash

    ```

  3. A terminal session will open inside the docker container. You can now run HistoQC as you would on a local machine.

  4. If you exit the shell, the container will stop running but no data/configuration will be lost. You can restart the container and resume your work with the following command:

    ```bash docker start -i

    Example:

    docker start -i my_container

    ```

Using pip

You can install HistoQC into your system by using

bash git clone https://github.com/choosehappy/HistoQC.git cd HistoQC python -m pip install --upgrade pip # (optional) upgrade pip to newest version pip install -r requirements.txt # (required) install pinned versions of packages pip install . # (recommended) install HistoQC as a package Note that pip install . will install HistoQC as a python package in your environment. If you do not want to install HistoQC as a package, you will only be able to run HistoQC from the HistoQC directory.

Basic Usage

histoqc CLI

Running the pipeline is now done via a python module:

``` C:\Research\code\HistoQC>python -m histoqc --help usage: main.py [-h] [-o OUTDIR] [-p BASEPATH] [-c CONFIG] [-f] [-b BATCH] [-n NPROCESSES] [--symlink TARGETDIR] inputpattern [input_pattern ...]

positional arguments: inputpattern input filename pattern (try: *.svs or targetpath/*.svs ), or tsv file containing list of files to analyze

optional arguments: -h, --help show this help message and exit -o OUTDIR, --outdir OUTDIR outputdir, default ./histoqcoutputYYMMDD-hhmmss -p BASEPATH, --basepath BASEPATH base path to add to file names, helps when producing data using existing output file as input -c CONFIG, --config CONFIG config file to use -f, --force force overwriting of existing files -b BATCH, --batch BATCH break results file into subsets of this size -s SEED, --seed SEED set a seed used to produce a random number in all modules
-n NPROCESSES, --nprocesses NPROCESSES number of processes to launch --symlink TARGETDIR create symlink to outdir in TARGETDIR

```

Installed or simply git-cloned, a typical command line for running the tool thus looks like:

bash python -m histoqc -c v2.1 -n 3 "*.svs"

which will use 3 process to operate on all svs files using the named configuration file config_v2.1.ini from the config directory.

In case of errors, HistoQC can be run with the same output directory and will begin where it left off, identifying completed images by the presence of an existing directory.

histoqc.config CLI

Supplied configuration files can be viewed and modified like so:

```

C:\Research\code\HistoQC>python -m histoqc.config --help usage: main.py [-h] [--list] [--show NAME]

show example config

optional arguments: -h, --help show this help message and exit --list list available configs --show NAME show named example config ```

Alternatively one can specify their own modified config file using an absolute or relative filename:

bash python -m histoqc.config --show light > mylight.ini python -m histoqc -c ./mylight.ini -n 3 "*.svs"

histoqc.ui CLI

HistoQC now has a httpd server which allows for improved result viewing, it can be accessed like so:

``` C:\Research\code\HistoQC>python -m histoqc.ui --help usage: histoqc.ui [-h] [--port PORT] resultsfilepath

launch server for result viewing in user interface

positional arguments: resultsfilepath Specify the full path to the results file. The user must specify this path.

optional arguments: -h, --help show this help message and exit --port PORT, -p PORT Specify the port [default:5000] ```

After completion of slide processing, view results in your web-browser by running the following command:

```bash python -m histoqc.ui

Example:

python -m histoqc.ui ./histoqcoutputYYMMDD-hhmmss/results.tsv

```

Note: The results file is a tab-separated file generated by HistoQC containing the quality control metrics for each slide. HistoQC generates the results file in the output directory specified by the -o flag, or formatted as histoqc_output_YYMMDD-hhmmss by default.

You may then navigate to http://<hostname>:5000 in your web browser to view the results.

Configuration modifications

HistoQC's performance is significantly improved if you select an appropriate configuration file as a starting point and modify it to suit your specific use case.

If you would like to see a list of provided config files to start you off, you can type

bash python -m histoqc.config --list

and then you can select one and write it to file like so for your modification and tuning:

bash python -m histoqc.config --show ihc > myconfig_ihc.ini

Advanced Usage

See wiki

Notes

Information from HistoQC users appears below:

  1. the new Pannoramic 1000 scanner, objective-magnification is given as 20, when a 20x objective lense and a 2x aperture boost is used, i.e. image magnification is actually 40x. While their own CaseViewer somehow determines that a boost exists and ends up with 40x when objective-magnification in Slidedat.ini is at 20, openslide and bioformats give 20x.

1.1. When converted to svs by CaseViewer, the MPP entry in ImageDescription meta-parameter give the average of the x and y mpp. Both values are slightly different for the new P1000 and can be found in meta-parameters of svs as tiff.XResolution and YResolution inverse values, so have to be converted, also respecting ResolutionUnit as centimeter or inch

Citation

If you find this software useful, please drop me a line and/or consider citing it:

"HistoQC: An Open-Source Quality Control Tool for Digital Pathology Slides", Janowczyk A., Zuo R., Gilmore H., Feldman M., Madabhushi A., JCO Clinical Cancer Informatics, 2019

Manuscript available here

“Assessment of a computerized quantitative quality control tool for kidney whole slide image biopsies”, Chen Y., Zee J., Smith A., Jayapandian C., Hodgin J., Howell D., Palmer M., Thomas D., Cassol C., Farris A., Perkinson K., Madabhushi A., Barisoni L., Janowczyk A., Journal of Pathology, 2020

Manuscript available here

Owner

  • Login: choosehappy
  • Kind: user

GitHub Events

Total
  • Issues event: 7
  • Watch event: 39
  • Delete event: 1
  • Issue comment event: 11
  • Push event: 5
  • Pull request event: 6
  • Pull request review comment event: 10
  • Pull request review event: 10
  • Fork event: 8
  • Create event: 2
Last Year
  • Issues event: 7
  • Watch event: 39
  • Delete event: 1
  • Issue comment event: 11
  • Push event: 5
  • Pull request event: 6
  • Pull request review comment event: 10
  • Pull request review event: 10
  • Fork event: 8
  • Create event: 2

Issues and Pull Requests

Last synced: 10 months ago

All Time
  • Total issues: 5
  • Total pull requests: 4
  • Average time to close issues: 3 months
  • Average time to close pull requests: 38 minutes
  • Total issue authors: 5
  • Total pull request authors: 1
  • Average comments per issue: 2.4
  • Average comments per pull request: 0.25
  • Merged pull requests: 3
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 4
  • Pull requests: 4
  • Average time to close issues: 1 day
  • Average time to close pull requests: 38 minutes
  • Issue authors: 4
  • Pull request authors: 1
  • Average comments per issue: 0.5
  • Average comments per pull request: 0.25
  • Merged pull requests: 3
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • jacksonjacobs1 (8)
  • Himanshunitrr (5)
  • choosehappy (3)
  • CielAl (2)
  • suminwei (2)
  • SaharAlmahfouzNasser (1)
  • YoihenBachu (1)
  • yau-lim (1)
  • ClavijoDiego (1)
  • zhaolei4383 (1)
  • EmanuelSoda (1)
  • koellerMC (1)
  • mgilkey (1)
  • usrsbn (1)
  • csittz (1)
Pull Request Authors
  • jacksonjacobs1 (11)
  • nanli-emory (7)
  • ant0nsc (1)
  • suminwei (1)
  • CielAl (1)
Top Labels
Issue Labels
Tracker: Feature (3) Priority: High (2) Priority: Low (1)
Pull Request Labels
Tracker: Feature (2)

Dependencies

requirements.txt pypi
  • dill ==0.3.3
  • importlib-resources *
  • matplotlib ==3.3.4
  • numpy ==1.20.1
  • openslide-python ==1.1.2
  • pytest *
  • scikit-image ==0.18.1
  • scikit-learn ==0.24.1
  • scipy ==1.6.1
Dockerfile docker
  • python 3.8 build
  • python 3.8-slim build
pyproject.toml pypi
setup.py pypi
.github/workflows/docker-image.yml actions
  • actions/checkout v4 composite
  • docker/build-push-action 3b5e8027fcad23fda98b2e3ac259d8d67585f671 composite
  • docker/login-action f4ef78c080cd8ba55a85445d5b36e214a81df20a composite
  • docker/metadata-action 9ec57ed1fcdbf14dcef7dfbe97b2010124a938b7 composite