https://github.com/bacpop/mandrake

Mandrake 🌿/👨‍🔬🦆 – Fast visualisation of the population structure of pathogens using Stochastic Cluster Embedding

https://github.com/bacpop/mandrake

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.9%) to scientific vocabulary

Keywords

cuda embedding genomics gpu pathogens
Last synced: 5 months ago · JSON representation

Repository

Mandrake 🌿/👨‍🔬🦆 – Fast visualisation of the population structure of pathogens using Stochastic Cluster Embedding

Basic Info
Statistics
  • Stars: 34
  • Watchers: 6
  • Forks: 2
  • Open Issues: 5
  • Releases: 5
Topics
cuda embedding genomics gpu pathogens
Created about 6 years ago · Last pushed over 2 years ago
Metadata Files
Readme License

README.md

mandrake

Build and run tests Anaconda package Documentation Status <!-- badges: end -->

Fast visualisation of the population structure of pathogens using Stochastic Cluster Embedding.

Paper:

Lees JA, Tonkin-Hill G, Yang Z, Corander J. Mandrake: visualizing microbial population structure by embedding millions of genomes into a low-dimensional representation. Philosophical Transactions of The Royal Society B. 2022;377: 20210237.

https://doi.org/10.1098/rstb.2021.0237

Documentation available at: https://mandrake.readthedocs.io/en/latest/

Installation (briefly)

See https://mandrake.readthedocs.io/en/latest/installation.html for more details.

  1. Install miniconda.
  2. Run conda create -n mandrake_env mandrake to install into a clean environment.
  3. Run conda activate mandrake_env to use the environment.

Refer to the conda-forge documentation if you want to install a CUDA (GPU) enabled version.

Semi-manual

You will need some dependencies, which you can install through conda: conda create -n mandrake_env python conda env update -n mandrake_env --file environment.yml conda activate mandrake_env

You can then clone this repository, and run: python setup.py install

GPU acceleration

You will need the CUDA toolkit installed.

If you have the ability to compile CUDA (e.g. nvcc) you should see a message: CUDA found, compiling both GPU and CPU code otherwise only the CPU version will be compiled: CUDA not found, compiling CPU code only

Usage

After installing, an example command would look like this: mandrake --sketches sketchlib.h5 --kNN 500 --cpus 4 --maxIter 1000000 This would use a file sketchlib.h5 created by pp-sketchlib to calculate accessory distances using 500 nearest neighbours.

Output can be found in numerous files prefixed mandrake.embedding*.

Other useful arguments include:

  • --alignment use a fasta alignment to calculate distances
  • --accessory use a presence/absence file (Rtab or similar) to calculate distances
  • --distances use a .npz file from a previous run and skip straight to the embedding step
  • --labels give labels to colour the output by
  • --perplexity change the perplexity of the preprocessing (similar to t-SNE)
  • --animate produce a video of the optimisation
  • --use-gpu use a GPU for the run. Make sure to increase --n-workers.

See the documentation for more details.

Owner

  • Name: Bacterial population genetics
  • Login: bacpop
  • Kind: organization
  • Email: contact@bacpop.org
  • Location: United Kingdom

Pathogen Informatics and Modelling @ EMBL-EBI / Bacterial Evolutionary Epidemiology Group @ Imperial College London

GitHub Events

Total
  • Create event: 7
  • Issues event: 1
  • Release event: 3
  • Watch event: 2
  • Delete event: 5
  • Issue comment event: 3
  • Push event: 96
  • Pull request review event: 3
  • Pull request event: 9
Last Year
  • Create event: 7
  • Issues event: 1
  • Release event: 3
  • Watch event: 2
  • Delete event: 5
  • Issue comment event: 3
  • Push event: 96
  • Pull request review event: 3
  • Pull request event: 9

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 1
  • Total pull requests: 6
  • Average time to close issues: over 2 years
  • Average time to close pull requests: 4 months
  • Total issue authors: 1
  • Total pull request authors: 2
  • Average comments per issue: 9.0
  • Average comments per pull request: 0.5
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 5
  • Average time to close issues: N/A
  • Average time to close pull requests: 5 days
  • Issue authors: 0
  • Pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.6
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • johnlees (1)
  • ccoulombe (1)
Pull Request Authors
  • absternator (5)
  • johnlees (1)
Top Labels
Issue Labels
code (1)
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads: unknown
  • Total dependent packages: 0
  • Total dependent repositories: 1
  • Total versions: 3
conda-forge.org: mandrake
  • Versions: 3
  • Dependent Packages: 0
  • Dependent Repositories: 1
Rankings
Dependent repos count: 24.3%
Average: 37.9%
Dependent packages count: 51.6%
Last synced: 7 months ago

Dependencies

docs/requirements.txt pypi
  • Cython >=0.26.1
requirements.txt pypi
  • Cython >=0.26.1
.github/workflows/python-package-conda.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
  • mamba-org/provision-with-micromamba main composite
boost/Dockerfile docker
  • ubuntu 20.04 build
docs/setup.py pypi
environment.yml pypi
setup.py pypi