Science Score: 67.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 2 DOI reference(s) in README -
○Academic publication links
-
✓Committers with academic emails
1 of 7 committers (14.3%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (16.5%) to scientific vocabulary
Keywords
Repository
Gene cluster comparison figure generator
Basic Info
Statistics
- Stars: 606
- Watchers: 15
- Forks: 73
- Open Issues: 46
- Releases: 28
Topics
Metadata Files
README.md
clinker
Both cblaster and clinker can now be used without installation on the CAGECAT webserver.
Gene cluster comparison figure generator
What is it?
clinker is a pipeline for easily generating publication-quality gene cluster comparison figures.

Given a set of GenBank files, clinker will automatically extract protein translations, perform global alignments between sequences in each cluster, determine the optimal display order based on cluster similarity, and generate an interactive visualisation (using clustermap.js) that can be extensively tweaked before being exported as an SVG file.
A note on scope:
clinker was designed primarily as a simple way to visualise groups of homologous biosynthetic gene clusters, which are typically small genomic regions with not many genes (as in the example GIF). It performs pairwise alignments of all genes in all input files using the aligner built into BioPython, then generates an interactive SVG document in the browser. The alignment stage will scale very poorly to multiple genomes with many genes, and the resulting visualisation will also be very slow given how many SVG elements it will contain. If you are looking to align entire genomes, you will likely be better served using tools built for that purpose (e.g. Cactus).

Installation
clinker can be installed directly through pip:
pip install clinker
By cloning the source code from GitHub:
git clone https://github.com/gamcil/clinker.git
cd clinker
pip install .
Or, through conda:
conda create -n clinker -c conda-forge -c bioconda clinker-py
conda activate clinker
Citation
If you found clinker useful, please cite:
clinker & clustermap.js: Automatic generation of gene cluster comparison figures.
Gilchrist, C.L.M., Chooi, Y.-H., 2020.
Bioinformatics. doi: https://doi.org/10.1093/bioinformatics/btab007
Usage
Running clinker can be as simple as:
clinker clusters/*.gbk
This will read in all GenBank files inside the folder, align them, and print
the alignments to the terminal. To generate the visualisation, use the -p/--plot
argument:
clinker clusters/*.gbk -p <optional: file name to save static HTML>
clinker can also parse GFF3 files:
clinker cluster1.gff3 cluster2.gff3 -p
Note: a corresponding FASTA file of the same name (extensions ".fa", ".fsa", ".fna", ".fasta" or ".faa") must
be found in the same directory as the GFF3, i.e. cluster1.fa and cluster2.fa.
See -h/--help for more information:
``` usage: clinker [-h] [--version] [-r RANGES [RANGES ...]] [-gf GENE_FUNCTIONS] [-na] [-i IDENTITY] [-j JOBS] [-s SESSION] [-ji JSON_INDENT] [-f] [-o OUTPUT] [-p [PLOT]] [-dl DELIMITER] [-dc DECIMALS] [-hl] [-ha] [-mo MATRIX_OUT] [-ufo] [files ...]
clinker: Automatic creation of publication-ready gene cluster comparison figures.
clinker generates gene cluster comparison figures from GenBank files. It performs pairwise local or global alignments between every sequence in every unique pair of clusters and generates interactive, to-scale comparison figures using the clustermap.js library.
optional arguments: -h, --help show this help message and exit --version show program's version number and exit
Input options: files Gene cluster GenBank files -r RANGES [RANGES ...], --ranges RANGES [RANGES ...] Scaffold extraction ranges. If a range is specified, only features within the range will be extracted from the scaffold. Ranges should be formatted like: scaffold:start-end (e.g. scaffold1:15000-40000) -gf GENEFUNCTIONS, --genefunctions GENEFUNCTIONS 2-column CSV file containing gene functions, used to build gene groups from same function instead of sequence similarity (e.g. GENE_001,PKS-NRPS).
Alignment options: -na, --no_align Do not align clusters -i IDENTITY, --identity IDENTITY Minimum alignment sequence identity [default: 0.3] -j JOBS, --jobs JOBS Number of alignments to run in parallel (0 to use the number of CPUs) [default: 0]
Output options: -s SESSION, --session SESSION Path to clinker session -ji JSONINDENT, --jsonindent JSONINDENT Number of spaces to indent JSON [default: none] -f, --force Overwrite previous output file -o OUTPUT, --output OUTPUT Save alignments to file -p [PLOT], --plot [PLOT] Plot cluster alignments using clustermap.js. If a path is given, clinker will generate a portable HTML file at that path. Otherwise, the plot will be served dynamically using Python's HTTP server. -dl DELIMITER, --delimiter DELIMITER Character to delimit output by [default: human readable] -dc DECIMALS, --decimals DECIMALS Number of decimal places in output [default: 2] -hl, --hidelinkheaders Hide alignment column headers -ha, --hidealnheaders Hide alignment cluster name headers -mo MATRIXOUT, --matrixout MATRIXOUT Save cluster similarity matrix to file
Visualisation options: -ufo, --usefileorder Display clusters in order of input files
Example usage
Align clusters, plot results and print scores to screen: $ clinker files/*.gbk
Only save gene-gene links when identity is over 50%: $ clinker files/*.gbk -i 0.5
Save an alignment session for later: $ clinker files/*.gbk -s session.json
Save alignments to file, in comma-delimited format, with 4 decimal places: $ clinker files/*.gbk -o alignments.csv -dl "," -dc 4
Generate visualisation: $ clinker files/*.gbk -p
Save visualisation as a static HTML document: $ clinker files/*.gbk -p plot.html
Cameron Gilchrist, 2020 ```
Defining gene groups by function
By default, clinker automatically assigns a name and colour for each group of homologous genes.
You can instead pre-assign names (i.e. functions) using the -gf/--gene_functions argument, which
takes a 2-column comma-separated file like:
GENE_001,Cytochrome P450
GENE_002,Cytochrome P450
GENE_003,Methyltransferase
GENE_004,Methyltransferase
This will generate two groups, Cytochrome P450 (GENE001 and 002), and Methyltransferase (GENE003, GENE_004). If there any other homologous genes are identified, they will automatically be added to these groups.
As of clinker v0.0.28, you can now specify colours for genes defined by the
-gf/--gene_functions argument. To do this, use the -cm/--colour_map argument which
also takes a 2-column CSV file containing the group name and hexadecimal colour code like:
Cytochrome P450,#FF0000
Methyltransferase,#0000FF
Owner
- Name: Cameron Gilchrist
- Login: gamcil
- Kind: user
- Location: Perth, Western Australia
- Website: http://www.chooilab.org
- Twitter: clmgilchrist
- Repositories: 5
- Profile: https://github.com/gamcil
Postdoc @ Steinegger Lab, Seoul National University Ex. Chooi Lab, The University of Western Australia
Citation (CITATION.cff)
cff-version: 1.1.0
message: "If you use this software, please cite it as below."
authors:
- family-names: Gilchrist
given-names: Cameron
orcid: https://orcid.org/0000-0001-7798-427X
- family-names: Chooi
given-names: Yit-Heng
orcid: https://orcid.org/0000-0001-7719-7524
title: "clinker & clustermap.js: automatic generation of gene cluster comparison figures"
version: 0.0.21
doi: 10.1093/bioinformatics/btab007
date-released: 2021-05-28
GitHub Events
Total
- Create event: 1
- Release event: 1
- Issues event: 6
- Watch event: 68
- Issue comment event: 14
- Push event: 1
- Fork event: 6
Last Year
- Create event: 1
- Release event: 1
- Issues event: 6
- Watch event: 68
- Issue comment event: 14
- Push event: 1
- Fork event: 6
Committers
Last synced: 7 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Cameron Gilchrist | c****t@g****m | 166 |
| Martin Larralde | m****e@e****e | 3 |
| Sam Minot | s****m@f****g | 3 |
| lukas-val | 6****l | 1 |
| Robert A. Petit III | r****t@g****m | 1 |
| Brady Johnston | 3****n | 1 |
| Aurora | a****i@o****m | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 4 months ago
All Time
- Total issues: 94
- Total pull requests: 25
- Average time to close issues: about 1 month
- Average time to close pull requests: 1 day
- Total issue authors: 81
- Total pull request authors: 7
- Average comments per issue: 2.66
- Average comments per pull request: 0.48
- Merged pull requests: 24
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 7
- Pull requests: 1
- Average time to close issues: 14 days
- Average time to close pull requests: 15 days
- Issue authors: 7
- Pull request authors: 1
- Average comments per issue: 2.0
- Average comments per pull request: 1.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- RvV1979 (6)
- marade (3)
- mattbird567 (2)
- davidmadariaga (2)
- alexweisberg (2)
- jelber2 (2)
- JulieDaz (2)
- kforcone (2)
- sean-bam (2)
- koolape (1)
- sminot (1)
- yuezhiTang (1)
- jeep3 (1)
- Korkerino (1)
- cifuj (1)
Pull Request Authors
- gamcil (17)
- althonos (3)
- Aurorabili (2)
- sminot (1)
- lukas-val (1)
- rpetit3 (1)
- BradyAJohnston (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 555 last-month
- Total dependent packages: 1
- Total dependent repositories: 2
- Total versions: 31
- Total maintainers: 1
pypi.org: clinker
- Homepage: https://github.com/gamcil/clinker
- Documentation: https://clinker.readthedocs.io/
- License: MIT License
-
Latest release: 0.0.31
published about 1 year ago
Rankings
Maintainers (1)
Dependencies
- actions/checkout v2 composite
- actions/setup-python v2 composite
- ubuntu 20.04 build
- biopython >=1.78
- disjoint-set >=0.7.1
- gffutils *
- numpy >=1.13.3
- scipy >=1.3.3