Science Score: 77.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 6 DOI reference(s) in README -
✓Academic publication links
Links to: acs.org, zenodo.org -
✓Committers with academic emails
1 of 10 committers (10.0%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (17.7%) to scientific vocabulary
Repository
Find clustered hits from a BLAST search
Basic Info
- Host: GitHub
- Owner: gamcil
- License: mit
- Language: HTML
- Default Branch: master
- Size: 48 MB
Statistics
- Stars: 120
- Watchers: 5
- Forks: 25
- Open Issues: 48
- Releases: 37
Metadata Files
README.md
cblaster
Both cblaster and clinker can now be used without installation on the CAGECAT webserver.
Outline
cblaster is a tool for finding clusters of co-located homologous sequences
in BLAST searches.

Given a collection of protein sequences, cblaster can search sequence databases
remotely (via NCBI BLAST API) or locally (via DIAMOND). Search results are parsed
and filtered based on user thresholds for identity, coverage and e-value. The genomic
coordinates of remaining hits are obtained from the NCBI's Identical Protein
Group (IPG) database (or a local database in local searches). Finally,
cblaster scans for instances of collocation and generates visualisations:

Installation
cblaster can be installed via pip:
bash
$ pip3 install cblaster --user
or by cloning the repository and installing:
bash
$ git clone https://github.com/gamcil/cblaster.git
...
$ cd cblaster/
$ pip3 install .
Additionally, we provide executables for Windows and Mac which can be downloaded from here.
Once installed, make sure you configure cblaster with your email address:
bash
$ cblaster config --email name@domain.com
You can find example search files, along with generated output, in the examples folder of the repository.
Dependencies
cblaster is tested on Python 3.6, and its only external Python dependency is
the requests module (used for interaction with NCBI APIs).
If you want to perform local searches, you should have diamond installed and available
on your system $PATH.
cblaster will throw an error if a local search is started but it cannot find
diamond or diamond-aligner (alias when installed via apt) on the system.
Usage
cblaster accepts FASTA files and collections of valid NCBI sequence identifiers
(GIs, accession numbers) as input.
A remote search can be performed as simply as:
bash
$ cblaster search --query_file query.fasta
For example, to remotely search the burnettramic acids gene cluster, bua , against the NCBI's nr database:
```bash $ cblaster search -qf bua.fasta
[12:14:17] INFO - Starting cblaster in remote mode
[12:14:17] INFO - Launching new search
[12:14:19] INFO - Request Identifier (RID): WHS0UGYJ015
[12:14:19] INFO - Request Time Of Execution (RTOE): 25s
[12:14:44] INFO - Polling NCBI for completion status
[12:14:44] INFO - Checking search status...
[12:15:44] INFO - Checking search status...
[12:16:44] INFO - Checking search status...
[12:16:46] INFO - Search has completed successfully!
[12:16:46] INFO - Retrieving results for search WHS0UGYJ015
[12:16:51] INFO - Parsing results...
[12:16:51] INFO - Found 3944 hits meeting score thresholds
[12:16:51] INFO - Fetching genomic context of hits
[12:17:14] INFO - Searching for clustered hits across 705 organisms
[12:17:14] INFO - Writing summary to
Aspergillus mulundensis DSM 5745
NW_020797889.1
Query Subject Identity Coverage E-value Bitscore Start End Strand QBE85641.1 XP026607259.1 75.56 99.5918 0 742 1717881 1719409 - QBE85642.1 XP026607260.1 89.916 100 0 667 1719650 1720797 + QBE85643.1 XP026607261.1 89.532 83.1169 0 832 1721494 1722934 + QBE85644.1 XP026607262.1 64.829 98.9218 6.51e-157 455 1723252 1724467 - QBE85645.1 XP026607263.1 69.97 100 6.93e-157 449 1725113 1726277 - QBE85646.1 XP026607264.1 82.759 96.8447 0 670 1726892 1728302 + QBE85647.1 XP026607265.1 72.674 99.2048 0 764 1729735 1731338 + QBE85648.1 XP026607266.1 56.098 98.324 4.24e-64 205 1731701 1732402 - QBE85649.1 XP_026607267.1 79.623 99.8746 0 6573 1732820 1745289 +
... ```
A query sequence absence/presence matrix can be generated using the --binary argument:
Organism Scaffold Start End QBE85641.1 QBE85642.1 QBE85643.1 QBE85644.1 QBE85645.1 QBE85646.1 QBE85647.1 QBE85648.1 QBE85649.1
Aspergillus mulundensis DSM 5745 NW_020797889.1 1717881 1745289 1 1 1 1 1 1 1 1 1
Aspergillus versicolor CBS 583.65 KV878126.1 3162095 3187090 1 1 1 0 1 1 1 1 1
Pseudomassariella vexata CBS 129021 MCFJ01000004.1 1606356 1628483 1 1 1 0 0 1 0 1 1
Hypoxylon sp. CO27-5 KZ112517.1 92119 112957 1 1 1 0 0 0 1 0 1
Hypoxylon sp. EC38 KZ111255.1 514739 535366 1 1 1 0 0 0 1 0 1
Epicoccum nigrum ICMP 19927 KZ107839.1 2116719 2142558 1 1 0 0 0 1 1 0 1
Aureobasidium subglaciale EXF-2481 NW_013566983.1 700476 718693 1 1 0 0 0 1 1 0 0
Aureobasidium pullulans EXF-6514 QZBF01000009.1 18721 34295 1 1 0 0 0 1 1 0 0
Aureobasidium pullulans EXF-5628 QZBI01000512.1 329 13401 1 0 0 0 0 1 1 0 0
cblaster can also generate fully interactive visualisations of the binary
table. To view an example, click here.
For further usage examples and API documentation, please refer to the documentation.
Citation
If you found this tool useful, please cite:
text
Cameron L M Gilchrist, Thomas J Booth, Bram van Wersch, Liana van Grieken, Marnix H Medema, Yit-Heng Chooi, cblaster: a remote search tool for rapid identification and visualisation of homologous gene clusters, Bioinformatics Advances, 2021;, vbab016, https://doi.org/10.1093/bioadv/vbab016
cblaster makes use of the following tools:
```
Buchfink, B., Xie, C. & Huson, D. H. Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12, 59–60 (2015).
Acland, A. et al. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 42, 7–17 (2014). ```
Owner
- Name: Cameron Gilchrist
- Login: gamcil
- Kind: user
- Location: Perth, Western Australia
- Website: http://www.chooilab.org
- Twitter: clmgilchrist
- Repositories: 5
- Profile: https://github.com/gamcil
Postdoc @ Steinegger Lab, Seoul National University Ex. Chooi Lab, The University of Western Australia
Citation (CITATION.cff)
cff-version: 1.1.0
message: "If you use this software, please cite it as below."
authors:
- family-names: Gilchrist
given-names: Cameron
orcid: https://orcid.org/0000-0001-7798-427X
- family-names: Booth
given-names: Thomas J
orcid: https://orcid.org/0000-0002-6134-1488
- family-names: van Wersch
given-names: Bram
- family-names: van Grieken
given-names: Liana
- family-names: Medema
given-names: Marnix H
orcid: https://orcid.org/0000-0002-2191-2821
- family-names: Chooi
given-names: Yit-Heng
orcid: https://orcid.org/0000-0001-7719-7524
title: "cblaster: a remote search tool for rapid identification and visualisation of homologous gene clusters"
version: 1.3.9
doi: 10.1093/bioadv/vbab016
date-released: 2021-08-05
GitHub Events
Total
- Create event: 1
- Release event: 1
- Issues event: 6
- Watch event: 22
- Issue comment event: 30
- Push event: 2
- Pull request event: 4
- Fork event: 6
Last Year
- Create event: 1
- Release event: 1
- Issues event: 6
- Watch event: 22
- Issue comment event: 30
- Push event: 2
- Pull request event: 4
- Fork event: 6
Committers
Last synced: 10 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Cameron Gilchrist | c****t@g****m | 529 |
| Bram | e****0@g****m | 274 |
| LianaGrieken | l****n@g****m | 63 |
| Matthias van den Belt | m****t@b****l | 12 |
| brymerr921 | b****1@g****m | 6 |
| Mohammad Alanjary | m****y@w****l | 2 |
| Martin Larralde | m****e@e****e | 1 |
| DrBoothTJ | 6****J | 1 |
| Chase Clark | 1****c | 1 |
| Friederike Biermann | f****e@b****e | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 68
- Total pull requests: 58
- Average time to close issues: 2 months
- Average time to close pull requests: 2 days
- Total issue authors: 53
- Total pull request authors: 12
- Average comments per issue: 2.25
- Average comments per pull request: 0.33
- Merged pull requests: 50
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 7
- Pull requests: 7
- Average time to close issues: N/A
- Average time to close pull requests: 13 days
- Issue authors: 7
- Pull request authors: 3
- Average comments per issue: 3.14
- Average comments per pull request: 1.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- yye88 (5)
- ghost (3)
- zuoyang97 (3)
- Tim-Kirkwood (2)
- pksun1 (2)
- galacmr (2)
- JordanVV (2)
- corkdagga (2)
- StefaanVerwimp (2)
- mafeeney (2)
- jjsanchezgil (1)
- lydia1201 (1)
- jeep3 (1)
- Dfvandenberg (1)
- aberaslop (1)
Pull Request Authors
- gamcil (27)
- bramvanwersch (13)
- LianaGrieken (4)
- biobeni (4)
- malanjary-wur (2)
- althonos (2)
- kaileyhh (2)
- LucoDevro (2)
- FriederikeBiermann (2)
- MatthiasvdBelt (1)
- brymerr921 (1)
- DrBoothTJ (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 141 last-month
- Total dependent packages: 0
- Total dependent repositories: 1
- Total versions: 38
- Total maintainers: 1
pypi.org: cblaster
- Homepage: https://github.com/gamcil/cblaster
- Documentation: https://cblaster.readthedocs.io/
- License: MIT
-
Latest release: 1.3.20
published about 1 year ago
Rankings
Maintainers (1)
Dependencies
- actions/checkout v2 composite
- actions/setup-python v1 composite
- actions/upload-release-asset v1 composite
- tubone24/update_release v1.0 composite
- actions/checkout v2 composite
- actions/setup-python v2 composite
- actions/checkout v2 composite
- actions/setup-python v2 composite
- codecov/codecov-action v1 composite
- Biopython *
- PySimpleGUI *
- appdirs *
- biopython *
- clinker >=0.0.15
- defusedxml *
- genomicsqlite *
- gffutils *
- numpy *
- requests *
- scipy *