Recent Releases of metasbt
metasbt - MetaSBT v0.1.5
New features
unpackcan automatically rename an unpacked database with the specified--databaseinput argument;updateexposes two new arguments--uncertaintyand--pruning-thresholdto tune the profiling performances.
Fixes
dbcorrectly downloads the selected database version now;unpackis now trimming the whole database structure out up to the database folder so thatunpackwould eventually work as expected;unpackautomatically fixes the paths to the bloom filter sketches onced a database is unpacked to a new location, usually different from the one where the database was located at the time of packing it;updatecorrectly generates a new database also in case of no new unknown genomes.
- Python
Published by cumbof 8 months ago
metasbt - MetaSBT v0.1.4.post1
New object-oriented implementation of MetaSBT. Clusters are consistent with the definition of Average Nucleotide Identity (ANI). Clusters' boundaries are defined as the minimum and maximum ANI distance between all the genomes under a specific cluster.
New features
It provides the following subroutines:
- db: List and retrieve public MetaSBT databases;
- index: Index a set of reference genomes and build the first baseline of a MetaSBT database;
- kraken: Export a MetaSBT database into a custom kraken database;
- pack: Build a compressed tarball with a MetaSBT database and report its sha256;
- profile: Profile an input genome and report the closest cluster at all the seven taxonomic levels and the closest genome in a MetaSBT database;
- sketch: Sketch the input genomes;
- summarize: Summarize the content of a MetaSBT database and report some statistics;
- test: Check for dependencies and run unit tests. This must be used by code maintainers only;
- unpack: Unpack a local MetaSBT tarball database;
- update: Update a MetaSBT database with new metagenome-assembled genomes.
The MetaSBT core provides an interface to the Database and Entry class abstractions.
Fixes
None
- Python
Published by cumbof 8 months ago
metasbt - MetaSBT v0.1.3
MetaSBT v0.1.3 brings the following improvements.
New features
- New option
--uniform-strandavailable with theindexandupdatemodules for processing the input sequences all on the same strand. Mainly used for viral sequences; - New option
--use-representativesavailable with theindexmodule to use only three representative genomes at the species level; - New option
--resumeavailable with theindexandupdatemodules able to resume the index and update processes in case of unexpected errors; - New
expand_fasta.pyutility inscriptsto expand input fasta files into multiple file. One fasta file for each read. Mainly used for viral sequences; - New
fastcluster.pyutility inscriptto compute a average-linkage hierarchical clustering of a set of genomes based on their Mash distances; - Both the
indexandupdatemodules now display a worning message in case the configuration file under--resumehas been previously generated with a different version of MetaSBT; - Both the
indexandupdatemodules now integrateCheckVandEukCCfor assessing the quality of viruses and eukaryotes; CheckMhas been upgraded toCheckM2;- The
cluster()function inutilsis now running in parallel; - The
howdesbt bfdistancecommand for computing the distances between bloom filters is now running in parallel.
Fixes
- It correctly checks now for new framework versions when starting a new
metasbtinstance; - Fixed genome quality filtering on completeness and contamination during the
update; - Improving docstring adopting the numpydoc documentation format.
- Python
Published by cumbof over 1 year ago
metasbt - MetaSBT v0.1.2
First public stable release of MetaSBT.
It is composed of the following modules:
index: build a MetaSBT database by building a series of Sequence Bloom Trees at different taxonomic levels;boundaries: define taxonomy-specific boundaries as the minimum and maximum number of kmers in common between all the genomes under a specific cluster;profile: taxonomically profile a genome by querying a MetaSBT database at different taxonomic levels;report: build a report table describing the content of a MetaSBT database;update: update a MetaSBT database with new genomes;tar: pack a MetaSBT database into a ready-to-be-distributed tarball;install: install a MetaSBT database tarball locally under a specific location of the file system.
The framework also comes with a set of utilities:
bf_sketch.py: build minimal bloom filter sketches with cluster-specific marker kmers;esearch_txid.sh: retrieve GCAs from NCBI GenBank given a specific taxonomic ID;get_ncbi_genomes.py: retrieve reference genomes and metagenome-assembled genomes under a specific superkingdom and kingdom from NCBI GenBank;howdesbt_index.sh: index genomes with HowDeSBT;uniform_inputs.sh: uniform input genome files extension.
- Python
Published by cumbof almost 3 years ago