Releases | Open Source Science

skder - v1.3.3

Introduce new 'lowmemgreedy' dereplication mode in skder to efficiently handle really large datasets.
Fix issues with automated downloading of genomes from NCBI due to updating listings and add catch for bad files.
Switch cidder's skani-based secondary clustering from using skani triangle to skani dist.

Full Changelog: https://github.com/raufs/skDER/compare/v1.3.2...v1.3.3

- Python
Published by raufs 11 months ago

skder - v1.3.2

Fix: Correct default value of --skani-triangle-parameters skDER which was outdated. This parameter was recently updated to request a screening parameter of X%, where X = ANI threshold - 10%, by default. It was however being set as -s 90.0 by default my mistake (the previous default) - this is now corrected.

What's Changed

Update to v1.3.2 by @raufs in https://github.com/raufs/skDER/pull/10

New Contributors

@raufs made their first contribution in https://github.com/raufs/skDER/pull/10

Full Changelog: https://github.com/raufs/skDER/compare/v1.3.1...v1.3.2

- Python
Published by raufs about 1 year ago

Minor fix: Update argument descriptions for inputs to clarify that genome/proteome files input for CiDDER need to be uncompressed unlike for skDER where gzipped files are allowed. Also added a check for this in case users provide compressed inputs.

Full Changelog: https://github.com/raufs/skDER/compare/v1.3.0...v1.3.1

- Python
Published by raufs about 1 year ago

skder - v1.3.0

Restructure code and introduce new modules to simplify the main programs of skder and cidder.
Incorporate faster way to download genomes from NCBI belonging to a single genus/species of interest in GTDB.
Incorporate the latest GTDB release R226.
Add new option to cidder to select additional representative genomes if X% of non-representative genomes are not contained by an individual representative genome. This is performed as a secondary step after the primary representative selection method.
Change the default ANI cutoff from 99.0% to 99.5% identity in skder to reflect thresholds coinciding with sequence type designations as reported by Rodriguez-R et al. 2023.
Lowered the default AF* threshold from 90% to 50% to reflect perspective/insight shared in the dRep documentation.
Allow proteomes to be provided as inputs for CiDDER.

* = initially wrote ANI by mistake.

Full Changelog: https://github.com/raufs/skDER/compare/v1.2.9...v1.3.0

- Python
Published by raufs about 1 year ago

skder - v1.2.9

Minor updates

make overwriting the output directory more safe and introduce user prompt.
strip away quotes in string arguments with spaces in case they are not processed properly. (related to https://github.com/raufs/skDER/issues/8).

Full Changelog: https://github.com/raufs/skDER/compare/v1.2.8...v1.2.9

- Python
Published by raufs over 1 year ago

skder - v1.2.8

Minor changes: update granet help function & graph creation to have representatives listed last and thus their nodes be shown on top and not hidden underneath non-representative genomes in really large graphs.

- Python
Published by raufs over 1 year ago

skder - v1.2.7

Update granet to make it deterministic and also introduce the --random-seed option to allow changing the layout if desired.
Fix indentation issues in help function and slight updates to logging and messages in CiDDER and skDER.
Create CiDDER_Results.txt file in CiDDER to capture the order in which representative genomes are selected.

Full Changelog: https://github.com/raufs/skDER/compare/v1.2.6...v1.2.7

- Python
Published by raufs almost 2 years ago

skder - v1.2.6

Make "greedy" mode the default algorithm in skder
Correct stumbling on gzipped files with new method to calculate N50s introduced in v1.2.4.
Introduce granet - for creating network visuals of genomes and where representative genomes selected fall.
Update help functions of skder and cidder to include citation notice for skani and CD-HIT, respectively.

What's Changed

Updated Docker folder by @Lolli-AK in https://github.com/raufs/skDER/pull/7

Full Changelog: https://github.com/raufs/skDER/compare/v1.2.5...v1.2.6

- Python
Published by raufs almost 2 years ago

skder - v1.2.5

In skDER, set the default value of -p - controlling additional arguments to pass to skani triangle - from nothing () to -s 90.0 to increase the screening parameter's value to 90.0 from the default value of 80.0.
Add support for providing directory paths, as well as files, to the -g/--genomes argument where the directories contain genome files to (also) include in skder/cidder analyses.
Begin development of Docker-based installation support - including convenience bash wrapper (still progress).

What's Changed

Docker files by @Lolli-AK in https://github.com/raufs/skDER/pull/6

New Contributors

@Lolli-AK made their first contribution in https://github.com/raufs/skDER/pull/6

Full Changelog: https://github.com/raufs/skDER/compare/v1.2.4...v1.2.5

- Python
Published by raufs almost 2 years ago

skder - v1.2.4

Introduce mgecut - a program that can use PhiSpy or geNomad to predict MGEs in genomes and filter them out prior to genomic dereplication.
Integrate mgecut usage into skder and cidder .
Switch to simple python implementation of N50 adapted from: https://gist.github.com/dinovski/2bcdcc770d5388c6fcc8a656e5dbe53c instead of using pyfastx.

Full Changelog: https://github.com/raufs/skDER/compare/v1.2.3...v1.2.4

- Python
Published by raufs almost 2 years ago

skder - v1.2.3

update and further polish arguments (e.g. underscores to dashs).
add options for secondary clustering for cidder using protein cluster containment* and/or skani (https://github.com/raufs/skDER/issues/5).

Full Changelog: https://github.com/raufs/skDER/compare/v1.2.2...v1.2.3

- Python
Published by raufs almost 2 years ago

skder - v1.2.2

Add memory option for CD-HIT usage in cidder and set default to unlimited.
Cosmetic changes to code / comments / help-function

Full Changelog: https://github.com/raufs/skDER/compare/v1.2.1...v1.2.2

- Python
Published by raufs almost 2 years ago

skder - v1.2.1

Correct for default parameter settings in CiDDER
Add/update expected test results

- Python
Published by raufs almost 2 years ago

skder - v1.2.0

Introduce CiDDER - a CD-HIT based dereplication program to ensure you properly sample the pangenome space of a species/genus.
Introduce new option to skDER to test a bunch of cutoffs for ANI and AF and generate a heatmap on the number of representative genomes that results from different combinations. E.g. useful if you want to limit your analysis to X genomes but don't know what ANI/AF cutoffs to use.

Full Changelog: https://github.com/raufs/skDER/compare/v1.1.1...v1.2.0

- Python
Published by raufs almost 2 years ago

skder - v1.1.1

add creation of "COMPLETED.txt" at the very end of skDER for incorporation in workflows.
add option to build indices locally when computing N50s instead of in the directory of the input genome

- Python
Published by raufs about 2 years ago

skder - v1.1.0

Introduce ability to specify GTDB release and update to using GTDB R220 as default for when users request to auto-download and include all genomes from a particular genus/species.
Remove need for symlinking genomes locally, instead fastx index files are now written in the same folder as the input genomes and deleted afterwards.
Parallelization when computing N50 is done by splitting up number of genomes by the number of CPUs allocated and thus writing to at most X number of files at a time, where X is the number of CPUs. This is to address: https://github.com/raufs/skDER/issues/4

- Python
Published by raufs about 2 years ago

skder - v1.0.10

Minor change, added new argument to use https://ftp.ncbi.nlm.nih.gov/genomes instead of https://ftp.ncbi.nih.gov/genomes in case there are issues with connecting to the latter. This gets passed to ncbi-genome-download's -u argument.

Full Changelog: https://github.com/raufs/skDER/compare/v1.0.9...v1.0.10

- Python
Published by raufs about 2 years ago

skder - v1.0.9

Support for gzipped files added (https://github.com/raufs/skDER/issues/4)
GTDB/NCBI downloaded genomes are now kept in gzip form
FASTA files ending in *.fas now allowed (https://github.com/raufs/skDER/issues/4)
If local input genomes are provided, default behavior is now to symlink files in the skDER results directory and do indexing for N50 calculation there.
FASTA confirmation now optional (might paralelize in the future and turn back on as default - but currently iterative) - it can take a while if there are a lot of files.

Full Changelog: https://github.com/raufs/skDER/compare/v1.0.8...v1.0.9

- Python
Published by raufs about 2 years ago

skder - v1.0.8

Fix broken GTDB-based downloading feature.
Polish names for genomic assemblies downloaded based on GTDB species names.

Full Changelog: https://github.com/raufs/skDER/compare/v1.0.7...v1.0.8

- Python
Published by raufs over 2 years ago

skder - v1.0.7

Corrected faulty usage of the -s option in skani triangle and now set it to the default value. This should now result in the more accurate ANI estimates being used for the dereplication methods as intended.
Updated stats and runtime info for running dynamic/greedy approaches on the Wiki.
Added new secondary clustering option, -n which will report the relation/distance of all genomes in the input set to their nearest representative genome.

Full Changelog: https://github.com/raufs/skDER/compare/v1.0.6...v1.0.7

- Python
Published by raufs over 2 years ago

skder - v1.0.6

Mostly just updates to the README & help function.
Added missing library import statements in util.py

Full Changelog: https://github.com/raufs/skDER/compare/v1.0.5...v1.0.6

- Python
Published by raufs over 2 years ago

skder - v1.0.5

update packaging of program + installation guide

- Python
Published by raufs almost 3 years ago

skder - v1.0.4

fix SKDER_PATH now that skder moved to bin/

- Python
Published by raufs almost 3 years ago

skder - v1.0.2

updates for v.1.0.2

KEY: Correct overflow issue in C++ code related to integer multiplication in computing scores for dynamic dereplication approach
Introduce a greedy set cover dereplication approach as an alternate method
Improve code + documentation organization
Add test case

Full Changelog: https://github.com/raufs/skDER/compare/v1.0.1...v1.0.2

- Python
Published by raufs almost 3 years ago

skder - v1.0.1

Create directory with representative genomes in the output directory.
Add version flag , change input for --genomes argument from accepting a directory to multiple paths to genome files.
Update Enterococcus dereplication showcasing.

Full Changelog: https://github.com/raufs/skDER/compare/v1.0...v1.0.1

- Python
Published by raufs almost 3 years ago

skder - v.1.0

First release of skDER.

- Python
Published by raufs almost 3 years ago

Recent Releases of skder

skder - v1.3.3

skder - v1.3.2

What's Changed

New Contributors

skder - v1.3.1

skder - v1.3.0

skder - v1.2.9

Minor updates

skder - v1.2.8

skder - v1.2.7

skder - v1.2.6

What's Changed

skder - v1.2.5

What's Changed

New Contributors

skder - v1.2.4

skder - v1.2.3

skder - v1.2.2

skder - v1.2.1

skder - v1.2.0

skder - v1.1.1

skder - v1.1.0

skder - v1.0.10

skder - v1.0.9

skder - v1.0.8

skder - v1.0.7

skder - v1.0.6

skder - v1.0.5

skder - v1.0.4

skder - v1.0.2

updates for v.1.0.2

skder - v1.0.1

skder - v.1.0