amptk - amptk v1.60

ITS database has now gotten too large to be hosted in single file on OSF, so its split into two. All users will need to upgrade to v1.6.0 in order for database to be properly downloaded, used.

- Python
Published by nextgenusfs over 2 years ago

amptk - AMPtk v1.5.5

bug fix for amptk unoise3 #96 #99

- Python
Published by nextgenusfs over 3 years ago

amptk - AMPtk v1.5.4

bug fix for amptk database that was stripping species epithet names #81
bug fix for linux systems related to platform, fix was to use distro #91
added action to build docker images and push to docker hub. Fully functional version can be run with shell amptk-docker script, usage see https://amptk.readthedocs.io/en/latest/index.html#run-from-docker
Updated fungal ITS database based off of UNITE v8.3, upgrade with amptk install
Added new PR2 database for universal SSU amplicons, https://github.com/pr2database/pr2database, add with amptk install
Re-wrote amptk funguild instead of running the original Guilds.py script as this stopped working at some point

- Python
Published by nextgenusfs over 4 years ago

amptk - AMPtk v1.5.3

typo/bug fixes #84 #85
fix to multiprocessing so now py>3.7 working #83

- Python
Published by nextgenusfs almost 5 years ago

Bug fix release

fix for merging PE reads with vsearch
fix dereplicate function in database #77
fix for filter if --negatives passed but no mock community barcode
fix for taxonomy where custom databases not working with the -d flag #80
add -p illumina3 to SRA-submit

- Python
Published by nextgenusfs about 5 years ago

amptk - amptk v1.5.1

to support OSX Catalina dropping 32-bit applications, I've removed usearch9 as a strict dependency. vsearch will be the default for all processing steps, including taxonomy assignment. Along these lines, I've dropped UTAX classifier as "default" in the hybrid taxonomy method.
dropping usearch, required addition of a few new dependencies: mafft, fasttree, and the python package pyfastx (for speed simplicity).
added support for PacBio CCS reads. Reads can be processed with amptk pacbio and then clustering with amptk pb-dada2
several bug fixes
fix embarrassing typos in v1.5.0 -- do not use v1.5.0 its broken.
apparently it does not work in python 3.8 -- investigating that for future release/support.
working on ONT (Oxford Nanopore) support as well -- stay tuned for this in near future.

- Python
Published by nextgenusfs over 5 years ago

amptk - amptk v1.4.2

add --pseudopool option to amptk dada2

- Python
Published by nextgenusfs over 6 years ago

amptk - amptk v1.4.1

bug fix for amptk summarize if taxonomy not present #57
add FASTA extraction step to install #59
add chimera detection options to dada2 #60
fix DADA2 for ion torrent data, in newer versions of dada2 the quality score issue has been fixed, so can now use quality scores during sample inference.

- Python
Published by nextgenusfs almost 7 years ago

amptk - amptk v1.4.0

fix for amptk summarize --> rewrote script
large update to taxonomy assignment, switched to vsearch for all global alignment steps
update to all taxonomy databases, the downloads are larger but the data should be more robust
update to the COI bold2utax method in docs
update to amptk database and the command line options
bug fixes along the way

- Python
Published by nextgenusfs over 7 years ago

This is a major reorganization of the code so it is "properly packaged" and can be installed with pip and hopefully conda. After v1.2.4 there were apparently changes to conda that would not allow the scripts to build with the previous code organization. I've then also fixed a few of the bugs that have shown up more recently, including: * The py2 error in the bold2amptk.py accessory script https://github.com/nextgenusfs/amptk/issues/40 * the edlib version bug reported several times after edlib updated how it stores the version info https://github.com/nextgenusfs/amptk/issues/46

- Python
Published by nextgenusfs over 7 years ago

amptk - amptk v1.2.5

bug fix where edlib version was not properly parsed due to changes upstream, fix is backwards compatible and hopefully future version compatible as well. This was error in #40 and several other "offline" emails.
update to amptk stats to generate interactive html output for NMDS output, allowing users to identify which samples correspond to which point on the graph. This requires r-plotly, r-htmltools, and r-dt -- however should be all installed through bioconda
bug fixes for amtpk SRA-submit
improve error/logging in amptk taxonomy
fix R scripts to be compatible with R>3.4.1 -- somehow different way of parsing the command line arguments. Also removed the "auto-install" function in the R scripts -- this was meant to be a convenience but didn't always work.

- Python
Published by nextgenusfs over 7 years ago

amptk - amptk v1.2.4

trying to fix tab error that bioconda didn't like, version bump accordingly.

- Python
Published by nextgenusfs about 8 years ago

amptk - amptk v1.2.3

fix auto-detect of base name output files in clustering scripts
bump version
update citation as now published in PeerJ https://peerj.com/articles/4925/

- Python
Published by nextgenusfs about 8 years ago

amptk - amptk v1.2.2

fix tab indentation bug for py3
add minimum length and trimming length to log file and terminal output.

- Python
Published by nextgenusfs about 8 years ago

amptk - amptk v1.2.1

numerous bug fixes: including a fix for --require_primer off in amptk illumina. Several bug fixes in amptk illumina2 and amptk illumina3 pre-processing steps.
thread (processes) control added for clustering steps

- Python
Published by nextgenusfs about 8 years ago

amptk - amptk v1.2.0

for all Illumina pre-processing methods the order of steps has been changed to now 1) search for primers/trim and then 2) merged PE reads. This ensures that all primer/adapter sequences are trimmed from dataset, whereas previously AMPtk releases merged PE reads first - because of how usearch/vsearch merge PE reads, occasional primer/adapter sequences were slipping through the pre-processing steps. Also reads with multiple primer hits (i.e. two forward primers) are now discarded. While this results in the pre-processing steps being slightly slower in runtime, it increases the data quality downstream.
read orientation is tested/fixed on the fly for amptk illumina2 workflow (barcodes/primers in reads). Some datasets of 50/50 read orientation.
added a check for "inverted OTUs" for all denoising/clustering steps. This was largely a check to validate that changing illumina pre-processing steps were working correctly (as unintended result was a small number of OTUs that were on the "crick" strand)
Default mapping file now has a 'RevBarcodeSequence' column. Mapping files used for amptk illumina2 enforce the paired barcode sequences in the mapping file. If barcode fasta files are given, then all combinations of barcodes (5' and 3') are saved.
a few py27/36 bug fixes

- Python
Published by nextgenusfs about 8 years ago

amptk - amptk v1.1.3

update to amptk taxonomy to alert user if sample names are duplicated
many fixes to unify the 'base name" output for many scripts
remove colored date/time stamp for all platforms except Mac
many fixes for py2/py3 compatibility
amptk now moved to bin directory - will slightly change install

- Python
Published by nextgenusfs about 8 years ago

amptk - amptk v1.1.2

update compatibility for py2/3

- Python
Published by nextgenusfs about 8 years ago

amptk - amptk v1.1.1

bug fixes for amptk lulu, bug fixes for amptk taxonomy to better deal with 16S 8 level taxonomy
added some minor functions to amptk filter
improvement of amptk illumina3 as well as allowing --barcode_rev_comp to reverse complement barcode sequences, i.e. if indexing was done on reverse primer. mapping file is not required now can also pass primers and a barcode faster file for demuxing.
clean up repo slightly, trying to get conda recipe working.

- Python
Published by nextgenusfs about 8 years ago

amptk - amptk v1.1.0

bug fix for amptk filter and the subtract feature https://github.com/nextgenusfs/amptk/issues/31
fix menu option https://github.com/nextgenusfs/amptk/issues/33
added LULU module for OTU curation, see more here: https://www.nature.com/articles/s41467-017-01312-x
LULU usage: amptk cluster --> amptk filter --> amptk lulu --> amptk taxonomy --> amptk stats
added indicator species analysis to amptk stats
several bug fixes for amptk stats
added option to drop specific OTUs prior to running amptk stats

- Python
Published by nextgenusfs over 8 years ago

amptk - amptk v1.0.3

update the ITS database to use newest version of UNITE database version 01.12.2017.

- Python
Published by nextgenusfs over 8 years ago

amptk - amptk v1.0.2

bug fix for amptk show if non gzipped file passed it was removed
added gzip support for amptk sample
bug fix for amptk filter if OTUs and OTU table did not overlap 100% then script would die, added sanity check
changed --col_order in amptk filter to be a space separated list (was previously comma separated with no space).

- Python
Published by nextgenusfs over 8 years ago

amptk - amptk v1.0.1

minor bug fixes for amptk database
update to amptk filter to only label potential chimera OTUs if --calculate all is passed, i.e. you are using a synthetic mock that won't be in the rest of your samples

- Python
Published by nextgenusfs over 8 years ago

amptk - amptk v1.0.0

now requires edlib v1.2.1 -> thanks to Edlib developer (Martin Šošić) for finding the bug that was preventing full usage of edlib in amptk. primer and barcode searches now very fast.
added a new phyloseq module, amptk stats which will run some preliminary community ecology stats on your BIOM output file. requires R and phyloseq
added support for UNOSIE3 via the amptk unoise3 command, note you will need to have USEARCHv10 for this script to work.
updated global alignment taxonomy search to better deal with multiple hits in the reference database
updated ITS reference database as well as COI database.
finally wrote some more comprehensive documents, located at http://amptk.readthedocs.io
several minor bug fixes

- Python
Published by nextgenusfs over 8 years ago

amptk - amptk v0.10.3

bug fixes for amptk database. Rewrote the dereplication function and added --lca or last common ancestor function for building the UTAX databases.
bug fixes for amptk SRA-submit and update to edlib for searching.
bug fix for amptk filter where samples passed to --drop are not used for index-bleed calculation
bug fix for pre-processing and --mult_samples argument

- Python
Published by nextgenusfs over 8 years ago

amptk - amptk v0.10.2

bug fix for -t, --threshold option of amptk select and amptk remove. Bug was introduced during last update when support for compressed files was introduced
AMPtk now supports degenerate nucleotide primer matching, thanks to the very fast edlib v1.2 library. You will need to have at least edlib v1.2.0, can upgrade with pip, i.e. pip install -U edlib. The scripts will check your edlib version during runtime and let you know if you need to upgrade.

- Python
Published by nextgenusfs almost 9 years ago

amptk - amptk v0.10.1

bug fix for amptk filter where some mock sequences were not being annotated correctly
upgrade amptk SRA-submit to use edlib alignment
fix menu in several places

- Python
Published by nextgenusfs almost 9 years ago

amptk - amptk v0.10.0

Major update is to use edlib library for alignment, this is a dramatic increase in speed, however downside is that degenerate nucleotides are not supported in edlib currently (hoping to get this fixed soon). You can increase --primer_mismatch to allow for degenerate matches, keep in mind that currently any degenerate nucleotide will be counted as a mismatch. v0.9.3 still supports degenerate nucleotides, although the alignment is much less accurate and is 10X slower.
edlib alignment now supports barcode_mismatches as well without a loss in speed.
update to MergePE function, which allows user to select either vsearch or search for merging paired end fastq files, controlled via --merge_method. Update to phiX filtering to split files if >3GB to avoid memory problem in USEARCH 32 bit.
add amptk illumina3 method for pre-processing, this will demultiplex Illumina PE files along with index read files
support for gzipped input files, as well as now default will output fq.gz demuxed files. Save space during processing.
updated docker container and install instructions

- Python
Published by nextgenusfs almost 9 years ago

amptk - amptk v0.9.3

remove bedtools as dependency for converting BAM -> FASTQ. Now AMPtk will first try to use samtools if it exists, then bedtools if it exists, and will default to pybam native python parser to convert. Pybam is 10X slower than samtools, but is written in python thus no extra dependencies needed.
added threshold filtering to amptk remove and amptk select, so you could remove all samples with reads less than 5000 by running, amptk remove -i input.demux.fq -t 5000 -o output.demux.fq

- Python
Published by nextgenusfs about 9 years ago

amptk - amptk v0.9.2

bug fix for amptk dada2 denoising where reads were getting ignored if they contained any ambiguous nucleotides. The filter for ambiguous nucleotides is still maintained prior to DADA2, note that terminal N's from padding will be properly removed, only internal ambiguous nucleotides are not allowed in DADA2

- Python
Published by nextgenusfs about 9 years ago

amptk - amptk v0.9.1

Add phix filtering for Illumina data. As part of the PE merging function in amptk illumina and amptk illumina2, scripts will also now run phix removal.
Workaround for DADA2 error where samples that only have 1 read post filtering trigger a derep$quals matrix error. amptk dada2 now has -m, --min_reads option to drop samples that have fewer than -m reads. Default this is set to 10, however, in practice probably this should be much higher, but this should avoid the above error.

- Python
Published by nextgenusfs about 9 years ago

amptk - amptk v0.9.0

added better support for amptk SRA-submit
added ability to normalize heat map
added amptk SRA which can be used to process reads downloaded from the SRA, where they are in a single FASTQ file, i.e. from ION or 454 data that has been demultiplexed into samples and then submitted.
created Dockerfile for using amptk with the scipy-notebook jupyter notebook server.

- Python
Published by nextgenusfs about 9 years ago