Recent Releases of parsnp

parsnp - v2.1.4

  • Use mauve format for partitioning mode and add VCF header for bcftools compatibility (#169 #177)
  • Fix logged sequence lengths #173
  • Use python3 in shebang line instead of python

- C++
Published by bkille 11 months ago

parsnp - v2.1.3

Parsnp now outputs accurate CHROM and POS in the VCF output for multi-fasta references (they were previously being treated as a single concatenated sequence in terms of coordinates).

- C++
Published by bkille about 1 year ago

parsnp - v2.1.2

  • Parsnp XMFA output now follows the Mauve format and has the appropriate header #FormatVersion Mauve so that HarvestTools parses it correctly. Fixes #169.
  • Adds warning for multifasta reference VCF coordinates.

- C++
Published by bkille over 1 year ago

parsnp - v2.1.1

This release corrects the start location of reverse complement alignment records in the MAF output and also adds the ##maf version=1 program=parsnp header to the MAF output. The MAF output is now compliant with the UCSC format specification.

For more details, please see #167.

- C++
Published by bkille over 1 year ago

parsnp - v2.1.0

  • The recombination filter has been fixed for partition workflows. Recombination filtering can now be enabled via --recomb-filter as well as the original -x, --xtrafast flags. Fixes #157.
  • The --no-recruit has been fixed and replaced by --skip-ani-filter. Fixes #164.
  • "Recruitment" in general has been renamed to "Filtering" to be consistent with other methods.
  • Crashes gracefully when a partition's genomes are too divergent. Fixes #165.
  • Inputs with duplicate filenames no longer allowed, i.e. /path1/genome.fna and /path2/genome.fna cannot both be provided. This is because the .xmfa and .maf output files use the file stem as the "sample" name. Fixes #166.

- C++
Published by bkille over 1 year ago

parsnp - Parsnp v2.0.6

  • Previously we were importing the extension module even if the user didn't need it. This required the pyspoa module and was causing issues. Now, users can run parsnp w/out pyspoa by using the --no-partition flag.
  • Removes log redirects for harvest which cause crashes
  • Uses regex strings for the parsers #153
  • Adds back the gingr support for phipack results

- C++
Published by bkille over 1 year ago

parsnp - Release v2.0.5

  • Parsnp now outputs a MAF file in addition to the typical XMFA file (can turn this off with (--no-maf).
  • Parsnp now uses the partition mode by default when there are more than 100 input sequences.

- C++
Published by bkille almost 2 years ago

parsnp - Release v2.0.4

  • fasttree output is now moved to the correct spot for downstream processing (fixes #147)
  • --min-ref-cov arg is now an inclusive lower bound instead of exclusive

- C++
Published by bkille about 2 years ago

parsnp - Release v2.0.3

  • Replaces internal commands which were incompatible with MacOS.

- C++
Published by bkille about 2 years ago

parsnp - v2.0.2

  • The interval for each xmfa record is now inclusive, following the xmfa format. This only effects the internal coordinate representation, used by gingr. The s[seq_idx]:p[position] coordinates remain unchanged.
  • tqdm logging has been redirected to the Parsnp logger
  • The extension module now also uses SPOA. However, it is still experimental.
  • Extension is much more conservative than before, but still can produce less conserved alignments

- C++
Published by bkille about 2 years ago

parsnp - Parsnp v2.0.1

  • Output is now deterministic, even in partition mode
  • time is now only prepended to commands if it exists on the system

- C++
Published by bkille about 2 years ago

parsnp - Parsnp v2.0.0

What's Changed

  • You can now pass a newline-separated file of paths to query sequences to -d in addition to directories and command-line lists.
  • Adds --partition flag, which splits recruited genomes into partitions of size --partition-size (default 100).
  • Adds --no-recruit option which skips the recruitment step, but still drops genomes if their size differs substantially from the reference.
  • Fixes multiple bugs in the output:
    • Output can now be parsed by BioPython's AlignIO module with the "mauve" format
    • LCBs no longer overlap
    • Ambiguous base pairs and small contigs no longer lead to incorrect coordinates
    • VCF now contains the correct reference allele.
  • FastANI now guarantees at least 100 segments (unless it requires a fragment length < 500)
  • Adds --min-ref-cov option (default 90), which when used with --use-ani, removes query genomes that do not cover at least 90% of the reference.
  • Output folder has been reorganized to separate logs and config files from the main output.

- C++
Published by bkille over 2 years ago

parsnp - Release v1.7.4

  • abpoa and muscle have been replaced by mafft for inter-LCB alignment.

- C++
Published by bkille over 3 years ago

parsnp - Release v1.7.3

The --extend-lcbs parameter now performs a gapped alignment using either muscle (>50nt) or abpoa (<=50nt) on inter-LCB regions. Alignments are trimmed back based on an ANI parameter that can also be provided by the user (i.e., cuts alignment off when it falls below 95% ANI). This parameter still only works w/ single contig genomes.

Also added better SeqIO validation. Throws out any genomes that can't be parsed.

- C++
Published by bkille almost 4 years ago

parsnp - Release v1.7.2

  • Fix bug where Harvesttools is provided both a gbk and fasta reference.

- C++
Published by bkille about 4 years ago

parsnp - v1.7.1

  • Mismatch=-4 and gap=-2 for extending LCBs
  • Fixed bug that created extra gaps in LCB extension
  • Fixed bug that resulted in FastTree being used in cases when it wasn't requested
  • Fixed bug that duplicated reference in alignment output
  • Reference sequence is now first in extended xmfa output

- C++
Published by bkille about 4 years ago

parsnp - Release v1.7.0

  • Added option --extend-lcbs which uses a naive alignment to extend the boundaries of clusters.

- C++
Published by bkille about 4 years ago

parsnp - Release v1.6.2

  • No longer deletes the core SNPs file, parsnp.snps.mblocks. Users can now use that file to run their own custom phylogenetic analysis using this file as input.

- C++
Published by bkille about 4 years ago

parsnp - Release v1.6.1

  • Fixes an issue where a list was compared to an integer.

- C++
Published by bkille about 4 years ago

parsnp - Release v1.6.0

  • Add --skip-phylogeny option to allow users to skip phylogeny reconstruction
  • Add --validate-input option to allow users to validate files provided by -d with Biopython parsing

- C++
Published by bkille about 4 years ago

parsnp - Release v1.5.6

  • Add option to generate .vcf output via the --vcf flag
  • Remove tmp/ directory in output folder

- C++
Published by bkille almost 5 years ago

parsnp - Release v1.5.4

  • Fixes issue for multireference fastas mentioned in #87

- C++
Published by bkille over 5 years ago

parsnp - v1.5.3

  • Fixed memory limit in Muscle library so that multiprocessing doesn't crash
  • Changed parsnp compilation to -O2 to fix WSL2 bug
  • Added warning for trying to use too many threads

- C++
Published by bkille over 5 years ago

parsnp - Release v1.5.2

  • Fixed fallback to FastTree on RaxML failure
  • Made --curate include all genomes regardless of length

- C++
Published by bkille over 5 years ago

parsnp - Release v1.5.1

Bugfixes for Parsnp v1.5.0: * Mash and FasANI recruitment now works regardless of reference type * Phiprofile fixes for python 3 * Logging input errors is now cleaner

- C++
Published by bkille almost 6 years ago

parsnp - Release v1.5.0

  • Ported to Python 3, no longer supports Python 2
  • Added pythonic logging and argument parsing
  • Merged build instructions for macOS and Unix
  • Switched FastTree to RaxML
  • All dependency tools must be on the system path (no longer defaults to bin/)
  • More robust error checking (skips bad files instead of crashing, more warnings + debug mode)
  • Fixed .gbk parsing
  • Convenient input now allows sequences to be supplied via regex i.e. what use to be parsnp -d ref/ -r ! can now be written as parsnp -d ref/*.fna which allows for heterogeneous input directories and spread out data.

- C++
Published by bkille almost 6 years ago

parsnp - Test release of v1.5

- C++
Published by bkille almost 6 years ago

parsnp - v1.2

Bug fixes

  • Important fix to FastTree2 binary to enable use of double precision branch lengths (instead of minimum length of 0.0001); FastTree2 issue fully detailed in the following blog post: http://darlinglab.org/blog/2015/03/23/not-so-fast-fasttree.html.

- C++
Published by treangen about 11 years ago

parsnp - v1.1.1

Fix to fully disable/deprecate -R parameter

- C++
Published by treangen about 11 years ago

parsnp - v1.1

Features

  • Added support to output unaligned regions in XMFA format (--unaligned). This output format contains a list of all unaligned sequence due to: (i) unrelated sequence, (ii) failure to extend past match (MUM) boundaries/sensitivity limitations, and (iii) subset relationships. This output file is further described in the documentation --> http://harvest.readthedocs.org/en/latest/.
  • Updated MUMi parameter to allow for setting a maximum MUMi value to enable bypassing the distribution cutoff

Bug fixes

  • Fixed minor bug when parsing multiple GenBank files for harvesttools annotation import
  • Updated branch lengths via harvest-tools
  • Fixed bug that produced VCF output without repeat filtering, even if enabled
  • Fixed issue where empty fasta file was causing index errors
  • Fixed exit code to be 0 on –h (Issue #4)
  • Updated docs to indicate CXXFLAGS=’-fopenmp’ must be provided to configure for openmp support
  • Added support for harvesttools v1.2 (see: https://github.com/marbl/harvest-tools/releases/tag/v1.2)

Changes

  • Added –V, --version parameter
  • Allow columns with Ns to be marked as pass in VCF
  • Allow any fasta file present in -d, regardless of extension, to be used
  • Disallow genome file names containing spaces or the following special characters: . , [ ] { } ( ) ! ; " ' * ? < > |
  • Default repeat filtering disabled, removing NUCmer dependency. Repeat filtering still possible via bed format support and harvest-tools; see harvesttools documentation for further details.

- C++
Published by treangen about 11 years ago

parsnp - v1.0

NEWS

07/21/2014: First official release

DOCS

For parsnp user guide please see:

http://harvest.readthedocs.org/en/latest/content/parsnp.html

MD5 checksums

08c19b4d12e8199b2ce098550b4100e6 parsnp-OSX64-v1.0.tar.gz f82b6b9dae456fe9263ee6214b2633af parsnp-Linux64-v1.0.tar.gz

- C++
Published by treangen over 11 years ago