Recent Releases of parsnp
parsnp - v2.1.2
- Parsnp XMFA output now follows the Mauve format and has the appropriate header
#FormatVersion Mauveso that HarvestTools parses it correctly. Fixes #169. - Adds warning for multifasta reference VCF coordinates.
- C++
Published by bkille over 1 year ago
parsnp - v2.1.1
This release corrects the start location of reverse complement alignment records in the MAF output and also adds the ##maf version=1 program=parsnp header to the MAF output. The MAF output is now compliant with the UCSC format specification.
For more details, please see #167.
- C++
Published by bkille over 1 year ago
parsnp - v2.1.0
- The recombination filter has been fixed for partition workflows. Recombination filtering can now be enabled via
--recomb-filteras well as the original-x, --xtrafastflags. Fixes #157. - The
--no-recruithas been fixed and replaced by--skip-ani-filter. Fixes #164. - "Recruitment" in general has been renamed to "Filtering" to be consistent with other methods.
- Crashes gracefully when a partition's genomes are too divergent. Fixes #165.
- Inputs with duplicate filenames no longer allowed, i.e.
/path1/genome.fnaand/path2/genome.fnacannot both be provided. This is because the.xmfaand.mafoutput files use the file stem as the "sample" name. Fixes #166.
- C++
Published by bkille over 1 year ago
parsnp - Parsnp v2.0.6
- Previously we were importing the extension module even if the user didn't need it. This required the pyspoa module and was causing issues. Now, users can run parsnp w/out pyspoa by using the
--no-partitionflag. - Removes log redirects for harvest which cause crashes
- Uses regex strings for the parsers #153
- Adds back the gingr support for phipack results
- C++
Published by bkille over 1 year ago
parsnp - Release v2.0.5
- Parsnp now outputs a MAF file in addition to the typical XMFA file (can turn this off with (--no-maf).
- Parsnp now uses the partition mode by default when there are more than 100 input sequences.
- C++
Published by bkille almost 2 years ago
parsnp - Release v2.0.4
fasttreeoutput is now moved to the correct spot for downstream processing (fixes #147)--min-ref-covarg is now an inclusive lower bound instead of exclusive
- C++
Published by bkille about 2 years ago
parsnp - Release v2.0.3
- Replaces internal commands which were incompatible with MacOS.
- C++
Published by bkille about 2 years ago
parsnp - v2.0.2
- The interval for each xmfa record is now inclusive, following the xmfa format. This only effects the internal coordinate representation, used by gingr. The
s[seq_idx]:p[position]coordinates remain unchanged. tqdmlogging has been redirected to the Parsnp logger- The extension module now also uses SPOA. However, it is still experimental.
- Extension is much more conservative than before, but still can produce less conserved alignments
- C++
Published by bkille about 2 years ago
parsnp - Parsnp v2.0.1
- Output is now deterministic, even in partition mode
timeis now only prepended to commands if it exists on the system
- C++
Published by bkille about 2 years ago
parsnp - Parsnp v2.0.0
What's Changed
- You can now pass a newline-separated file of paths to query sequences to
-din addition to directories and command-line lists. - Adds
--partitionflag, which splits recruited genomes into partitions of size--partition-size(default 100). - Adds
--no-recruitoption which skips the recruitment step, but still drops genomes if their size differs substantially from the reference. - Fixes multiple bugs in the output:
- Output can now be parsed by BioPython's AlignIO module with the "mauve" format
- LCBs no longer overlap
- Ambiguous base pairs and small contigs no longer lead to incorrect coordinates
- VCF now contains the correct reference allele.
- FastANI now guarantees at least 100 segments (unless it requires a fragment length < 500)
- Adds
--min-ref-covoption (default 90), which when used with--use-ani, removes query genomes that do not cover at least 90% of the reference. - Output folder has been reorganized to separate logs and config files from the main output.
- C++
Published by bkille over 2 years ago
parsnp - Release v1.7.4
- abpoa and muscle have been replaced by
mafftfor inter-LCB alignment.
- C++
Published by bkille over 3 years ago
parsnp - Release v1.7.3
The --extend-lcbs parameter now performs a gapped alignment using either muscle (>50nt) or abpoa (<=50nt) on inter-LCB regions. Alignments are trimmed back based on an ANI parameter that can also be provided by the user (i.e., cuts alignment off when it falls below 95% ANI). This parameter still only works w/ single contig genomes.
Also added better SeqIO validation. Throws out any genomes that can't be parsed.
- C++
Published by bkille almost 4 years ago
parsnp - Release v1.7.2
- Fix bug where Harvesttools is provided both a gbk and fasta reference.
- C++
Published by bkille about 4 years ago
parsnp - v1.7.1
- Mismatch=-4 and gap=-2 for extending LCBs
- Fixed bug that created extra gaps in LCB extension
- Fixed bug that resulted in FastTree being used in cases when it wasn't requested
- Fixed bug that duplicated reference in alignment output
- Reference sequence is now first in extended xmfa output
- C++
Published by bkille about 4 years ago
parsnp - Release v1.7.0
- Added option
--extend-lcbswhich uses a naive alignment to extend the boundaries of clusters.
- C++
Published by bkille about 4 years ago
parsnp - Release v1.6.2
- No longer deletes the core SNPs file,
parsnp.snps.mblocks. Users can now use that file to run their own custom phylogenetic analysis using this file as input.
- C++
Published by bkille about 4 years ago
parsnp - Release v1.6.1
- Fixes an issue where a list was compared to an integer.
- C++
Published by bkille about 4 years ago
parsnp - Release v1.6.0
- Add
--skip-phylogenyoption to allow users to skip phylogeny reconstruction - Add
--validate-inputoption to allow users to validate files provided by-dwith Biopython parsing
- C++
Published by bkille about 4 years ago
parsnp - Release v1.5.6
- Add option to generate
.vcfoutput via the--vcfflag - Remove
tmp/directory in output folder
- C++
Published by bkille almost 5 years ago
parsnp - Release v1.5.4
- Fixes issue for multireference fastas mentioned in #87
- C++
Published by bkille over 5 years ago
parsnp - Release v1.5.2
- Fixed fallback to FastTree on RaxML failure
- Made
--curateinclude all genomes regardless of length
- C++
Published by bkille over 5 years ago
parsnp - Release v1.5.1
Bugfixes for Parsnp v1.5.0: * Mash and FasANI recruitment now works regardless of reference type * Phiprofile fixes for python 3 * Logging input errors is now cleaner
- C++
Published by bkille almost 6 years ago
parsnp - Release v1.5.0
- Ported to Python 3, no longer supports Python 2
- Added pythonic logging and argument parsing
- Merged build instructions for macOS and Unix
- Switched FastTree to RaxML
- All dependency tools must be on the system path (no longer defaults to
bin/) - More robust error checking (skips bad files instead of crashing, more warnings + debug mode)
- Fixed .gbk parsing
- Convenient input now allows sequences to be supplied via regex i.e. what use to be
parsnp -d ref/ -r !can now be written asparsnp -d ref/*.fnawhich allows for heterogeneous input directories and spread out data.
- C++
Published by bkille almost 6 years ago
parsnp - v1.2
Bug fixes
- Important fix to FastTree2 binary to enable use of double precision branch lengths (instead of minimum length of 0.0001); FastTree2 issue fully detailed in the following blog post: http://darlinglab.org/blog/2015/03/23/not-so-fast-fasttree.html.
- C++
Published by treangen about 11 years ago
parsnp - v1.1
Features
- Added support to output unaligned regions in XMFA format (--unaligned). This output format contains a list of all unaligned sequence due to: (i) unrelated sequence, (ii) failure to extend past match (MUM) boundaries/sensitivity limitations, and (iii) subset relationships. This output file is further described in the documentation --> http://harvest.readthedocs.org/en/latest/.
- Updated MUMi parameter to allow for setting a maximum MUMi value to enable bypassing the distribution cutoff
Bug fixes
- Fixed minor bug when parsing multiple GenBank files for harvesttools annotation import
- Updated branch lengths via harvest-tools
- Fixed bug that produced VCF output without repeat filtering, even if enabled
- Fixed issue where empty fasta file was causing index errors
- Fixed exit code to be 0 on –h (Issue #4)
- Updated docs to indicate CXXFLAGS=’-fopenmp’ must be provided to configure for openmp support
- Added support for harvesttools v1.2 (see: https://github.com/marbl/harvest-tools/releases/tag/v1.2)
Changes
- Added –V, --version parameter
- Allow columns with Ns to be marked as pass in VCF
- Allow any fasta file present in -d, regardless of extension, to be used
- Disallow genome file names containing spaces or the following special characters: . , [ ] { } ( ) ! ; " ' * ? < > |
- Default repeat filtering disabled, removing NUCmer dependency. Repeat filtering still possible via bed format support and harvest-tools; see harvesttools documentation for further details.
- C++
Published by treangen about 11 years ago
parsnp - v1.0
NEWS
07/21/2014: First official release
DOCS
For parsnp user guide please see:
http://harvest.readthedocs.org/en/latest/content/parsnp.html
MD5 checksums
08c19b4d12e8199b2ce098550b4100e6 parsnp-OSX64-v1.0.tar.gz
f82b6b9dae456fe9263ee6214b2633af parsnp-Linux64-v1.0.tar.gz
- C++
Published by treangen over 11 years ago