Recent Releases of abyss
abyss - 2.3.4
Release version 2.3.4
General:
- Dropped support for gcc5 due to lack of support for more recent C++ features.
abyss-rresolver-short: * Reduced memory consumption. * Better calculation of read size proportions in the input dataset. * Increased max read read size allowed.
- C++
Published by vlad0x00 about 4 years ago
abyss -
- Added the RResolver component to the ABySS pipeline. This release updates the 2.2.4 ABySS version and is used as a reference for the RResolver manuscript.
- C++
Published by vlad0x00 almost 5 years ago
abyss - 2.2.5
Release version 2.2.5
General: * Resolve various compilation errors in newer versions of clang * Use ntHash's 0th hash as the default hash instead of the 1st hash * Added optional RResolver module, not currently part of the ABySS assembly pipeline
abyss-rresolver-short: * Improves genome assemblies at unitig stage by using a sliding window at read size level across repeats to determine which paths are correct * For further information: consult http://www.birollab.ca/assets/posts/195NikolicVladimirHiTSeqISMB2020.pdf
misc: * Extract all tags in a SAM file
- C++
Published by jwcodee over 5 years ago
abyss - 2.2.4
- Release version 2.2.4
General: * Refactor deprecated functions in clang-8
Sealer: * Remove unsupported -D option from help page
abyss-bloom: * Add counting Bloom Filter instruction to help page
abyss-bloom-dbg: * Report coverage information of unitigs
- C++
Published by jwcodee about 6 years ago
abyss -
- Release version 1.0.11.
- Assemble colour-space reads. Read identifiers must be named with the suffixes F3 and R3.
- Read files in qseq format. Thanks to Tony Raymond (tgr).
- Prevent misassemblies mediated by tandem segmental duplications. A sequence XRRY, where R is a repeat sequence, could have been misassembled as XRY. (tgr)
abyss-pe: * Integrate with Sun Grid Engine (SGE). A parallel, paired-end assembly can be run with a single qsub command. The parameters lib, np and k default to the qsub environment variables JOBNAME (qsub -N), NSLOTS (qsub -pe) and SGETASK_ID (qsub -t) respectively. * The .pair file, the largest intermediate file, is now gzipped.
ABYSS-P: * Bug fix. At k=19, k-mer would be distributed to even-numbered processes only.
KAligner: * Multithreaded. The -j, --threads option specifies the number of threads to use. The order in which the alignments are output will vary from run to run, but the alignments are deterministic and will not vary. Each thread reads and aligns one file, so the reads must be in more than one file to use this feature. (tgr)
- C++
Published by jwcodee about 6 years ago
abyss -
- Release version 2.2.3
- Revert memory consumption of Bloom filters to pre 2.2.0 behaviour. ABySS will now share the specified memory among all Bloom filters instead of just the counting Bloom filter.
- Fix gcc-9 compilation warnings
- C++
Published by jwcodee over 6 years ago
abyss -
- Release version 2.2.2
- Fix abyss-overlap for 32-bit systems
- C++
Published by jwcodee over 6 years ago
abyss -
- Release version 2.2.0
- Construct a counting bloom filter instead of a cascading bloom filter.
abyss-bloom: * Add 'counting' as valid argument to '-t' option to build a counting bloom filter
- C++
Published by jwcodee over 6 years ago
abyss - 2.1.4
This release provides major improvements to Bloom filter assembly contiguity
and correctness. Bloom filter assemblies now have equivalent scaffold contiguity
and better correctness than MPI assemblies of the same data, while still
requiring less than 1/10th of the memory. On human, Bloom filter assembly times
are still a few hours longer than MPI assemblies (e.g. 17 hours vs. 13 hours,
using 48 threads).
abyss-pe:
* Change default value of m from 50 => 0, which has the effect
of disallowing sequence overlaps < k-1 bp. QUAST tests on E. coli /
C. elegans / H. sapiens showed that both contiguity and
correctness were improved by allowing only overlaps of k-1 bp
between sequence ends.
- C++
Published by benvvalk over 7 years ago
abyss - 2.1.3
This release fixes a SAM-formatting bug that broke the ABySS-LR pipeline (Tigmint/ARCS).
abyss-bloom:
* Added graph command for visualizing neighbourhoods of the
Bloom filter de Bruijn graph (produces GraphViz)
abyss-fixmate-ssq:
* Fixed missing tab in SAM output which broke ABySS linked reads
pipeline (Tigmint/ARCS)
- C++
Published by benvvalk over 7 years ago
abyss - 2.1.2
This release improves scaffold N50 on human by ~10%, due to implementation of a new --median option for DistanceEst (thanks to @lcoombe!). This release also adds a new --max-cost option for konnector and abyss-sealer that curbs indeterminately long running times, particularly at low k values.
abyss-pe:
* Use the new DistanceEst --median option as the
default for the scaffolding stage
Dockerfile: * Fix OpenMPI setup
DistanceEst:
* Added --median option
konnector:
* Added --max-cost option to bound running time
sealer:
* Added --max-cost option to bound running time
- C++
Published by benvvalk over 7 years ago
abyss - 2.1.1
This release provides bug fixes and modest improvements to
Bloom filter assembly contiguity/correctness. Parallelization
of Sealer has also been improved, thanks to contributions by
@vlad0x00.
abyss-bloom-dbg:
* upgrade to most recent version of ntHash to reduce
some assembly/hashing artifacts. On a human assembly, this
reduced QUAST major misassemblies by 5% and increased
scaffold contiguity by 10%
* kc parameter now also applies to MPI assemblies (see below)
abyss-fac:
* change N20 and N80 to N25 and N75, respectively
ABYSS-P:
* add --kc option, with implements a hard minimum k-mer
multiplicity cutoff
abyss-pe:
* fix zsh: no such option: pipefail error with
old versions of zsh (fallback to bash instead)
* adding time=1 now times all assembly commands
abyss-sealer:
* parallelize gap sealing with OpenMP (thanks to
@vlad0x00!)
* add --gap-file option (thanks to @vlad0x00!)
DistanceEst:
* add support for GFA output
- C++
Published by benvvalk over 7 years ago
abyss - 2.1.0
This release adds support for misassembly correction and scaffolding
using linked reads, using Tigmint and ARCS. (Tigmint and
ARCS must be installed separately.) In addition, simultaneous optimization
of s (seed length) and n (min supporting read pairs / Chromium barcodes)
is now supported during scaffolding.
abyss-longseqdist: * Fix hang on input SAM containing no alignments with MAPQ > 0
abyss-pe:
* New lr parameter. Provide linked reads (i.e. 10x Genomics
Chromium reads) via this parameter to perform misassembly
correction and scaffolding using Chromium barcode information.
Requires Tigmint and ARCS tools to be installed in addition
to ABySS.
* Fix bug where j (threads) was not being correctly passed to
to bgzip/pigz
* Fix bug where zsh time/memory profiling was not being used,
even when zsh was available
abyss-scaffold:
* Simultaneous optimization of n and s using line search
or grid search [default]
SimpleGraph:
* add options -s and -n to filter paired-end paths by
seed length and edge weight, respectively
- C++
Published by benvvalk almost 8 years ago
abyss - 2.0.3
This minor release provides bug fixes and improved reliability for both MPI assemblies and Bloom filter assemblies on large datasets. In addition, many usability improvements have been made to the abyss-samtobreak program for misasssembly assessment.
overall:
* Many compiler fixes for GCC >= 6, Boost >= 1.64
* Read and write GFA 2 assembly graphs with abyss-pe graph=gfa2
* Support reading CRAM via samtools
abyss-bloom:
* New abyss-bloom build -t rolling-hash option, to
pre-build input Bloom filters for abyss-bloom-dbg
* Fix incorrect output of abyss-bloom kmers -r
(thanks to @notestaff!)
abyss-bloom-dbg:
* New -i option to read Bloom filter files built by
abyss-bloom build -t rolling-hash
* Improved error branch trimming (reduces number of
small output sequences)
* Fix intermittent segfaults caused by non-null-terminated
strings
abyss-map:
* Append BX tag to SAM output (Chromium 10x Genomics data)
ABYSS-P:
* Increase default number of sparsehash buckets from
200,000,000 => 1,000,000,000
* Benefit: Allows larger datasets to be assembled without
time-consuming sparsehash resize operations (e.g. H. sapiens)
* Caveat: Increases minimum memory requirement per
CPU core from 89 MB to 358 MB
abyss-pe:
* Parallelize gzip with pigz, if available
* Report time/memory for each program with zsh, if available
* Fix: use N instead of n for scaffold stage,
when set by user
abyss-samtobreak:
* New --alignment-length (-a) option to exclude alignments
shorter than a given length
* New --contig-length (-l) option to exclude contigs
shorter than a given length
* New --genome-size (-G) option, for contiguity metrics
that depend on the reference genome size
* New --mapq (-q) option for minimum MAPQ score
* New --patch-gaps (-g) option to join alignments
separated by small gaps
* New TSV output format with additional contiguity
stats (e.g. L50, NG50)
* Fix handling of hard-clipped alignments
abyss-todot:
* New --add-complements option
abyss-tofastq:
* New --bx option to copy BX tag from from SAM/BAM
to FASTQ header comment (Chromium 10x Genomics
data)
- C++
Published by benvvalk almost 8 years ago
abyss - 2.0.1
Summary
This release resolves some licensing issues with that were pointed out in 2.0.0. As of 2.0.1, ABySS is now available under a standard GPL-3 license, and the libraries included under lib/rolling-hash and lib/bloomfilter are now also licensed under GPL-3. For alternative licensing terms, please contact Patrick Rebstein (prebstein at bccancer.bc.ca).
- C++
Published by benvvalk over 9 years ago
abyss - 2.0.0
Summary
This release introduces a new Bloom filter assembly mode that enables large genome assemblies with minimal memory (e.g. 34 GB for H. sapiens with 76X coverage bfc-corrected reads). Bloom filter assemblies are currently less contiguous than the default (MPI) assembly mode but are still of high quality (e.g. 3.5 Mbp vs. 4.8 Mbp scaffold NG50 for H. sapiens). Bloom filter assembly mode is enabled by adding three 'abyss-pe' parameters (B = Bloom filter size, H = number of Bloom filter hash functions, kc = k-mer coverage threshold). See 'README.md' for an example.
This release also updates several 'abyss-pe' parameter defaults to be more suitable for large genome assemblies with recent Illumina data. In addition, ABySS 2.0.0 includes minor usability improvements for 'abyss-sealer' and removes an unnecessary build dependency on sqlite3.
ChangeLog
2016-08-30 Ben Vandervalk benv@bcgsc.ca - Release version 2.0.0 - New Bloom filter mode for assembly => assemble large genomes with minimal memory (e.g. 34G for H. sapiens) - Update param defaults for modern Illumina data - Make sqlite3 an optional dependency
abyss-bloom: - New 'compare' command for bitwise comparison of Bloom filters (thanks to @bschiffthaler!) - New 'kmers' command for printing k-mers that match a Bloom filter (thanks to @bschiffthaler!)
abyss-bloom-dbg: - New preunitig assembler that uses Bloom filter - Add 'B' param (Bloom filter size) to 'abyss-pe' command to enable Bloom filter mode - See README.md and '--help' for further instructions
abyss-fatoagp: - Mask scaftigs shorter than 50bp with 'N's (short scaftigs were causing problems with NCBI submission)
abyss-pe: - Update default parameter values for modern Illumina data - Change 'l=k' => 'l=40' - Change 's=200' => 's=1000' - Change 'S=s' => 'S=1000-10000' (do a param sweep of 'S') - Use 'DistanceEst --mean' for scaffolding stage, instead of the default '--mle'
abyss-sealer: - New '--max-gap-length' ('-G') option to replace unintuitive '--max-frag'; use of '--max-frag' is now deprecated - Require user to explicitly specify Bloom filter size (e.g. '-b40G') - Report false positive rate (FPR) when building/loading Bloom filters - Don't require input FASTQ files when using pre-built Bloom filter files
konnector: - Fix bug causing output read 2 file to be empty - New percent sequence identity options ('-x' and '-X') - New '--alt-paths-mode' option to output alternate connecting paths between read pairs
README.md: - Fix documentation of ABYSS and abyss-pe parameters (thanks to @nsoranzo!)
- C++
Published by benvvalk over 9 years ago
abyss - 1.9.0
Summary
This release introduces a new paired de Bruijn graph mode for assembly. In paired de Bruijn graph mode, ordinary k-mers are replaced by k-mer pairs, where each k-mer pair is separated by a fixed-size gap. The primary advantage of paired de Bruijn graph mode is that the span of a k-mer pair can be arbitrarily wide without consuming additional memory, and thus provides improved scalability for assemblies of long sequencing reads.
This release also introduces a new tool called Sealer for closing scaffold gaps, new Konnector functionality for producing long pseudo-reads, and support for the DIDA (Distributed Indexing Dispatched Alignment) parallel alignment framework.
ChangeLog
2015-05-28 Ben Vandervalk benv@bcgsc.ca - Release version 1.9.0 - New paired de Bruijn graph mode for assembly. - First official release of Sealer, a tool for closing scaffold gaps by navigating a Bloom filter de Bruijn graph. - New outward extension feature for Konnector to generate long pseudo-reads. - Support for the DIDA (Distributed Indexing Dispatched Alignment) framework, for computing sequence alignments in parallel across multiple machines. - Unit tests can now be run easily with 'make check', without external dependencies.
abyss-bloom:
- abyss-bloom 'build' command now supports -j option for
multi-threaded Bloom filter construction.
abyss-map:
- New --protein option for mapping protein sequences.
abyss-pe:
- New paired de Bruijn graph mode for assembly. Enable by
setting k to the k-mer pair span and K to size of an
individual k-mer in a k-mer pair. See README.md for further
details.
- New aligner=dida option for using the DIDA parallel alignment
framework. See the DIDA section of the abyss-pe man page
for usage details.
- New graph=gfa option to use the GFA (Graphical
Fragment Assembly) format for intermediate graph files.
abyss-sealer: - New tool for closing scaffold gaps by navigating a Bloom filter de Bruijn graph - See Sealer/README.md or abyss-sealer man page for details and examples.
konnector:
- New --extend option for extending merged and unmerged
reads outwards in the de Bruijn graph.
- C++
Published by benvvalk almost 11 years ago
abyss - 1.5.2
Summary
In this release we introduce Konnector, a fast and memory-efficient tool to fill the gap between paired-end reads. Konnector determines the intervening sequence by building a Bloom filter de Bruijn graph and searching for paths between paired-end reads within the graph. A companion tool called abyss-bloom is also provided which can be used to construct reusable bloom filter files for input to Konnector; otherwise, Konnector will build an in-memory Bloom filter for one-time use. In addition to Konnector, we have fixed bugs related to compiling with GCC 4.8+ and parsing BWA output SAM files.
ChangeLog
2014-07-09 Anthony Raymond traymond@bcgsc.ca - Release version 1.5.2 - First official release of Konnector and abyss-bloom. - More GCC 4.8+ fixes! Modified Boost install instructions. - Fixed rare bug when parsing output of BWA.
ABYSS: - New option, --mask-cov, use kmers with lowercased bases, but don't count them towards multiplicity.
abyss-bloom: - Construct reusable Bloom filter files for use with Konnector. - Perform boolean operations on two or more bloom filters. Currently supports union and intersection operations.
abyss-fixmate:
- Check for boost 1.43+ when using unordered_map::quick_erase.
- New option, --all, to report all alignments.
- Set mate unmapped flag for mateless reads.
abyss-longseqdist:
- Fixed error: invalid CIGAR when reading BWA output.
configure: - Include mpi and boost libraries as system libraries. Silences warnings (treated as errors) when compiling with GCC 4.8+.
konnector: - Merge read pairs into a single sequence (pseudoread) by building a Bloom filter de Bruijn graph and searching for paths between the paired end reads. Input reads may be FASTA/FASTQ/SAM/BAM. The input files must be sorted by read name and may not contain orphan reads.
- C++
Published by traymond over 11 years ago
abyss - 1.5.1
Summary
In this release we fix a compatibility issue with Trans-ABySS 1.5.0 where the output of abyss-filtergraph is not strand-specific. Also, we include a FCC portability fix.
ChangeLog
2014-05-07 Anthony Raymond traymond@bcgsc.ca
- Release version 1.5.1
- Fix an issue with strand-specific RNA-Seq assembly when running
abyss-filtergraph --assemble --SS.
- Portability fixes for Fujitsu C Compiler (FCC).
abyss-filtergraph:
- Assemble contigs in forward orientation with --assemble --SS
abyss-pe: - Fix some cases where abyss-pe uses incorrect executables
ABYSS-P: - Portability fix with FCC
- C++
Published by traymond almost 12 years ago
abyss - 1.5.0
Summary
In this release we have added full strand specific RNA-Seq support such that output contigs are correctly oriented with respect to the original transcripts sequenced. Also, there are new parameters to abyss-pe, xtip and Q, that are used to improve assembly in high coverage regions like highly expressed transcripts. Setting xtip=1 will more aggressively remove certain tips. The Q parameter will prevent low quality bases from being used in the assembly. The version has been bumped to 1.5.0 to signify compatibility with Trans-ABySS 1.5.0.
ChangeLog
2014-01-15 Anthony Raymond traymond@bcgsc.ca - Release version 1.5.0 - Assemble strand-specific RNA-Seq libraries into strand-specific contigs. - New parameters, Q and xtip. Improves assembly in high-coverage regions by removing recurrent read errors. - Portability fixes for Fujitsu C Compiler.
abyss-pe:
- New parameter, Q, to mask low quality bases to N.
- New parameter, xtip=1, to remove 2-in 0-out tips.
- New parameter, ss=1, to perform strand-specific assembly
using ssRNA-Seq libraries.
- New command, scaftigs. Breaks scaffold sequences at 'N's and
produce a scaftigs.fa file.
- Include long-scaffs.fa in FAC statistics if long parameter
used.
abyss-fixmate: - Performance improvement for GCC-4.6 and older.
DistanceEst: - Report an estimation of duplicate fragments from read pairs mapping to different contigs.
abyss-fixmate: - Report number of fragments removed as noise and outliers.
ABYSS/ABYSS-P: - New option, --SS, to support strand-specific assembly.
abyss-layout: - New option, --SS, to support strand-specific assembly.
abyss-map: - New option, --SS, to support strand-specific assembly.
abyss-overlap: - New option, --SS, to support strand-specific assembly.
abyss-PathOverlap: - New option, --SS, to support strand-specific assembly.
abyss-scaffold: - New option, --SS, to support strand-specific assembly. - Don't prune xtips when scaffolding.
AdjList: - New option, --SS, to support strand-specific assembly.
Overlap: - New option, --SS, to support strand-specific assembly.
PopBubbles: - New option, --SS, to support strand-specific assembly.
- C++
Published by traymond almost 12 years ago
abyss - 1.3.7
Summary
Scaffolds can now be rescaffolded using long sequences such as RNA-Seq assemblies produced from Trans-ABySS. Added support for gcc 4.8+ and Mac OS X 10.9 Mavericks with clang. Finally, we've licensed ABySS under GPL for non-commercial purposes. Please read the LICENSE file for more details.
ChangeLog
2013-11-20 Anthony Raymond traymond@bcgsc.ca - Release version 1.3.7 - Use long sequences to rescaffold scaffolds. May be run by adding libraries to the `long’ parameter. When Scaffolding with RNA-Seq contigs from a Trans-ABySS assembly, the genic contiguity is greatly improved. - Added support gcc 4.8+, and Mac OS X 10.9 Mavericks with clang. - Licensed as GPL for non-commercial purposes.
abyss-fac: - Added e-size to contiguity statistics as described in the GAGE paper.
abyss-filtergraph: - Bug fix. `--assemble’ will not fail an assertion. - New option, --max-length, used to remove contigs over the specified threshold. - Trim 2-in 0-out tips when removing tips.
abyss-map: - Bug fix. Correctly set mapq=0 for reads that multi map.
abyss-longseqdist: - New program. Generate distance estimates between all contigs a single read maps to.
abyss-mergepairs: - Report number of reads chastity filtered.
abyss-overlap: - Bug fix. Handle ambiguity codes.
abyss-pe:
- Support BWA-MEM with assembly. Run using parameter
aligner=bwamem’.
- Added another scaffolding stage using long sequences. May be
run by adding libraries to thelong’ parameter.
ABYSS-P: - Bug fix. Do not use awk to merge fasta files.
abyss-samtobreak: - Building bug fix. Check that ghc modules are installed.
UnitTest: - The Google C++ testing framework has been added to ABySS.
- C++
Published by traymond about 12 years ago
abyss - Release version 1.3.6
2013-07-15 Anthony Raymond traymond@bcgsc.ca
``` * Release version 1.3.6 * Improved documentation for GitHub devs. * ABYSS-P performance improvement. * Various portability and bug fixes.
abyss-mergepairs: * Fix program name.
abyss-fac: * New option --exp-size to give the expected genome size needed for NG50 calculation. * New option --count-ambig include ambiguities in calculations.
ABYSS/ABYSS-P: * Performance improvement. Runtime reduced by ~20%. * Fix support for MPICH.
abyss-map: * No longer require POPCNT instruction. * New option --order to force output order the same as input.
abyss-filtergraph: * New option --remove to remove specified contigs from graph.
PopBubbles: * Bug fix. Setting branches > 2 will now work.
abyss-fixmate: * Improved error when first and second read IDs do not match. * New option --cov to compute and store the physical coverage in a Wiggle file.
AdjIO: * Bug fix for non-GCC compilers. ```
- C++
Published by traymond over 12 years ago