Recent Releases of genesis
genesis - genesis v0.34.0
Notable changes
- sequence
- Add Fai Input Stream
- Add experimental Kmer Color Gamut (color set) functions
- Add experimental approaches for kmer colors from Taxonomy
- taxonomy
- Add Taxonomy to Tree function with node associations
- Add experimental functions for Taxonomy partitioning
- Improve Accession Lookup Reader speed and fault tolerance
- utils
- strings
- Add Lightweight String class
- Add and refactor string split function overloads
- bits
- Add Bitvector set functions
- Add bit trailing/leading count functions
- threading
- Add Serial Task Queue
- Add throttled parallel for loop helpers
- misc
- Add Resource Logger helper class
- Add Matrix serialization
- strings
Build
- Upgrade to CMake >= 3.8
- Move bit and bitvector files
Bug fixes
- Fix missing header include in test
- Fix Csv Input Iterator range based for loop
- Several smaller bug fixes and compiler issues
- C++
Published by lczech 11 months ago
genesis - genesis v0.33.0
Notable changes
- sequence
- Add k-mer classes and functions
- Extraction from sequences and strings
- Scanning microvariant neighborhood
- Minimal canonical encoding functionality
- Helper functions for canonical k-mers
- Add setting for Fasta Reader to not validate labels
- Allow empty sequences in Fastx Input View Stream
- Add k-mer classes and functions
- taxonomy
- Change Taxon ID to uint instead of string
- Refactor Taxonomy for simplicity and speed
- Refactor NCBI taxonomy reader to improve speed and memory
- Add Accession Lookup table class
- Add Taxon Kmer Data class
- Add Taxonomy kmer grouping functions
- Add Taxonomy Json Reader and Writer
- utils
- Bitvector
- Refactor Bitvector to use more free functions
- Add functions offering different lengths policies
- Add find first and last bit set functions
- Add Jaccard similarity functions
- Add Bitvector serialization functionality
- Add hierarchical agglomerative clustering functionality
- Add Bitpacked Vector class
- Add Concurrent Vector Guard class
- Add Shannon entropy function
- Extend median and quartiles functions to other numerical types
- Refactor serialization functionality for simplicity
- Add Thread Local Cache helper class
- Add exception handler callback to Thread Pool
- Refine Thread Pool timed wait functions
- Bitvector
- CMake and build
- Fix compatibility issues with C++17 and later
- Add auto-detection of C++ standard to CMake
- Activate automatic AVX detection in CMake
- Deprecate MacOS 12 in GitHub Actions CI
Bug fixes
- Fix several issues with different compilers in CI
- Fix exception handling bug in Thread Pool
- Fix numerical epsilon for SVG color bar
- Fix nan lengths in Jplace Writer
- Fix taxonomy data check functions
- Fix taxonomy preorder iterator reference bugs
- Fix edge case in Newick Reader
- C++
Published by lczech over 1 year ago
genesis - genesis v0.32.0
Notable Changes
- population
- Add per-sample mask tags and functions
- Add mask for provided loci to window averaging function
- Add Genome Locus Set invert function
- Add Genome Window View chromosome lengths
- Exclude missing data from available loci window avg function
- tree
- Add tree drawing wrapper functions with node and edge shapes
- Refine tree postorder iterator for speed
- sequence
- Refactor and rename Fasta and Fastq Iterators to Streams
- Speed up Fasta and Fastq reading and offer string view reading
- utils
- Speed up Thread Pool by using Concurrent Queue
- Add several convenience functions to Thread Pool and threading
- Add hardware feature detection and current resource usage functions
- Add Sequential Output Buffer class
- Add stdin input source and stdout stderr output targets
- Refactor Input Stream get line functions to use AVX2
- Outsource int parsing from input stream
- bugfixes
- Fix reference base lower case comparison issue in population
- Fix VCF with non-SNP AD field entries and deletions
- C++
Published by lczech almost 2 years ago
genesis - genesis v0.31.1
This is mainly a release to fix the window averaging approach for FST, which was statistically nonsensical in the last release by accident.
Notable Changes
- Redesign FST window averaging implementation
- Refine diversity denominator for low read depths
- Add Window Stream begin and end callbacks
- C++
Published by lczech almost 2 years ago
genesis - genesis v0.31.0
This release is a major clean-up of the population classes and functions. In particular: (1) "Iterator" classes have been renamed to the more appropriate "Stream", and (2) the Variant filtering approach has been completely redesigned to use tags instead of fully removing positions from the stream, allowing us to properly compute per-window averages of statistics.
Notable Changes
- General
- Rename all "Visitor" instances to "Observer" to follow the pattern
- Refactor observers to have on-enter and on-leave functionality
- population
- Rename Base Counts class to Sample Counts
- Rename all Variant and Window "Iterator" classes to "Stream"
- Rename Sliding Entries Window Stream to Queue Window Stream
- Rename usage of "coverage" to "read depth" in function names
- Major refactor of Variant and Sample Counts filter to use tagging filters
- Add filter categories and summaries, to simplify user output
- Refactor file formats and streams to use tagging filters
- Refactor statistics computations to use tagging filters
- Add proper window averaging support for statistics using tagging filters
- Refactor Queue Window Stream to use tagging filters
- Outsource Genome Stream from Chromosome Stream
- Add Position Window Stream
- Add Variant Gapless Input Stream
- Add Variant Input Stream that merges sample groups
- Add Diversity Processor helper class
- Add re-scaling and re-sampling functions for Sample Counts
- utils
- Add Kendall's Tau correlation functions
- Add multinomial and multivariate hypergeometric distribution functions
- Add betas and intercept coefficients estimation functions for GLM
- Refactor Thread Pool using Proactive Future for nested tasks
- Add thread-safe random engines
- Refine guess thread number functions
- Add auto waiting to parallel for loop functions
- Use global thread pool in gzip block compression
- Disable the local build of htslib if HTSLIB_DIR is provided
Bug fixes
- Fix end of iteration bug in Lambda Iterator
- Fix cmake clang htslib incompatibility
- Fix virtual override destructors
- Fix Matrix output stream operator for char types
- Fix htslib lib64 issue lczech/grenedalf#12
- Add regression interaction test for lczech/gappa#29
- C++
Published by lczech almost 2 years ago
genesis - genesis v0.30.0
Notable Changes
- population
- Add improved Tajima D empirical pool size estimators
- Add cathedral plot functions with efficient algorithm
- Add support for sample name header row in Sync Reader
- Improve Fst Pool Calculator classes
- Allow multiallelic SNPs in pool VCF and Karlsson Fst
- Refine automatic sample naming for formats without names
- Refine sample filter and numerical filter functions
- Improve input order and chromosome length check functionality
- utils
- Add Matrix inplace transpose function
- Add advanced compensated summation algorithms
- Add begin and end callbacks to Lambda Iterator
- Refine text join functions
Bug fixes
- Fix Matrix output operator for char types
- Fix missing return statements in Lambda Iterator and Base Window Iterator
- Fix htslib check in Variant Input Iterator test case
- Fix Dataframe and Matrix string to double conversions
- Fix generic convert function
- Fix thread collision for cache in Reference Genome class
- Fix backslash escape bug
- C++
Published by lczech over 2 years ago
genesis - genesis v0.29.0
Notable changes
- population
- Add Reference Genome based ref and alt handling
- Add Chromosome/Genome Iterator classes
- Add Window, WindowView, and Iterator abstractions and helpers
- Add diversity pool calculator, refactor diversity functions
- Refine diversity measures, for speed and robustness for large coverages
- Refactor FST pool functions into classes, for streaming
- Refactor Variant filters and transformations
- Add Kapun-style missing data entries to Sync Reader
- sequence
- Add Reference Genome class and functions
- Add Sequence Dict class and functions
- utils
- Refine binomial functions for larger values, increase speed
- Add visitor functions to Lambda Iterator
- Add ranged pop count function to Bitvector
- Move exceptions back to utils namespace
- build
- Export the cmake include targets so that genesis can be used as a subproject
Bug fixes
- Fix MRU Cache copy constructor
- Fix thread pool nested deadlocks and seg faults
- Fix date time sprintf function and gcc macro test
- Fix string split default argument overload
- C++
Published by lczech about 3 years ago
genesis - genesis v0.28.1
Notable changes
- population
- Add Sliding Entries Window Iterator
- Add user-provided column names to Frequency Table Reader
- utils
- Compute proper bounding boxes for SVG Path objects
- Compute proper SVG bounding boxes with transformations
- Add pie chart SVG helper function
- Add cache stats to MRU Cache
- build
- Update htslib version to fix autoconf issues
- Add LTO/IPO support with CMake build
- Add GitHub Actions CI
- Deactivate OpenMP on MacOS by default, too much trouble
- Change CMakeLists to use an object library to speed up compliation
Bug fixes
- Fix various minor compiler warnings found due to CI
- Fix clang issue with std::tm initialization
- Proper linking against OpenMP for tests and apps
- Remove deprecrated std dependency
- Fix Base Window Iterator categories
- C++
Published by lczech over 3 years ago
genesis - genesis v0.28.0
Notable changes
- Add generic Frequency Table Input Iterator
- Add generic Genome Region Reader
- Add Genome Locus Set for fast position queries
- Add whole chromosome coverage functionality to Genome Region List
- Add Genome Region Window Iterator
- Add Map/Bim Reader
- Make Fst functions more lenient for small pool sizes
- Add global thread pool for eliminating core oversubscription
Bug fixes
- Fix memory leak in Base Window Iterator
- Fix sliding window iterator for empty input
- C++
Published by lczech over 3 years ago
genesis - genesis v0.27.0
Notable Changes
- Add SAM/BAM/CRAM Input Iterator, with RG read group splitting and filtering, and SAM flags filters
- Refactor Variant Input Iterators for ease of use
- Add Variant Input Iterator for Parallel Input
- Refactor Genome Region List to use Interval Tree, and add surrounding functionality
- Rename and refactor Kofler and Karlsson F_ST pool functions for clarity
- Add our unbiased F_ST estimators for pool sequencing data
- Refactor and refine diversity measure settings
- Refactor Window Iterator
- Non-virtual iterator interface
- Base class abstraction for SlidingWindowIterator
- Deprecate SlidingWindowGenerater, use SlidingWindowIterator instead
- Deprecate Vcf Window Generator function
- Add BED Reader
- Add Genome Region List reader for GFF
- Speed improvements and async block buffering for Lambda Iterator
- Refine CMake setup for htslib, improve autotools combatibility
- C++
Published by lczech almost 4 years ago
genesis - genesis v0.26.1
Notable Changes
- This is mostly a version bump because the
version.hppfile did not get updated properly with genesis v0.26.0 due to the new year, but also: - Add pendant length filters for placements
- C++
Published by lczech over 4 years ago
genesis - genesis v0.26.0
Notable Changes
Population:
- Add Genome Locus class and comparison operators
- Add Variant Parallel Input Iterator
- Add Variant Input Iterator
- Refine Pileup Reader to allow parsing Variants directly
- Change Window Iterator start position to 1
- Switch to always using local htslib
- Disable htslib libcurl requirement
Tree:
- Fix EMD computation with zero branch lengths
- Improve tree diameter function for speed and memory efficiency
- Add Simple Newick Reader and Writer
Utils:
- Add Interval Tree implementation
- Add bare Optional class
- Refactor Lambda Iterator
- Refine Options guess number of threads
- Relax binomial coefficient behaviour to allow for larger numbers
- C++
Published by lczech over 4 years ago
genesis - genesis v0.25.0
This is a long overdue release that adds a lot of new features, and in particular introduces support for population genetics data, methods, and file formats, and adds an (optional) dependency on htslib.
Important Changes
- Add (optional) support for VCF files by wrapping htslib (which now is an optional dependency)
- Add reading support for (m)pileup, GFF/GTF, and PoPoolation2 sync files
- Add tools to work with alleleic variants (SNPs), genome regions, and sliding windows
- Add pool-sequencing variants of population genetic statistics, such as heterozygosity, Theta Watterson, Theta Pi, Tajima's D, and variants of F_ST, by re-implementing methods of PoPoolation and PoPoolation2
Notable Changes
- Add filtering and transforming iterator classes
- Add harmonic mean functions
- Add binomial distribution function and binomial coefficient (n choose k) functions
- Add base64 encoding and decoding functions
- Add natural sorting function
- Add simple pure function cache class
- Add svg image embedding and rendering options
- Add simple thread pool class
- Add GzipBlockOStream class
- Refactor gzip input stream to work on concatenated gzip streams
- Adapt BmpWriter and SequencePrinter to new OutputTarget classes
- Fix placement sample PqueryName filtering functions
- Fix tree bipartition find subtree function
- Fix bug in tuple hash function
- Fix undefined behaviour in GzipStream destructor
- Bug fixes and speed improvements
- C++
Published by lczech almost 5 years ago
genesis - genesis v0.24.0
Notable Changes
- Change writer classes to use new output targets functionality
- Add Fastq handling: reading, writing, iterating
- Add phred quality score handling
- Add gzip output support and gzip streams
- Add date/time conversion functions
Further Changes
- Add a lot of refinements to existing functionality
- Improve support and fix issues for build platforms and compilers
- Refine error messages for file handling
- Several speedups, improvements, and bug fixes
- C++
Published by lczech almost 6 years ago
genesis - genesis v0.23.0
This is the release that accompanies the publication of our application note
Genesis and Gappa: processing, analyzing and visualizing phylogenetic (placement) data. Lucas Czech, Pierre Barbera, and Alexandros Stamatakis. Bioinformatics, 2020. https://doi.org/10.1093/bioinformatics/btaa070
which is now the official reference to cite when using genesis and gappa.
Notable Changes
- Add a lot of more tutorials for all parts of genesis.
- Reduce memory footprint of Squash Clustering.
- Add Robinson-Foulds (RF) distance functions.
- Several bugfixes and refinements.
- C++
Published by lczech over 6 years ago
genesis - genesis v0.22.1
Notable Changes
- Refine input stream reading.
- Refine phylip reading.
- C++
Published by lczech almost 7 years ago
genesis - genesis v0.22.0
Notable Changes
- Add support for old versions 1 and 2 of the
jplacestandard. - Add minimal tree function for easier tree experimentation.
- Fix some issues occurring on Mac and with other compilers.
- Fix issues found by clang sanitizer and similar tools.
- Fix some bugs, refine some code, speed up some functions.
- C++
Published by lczech about 7 years ago
genesis - genesis v0.21.0
This is a long overdue release that adds a lot of new features, and refactors existing ones. It breaks compatibility with previous releases.
Important Changes
- Refactor design and usage of input reading functions/classes.
- Rename DefaultTree to CommonTree.
- Use dereferencing iterator for Tree members.
- Refactor several smaller classes and functions for usability.
Notable Changes
- Add transparent gzib and zlib support for input reading.
- Add adaptation of Phylogenetic Isometric Log-Ratio (PhILR) Transform to phylogenetic placements.
- Add Placement-Factorization, an adaptation of Phylofactorization to phylogenetic placements.
- Add Generalized Linear Models (GLM).
- Add Multi-Dimensional Scaling (MDS).
- Add Taxonomy to Tree functions.
- Add Tree/Sample rerooting and subtree deletion functions.
- Add Subtree class, add iterator support for subtrees.
Further Changes
- Add heat tree / heat map visualization.
- Add several statistics functions.
- Add a lot (!) of smaller auxiliary functions and features.
- Fix several bugs, implement several speedups and improvements.
- C++
Published by lczech about 7 years ago
genesis - genesis v0.20.0
Notable Changes
- Add Sequence abundances.
- Rename and change several Tree functions and classes.
- Refine hashing functions and classes.
- Add several Taxonomy functions.
- Add and change k-means functions.
- Add Matrix Writer class.
- Many bugfixes and refinements.
- C++
Published by lczech over 7 years ago
genesis - genesis v0.19.0
Notable Changes
- Refactor Node Histogram Distance.
- Refactor bipartition functions.
- Add more Color and Svg functions.
- Add and improved Tree drawing functions.
- Add Matrix iterators.
- Add
DataFrameandMruCacheclasses. - Add hash and sequence signature functions.
- Add many convenience and helper functions.
- Refactor some functions and classes.
- Speedups in many functions.
- Some bugfixes.
- C++
Published by lczech about 8 years ago
genesis - genesis v0.18.1
Notable Changes
- Fix regex issue with gcc5.
- Fix numercial issue in geodesy test case.
- Fix missing header include.
- C++
Published by lczech over 8 years ago
genesis - genesis v0.18.0
Notable Changes
- Add Color Palette class and more Color features and lists.
- Simplify consensus sequence calculations, add Cavener's method.
- Add some geodesy functions.
- Add Matrix Reader class.
- Speedups in Jplace Reader and Node Distance Matrix.
- C++
Published by lczech over 8 years ago
genesis - genesis v0.17.0
Notable Changes
- Add per-module headers.
- Better CMake support for using Genesis as a library.
- Refactor Node Histogram Distance to also use negative axis.
- Add some helper functions and speedups.
- Change default out of range behaviour of Histogram.
- C++
Published by lczech over 8 years ago
genesis - genesis v0.16.0
Notable Changes
- Better CMake support for using Genesis as a library.
- Add Squash Clustering.
- Add statistics functions (correlation coefficients, ranking, etc).
- Improvements in Earth Mover's Distance and EdgePCA calculations.
- Improvements in Tree drawing and Matrix functions.
- Various bugfixes and speedups.
- C++
Published by lczech almost 9 years ago
genesis - genesis v0.15.0
Notable Changes
- Add better multiple Newick trees reading support.
- Speedup and refinements in Node Histogram Distance and EdgePCA calculation.
- Add color palettes.
- Add Genesis version name.
- C++
Published by lczech almost 9 years ago
genesis - genesis v0.14.0
Notable Changes
- Add Placement Histograms demo.
- Add K-means clustering, including K-means++ and empty cluster treatment.
- Add Attribute Tree, which supports string maps for Nodes and Edges.
- Add simple k-mer counting functions.
- Improved EMD and EdgePCA calculations: Faster and more flexible.
- Tutorials for Basic Tree usage and for Sequences.
- C++
Published by lczech about 9 years ago
genesis - genesis v0.13.0
This is an overdue release that contains some major incompatibilities with previous versions.
Important Changes
- Move all library files to
lib/genesissubdirectory, to avoid include conflicts. - All binaries are now all compiled into the
bindirectory and its subdirectories.
Notable Changes
- Add functions to add Nodes to a Tree.
- Add Tree reroot and ladderize function.
- Add labelled tree function and demo.
- Add EdgePCA function, PCA function and Matrix statistics functions.
- Add Bitmap, better Svg and better Json support.
- Add functions for Taxonomies and Sequences, like pruning by entropy and consensus sequences.
- Refine some functions to work with Sequences, as well as Phylip and Fasta files.
- Refactor Tree reading and writing. More extensible now, more options available.
- Refactor the interface of Sample.
- Major speedups in Earth Movers Distance implementation.
Further Changes
- Add global option for allowing to overwrite files.
- All readers now use the new scanners. The old lexer class is gone.
- Add unity build by default for speedup and better optimization.
- Better support of Pthreads and OpenMP.
- Preparations for reactivation of Python bindings with Pybind11.
- Automatic download of GTest and Pybind11, if needed.
- Add some documentation for existing code.
- C++
Published by lczech about 9 years ago
genesis - genesis v0.12.1
Notable Changes
- Fix in Tree scaling for "Compare Jplace Files" demo.
- Add
scale_all_branch_lengths()function for Sample. - Allow small tolerance in sum of LWRs for validating a Sample.
- C++
Published by lczech over 9 years ago
genesis - genesis v0.12.0
Notable Changes
- Add color gradient information to the "Visualize Placement" demo.
- Add circular tree drawing functions.
- Some small fixes and improvements.
- C++
Published by lczech over 9 years ago
genesis - genesis v0.11.1
Notable Changes
- Fix bug in "Compare Jplace Files" demo program.
- Add some minor functions.
- C++
Published by lczech over 9 years ago
genesis - genesis v0.11.0
Notable Changes
- Add SHA1 calculation, e.g., for Sequence relabelling.
- Add Expected Distance between Placement Locations (EDPL) functions.
- C++
Published by lczech over 9 years ago
genesis - genesis v0.10.0
Notable Changes
- Add "Compare Jplace Files" demo.
- C++
Published by lczech over 9 years ago
genesis - genesis v0.9.0
This release contains some major incompatibilities with previous versions.
Notable Changes
- Refactor Tree template class into normal class. Thus, data on Nodes and Edges is now dynamic instead of static.
- Add functions and classes to work with Sequences, e.g., consensus sequences, filtering, sanitizing.
- Add functionality and algorithms for working with Taxonomies, e.g., iterators, pruning by entropy, arbitrary data on Taxa.
- Add helper classes and functions that will be important later: SVG, RMQ, TwobitVector and more.
- Stall Python support. It's too much work to maintain the bindings while working on the library. Will continue once the API is stable.
- Bugfixes and better C++14 compatibility, plus many internal changes for speedup and improved maintainability.
- C++
Published by lczech over 9 years ago
genesis - genesis v0.8.0
Notable Changes
- Add many Taxonomy functions.
- Add Taxonomy Reader for CSV files with Taxscriptors.
- Add functions for Trees and distances.
- Add some more functions for Sequences and fasta files.
- Some function renaming for more consistency.
- Update extract clade placement demo to use CSV file instead of clade directory.
- Many bugfixes.
- C++
Published by lczech almost 10 years ago
genesis - genesis v0.7.0
Notable Changes
- Add
Taxonomyclass and functions. - Add
CsvReaderclass. - Move
ReaderandWriterclasses fromiotoformats. - Turn Log Warnings into exceptions where reasonable.
- Reading formats works with all new line chars now (including Mac and Win).
- Fix issue with C++14 compiler and
make_uniqueimplementation.
- C++
Published by lczech almost 10 years ago
genesis - genesis v0.6.0
Notable Changes
- Finish first working version of the Placement Simulator.
- Speedup for writing Jplace files.
- Fix some issues with g++ and Boost Python.
- C++
Published by lczech about 10 years ago
genesis - genesis v0.5.1
Notable Changes
- Bugfix in Visualize Placement demo.
- C++
Published by lczech about 10 years ago
genesis - genesis v0.5.0
Notable Changes
- Add example to "Extract Clade Placements" demo.
- Rename Tree distance functions; add some of their missing implementations.
- C++
Published by lczech about 10 years ago
genesis - genesis v0.4.0
Notable Changes
- Add normalizing step to "Extract Clade Placements" demo.
- Add Tree::convert_from() function for trees with different template parameters.
- Add new, more flexible EMD implementation, that also takes multiplicities into account.
- Make Sample::add_pquery() function work with Pqueries from different Samples.
- Add some safety measures, documentation etc for some classes.
- Fix some minor bugs.
- C++
Published by lczech about 10 years ago
genesis - genesis v0.3.0
This release contains a major refactoring of the API. One big step towards a clean, consistent interface.
The API is still not finished yet. There might be breaking changes in the future.
Notable Changes
- Major API cleanup and changes towards consistency.
- Major refactoring of many class interfaces.
- Outsource many class functions to free functions.
- Move all code into distinct module namespaces.
- Update Python bindings.
- Add manual with first tutorials and demos.
- Make build process ready for release.
- Add continuous integration.
- C++
Published by lczech about 10 years ago
genesis - genesis v0.2.0
(Re-)Release that changes the license from GNU GPL v2 to v3.
This is not an API-breaking change, but a license-breaking one. As GPL v2 and v3 are incompatible, we increase the version number to make that change visible.
- C++
Published by lczech about 10 years ago
genesis - genesis v0.1.1
This is a minor release that prepares a change of the license.
Notable Changes
- Restructure the doc directory.
- Restructure the tools directory.
- Move file functions to namespace genesis::utils.
- C++
Published by lczech about 10 years ago
genesis - genesis v0.1.0
Initial release that starts version counting. We use semantic versioning.
For now, the public API should not be considered stable. Anything may change at any time. We use major versions 0.x.y in this initial development phase. The minor version x will be increased whenever backwards compatibility is broken.
We will transition to major version 1.0.0 once the API is considered stable. From then on, the major version will increase for incompatibilities, the minor version for new features, and the patch version for bug fixes.
- C++
Published by lczech about 10 years ago