Recent Releases of https://github.com/hosseinmoein/dataframe

https://github.com/hosseinmoein/dataframe - August-2025

Please consider sponsoring DataFrame, especially if you are using it in Production Capacity. It is the strongest form of appreciation

Improved documentation by creating split windows Added cheat-sheet to docs Fixed some edge case bug in parallel sort Implemented detectandfill() Implemented detectandchange() Implemented KolmoSmirnovTestVisitor visitor Implemented MannWhitneyUTestVisitor visitor Implemented mask() Added many more functionalities to the internal Matrix class Implemented fast_ica() Fixed the inconsistency in writing/reading DateTime columns to/from files Added ability to read only selected columns from files Implemented MutualInfoVisitor visitor Added ability to specify a delimiter when writing/reading to/from csv files. Implemented AndersonDarlingTestVisitor visitor Implemented ShapiroWilkTestVisitor visitor Implemented CramerVonMisesTestVisitor visitor Implemented unpivot() Implemented pivot()

- C++
Published by hosseinmoein 7 months ago

https://github.com/hosseinmoein/dataframe - April-2025

Please consider sponsoring DataFrame, especially if you are using it in Production Capacity. It is the strongest form of appreciation

Enhanced documentation Implemented SpectralClusteringVisitor visitor Enhanced ThreadPool parallelloop() Implemented get[data|view]byspectral() Implemented determinant() + a bunch of other stuff in Matrix Implemented canoncorr() Implemented MCstationdist() Implemented SeasonalPeriodVisitor visitor Improved performance in reading files of different types Changed read() signature to take a struct for its parameters -- _backward incompatible change Changed write() signature to take a struct for its parameters -- backward incompatible change Implemented ability to read() csv2 files with user provided schema Implemented knn() Implemented DynamicTimeWarpVisitor visitor Implemented AnomalyDetectByFFTVisitor visitor Changed the interface of HampelFilterVisitor -- backward incompatible change Implemented removedatabyfft() Implemented AnomalyDetectByIQRVisitor visitor Implemented AnomalyDetectByZScoreVisitor visitor Implemented removedatabyiqr() Implemented removedataby_zscore() Implemented AnomalyDetectByLOFVisitor visitor Ported to GCC14 compiler and fixed many edge-case bugs

- C++
Published by hosseinmoein 11 months ago

https://github.com/hosseinmoein/dataframe - Jan-2025

Please consider sponsoring DataFrame, especially if you are using it in Production Capacity. It is the strongest form of appreciation

Improved documentation and code quality Fixed a bug in assign() Implemented get[data|view]by_kmeans() Changed interface and optimized code in AffinityPropVisitor (backward incompatible change) Implemented get_[data|view]_by_affin() Added option to HampelFilterVisitor to populate indices to datapoints affected Implemented removedatabyhampel() Implemented MeanShiftVisitor visitor Implemented get[data|view]bydbscan() Impelmented get_[data|view]_by_mshift() Improved performance in removeduplicates() Added FixedSizeString as one of the types that can be read/written from/to files **Added a stablealgo option to covariance and ... visitors to use a numerically stable algo instead of regular algo** Implemented a Matrix class to be used for internal calculations and analysis results Implemented CrossCorrVisitor visitor Optimized the implementation of AutoCorrVisitor Implemented PartialAutoCorrVisitor visitor Added maxlag parameter to AutoCorrVisitor Implemented makestationary() Implemented StationaryCheckVisitor visitor Implemented covariancematrix() **Implemented pcabyeigen()** Implemented compactsvd()

- C++
Published by hosseinmoein about 1 year ago

https://github.com/hosseinmoein/dataframe - Oct-2024

Please consider sponsoring DataFrame, especially if you are using it in Production Capacity. It is the strongest form of appreciation

Improved documentation both visually and content-wise Changed NLargestVisitor to take N as constructor parameter instead of template parameter (backward incompatible change) Implemented gettopn_[data|view]() Implemented getbottomn_[data|view]() Implemented getabovequantile_[data|view]() Implemented getbelowquantile_[data|view]() Added period parameter to ReturnVisitor visitor Implemented startswith() Implemented endswith() Implemented CumCountVisitor visitor Implemented inbetween() Implemented peaks() Implemented valleys() Made reading/writing large files faster Implemented apply() Made replace() faster with better algorithm Implemented truncate() Implemented a version of loadcolumn() with functor generating data Implemented explode() Implemented reading/writing std::pair columns from/to files Added more sanity checks Implemented difference() Implemented get_[data|view]_attimes() Implemented get_[data|view]_beforetimes() Implemented get_[data|view]_aftertimes() Implemented get_[data|view]_ondays() Implemented get_[data|view]_inmonths() Implemented get_[data|view]_ondaysinmonth() Implemented get_[data|view]_betweentimes() **Implemented removetopndata()** Implemented removebottomn_data() Implemented removeabovequantile_data() Implemented removebelowquantile_data() Implemented removedataby_stdev() Implemented get[data|view]by_stdev()

- C++
Published by hosseinmoein over 1 year ago

https://github.com/hosseinmoein/dataframe - July-2024

Please consider sponsoring DataFrame, especially if you are using it in Production Capacity. It is the strongest form of appreciation

Enhanced documentation and code clean ups Converted DateTime doc to html Implemented PeaksAndValleysVisitor visitor Implemented EhlersHighPassFilterVisitor visitor Implemented EhlersBandPassFilterVisitor visitor Implemented reading/writing in binary format Implemented reading binary data format in chunks Implemented serialize() and deserialize() Implemented reading/writing containers in binary format Added optional time-zone to strings parsed by DateTime constructor Implemented PowerFitVisitor visitor Implemented QuadraticFitVisitor visitor Implemented fillpolicy::lagrangeinterpolate Implemented correlationtype::kendalltau Implemented changefreq() Implemented duplicationmask()

- C++
Published by hosseinmoein over 1 year ago

https://github.com/hosseinmoein/dataframe - May-2024

Please consider sponsoring DataFrame, especially if you are using it in Production Capacity. It is the strongest form of appreciation

Significantly enhanced documentations both content-wise and visually Fixed a few edge-case bugs, including an edge-case in reading CSV2 format files Factored out and cleaned code Implemented inversioncount() Implemented get[data|view]bylike() Implemented removedatabylike() Added char and uchar type to types read/written from/to files Added ability to read/write columns of containers from/to files removecolumn() now requires a template parameter. It actually frees up the memory space now Implemented clear() Implemented swap() Now using some of the std::ranges algorithms Added scaler arithmetic DF operators Added magnitude calculations to DotProdVisitor visitor Added Euclidean distance calculations to DotProdVisitor visitor Added Manhattan distance calculations to DotProdVisitor visitor Implemented VectorSimilarityVisitor visitor Replaced asserts in algos with exceptions and added a compile-time option for it (HMDFSANITYEXCEPTIONS) Partially reengineered views so now you can use most of the API from views Added sentinels to vector views iterators

- C++
Published by hosseinmoein almost 2 years ago

https://github.com/hosseinmoein/dataframe - Feb-2024

Please consider sponsoring DataFrame, especially if you are using it in Production Capacity. It is the strongest form of appreciation

multithreading was completely redesigned by using a versatile thread-pool. Almost every API has a multithreaded version that kicks in for large datasets. This justifies increasing the major version number Added a thread-pool. All Async calls now use the thread-pool. Sort now uses parallel sort for large datasets. Added multithreading to almost all algorithms. Enhanced docs and hello world.

- C++
Published by hosseinmoein about 2 years ago

https://github.com/hosseinmoein/dataframe - Dec-2023

Please consider sponsoring DataFrame, especially if you are using it in Production Capacity. It is the strongest form of appreciation

This release requires C++23 or higher. Added more content to documentation. Made reading/writing files more streamlined and efficient. Fixed a bug in Median and Kth_element visitors related to handling nans. Added ability to read/write String Vectors, Double Sets, and String Sets as column elements in CSV2 format. Added seed option to all algorithms that use random numbers. Implemented PriceVolumeTrendVisitor visitor. Implemented QuantQualEstimationVisitor visitor. Fixed RSIVisitor visitor result to be the same size as its input. Implemented getstrcol_stats(). Added geteuclideannorm() to QuadraticMeanVisitor visitor. Added different normalization-types to NormalizeVisitor visitor. Added more benchmarking comparing with Pandas and Polars Made sorting much faster by using ranges and zip.

- C++
Published by hosseinmoein about 2 years ago

https://github.com/hosseinmoein/dataframe - Oct-2023

Please consider sponsoring DataFrame, especially if you are using it in Production Capacity. It is the strongest form of appreciation

Added more content to documentation and Hello World example. Fixed a bug in join that missed multiple matches in some edge cases Fixed a bug/edge case in Covariance calculation. Fixed a bug in reading JSON files. Utilized meta-programming in several parts of the codebase, especially visitors. Added a whole lot of C++ concepts throughout the code. Fixed many const-correctness throughout the code. Added a mechanism for a lot of visitors to be used in groupby and bucketize. Now in csv2 format you can read/write columns of vector, map, and unordered map types Enhanced DateTime ISO format parsing.

- C++
Published by hosseinmoein over 2 years ago

https://github.com/hosseinmoein/dataframe - June-2023

Please consider sponsoring DataFrame, especially if you are using it in Production Capacity. It is the strongest form of appreciation

Added more content to documentation and README Cleaned up and streamlined codebase (using a lot of typedef’s, using STL algorithms) Fixed minor bugs (got rid of template exports, ChandeKrollStopVisitor{}, got rid of boundary issues) Added more operators to DateTime class Implemented InertiaVisitor{} visitor Implemented SymmTriangleMovingMeanVisitor{} visitor Implemented RelativeVigorIndexVisitor{} visitor Implemented ElderRayIndexVisitor{} visitor Implemented ChopIndexVisitor{} visitor Implemented DetrendPriceOsciVisitor{} visitor Implemented RectifiedLinearUnitVisitor{} visitor Implemented AccelerationBandsVisitor{} visitor Implemented PriceDistanceVisitor{} visitor Implemented EldersThermometerVisitor{} visitor Implemented ProbabilityDistVisitor{} visitor Changed ReUL Rectifier visitor to a generalized RectifyVisitor{} with many options Implemented PolicyLearningLossVisitor{} visitor Implemented loadresultas_column() that runs and loads the result in one-shot Implemented LossFunctionVisitor{} visitor Implemented EldersForceIndexVisitor{} visitor Implemented EaseOfMovementVisitor{} visitor Added [[likely]] to branches and now requiring C++20 Implemented SeqLock{} synchronization

- C++
Published by hosseinmoein over 2 years ago

https://github.com/hosseinmoein/dataframe - April-2023

Please consider sponsoring DataFrame, especially if you are using it in Production system. It is the strongest form of appreciation

Implemented ability to have custom memory-boundary allocation to take advantage of SIMD instructions. This breaks backward-compatibility especially for views. Also, this justifies increasing the major version number Bug fixes and code cleanup Enhanced helloworld.cc and docs Made multi-threading faster by streamlining locks Made file I/O faster and more efficient Implemented getdatabysel() for 11, 12, and 13 columns Implemented CubicSplineFit visitor Implemented ImpurityVisitor visitor Added ignoreindex option to sort functions to make index sorting optional Implemented ExponentiallyWeightedVarVisitor visitor Removed ExponentialRollAdopter Implemented ExponentiallyWeightedCovVisitor visitor Implemented ExponentiallyWeightedCorrVisitor visitor Implemented ability to read files in chunks Added arithmetic operators to DateTime Changed KMeans to always calculate clusters -- interface change Changed all std::isarithmetic to supports_arithmetic Implemented FixedAutoCorrVisitor visitor Added option to SharpeRatioVisitor to also calculate Sortino Ratio Implemented RVIVisitor visitor Fixed moving averages to work with nan values in the beginning Added options to sort to sort based on absolute values Implemented LinregMovingMeanVisitor visitor

- C++
Published by hosseinmoein almost 3 years ago

https://github.com/hosseinmoein/dataframe - Jan-2023

Please consider sponsoring DataFrame, especially if you are using it in Production system. It is the strongest form of support

Bug fixes, including get[data|view]byrand(), docs Documentation and Hello World enhancements More confirmation with ISO C++ Enhanced DateTime ISO parsing Implemented T3MovingMeanVisitor Implemented appnedrow() Added maxrecs parameter to write() + fixed compiler debug warnings **Implemented _const views** Implemented loadresultascolumn() **Implemented getindicators() and from_indicators()** Implemented TreynorRatioVisitor Implemented ExponentialFitVisitor Implemented LinearFitVisitor

- C++
Published by hosseinmoein about 3 years ago

https://github.com/hosseinmoein/dataframe - July-2022

Enhanced documentation + add more to helloworld.cc Fixed compiling issue related to injecting clockgettime() system call into code Code clean ups and performance enhancements Fixed a bug in KthValueVisitor that affected MedianVisitor and QuantileVisitor Implemented BalanceOfPowerVisitor visitor Implemented NonZeroRangeVisitor visitor Implemented ChandeKrollStopVisitor visitor Added average option to TrueRange Implemented VotexVisitor visitor Implemented KeltnerChannelsVisitor visitor Added normalize option to TrueRange + now using exponential moving avg in TrueRange Implemented TrixVisitor visitor Implemented PrettyGoodOsciVisitor visitor Implemented ZeroLagMovingMeanVisitor visitor Implemented StableMeanVisitor visitor Added double operator to DateTime Implemented describe()

- C++
Published by hosseinmoein over 3 years ago

https://github.com/hosseinmoein/dataframe - Feb-2022

Enhanced documentation Fixed a few bugs; visitors macros, get_[data|view]_byloc(), get_view_by_idx() Fixed all views not to be _const, since you can change things through views CMake compiling was redone to make it easier for Windows Windows macros were redesigned to make compiling easier Turned extra warning flag on and fixed all compiler warnings Made hello_world.cc more comprehensive Implemented get_[data|view]() Replaced std::array, in all interfaces, with std::vector Implemented get_[data|view]_by_sel() for up to 5 columns Implemented to_string() and from_string() Implemented Coppock Curve Visitor Implemented Bias Visitor

- C++
Published by hosseinmoein about 4 years ago

https://github.com/hosseinmoein/dataframe - Oct-2021

Fixed bugs, including in multithreading, groupby + Removed all warnings Improved documentation Improved/tightened multithreading + SpinLock is now recursive Implemented RateOfChangeVisitor Implemented AccumDistVisitor Added singleactvisit() for 5 columns Implemented ChaikinMoneyFlowVisitor Implemented VertHorizFilterVisitor Implemented OnBalanceVolumeVisitor Implemented TrueRangeVisitor Changed the ReturnVisitor logic to start with a NaN and have the same length as input Implemented DecayVisitor Implemented HodgesTompkinsVolVisitor Implemented ParkinsonVolVisitor Implemented concatview() Added helloworld.cc Implemented get_row() that always includes all columns

- C++
Published by hosseinmoein over 4 years ago

https://github.com/hosseinmoein/dataframe - July-2021

Fixed minor bugs (MACD, MassIndex, PercentPriceOSCI) and streamlined code Improved documentation Implemented gensymtriangle() – generate symmetric triangular numbers Implemented RSXVisitor -- "noise free" version of RSI, with no added la Implemented TTMTrendVisitor -- Trade To Market trend indicator Implemented ParabolicSARVisitorVisitor -- Parabolic Stop And Reverse (PSAR) Implemented EBSineWaveVisitor -- Even Better Sine Wave (EBSW) indicator Implemented EhlerSuperSmootherVisitor -- Ehler's Super Smoother Filter (SSF) indicator Implemented VarIdxDynAvgVisitor -- Variable Index Dynamic Average (VIDYA) indicator Implemented AbsVisitor – Absolute value visitor Implemented PivotPointSRVisitor -- Pivot Points, Supports and Resistances indicators Added meantype + rearranged mean calculations Implemented ExponentiallyWeightedMeanVisitor – Exponentially weighted moving average Added abbreviated type aliases for visitors with long names Implemented AvgDirMovIdxVisitor -- Average Directional Movement Index (ADX) Added more performance tests Implemented fillmissing() with another DataFrame Implmented HoltWinterChannelVisitor -- Holt-Winter Channel (HWC) indicator Implemented HeikinAshiCndlVisitor – Heikin Ashi candle Implemented FastFourierTransVisitor – Fast Fourier transform and its inverse Implemented CenterOfGravityVisitor -- Also called Stochastic Oscillator Implemented ArnaudLegouxMAVisitor -- Arnaud Legoux Moving Average

- C++
Published by hosseinmoein over 4 years ago

https://github.com/hosseinmoein/dataframe - May-2021

Bug fixes, including in DateTime, is_equal(), write() Rearranged code to make it easier to compile as DLL Significantly improved docs both in terms of format and content Implemented Percent Price Oscillator visitor Added consolidated() for 4 and 5 columns Added another version of shift() to return a copy of a single shifted column Implemented Ulcer Index visitor Completely redesigned groupby() interface and made it significantly more versatile Completely redesigned bucketize() interface and made it significantly more versatile Implemented Count visitor Added ISO date format to DateTime Implemented ability to read()/write() DateTime as strings into streams Fixed a few move semantics to improve memory efficiency

- C++
Published by hosseinmoein almost 5 years ago

https://github.com/hosseinmoein/dataframe - March-2021

Improved documentation Minor bug fixes including a bug in MassIndexVisitor Implemented CCIVisitor (Commodity Channel Index) Implemented gentriangularnums() Rearranged code Implemented EntropyVisitor (information entropy) Implemented GarmanKlassVolVisitor (Garman Klass volatility) Implemented YangZhangVolVisitor (Yang Zhang volatility) Added columns_only flag to read()/write() Added non-zero flag option to DiffVisitor Implemented KamaVisitor (Kaufman's Adaptive Moving Average) Implemented FisherTransVisitor (Fisher transform) Implemented SlopeVisitor Implemented UltimateOSCIVisitor (Ultimate Oscillator)

- C++
Published by hosseinmoein almost 5 years ago

https://github.com/hosseinmoein/dataframe - Jan-2021

Fixed bugs, notably: - Fixed a bug in MACDVisitor calculation - Fixed a bug in vector reverse iterators - Fixed DataFrame destructor to work properly in multithreading environments

Streamlined and simplified code Enhanced documentations Implemented genevenspacenums() Generalized read/write to take either file name or stream Made columns to have deterministic order. Now you can access columns either by name or index With column order, implemented left/right rotating and shifting Implemented removeduplicates() for a single column Implemented TTestVisitor Implemented MassIndexVisitor Implemented WeightedMeanVisitor Added inreverse to visit() methods Implemented QuadraticMeanVisitor Implemented HullRollingMeanVisitor Implemented RollingMidValueVisitor Implemented DrawdownVisitor Added singleactvisit() for 3 and 4 columns Implemented WilliamPrcRVisitor Added repeatcount to ExponentialRollAdopter, so we can have multiple smoothing in one call Added repeat_count to ExpoSmootherVisitor

- C++
Published by hosseinmoein about 5 years ago

https://github.com/hosseinmoein/dataframe - Nov-2020

Added weights and residual calculations to PolyFitVisitor Implemented LogFitVisitor Implemented ExpoSmootherVisitor (exponential smoothing) Implemented HWExpoSmootherVisitor (Holt-Winters double exponential smoothing) Implemented consolidate() Specialized std::hash for DateTime Implemented [Max|Min]SubArrayVisitor (sub-intervals with max/min sums) Replaced all Max/Min’s with Extremums visitors and typedef’ed Max/Min Implemented LowessVisitor (Locally Weighted Scatterplot Smoothing) Implemented StepRollAdopter Implemented DecomposeVisitor (STL time-series decomposer) with additive and multiplicative options Enhanced documentations Fixed bugs and compile issues

- C++
Published by hosseinmoein over 5 years ago

https://github.com/hosseinmoein/dataframe - Oct-2020

Improve single-act-visitor interface to be more flexible Bug fixes Enhanced documentation Implemented genlogspacenums() to generate logarithmically-spaced numbers Implemented removeduplicates() Enhanced group-by functionality and made it more generalized Implemented io_format::csv2 to read/write files in Pandas csv format Implemented empty() and shapeless() Implemented Box-Cox visitor Implemented Normalize and Standardize visitors Implemented Hampel filter visitor Implemented Polynomial Fit visitor Implemented Hurst Exponent visitor

- C++
Published by hosseinmoein over 5 years ago

https://github.com/hosseinmoein/dataframe - August-2020

Enhanced documentation Fixed all VSC++ warnings for 64bit compilation (we don’t compile 32bit anymore) Implemented RankVisitor Improved random number generation Implemented SigmoidVisitor Implemented combine() method Added nodiscard to some methods Implemented RSIVisitor (Relative Strength Index)

- C++
Published by hosseinmoein over 5 years ago

https://github.com/hosseinmoein/dataframe - June-2020

Enhanced DateTime object Implemented getcolumnsinfo method Implemented CategoryVisitor visitor Implemented FactorizeVisitor visitor Implemented patternmatch method Changed shrinkto_fit() to optimize for power of 2 cache line aliasing misses Implemented ClipVisitor visitor Significantly enhanced documentation both in terms of content and format Fixed a bug in DateTime object related to time zones Implemented SharpeRatioVisitor visitor

- C++
Published by hosseinmoein over 5 years ago

https://github.com/hosseinmoein/dataframe - April-2020

Added visitasync() and singleactvisitasync() methods Enhanced documentations Fixed many codacy complains Fixed a glitch in DateTime related to nanoseconds Added Conan package support, thanks to @yssource Added get retypecolumn() method Added loadaligncolumn() method **Brought the whole codebase to 100% compliance with C++ standards (using _GLIBCXXDEBUG)**

- C++
Published by hosseinmoein almost 6 years ago

https://github.com/hosseinmoein/dataframe - March-2020

Added midpoint to fillpolicy Added quantile visitor Added VWAP (Volume Weighted Average Price) visitor Added VWBAS (Volume Weighted Bid-Ask Spread) visitor Added concat() and selfconcat() methods Added getreindexed() and getreindexedview() methods Made all get methods (i.e. views) const Switched to HTML docs

- C++
Published by hosseinmoein almost 6 years ago

https://github.com/hosseinmoein/dataframe - Feb-2020

Code reorganization Added DataFrame method getmemoryusage() Divided visitor source file into Stats, ML, and financial source files Added exponential-moving-stats adopter for visitors Added Geometric-Mean visitor Added Harmonic-Mean visitor Added Double-Cross-Over visitor Added Bollinger-Band visitor Added Moving-Average-Convergence/Divergence visitor Added Expanding-Roll-Adopter for visitors Added support for Conan file compiling Improved multi-threading + added multi-threading to more algos Added MADVisitor -- 4 different Mean-Absolute-Deviation visitor logic Added Standard-Error-of-the-Mean visitor

- C++
Published by hosseinmoein about 6 years ago

https://github.com/hosseinmoein/dataframe - Jan-2020

Improved testing Enhanced documentation Improved DataFrame performance Added z-score visitor Added protection for multithreading Improved visitors performance Added k-means visitor Brought iterators up to C++17 standards Added affinity-propagation visitor Improved sorting performance Added multi-column sorting + ascending vs. descending Added join by column and improved join by index Factoring out a lot of code, especially on sort and join Improved Mode calculation not to copy data

- C++
Published by hosseinmoein about 6 years ago

https://github.com/hosseinmoein/dataframe - Sept-2019

Code restructures Documentation enhancements shape() shuffle() shrinktofit() Roll adapters for visitor algorithms Better NaN value handling Slicing by random selection Random number generators JASON files format read/write Performance/scalability comparison with Pandas was added to README Enhanced get[data|view]byloc(), allowing distinct location Enhanced get[data|view]byidx(), allowing distinct indices

- C++
Published by hosseinmoein over 6 years ago

https://github.com/hosseinmoein/dataframe - July-2019

MMap stuff Ptr views Fixed CMake More Visitors Index generation Slicing and removing by boolean selection Other improvements and cleanups

- C++
Published by hosseinmoein over 6 years ago

https://github.com/hosseinmoein/dataframe - June-2019

Added more analytics + more utility + bug fixes

- C++
Published by hosseinmoein over 6 years ago

https://github.com/hosseinmoein/dataframe - May-2019

Added multithreading + more

- C++
Published by hosseinmoein almost 7 years ago

https://github.com/hosseinmoein/dataframe - April-2019 (2)

Added drop missing columns

- C++
Published by hosseinmoein almost 7 years ago

https://github.com/hosseinmoein/dataframe - April-2019

First release

- C++
Published by hosseinmoein almost 7 years ago