Recent Releases of https://github.com/hosseinmoein/dataframe
https://github.com/hosseinmoein/dataframe - August-2025
Improved documentation by creating split windows Added cheat-sheet to docs Fixed some edge case bug in parallel sort Implemented detectandfill() Implemented detectandchange() Implemented KolmoSmirnovTestVisitor visitor Implemented MannWhitneyUTestVisitor visitor Implemented mask() Added many more functionalities to the internal Matrix class Implemented fast_ica() Fixed the inconsistency in writing/reading DateTime columns to/from files Added ability to read only selected columns from files Implemented MutualInfoVisitor visitor Added ability to specify a delimiter when writing/reading to/from csv files. Implemented AndersonDarlingTestVisitor visitor Implemented ShapiroWilkTestVisitor visitor Implemented CramerVonMisesTestVisitor visitor Implemented unpivot() Implemented pivot()
- C++
Published by hosseinmoein 7 months ago
https://github.com/hosseinmoein/dataframe - April-2025
Enhanced documentation Implemented SpectralClusteringVisitor visitor Enhanced ThreadPool parallelloop() Implemented get[data|view]byspectral() Implemented determinant() + a bunch of other stuff in Matrix Implemented canoncorr() Implemented MCstationdist() Implemented SeasonalPeriodVisitor visitor Improved performance in reading files of different types Changed read() signature to take a struct for its parameters -- _backward incompatible change Changed write() signature to take a struct for its parameters -- backward incompatible change Implemented ability to read() csv2 files with user provided schema Implemented knn() Implemented DynamicTimeWarpVisitor visitor Implemented AnomalyDetectByFFTVisitor visitor Changed the interface of HampelFilterVisitor -- backward incompatible change Implemented removedatabyfft() Implemented AnomalyDetectByIQRVisitor visitor Implemented AnomalyDetectByZScoreVisitor visitor Implemented removedatabyiqr() Implemented removedataby_zscore() Implemented AnomalyDetectByLOFVisitor visitor Ported to GCC14 compiler and fixed many edge-case bugs
- C++
Published by hosseinmoein 11 months ago
https://github.com/hosseinmoein/dataframe - Jan-2025
Improved documentation and code quality Fixed a bug in assign() Implemented get[data|view]by_kmeans() Changed interface and optimized code in AffinityPropVisitor (backward incompatible change) Implemented get_[data|view]_by_affin() Added option to HampelFilterVisitor to populate indices to datapoints affected Implemented removedatabyhampel() Implemented MeanShiftVisitor visitor Implemented get[data|view]bydbscan() Impelmented get_[data|view]_by_mshift() Improved performance in removeduplicates() Added FixedSizeString as one of the types that can be read/written from/to files **Added a stablealgo option to covariance and ... visitors to use a numerically stable algo instead of regular algo** Implemented a Matrix class to be used for internal calculations and analysis results Implemented CrossCorrVisitor visitor Optimized the implementation of AutoCorrVisitor Implemented PartialAutoCorrVisitor visitor Added maxlag parameter to AutoCorrVisitor Implemented makestationary() Implemented StationaryCheckVisitor visitor Implemented covariancematrix() **Implemented pcabyeigen()** Implemented compactsvd()
- C++
Published by hosseinmoein about 1 year ago
https://github.com/hosseinmoein/dataframe - Oct-2024
Improved documentation both visually and content-wise Changed NLargestVisitor to take N as constructor parameter instead of template parameter (backward incompatible change) Implemented gettopn_[data|view]() Implemented getbottomn_[data|view]() Implemented getabovequantile_[data|view]() Implemented getbelowquantile_[data|view]() Added period parameter to ReturnVisitor visitor Implemented startswith() Implemented endswith() Implemented CumCountVisitor visitor Implemented inbetween() Implemented peaks() Implemented valleys() Made reading/writing large files faster Implemented apply() Made replace() faster with better algorithm Implemented truncate() Implemented a version of loadcolumn() with functor generating data Implemented explode() Implemented reading/writing std::pair columns from/to files Added more sanity checks Implemented difference() Implemented get_[data|view]_attimes() Implemented get_[data|view]_beforetimes() Implemented get_[data|view]_aftertimes() Implemented get_[data|view]_ondays() Implemented get_[data|view]_inmonths() Implemented get_[data|view]_ondaysinmonth() Implemented get_[data|view]_betweentimes() **Implemented removetopndata()** Implemented removebottomn_data() Implemented removeabovequantile_data() Implemented removebelowquantile_data() Implemented removedataby_stdev() Implemented get[data|view]by_stdev()
- C++
Published by hosseinmoein over 1 year ago
https://github.com/hosseinmoein/dataframe - July-2024
Enhanced documentation and code clean ups Converted DateTime doc to html Implemented PeaksAndValleysVisitor visitor Implemented EhlersHighPassFilterVisitor visitor Implemented EhlersBandPassFilterVisitor visitor Implemented reading/writing in binary format Implemented reading binary data format in chunks Implemented serialize() and deserialize() Implemented reading/writing containers in binary format Added optional time-zone to strings parsed by DateTime constructor Implemented PowerFitVisitor visitor Implemented QuadraticFitVisitor visitor Implemented fillpolicy::lagrangeinterpolate Implemented correlationtype::kendalltau Implemented changefreq() Implemented duplicationmask()
- C++
Published by hosseinmoein over 1 year ago
https://github.com/hosseinmoein/dataframe - May-2024
Significantly enhanced documentations both content-wise and visually Fixed a few edge-case bugs, including an edge-case in reading CSV2 format files Factored out and cleaned code Implemented inversioncount() Implemented get[data|view]bylike() Implemented removedatabylike() Added char and uchar type to types read/written from/to files Added ability to read/write columns of containers from/to files removecolumn() now requires a template parameter. It actually frees up the memory space now Implemented clear() Implemented swap() Now using some of the std::ranges algorithms Added scaler arithmetic DF operators Added magnitude calculations to DotProdVisitor visitor Added Euclidean distance calculations to DotProdVisitor visitor Added Manhattan distance calculations to DotProdVisitor visitor Implemented VectorSimilarityVisitor visitor Replaced asserts in algos with exceptions and added a compile-time option for it (HMDFSANITYEXCEPTIONS) Partially reengineered views so now you can use most of the API from views Added sentinels to vector views iterators
- C++
Published by hosseinmoein almost 2 years ago
https://github.com/hosseinmoein/dataframe - Feb-2024
multithreading was completely redesigned by using a versatile thread-pool. Almost every API has a multithreaded version that kicks in for large datasets. This justifies increasing the major version number Added a thread-pool. All Async calls now use the thread-pool. Sort now uses parallel sort for large datasets. Added multithreading to almost all algorithms. Enhanced docs and hello world.
- C++
Published by hosseinmoein about 2 years ago
https://github.com/hosseinmoein/dataframe - Dec-2023
This release requires C++23 or higher. Added more content to documentation. Made reading/writing files more streamlined and efficient. Fixed a bug in Median and Kth_element visitors related to handling nans. Added ability to read/write String Vectors, Double Sets, and String Sets as column elements in CSV2 format. Added seed option to all algorithms that use random numbers. Implemented PriceVolumeTrendVisitor visitor. Implemented QuantQualEstimationVisitor visitor. Fixed RSIVisitor visitor result to be the same size as its input. Implemented getstrcol_stats(). Added geteuclideannorm() to QuadraticMeanVisitor visitor. Added different normalization-types to NormalizeVisitor visitor. Added more benchmarking comparing with Pandas and Polars Made sorting much faster by using ranges and zip.
- C++
Published by hosseinmoein about 2 years ago
https://github.com/hosseinmoein/dataframe - Oct-2023
Added more content to documentation and Hello World example. Fixed a bug in join that missed multiple matches in some edge cases Fixed a bug/edge case in Covariance calculation. Fixed a bug in reading JSON files. Utilized meta-programming in several parts of the codebase, especially visitors. Added a whole lot of C++ concepts throughout the code. Fixed many const-correctness throughout the code. Added a mechanism for a lot of visitors to be used in groupby and bucketize. Now in csv2 format you can read/write columns of vector, map, and unordered map types Enhanced DateTime ISO format parsing.
- C++
Published by hosseinmoein over 2 years ago
https://github.com/hosseinmoein/dataframe - June-2023
Added more content to documentation and README Cleaned up and streamlined codebase (using a lot of typedef’s, using STL algorithms) Fixed minor bugs (got rid of template exports, ChandeKrollStopVisitor{}, got rid of boundary issues) Added more operators to DateTime class Implemented InertiaVisitor{} visitor Implemented SymmTriangleMovingMeanVisitor{} visitor Implemented RelativeVigorIndexVisitor{} visitor Implemented ElderRayIndexVisitor{} visitor Implemented ChopIndexVisitor{} visitor Implemented DetrendPriceOsciVisitor{} visitor Implemented RectifiedLinearUnitVisitor{} visitor Implemented AccelerationBandsVisitor{} visitor Implemented PriceDistanceVisitor{} visitor Implemented EldersThermometerVisitor{} visitor Implemented ProbabilityDistVisitor{} visitor Changed ReUL Rectifier visitor to a generalized RectifyVisitor{} with many options Implemented PolicyLearningLossVisitor{} visitor Implemented loadresultas_column() that runs and loads the result in one-shot Implemented LossFunctionVisitor{} visitor Implemented EldersForceIndexVisitor{} visitor Implemented EaseOfMovementVisitor{} visitor Added [[likely]] to branches and now requiring C++20 Implemented SeqLock{} synchronization
- C++
Published by hosseinmoein over 2 years ago
https://github.com/hosseinmoein/dataframe - April-2023
Implemented ability to have custom memory-boundary allocation to take advantage of SIMD instructions. This breaks backward-compatibility especially for views. Also, this justifies increasing the major version number Bug fixes and code cleanup Enhanced helloworld.cc and docs Made multi-threading faster by streamlining locks Made file I/O faster and more efficient Implemented getdatabysel() for 11, 12, and 13 columns Implemented CubicSplineFit visitor Implemented ImpurityVisitor visitor Added ignoreindex option to sort functions to make index sorting optional Implemented ExponentiallyWeightedVarVisitor visitor Removed ExponentialRollAdopter Implemented ExponentiallyWeightedCovVisitor visitor Implemented ExponentiallyWeightedCorrVisitor visitor Implemented ability to read files in chunks Added arithmetic operators to DateTime Changed KMeans to always calculate clusters -- interface change Changed all std::isarithmetic to supports_arithmetic Implemented FixedAutoCorrVisitor visitor Added option to SharpeRatioVisitor to also calculate Sortino Ratio Implemented RVIVisitor visitor Fixed moving averages to work with nan values in the beginning Added options to sort to sort based on absolute values Implemented LinregMovingMeanVisitor visitor
- C++
Published by hosseinmoein almost 3 years ago
https://github.com/hosseinmoein/dataframe - Jan-2023
Please consider sponsoring DataFrame, especially if you are using it in Production system. It is the strongest form of support
Bug fixes, including get[data|view]byrand(), docs Documentation and Hello World enhancements More confirmation with ISO C++ Enhanced DateTime ISO parsing Implemented T3MovingMeanVisitor Implemented appnedrow() Added maxrecs parameter to write() + fixed compiler debug warnings **Implemented _const views** Implemented loadresultascolumn() **Implemented getindicators() and from_indicators()** Implemented TreynorRatioVisitor Implemented ExponentialFitVisitor Implemented LinearFitVisitor
- C++
Published by hosseinmoein about 3 years ago
https://github.com/hosseinmoein/dataframe - July-2022
Enhanced documentation + add more to helloworld.cc Fixed compiling issue related to injecting clockgettime() system call into code Code clean ups and performance enhancements Fixed a bug in KthValueVisitor that affected MedianVisitor and QuantileVisitor Implemented BalanceOfPowerVisitor visitor Implemented NonZeroRangeVisitor visitor Implemented ChandeKrollStopVisitor visitor Added average option to TrueRange Implemented VotexVisitor visitor Implemented KeltnerChannelsVisitor visitor Added normalize option to TrueRange + now using exponential moving avg in TrueRange Implemented TrixVisitor visitor Implemented PrettyGoodOsciVisitor visitor Implemented ZeroLagMovingMeanVisitor visitor Implemented StableMeanVisitor visitor Added double operator to DateTime Implemented describe()
- C++
Published by hosseinmoein over 3 years ago
https://github.com/hosseinmoein/dataframe - Feb-2022
Enhanced documentation Fixed a few bugs; visitors macros, get_[data|view]_byloc(), get_view_by_idx() Fixed all views not to be _const, since you can change things through views CMake compiling was redone to make it easier for Windows Windows macros were redesigned to make compiling easier Turned extra warning flag on and fixed all compiler warnings Made hello_world.cc more comprehensive Implemented get_[data|view]() Replaced std::array, in all interfaces, with std::vector Implemented get_[data|view]_by_sel() for up to 5 columns Implemented to_string() and from_string() Implemented Coppock Curve Visitor Implemented Bias Visitor
- C++
Published by hosseinmoein about 4 years ago
https://github.com/hosseinmoein/dataframe - Oct-2021
Fixed bugs, including in multithreading, groupby + Removed all warnings Improved documentation Improved/tightened multithreading + SpinLock is now recursive Implemented RateOfChangeVisitor Implemented AccumDistVisitor Added singleactvisit() for 5 columns Implemented ChaikinMoneyFlowVisitor Implemented VertHorizFilterVisitor Implemented OnBalanceVolumeVisitor Implemented TrueRangeVisitor Changed the ReturnVisitor logic to start with a NaN and have the same length as input Implemented DecayVisitor Implemented HodgesTompkinsVolVisitor Implemented ParkinsonVolVisitor Implemented concatview() Added helloworld.cc Implemented get_row() that always includes all columns
- C++
Published by hosseinmoein over 4 years ago
https://github.com/hosseinmoein/dataframe - July-2021
Fixed minor bugs (MACD, MassIndex, PercentPriceOSCI) and streamlined code Improved documentation Implemented gensymtriangle() – generate symmetric triangular numbers Implemented RSXVisitor -- "noise free" version of RSI, with no added la Implemented TTMTrendVisitor -- Trade To Market trend indicator Implemented ParabolicSARVisitorVisitor -- Parabolic Stop And Reverse (PSAR) Implemented EBSineWaveVisitor -- Even Better Sine Wave (EBSW) indicator Implemented EhlerSuperSmootherVisitor -- Ehler's Super Smoother Filter (SSF) indicator Implemented VarIdxDynAvgVisitor -- Variable Index Dynamic Average (VIDYA) indicator Implemented AbsVisitor – Absolute value visitor Implemented PivotPointSRVisitor -- Pivot Points, Supports and Resistances indicators Added meantype + rearranged mean calculations Implemented ExponentiallyWeightedMeanVisitor – Exponentially weighted moving average Added abbreviated type aliases for visitors with long names Implemented AvgDirMovIdxVisitor -- Average Directional Movement Index (ADX) Added more performance tests Implemented fillmissing() with another DataFrame Implmented HoltWinterChannelVisitor -- Holt-Winter Channel (HWC) indicator Implemented HeikinAshiCndlVisitor – Heikin Ashi candle Implemented FastFourierTransVisitor – Fast Fourier transform and its inverse Implemented CenterOfGravityVisitor -- Also called Stochastic Oscillator Implemented ArnaudLegouxMAVisitor -- Arnaud Legoux Moving Average
- C++
Published by hosseinmoein over 4 years ago
https://github.com/hosseinmoein/dataframe - May-2021
Bug fixes, including in DateTime, is_equal(), write() Rearranged code to make it easier to compile as DLL Significantly improved docs both in terms of format and content Implemented Percent Price Oscillator visitor Added consolidated() for 4 and 5 columns Added another version of shift() to return a copy of a single shifted column Implemented Ulcer Index visitor Completely redesigned groupby() interface and made it significantly more versatile Completely redesigned bucketize() interface and made it significantly more versatile Implemented Count visitor Added ISO date format to DateTime Implemented ability to read()/write() DateTime as strings into streams Fixed a few move semantics to improve memory efficiency
- C++
Published by hosseinmoein almost 5 years ago
https://github.com/hosseinmoein/dataframe - March-2021
Improved documentation Minor bug fixes including a bug in MassIndexVisitor Implemented CCIVisitor (Commodity Channel Index) Implemented gentriangularnums() Rearranged code Implemented EntropyVisitor (information entropy) Implemented GarmanKlassVolVisitor (Garman Klass volatility) Implemented YangZhangVolVisitor (Yang Zhang volatility) Added columns_only flag to read()/write() Added non-zero flag option to DiffVisitor Implemented KamaVisitor (Kaufman's Adaptive Moving Average) Implemented FisherTransVisitor (Fisher transform) Implemented SlopeVisitor Implemented UltimateOSCIVisitor (Ultimate Oscillator)
- C++
Published by hosseinmoein almost 5 years ago
https://github.com/hosseinmoein/dataframe - Jan-2021
Fixed bugs, notably: - Fixed a bug in MACDVisitor calculation - Fixed a bug in vector reverse iterators - Fixed DataFrame destructor to work properly in multithreading environments
Streamlined and simplified code Enhanced documentations Implemented genevenspacenums() Generalized read/write to take either file name or stream Made columns to have deterministic order. Now you can access columns either by name or index With column order, implemented left/right rotating and shifting Implemented removeduplicates() for a single column Implemented TTestVisitor Implemented MassIndexVisitor Implemented WeightedMeanVisitor Added inreverse to visit() methods Implemented QuadraticMeanVisitor Implemented HullRollingMeanVisitor Implemented RollingMidValueVisitor Implemented DrawdownVisitor Added singleactvisit() for 3 and 4 columns Implemented WilliamPrcRVisitor Added repeatcount to ExponentialRollAdopter, so we can have multiple smoothing in one call Added repeat_count to ExpoSmootherVisitor
- C++
Published by hosseinmoein about 5 years ago
https://github.com/hosseinmoein/dataframe - Nov-2020
Added weights and residual calculations to PolyFitVisitor Implemented LogFitVisitor Implemented ExpoSmootherVisitor (exponential smoothing) Implemented HWExpoSmootherVisitor (Holt-Winters double exponential smoothing) Implemented consolidate() Specialized std::hash for DateTime Implemented [Max|Min]SubArrayVisitor (sub-intervals with max/min sums) Replaced all Max/Min’s with Extremums visitors and typedef’ed Max/Min Implemented LowessVisitor (Locally Weighted Scatterplot Smoothing) Implemented StepRollAdopter Implemented DecomposeVisitor (STL time-series decomposer) with additive and multiplicative options Enhanced documentations Fixed bugs and compile issues
- C++
Published by hosseinmoein over 5 years ago
https://github.com/hosseinmoein/dataframe - Oct-2020
Improve single-act-visitor interface to be more flexible Bug fixes Enhanced documentation Implemented genlogspacenums() to generate logarithmically-spaced numbers Implemented removeduplicates() Enhanced group-by functionality and made it more generalized Implemented io_format::csv2 to read/write files in Pandas csv format Implemented empty() and shapeless() Implemented Box-Cox visitor Implemented Normalize and Standardize visitors Implemented Hampel filter visitor Implemented Polynomial Fit visitor Implemented Hurst Exponent visitor
- C++
Published by hosseinmoein over 5 years ago
https://github.com/hosseinmoein/dataframe - August-2020
Enhanced documentation
Fixed all VSC++ warnings for 64bit compilation (we don’t compile 32bit anymore)
Implemented RankVisitor
Improved random number generation
Implemented SigmoidVisitor
Implemented combine() method
Added nodiscard to some methods
Implemented RSIVisitor (Relative Strength Index)
- C++
Published by hosseinmoein over 5 years ago
https://github.com/hosseinmoein/dataframe - June-2020
Enhanced DateTime object Implemented getcolumnsinfo method Implemented CategoryVisitor visitor Implemented FactorizeVisitor visitor Implemented patternmatch method Changed shrinkto_fit() to optimize for power of 2 cache line aliasing misses Implemented ClipVisitor visitor Significantly enhanced documentation both in terms of content and format Fixed a bug in DateTime object related to time zones Implemented SharpeRatioVisitor visitor
- C++
Published by hosseinmoein over 5 years ago
https://github.com/hosseinmoein/dataframe - April-2020
Added visitasync() and singleactvisitasync() methods Enhanced documentations Fixed many codacy complains Fixed a glitch in DateTime related to nanoseconds Added Conan package support, thanks to @yssource Added get retypecolumn() method Added loadaligncolumn() method **Brought the whole codebase to 100% compliance with C++ standards (using _GLIBCXXDEBUG)**
- C++
Published by hosseinmoein almost 6 years ago
https://github.com/hosseinmoein/dataframe - March-2020
Added midpoint to fillpolicy Added quantile visitor Added VWAP (Volume Weighted Average Price) visitor Added VWBAS (Volume Weighted Bid-Ask Spread) visitor Added concat() and selfconcat() methods Added getreindexed() and getreindexedview() methods Made all get methods (i.e. views) const Switched to HTML docs
- C++
Published by hosseinmoein almost 6 years ago
https://github.com/hosseinmoein/dataframe - Feb-2020
Code reorganization Added DataFrame method getmemoryusage() Divided visitor source file into Stats, ML, and financial source files Added exponential-moving-stats adopter for visitors Added Geometric-Mean visitor Added Harmonic-Mean visitor Added Double-Cross-Over visitor Added Bollinger-Band visitor Added Moving-Average-Convergence/Divergence visitor Added Expanding-Roll-Adopter for visitors Added support for Conan file compiling Improved multi-threading + added multi-threading to more algos Added MADVisitor -- 4 different Mean-Absolute-Deviation visitor logic Added Standard-Error-of-the-Mean visitor
- C++
Published by hosseinmoein about 6 years ago
https://github.com/hosseinmoein/dataframe - Jan-2020
Improved testing
Enhanced documentation
Improved DataFrame performance
Added z-score visitor
Added protection for multithreading
Improved visitors performance
Added k-means visitor
Brought iterators up to C++17 standards
Added affinity-propagation visitor
Improved sorting performance
Added multi-column sorting + ascending vs. descending
Added join by column and improved join by index
Factoring out a lot of code, especially on sort and join
Improved Mode calculation not to copy data
- C++
Published by hosseinmoein about 6 years ago
https://github.com/hosseinmoein/dataframe - Sept-2019
Code restructures Documentation enhancements shape() shuffle() shrinktofit() Roll adapters for visitor algorithms Better NaN value handling Slicing by random selection Random number generators JASON files format read/write Performance/scalability comparison with Pandas was added to README Enhanced get[data|view]byloc(), allowing distinct location Enhanced get[data|view]byidx(), allowing distinct indices
- C++
Published by hosseinmoein over 6 years ago
https://github.com/hosseinmoein/dataframe - July-2019
MMap stuff Ptr views Fixed CMake More Visitors Index generation Slicing and removing by boolean selection Other improvements and cleanups
- C++
Published by hosseinmoein over 6 years ago
https://github.com/hosseinmoein/dataframe - June-2019
Added more analytics + more utility + bug fixes
- C++
Published by hosseinmoein over 6 years ago
https://github.com/hosseinmoein/dataframe - May-2019
Added multithreading + more
- C++
Published by hosseinmoein almost 7 years ago
https://github.com/hosseinmoein/dataframe - April-2019 (2)
Added drop missing columns
- C++
Published by hosseinmoein almost 7 years ago
https://github.com/hosseinmoein/dataframe - April-2019
First release
- C++
Published by hosseinmoein almost 7 years ago