Recent Releases of diive

diive - v0.89.0

v0.89.0 | 23 Jul 2025

Version 0.89.0 introduces a new GridAggregator class for 2D data aggregation with support for quantile, equal-width, and custom binning methods, along with comprehensive documentation improvements and major dependency updates including shapiq integration for enhanced analysis capabilities.

See the notebook for example usage.

Added

  • New GridAggregator class for 2D grid data aggregation (diive/pkgs/analyses/gridaggregator.py)
    • Supports quantile, equal-width, and custom binning methods
    • Flexible aggregation functions
    • Comprehensive input validation and error handling
    • Added unit tests covering core functionality
    • Added example notebook:
      • notebooks/Examples/GridAggregator.ipynb - demonstrates 2D data aggregation and binning

Enhanced

  • Improved documentation across modules
    • Added detailed docstrings for methods and classes
    • Updated example notebooks for better clarity
    • Streamlined notebook structure in Overview

Dependencies

  • Updated multiple Python dependencies to their latest versions
  • Added new dependencies:
    • shapiq (>=1.3.1,<2.0.0)
    • galois
    • networkx
    • sparse-transform

Unittests

  • Added unittests for dv.heatmap_xyz
  • 66/66 unittests ran successfully

What's Changed

  • Indev by @holukas in https://github.com/holukas/diive/pull/327
  • Indev by @holukas in https://github.com/holukas/diive/pull/332

Full Changelog: https://github.com/holukas/diive/compare/v0.87.1...v0.89.0

- Python
Published by holukas 7 months ago

diive - v0.88.0

v0.88.0 | 18 Jul 2025

plotHeatmapYearMonthMaxTA_diive_v0.88.0 Heatmaps can now be plotted in horizontal orientation by setting the parameter ax_orientation='horizontal'. This example plot shows the monthly maximum air temperature.

Changes

Heatmap updates

  • There are several improvements for heatmap visualizations:
    • More consistent heatmap creation: The .heatmapdatetime(), .heatmapyearmonth() and .heatmapxyz() functions now offer a more unified experience for generating heatmaps.
    • Flexible orientation: heatmaps can now be displayed vertically or horizontally using the new parameter ax_orientation.
    • The rank plot introduced in the previous version can now be created using the parameter ranks=True when using .heatmapyearmonth().

Fyi, .heatmapdatetime() is an alias for the diive.core.plotting.heatmap_datetime.HeatmapDateTime class, .heatmapyearmonth() is an alias for diive.core.plotting.heatmap_datetime.HeatmapYearMonth, .heatmapxyz() is an alias for diive.core.plotting.heatmap_xyz.HeatmapXYZ. All of these classes use diive.core.plotting.heatmap_base.HeatmapBase or diive.core.plotting.heatmap_base.HeatmapBaseXYZ as base class for their core functionality.

Notebooks

  • Updated notebook for QuantileGridAggregator (formerly CalculateZaggregatesInQuantileClassesOfXY)
  • Updated notebook for HeatmapDateTime
  • Updated notebook for HeatmapYearMonth

Unittests

  • Updated test case for tests.test_analyses.TestAnalyses.test_quantilegridaggregator
  • 56/56 unittests ran successfully

What's Changed

  • Indev by @holukas in https://github.com/holukas/diive/pull/315

Full Changelog: https://github.com/holukas/diive/compare/v0.87.0...v0.88.0

- Python
Published by holukas 7 months ago

diive - v0.87.1

v0.87.1 | 12 Jun 2025

New features

  • Added new function .set_exact_values_to_missing() to set specific values in a time series to missing values ( diive.pkgs.corrections.setto_missing.set_exact_values_to_missing)

Additions

  • Added parameters when plotting diel cycles:
    • Added parameter show_xticklabels for showing grid
    • Added parameter show_xlabel for showing x-ticklabels
    • Added parameter show_legend for showing legend
    • (diive.core.plotting.dielcycle.DielCycle.plot)
  • Similarly, added more params for plotting cumulatives (diive.core.plotting.cumulative.Cumulative)

Changes

  • In .quickplot(), other rows now use the same scaling for x-axis as the plot in the first row ( diive.core.plotting.plotfuncs.quickplot)
  • Scaling of the y-axis is now slightly extended (by 5%) when plotting cumulatives ( diive.core.plotting.cumulative.Cumulative)

Notebooks

  • Updated StepwiseMeteoScreeningFromDatabase.ipynb, added new correction .set_exact_values_to_missing()

Unittests

  • Added test case for .set_exact_values_to_missing() (tests.test_corrections.TestCorrections.test_settomissing)
  • 56/56 unittests ran successfully

What's Changed

  • Indev by @holukas in https://github.com/holukas/diive/pull/315

Full Changelog: https://github.com/holukas/diive/compare/v0.87.0...v0.87.1

- Python
Published by holukas 8 months ago

diive - v0.87.0

v0.87.0 | 17 May 2025

Heatmap rank plot

diive can now create heatmap rank plots.

plotHeatmapYearMonthRank_diive_v0.87.0.png

Example heatmap rank plot for air temperatures. This heatmap displays the rank of average monthly air temperatures compared across different years. For instance, May 2022 had the highest average temperature among all Mays on record ( rank 1), as did October 2022 for Octobers. Conversely, January 2019 recorded the lowest average temperature for January within the 26-year period shown.

Heatmap rank plots display the relative ranking of monthly aggregated values across multiple years. Essentially, it shows how each month's overall value compares to the same month in other years. By default, the plot ranks the monthly mean (average) of the selected variable.

Other aggregation methods commonly used in the pandas library are possible, such as median, min, max and std, among others.

Basic example:

import diive as dv hm = dv.heatmapyearmonth_ranks(series=series) # Initialize instance, series is a pandas Series hm.plot() # Generate basic plot

See the notebook here for more examples: notebooks/Plotting/HeatmapYearMonthRank.ipynb

New features

  • Added new class .heatmapyearmonth_ranks() to plot monthly ranks of an aggregated value across years ( diive.core.plotting.heatmap_datetime.HeatmapYearMonthRanks)
  • Added new function .resample_to_monthly_agg_matrix() to calculate a matrix of monthly aggregates across years ( diive.core.times.resampling.resample_to_monthly_agg_matrix)
  • Added new function .transform_yearmonth_matrix_to_longform() to convert monthly aggregation matrix to long-form time series (diive.core.dfun.frames.transform_yearmonth_matrix_to_longform)
  • Added new function to calculate ET (evapotranspiration in mm h-1) from LE (latent heat flux in W m-2). ( diive.pkgs.createvar.conversions.et_from_le)
  • Added new function to calculate latent heat of vaporization. Originally needed for calculating ET from LE. ( diive.pkgs.createvar.conversions.latent_heat_of_vaporization)

Additions

  • Heatmap plotting:
    • Heatmaps can now show the z-value for each rectangle in the plot, using the parameters show_values and show_values_n_dec_places. This makes more sense for data that are plotted month vs. year than for e.g. half-hourly data.
    • Simplified API to call heatmap plots: after import diive as dv, the heatmaps can now be called via dv.heatmapyearmonth() and dv.heatmapdatetime().
  • SortingBinsMethod:
    • The counts per bin are now also part of the bin stats
    • Sometimes the required number of bins cannot be generated, in this case the stats for the respective bin are now skipped and the bin is missing from the output (.calcbins)
    • All parameters were renamed to better reflect what is going on
    • (diive.pkgs.analyses.decoupling.SortingBinsMethod)
    • Added agg parameter to define aggregation method used in binning the data
    • Renamed and reworked conversion paramater, now allows conversion to z-scores in addition to percentiles
  • Added new filetype FLUXNET-FULLSET-HR-CSV-60MIN for reading FLUXNET files with 60MIN time resolution

Notebooks

  • Added new notebook for calculating a monthly aggregation matrix (notebooks/Resampling/ResamplingMonthlyMatrix.ipynb)
  • Updated notebook HeatmapDateTime
  • Updated notebook HeatmapYearMonth
  • Changed name of notebook ridgeline to camel-case RidgeLine

Unittests

  • Added test case for .et_from_le() (tests.test_createvar.TestCreateVar.test_conversion_et_from_le)
  • Added test case for .resample_to_monthly_agg_matrix(), this test also includes the transformation to long-form time series using .transform_yearmonth_matrix_to_longform() ( tests.test_resampling.TestResampling.test_resample_to_monthly_agg_matrix)
  • 55/55 unittests ran successfully

Environment

  • diive is now using Python version 3.11 upwards
  • Updated environment, poetry pyproject.toml file now has the currently used structure

What's Changed

  • Et from le by @holukas in https://github.com/holukas/diive/pull/306
  • Heatmap rank plot by @holukas in https://github.com/holukas/diive/pull/307

Full Changelog: https://github.com/holukas/diive/compare/v0.86.0...v0.87.0

- Python
Published by holukas 9 months ago

diive - v0.86.0

v0.86.0 | 20 Mar 2025

New features

Ridgeline plot

diive can now create ridgeline plots.

plotRidgeLinePlot_diive_v0.86.0.png

The ridgeline plot visualizes the distribution of a quantitative variable by stacking overlapping density plots, creating a "ridged" landscape. I think this is quite pleasing to look at. With the implementation in diive, it facilitates the comparison of distributional shapes and changes of time series data across weeks, months and years. Ridgeline plots are quite space-efficient and hopefully visually intuitive for revealing patterns and trends in data.

This is also the first function that uses a simplified API. After importing diive, the plot can simply be accessed via .ridgeline(). This is a shortcut to access the class RidgeLinePlot that is otherwise deeply buried in the code here: diive.core.plotting.ridgeline.RidgeLinePlot. In the future, other classes and functions will also be accessible via similar shortforms.

Basic example:

import diive as dv rp = dv.ridgeline(series=series) # Initialize instance, series is a pandas Series rp.plot() # Generate basic plot

See the notebook here for more examples: notebooks/Plotting/ridgeline.ipynb

Additions

  • Additions to the flux processing chain:
    • Added two methods to get details about training and testing when using machine-learning models in the flux processing chain: .report_traintest_model_scores() and .report_traintest_details()
    • Added parameter setflag_timeperiod to set the flag for the SSITC to another value during certain time periods, for example when a time period needs stricter filtering (e.g. due to issues with the sonic anemometer). In this case the parameter can be used to set all values where flag=1 (medium quality data) to flag=2 (bad data).
      • Example from docstring: Set flag 1 to value 2 between '2022-05-01' and '2023-09-30', and between '2024-04-02' and '2024-04-19' (dates inclusive): setflag_timeperiod={2: [ [1, '2022-05-01', '2023-09-30'], [1, '2024-04-02', '2024-04-19'] ]} (diive.pkgs.qaqc.eddyproflags.flag_ssitc_eddypro_test)
    • Added params to export some gap-filling results (e.g. model scores) to csv files (e.g., .report_gapfilling_model_scores(outpath=...))
    • (diive.pkgs.fluxprocessingchain.fluxprocessingchain.FluxProcessingChain)
  • Added check if time series has a name when plotting heatmaps. If time series does not have a name, it is automatically assigned the name data. Implemented in class HeatmapBase that is used by all heatmap plotters. ( diive.core.plotting.heatmap_base.HeatmapBase)
  • Added new filetype for 60MIN EddyPro output (diive/configs/filetypes/EDDYPRO-FLUXNET-CSV-60MIN.yml)

Notebooks

  • Added notebook for ridgeline plot (notebooks/Plotting/ridgeline.ipynb)

Bugfixes

  • Fixed bug where the flux processing chain would crash when a variable with the same name as one of the automatically generated variables was already present in the input data. For example, the potential radiation SW_IN_POT is generated when the flux processing chain starts and then it is added also to the input data. If the input data already has a variable with the same name, the processing chain would crash. Now, the automatically generated SW_IN_POT is given priority, which means the variable in the input data is overwritten. ( diive.pkgs.fluxprocessingchain.fluxprocessingchain.FluxProcessingChain)

Environment

  • Updated packages

Unittests

  • 53/53 unittests ran successfully

What's Changed

  • Ridgeline plot by @holukas in https://github.com/holukas/diive/pull/291

Full Changelog: https://github.com/holukas/diive/compare/v0.85.7...v0.86.0

- Python
Published by holukas 11 months ago

diive - v0.85.7

v0.85.7 | 26 Feb 2025

New features

  • Added class for formatting meteo data for upload to FLUXNET (diive.pkgs.formats.meteo.FormatMeteoForFluxnetUpload)

Notebooks

  • Added new notebook notebooks/Formats/FormatMeteoForFluxnetUpload.ipynb

What's Changed

  • Indev by @holukas in https://github.com/holukas/diive/pull/282

Full Changelog: https://github.com/holukas/diive/compare/v0.85.6...v0.85.7

- Python
Published by holukas 12 months ago

diive - v0.85.6

v0.85.6 | 25 Feb 2025

New features

  • Added class to format meteo data as input file for EddyPro flux calcs ( diive.pkgs.formats.meteo.FormatMeteoForEddyProFluxProcessing)

Changes

  • Updated formatting for FLUXNET upload (diive.pkgs.formats.fluxnet.FormatEddyProFluxnetFileForUpload)
  • HeatmapYearMonth plot now shows every year on y-axis (diive.core.plotting.heatmap_datetime.HeatmapYearMonth)
  • Improved check for excluded columns when creating lagged variants ( diive.pkgs.createvar.laggedvariants.lagged_variants)
  • More text output when reducting features (diive.core.ml.common.MlRegressorGapFillingBase.reduce_features)
  • Fixed colorwheel running out of colors when plotting feature ranks ( diive.pkgs.gapfilling.longterm.LongTermGapFillingBase.showplot_feature_ranks_per_year)
  • Less text output when filling storage term ( diive.pkgs.fluxprocessingchain.level31_storagecorrection.FluxStorageCorrectionSinglePointEddyPro._gapfill_storage_term)
  • Smaller fixes

Notebooks

  • Added new notebook notebooks/Formats/FormatMeteoForEddyProFluxProcessing.ipynb
  • Updated notebook notebooks/Formats/notebooks/Formats/FormatEddyProFluxnetFileForUpload.ipynb

What's Changed

  • Indev by @holukas in https://github.com/holukas/diive/pull/281

Full Changelog: https://github.com/holukas/diive/compare/v0.85.5...v0.85.6

- Python
Published by holukas 12 months ago

diive - v0.85.5

v0.85.5 | 3 Feb 2024

Updates to MDS gap-filling

The community-standard MDS gap-filling method for eddy covariance ecosystem fluxes (e.g., CO2 flux) is now integrated into the FluxProcessingChain. MDS is used during gap-filling in flux Level-4.1.

The diive implementation of the MDS gap-filling method adheres to the descriptions in Reichstein et al. (2005) and Vekuri et al. (2023), similar to the standard gap-filling procedures used by FLUXNET, ICOS, ReddyProc, and other similar platforms. This method fills gaps by substituting missing flux values with average flux values observed under comparable meteorological conditions.

DIIVE

Background: different flux levels

  • The class FluxProcessingChain in diive follows the flux processing steps as shown in the Flux Processing Chain outlined by Swiss FluxNet. -
  • The flux processing chain uses different levels for different steps in the chain:
    • Level-0: preliminary flux calculations, e.g. during the year, using EddyPro
    • Level-1: final flux calculations, e.g. for complete year, using EddyPro
    • Level-2: quality flag expansion (flagging)
    • Level-3.1: storage correction (using one point measurement only, from profile not included by default)
    • Level-3.2: outlier removal (flagging)
    • Level-3.3: USTAR filtering (constant threshold, must be known, detection process not included by default) ( flagging)
    • Following Level 3.3, a comprehensive quality flag (QCF) is generated by combining individual quality flags. Prior to subsequent processing steps, low-quality data (flag=2) is removed. Medium-quality data (flag=1) can be retained if necessary, while the highest quality data (flag=0) is always kept.
    • Level-4.1: gap-filling (MDS, long-term random forest)

Changes

  • Changes in FluxMDS:
    • Added parameter avg_min_n_vals in MDS gap-filling
    • Renamed tolerance parameters for MDS gap-filling to *_tol
    • (diive.pkgs.gapfilling.mds.FluxMDS)
  • When reading a parquet file, sanitizing the timestamp is now optional (diive.core.io.files.load_parquet)
  • The function for creating lagged variants is now found in diive.pkgs.createvar.laggedvariants.lagged_variants

Additions

  • Added more text output for fill quality during gap-filling with MDS (diive.pkgs.gapfilling.mds.FluxMDS)
  • Added MDS gap-filling to flux processing chain ( diive.pkgs.fluxprocessingchain.fluxprocessingchain.FluxProcessingChain)
  • Allow fitting to unbinned data (diive.pkgs.fits.fitter.BinFitterCP)
  • Added parameter to edit y-label (diive.core.plotting.dielcycle.DielCycle)
  • Added preliminary USTAR filtering for NEE to quick flux processing chain ( diive.pkgs.fluxprocessingchain.fluxprocessingchain.QuickFluxProcessingChain)
  • FileSplitter:
    • Added parameter to directly output splits as parquet files in FileSplitter and FileSplitterMulti. These two classes split longer time series files (e.g., 6 hours) into several smaller splits (e.g., 12 half-hourly files). Usage of parquet speeds up not only the splitting part, but also the process when later re-reading the files for other processing steps.
    • After splitting, missing values in the split files are numpy NAN (diive.core.io.filesplitter.FileSplitter)
  • Added parameter to hide default plot when called. The method defaultplot is used e.g. by outlier detection methods to plot the data after outlier removal, to show flagged vs. unflagged values. ( diive.core.base.flagbase.FlagBase.defaultplot)
  • Added new filetype ETH-SONICREAD-BICO-MOD-CSV-20HZ
  • Added fig property that contains the default plot for outlier removal methods. This is useful when the default plot is needed elsewhere, e.g. saved to a file. At the moment, the parameter showplot must be True for the property to be accessible. (diive.core.base.flagbase.FlagBase)
    • Example for class zScoreRolling: zsr = zScoreRolling(..., showplot=True, ...) zsr.calc(repeat=True) fig = zsr.fig # Contains the figure instance fig.savefig(...) # Figure can then be saved to a file etc.

Notebooks

  • Added notebook example for creating lagged variants of variables ( notebooks/CalculateVariable/Create_lagged_variants.ipynb)
  • Updated flux processing chain notebook to v9.0: added option for MDS gap-filling, more descriptions
  • Bugfix: import for loading from Path was missing in flux processing chain notebook
  • Updated MDS gap-filling notebook to v1.1, added more descriptions and example for min_n_vals_nt parameter
  • Updated quick flux processing chain notebook

Unittests

  • Added test case tests.test_createvar.TestCreateVar.test_lagged_variants
  • Updated test case tests.test_gapfilling.TestGapFilling.test_fluxmds
  • Updated test case tests.test_fluxprocessingchain.TestFluxProcessingChain.test_fluxprocessingchain
  • 53/53 unittests ran successfully

Bugfixes

  • The setting for features that should not be lagged was not properly implemented ( diive.pkgs.fluxprocessingchain.fluxprocessingchain.FluxProcessingChain._get_ml_feature_settings)
  • Fixed bug when plotting (diive.pkgs.outlierdetection.localsd.LocalSD)

What's Changed

  • Indev by @holukas in https://github.com/holukas/diive/pull/278

Full Changelog: https://github.com/holukas/diive/compare/v0.84.2...v0.85.5

- Python
Published by holukas about 1 year ago

diive - v0.84.2

v0.84.2 | 8 Nov 2024

Changes

  • Adjust version number to avoid publishing conflict

Full Changelog: https://github.com/holukas/diive/compare/v0.84.1...v0.84.2

- Python
Published by holukas over 1 year ago

diive - v0.84.1

v0.84.1 | 8 Nov 2024

Bugfixes

  • Removed invalid imports

Tests

  • Added test case for diive imports (tests.test_imports.TestImports.test_imports)
  • 52/52 unittests ran successfully

What's Changed

  • Hotifx imports by @holukas in https://github.com/holukas/diive/pull/236

Full Changelog: https://github.com/holukas/diive/compare/v0.84.0...v0.84.1

- Python
Published by holukas over 1 year ago

diive - v0.84.0

v0.84.0 | 7 Nov 2024

New features

  • New class BinFitterCP for fitting function to binned data, includes confidence interval and prediction interval ( diive.pkgs.fits.fitter.BinFitterCP)

DIIVE

Additions

  • Added small function to detect duplicate entries in lists (diive.core.funcs.funcs.find_duplicates_in_list)
  • Added new filetype (diive/configs/filetypes/ETH-MERCURY-CSV-20HZ.yml)
  • Added new filetype (diive/configs/filetypes/GENERIC-CSV-HEADER-1ROW-TS-END-FULL-NS-20HZ.yml)

Bugfixes

  • Not directly a bug fix, but when reading EddyPro fluxnet files with LoadEddyProOutputFiles (e.g., in the flux processing chain) duplicate columns are now automatically renamed by adding a numbered suffix. For example, if two variables are named CUSTOM_CH4_MEAN in the output file, they are automatically renamed to CUSTOM_CH4_MEAN_1 and CUSTOM_CH4_MEAN_2 (diive.core.dfun.frames.compare_len_header_vs_data)

Notebooks

  • Added notebook example for BinFitterCP (notebooks/Fits/BinFitterCP.ipynb)
  • Updated flux processing chain notebook to v8.6, import for loading EddyPro fluxnet output files was missing

Tests

  • Added test case for BinFitterCP (tests.test_fits.TestFits.test_binfittercp)
  • 51/51 unittests ran successfully

What's Changed

  • Indev by @holukas in https://github.com/holukas/diive/pull/235

Full Changelog: https://github.com/holukas/diive/compare/v0.83.2...v0.84.0

- Python
Published by holukas over 1 year ago

diive - v0.83.2

v0.83.2 | 25 Oct 2024

From now on Python version 3.11.10 is used for developing Python (up to now, version 3.9 was used). All unittests were successfully executed with this new Python version. In addition, all notebooks were re-run, all looked good.

JupyterLab is now included in the environment, which makes it easier to quickly install diive (pip install diive) in an environment and directly use its notebooks, without the need to install JupyterLab separately.

Environment

Notebooks

  • All notebooks were re-run and updated using Python version 3.11.10

Tests

  • 50/50 unittests ran successfully with Python version 3.11.10

Changes

  • Adjusted flags check in QCF flag report, the progressive flag must be the same as the previously calculated overall flag (diive.pkgs.qaqc.qcf.FlagQCF.report_qcf_evolution)

What's Changed

  • Indev by @holukas in https://github.com/holukas/diive/pull/234

Full Changelog: https://github.com/holukas/diive/compare/v0.83.1...v0.83.2

- Python
Published by holukas over 1 year ago

diive - v0.83.1

v0.83.1 | 23 Oct 2024

Changes

  • When detecting the frequency from the time delta of records, the inferred frequency is accepted if the most frequent timedelta was found for more than 50% of records (diive.core.times.times.timestamp_infer_freq_from_timedelta)
  • Storage terms are now gap-filled using the rolling median in an expanding time window ( FluxStorageCorrectionSinglePointEddyPro._gapfill_storage_term)

Notebooks

  • Added notebook example for using the flux processing chain for CH4 flux from a subcanopy eddy covariance station ( notebooks/Workbench/CH-DAS_2023_FluxProcessingChain/FluxProcessingChain_NEE_CH-DAS_2023.ipynb)

Bugfixes

  • Fixed info for storage term correction report to account for cases when more storage terms than flux records are available (FluxStorageCorrectionSinglePointEddyPro.report)

Tests

  • 50/50 unittests ran successfully

What's Changed

  • Indev by @holukas in https://github.com/holukas/diive/pull/233

Full Changelog: https://github.com/holukas/diive/compare/v0.83.0...v0.83.1

- Python
Published by holukas over 1 year ago

diive - v0.83.0

v0.83.0 | 4 Oct 2024

MDS gap-filling

Finally it is possible to use the MDS (marginal distribution sampling) gap-filling method in diive. This method is the current default and widely used gap-filling method for eddy covariance ecosystem fluxes. For a detailed description of the method see Reichstein et al. (2005) and Pastorello et al. (2020; full references given below).

The implementation of MDS in diive (FluxMDS) follows the description in Reichstein et al. (2005) and should therefore yield results similar to other implementations of this algorithm. FluxMDS can also easily output model scores, such as r2 and error values.

At the moment it is not yet possible to use FluxMDS in the flux processing chain, but during the preparation of this update the flux processing chain code was already refactored and prepared to include FluxMDS in one of the next updates.

At the moment, FluxMDS is specifically tailored to gap-fill ecosystem fluxes, a more general implementation (e.g., to gap-fill meteorological data) will follow.

New features

  • Added new gap-filling class FluxMDS:
    • MDS stands for marginal distribution sampling. The method uses a time window to first identify meteorological conditions (short-wave incoming radiation, air temperature and VPD) similar to those when the missing data occurred. Gaps are then filled with the mean flux in the time window.
    • FluxMDS cannot be used in the flux processing chain, but will be implemented soon.
    • (diive.pkgs.gapfilling.mds.FluxMDS)

Changes

  • Storage correction: By default, values missing in the storage term are now filled with a rolling mean in an expanding time window. Testing showed that the (single point) storage term is missing for between 2-3% of the data, which I think is reason enough to make filling these gaps the default option. Previously, it was optional to fill the gaps using random forest, however, results were not great since only the timestamp info was used as model features. Plots generated during Level-3.1 were also updated, now better showing the storage terms (gap-filled and non-gap-filled) and the flag indicating filled values ( diive.pkgs.fluxprocessingchain.level31_storagecorrection.FluxStorageCorrectionSinglePointEddyPro)

Notebooks

  • Added notebook example for FluxMDS (notebooks/GapFilling/FluxMDSGapFilling.ipynb)

Tests

  • Added test case for FluxMDS (tests.test_gapfilling.TestGapFilling.test_fluxmds)
  • 50/50 unittests ran successfully

Bugfixes

  • Fixed bug: overall quality flag QCF was not created correctly for the different USTAR scenarios ( diive.core.base.identify.identify_flagcols) (diive.pkgs.qaqc.qcf.FlagQCF)
  • Fixed bug: calculation of QCF flag sums is now strictly done on flag columns. Before, sums were calculated across all columns in the flags dataframe, which resulted in erroneous overall flags after USTAR filtering ( diive.pkgs.qaqc.qcf.FlagQCF._calculate_flagsums)

Environment

References

  • Pastorello, G. et al. (2020). The FLUXNET2015 dataset and the ONEFlux processing pipeline for eddy covariance data. 27. https://doi.org/10.1038/s41597-020-0534-3
  • Reichstein, M., Falge, E., Baldocchi, D., Papale, D., Aubinet, M., Berbigier, P., Bernhofer, C., Buchmann, N., Gilmanov, T., Granier, A., Grunwald, T., Havrankova, K., Ilvesniemi, H., Janous, D., Knohl, A., Laurila, T., Lohila, A., Loustau, D., Matteucci, G., … Valentini, R. (2005). On the separation of net ecosystem exchange into assimilation and ecosystem respiration: Review and improved algorithm. Global Change Biology, 11(9), 1424–1439. https://doi.org/10.1111/j.1365-2486.2005.001002.x

What's Changed

  • Indev by @holukas in https://github.com/holukas/diive/pull/229

Full Changelog: https://github.com/holukas/diive/compare/v0.82.1...v0.83.0

- Python
Published by holukas over 1 year ago

diive - v0.82.1

v0.82.1 | 22 Sep 2024

Notebooks

  • Added notebook showing an example for LongTermGapFillingRandomForestTS ( notebooks/GapFilling/LongTermRandomForestGapFilling.ipynb)
  • Added notebook example for MeasurementOffset (notebooks/Corrections/MeasurementOffset.ipynb)

Tests

  • Added unittest for LongTermGapFillingRandomForestTS ( tests.test_gapfilling.TestGapFilling.test_gapfilling_longterm_randomforest)
  • Added unittest for WindDirOffset (tests.test_corrections.TestCorrections.test_winddiroffset)
  • Added unittest for DaytimeNighttimeFlag (tests.test_createvar.TestCreateVar.test_daytime_nighttime_flag)
  • Added unittest for calc_vpd_from_ta_rh (tests.test_createvar.TestCreateVar.test_calc_vpd)
  • Added unittest for percentiles101 (tests.test_analyses.TestAnalyses.test_percentiles)
  • Added unittest for GapFinder (tests.test_analyses.TestAnalyses.test_gapfinder)
  • Added unittest for SortingBinsMethod (tests.test_analyses.TestAnalyses.test_sorting_bins_method)
  • Added unittest for daily_correlation (tests.test_analyses.TestAnalyses.test_daily_correlation)
  • Added unittest for QuantileXYAggZ (tests.test_analyses.TestCreateVar.test_quantilexyaggz)
  • 49/49 unittests ran successfully

Bugfixes

  • Fixed bug that caused results from long-term gap-filling to be inconsistent despite using a fixed random state. I found the following: when reducing features across years, the removal of duplicate features from a list of found features created a list where the order of elements changed each run. This in turn produced slightly different gap-filling results each time the long-term gap-filling was executed. Used Python version where this issue occurred was 3.9.19.
    • Here is a simplified example, where input_list is a list of elements with some duplicate elements:
    • Running output_list = list(set(input_list)) generates output_list where the elements would have a different output order each run. The elements were otherwise the same, only their order changed.
    • To keep the order of elements consistent it was necessary to output_list.sort().
    • (diive.pkgs.gapfilling.longterm.LongTermGapFillingBase.reduce_features_across_years)
  • Corrected wind direction could be 360°, but will now be 0° ( diive.pkgs.corrections.winddiroffset.WindDirOffset._correct_degrees)

What's Changed

  • Indev by @holukas in https://github.com/holukas/diive/pull/218

Full Changelog: https://github.com/holukas/diive/compare/v0.82.0...v0.82.1

- Python
Published by holukas over 1 year ago

diive - v0.82.0

v0.82.0 | 19 Sep 2024

Long-term gap-filling

It is now possible to gap-fill multi-year datasets using the class LongTermGapFillingRandomForestTS. In this approach, data from neighboring years are pooled together before training the random forest model for gap-filling a specific year. This is especially useful for long-term, multi-year datasets where environmental conditions and drivers might change over years and decades.

Why random forest? Because it performed well and to me it looks like the first choice for gap-filling ecosystem fluxes, at least at the moment.

Long-term gap-filling using random forest is now also built into the flux processing chain (Level-4.1). This allows to quickly gap-fill the different USTAR scenarios and to create some useful plots (I hope). See the flux processing chain notebook for how this looks like.

In a future update it will be possible to either directly switch to XGBoost for gap-filling, or to use it (and other machine-learning models) in combination with random forest in the flux processing chain.

Example

Here is an example for a dataset containing CO2 flux (NEE) measurements from 2005 to 2023:

  • for gap-filling the year 2005, the model is trained on data from 2005, 2006 and 2007 (2005 has no previous year)
  • for gap-filling the year 2006, the model is trained on data from 2005, 2006 and 2007 (same model as for 2005)
  • for gap-filling the year 2007, the model is trained on data from 2006, 2007 and 2008
  • ...
  • for gap-filling the year 2012, the model is trained on data from 2011, 2012 and 2013
  • for gap-filling the year 2013, the model is trained on data from 2012, 2013 and 2014
  • for gap-filling the year 2014, the model is trained on data from 2013, 2014 and 2015
  • ...
  • for gap-filling the year 2021, the model is trained on data from 2020, 2021 and 2022
  • for gap-filling the year 2022, the model is trained on data from 2021, 2022 and 2023 (same model as for 2023)
  • for gap-filling the year 2023, the model is trained on data from 2021, 2022 and 2023 (2023 has no next year)

New features

  • Added new method for long-term (multiple years) gap-filling using random forest to flux processing chain ( diive.pkgs.fluxprocessingchain.fluxprocessingchain.FluxProcessingChain.level41_gapfilling_longterm)
  • Added new class for long-term (multiple years) gap-filling using random forest ( diive.pkgs.gapfilling.longterm.LongTermGapFillingRandomForestTS)
  • Added class for plotting cumulative sums across all data, for multiple columns ( diive.core.plotting.cumulative.Cumulative)
  • Added class to detect a constant offset between two measurements ( diive.pkgs.corrections.measurementoffset.MeasurementOffset)

Changes

  • Creating lagged variants creates gaps which then leads to incomplete features in machine learning models. Now, gaps are filled using simple forward and backward filling, limited to the number of values defined in lag. For example, if variable TA is lagged by -2 value this creates two missing values for this variant at the start of the time series, which then are then gap-filled using the simple backwards fill with limit=2. ( diive.core.dfun.frames.lagged_variants)

Notebooks

  • Updated flux processing chain notebook to include long-term gap-filling using random forest ( notebooks/FluxProcessingChain/FluxProcessingChain.ipynb)
  • Added new notebook for plotting cumulative sums across all data, for multiple columns ( notebooks/Plotting/Cumulative.ipynb)

Tests

  • Unittest for flux processing chain now includes many more methods ( tests.test_fluxprocessingchain.TestFluxProcessingChain.test_fluxprocessingchain)
  • 39/39 unittests ran successfully

Bugfixes

  • Fixed deprecation warning in (diive.core.ml.common.prediction_scores_regr)

What's Changed

  • Indev by @holukas in https://github.com/holukas/diive/pull/215

Full Changelog: https://github.com/holukas/diive/compare/v0.81.0...v0.82.0

- Python
Published by holukas over 1 year ago

diive - v0.81.0

v0.81.0 | 11 Sep 2024

Expanding Flux Processing Capabilities

This update brings advancements for post-processing eddy covariance data in the context of the FluxProcessingChain. The goal is to offer a complete chain for post-processing ecosystem flux data, specifically designed to work seamlessly with the standardized _fluxnet output file from the widely-used EddyPro software.

Now, diive offers the option for USTAR filtering based on known constant thresholds across the entire dataset (similar to the CUT scenarios in FLUXNET data). While seasonal (DJF, MAM, JJA, SON) thresholds are calculated internally, applying them on a seasonal basis or using variable thresholds per year (like FLUXNET's VUT scenarios) isn't yet implemented.

With this update, the FluxProcessingChain class can handle various data processing steps:

  • Level-2: Quality flag expansion
  • Level-3.1: Storage correction
  • Level-3.2: Outlier removal
  • Level-3.3: (new) USTAR filtering (with constant thresholds for now)
  • (upcoming) Level-4.1: long-term gap-filling using random forest and XGBoost
  • For info about the different flux levels see Swiss FluxNet flux processing chain

New features

  • Added class to apply multiple known constant USTAR (friction velocity) thresholds, creating flags that indicate time periods characterized by low turbulence for multiple USTAR scenarios. The constant thresholds must be known beforehand, e.g., from an earlier USTAR detection run, or from results from FLUXNET ( diive.pkgs.flux.ustarthreshold.FlagMultipleConstantUstarThresholds)
  • Added class to apply one single known constant USTAR thresholds ( diive.pkgs.flux.ustarthreshold.FlagSingleConstantUstarThreshold)
  • Added FlagMultipleConstantUstarThresholds to the flux processing chain ( diive.pkgs.fluxprocessingchain.fluxprocessingchain.FluxProcessingChain.level33_constant_ustar)
  • Added USTAR detection algorithm based on Papale et al., 2006 (diive.pkgs.flux.ustarthreshold.UstarDetectionMPT)
  • Added function to analyze high-quality ecosystem fluxes that helps in understanding the range of highest-quality data( diive.pkgs.flux.hqflux.analyze_highest_quality_flux)

Additions

  • LocalSD outlier detection can now use a constant SD:
    • Added parameter to use standard deviation across all data (constant) instead of the rolling SD to calculate the upper and lower limits that define outliers in the median rolling window ( diive.pkgs.outlierdetection.localsd.LocalSD)
    • Added to step-wise outlier detection ( diive.pkgs.outlierdetection.stepwiseoutlierdetection.StepwiseOutlierDetection.flag_outliers_localsd_test)
    • Added to meteoscreening from database ( diive.pkgs.qaqc.meteoscreening.StepwiseMeteoScreeningDb.flag_outliers_localsd_test)
    • Added to flux processing chain ( diive.pkgs.fluxprocessingchain.fluxprocessingchain.FluxProcessingChain.level32_flag_outliers_localsd_test)

Changes

  • Replaced .plot_date() from the Matplotlib library with .plot() due to deprecation

Notebooks

  • Added notebook for plotting cumulative sums per year (notebooks/Plotting/CumulativesPerYear.ipynb)
  • Added notebook for removing outliers based on the z-score in rolling time window ( notebooks/OutlierDetection/zScoreRolling.ipynb)

Bugfixes

  • Fixed bug when saving a pandas Series to parquet (diive.core.io.files.save_parquet)
  • Fixed bug when plotting doy_mean_cumulative: no longer crashes when years defined in parameter excl_years_from_reference are not in dataset (diive.core.times.times.doy_mean_cumulative)
  • Fixed deprecation warning when plotting in bokeh (interactive plots)

Tests

  • Added unittest for LocalSD using constant SD ( tests.test_outlierdetection.TestOutlierDetection.test_localsd_with_constantsd)
  • Added unittest for rolling z-score outlier removal ( tests.test_outlierdetection.TestOutlierDetection.test_zscore_rolling)
  • Improved check if figure and axis were created in (tests.test_plots.TestPlots.test_histogram)
  • 39/39 unittests ran successfully

Environment

  • Added new package scikit-optimize
  • Added new package category_encoders

What's Changed

  • Indev by @holukas in https://github.com/holukas/diive/pull/205

Full Changelog: https://github.com/holukas/diive/compare/v0.80.0...v0.81.0

- Python
Published by holukas over 1 year ago

diive - v0.80.0

v0.80.0 | 28 Aug 2024

Additions

  • Added outlier tests to step-wise meteoscreening from database: Hampel, HampelDaytimeNighttime and TrimLow ( diive.pkgs.qaqc.meteoscreening.StepwiseMeteoScreeningDb)
  • Added parameter to control whether or not to output the middle timestamp when loading parquet files with load_parquet(). By default, output_middle_timestamp=True. (diive.core.io.files.load_parquet)

Environment

  • Re-created environment and created new lock file
  • Currently using Python 3.9.19

Notebooks

  • Added new notebook for creating a flag that indicates missing values (notebooks/OutlierDetection/MissingValues.ipynb)
  • Updated notebook for meteoscreening from database ( notebooks/MeteoScreening/StepwiseMeteoScreeningFromDatabase.ipynb)
  • Updated notebook for loading and saving parquet files (notebooks/Formats/LoadSaveParquetFile.ipynb)

Tests

  • Added unittest for flagging missing values (tests.test_outlierdetection.TestOutlierDetection.test_missing_values)
  • 37/37 unittests ran successfully

Bugfixes

  • Fixed links in README, needed absolute links to notebooks
  • Fixed issue with return list in (diive.pkgs.analyses.histogram.Histogram.peakbins)

What's Changed

  • Meteoscreening updates by @holukas in https://github.com/holukas/diive/pull/184

Full Changelog: https://github.com/holukas/diive/compare/v0.79.1...v0.80.0

- Python
Published by holukas over 1 year ago

diive - v0.79.1

v0.79.1 | 26 Aug 2024

Additions

  • Added new function to apply quality flags to certain time periods only (diive.pkgs.qaqc.flags.restrict_application)
  • Added to option to restrict the application of the angle-of-attack flag to certain time periods ( diive.pkgs.fluxprocessingchain.level2_qualityflags.FluxQualityFlagsEddyPro.angle_of_attack_test)

Changes

  • Test options in FluxProcessingChain are now always passed as dict. This has the advantage that in addition to run the test by setting the dict key apply to True, various other test settings can be passed, for example the new parameter application dates for the angle-of-attack flag. ( diive.pkgs.fluxprocessingchain.fluxprocessingchain.FluxProcessingChain)

Tests

  • Added unittest for Flux Processing Chain up to Level-2 ( tests.test_fluxprocessingchain.TestFluxProcessingChain.test_fluxprocessingchain_level2)
  • 36/36 unittests ran successfully

What's Changed

  • Time periods ec flags by @holukas in https://github.com/holukas/diive/pull/179

Full Changelog: https://github.com/holukas/diive/compare/v0.79.0...v0.79.1

- Python
Published by holukas over 1 year ago

diive - v0.79.0

v0.79.0 | 22 Aug 2024

This version introduces a histogram plot that has the option to display z-score as vertical lines superimposed on the distribution, which helps in assessing z-score settings used by some outlier removal functions.

DIIVE

Histogram plot of half-hourly air temperature measurements at the ICOS Class 1 ecosystem station Davos between 2013 and 2022, displayed in 20 equally-spaced bins. The dashed vertical lines show the z-score and the corresponding value calculated based on the time series. The bin with most counts is highlighted orange.

New features

  • Added new class HistogramPlotfor plotting histograms, based on the Matplotlib implementation (diive.core.plotting.histogram.HistogramPlot)
  • Added function to calculate the value for a specific z-score, e.g., based on a time series it calculates the value where z-score = 3 etc. (diive.core.funcs.funcs.val_from_zscore)

Additions

  • Added histogram plots to FlagBase, histograms are now shown for all outlier methods (diive.core.base.flagbase.FlagBase.defaultplot)
  • Added daytime/nighttime histogram plots to (diive.pkgs.outlierdetection.hampel.HampelDaytimeNighttime)
  • Added daytime/nighttime histogram plots to (diive.pkgs.outlierdetection.zscore.zScoreDaytimeNighttime)
  • Added daytime/nighttime histogram plots to (diive.pkgs.outlierdetection.lof.LocalOutlierFactorDaytimeNighttime)
  • Added daytime/nighttime histogram plots to ( diive.pkgs.outlierdetection.absolutelimits.AbsoluteLimitsDaytimeNighttime)
  • Added option to calculate the z-score with sign instead of absolute (diive.core.funcs.funcs.zscore)

Changes

  • Improved daytime/nighttime outlier plot used by various outlier removal classes ( diive.core.base.flagbase.FlagBase.plot_outlier_daytime_nighttime)

Notebooks

  • Added notebook for plotting histograms (notebooks/Plotting/Histogram.ipynb)
  • Added notebook for manual removal of data points (notebooks/OutlierDetection/ManualRemoval.ipynb)
  • Added notebook for outlier detection using local outlier factor, separately during daytime and nighttime ( notebooks/OutlierDetection/LocalOutlierFactorDaytimeNighttime.ipynb)
  • Updated notebook (notebooks/OutlierDetection/HampelDaytimeNighttime.ipynb)
  • Updated notebook (notebooks/OutlierDetection/AbsoluteLimitsDaytimeNighttime.ipynb)
  • Updated notebook (notebooks/OutlierDetection/zScoreDaytimeNighttime.ipynb)
  • Updated notebook (notebooks/OutlierDetection/LocalOutlierFactorAllData.ipynb)

Tests

  • Added unittest for plotting histograms (tests.test_plots.TestPlots.test_histogram)
  • Added unittest for calculating histograms (without plotting) (tests.test_analyses.TestCreateVar.test_histogram)

What's Changed

  • v0.79.0 by @holukas in https://github.com/holukas/diive/pull/176

Full Changelog: https://github.com/holukas/diive/compare/v0.78.1.1...v0.79.0

- Python
Published by holukas over 1 year ago

diive - v0.78.1.1

v0.78.1.1 | 19 Aug 2024

Additions

  • Added CITATIONS file

Full Changelog: https://github.com/holukas/diive/compare/v0.78.1...v0.78.1.1

- Python
Published by holukas over 1 year ago

diive - v0.78.1

v0.78.1 | 19 Aug 2024

Changes

  • Added option to set different n_sigma for daytime and nightime data in HampelDaytimeNighttime (diive.pkgs.outlierdetection.hampel.HampelDaytimeNighttime)
  • Updated flag_outliers_hampel_dtnt_test in step-wise outlier detection
  • Updated level32_flag_outliers_hampel_dtnt_test in flux processing chain

Notebooks

  • Updated notebook HampelDaytimeNighttime
  • Updated notebook FluxProcessingChain

Tests

  • Updated unittest test_hampel_filter_daytime_nighttime

What's Changed

  • v0.78.1 by @holukas in https://github.com/holukas/diive/pull/168

Full Changelog: https://github.com/holukas/diive/compare/v0.78.0...v0.78.1

- Python
Published by holukas over 1 year ago

diive - v0.78.0

v0.78.0 | 18 Aug 2024

New features

  • Added new class for outlier removal, based on the rolling z-score. It can also be used in step-wise outlier detection and during meteoscreening from the database. (diive.pkgs.outlierdetection.zscore.zScoreRolling, diive.pkgs.outlierdetection.stepwiseoutlierdetection.StepwiseOutlierDetection, diive.pkgs.qaqc.meteoscreening.StepwiseMeteoScreeningDb).
  • Added Hampel filter for outlier removal (diive.pkgs.outlierdetection.hampel.Hampel)
  • Added Hampel filter (separate daytime, nighttime) for outlier removal (diive.pkgs.outlierdetection.hampel.HampelDaytimeNighttime)
  • Added function to plot daytime and nighttime outliers during outlier tests (diive.core.plotting.outlier_dtnt.outlier_daytime_nighttime)

Changes

  • Flux processing chain:
    • Several changes to the flux processing chain to make sure it can also work with data files not directly output by EddyPro. The class FluxProcessingChain can now handle files that have a different format than the two EddyPro output files EDDYPRO-FLUXNET-CSV-30MIN and EDDYPRO-FULL-OUTPUT-CSV-30MIN. See following notes.
    • Removed option to process EddyPro _full_output_ files, since it as an older format and its variables do not follow FLUXNET conventions.
    • Removed keyword filetype in class FluxProcessingChain. It is now assumed that the variable names follow the FLUXNET convention. Variables used in FLUXNET are listed here (diive.pkgs.fluxprocessingchain.fluxprocessingchain.FluxProcessingChain)
    • When detecting the base variable from which a flux variable was calculated, the variables defined for filetype EDDYPRO-FLUXNET-CSV-30MIN are now assumed by default. (diive.pkgs.flux.common.detect_basevar)
    • Renamed function that detects the base variable that was used to calculate the respective flux (diive.pkgs.flux.common.detect_fluxbasevar)
    • Renamed gas in functions related to completeness tests to fluxbasevar to better reflect that the completeness test does not necessarily require a gas (e.g. T_SONIC is used to calculate the completeness for sensible heat flux) (flag_fluxbasevar_completeness_eddypro_test)
  • Removing the radiation offset now uses 0.001 (W m-2) instead of 50 as the threshold value to flag nighttime values for the correction (diive.pkgs.corrections.offsetcorrection.remove_radiation_zero_offset)
  • The database tag for meteo data screened with diive is now meteoscreening_diive (diive.pkgs.qaqc.meteoscreening.StepwiseMeteoScreeningDb.resample)
  • During noise generation, function now uses the absolute values of the min/max of a series to calculate minimum noise and maximum noise (diive.pkgs.createvar.noise.add_impulse_noise)

Notebooks

  • Added new notebook for outlier detection using class zScore (notebooks/OutlierDetection/zScore.ipynb)
  • Added new notebook for outlier detection using class zScoreDaytimeNighttime (notebooks/OutlierDetection/zScoreDaytimeNighttime.ipynb)
  • Added new notebook for outlier removal using trimming (notebooks/OutlierDetection/TrimLow.ipynb)
  • Updated notebook (notebooks/MeteoScreening/StepwiseMeteoScreeningFromDatabase_v7.0.ipynb)
  • When uploading screened meteo data to the database using the notebook StepwiseMeteoScreeningFromDatabase, variables with the same name, measurement and data version as the screened variable(s) are now deleted from the database before the new data are uploaded. Implemented in the Python package dbc-influxdb to avoid duplicates in the database. Such duplicates can occur when one of the tags of an otherwise identical variable changed, e.g., when one of the tags of the originally uploaded data was wrong and needed correction. The database InfluxDB stores a new time series alongside the previous time series when one of the tags is different in an otherwise identical time series.

Tests

  • Added test case for Hampel filter (tests.test_outlierdetection.TestOutlierDetection.test_hampel_filter)
  • Added test case for HampelDaytimeNighttime filter (tests.test_outlierdetection.TestOutlierDetection.test_hampel_filter_daytime_nighttime)
  • Added test case for zScore (tests.test_outlierdetection.TestOutlierDetection.test_zscore)
  • Added test case for TrimLow (tests.test_outlierdetection.TestOutlierDetection.test_trim_low_nt)
  • Added test case for zScoreDaytimeNighttime (tests.test_outlierdetection.TestOutlierDetection.test_zscore_daytime_nighttime)
  • 33/33 unittests ran successfully

Environment

  • Added package sktime, a unified framework for machine learning with time series.

What's Changed

  • v0.78.0 by @holukas in https://github.com/holukas/diive/pull/161

Full Changelog: https://github.com/holukas/diive/compare/v0.77.0...v0.78.0

- Python
Published by holukas over 1 year ago

diive - v0.77.0

v0.77.0 | 11 Jun 2024

Additions

  • Plotting cumulatives with CumulativeYear now also shows the cumulative for the reference, i.e. for the mean over the reference years (diive.core.plotting.cumulative.CumulativeYear)
  • Plotting DielCycle now accepts ylim parameter (diive.core.plotting.dielcycle.DielCycle)
  • Added long-term dataset for local testing purposes (internal only) (diive.configs.exampledata.load_exampledata_parquet_long)
  • Added several classes in preparation for long-term gap-filling for a future update

Changes

  • Several updates and changes to the base class for regressor decision trees (diive.core.ml.common.MlRegressorGapFillingBase):
    • The data are now split into training set and test set at the very start of regressor setup. This test set is used to evaluate models on unseen data. The default split is 80% training and 20% test data.
    • Plotting (scores, importances etc.) is now generally separated from the method where they are calculated.
    • the same random_state is now used for all processing steps
    • refactored code
    • beautified console output
  • When correcting for relative humidity values above 100%, the maximum of the corrected time series is now set to 100, after the (daily) offset was removed (diive.pkgs.corrections.offsetcorrection.remove_relativehumidity_offset)
  • During feature reduction in machine learning regressors, features with permutation importance < 0 are now always removed (diive.core.ml.common.MlRegressorGapFillingBase._remove_rejected_features)
  • Changed default parameters for quick random forest gap-filling (diive.pkgs.gapfilling.randomforest_ts.QuickFillRFTS)
  • I tried to improve the console output (clarity) for several functions and methods

Environment

  • Added package dtreeviz to visualize decision trees

Notebooks

  • Updated notebook (notebooks/GapFilling/RandomForestGapFilling.ipynb)
  • Updated notebook (notebooks/GapFilling/LinearInterpolation.ipynb)
  • Updated notebook (notebooks/GapFilling/XGBoostGapFillingExtensive.ipynb)
  • Updated notebook (notebooks/GapFilling/XGBoostGapFillingMinimal.ipynb)
  • Updated notebook (notebooks/GapFilling/RandomForestParamOptimization.ipynb)
  • Updated notebook (notebooks/GapFilling/QuickRandomForestGapFilling.ipynb)

Tests

  • Updated and fixed test case (tests.test_outlierdetection.TestOutlierDetection.test_zscore_increments)
  • Updated and fixed test case (tests.test_gapfilling.TestGapFilling.test_gapfilling_randomforest)

What's Changed

  • Ml long term gap filling by @holukas in https://github.com/holukas/diive/pull/128

Full Changelog: https://github.com/holukas/diive/compare/v0.76.2...v0.77.0

- Python
Published by holukas over 1 year ago

diive - v0.76.2

v0.76.2 | 23 May 2024

Additions

  • Added function to calculate absolute double differences of a time series, which is the sum of absolute differences between a data record and its preceding and next record. Used in class zScoreIncrements for finding (isolated) outliers that are distant from neighboring records. (diive.core.dfun.stats.double_diff_absolute)
  • Added small function to calculate z-score stats of a time series (diive.core.dfun.stats.sstats_zscore)
  • Added small function to calculate stats for absolute double differences of a time series (diive.core.dfun.stats.sstats_doublediff_abs)

Changes

  • Changed the algorithm for outlier detection when using zScoreIncrements. Data points are now flagged as outliers if the z-scores of three absolute differences (previous record, next record and the sum of both) all exceed a specified threshold. (diive.pkgs.outlierdetection.incremental.zScoreIncrements)

Notebooks

  • Added new notebook for outlier detection using class LocalOutlierFactorAllData (notebooks/OutlierDetection/LocalOutlierFactorAllData.ipynb)

Tests

  • Added new test case for LocalOutlierFactorAllData (tests.test_outlierdetection.TestOutlierDetection.test_lof_alldata)

What's Changed

  • More stats by @holukas in https://github.com/holukas/diive/pull/116

Full Changelog: https://github.com/holukas/diive/compare/v0.76.1...v0.76.2

- Python
Published by holukas over 1 year ago

diive - v0.76.1

v0.76.1 | 17 May 2024

Additions

  • It is now possible to set a fixed random seed when creating impulse noise (diive.pkgs.createvar.noise.add_impulse_noise)

Changes

  • In class zScoreIncrements, outliers are now detected by calculating the sum of the absolute differences between a data point and its respective preceding and next data point. Before, only the non-absolute difference of the preceding data point was considered. The sum of absolute differences is then used to calculate the z-score and in further consequence to flag outliers. (diive.pkgs.outlierdetection.incremental.zScoreIncrements)

Notebooks

  • Added new notebook for outlier detection using class zScoreIncrements (notebooks/OutlierDetection/zScoreIncremental.ipynb)
  • Added new notebook for outlier detection using class LocalSD (notebooks/OutlierDetection/LocalSD.ipynb)

Tests

  • Added new test case for zScoreIncrements (tests.test_outlierdetection.TestOutlierDetection.test_zscore_increments)
  • Added new test case for LocalSD (tests.test_outlierdetection.TestOutlierDetection.test_localsd)

What's Changed

  • Added more notebooks and test cases by @holukas in https://github.com/holukas/diive/pull/108

Full Changelog: https://github.com/holukas/diive/compare/v0.76.0...v0.76.1

- Python
Published by holukas almost 2 years ago

diive - v0.76.0

v0.76.0 | 14 May 2024

Diel cycle plot

The new class DielCycle allows to plot diel cycles per month or across all data for time series data. At the moment, it plots the (monthly) diel cycles as means (+/- standard deviation). It makes use of the time info contained in the datetime timestamp index of the data. All aggregates are calculated by grouping data by time and (optional) separately for each month. The diel cycles have the same time resolution as the time component of the timestamp index, e.g. hourly.

DIIVE

New features

  • Added new class DielCycle for plotting diel cycles per month (diive.core.plotting.dielcycle.DielCycle)
  • Added new function diel_cycle for calculating diel cycles per month. This function is also used by the plotting class DielCycle (diive.core.times.resampling.diel_cycle)

Additions

  • Added color scheme that contains 12 colors, one for each month. Not perfect, but better than before. (diive.core.plotting.styles.LightTheme.colors_12_months)

Notebooks

  • Added new notebook for plotting diel cycles (per month) (notebooks/Plotting/DielCycle.ipynb)
  • Added new notebook for calculating diel cycles (per month) (notebooks/Resampling/ResamplingDielCycle.ipynb)

Tests

  • Added test case for new function diel_cycle (tests.test_resampling.TestResampling.test_diel_cycle)

What's Changed

  • Diel cycle plot by @holukas in https://github.com/holukas/diive/pull/107

Full Changelog: https://github.com/holukas/diive/compare/v0.75.0...v0.76.0

- Python
Published by holukas almost 2 years ago

diive - v0.75.0

v0.75.0 | 26 Apr 2024

XGBoost gap-filling

XGBoost can now be used to fill gaps in time series data. In diive, XGBoost is implemented in class XGBoostTS, which adds additional options for easily including e.g. lagged variants of feature variables, timestamp info (DOY, month, ...) and a continuous record number. It also allows direct feature reduction by including a purely random feature (consisting of completely random numbers) and calculating the 'permutation importance'. All features where the permutation importance is lower than for the random feature can then be removed from the dataset, i.e., the list of features, before building the final model.

XGBoostTS and RandomForestTS both use the same base class MlRegressorGapFillingBase. This base class will also facilitate the implementation of other gap-filling algorithms in the future.

Another fun (for me) addition is the new class TimeSince. It allows to calculate the time since the last occurrence of specific conditions. One example where this class can be useful is the calculation of 'time since last precipitation', expressed as number of records, which can be helpful in identifying dry conditions. More examples: 'time since freezing conditions' based on air temperature; 'time since management' based on management info, e.g. fertilization events. Please see the notebook for some illustrative examples.

Please note that diive is still under developement and bugs can be expected.

New features

  • Added gap-filling class XGBoostTS for time series data, using XGBoost (diive.pkgs.gapfilling.xgboost_ts.XGBoostTS)
  • Added new class TimeSince: counts number of records (inceremental number / counter) since the last time a time series was inside a specified range, useful for e.g. counting the time since last precipitation, since last freezing temperature, etc. (diive.pkgs.createvar.timesince.TimeSince)

Additions

  • Added base class for machine learning regressors, which is basically the code shared between the different methods. At the moment used by RandomForestTS and XGBoostTS. (diive.core.ml.common.MlRegressorGapFillingBase)
  • Added option to change line color directly in TimeSeries plots (diive.core.plotting.timeseries.TimeSeries.plot)

Notebooks

  • Added new notebook for gap-filling using XGBoostTS with mininmal settings (notebooks/GapFilling/XGBoostGapFillingMinimal.ipynb)
  • Added new notebook for gap-filling using XGBoostTS with more extensive settings (notebooks/GapFilling/XGBoostGapFillingExtensive.ipynb)
  • Added new notebook for creating TimeSince variables (notebooks/CalculateVariable/TimeSince.ipynb)

Tests

  • Added test case for XGBoost gap-filling (tests.test_gapfilling.TestGapFilling.test_gapfilling_xgboost)
  • Updated test case for random forest gap-filling (tests.test_gapfilling.TestGapFilling.test_gapfilling_randomforest)
  • Harmonized test case for XGBoostTS with test case of RandomForestTS
  • Added test case for TimeSince variable creation (tests.test_createvar.TestCreateVar.test_timesince)

What's Changed

  • Adding xgboost by @holukas in https://github.com/holukas/diive/pull/102

Full Changelog: https://github.com/holukas/diive/compare/v0.74.1...v0.75.0

- Python
Published by holukas almost 2 years ago

diive - v0.74.1

v0.74.1 | 23 Apr 2024

This update adds the first notebooks (and tests) for outlier detection methods. Only two tests are included so far and both tests are relatively simple, but both notebooks already show in principle how outlier removal is handled. An important aspect is that diive single outlier methods do not remove outliers by default, but instead a flag is created that shows where the outliers are located. The flag can then be used to remove the data points. This update also includes the addition of a small function that creates artificial spikes in time series data and is therefore very useful for testing outlier detection methods. More outlier removal notebooks will be added in the future, including a notebook that shows how to combine results from multiple outlier tests into one single overall outlier flag.

New features

  • Added: new function to add impulse noise to time series (diive.pkgs.createvar.noise.impulse)

Notebooks

  • Added: new notebook for outlier detection: absolute limits, separately for daytime and nighttime data (notebooks/OutlierDetection/AbsoluteLimitsDaytimeNighttime.ipynb)
  • Added: new notebook for outlier detection: absolute limits (notebooks/OutlierDetection/AbsoluteLimits.ipynb)

Tests

  • Added: test case for outlier detection: absolute limits, separately for daytime and nighttime data (tests.test_outlierdetection.TestOutlierDetection.test_absolute_limits)
  • Added: test case for outlier detection: absolute limits (tests.test_outlierdetection.TestOutlierDetection.test_absolute_limits)

What's Changed

  • Outlier notebooks by @holukas in https://github.com/holukas/diive/pull/95
  • Update README.md by @inkenbrandt in https://github.com/holukas/diive/pull/86
  • Update pyproject.toml by @inkenbrandt in https://github.com/holukas/diive/pull/85

Full Changelog: https://github.com/holukas/diive/compare/v0.74.0...v0.74.1

- Python
Published by holukas almost 2 years ago

diive - v0.74.0

v0.74.0 | 21 Apr 2024

Additions

  • Added: new function to remove rows that do not have timestamp info (NaT) (diive.core.times.times.remove_rows_nat and diive.core.times.times.TimestampSanitizer)
  • Added: new settings VARNAMES_ROW and VARUNITS_ROW in filetypes YAML files, allows better and more specific configuration when reading data files (diive/configs/filetypes)
  • Added: many (small) example data files for various filetypes, e.g. ETH-RECORD-TOA5-CSVGZ-20HZ
  • Added: new optional check in TimestampSanitizer that compares the detected time resolution of a time series with the nominal (expected) time resolution. Runs automatically when reading files with ReadFileType, in which case the FREQUENCY from the filetype configs is used as the nominal time resolution. (diive.core.times.times.TimestampSanitizer, diive.core.io.filereader.ReadFileType)
  • Added: application of TimestampSanitizer after inserting a timestamp and setting it as index with function insert_timestamp, this makes sure the freq/freqstr info is available for the new timestamp index (diive.core.times.times.insert_timestamp)

Notebooks

  • General: Ran all notebook examples to make sure they work with this version of diive
  • Added: new notebook for reading EddyPro fluxnet output file with DataFileReader parameters (notebooks/ReadFiles/Read_single_EddyPro_fluxnet_output_file_with_DataFileReader.ipynb)
  • Added: new notebook for reading EddyPro fluxnet output file with ReadFileType and pre-defined filetype EDDYPRO-FLUXNET-CSV-30MIN (notebooks/ReadFiles/Read_single_EddyPro_fluxnet_output_file_with_ReadFileType.ipynb)
  • Added: new notebook for reading multiple EddyPro fluxnet output files with MultiDataFileReader and pre-defined filetype EDDYPRO-FLUXNET-CSV-30MIN (notebooks/ReadFiles/Read_multiple_EddyPro_fluxnet_output_files_with_MultiDataFileReader.ipynb)

Changes

  • Renamed: function get_len_header to parse_header(diive.core.dfun.frames.parse_header)
  • Renamed: exampledata files (diive/configs/exampledata)
  • Renamed: filetypes YAML files to always include the file extension in the file name (diive/configs/filetypes)
  • Reduced: file size for most example data files

Tests

  • Added: various test cases for loading filetypes (tests/test_loaddata.py)
  • Added: test case for loading and merging multiple files (tests.test_loaddata.TestLoadFiletypes.test_load_exampledata_multiple_EDDYPRO_FLUXNET_CSV_30MIN)
  • Added: test case for reading EddyPro fluxnet output file with DataFileReader parameters (tests.test_loaddata.TestLoadFiletypes.test_load_exampledata_EDDYPRO_FLUXNET_CSV_30MIN_datafilereader_parameters)
  • Added: test case for resampling series to 30MIN time resolution (tests.test_time.TestTime.test_resampling_to_30MIN)
  • Added: test case for inserting timestamp with a different convention (middle, start, end) (tests.test_time.TestTime.test_insert_timestamp)
  • Added: test case for inserting timestamp as index (tests.test_time.TestTime.test_insert_timestamp_as_index)

Bugfixes

  • Fixed: bug in class DetectFrequency when inferred frequency is None (diive.core.times.times.DetectFrequency)
  • Fixed: bug in class DetectFrequency where pd.Timedelta() would crash if the input frequency does not have a number. Timedelta does not accept e.g. the frequency string min for minutely time resolution, even though e.g. pd.infer_freq() outputs min for data in 1-minute time resolution. TimeDelta requires a number, in this case 1min. Results from infer_freq() are now checked if they contain a number and if not, 1 is added at the beginning of the frequency string. (diive.core.times.times.DetectFrequency)
  • Fixed: bug in notebook WindDirectionOffset, related to frequency detection during heatmap plotting
  • Fixed: bug in TimestampSanitizer where the script would crash if the timestamp contained an element that could not be converted to datetime, e.g., when there is a string mixed in with the regular timestamps. Data rows with invalid timestamps are now parsed as NaT by using errors='coerce' in pd.to_datetime(data.index, errors='coerce'). (diive.core.times.times.convert_timestamp_to_datetime and diive.core.times.times.TimestampSanitizer)
  • Fixed: bug when plotting heatmap (diive.core.plotting.heatmap_datetime.HeatmapDateTime)

What's Changed

  • Update read csv and notebooks by @holukas in https://github.com/holukas/diive/pull/93
  • Added new and updated test cases by @holukas in https://github.com/holukas/diive/pull/94

Full Changelog: https://github.com/holukas/diive/compare/v0.73.0...v0.74.0

- Python
Published by holukas almost 2 years ago

diive - v0.73.0

v0.73.0 | 17 Apr 2024

New features

  • Added new function trim_frame that allows to trim the start and end of a dataframe based on available records of a variable (diive.core.dfun.frames.trim_frame)
  • Added new option to export borderless heatmaps (diive.core.plotting.heatmap_base.HeatmapBase.export_borderless_heatmap)

Additions

  • Added more info in comments of class WindRotation2D (diive.pkgs.echires.windrotation.WindRotation2D)
  • Added example data for EddyPro fulloutput files (`diive.configs.exampledata.loadexampledataeddyprofulloutputCSV_30MIN`)
  • Added code in an attempt to harmonize frequency detection from data: in class DetectFrequency the detected frequency strings are now converted from Timedelta (pandas) to offset (pandas) to .freqstr. This will yield the frequency string as seen by (the current version of) pandas. The idea is to harmonize between different representations e.g. T or min for minutes. Currently it seems that pandas is not consistent with e.g. the represenation of minutes, using T in .infer_freq() but min for Timedelta ( see here). (diive.core.times.times.DetectFrequency)

Changes

  • Updated class DataFileReader to comply with new pandas kwargs when using .read_csv() (diive.core.io.filereader.DataFileReader._parse_file)
  • Environment: updated pandas to v2.2.2 and pyarrow to v15.0.2
  • Updated date offsets in config filetypes to be compliant with pandas version 2.2+ ( see here and here), e.g., 30T was changed to 30min. This seems to work without raising a warning, however, if frequency is inferred from available data, the resulting frequency string shows e.g. 30T, i.e. still showing T for minutes instead of min. (diive/configs/filetypes)
  • Changed variable names in WindRotation2D to be in line with the variable names given in the paper by Wilczak et al. (2001) https://doi.org/10.1023/A:1018966204465

Removals

  • Removed function timedelta_to_string because this can be done with pandas to_offset().freqstr
  • Removed function generate_freq_str (unused)

Tests

  • Added test case for reading EddyPro fulloutput files (`tests.testloaddata.TestLoadFiletypes.testloadexampledataeddyprofulloutputCSV_30MIN`)
  • Updated test for frequency detection (tests.test_timestamps.TestTime.test_detect_freq)

What's Changed

  • Adding trim frame by @holukas in https://github.com/holukas/diive/pull/81

Full Changelog: https://github.com/holukas/diive/compare/v0.72.1...v0.73.0

- Python
Published by holukas almost 2 years ago

diive - v0.72.1

v0.72.1 | 26 Mar 2024

  • pyproject.toml now uses the inequality syntax >= instead of caret syntax ^ because the version capping is restrictive and prevents compatibility in conda installations. See #74
  • Added badges in README.md
  • Smaller diive logo in README.md

What's Changed

  • Update pyproject.toml by @inkenbrandt in https://github.com/holukas/diive/pull/74
  • Minor updates by @holukas in https://github.com/holukas/diive/pull/77

Full Changelog: https://github.com/holukas/diive/compare/v0.72.0...v0.72.1

- Python
Published by holukas almost 2 years ago

diive - v0.72.0

v0.72.0 | 25 Mar 2024

New feature

  • Added new heatmap plotting class HeatmapYearMonth that allows to plot a variable in year/month classes(diive.core.plotting.heatmap_datetime.HeatmapYearMonth)

DIIVE

Changes

  • Refactored code for class HeatmapDateTime (diive.core.plotting.heatmap_datetime.HeatmapDateTime)
  • Added new base class HeatmapBase for heatmap plots. Currently used by HeatmapYearMonth and HeatmapDateTime (diive.core.plotting.heatmap_base.HeatmapBase)

Notebooks

  • Added new notebook for HeatmapDateTime (notebooks/Plotting/HeatmapDateTime.ipynb)
  • Added new notebook for HeatmapYearMonth (notebooks/Plotting/HeatmapYearMonth.ipynb)

Bugfixes

  • Fixed bug in HeatmapDateTime where the last record of each day was not shown

What's Changed

  • Heatmap plot update by @holukas in https://github.com/holukas/diive/pull/75
  • Heatmap plot update by @holukas in https://github.com/holukas/diive/pull/76

Full Changelog: https://github.com/holukas/diive/compare/v0.71.6...v0.72.0

- Python
Published by holukas almost 2 years ago

diive - v0.71.6

v0.71.6 | 23 Mar 2024

DIIVE

Notebooks

  • Added new notebook for Percentiles (notebooks/Analyses/Percentiles.ipynb)
  • Added new notebook for LinearInterpolation (notebooks/GapFilling/LinearInterpolation.ipynb)
  • Added new notebook for calculating z-aggregates in quantiles (classes) of x and y (notebooks/Analyses/CalculateZaggregatesInQuantileClassesOfXY.ipynb)
  • Updated notebook for DaytimeNighttimeFlag (notebooks/CalculateVariable/DaytimeNighttimeFlag.ipynb)

What's Changed

  • Percentile calculation by @holukas in https://github.com/holukas/diive/pull/73

Full Changelog: https://github.com/holukas/diive/compare/v0.71.5...v0.71.6

- Python
Published by holukas almost 2 years ago

diive - v0.71.5

v0.71.5 | 22 Mar 2024

Changes

  • Updated notebook for SortingBinsMethod (diive.pkgs.analyses.decoupling.SortingBinsMethod)

DIIVE Plot showing vapor pressure deficit (y) in 10 classes of short-wave incoming radiation (x), separate for 5 classes of air temperature (z). All values shown are medians of the respective variable. The shaded errorbars refer to the interquartile range for the respective class. Plot was generated using the class SortingBinsMethod.

- Python
Published by holukas almost 2 years ago

diive - v0.71.4

v0.71.4 | 20 Mar 2024

Changes

  • Refactored class LongtermAnomaliesYear (diive.core.plotting.bar.LongtermAnomaliesYear)

DIIVE

Notebooks

  • Added new notebook for LongtermAnomaliesYear (notebooks/Plotting/LongTermAnomalies.ipynb)

What's Changed

  • Anomaly plot by @holukas in https://github.com/holukas/diive/pull/72

Full Changelog: https://github.com/holukas/diive/compare/v0.71.3...v0.71.4

- Python
Published by holukas almost 2 years ago

diive - v0.71.3

v0.71.3 | 19 Mar 2024

Changes

  • Refactored class SortingBinsMethod: Allows to investigate binned aggregates of a variable z in binned classes of x and y (see plot below). All bins now show medians and interquartile ranges. (diive.pkgs.analyses.decoupling.SortingBinsMethod)

Notebooks

  • Added new notebook for SortingBinsMethod

Bugfixes

  • Added absolute links to example notebooks in README.md

Other

  • From now on, diive is officially published on pypi

What's Changed

  • V0.71.3 by @holukas in https://github.com/holukas/diive/pull/71

Full Changelog: https://github.com/holukas/diive/compare/v0.71.2...v0.71.3

- Python
Published by holukas almost 2 years ago

diive - v0.71.2

v0.71.2 | 18 Mar 2024

Notebooks

  • Added new notebook for daily_correlation function (notebooks/Analyses/DailyCorrelation.ipynb)
  • Added new notebook for Histogram class (notebooks/Analyses/Histogram.ipynb)

Bugfixes & changes

  • Daily correlations are now returned with daily (1d) timestamp index (diive.pkgs.analyses.correlation.daily_correlation)
  • Updated README
  • Environment: Added ruff to dev dependencies for linting

What's Changed

  • V0.71.2 by @holukas in https://github.com/holukas/diive/pull/70

Full Changelog: https://github.com/holukas/diive/compare/v0.71.1...v0.71.2

- Python
Published by holukas almost 2 years ago

diive - v0.71.1

v0.71.1 | 15 Mar 2024

Bugfixes & changes

  • Fixed: Replaced all references to old filetypes using the underscore to their respective new filetype names, e.g. all occurrences of EDDYPRO_FLUXNET_30MIN were replaced with the new name EDDYPRO-FLUXNET-30MIN.
  • Environment: Python 3.11 is now allowed in pyproject.toml: python = ">=3.9,<3.12"
  • Environment: Removed fitter library from dependencies, was not used.
  • Docs: Testing documentation generation using Sphinx, although it looks very rough at the moment.

What's Changed

  • Update pyproject.toml for compatibility with python 3.11 by @inkenbrandt in https://github.com/holukas/diive/pull/58
  • V0.71.1 by @holukas in https://github.com/holukas/diive/pull/69

New Contributors

  • @inkenbrandt made their first contribution in https://github.com/holukas/diive/pull/58

Full Changelog: https://github.com/holukas/diive/compare/v0.71.0...v0.71.1

- Python
Published by holukas almost 2 years ago

diive - v0.71.0 | High-resolution update

v0.71.0 by @holukas in https://github.com/holukas/diive/pull/66

v0.71.0 | 14 Mar 2024

High-resolution update

This update focuses on the implementation of several classes that work with high-resolution (20 Hz) data.

The main motivation behind these implementations is the upcoming new version of another script, dyco, which will make direct use of these new classes. dyco allows to detect and remove time lags from time series data and can also handle drifting lags, i.e., lags that are not constant over time. This is especially useful for eddy covariance data, where the detection of accurate time lags is of high importance for the calculation of ecosystem fluxes.

Plot showing the covariance between the turbulent departures of vertical wind and CO2 measurements. Maximum (absolute) covariance was found at record -26, which means that the CO2 signal has to be shifted by 26 records in relation to the wind data to obtain the maximum covariance between the two variables. Since the covariance was calculated on 20 Hz data, this corresponds to a time lag of 1.3 seconds between CO2 and wind (20 Hz = measurement every 0.05 seconds, 26 * 0.05 = 1.3), or, to put it another way, the CO2 signal arrived 1.3 seconds later at the sensor than the wind signal. Maximum covariance was calculated using the MaxCovariance class.

New features

  • Added new class MaxCovariance to find the maximum covariance between two variables (diive.pkgs.echires.lag.MaxCovariance)
  • Added new class FileDetector to detect expected and unexpected files from a list of files (diive.core.io.filesdetector.FileDetector)
  • Added new class FileSplitter to split file into multiple smaller parts and export them as multiple CSV files. (diive.core.io.filesplitter.FileSplitter)
  • Added new class FileSplitterMulti to split multiple files into multiple smaller parts and save them as CSV or compressed CSV files. (diive.core.io.filesplitter.FileSplitterMulti)
  • Added new function create_timestamp that calculates the timestamp for each record in a dataframe, based on number of records in the file and the file duration. (diive.core.times.times.create_timestamp)

Additions

  • Added new filetype ETH-SONICREAD-BICO-CSVGZ-20HZ, these files contain data that were originally logged by the sonicread script which is in use in the ETH Grassland Sciences group since the early 2000s to record eddy covariance data within the Swiss FluxNet. Data were then converted to a regular format using the Python script bico, which also compressed the resulting CSV files to gz files (gzipped).
  • Added new filetype GENERIC-CSV-HEADER-1ROW-TS-MIDDLE-FULL-NS-30MIN, which corresponds to a CSV file with one header row with variable names, a timestamp that describes the middle of the averaging period, whereby the timestamp also includes nanoseconds. Time resolution of the file is 30MIN.

Changes

  • Renamed class TurbFlux to WindRotation2D and updated code a bit, e.g., now it is possible to get rotated values for all three wind components (u', v', w') in addition to the rotated scalar c'. (diive.pkgs.echires.windrotation.WindRotation2D)
  • Renamed filetypes: all filetypes now use the dash instead of an underscore
  • Renamed filetype to ETH-RECORD-DAT-20HZ: this filetype originates from the new eddy covariance real-time logging script rECord (currently not open source)
  • Missing values are now defined for all files as: NA_VALUES: [ -9999, -6999, -999, "nan", "NaN", "NAN", "NA", "inf", "-inf", "-" ]

- Python
Published by holukas almost 2 years ago

diive - v0.70.1

Full Changelog: https://github.com/holukas/diive/compare/v0.70.0...v0.70.1

- Python
Published by holukas almost 2 years ago

diive - v0.70.0

What's Changed

  • v0.70.0 by @holukas in https://github.com/holukas/diive/pull/54

Full Changelog: https://github.com/holukas/diive/compare/v0.69.0...v0.70.0

- Python
Published by holukas almost 2 years ago

diive - v0.69.0

What's Changed

  • Extract binary info by @holukas in https://github.com/holukas/diive/pull/48

Full Changelog: https://github.com/holukas/diive/compare/v0.68.1...v0.69.0

- Python
Published by holukas almost 2 years ago

diive - v0.68.1

What's Changed

  • fixed bugs in flux processing chain by @holukas in https://github.com/holukas/diive/pull/47

Full Changelog: https://github.com/holukas/diive/compare/v0.68.0...v0.68.1

- Python
Published by holukas about 2 years ago

diive - v0.68.0

v0.68.0

- Python
Published by holukas about 2 years ago

diive -

- Python
Published by holukas about 2 years ago

diive - v0.67.0 - Flux processing chain updates

- Python
Published by holukas about 2 years ago

diive - v0.66.0: ScatterXY plot

What's Changed

  • Indev by @holukas in https://github.com/holukas/diive/pull/36
  • Remove sphinx autodocs for now by @holukas in https://github.com/holukas/diive/pull/37
  • Add scatter plot by @holukas in https://github.com/holukas/diive/pull/41

Full Changelog: https://github.com/holukas/diive/compare/v0.64.0...v0.66.0

- Python
Published by holukas over 2 years ago

diive - v0.65.0: Harmonized daytime/nighttime flag calculation

- Python
Published by holukas over 2 years ago

diive -

- Python
Published by holukas over 2 years ago