Recent Releases of diive
diive - v0.89.0
v0.89.0 | 23 Jul 2025
Version 0.89.0 introduces a new GridAggregator class for 2D data aggregation with support for quantile,
equal-width, and custom binning methods, along with comprehensive documentation improvements and major dependency
updates including shapiq integration for enhanced analysis capabilities.
See the notebook for example usage.
Added
- New
GridAggregatorclass for 2D grid data aggregation (diive/pkgs/analyses/gridaggregator.py)- Supports quantile, equal-width, and custom binning methods
- Flexible aggregation functions
- Comprehensive input validation and error handling
- Added unit tests covering core functionality
- Added example notebook:
notebooks/Examples/GridAggregator.ipynb- demonstrates 2D data aggregation and binning
Enhanced
- Improved documentation across modules
- Added detailed docstrings for methods and classes
- Updated example notebooks for better clarity
- Streamlined notebook structure in Overview
Dependencies
- Updated multiple Python dependencies to their latest versions
- Added new dependencies:
- shapiq (>=1.3.1,<2.0.0)
- galois
- networkx
- sparse-transform
Unittests
- Added unittests for
dv.heatmap_xyz - 66/66 unittests ran successfully
What's Changed
- Indev by @holukas in https://github.com/holukas/diive/pull/327
- Indev by @holukas in https://github.com/holukas/diive/pull/332
Full Changelog: https://github.com/holukas/diive/compare/v0.87.1...v0.89.0
- Python
Published by holukas 7 months ago
diive - v0.88.0
v0.88.0 | 18 Jul 2025
Heatmaps can now be plotted in horizontal orientation by setting the parameter ax_orientation='horizontal'. This
example plot shows the monthly maximum air temperature.
Changes
Heatmap updates
- There are several improvements for heatmap visualizations:
- More consistent heatmap creation: The
.heatmapdatetime(),.heatmapyearmonth()and.heatmapxyz()functions now offer a more unified experience for generating heatmaps. - Flexible orientation: heatmaps can now be displayed vertically or horizontally using the new parameter
ax_orientation. - The rank plot introduced in the previous version can now be created using the parameter
ranks=Truewhen using.heatmapyearmonth().
- More consistent heatmap creation: The
Fyi, .heatmapdatetime() is an alias for the diive.core.plotting.heatmap_datetime.HeatmapDateTime class,
.heatmapyearmonth() is an alias for diive.core.plotting.heatmap_datetime.HeatmapYearMonth, .heatmapxyz() is an
alias for diive.core.plotting.heatmap_xyz.HeatmapXYZ. All of these classes use
diive.core.plotting.heatmap_base.HeatmapBase or diive.core.plotting.heatmap_base.HeatmapBaseXYZ as base class for
their core functionality.
Notebooks
- Updated notebook for
QuantileGridAggregator(formerlyCalculateZaggregatesInQuantileClassesOfXY) - Updated notebook for
HeatmapDateTime - Updated notebook for
HeatmapYearMonth
Unittests
- Updated test case for
tests.test_analyses.TestAnalyses.test_quantilegridaggregator - 56/56 unittests ran successfully
What's Changed
- Indev by @holukas in https://github.com/holukas/diive/pull/315
Full Changelog: https://github.com/holukas/diive/compare/v0.87.0...v0.88.0
- Python
Published by holukas 7 months ago
diive - v0.87.1
v0.87.1 | 12 Jun 2025
New features
- Added new function
.set_exact_values_to_missing()to set specific values in a time series to missing values (diive.pkgs.corrections.setto_missing.set_exact_values_to_missing)
Additions
- Added parameters when plotting diel cycles:
- Added parameter
show_xticklabelsfor showing grid - Added parameter
show_xlabelfor showing x-ticklabels - Added parameter
show_legendfor showing legend - (
diive.core.plotting.dielcycle.DielCycle.plot)
- Added parameter
- Similarly, added more params for plotting cumulatives (
diive.core.plotting.cumulative.Cumulative)
Changes
- In
.quickplot(), other rows now use the same scaling for x-axis as the plot in the first row (diive.core.plotting.plotfuncs.quickplot) - Scaling of the y-axis is now slightly extended (by 5%) when plotting cumulatives (
diive.core.plotting.cumulative.Cumulative)
Notebooks
- Updated
StepwiseMeteoScreeningFromDatabase.ipynb, added new correction.set_exact_values_to_missing()
Unittests
- Added test case for
.set_exact_values_to_missing()(tests.test_corrections.TestCorrections.test_settomissing) - 56/56 unittests ran successfully
What's Changed
- Indev by @holukas in https://github.com/holukas/diive/pull/315
Full Changelog: https://github.com/holukas/diive/compare/v0.87.0...v0.87.1
- Python
Published by holukas 8 months ago
diive - v0.87.0
v0.87.0 | 17 May 2025
Heatmap rank plot
diive can now create heatmap rank plots.

Example heatmap rank plot for air temperatures. This heatmap displays the rank of average monthly air temperatures compared across different years. For instance, May 2022 had the highest average temperature among all Mays on record ( rank 1), as did October 2022 for Octobers. Conversely, January 2019 recorded the lowest average temperature for January within the 26-year period shown.
Heatmap rank plots display the relative ranking of monthly aggregated values across multiple years. Essentially, it shows how each month's overall value compares to the same month in other years. By default, the plot ranks the monthly mean (average) of the selected variable.
Other aggregation methods commonly used in the pandas library are possible, such as median, min, max and std,
among others.
Basic example:
import diive as dv
hm = dv.heatmapyearmonth_ranks(series=series) # Initialize instance, series is a pandas Series
hm.plot() # Generate basic plot
See the notebook here for more examples:
notebooks/Plotting/HeatmapYearMonthRank.ipynb
New features
- Added new class
.heatmapyearmonth_ranks()to plot monthly ranks of an aggregated value across years (diive.core.plotting.heatmap_datetime.HeatmapYearMonthRanks) - Added new function
.resample_to_monthly_agg_matrix()to calculate a matrix of monthly aggregates across years (diive.core.times.resampling.resample_to_monthly_agg_matrix) - Added new function
.transform_yearmonth_matrix_to_longform()to convert monthly aggregation matrix to long-form time series (diive.core.dfun.frames.transform_yearmonth_matrix_to_longform) - Added new function to calculate ET (evapotranspiration in mm h-1) from LE (latent heat flux in W m-2). (
diive.pkgs.createvar.conversions.et_from_le) - Added new function to calculate latent heat of vaporization. Originally needed for calculating ET from LE. (
diive.pkgs.createvar.conversions.latent_heat_of_vaporization)
Additions
- Heatmap plotting:
- Heatmaps can now show the z-value for each rectangle in the plot, using the parameters
show_valuesandshow_values_n_dec_places. This makes more sense for data that are plotted month vs. year than for e.g. half-hourly data. - Simplified API to call heatmap plots: after
import diive as dv, the heatmaps can now be called viadv.heatmapyearmonth()anddv.heatmapdatetime().
- Heatmaps can now show the z-value for each rectangle in the plot, using the parameters
SortingBinsMethod:- The counts per bin are now also part of the bin stats
- Sometimes the required number of bins cannot be generated, in this case the stats for the respective bin are now
skipped and the bin is missing from the output (
.calcbins) - All parameters were renamed to better reflect what is going on
- (
diive.pkgs.analyses.decoupling.SortingBinsMethod) - Added
aggparameter to define aggregation method used in binning the data - Renamed and reworked
conversionparamater, now allows conversion to z-scores in addition to percentiles
- Added new filetype
FLUXNET-FULLSET-HR-CSV-60MINfor reading FLUXNET files with 60MIN time resolution
Notebooks
- Added new notebook for calculating a monthly aggregation matrix (
notebooks/Resampling/ResamplingMonthlyMatrix.ipynb) - Updated notebook
HeatmapDateTime - Updated notebook
HeatmapYearMonth - Changed name of notebook
ridgelineto camel-caseRidgeLine
Unittests
- Added test case for
.et_from_le()(tests.test_createvar.TestCreateVar.test_conversion_et_from_le) - Added test case for
.resample_to_monthly_agg_matrix(), this test also includes the transformation to long-form time series using.transform_yearmonth_matrix_to_longform()(tests.test_resampling.TestResampling.test_resample_to_monthly_agg_matrix) - 55/55 unittests ran successfully
Environment
diiveis now using Python version3.11upwards- Updated environment, poetry
pyproject.tomlfile now has the currently used structure
What's Changed
- Et from le by @holukas in https://github.com/holukas/diive/pull/306
- Heatmap rank plot by @holukas in https://github.com/holukas/diive/pull/307
Full Changelog: https://github.com/holukas/diive/compare/v0.86.0...v0.87.0
- Python
Published by holukas 9 months ago
diive - v0.86.0
v0.86.0 | 20 Mar 2025
New features
Ridgeline plot
diive can now create ridgeline plots.

The ridgeline plot visualizes the distribution of a quantitative variable by stacking overlapping density plots,
creating a "ridged" landscape. I think this is quite pleasing to look at. With the implementation in diive, it
facilitates the comparison of distributional shapes and changes of time series data across weeks, months and years.
Ridgeline plots are quite space-efficient and hopefully visually intuitive for revealing patterns and trends in data.
This is also the first function that uses a simplified API. After importing diive, the plot can simply be accessed via
.ridgeline(). This is a shortcut to access the class RidgeLinePlot that is otherwise deeply buried in the code
here: diive.core.plotting.ridgeline.RidgeLinePlot. In the future, other classes and functions will also be
accessible via similar shortforms.
Basic example:
import diive as dv
rp = dv.ridgeline(series=series) # Initialize instance, series is a pandas Series
rp.plot() # Generate basic plot
See the notebook here for more examples:
notebooks/Plotting/ridgeline.ipynb
Additions
- Additions to the flux processing chain:
- Added two methods to get details about training and testing when using machine-learning models in the flux
processing chain:
.report_traintest_model_scores()and.report_traintest_details() - Added parameter
setflag_timeperiodto set the flag for the SSITC to another value during certain time periods, for example when a time period needs stricter filtering (e.g. due to issues with the sonic anemometer). In this case the parameter can be used to set all values where flag=1 (medium quality data) to flag=2 (bad data).- Example from docstring:
Set flag 1 to value 2 between '2022-05-01' and '2023-09-30', and between '2024-04-02' and '2024-04-19' (dates inclusive): setflag_timeperiod={2: [ [1, '2022-05-01', '2023-09-30'], [1, '2024-04-02', '2024-04-19'] ]}(diive.pkgs.qaqc.eddyproflags.flag_ssitc_eddypro_test)
- Example from docstring:
- Added params to export some gap-filling results (e.g. model scores) to csv files (e.g.,
.report_gapfilling_model_scores(outpath=...)) - (
diive.pkgs.fluxprocessingchain.fluxprocessingchain.FluxProcessingChain)
- Added two methods to get details about training and testing when using machine-learning models in the flux
processing chain:
- Added check if time series has a name when plotting heatmaps. If time series does not have a name, it is automatically
assigned the name
data. Implemented in classHeatmapBasethat is used by all heatmap plotters. (diive.core.plotting.heatmap_base.HeatmapBase) - Added new filetype for 60MIN EddyPro output (
diive/configs/filetypes/EDDYPRO-FLUXNET-CSV-60MIN.yml)
Notebooks
- Added notebook for ridgeline plot (
notebooks/Plotting/ridgeline.ipynb)
Bugfixes
- Fixed bug where the flux processing chain would crash when a variable with the same name as one of the automatically
generated variables was already present in the input data. For example, the potential radiation
SW_IN_POTis generated when the flux processing chain starts and then it is added also to the input data. If the input data already has a variable with the same name, the processing chain would crash. Now, the automatically generatedSW_IN_POTis given priority, which means the variable in the input data is overwritten. (diive.pkgs.fluxprocessingchain.fluxprocessingchain.FluxProcessingChain)
Environment
- Updated packages
Unittests
- 53/53 unittests ran successfully
What's Changed
- Ridgeline plot by @holukas in https://github.com/holukas/diive/pull/291
Full Changelog: https://github.com/holukas/diive/compare/v0.85.7...v0.86.0
- Python
Published by holukas 11 months ago
diive - v0.85.7
v0.85.7 | 26 Feb 2025
New features
- Added class for formatting meteo data for upload to FLUXNET (
diive.pkgs.formats.meteo.FormatMeteoForFluxnetUpload)
Notebooks
- Added new notebook
notebooks/Formats/FormatMeteoForFluxnetUpload.ipynb
What's Changed
- Indev by @holukas in https://github.com/holukas/diive/pull/282
Full Changelog: https://github.com/holukas/diive/compare/v0.85.6...v0.85.7
- Python
Published by holukas 12 months ago
diive - v0.85.6
v0.85.6 | 25 Feb 2025
New features
- Added class to format meteo data as input file for EddyPro flux calcs (
diive.pkgs.formats.meteo.FormatMeteoForEddyProFluxProcessing)
Changes
- Updated formatting for FLUXNET upload (
diive.pkgs.formats.fluxnet.FormatEddyProFluxnetFileForUpload) HeatmapYearMonthplot now shows every year on y-axis (diive.core.plotting.heatmap_datetime.HeatmapYearMonth)- Improved check for excluded columns when creating lagged variants (
diive.pkgs.createvar.laggedvariants.lagged_variants) - More text output when reducting features (
diive.core.ml.common.MlRegressorGapFillingBase.reduce_features) - Fixed colorwheel running out of colors when plotting feature ranks (
diive.pkgs.gapfilling.longterm.LongTermGapFillingBase.showplot_feature_ranks_per_year) - Less text output when filling storage term (
diive.pkgs.fluxprocessingchain.level31_storagecorrection.FluxStorageCorrectionSinglePointEddyPro._gapfill_storage_term) - Smaller fixes
Notebooks
- Added new notebook
notebooks/Formats/FormatMeteoForEddyProFluxProcessing.ipynb - Updated notebook
notebooks/Formats/notebooks/Formats/FormatEddyProFluxnetFileForUpload.ipynb
What's Changed
- Indev by @holukas in https://github.com/holukas/diive/pull/281
Full Changelog: https://github.com/holukas/diive/compare/v0.85.5...v0.85.6
- Python
Published by holukas 12 months ago
diive - v0.85.5
v0.85.5 | 3 Feb 2024
Updates to MDS gap-filling
The community-standard MDS gap-filling method for eddy covariance ecosystem fluxes (e.g., CO2 flux) is now integrated
into the FluxProcessingChain. MDS is used during gap-filling in flux Level-4.1.
- Example notebook using MDS as part of the flux processing chain where it is used together with random forest: Flux Processing Chain
- Example notebook using MDS as stand alone class
FluxMDS: MDS gap-filling of ecosystem fluxes
The diive implementation of the MDS gap-filling method adheres to the descriptions in Reichstein et al. (2005) and
Vekuri et al. (2023), similar to the standard gap-filling procedures used by FLUXNET, ICOS, ReddyProc, and other similar
platforms. This method fills gaps by substituting missing flux values with average flux values observed under comparable
meteorological conditions.

Background: different flux levels
- The class
FluxProcessingChainindiivefollows the flux processing steps as shown in the Flux Processing Chain outlined by Swiss FluxNet. - - The flux processing chain uses different levels for different steps in the chain:
- Level-0: preliminary flux calculations, e.g. during the year, using EddyPro
- Level-1: final flux calculations, e.g. for complete year, using EddyPro
- Level-2: quality flag expansion (flagging)
- Level-3.1: storage correction (using one point measurement only, from profile not included by default)
- Level-3.2: outlier removal (flagging)
- Level-3.3: USTAR filtering (constant threshold, must be known, detection process not included by default) ( flagging)
- Following Level 3.3, a comprehensive quality flag (
QCF) is generated by combining individual quality flags. Prior to subsequent processing steps, low-quality data (flag=2) is removed. Medium-quality data (flag=1) can be retained if necessary, while the highest quality data (flag=0) is always kept. - Level-4.1: gap-filling (MDS, long-term random forest)
Changes
- Changes in
FluxMDS:- Added parameter
avg_min_n_valsin MDS gap-filling - Renamed tolerance parameters for MDS gap-filling to
*_tol - (
diive.pkgs.gapfilling.mds.FluxMDS)
- Added parameter
- When reading a parquet file, sanitizing the timestamp is now optional (
diive.core.io.files.load_parquet) - The function for creating lagged variants is now found in
diive.pkgs.createvar.laggedvariants.lagged_variants
Additions
- Added more text output for fill quality during gap-filling with MDS (
diive.pkgs.gapfilling.mds.FluxMDS) - Added MDS gap-filling to flux processing chain (
diive.pkgs.fluxprocessingchain.fluxprocessingchain.FluxProcessingChain) - Allow fitting to unbinned data (
diive.pkgs.fits.fitter.BinFitterCP) - Added parameter to edit y-label (
diive.core.plotting.dielcycle.DielCycle) - Added preliminary USTAR filtering for NEE to quick flux processing chain (
diive.pkgs.fluxprocessingchain.fluxprocessingchain.QuickFluxProcessingChain) FileSplitter:- Added parameter to directly output splits as
parquetfiles inFileSplitterandFileSplitterMulti. These two classes split longer time series files (e.g., 6 hours) into several smaller splits (e.g., 12 half-hourly files). Usage of parquet speeds up not only the splitting part, but also the process when later re-reading the files for other processing steps. - After splitting, missing values in the split files are numpy NAN (
diive.core.io.filesplitter.FileSplitter)
- Added parameter to directly output splits as
- Added parameter to hide default plot when called. The method
defaultplotis used e.g. by outlier detection methods to plot the data after outlier removal, to show flagged vs. unflagged values. (diive.core.base.flagbase.FlagBase.defaultplot) - Added new filetype
ETH-SONICREAD-BICO-MOD-CSV-20HZ - Added
figproperty that contains the default plot for outlier removal methods. This is useful when the default plot is needed elsewhere, e.g. saved to a file. At the moment, the parametershowplotmust beTruefor the property to be accessible. (diive.core.base.flagbase.FlagBase)- Example for class
zScoreRolling:zsr = zScoreRolling(..., showplot=True, ...) zsr.calc(repeat=True) fig = zsr.fig # Contains the figure instance fig.savefig(...) # Figure can then be saved to a file etc.
- Example for class
Notebooks
- Added notebook example for creating lagged variants of variables (
notebooks/CalculateVariable/Create_lagged_variants.ipynb) - Updated flux processing chain notebook to
v9.0: added option for MDS gap-filling, more descriptions - Bugfix: import for loading from
Pathwas missing in flux processing chain notebook - Updated MDS gap-filling notebook to
v1.1, added more descriptions and example formin_n_vals_ntparameter - Updated quick flux processing chain notebook
Unittests
- Added test case
tests.test_createvar.TestCreateVar.test_lagged_variants - Updated test case
tests.test_gapfilling.TestGapFilling.test_fluxmds - Updated test case
tests.test_fluxprocessingchain.TestFluxProcessingChain.test_fluxprocessingchain - 53/53 unittests ran successfully
Bugfixes
- The setting for features that should not be lagged was not properly implemented (
diive.pkgs.fluxprocessingchain.fluxprocessingchain.FluxProcessingChain._get_ml_feature_settings) - Fixed bug when plotting (
diive.pkgs.outlierdetection.localsd.LocalSD)
What's Changed
- Indev by @holukas in https://github.com/holukas/diive/pull/278
Full Changelog: https://github.com/holukas/diive/compare/v0.84.2...v0.85.5
- Python
Published by holukas about 1 year ago
diive - v0.84.1
v0.84.1 | 8 Nov 2024
Bugfixes
- Removed invalid imports
Tests
- Added test case for
diiveimports (tests.test_imports.TestImports.test_imports) - 52/52 unittests ran successfully
What's Changed
- Hotifx imports by @holukas in https://github.com/holukas/diive/pull/236
Full Changelog: https://github.com/holukas/diive/compare/v0.84.0...v0.84.1
- Python
Published by holukas over 1 year ago
diive - v0.84.0
v0.84.0 | 7 Nov 2024
New features
- New class
BinFitterCPfor fitting function to binned data, includes confidence interval and prediction interval (diive.pkgs.fits.fitter.BinFitterCP)

Additions
- Added small function to detect duplicate entries in lists (
diive.core.funcs.funcs.find_duplicates_in_list) - Added new filetype (
diive/configs/filetypes/ETH-MERCURY-CSV-20HZ.yml) - Added new filetype (
diive/configs/filetypes/GENERIC-CSV-HEADER-1ROW-TS-END-FULL-NS-20HZ.yml)
Bugfixes
- Not directly a bug fix, but when reading EddyPro fluxnet files with
LoadEddyProOutputFiles(e.g., in the flux processing chain) duplicate columns are now automatically renamed by adding a numbered suffix. For example, if two variables are namedCUSTOM_CH4_MEANin the output file, they are automatically renamed toCUSTOM_CH4_MEAN_1andCUSTOM_CH4_MEAN_2(diive.core.dfun.frames.compare_len_header_vs_data)
Notebooks
- Added notebook example for
BinFitterCP(notebooks/Fits/BinFitterCP.ipynb) - Updated flux processing chain notebook to
v8.6, import for loading EddyPro fluxnet output files was missing
Tests
- Added test case for
BinFitterCP(tests.test_fits.TestFits.test_binfittercp) - 51/51 unittests ran successfully
What's Changed
- Indev by @holukas in https://github.com/holukas/diive/pull/235
Full Changelog: https://github.com/holukas/diive/compare/v0.83.2...v0.84.0
- Python
Published by holukas over 1 year ago
diive - v0.83.2
v0.83.2 | 25 Oct 2024
From now on Python version 3.11.10 is used for developing Python (up to now, version 3.9 was used). All unittests
were successfully executed with this new Python version. In addition, all notebooks were re-run, all looked good.
JupyterLab is now included in the environment, which makes it
easier to quickly install diive (pip install diive) in an environment and directly use its notebooks, without the
need to install JupyterLab separately.
Environment
diivewill now be developed using Python version3.11.10- Added JupyterLab
- Added jupyter bokeh
Notebooks
- All notebooks were re-run and updated using Python version
3.11.10
Tests
- 50/50 unittests ran successfully with Python version
3.11.10
Changes
- Adjusted flags check in QCF flag report, the progressive flag must be the same as the previously calculated overall
flag (
diive.pkgs.qaqc.qcf.FlagQCF.report_qcf_evolution)
What's Changed
- Indev by @holukas in https://github.com/holukas/diive/pull/234
Full Changelog: https://github.com/holukas/diive/compare/v0.83.1...v0.83.2
- Python
Published by holukas over 1 year ago
diive - v0.83.1
v0.83.1 | 23 Oct 2024
Changes
- When detecting the frequency from the time delta of records, the inferred frequency is accepted if the most frequent
timedelta was found for more than 50% of records (
diive.core.times.times.timestamp_infer_freq_from_timedelta) - Storage terms are now gap-filled using the rolling median in an expanding time window (
FluxStorageCorrectionSinglePointEddyPro._gapfill_storage_term)
Notebooks
- Added notebook example for using the flux processing chain for CH4 flux from a subcanopy eddy covariance station (
notebooks/Workbench/CH-DAS_2023_FluxProcessingChain/FluxProcessingChain_NEE_CH-DAS_2023.ipynb)
Bugfixes
- Fixed info for storage term correction report to account for cases when more storage terms than flux records are
available (
FluxStorageCorrectionSinglePointEddyPro.report)
Tests
- 50/50 unittests ran successfully
What's Changed
- Indev by @holukas in https://github.com/holukas/diive/pull/233
Full Changelog: https://github.com/holukas/diive/compare/v0.83.0...v0.83.1
- Python
Published by holukas over 1 year ago
diive - v0.83.0
v0.83.0 | 4 Oct 2024
MDS gap-filling
Finally it is possible to use the MDS (marginal distribution sampling) gap-filling method in diive. This method is
the current default and widely used gap-filling method for eddy covariance ecosystem fluxes. For a detailed description
of the method see Reichstein et al. (2005) and Pastorello et al. (2020; full references given below).
The implementation of MDS in diive (FluxMDS) follows the description in Reichstein et al. (2005) and should
therefore yield results similar to other implementations of this algorithm. FluxMDS can also easily output model
scores, such as r2 and error values.
At the moment it is not yet possible to use FluxMDS in the flux processing chain, but during the preparation of this
update the flux processing chain code was already refactored and prepared to include FluxMDS in one of the next
updates.
At the moment, FluxMDS is specifically tailored to gap-fill ecosystem fluxes, a more general implementation (e.g., to
gap-fill meteorological data) will follow.
New features
- Added new gap-filling class
FluxMDS:MDSstands formarginal distribution sampling. The method uses a time window to first identify meteorological conditions (short-wave incoming radiation, air temperature and VPD) similar to those when the missing data occurred. Gaps are then filled with the mean flux in the time window.FluxMDScannot be used in the flux processing chain, but will be implemented soon.- (
diive.pkgs.gapfilling.mds.FluxMDS)
Changes
- Storage correction: By default, values missing in the storage term are now filled with a rolling mean in an
expanding
time window. Testing showed that the (single point) storage term is missing for between 2-3% of the data, which I
think is reason enough to make filling these gaps the default option. Previously, it was optional to fill the gaps
using random forest, however, results were not great since only the timestamp info was used as model features. Plots
generated during Level-3.1 were also updated, now better showing the storage terms (gap-filled and non-gap-filled) and
the flag indicating filled values (
diive.pkgs.fluxprocessingchain.level31_storagecorrection.FluxStorageCorrectionSinglePointEddyPro)
Notebooks
- Added notebook example for
FluxMDS(notebooks/GapFilling/FluxMDSGapFilling.ipynb)
Tests
- Added test case for
FluxMDS(tests.test_gapfilling.TestGapFilling.test_fluxmds) - 50/50 unittests ran successfully
Bugfixes
- Fixed bug: overall quality flag
QCFwas not created correctly for the different USTAR scenarios (diive.core.base.identify.identify_flagcols) (diive.pkgs.qaqc.qcf.FlagQCF) - Fixed bug: calculation of
QCFflag sums is now strictly done on flag columns. Before, sums were calculated across all columns in the flags dataframe, which resulted in erroneous overall flags after USTAR filtering (diive.pkgs.qaqc.qcf.FlagQCF._calculate_flagsums)
Environment
- Added polars
References
- Pastorello, G. et al. (2020). The FLUXNET2015 dataset and the ONEFlux processing pipeline for eddy covariance data. 27. https://doi.org/10.1038/s41597-020-0534-3
- Reichstein, M., Falge, E., Baldocchi, D., Papale, D., Aubinet, M., Berbigier, P., Bernhofer, C., Buchmann, N., Gilmanov, T., Granier, A., Grunwald, T., Havrankova, K., Ilvesniemi, H., Janous, D., Knohl, A., Laurila, T., Lohila, A., Loustau, D., Matteucci, G., … Valentini, R. (2005). On the separation of net ecosystem exchange into assimilation and ecosystem respiration: Review and improved algorithm. Global Change Biology, 11(9), 1424–1439. https://doi.org/10.1111/j.1365-2486.2005.001002.x
What's Changed
- Indev by @holukas in https://github.com/holukas/diive/pull/229
Full Changelog: https://github.com/holukas/diive/compare/v0.82.1...v0.83.0
- Python
Published by holukas over 1 year ago
diive - v0.82.1
v0.82.1 | 22 Sep 2024
Notebooks
- Added notebook showing an example for
LongTermGapFillingRandomForestTS(notebooks/GapFilling/LongTermRandomForestGapFilling.ipynb) - Added notebook example for
MeasurementOffset(notebooks/Corrections/MeasurementOffset.ipynb)
Tests
- Added unittest for
LongTermGapFillingRandomForestTS(tests.test_gapfilling.TestGapFilling.test_gapfilling_longterm_randomforest) - Added unittest for
WindDirOffset(tests.test_corrections.TestCorrections.test_winddiroffset) - Added unittest for
DaytimeNighttimeFlag(tests.test_createvar.TestCreateVar.test_daytime_nighttime_flag) - Added unittest for
calc_vpd_from_ta_rh(tests.test_createvar.TestCreateVar.test_calc_vpd) - Added unittest for
percentiles101(tests.test_analyses.TestAnalyses.test_percentiles) - Added unittest for
GapFinder(tests.test_analyses.TestAnalyses.test_gapfinder) - Added unittest for
SortingBinsMethod(tests.test_analyses.TestAnalyses.test_sorting_bins_method) - Added unittest for
daily_correlation(tests.test_analyses.TestAnalyses.test_daily_correlation) - Added unittest for
QuantileXYAggZ(tests.test_analyses.TestCreateVar.test_quantilexyaggz) - 49/49 unittests ran successfully
Bugfixes
- Fixed bug that caused results from long-term gap-filling to be inconsistent despite using a fixed random state. I
found the following: when reducing features across years, the removal of duplicate features from a list of found
features created a list where the order of elements changed each run. This in turn produced slightly different
gap-filling results each time the long-term gap-filling was executed. Used Python version where this issue occurred
was
3.9.19.- Here is a simplified example, where
input_listis a list of elements with some duplicate elements: - Running
output_list = list(set(input_list))generatesoutput_listwhere the elements would have a different output order each run. The elements were otherwise the same, only their order changed. - To keep the order of elements consistent it was necessary to
output_list.sort(). - (
diive.pkgs.gapfilling.longterm.LongTermGapFillingBase.reduce_features_across_years)
- Here is a simplified example, where
- Corrected wind direction could be 360°, but will now be 0° (
diive.pkgs.corrections.winddiroffset.WindDirOffset._correct_degrees)
What's Changed
- Indev by @holukas in https://github.com/holukas/diive/pull/218
Full Changelog: https://github.com/holukas/diive/compare/v0.82.0...v0.82.1
- Python
Published by holukas over 1 year ago
diive - v0.82.0
v0.82.0 | 19 Sep 2024
Long-term gap-filling
It is now possible to gap-fill multi-year datasets using the class LongTermGapFillingRandomForestTS. In this approach,
data from neighboring years are pooled together before training the random forest model for gap-filling a specific year.
This is especially useful for long-term, multi-year datasets where environmental conditions and drivers might change
over years and decades.
Why random forest? Because it performed well and to me it looks like the first choice for gap-filling ecosystem fluxes, at least at the moment.
Long-term gap-filling using random forest is now also built into the flux processing chain (Level-4.1). This allows to quickly gap-fill the different USTAR scenarios and to create some useful plots (I hope). See the flux processing chain notebook for how this looks like.
In a future update it will be possible to either directly switch to XGBoost for gap-filling, or to use it (and other
machine-learning models) in combination with random forest in the flux processing chain.
Example
Here is an example for a dataset containing CO2 flux (NEE) measurements from 2005 to 2023:
- for gap-filling the year 2005, the model is trained on data from 2005, 2006 and 2007 (2005 has no previous year)
- for gap-filling the year 2006, the model is trained on data from 2005, 2006 and 2007 (same model as for 2005)
- for gap-filling the year 2007, the model is trained on data from 2006, 2007 and 2008
- ...
- for gap-filling the year 2012, the model is trained on data from 2011, 2012 and 2013
- for gap-filling the year 2013, the model is trained on data from 2012, 2013 and 2014
- for gap-filling the year 2014, the model is trained on data from 2013, 2014 and 2015
- ...
- for gap-filling the year 2021, the model is trained on data from 2020, 2021 and 2022
- for gap-filling the year 2022, the model is trained on data from 2021, 2022 and 2023 (same model as for 2023)
- for gap-filling the year 2023, the model is trained on data from 2021, 2022 and 2023 (2023 has no next year)
New features
- Added new method for long-term (multiple years) gap-filling using random forest to flux processing chain (
diive.pkgs.fluxprocessingchain.fluxprocessingchain.FluxProcessingChain.level41_gapfilling_longterm) - Added new class for long-term (multiple years) gap-filling using random forest (
diive.pkgs.gapfilling.longterm.LongTermGapFillingRandomForestTS) - Added class for plotting cumulative sums across all data, for multiple columns (
diive.core.plotting.cumulative.Cumulative) - Added class to detect a constant offset between two measurements (
diive.pkgs.corrections.measurementoffset.MeasurementOffset)
Changes
- Creating lagged variants creates gaps which then leads to incomplete features in machine learning models. Now, gaps
are filled using simple forward and backward filling, limited to the number of values defined in lag. For example,
if variable TA is lagged by -2 value this creates two missing values for this variant at the start of the time series,
which then are then gap-filled using the simple backwards fill with
limit=2. (diive.core.dfun.frames.lagged_variants)
Notebooks
- Updated flux processing chain notebook to include long-term gap-filling using random forest (
notebooks/FluxProcessingChain/FluxProcessingChain.ipynb) - Added new notebook for plotting cumulative sums across all data, for multiple columns (
notebooks/Plotting/Cumulative.ipynb)
Tests
- Unittest for flux processing chain now includes many more methods (
tests.test_fluxprocessingchain.TestFluxProcessingChain.test_fluxprocessingchain) - 39/39 unittests ran successfully
Bugfixes
- Fixed deprecation warning in (
diive.core.ml.common.prediction_scores_regr)
What's Changed
- Indev by @holukas in https://github.com/holukas/diive/pull/215
Full Changelog: https://github.com/holukas/diive/compare/v0.81.0...v0.82.0
- Python
Published by holukas over 1 year ago
diive - v0.81.0
v0.81.0 | 11 Sep 2024
Expanding Flux Processing Capabilities
This update brings advancements for post-processing eddy covariance data in the context of the FluxProcessingChain.
The goal is to offer a complete chain for post-processing ecosystem flux data, specifically designed to work seamlessly
with the standardized _fluxnet output file from the
widely-used EddyPro software.
Now, diive offers the option for USTAR filtering based on known constant thresholds across the entire dataset (similar
to the CUT scenarios in FLUXNET data). While seasonal (DJF, MAM, JJA, SON) thresholds are calculated internally,
applying them on a seasonal basis or using variable thresholds per year (like FLUXNET's VUT scenarios) isn't yet
implemented.
With this update, the FluxProcessingChain class can handle various data processing steps:
- Level-2: Quality flag expansion
- Level-3.1: Storage correction
- Level-3.2: Outlier removal
- Level-3.3: (new) USTAR filtering (with constant thresholds for now)
- (upcoming) Level-4.1: long-term gap-filling using random forest and XGBoost
- For info about the different flux levels see Swiss FluxNet flux processing chain
New features
- Added class to apply multiple known constant USTAR (friction velocity) thresholds, creating flags that indicate time
periods characterized by low turbulence for multiple USTAR scenarios. The constant thresholds must be known
beforehand, e.g., from an earlier USTAR detection run, or from results from FLUXNET (
diive.pkgs.flux.ustarthreshold.FlagMultipleConstantUstarThresholds) - Added class to apply one single known constant USTAR thresholds (
diive.pkgs.flux.ustarthreshold.FlagSingleConstantUstarThreshold) - Added
FlagMultipleConstantUstarThresholdsto the flux processing chain (diive.pkgs.fluxprocessingchain.fluxprocessingchain.FluxProcessingChain.level33_constant_ustar) - Added USTAR detection algorithm based on Papale et al., 2006 (
diive.pkgs.flux.ustarthreshold.UstarDetectionMPT) - Added function to analyze high-quality ecosystem fluxes that helps in understanding the range of highest-quality data(
diive.pkgs.flux.hqflux.analyze_highest_quality_flux)
Additions
LocalSDoutlier detection can now use a constant SD:- Added parameter to use standard deviation across all data (constant) instead of the rolling SD to calculate the
upper and lower limits that define outliers in the median rolling window (
diive.pkgs.outlierdetection.localsd.LocalSD) - Added to step-wise outlier detection (
diive.pkgs.outlierdetection.stepwiseoutlierdetection.StepwiseOutlierDetection.flag_outliers_localsd_test) - Added to meteoscreening from database (
diive.pkgs.qaqc.meteoscreening.StepwiseMeteoScreeningDb.flag_outliers_localsd_test) - Added to flux processing chain (
diive.pkgs.fluxprocessingchain.fluxprocessingchain.FluxProcessingChain.level32_flag_outliers_localsd_test)
- Added parameter to use standard deviation across all data (constant) instead of the rolling SD to calculate the
upper and lower limits that define outliers in the median rolling window (
Changes
- Replaced
.plot_date()from the Matplotlib library with.plot()due to deprecation
Notebooks
- Added notebook for plotting cumulative sums per year (
notebooks/Plotting/CumulativesPerYear.ipynb) - Added notebook for removing outliers based on the z-score in rolling time window (
notebooks/OutlierDetection/zScoreRolling.ipynb)
Bugfixes
- Fixed bug when saving a pandas Series to parquet (
diive.core.io.files.save_parquet) - Fixed bug when plotting
doy_mean_cumulative: no longer crashes when years defined in parameterexcl_years_from_referenceare not in dataset (diive.core.times.times.doy_mean_cumulative) - Fixed deprecation warning when plotting in
bokeh(interactive plots)
Tests
- Added unittest for
LocalSDusing constant SD (tests.test_outlierdetection.TestOutlierDetection.test_localsd_with_constantsd) - Added unittest for rolling z-score outlier removal (
tests.test_outlierdetection.TestOutlierDetection.test_zscore_rolling) - Improved check if figure and axis were created in (
tests.test_plots.TestPlots.test_histogram) - 39/39 unittests ran successfully
Environment
- Added new package
scikit-optimize - Added new package
category_encoders
What's Changed
- Indev by @holukas in https://github.com/holukas/diive/pull/205
Full Changelog: https://github.com/holukas/diive/compare/v0.80.0...v0.81.0
- Python
Published by holukas over 1 year ago
diive - v0.80.0
v0.80.0 | 28 Aug 2024
Additions
- Added outlier tests to step-wise meteoscreening from database:
Hampel,HampelDaytimeNighttimeandTrimLow(diive.pkgs.qaqc.meteoscreening.StepwiseMeteoScreeningDb) - Added parameter to control whether or not to output the middle timestamp when loading parquet files with
load_parquet(). By default,output_middle_timestamp=True. (diive.core.io.files.load_parquet)
Environment
- Re-created environment and created new
lockfile - Currently using Python 3.9.19
Notebooks
- Added new notebook for creating a flag that indicates missing values (
notebooks/OutlierDetection/MissingValues.ipynb) - Updated notebook for meteoscreening from database (
notebooks/MeteoScreening/StepwiseMeteoScreeningFromDatabase.ipynb) - Updated notebook for loading and saving parquet files (
notebooks/Formats/LoadSaveParquetFile.ipynb)
Tests
- Added unittest for flagging missing values (
tests.test_outlierdetection.TestOutlierDetection.test_missing_values) - 37/37 unittests ran successfully
Bugfixes
- Fixed links in README, needed absolute links to notebooks
- Fixed issue with return list in (
diive.pkgs.analyses.histogram.Histogram.peakbins)
What's Changed
- Meteoscreening updates by @holukas in https://github.com/holukas/diive/pull/184
Full Changelog: https://github.com/holukas/diive/compare/v0.79.1...v0.80.0
- Python
Published by holukas over 1 year ago
diive - v0.79.1
v0.79.1 | 26 Aug 2024
Additions
- Added new function to apply quality flags to certain time periods only (
diive.pkgs.qaqc.flags.restrict_application) - Added to option to restrict the application of the angle-of-attack flag to certain time periods (
diive.pkgs.fluxprocessingchain.level2_qualityflags.FluxQualityFlagsEddyPro.angle_of_attack_test)
Changes
- Test options in
FluxProcessingChainare now always passed as dict. This has the advantage that in addition to run the test by setting the dict keyapplytoTrue, various other test settings can be passed, for example the new parameterapplication datesfor the angle-of-attack flag. (diive.pkgs.fluxprocessingchain.fluxprocessingchain.FluxProcessingChain)
Tests
- Added unittest for Flux Processing Chain up to Level-2 (
tests.test_fluxprocessingchain.TestFluxProcessingChain.test_fluxprocessingchain_level2) - 36/36 unittests ran successfully
What's Changed
- Time periods ec flags by @holukas in https://github.com/holukas/diive/pull/179
Full Changelog: https://github.com/holukas/diive/compare/v0.79.0...v0.79.1
- Python
Published by holukas over 1 year ago
diive - v0.79.0
v0.79.0 | 22 Aug 2024
This version introduces a histogram plot that has the option to display z-score as vertical lines superimposed on the distribution, which helps in assessing z-score settings used by some outlier removal functions.

Histogram plot of half-hourly air temperature measurements at the ICOS Class 1 ecosystem station Davos between 2013 and 2022, displayed in 20 equally-spaced bins. The dashed vertical lines show the z-score and the corresponding value calculated based on the time series. The bin with most counts is highlighted orange.
New features
- Added new class
HistogramPlotfor plotting histograms, based on the Matplotlib implementation (diive.core.plotting.histogram.HistogramPlot) - Added function to calculate the value for a specific z-score, e.g., based on a time series it calculates the value
where z-score =
3etc. (diive.core.funcs.funcs.val_from_zscore)
Additions
- Added histogram plots to
FlagBase, histograms are now shown for all outlier methods (diive.core.base.flagbase.FlagBase.defaultplot) - Added daytime/nighttime histogram plots to (
diive.pkgs.outlierdetection.hampel.HampelDaytimeNighttime) - Added daytime/nighttime histogram plots to (
diive.pkgs.outlierdetection.zscore.zScoreDaytimeNighttime) - Added daytime/nighttime histogram plots to (
diive.pkgs.outlierdetection.lof.LocalOutlierFactorDaytimeNighttime) - Added daytime/nighttime histogram plots to (
diive.pkgs.outlierdetection.absolutelimits.AbsoluteLimitsDaytimeNighttime) - Added option to calculate the z-score with sign instead of absolute (
diive.core.funcs.funcs.zscore)
Changes
- Improved daytime/nighttime outlier plot used by various outlier removal classes (
diive.core.base.flagbase.FlagBase.plot_outlier_daytime_nighttime)
Notebooks
- Added notebook for plotting histograms (
notebooks/Plotting/Histogram.ipynb) - Added notebook for manual removal of data points (
notebooks/OutlierDetection/ManualRemoval.ipynb) - Added notebook for outlier detection using local outlier factor, separately during daytime and nighttime (
notebooks/OutlierDetection/LocalOutlierFactorDaytimeNighttime.ipynb) - Updated notebook (
notebooks/OutlierDetection/HampelDaytimeNighttime.ipynb) - Updated notebook (
notebooks/OutlierDetection/AbsoluteLimitsDaytimeNighttime.ipynb) - Updated notebook (
notebooks/OutlierDetection/zScoreDaytimeNighttime.ipynb) - Updated notebook (
notebooks/OutlierDetection/LocalOutlierFactorAllData.ipynb)
Tests
- Added unittest for plotting histograms (
tests.test_plots.TestPlots.test_histogram) - Added unittest for calculating histograms (without plotting) (
tests.test_analyses.TestCreateVar.test_histogram)
What's Changed
- v0.79.0 by @holukas in https://github.com/holukas/diive/pull/176
Full Changelog: https://github.com/holukas/diive/compare/v0.78.1.1...v0.79.0
- Python
Published by holukas over 1 year ago
diive - v0.78.1
v0.78.1 | 19 Aug 2024
Changes
- Added option to set different
n_sigmafor daytime and nightime data inHampelDaytimeNighttime(diive.pkgs.outlierdetection.hampel.HampelDaytimeNighttime) - Updated
flag_outliers_hampel_dtnt_testin step-wise outlier detection - Updated
level32_flag_outliers_hampel_dtnt_testin flux processing chain
Notebooks
- Updated notebook
HampelDaytimeNighttime - Updated notebook
FluxProcessingChain
Tests
- Updated unittest
test_hampel_filter_daytime_nighttime
What's Changed
- v0.78.1 by @holukas in https://github.com/holukas/diive/pull/168
Full Changelog: https://github.com/holukas/diive/compare/v0.78.0...v0.78.1
- Python
Published by holukas over 1 year ago
diive - v0.78.0
v0.78.0 | 18 Aug 2024
New features
- Added new class for outlier removal, based on the rolling z-score. It can also be used in step-wise outlier detection
and during meteoscreening from the
database. (
diive.pkgs.outlierdetection.zscore.zScoreRolling,diive.pkgs.outlierdetection.stepwiseoutlierdetection.StepwiseOutlierDetection,diive.pkgs.qaqc.meteoscreening.StepwiseMeteoScreeningDb). - Added Hampel filter for outlier removal (
diive.pkgs.outlierdetection.hampel.Hampel) - Added Hampel filter (separate daytime, nighttime) for outlier
removal (
diive.pkgs.outlierdetection.hampel.HampelDaytimeNighttime) - Added function to plot daytime and nighttime outliers during outlier
tests (
diive.core.plotting.outlier_dtnt.outlier_daytime_nighttime)
Changes
- Flux processing chain:
- Several changes to the flux processing chain to make sure it can also work with data files not directly output by
EddyPro. The class
FluxProcessingChaincan now handle files that have a different format than the two EddyPro output filesEDDYPRO-FLUXNET-CSV-30MINandEDDYPRO-FULL-OUTPUT-CSV-30MIN. See following notes. - Removed option to process EddyPro
_full_output_files, since it as an older format and its variables do not follow FLUXNET conventions. - Removed keyword
filetypein classFluxProcessingChain. It is now assumed that the variable names follow the FLUXNET convention. Variables used in FLUXNET are listed here (diive.pkgs.fluxprocessingchain.fluxprocessingchain.FluxProcessingChain) - When detecting the base variable from which a flux variable was calculated, the variables defined for
filetype
EDDYPRO-FLUXNET-CSV-30MINare now assumed by default. (diive.pkgs.flux.common.detect_basevar) - Renamed function that detects the base variable that was used to calculate the respective
flux (
diive.pkgs.flux.common.detect_fluxbasevar) - Renamed
gasin functions related to completeness tests tofluxbasevarto better reflect that the completeness test does not necessarily require a gas (e.g.T_SONICis used to calculate the completeness for sensible heat flux) (flag_fluxbasevar_completeness_eddypro_test)
- Several changes to the flux processing chain to make sure it can also work with data files not directly output by
EddyPro. The class
- Removing the radiation offset now uses
0.001(W m-2) instead of50as the threshold value to flag nighttime values for the correction (diive.pkgs.corrections.offsetcorrection.remove_radiation_zero_offset) - The database tag for meteo data screened with
diiveis nowmeteoscreening_diive(diive.pkgs.qaqc.meteoscreening.StepwiseMeteoScreeningDb.resample) - During noise generation, function now uses the absolute values of the min/max of a series to calculate minimum noise
and maximum noise (
diive.pkgs.createvar.noise.add_impulse_noise)
Notebooks
- Added new notebook for outlier detection using class
zScore(notebooks/OutlierDetection/zScore.ipynb) - Added new notebook for outlier detection using
class
zScoreDaytimeNighttime(notebooks/OutlierDetection/zScoreDaytimeNighttime.ipynb) - Added new notebook for outlier removal using trimming (
notebooks/OutlierDetection/TrimLow.ipynb) - Updated notebook (
notebooks/MeteoScreening/StepwiseMeteoScreeningFromDatabase_v7.0.ipynb) - When uploading screened meteo data to the database using the notebook
StepwiseMeteoScreeningFromDatabase, variables with the same name, measurement and data version as the screened variable(s) are now deleted from the database before the new data are uploaded. Implemented in the Python packagedbc-influxdbto avoid duplicates in the database. Such duplicates can occur when one of the tags of an otherwise identical variable changed, e.g., when one of the tags of the originally uploaded data was wrong and needed correction. The databaseInfluxDBstores a new time series alongside the previous time series when one of the tags is different in an otherwise identical time series.
Tests
- Added test case for
Hampelfilter (tests.test_outlierdetection.TestOutlierDetection.test_hampel_filter) - Added test case for
HampelDaytimeNighttimefilter (tests.test_outlierdetection.TestOutlierDetection.test_hampel_filter_daytime_nighttime) - Added test case for
zScore(tests.test_outlierdetection.TestOutlierDetection.test_zscore) - Added test case for
TrimLow(tests.test_outlierdetection.TestOutlierDetection.test_trim_low_nt) - Added test case
for
zScoreDaytimeNighttime(tests.test_outlierdetection.TestOutlierDetection.test_zscore_daytime_nighttime) - 33/33 unittests ran successfully
Environment
- Added package sktime, a unified framework for machine learning with time series.
What's Changed
- v0.78.0 by @holukas in https://github.com/holukas/diive/pull/161
Full Changelog: https://github.com/holukas/diive/compare/v0.77.0...v0.78.0
- Python
Published by holukas over 1 year ago
diive - v0.77.0
v0.77.0 | 11 Jun 2024
Additions
- Plotting cumulatives with
CumulativeYearnow also shows the cumulative for the reference, i.e. for the mean over the reference years (diive.core.plotting.cumulative.CumulativeYear) - Plotting
DielCyclenow acceptsylimparameter (diive.core.plotting.dielcycle.DielCycle) - Added long-term dataset for local testing purposes (internal
only) (
diive.configs.exampledata.load_exampledata_parquet_long) - Added several classes in preparation for long-term gap-filling for a future update
Changes
- Several updates and changes to the base class for regressor decision
trees (
diive.core.ml.common.MlRegressorGapFillingBase):- The data are now split into training set and test set at the very start of regressor setup. This test set is used to evaluate models on unseen data. The default split is 80% training and 20% test data.
- Plotting (scores, importances etc.) is now generally separated from the method where they are calculated.
- the same
random_stateis now used for all processing steps - refactored code
- beautified console output
- When correcting for relative humidity values above 100%, the maximum of the corrected time series is now set to 100,
after the (daily) offset was removed (
diive.pkgs.corrections.offsetcorrection.remove_relativehumidity_offset) - During feature reduction in machine learning regressors, features with permutation importance < 0 are now always
removed (
diive.core.ml.common.MlRegressorGapFillingBase._remove_rejected_features) - Changed default parameters for quick random forest gap-filling (
diive.pkgs.gapfilling.randomforest_ts.QuickFillRFTS) - I tried to improve the console output (clarity) for several functions and methods
Environment
- Added package dtreeviz to visualize decision trees
Notebooks
- Updated notebook (
notebooks/GapFilling/RandomForestGapFilling.ipynb) - Updated notebook (
notebooks/GapFilling/LinearInterpolation.ipynb) - Updated notebook (
notebooks/GapFilling/XGBoostGapFillingExtensive.ipynb) - Updated notebook (
notebooks/GapFilling/XGBoostGapFillingMinimal.ipynb) - Updated notebook (
notebooks/GapFilling/RandomForestParamOptimization.ipynb) - Updated notebook (
notebooks/GapFilling/QuickRandomForestGapFilling.ipynb)
Tests
- Updated and fixed test case (
tests.test_outlierdetection.TestOutlierDetection.test_zscore_increments) - Updated and fixed test case (
tests.test_gapfilling.TestGapFilling.test_gapfilling_randomforest)
What's Changed
- Ml long term gap filling by @holukas in https://github.com/holukas/diive/pull/128
Full Changelog: https://github.com/holukas/diive/compare/v0.76.2...v0.77.0
- Python
Published by holukas over 1 year ago
diive - v0.76.2
v0.76.2 | 23 May 2024
Additions
- Added function to calculate absolute double differences of a time series, which is the sum of absolute differences
between a data record and its preceding and next record. Used in class
zScoreIncrementsfor finding (isolated) outliers that are distant from neighboring records. (diive.core.dfun.stats.double_diff_absolute) - Added small function to calculate z-score stats of a time series (
diive.core.dfun.stats.sstats_zscore) - Added small function to calculate stats for absolute double differences of a time
series (
diive.core.dfun.stats.sstats_doublediff_abs)
Changes
- Changed the algorithm for outlier detection when using
zScoreIncrements. Data points are now flagged as outliers if the z-scores of three absolute differences (previous record, next record and the sum of both) all exceed a specified threshold. (diive.pkgs.outlierdetection.incremental.zScoreIncrements)
Notebooks
- Added new notebook for outlier detection using
class
LocalOutlierFactorAllData(notebooks/OutlierDetection/LocalOutlierFactorAllData.ipynb)
Tests
- Added new test case
for
LocalOutlierFactorAllData(tests.test_outlierdetection.TestOutlierDetection.test_lof_alldata)
What's Changed
- More stats by @holukas in https://github.com/holukas/diive/pull/116
Full Changelog: https://github.com/holukas/diive/compare/v0.76.1...v0.76.2
- Python
Published by holukas over 1 year ago
diive - v0.76.1
v0.76.1 | 17 May 2024
Additions
- It is now possible to set a fixed random seed when creating impulse
noise (
diive.pkgs.createvar.noise.add_impulse_noise)
Changes
- In class
zScoreIncrements, outliers are now detected by calculating the sum of the absolute differences between a data point and its respective preceding and next data point. Before, only the non-absolute difference of the preceding data point was considered. The sum of absolute differences is then used to calculate the z-score and in further consequence to flag outliers. (diive.pkgs.outlierdetection.incremental.zScoreIncrements)
Notebooks
- Added new notebook for outlier detection using
class
zScoreIncrements(notebooks/OutlierDetection/zScoreIncremental.ipynb) - Added new notebook for outlier detection using
class
LocalSD(notebooks/OutlierDetection/LocalSD.ipynb)
Tests
- Added new test case for
zScoreIncrements(tests.test_outlierdetection.TestOutlierDetection.test_zscore_increments) - Added new test case for
LocalSD(tests.test_outlierdetection.TestOutlierDetection.test_localsd)
What's Changed
- Added more notebooks and test cases by @holukas in https://github.com/holukas/diive/pull/108
Full Changelog: https://github.com/holukas/diive/compare/v0.76.0...v0.76.1
- Python
Published by holukas almost 2 years ago
diive - v0.76.0
v0.76.0 | 14 May 2024
Diel cycle plot
The new class DielCycle allows to plot diel cycles per month or across all data for time series data. At the moment,
it plots the (monthly) diel cycles as means (+/- standard deviation). It makes use of the time info contained in the
datetime timestamp index of the data. All aggregates are calculated by grouping data by time and (optional) separately
for each month. The diel cycles have the same time resolution as the time component of the timestamp index, e.g. hourly.

New features
- Added new class
DielCyclefor plotting diel cycles per month (diive.core.plotting.dielcycle.DielCycle) - Added new function
diel_cyclefor calculating diel cycles per month. This function is also used by the plotting classDielCycle(diive.core.times.resampling.diel_cycle)
Additions
- Added color scheme that contains 12 colors, one for each month. Not perfect, but better than
before. (
diive.core.plotting.styles.LightTheme.colors_12_months)
Notebooks
- Added new notebook for plotting diel cycles (per month) (
notebooks/Plotting/DielCycle.ipynb) - Added new notebook for calculating diel cycles (per month) (
notebooks/Resampling/ResamplingDielCycle.ipynb)
Tests
- Added test case for new function
diel_cycle(tests.test_resampling.TestResampling.test_diel_cycle)
What's Changed
- Diel cycle plot by @holukas in https://github.com/holukas/diive/pull/107
Full Changelog: https://github.com/holukas/diive/compare/v0.75.0...v0.76.0
- Python
Published by holukas almost 2 years ago
diive - v0.75.0
v0.75.0 | 26 Apr 2024
XGBoost gap-filling
XGBoost can now be used to fill gaps in time series data.
In diive, XGBoost is implemented in class XGBoostTS, which adds additional options for easily including e.g.
lagged variants of feature variables, timestamp info (DOY, month, ...) and a continuous record number. It also allows
direct feature reduction by including a purely random feature (consisting of completely random numbers) and calculating
the 'permutation importance'. All features where the permutation importance is lower than for the random feature can
then be removed from the dataset, i.e., the list of features, before building the final model.
XGBoostTS and RandomForestTS both use the same base class MlRegressorGapFillingBase. This base class will also
facilitate the implementation of other gap-filling algorithms in the future.
Another fun (for me) addition is the new class TimeSince. It allows to calculate the time since the last occurrence of
specific conditions. One example where this class can be useful is the calculation of 'time since last precipitation',
expressed as number of records, which can be helpful in identifying dry conditions. More examples: 'time since freezing
conditions' based on air temperature; 'time since management' based on management info, e.g. fertilization events.
Please see the notebook for some illustrative examples.
Please note that diive is still under developement and bugs can be expected.
New features
- Added gap-filling class
XGBoostTSfor time series data, using XGBoost (diive.pkgs.gapfilling.xgboost_ts.XGBoostTS) - Added new class
TimeSince: counts number of records (inceremental number / counter) since the last time a time series was inside a specified range, useful for e.g. counting the time since last precipitation, since last freezing temperature, etc. (diive.pkgs.createvar.timesince.TimeSince)
Additions
- Added base class for machine learning regressors, which is basically the code shared between the different
methods. At the moment used by
RandomForestTSandXGBoostTS. (diive.core.ml.common.MlRegressorGapFillingBase) - Added option to change line color directly in
TimeSeriesplots (diive.core.plotting.timeseries.TimeSeries.plot)
Notebooks
- Added new notebook for gap-filling using
XGBoostTSwith mininmal settings (notebooks/GapFilling/XGBoostGapFillingMinimal.ipynb) - Added new notebook for gap-filling using
XGBoostTSwith more extensive settings (notebooks/GapFilling/XGBoostGapFillingExtensive.ipynb) - Added new notebook for creating
TimeSincevariables (notebooks/CalculateVariable/TimeSince.ipynb)
Tests
- Added test case for XGBoost gap-filling (
tests.test_gapfilling.TestGapFilling.test_gapfilling_xgboost) - Updated test case for random forest gap-filling (
tests.test_gapfilling.TestGapFilling.test_gapfilling_randomforest) - Harmonized test case for XGBoostTS with test case of RandomForestTS
- Added test case for
TimeSincevariable creation (tests.test_createvar.TestCreateVar.test_timesince)
What's Changed
- Adding xgboost by @holukas in https://github.com/holukas/diive/pull/102
Full Changelog: https://github.com/holukas/diive/compare/v0.74.1...v0.75.0
- Python
Published by holukas almost 2 years ago
diive - v0.74.1
v0.74.1 | 23 Apr 2024
This update adds the first notebooks (and tests) for outlier detection methods. Only two tests are included so far and
both tests are relatively simple, but both notebooks already show in principle how outlier removal is handled. An
important aspect is that diive single outlier methods do not remove outliers by default, but instead a flag is created
that shows where the outliers are located. The flag can then be used to remove the data points.
This update also includes the addition of a small function that creates artificial spikes in time series data and is
therefore very useful for testing outlier detection methods.
More outlier removal notebooks will be added in the future, including a notebook that shows how to combine results from
multiple outlier tests into one single overall outlier flag.
New features
- Added: new function to add impulse noise to time series (
diive.pkgs.createvar.noise.impulse)
Notebooks
- Added: new notebook for outlier detection: absolute limits, separately for daytime and nighttime
data (
notebooks/OutlierDetection/AbsoluteLimitsDaytimeNighttime.ipynb) - Added: new notebook for outlier detection: absolute limits (
notebooks/OutlierDetection/AbsoluteLimits.ipynb)
Tests
- Added: test case for outlier detection: absolute limits, separately for daytime and
nighttime data (
tests.test_outlierdetection.TestOutlierDetection.test_absolute_limits) - Added: test case for outlier detection: absolute
limits (
tests.test_outlierdetection.TestOutlierDetection.test_absolute_limits)
What's Changed
- Outlier notebooks by @holukas in https://github.com/holukas/diive/pull/95
- Update README.md by @inkenbrandt in https://github.com/holukas/diive/pull/86
- Update pyproject.toml by @inkenbrandt in https://github.com/holukas/diive/pull/85
Full Changelog: https://github.com/holukas/diive/compare/v0.74.0...v0.74.1
- Python
Published by holukas almost 2 years ago
diive - v0.74.0
v0.74.0 | 21 Apr 2024
Additions
- Added: new function to remove rows that do not have timestamp
info (
NaT) (diive.core.times.times.remove_rows_natanddiive.core.times.times.TimestampSanitizer) - Added: new settings
VARNAMES_ROWandVARUNITS_ROWin filetypes YAML files, allows better and more specific configuration when reading data files (diive/configs/filetypes) - Added: many (small) example data files for various filetypes, e.g.
ETH-RECORD-TOA5-CSVGZ-20HZ - Added: new optional check in
TimestampSanitizerthat compares the detected time resolution of a time series with the nominal (expected) time resolution. Runs automatically when reading files withReadFileType, in which case theFREQUENCYfrom the filetype configs is used as the nominal time resolution. (diive.core.times.times.TimestampSanitizer,diive.core.io.filereader.ReadFileType) - Added: application of
TimestampSanitizerafter inserting a timestamp and setting it as index with functioninsert_timestamp, this makes sure the freq/freqstr info is available for the new timestamp index (diive.core.times.times.insert_timestamp)
Notebooks
- General: Ran all notebook examples to make sure they work with this version of
diive - Added: new notebook for reading EddyPro fluxnet output file with
DataFileReaderparameters (notebooks/ReadFiles/Read_single_EddyPro_fluxnet_output_file_with_DataFileReader.ipynb) - Added: new notebook for reading EddyPro fluxnet output file with
ReadFileTypeand pre-defined filetypeEDDYPRO-FLUXNET-CSV-30MIN(notebooks/ReadFiles/Read_single_EddyPro_fluxnet_output_file_with_ReadFileType.ipynb) - Added: new notebook for reading multiple EddyPro fluxnet output files with
MultiDataFileReaderand pre-defined filetypeEDDYPRO-FLUXNET-CSV-30MIN(notebooks/ReadFiles/Read_multiple_EddyPro_fluxnet_output_files_with_MultiDataFileReader.ipynb)
Changes
- Renamed: function
get_len_headertoparse_header(diive.core.dfun.frames.parse_header) - Renamed: exampledata files (
diive/configs/exampledata) - Renamed: filetypes YAML files to always include the file extension in the file name (
diive/configs/filetypes) - Reduced: file size for most example data files
Tests
- Added: various test cases for loading filetypes (
tests/test_loaddata.py) - Added: test case for loading and merging multiple
files (
tests.test_loaddata.TestLoadFiletypes.test_load_exampledata_multiple_EDDYPRO_FLUXNET_CSV_30MIN) - Added: test case for reading EddyPro fluxnet output file with
DataFileReaderparameters (tests.test_loaddata.TestLoadFiletypes.test_load_exampledata_EDDYPRO_FLUXNET_CSV_30MIN_datafilereader_parameters) - Added: test case for resampling series to 30MIN time
resolution (
tests.test_time.TestTime.test_resampling_to_30MIN) - Added: test case for inserting timestamp with a different convention (middle, start,
end) (
tests.test_time.TestTime.test_insert_timestamp) - Added: test case for inserting timestamp as index (
tests.test_time.TestTime.test_insert_timestamp_as_index)
Bugfixes
- Fixed: bug in class
DetectFrequencywhen inferred frequency isNone(diive.core.times.times.DetectFrequency) - Fixed: bug in class
DetectFrequencywherepd.Timedelta()would crash if the input frequency does not have a number.Timedeltadoes not accept e.g. the frequency stringminfor minutely time resolution, even though e.g.pd.infer_freq()outputsminfor data in 1-minute time resolution.TimeDeltarequires a number, in this case1min. Results frominfer_freq()are now checked if they contain a number and if not,1is added at the beginning of the frequency string. (diive.core.times.times.DetectFrequency) - Fixed: bug in notebook
WindDirectionOffset, related to frequency detection during heatmap plotting - Fixed: bug in
TimestampSanitizerwhere the script would crash if the timestamp contained an element that could not be converted to datetime, e.g., when there is a string mixed in with the regular timestamps. Data rows with invalid timestamps are now parsed asNaTby usingerrors='coerce'inpd.to_datetime(data.index, errors='coerce'). (diive.core.times.times.convert_timestamp_to_datetimeanddiive.core.times.times.TimestampSanitizer) - Fixed: bug when plotting heatmap (
diive.core.plotting.heatmap_datetime.HeatmapDateTime)
What's Changed
- Update read csv and notebooks by @holukas in https://github.com/holukas/diive/pull/93
- Added new and updated test cases by @holukas in https://github.com/holukas/diive/pull/94
Full Changelog: https://github.com/holukas/diive/compare/v0.73.0...v0.74.0
- Python
Published by holukas almost 2 years ago
diive - v0.73.0
v0.73.0 | 17 Apr 2024
New features
- Added new function
trim_framethat allows to trim the start and end of a dataframe based on available records of a variable (diive.core.dfun.frames.trim_frame) - Added new option to export borderless
heatmaps (
diive.core.plotting.heatmap_base.HeatmapBase.export_borderless_heatmap)
Additions
- Added more info in comments of class
WindRotation2D(diive.pkgs.echires.windrotation.WindRotation2D) - Added example data for EddyPro fulloutput files (`diive.configs.exampledata.loadexampledataeddyprofulloutputCSV_30MIN`)
- Added code in an attempt to harmonize frequency detection from data: in class
DetectFrequencythe detected frequency strings are now converted fromTimedelta(pandas) tooffset(pandas) to.freqstr. This will yield the frequency string as seen by (the current version of) pandas. The idea is to harmonize between different representations e.g.Torminfor minutes. Currently it seems that pandas is not consistent with e.g. the represenation of minutes, usingTin.infer_freq()butminforTimedelta( see here). (diive.core.times.times.DetectFrequency)
Changes
- Updated class
DataFileReaderto comply with newpandaskwargs when using.read_csv()(diive.core.io.filereader.DataFileReader._parse_file) - Environment: updated
pandasto v2.2.2 andpyarrowto v15.0.2 - Updated date offsets in config filetypes to be compliant with
pandasversion 2.2+ ( see here and here), e.g.,30Twas changed to30min. This seems to work without raising a warning, however, if frequency is inferred from available data, the resulting frequency string shows e.g.30T, i.e. still showingTfor minutes instead ofmin. (diive/configs/filetypes) - Changed variable names in
WindRotation2Dto be in line with the variable names given in the paper by Wilczak et al. (2001) https://doi.org/10.1023/A:1018966204465
Removals
- Removed function
timedelta_to_stringbecause this can be done with pandasto_offset().freqstr - Removed function
generate_freq_str(unused)
Tests
- Added test case for reading EddyPro fulloutput files (`tests.testloaddata.TestLoadFiletypes.testloadexampledataeddyprofulloutputCSV_30MIN`)
- Updated test for frequency detection (
tests.test_timestamps.TestTime.test_detect_freq)
What's Changed
- Adding trim frame by @holukas in https://github.com/holukas/diive/pull/81
Full Changelog: https://github.com/holukas/diive/compare/v0.72.1...v0.73.0
- Python
Published by holukas almost 2 years ago
diive - v0.72.1
v0.72.1 | 26 Mar 2024
pyproject.tomlnow uses the inequality syntax>=instead of caret syntax^because the version capping is restrictive and prevents compatibility in conda installations. See #74- Added badges in
README.md - Smaller
diivelogo inREADME.md
What's Changed
- Update pyproject.toml by @inkenbrandt in https://github.com/holukas/diive/pull/74
- Minor updates by @holukas in https://github.com/holukas/diive/pull/77
Full Changelog: https://github.com/holukas/diive/compare/v0.72.0...v0.72.1
- Python
Published by holukas almost 2 years ago
diive - v0.72.0
v0.72.0 | 25 Mar 2024
New feature
- Added new heatmap plotting class
HeatmapYearMonththat allows to plot a variable in year/month classes(diive.core.plotting.heatmap_datetime.HeatmapYearMonth)

Changes
- Refactored code for class
HeatmapDateTime(diive.core.plotting.heatmap_datetime.HeatmapDateTime) - Added new base class
HeatmapBasefor heatmap plots. Currently used byHeatmapYearMonthandHeatmapDateTime(diive.core.plotting.heatmap_base.HeatmapBase)
Notebooks
- Added new notebook for
HeatmapDateTime(notebooks/Plotting/HeatmapDateTime.ipynb) - Added new notebook for
HeatmapYearMonth(notebooks/Plotting/HeatmapYearMonth.ipynb)
Bugfixes
- Fixed bug in
HeatmapDateTimewhere the last record of each day was not shown
What's Changed
- Heatmap plot update by @holukas in https://github.com/holukas/diive/pull/75
- Heatmap plot update by @holukas in https://github.com/holukas/diive/pull/76
Full Changelog: https://github.com/holukas/diive/compare/v0.71.6...v0.72.0
- Python
Published by holukas almost 2 years ago
diive - v0.71.6
v0.71.6 | 23 Mar 2024

Notebooks
- Added new notebook for
Percentiles(notebooks/Analyses/Percentiles.ipynb) - Added new notebook for
LinearInterpolation(notebooks/GapFilling/LinearInterpolation.ipynb) - Added new notebook for calculating z-aggregates in quantiles (classes) of x and
y (
notebooks/Analyses/CalculateZaggregatesInQuantileClassesOfXY.ipynb) - Updated notebook for
DaytimeNighttimeFlag(notebooks/CalculateVariable/DaytimeNighttimeFlag.ipynb)
What's Changed
- Percentile calculation by @holukas in https://github.com/holukas/diive/pull/73
Full Changelog: https://github.com/holukas/diive/compare/v0.71.5...v0.71.6
- Python
Published by holukas almost 2 years ago
diive - v0.71.5
v0.71.5 | 22 Mar 2024
Changes
- Updated notebook for
SortingBinsMethod(diive.pkgs.analyses.decoupling.SortingBinsMethod)
Plot showing vapor pressure deficit (y) in 10 classes of short-wave incoming radiation (x), separate for 5 classes of
air temperature (z). All values shown are medians of the respective variable. The shaded errorbars refer to the
interquartile range for the respective class. Plot was generated using the class SortingBinsMethod.
- Python
Published by holukas almost 2 years ago
diive - v0.71.4
v0.71.4 | 20 Mar 2024
Changes
- Refactored class
LongtermAnomaliesYear(diive.core.plotting.bar.LongtermAnomaliesYear)

Notebooks
- Added new notebook for
LongtermAnomaliesYear(notebooks/Plotting/LongTermAnomalies.ipynb)
What's Changed
- Anomaly plot by @holukas in https://github.com/holukas/diive/pull/72
Full Changelog: https://github.com/holukas/diive/compare/v0.71.3...v0.71.4
- Python
Published by holukas almost 2 years ago
diive - v0.71.3
v0.71.3 | 19 Mar 2024
Changes
- Refactored class
SortingBinsMethod: Allows to investigate binned aggregates of a variable z in binned classes of x and y (see plot below). All bins now show medians and interquartile ranges. (diive.pkgs.analyses.decoupling.SortingBinsMethod)
Notebooks
- Added new notebook for
SortingBinsMethod
Bugfixes
- Added absolute links to example notebooks in
README.md
Other
- From now on,
diiveis officially published on pypi
What's Changed
- V0.71.3 by @holukas in https://github.com/holukas/diive/pull/71
Full Changelog: https://github.com/holukas/diive/compare/v0.71.2...v0.71.3
- Python
Published by holukas almost 2 years ago
diive - v0.71.2
v0.71.2 | 18 Mar 2024
Notebooks
- Added new notebook for
daily_correlationfunction (notebooks/Analyses/DailyCorrelation.ipynb) - Added new notebook for
Histogramclass (notebooks/Analyses/Histogram.ipynb)
Bugfixes & changes
- Daily correlations are now returned with daily (
1d) timestamp index (diive.pkgs.analyses.correlation.daily_correlation) - Updated README
- Environment: Added ruff to dev dependencies for linting
What's Changed
- V0.71.2 by @holukas in https://github.com/holukas/diive/pull/70
Full Changelog: https://github.com/holukas/diive/compare/v0.71.1...v0.71.2
- Python
Published by holukas almost 2 years ago
diive - v0.71.1
v0.71.1 | 15 Mar 2024
Bugfixes & changes
- Fixed: Replaced all references to old filetypes using the underscore to their respective new filetype names,
e.g. all occurrences of
EDDYPRO_FLUXNET_30MINwere replaced with the new nameEDDYPRO-FLUXNET-30MIN. - Environment: Python 3.11 is now allowed in
pyproject.toml:python = ">=3.9,<3.12" - Environment: Removed
fitterlibrary from dependencies, was not used. - Docs: Testing documentation generation using Sphinx, although it looks very rough at the moment.
What's Changed
- Update pyproject.toml for compatibility with python 3.11 by @inkenbrandt in https://github.com/holukas/diive/pull/58
- V0.71.1 by @holukas in https://github.com/holukas/diive/pull/69
New Contributors
- @inkenbrandt made their first contribution in https://github.com/holukas/diive/pull/58
Full Changelog: https://github.com/holukas/diive/compare/v0.71.0...v0.71.1
- Python
Published by holukas almost 2 years ago
diive - v0.71.0 | High-resolution update
v0.71.0 by @holukas in https://github.com/holukas/diive/pull/66
v0.71.0 | 14 Mar 2024
High-resolution update
This update focuses on the implementation of several classes that work with high-resolution (20 Hz) data.
The main motivation behind these implementations is the upcoming new version of another script, dyco, which will make direct use of these new classes. dyco allows to detect and remove time lags from time series data and can also handle drifting lags, i.e., lags that are not constant over time. This is especially useful for eddy covariance data, where the detection of accurate time lags is of high importance for the calculation of ecosystem fluxes.
Plot showing the covariance between the turbulent departures of vertical wind and CO2 measurements.
Maximum (absolute) covariance was found at record -26, which means that the CO2 signal has to be shifted
by 26 records in relation to the wind data to obtain the maximum covariance between the two variables.
Since the covariance was calculated on 20 Hz data, this corresponds to a time lag of 1.3 seconds
between CO2 and wind (20 Hz = measurement every 0.05 seconds, 26 * 0.05 = 1.3), or, to put it
another way, the CO2 signal arrived 1.3 seconds later at the sensor than the wind signal. Maximum
covariance was calculated using the MaxCovariance class.
New features
- Added new class
MaxCovarianceto find the maximum covariance between two variables (diive.pkgs.echires.lag.MaxCovariance) - Added new class
FileDetectorto detect expected and unexpected files from a list of files (diive.core.io.filesdetector.FileDetector) - Added new class
FileSplitterto split file into multiple smaller parts and export them as multiple CSV files. (diive.core.io.filesplitter.FileSplitter) - Added new class
FileSplitterMultito split multiple files into multiple smaller parts and save them as CSV or compressed CSV files. (diive.core.io.filesplitter.FileSplitterMulti) - Added new function
create_timestampthat calculates the timestamp for each record in a dataframe, based on number of records in the file and the file duration. (diive.core.times.times.create_timestamp)
Additions
- Added new filetype
ETH-SONICREAD-BICO-CSVGZ-20HZ, these files contain data that were originally logged by thesonicreadscript which is in use in the ETH Grassland Sciences group since the early 2000s to record eddy covariance data within the Swiss FluxNet. Data were then converted to a regular format using the Python script bico, which also compressed the resulting CSV files togzfiles (gzipped). - Added new filetype
GENERIC-CSV-HEADER-1ROW-TS-MIDDLE-FULL-NS-30MIN, which corresponds to a CSV file with one header row with variable names, a timestamp that describes the middle of the averaging period, whereby the timestamp also includes nanoseconds. Time resolution of the file is 30MIN.
Changes
- Renamed class
TurbFluxtoWindRotation2Dand updated code a bit, e.g., now it is possible to get rotated values for all three wind components (u',v',w') in addition to the rotated scalarc'. (diive.pkgs.echires.windrotation.WindRotation2D) - Renamed filetypes: all filetypes now use the dash instead of an underscore
- Renamed filetype to
ETH-RECORD-DAT-20HZ: this filetype originates from the new eddy covariance real-time logging scriptrECord(currently not open source) - Missing values are now defined for all files
as:
NA_VALUES: [ -9999, -6999, -999, "nan", "NaN", "NAN", "NA", "inf", "-inf", "-" ]
- Python
Published by holukas almost 2 years ago
diive - v0.66.0: ScatterXY plot
What's Changed
- Indev by @holukas in https://github.com/holukas/diive/pull/36
- Remove sphinx autodocs for now by @holukas in https://github.com/holukas/diive/pull/37
- Add scatter plot by @holukas in https://github.com/holukas/diive/pull/41
Full Changelog: https://github.com/holukas/diive/compare/v0.64.0...v0.66.0
- Python
Published by holukas over 2 years ago
diive - v0.65.0: Harmonized daytime/nighttime flag calculation
- Python
Published by holukas over 2 years ago