xr_fresh: Automated Time Series Feature Extraction for Remote Sensing & Gridded Data

xr_fresh: Automated Time Series Feature Extraction for Remote Sensing & Gridded Data - Published in JOSS (2025)

https://github.com/mmann1123/xr_fresh

Science Score: 93.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 4 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: joss.theoj.org, zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software
Last synced: 4 months ago · JSON representation

Repository

A faster way to raster times-series feature utopia

Basic Info
Statistics
  • Stars: 8
  • Watchers: 2
  • Forks: 2
  • Open Issues: 7
  • Releases: 14
Created almost 6 years ago · Last pushed 4 months ago
Metadata Files
Readme Contributing License Code of conduct

README.md

xr_fresh

python Tests status <!--

-->

xr_fresh is designed to quickly generate a broad set of temporal features from gridded raster data time series. NOTE: This only works on single band images, not multiband e.g. rgb etc. If you have multiband you will need to split the bands into individual files.

The main operations of this package are:

1) Extract features from raster data (e.g. mean, minimum, maximum, complexity, skewdness, etc.) 2) Interpolate missing values in time series data 3) Calculate PCA components from raster data

Why use xr_fresh?

1) It is designed to be fast and efficient, using parallel processing to speed up the feature extraction process. 2) It can run on a GPU, if available, to further speed up the process. 3) Radically improves feature generation speeds over tabular extracts using ts_fresh as the backend comparison here 4) You really should use google earth engine less!

Time Series Feature Extraction

The package can extract a wide range of features from raster data time series. The features are calculated for each pixel in the raster stack, and the output is a raster file with the same shape as the input stack.

png

Interpolation of Missing Values

The package can also interpolate missing values in raster data time series. The interpolation is done on a pixel-by-pixel basis, and the output is a raster stack with the same shape as the input stack. These processes are parallelized through jax to speed up the interpolation process on a CPU or GPU.

Example of dummy data with missing values: data with missing values

Example of interpolated data: iterpolated data values

Principal Component Analysis (PCA) on Raster Data

The package can also calculate PCA components from stacks of raster data, and can be used to generate new rasters with the PCA components as bands.

Example outputs from PCA components for African ppt data:

Principal component outputs

Install

To install xrfresh, you can use pip. However, since xrfresh includes a C++ extension module, it requires compilation during the installation process. Here are the steps to install xr_fresh:

Prerequisites

  • Conda or mamba Instructions here
  • C++ compiler (e.g., g++ on Linux, clang on macOS, or MSVC on Windows)

Linux, OSx & Windows Install

```

add dependency

conda create -n xrfresh geowombat -c conda-forge conda activate xrfresh

clone repository

cd # to desired location git clone https://github.com/mmann1123/xrfresh cd xrfresh pip install -U pip setuptools wheel pip install . `` Note: If you run into problems related torletry runningpython setup.py buildext --inplacefrom thexrfresh` directory

To run PCA you must also install ray.

conda install -c conda-forge "ray-default" Note: ray is only is beta for Windows and will not be installed by default. Please read more about the installation here

Example

Simple working example

``` python import os from glob import glob import geowombat as gw from xrfresh.featurecalculator_series import *

get list of evi time series images

os.chdir("~/xrfresh/xrfresh/data") files = glob("tests/data/evi*.tif")

outpath = "evilongeststrikeabove_mean.tif"

use rasterio to create a new file tif file

with gw.series(files) as src: src.apply( longeststrikeabovemean(mean=299), bands=1, numworkers=12, outfile=out_path, ) ```

Execute across multiple features and parameters

``` python from xrfresh.extractorsseries import extractfeaturesseries import datetime as dt

get dates from files to use in doyofmaximum

dates = [dt.datetime.strptime("evi_%Y%m%d.tif") for f in files]

create list of desired series

featurelist = { "minimum": [{}], "doyofmaximum": [{"dates": dates}], "absenergy": [{}], "meanabschange": [{}], "variancelargerthanstandarddeviation": [{}], "ratiobeyondrsigma": [{"r": 1}, {"r": 2}, {"r": 3}], "symmetrylooking": [{}], "sum_values": [{}], }

Extract features from the geospatial time series

extractfeaturesseries(files, featurelist, bandname, tempdir, numworkers=12, nodata=-9999) ```

Working with NetCDF Files

While xr_fresh is designed to work with individual GeoTIFF files, you can work with NetCDF data (e.g., climate model output) using a simple workaround:

  1. Export NetCDF time slices to individual GeoTIFF files - Each time step becomes a separate raster file
  2. Run xr_fresh on the exported files - Process as normal time series
  3. Extract features - Generate temporal statistics across your time series

See the complete NetCDF workflow example that demonstrates: - Loading CESM2 climate model data (PRECT precipitation variable) - Exporting 1 year of daily data (~366 timesteps) to individual rasters - Extracting 27+ temporal features including extremes, variability, and trends

Note: Some features like longest_strike_above_mean and longest_strike_below_mean are not compatible with JAX tracing and should be excluded when using GPU acceleration. The notebook example shows a curated list of JAX-compatible features.

Documentation

After cloning the repository navigate locally to and open the documentation

Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines on: - Setting up a development environment - Running tests - Submitting pull requests - Code style and testing requirements - Reporting issues

Citation

Please cite work as: Michael Mann. (2024). mmann1123/xr_fresh: SpeedySeries (0.2.0). Zenodo. https://doi.org/10.5281/zenodo.12701466 DOI

Owner

  • Name: Michael Mann
  • Login: mmann1123
  • Kind: user
  • Location: Washington DC
  • Company: The George Washington University

Spatial Modeling | Python | R | Machine Learning

JOSS Publication

xr_fresh: Automated Time Series Feature Extraction for Remote Sensing and Gridded Data
Published
November 03, 2025
Volume 10, Issue 115, Page 9009
Authors
Michael L. Mann ORCID
The George Washington University, Department of Geography & Environment, Washington DC 20052
Editor
Monica Bobra ORCID
Tags
Feature Extraction Remote Sensing Time Series Machine Learning Crop Classification xarray

GitHub Events

Total
  • Create event: 4
  • Release event: 1
  • Issues event: 7
  • Watch event: 4
  • Delete event: 4
  • Issue comment event: 7
  • Push event: 58
  • Pull request event: 11
  • Fork event: 1
Last Year
  • Create event: 4
  • Release event: 1
  • Issues event: 7
  • Watch event: 4
  • Delete event: 4
  • Issue comment event: 7
  • Push event: 57
  • Pull request event: 11
  • Fork event: 1

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 9
  • Total pull requests: 18
  • Average time to close issues: about 2 months
  • Average time to close pull requests: 7 days
  • Total issue authors: 2
  • Total pull request authors: 2
  • Average comments per issue: 0.33
  • Average comments per pull request: 0.17
  • Merged pull requests: 12
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 3
  • Pull requests: 6
  • Average time to close issues: 1 day
  • Average time to close pull requests: about 1 hour
  • Issue authors: 1
  • Pull request authors: 1
  • Average comments per issue: 0.67
  • Average comments per pull request: 0.0
  • Merged pull requests: 5
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • mmann1123 (6)
  • chenyangkang (3)
Pull Request Authors
  • mmann1123 (15)
  • Jithendra-k (3)
Top Labels
Issue Labels
Pull Request Labels

Dependencies

setup.py pypi
.github/workflows/python-tests.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite