remotePARTS

remotePARTS: Spatiotemporal autoregression analyses for large data sets - Published in JOSS (2025)

https://github.com/morrowcj/remoteparts

Science Score: 93.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 6 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: joss.theoj.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Keywords

autocorrelation big-data remote-sensing-in-r statistical-analysis

Scientific Fields

Earth and Environmental Sciences Physical Sciences - 40% confidence
Economics Social Sciences - 40% confidence
Last synced: 4 months ago · JSON representation

Repository

remotePARTS is a set of tools for running Partitioned spatio-temporal auto regression analyses on remotely-sensed data sets.

Basic Info
Statistics
  • Stars: 22
  • Watchers: 3
  • Forks: 6
  • Open Issues: 12
  • Releases: 2
Topics
autocorrelation big-data remote-sensing-in-r statistical-analysis
Created almost 6 years ago · Last pushed 5 months ago
Metadata Files
Readme Changelog License

README.Rmd

---
output: github_document
---



```{r, include = FALSE}
knitr::opts_chunk$set(
  echo = TRUE, collapse = TRUE, comment = "#>",
  fig.path = "man/figures/README-", out.width = "100%"
)
```

## remotePARTS


[![CRAN status](https://www.r-pkg.org/badges/version/remotePARTS)](https://CRAN.R-project.org/package=remotePARTS)
[![Lifecycle: stable](https://img.shields.io/badge/lifecycle-stable-green.svg)](https://lifecycle.r-lib.org/articles/stages.html#stable)
[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)
[![R-CMD-check](https://github.com/morrowcj/remotePARTS/workflows/R-CMD-check/badge.svg)](https://github.com/morrowcj/remotePARTS/actions)
[![status](https://joss.theoj.org/papers/c6a3da6a56aa0fb0e1f8a4f36cab12c2/status.svg)](https://joss.theoj.org/papers/c6a3da6a56aa0fb0e1f8a4f36cab12c2)
[![Codecov test coverage](https://codecov.io/gh/morrowcj/remotePARTS/graph/badge.svg)](https://app.codecov.io/gh/morrowcj/remotePARTS)


*remotePARTS* is a software package for the *R* statistical programming language. 
The package contains tools for analyzing spatiotemporal data, typically obtained 
via remote sensing.

## Description

These tools were created to test map-scale hypotheses about trends in large 
remotely sensed data sets, but they are useful for analyzing trends in any 
spatial data, with or without a temporal component. Statistical tests are 
conducted with the PARTS method for analyzing spatially autocorrelated time 
series (Ives et al., 2021). The method's unique approach can handle extremely
large data sets that other spatiotemporal models cannot, while still 
appropriately accounting for autocorrelation structure. This is 
done by partitioning the data into smaller chunks, analyzing chunks separately
and then combining the separate analyses into a single test, that accounts for
correlations among chunks, of the map-scale hypotheses.

## Installation

To install the package and it's dependencies from CRAN, use the following R code:

```{r, eval = FALSE}
install.packages("remotePARTS")
```

To install the latest stable version of this package from github, use

```{r, eval = FALSE}
remotes::install_github("morrowcj/remotePARTS")
```

and to test out the newest features and functionality, use 

```{r, eval = FALSE}
remotes::install_github("morrowcj/remotePARTS", ref = "develop")
```

To ensure the vignette is built when installing from GitHub, use 

```{r, eval = FALSE}
remotes::install_github("morrowcj/remotePARTS", build_vignettes = TRUE)
```

Then, upon successful installation, load the package with 
`library(remotePARTS)`. 

### Dependencies

Since the matrix operations in this package rely on C++ code, as implemented 
via the [RcppEigen package](https://github.com/RcppCore/RcppEigen),
the latest version of [Rtools](https://cran.r-project.org/bin/windows/Rtools/)
is required for Windows and C++11 is required for other systems. 



## Citation

To cite this package in publications, please use:

  Morrow CJ, Ives AR (2025). “remotePARTS: Spatiotemporal autoregression analyses for large data sets.” _Journal of Open Source
  Software_, *10*(109), 7937. doi:10.21105/joss.07937 .

## Contribution, bugs, and feature requests

If you wish to contribute to this package, report bugs, suggest new features, tests,
or behavior, correct typos, update documentation, or anything else, please submit a
[GitHub Issue](https://github.com/morrowcj/remotePARTS/issues). We welcome and
appreciate any and all feedback.

## Testing

To manually run the package's unit tests from a cloned directory, 
after installing dependencies, use

```{r, eval = FALSE, echo = TRUE}
devtools::test()  # run unit tests in tests/testthat/
devtools::check() # full R CMD check (tests + docs)
```

Test coverage is currently low, especially among functions with 
stochastic outcomes. As the package develops, additional tests
will be added.

## Typical Workflow

A typical *remotePARTS* workflow is comprised of two broad steps for analyzing
trends in spatiotemporal datasets: 1) time series analysis and 2) spatial 
analysis. For purely spatial problems, step 1 is skipped. We briefly summarize 
these steps and the expected data structure below.

#### Input data

Currently, *remotePARTS* requires that the data are formatted as "flat" files
(i.e., data frames with 1 row per pixel) with x- and y-coordinates. We will 
not go into detail of how to prepare your data here, as other packages are
dedicated to reading and manipulating spatial data (e.g., see 
`raster::rasterToPoints()`). We recognize that this is a limitation, since flat
files are highly inefficient. Future versions of this package may include
interfaces with raster objects if enough users express interest.

To demonstrate the package's basic functionality, we first simulate a 
small spatiotemporal data set for analysis:

```{r, warning=FALSE, message=FALSE}
library(tibble); library(dplyr); library(tidyr); library(viridisLite)
library(ggplot2); library(remotePARTS)

# set a random seed, for reproducibility
set.seed(42) # don't panic

# simulate a spatiotemporal response variable
sim_spatiotemp <- function(
    n, k, n_time = 4, b.0 = 0, b.x = 0.5, b.y = 1, 
    b.xy = 0.1, sd.xy = 0.2, b.t = 0.2, ar = 0.4,  
    sd.t = 0.1
){
  coords = expand_grid(
    x = seq(0, 1, length.out = n), y = seq(0, 1, length.out = k)
  )
  
  time = seq_len(n_time)
  
  tibble::tibble(
    x = coords[[1]], y = coords[[2]],
    z.0 = b.0 + x*b.x + y*b.y + x*y*b.xy,
    eps = rnorm(n = length(x), mean = 0, sd = sd.xy),
    time.effect = list(time * b.t),
  )  |> 
    rowwise() |>
    mutate(
      z.0 = z.0 + eps,
      sp.innov = list(z.0 + time.effect + rnorm(n_time, sd = sd.t)), 
      z = list(arima.sim(list(ar = ar), n_time, innov = sp.innov))
    ) |> 
    unnest_wider(z, names_sep = ".") |> 
    select(-"time.effect", -"sp.innov", -"eps")
}
```

The function defined above generates a data frame for `n` $\times$ `k` pixels. 
The response variable (`z`) depends upon the `x` and `y` coordinates of the map.
The resulting spatial patterns (`z.0`) are used as the random innovations
of an AR(1) time series model to generate the spatiotemporal response 
(`z.1` -- `z.4`). These data are visualized below:

```{r, fig.asp=0.8, fig.width=5, out.width="50%", cache = TRUE}
# build the data
dat <- sim_spatiotemp(n = 100, k = 100)

# extract coordinates
coords <- select(dat, x, y)

# visualize the data
dat |> 
  pivot_longer(cols = z.1:z.4, names_to = "time", values_to = "z") |> 
  mutate(time = as.numeric(gsub("z\\.", "", time))) |> 
  ggplot(aes(x = x, y = y, fill = z)) + 
  facet_wrap(~time, labeller = "label_both") + 
  geom_tile() + 
  scale_fill_viridis_c(option = "magma") 
```

#### 1. Time series analysis

With properly structured data, the first step is to conduct a time series 
analysis. This is done with the `fitAR_map` function.

```{r, cache = TRUE, message=FALSE, warning=FALSE}
# fit a pixel-wise autoregression model to the full map
AR_fit <- fitAR_map(Y = dat |> select(z.1:z.4) |> as.matrix(), coords = coords)

# combine results into a data frame
df <- data.frame(
  coords = AR_fit$coords, coefs = AR_fit$coefficients, 
  resids = AR_fit$residuals
)
```

This function returns time series regression coefficients and 
residual estimates for each pixel: 

```{r, fig.asp=0.8, fig.width=4, out.width="40%"}
df |> 
  ggplot(aes(x = coords.x, y = coords.y, fill = coefs.t)) +
  geom_tile() +
  labs(x = "x", y = "y", fill = "t coef") +
  scale_fill_viridis_c(option = "magma")
```

```{r, fig.asp=0.8, fig.width=5, out.width="50%"}
df |> 
  pivot_longer(resids.1:resids.4, names_to = "time", values_to = "resids") |> 
  mutate(time = as.numeric(gsub("resids\\.", "", time))) |> 
  ggplot(aes(x = coords.x, y = coords.y, fill = resids)) +
  facet_wrap(~time, labeller = "label_both") +
  geom_tile() +
  labs(x = "x", y = "y", fill = "resid") +
  scale_fill_viridis_c(option = "magma")
```

#### 2. spatial analysis

The second step is to conduct a spatial analysis with `fitGLS_partition`. In 
this case, we'll estimate how the temporal trend differs across the `x` and `y` 
coordinates.  

```{r, cache = TRUE}
# randomly divide the data into partitions
partitions <- sample_partitions(npix = nrow(df), partsize = 1000)

# fit the partitioned GLS
part_GLS <- fitGLS_partition(
  formula = coefs.t ~ coords.x + coords.y + coords.x:coords.y, data = df, 
  partmat = partitions, coord.names = c("coords.x", "coords.y"), ncores = 8
)
```

The results provide coefficient estimates that are corrected for spatial and 
temporal autocorrelation: 

```{r}
part_GLS$overall$t.test
```
Note that these are are not direct estimates of the parameters 
used to generate the data with `sim_spatiotemp` above. For example the 
coefficients for `coords.x` and `coords.y` are estimates of 
`b.t` $\times$ `b.x` and `b.t` $\times$ `b.y`.

##### 2a. Purely spatial problem

This method can also be used for purely spatial problems. Here we will use our
original spatial variable (`z.0`):

```{r, cache = TRUE}
# add the spatial variable into the data frame
df$z.0 <- dat$z.0

# fit the partitioned GLS
part_GLS1 <- fitGLS_partition(
  formula = z.0 ~ coords.x + coords.y + coords.x:coords.y, data = df, 
  partmat = partitions, coord.names = c("coords.x", "coords.y"), ncores = 8
)
```

```{r}
part_GLS1$overall$t.test
```

In this case, the coefficients *are* direct estimates of the spatial parameters 
(`b.0`, `b.x`, `b.y`, `b.xy`) given to `sim_spatiotemp`.

#### Vignette

For detailed examples of how to use `remotePARTS` and all its options, with 
real data, see the `Alaska` vignette:

```{r, eval = FALSE}
vignette("Alaska")
```

The latest stable version of the vignette is also hosted online at 
https://morrowcj.github.io/remotePARTS/Alaska.html.

## References

Ives, Anthony R., et al. "Statistical inference for trends in spatiotemporal data."
Remote Sensing of Environment 266 (2021): 112678. https://doi.org/10.1016/j.rse.2021.112678 

Owner

  • Name: Clay Morrow
  • Login: morrowcj
  • Kind: user
  • Location: Madison, WI
  • Company: UW-Madison, USDA, @RuminantFarmSystems

I am an ecologist (PhD) and statistician (MS) who studies topics ranging from plant defense, community ecology, genomics, and spatial patterns.

JOSS Publication

remotePARTS: Spatiotemporal autoregression analyses for large data sets
Published
May 23, 2025
Volume 10, Issue 109, Page 7937
Authors
Clay J. Morrow ORCID
University of Wisconsin - Madison, USA
Anthony R. Ives ORCID
University of Wisconsin - Madison, USA
Editor
James Gaboardi ORCID
Tags
remote sensing statistics hypothesis testing spatiotemporal autocorrelation

GitHub Events

Total
  • Create event: 7
  • Release event: 1
  • Issues event: 17
  • Watch event: 3
  • Delete event: 4
  • Member event: 1
  • Issue comment event: 19
  • Push event: 47
  • Pull request review event: 3
  • Pull request review comment event: 2
  • Pull request event: 15
Last Year
  • Create event: 7
  • Release event: 1
  • Issues event: 17
  • Watch event: 3
  • Delete event: 4
  • Member event: 1
  • Issue comment event: 19
  • Push event: 47
  • Pull request review event: 3
  • Pull request review comment event: 2
  • Pull request event: 15

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 27
  • Total pull requests: 28
  • Average time to close issues: 6 months
  • Average time to close pull requests: 12 days
  • Total issue authors: 6
  • Total pull request authors: 2
  • Average comments per issue: 2.93
  • Average comments per pull request: 0.39
  • Merged pull requests: 26
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 12
  • Pull requests: 14
  • Average time to close issues: 3 months
  • Average time to close pull requests: 24 days
  • Issue authors: 2
  • Pull request authors: 1
  • Average comments per issue: 0.92
  • Average comments per pull request: 0.79
  • Merged pull requests: 13
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • morrowcj (20)
  • arives (2)
  • kelewinska (2)
  • Rapsodia86 (1)
  • iosefa (1)
  • SeltaZheng (1)
Pull Request Authors
  • morrowcj (27)
  • arives (1)
Top Labels
Issue Labels
enhancement (11) bug (6) documentation (2)
Pull Request Labels
enhancement (3) documentation (1) bug (1)

Packages

  • Total packages: 1
  • Total downloads:
    • cran 213 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 1
  • Total versions: 1
  • Total maintainers: 1
cran.r-project.org: remotePARTS

Spatiotemporal Autoregression Analyses for Large Data Sets

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 1
  • Downloads: 213 Last month
Rankings
Forks count: 9.6%
Stargazers count: 12.2%
Dependent repos count: 24.0%
Average: 25.3%
Dependent packages count: 28.8%
Downloads: 51.8%
Maintainers (1)
Last synced: 4 months ago

Dependencies

DESCRIPTION cran
  • R >= 4.0 depends
  • CompQuadForm * imports
  • Rcpp >= 1.0.5 imports
  • doParallel * imports
  • foreach * imports
  • geosphere >= 1.5.10 imports
  • iterators * imports
  • parallel * imports
  • stats * imports
  • data.table * suggests
  • devtools * suggests
  • dplyr >= 1.0.0 suggests
  • ggplot2 * suggests
  • knitr * suggests
  • markdown * suggests
  • reshape2 * suggests
  • rmarkdown * suggests
  • sqldf * suggests
.github/workflows/archive/check-standard_OLD.yaml actions
  • actions/cache v2 composite
  • actions/checkout v2 composite
  • actions/upload-artifact v2 composite
  • r-lib/actions/setup-pandoc v1 composite
  • r-lib/actions/setup-r v1 composite
.github/workflows/check-standard.yaml actions
  • actions/checkout v3 composite
  • r-lib/actions/check-r-package v2 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite