geothinner

An R Package for Efficient Spatial Thinning of Species Occurrences and Point Data

https://github.com/jmestret/geothinner

Science Score: 39.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 10 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (17.5%) to scientific vocabulary
Last synced: 9 months ago · JSON representation

Repository

An R Package for Efficient Spatial Thinning of Species Occurrences and Point Data

Basic Info
Statistics
  • Stars: 10
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 4
Created almost 2 years ago · Last pushed 12 months ago
Metadata Files
Readme Changelog License

README.Rmd

---
output: github_document
---



```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)
```

# GeoThinneR - An R Package for Spatial Thinning 


[![CRAN_Status_Badge](https://www.r-pkg.org/badges/version/GeoThinneR?color=blue)](https://cran.r-project.org/package=GeoThinneR)
[![R-CMD-check](https://github.com/jmestret/GeoThinneR/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/jmestret/GeoThinneR/actions/workflows/R-CMD-check.yaml)
[![codecov](https://codecov.io/gh/jmestret/GeoThinneR/graph/badge.svg?token=LID8Q55SMD)](https://app.codecov.io/gh/jmestret/GeoThinneR)


## Overview

**GeoThinneR** is an R package designed for efficient spatial thinning of species occurrence records and other geospatial point data. It integrates three primary thinning methods (**distance-based, grid-based, and precision-based thinning**) into a single package, eliminating the need to switch between multiple tools. GeoThinneR implements algorithms based on kd-tree structures for nearest-neighbor searches, significantly improving performance and **scalability for large datasets**. Additionally, the package provides custom **functionalities** useful for species distribution modeling (SDM), such as thinning by group (e.g., multiple species), retaining an exact number of points, and prioritizing records based on user-defined variables. These features make GeoThinneR a valuable tool for handling large-scale occurrence datasets.



GeoThinneR has been developed as an alternative tool for spatial thinning to mitigate the effects of sampling bias in SDM. Various approaches exist to address sampling bias, each suited to different scenarios. Below are some references discussing methods for bias correction and spatial thinning:

- Boria, R. A., Olson, L. E., Goodman, S. M., & Anderson, R. P. (2014). Spatial filtering to reduce sampling bias can improve the performance of ecological niche models. *Ecological modelling, 275*, 73-77. 
- Veloz, S. D. (2009). Spatially autocorrelated sampling falsely inflates measures of accuracy for presence‐only niche models. *Journal of biogeography, 36*(12), 2290-2299.
- Moudrý, V., Bazzichetto, M., Remelgado, R., Devillers, R., Lenoir, J., Mateo, R.G., Lembrechts, J.J., Sillero, N., Lecours, V., Cord, A.F., Barták, V., Balej, P., Rocchini, D., Torresani, M., Arenas-Castro, S., Man, M., Prajzlerová, D., Gdulová, K., Prošek, J., Marchetto, E., Zarzo-Arias, A., Gábor, L., Leroy, F., Martini, M., Malavasi, M., Cazzolla Gatti, R., Wild, J. and Šímová, P. (2024), Optimising occurrence data in species distribution models: sample size, positional uncertainty, and sampling bias matter. *Ecography, 2024*: e07294. 

## Getting started

You can install **GeoThinneR** from CRAN with:

```r
install.packages("GeoThinneR")
```

To install the **development version** from GitHub, use:

```r
# install.packages("devtools")
devtools::install_github("jmestret/GeoThinneR")
```

Using GeoThinneR is simple. The main function, `thin_points()`, applies spatial thinning using a user-specified method and thinning constraint. 

```r
library(GeoThinneR)

# Distance-based thinning (minimum separation of 10 km)
thin_points(data, method = "distance", thin_dist = 10)

# Grid-based thinning (grid resolution of 0.1 degrees)
thin_points(data, method = "grid", resolution = 0.1)

# Precision-based thinning (rounding coordinates to 1 decimal place)
thin_points(data, method = "precision", precision = 1)
```

## Documentation

For detailed documentation, guides, and usage examples, please visit the [official package documentation](https://jmestret.github.io/GeoThinneR/).

## Contributing

We welcome contributions! If you have suggestions for improvements or new features, please open an issue or submit a pull request on our [GitHub repository](https://github.com/jmestret/GeoThinneR).

## How to cite GeoThinneR

The GeoThinneR manuscript is currently in progress. In the meantime, you can cite the [preprint](https://doi.org/10.48550/arXiv.2505.07867) as follows:

> Mestre-Tomás, J. (2025). GeoThinneR: An R Package for Efficient Spatial Thinning of Species Occurrences and Point Data. arXiv preprint arXiv:2505.07867. DOI: https://doi.org/10.48550/arXiv.2505.07867

Owner

  • Name: Jorge Mestre Tomás
  • Login: jmestret
  • Kind: user

Hi, I'm Jorge

GitHub Events

Total
  • Create event: 3
  • Issues event: 3
  • Release event: 2
  • Watch event: 7
  • Delete event: 1
  • Issue comment event: 1
  • Push event: 13
Last Year
  • Create event: 3
  • Issues event: 3
  • Release event: 2
  • Watch event: 7
  • Delete event: 1
  • Issue comment event: 1
  • Push event: 13

Packages

  • Total packages: 1
  • Total downloads:
    • cran 589 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 3
  • Total maintainers: 1
cran.r-project.org: GeoThinneR

Efficient Spatial Thinning of Species Occurrences

  • Versions: 3
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 589 Last month
Rankings
Dependent packages count: 28.3%
Dependent repos count: 34.9%
Average: 50.0%
Downloads: 86.8%
Maintainers (1)
Last synced: 10 months ago

Dependencies

.github/workflows/R-CMD-check.yaml actions
  • actions/checkout v4 composite
  • r-lib/actions/check-r-package v2 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/pkgdown.yaml actions
  • JamesIves/github-pages-deploy-action v4.5.0 composite
  • actions/checkout v4 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/test-coverage.yaml actions
  • actions/checkout v4 composite
  • actions/upload-artifact v4 composite
  • codecov/codecov-action v4 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
DESCRIPTION cran
  • R >= 4.0.0 depends
  • Rcpp * imports
  • data.table * imports
  • fields * imports
  • matrixStats * imports
  • nabor * imports
  • stats * imports
  • terra * imports
  • ggplot2 * suggests
  • knitr * suggests
  • rmarkdown * suggests
  • sf * suggests
  • testthat >= 3.0.0 suggests
  • tibble * suggests