spatialcluster

spatially-constrained clustering in R

https://github.com/mpadge/spatialcluster

Science Score: 39.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (16.3%) to scientific vocabulary

Keywords

cluster clustering-algorithm r spatial
Last synced: 4 months ago · JSON representation

Repository

spatially-constrained clustering in R

Basic Info
Statistics
  • Stars: 31
  • Watchers: 5
  • Forks: 6
  • Open Issues: 11
  • Releases: 0
Topics
cluster clustering-algorithm r spatial
Created almost 8 years ago · Last pushed 11 months ago
Metadata Files
Readme Codemeta

README.Rmd

---
output: github_document
---



```{r, echo = FALSE}
knitr::opts_chunk$set (
    collapse = TRUE,
    comment = "#>",
    fig.path = "man/figures/README-"
)
```

[![R build status](https://github.com/mpadge/spatialcluster/workflows/R-CMD-check/badge.svg)](https://github.com/mpadge/spatialcluster/actions?query=workflow%3AR-CMD-check)
[![Project Status: WIP](http://www.repostatus.org/badges/latest/wip.svg)](http://www.repostatus.org/#wip)
[![codecov](https://codecov.io/gh/mpadge/spatialcluster/branch/master/graph/badge.svg)](https://codecov.io/gh/mpadge/spatialcluster)

# spatialcluster

An **R** package for spatially-constrained clustering using either distance or
covariance matrices. "*Spatially-constrained*" means that the data from which
clusters are to be formed also map on to spatial coordinates, and the
constraint is that clusters must be spatially contiguous.

The package includes both an implementation of the
REDCAP collection of efficient yet approximate algorithms described in [D. Guo's
2008 paper, "Regionalization with dynamically constrained agglomerative
clustering and
partitioning."](https://www.tandfonline.com/doi/abs/10.1080/13658810701674970)
(pdf available
[here](https://pdfs.semanticscholar.org/ead1/7df8aaa1aed0e433b3ae1ec1ec5c7e785b2b.pdf)),
with extension to covariance matrices, and a new technique for computing
clusters using complete data sets. The package is also designed to analyse
matrices of spatial interactions (counts, densities) between sets of origin and
destination points. The spatial structure of interaction matrices is able to be
statistically analysed to yield both global statistics for the overall spatial
structure, and local statistics for individual clusters.


## Installation

The easiest way to install `spatialcluster` is be enabling the [corresponding
`r-universe`](https://mpadge.r-universe.dev/):

```{r r-univ, eval = FALSE}
options (repos = c (
    mpadge = "https://mpadge.r-universe.dev",
    CRAN = "https://cloud.r-project.org"
))
```

The package can then be installed as usual with,

```{r install, eval = FALSE}
install.packges ("spatialcluster")
```

Alternatively, the package can also be installed using any of the following
options:

```{r gh-installation, eval = FALSE}
# install.packages("remotes")
remotes::install_git ("https://codeberg.org/mpadge/spatialcluster")
remotes::install_git ("https://git.sr.ht/~mpadge/spatialcluster")
remotes::install_bitbucket ("mpadge/spatialcluster")
remotes::install_gitlab ("mpadge/spatialcluster")
remotes::install_github ("mpadge/spatialcluster")
```

## Usage

The two main functions, `scl_redcap()` and `scl_full()`, implement different
algorithms for spatial clustering. The former implements the REDCAP collection
of efficient yet approximate algorithms described in [D. Guo's 2008 paper,
"Regionalization with dynamically constrained agglomerative clustering and
partitioning."](https://www.tandfonline.com/doi/abs/10.1080/13658810701674970)
(pdf available
[here](https://pdfs.semanticscholar.org/ead1/7df8aaa1aed0e433b3ae1ec1ec5c7e785b2b.pdf)),
with extension here to apply clustering to covariance matrices. These
algorithms are computationally efficient yet generate only *approximate*
estimates of underlying clusters. The second function, `scl_full()`, trades
computational efficiency for accuracy, through generating clustering schemes
using all available data.

In short:

- `scl_full()` should always be preferred as long as it returns results within
  a reasonable amount of time
- `scl_redcap()` should be used only where data are too large for `scl_full()`
  to be run in a reasonable time.

For clustering a group of `n` points, both of these functions require three
main arguments:

1. A rectangular matrix of spatial coordinates of points to be clustered (`n`
    rows; at least 2 columns);
2. An `n`-by-`n` square matrix quantifying relationships between those points;
3. A single value (`ncl`) specifying the desired number of clusters.

The following code demonstrates usage with randomly-generated data:
```{r}
set.seed (1)
n <- 100
xy <- matrix (runif (2 * n), ncol = 2)
dmat <- matrix (runif (n^2), ncol = n)
```

The load the package and call the function:

```{r full-single, echo = TRUE, eval = TRUE}
library (spatialcluster)
scl <- scl_full (xy, dmat, ncl = 8)
plot (scl)
```

Both functions return a `list` with the following components:

```{r list-components}
names (scl)
```

- `tree` details distances and cluster numbers for all pairwise comparisons
  between objects.
- `merges` details increasing distances at which each pair of objects was
  merged into a single cluster.
- `ord` provides the order of the merges (for `scl_full()` only).
- `nodes` records the spatial coordinates of each point (node) of the input
  data.
- `pars` retains the parameters used to call the clustering function.
- `statsitics` returns the clustering statistics, both for individual clusters
  and an overall global statistic for the clustering scheme as a whole.

See the "_Get Started_" vignette for more details.

## A Cautionary Note

The following plot compares the results of applying four different clustering
algorithms to the same data.

```{r cautionary, eval = TRUE, fig.width = 7, fig.height = 7}
library (ggplot2)
library (gridExtra)
scl <- scl_full (xy, dmat, ncl = 8, linkage = "single")
p1 <- plot (scl) + ggtitle ("full-single")
scl <- scl_redcap (xy, dmat, ncl = 8, linkage = "single")
p2 <- plot (scl) + ggtitle ("redcap-single")
scl <- scl_redcap (xy, dmat, ncl = 8, linkage = "average")
p3 <- plot (scl) + ggtitle ("redcap-average")
scl <- scl_redcap (xy, dmat, ncl = 8, linkage = "complete")
p4 <- plot (scl) + ggtitle ("redcap-complete")

grid.arrange (p1, p2, p3, p4, ncol = 2)
```


This example illustrates the universal danger in all clustering algorithms: they
can not fail to produce results, even when the data fed to them are definitely
devoid of any information as in this example. Clustering algorithms should only
be applied to reflect a very specific hypothesis for why data should be
clustered in the first place; spatial clustering algorithms should only be
applied to reflect two very specific hypothesis for (i) why data should be
clustered at all, and (ii) why those clusters should manifest a spatial
pattern.

Owner

  • Name: mark padgham
  • Login: mpadge
  • Kind: user
  • Location: Münster, Germany
  • Company: @rOpenSci

rOpenSci software review lead

CodeMeta (codemeta.json)

{
  "@context": "https://doi.org/10.5063/schema/codemeta-2.0",
  "@type": "SoftwareSourceCode",
  "identifier": "spatialcluster",
  "description": "R port of redcap (Regionalization with dynamically constrained agglomerative clustering and partitioning).",
  "name": "spatialcluster: R port of redcap",
  "codeRepository": "https://github.com/mpadge/spatialcluster",
  "issueTracker": "https://github.com/mpadge/spatialcluster/issues",
  "license": "https://spdx.org/licenses/GPL-3.0",
  "version": "0.2.0.017",
  "programmingLanguage": {
    "@type": "ComputerLanguage",
    "name": "R",
    "url": "https://r-project.org"
  },
  "runtimePlatform": "R version 4.4.2 (2024-10-31)",
  "author": [
    {
      "@type": "Person",
      "givenName": "Mark",
      "familyName": "Padgham",
      "email": "mark.padgham@email.com"
    }
  ],
  "maintainer": [
    {
      "@type": "Person",
      "givenName": "Mark",
      "familyName": "Padgham",
      "email": "mark.padgham@email.com"
    }
  ],
  "softwareSuggestions": [
    {
      "@type": "SoftwareApplication",
      "identifier": "dbscan",
      "name": "dbscan",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=dbscan"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "knitr",
      "name": "knitr",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=knitr"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "rmarkdown",
      "name": "rmarkdown",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=rmarkdown"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "roxygen2",
      "name": "roxygen2",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=roxygen2"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "testthat",
      "name": "testthat",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=testthat"
    }
  ],
  "softwareRequirements": {
    "1": {
      "@type": "SoftwareApplication",
      "identifier": "R",
      "name": "R",
      "version": ">= 4.1.0"
    },
    "2": {
      "@type": "SoftwareApplication",
      "identifier": "alphahull",
      "name": "alphahull",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=alphahull"
    },
    "3": {
      "@type": "SoftwareApplication",
      "identifier": "dplyr",
      "name": "dplyr",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=dplyr"
    },
    "4": {
      "@type": "SoftwareApplication",
      "identifier": "ggplot2",
      "name": "ggplot2",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=ggplot2"
    },
    "5": {
      "@type": "SoftwareApplication",
      "identifier": "ggthemes",
      "name": "ggthemes",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=ggthemes"
    },
    "6": {
      "@type": "SoftwareApplication",
      "identifier": "methods",
      "name": "methods"
    },
    "7": {
      "@type": "SoftwareApplication",
      "identifier": "Rcpp",
      "name": "Rcpp",
      "version": ">= 0.12.6",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=Rcpp"
    },
    "8": {
      "@type": "SoftwareApplication",
      "identifier": "tibble",
      "name": "tibble",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=tibble"
    },
    "9": {
      "@type": "SoftwareApplication",
      "identifier": "tripack",
      "name": "tripack",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=tripack"
    },
    "SystemRequirements": {}
  },
  "fileSize": "17667.689KB",
  "readme": "https://github.com/mpadge/spatialcluster/blob/main/README.md",
  "contIntegration": [
    "https://github.com/mpadge/spatialcluster/actions?query=workflow%3AR-CMD-check",
    "https://codecov.io/gh/mpadge/spatialcluster"
  ],
  "developmentStatus": "http://www.repostatus.org/#wip",
  "keywords": [
    "clustering-algorithm",
    "cluster",
    "r",
    "spatial"
  ]
}

GitHub Events

Total
  • Issues event: 2
  • Watch event: 2
  • Push event: 16
Last Year
  • Issues event: 2
  • Watch event: 2
  • Push event: 16

Committers

Last synced: over 1 year ago

All Time
  • Total Commits: 322
  • Total Committers: 1
  • Avg Commits per committer: 322.0
  • Development Distribution Score (DDS): 0.0
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
mpadge m****m@e****m 322
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 30
  • Total pull requests: 0
  • Average time to close issues: about 1 year
  • Average time to close pull requests: N/A
  • Total issue authors: 4
  • Total pull request authors: 0
  • Average comments per issue: 1.47
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 1
  • Pull requests: 0
  • Average time to close issues: 29 days
  • Average time to close pull requests: N/A
  • Issue authors: 1
  • Pull request authors: 0
  • Average comments per issue: 0.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • mpadge (27)
  • geniusadventurer (1)
  • Nowosad (1)
  • ambarja (1)
Pull Request Authors
Top Labels
Issue Labels
future stuff (5) enhancement (4) must do (4) bug (1)
Pull Request Labels

Dependencies

.github/workflows/R-CMD-check.yaml actions
  • actions/checkout v2 composite
  • r-lib/actions/check-r-package v2 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/pkgdown.yaml actions
  • JamesIves/github-pages-deploy-action 4.1.4 composite
  • actions/checkout v2 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/test-coverage.yaml actions
  • actions/checkout v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
DESCRIPTION cran
  • R >= 3.3.0 depends
  • Rcpp >= 0.12.6 imports
  • alphahull * imports
  • dplyr * imports
  • ggplot2 * imports
  • ggthemes * imports
  • magrittr * imports
  • methods * imports
  • tibble * imports
  • tripack * imports
  • dbscan * suggests
  • knitr * suggests
  • rmarkdown * suggests
  • roxygen2 * suggests
  • testthat * suggests
.hooks/description cran