CoordinateCleaner
Automated flagging of common spatial and temporal errors in biological and palaeontological collection data, for the use in conservation, ecology and palaeontology.
Science Score: 49.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 4 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.8%) to scientific vocabulary
Keywords
Keywords from Contributors
Repository
Automated flagging of common spatial and temporal errors in biological and palaeontological collection data, for the use in conservation, ecology and palaeontology.
Basic Info
- Host: GitHub
- Owner: ropensci
- Language: HTML
- Default Branch: master
- Homepage: https://docs.ropensci.org/CoordinateCleaner/
- Size: 59.1 MB
Statistics
- Stars: 84
- Watchers: 14
- Forks: 23
- Open Issues: 30
- Releases: 4
Topics
Metadata Files
README.md
CoordinateCleaner v3.0
CoordinateCleaner has been updated to version 3.0 on github and on CRAN to adapt to the retirement of sp and raster. The update may not be compatible with analysis-pipelines build with version 2.x*
Automated flagging of common spatial and temporal errors in biological and palaeontological collection data, for the use in conservation, ecology and palaeontology. Specifically includes tests for
- General coordinate validity
- Country and province centroids
- Capital coordinates
- Coordinates of biodiversity institutions
- Spatial outliers
- Temporal outliers
- Coordinate-country discordance
- Duplicated coordinates per species
- Assignment to the location of the GBIF headquarters
- Urban areas
- Seas
- Plain zeros
- Equal longitude and latitude
- Rounded coordinates
- DDMM to DD.DD coordinate conversion errors
- Large temporal uncertainty (fossils)
- Equal minimum and maximum ages (fossils)
- Spatio-temporal outliers (fossils)
CoordinateCleaner can be particularly useful to improve data quality when using data from GBIF (e.g. obtained with rgbif) or the Paleobiology database (e.g. obtained with paleobioDB) for historical biogeography (e.g. with BioGeoBEARS or phytools), automated conservation assessment (e.g. with speciesgeocodeR or conR) or species distribution modelling (e.g. with dismo or sdm). See scrubr and taxize for complementary taxonomic cleaning or biogeo for correcting spatial coordinate errors.
See News for update information.
Installation
Stable from CRAN
r
install.packages("CoordinateCleaner")
library(CoordinateCleaner)
Developmental from GitHub
r
devtools::install_github("ropensci/CoordinateCleaner")
library(CoordinateCleaner)
Usage
A simple example:
```r
Simulate example data
minages <- runif(250, 0, 65) exmpl <- data.frame(species = sample(letters, size = 250, replace = TRUE), decimalLongitude = runif(250, min = 42, max = 51), decimalLatitude = runif(250, min = -26, max = -11), minma = minages, maxma = minages + runif(250, 0.1, 65), dataset = "clean")
Run record-level tests
rl <- clean_coordinates(x = exmpl) summary(rl) plot(rl)
Dataset level
dsl <- clean_dataset(exmpl)
For fossils
fl <- clean_fossils(x = exmpl, taxon = "species", lon = "decimalLongitude", lat = "decimalLatitude") summary(fl)
Alternative example using the pipe
library(tidyverse)
cl <- exmpl %>% ccval()%>% cccap()%>% cdddmm()%>% cfrange(lon = "decimalLongitude", lat = "decimalLatitude", taxon ="species") ```
Documentation
Pipelines for cleaning data from the Global Biodiversity Information Facility (GBIF) and the Paleobiology Database (PaleobioDB) are available in here.
Contributing
See the CONTRIBUTING document.
Citation
Zizka A, Silvestro D, Andermann T, Azevedo J, Duarte Ritter C, Edler D, Farooq H, Herdean A, Ariza M, Scharn R, Svanteson S, Wengtrom N, Zizka V & Antonelli A (2019) CoordinateCleaner: standardized cleaning of occurrence records from biological collection databases. Methods in Ecology and Evolution, 10(5):744-751, doi:10.1111/2041-210X.13152, https://github.com/ropensci/CoordinateCleaner
Owner
- Name: rOpenSci
- Login: ropensci
- Kind: organization
- Email: info@ropensci.org
- Location: Berkeley, CA
- Website: https://ropensci.org/
- Twitter: rOpenSci
- Repositories: 307
- Profile: https://github.com/ropensci
CodeMeta (codemeta.json)
{
"@context": [
"https://doi.org/10.5063/schema/codemeta-2.0",
"http://schema.org"
],
"@type": "SoftwareSourceCode",
"identifier": "CoordinateCleaner",
"description": "Automated flagging of common spatial and temporal errors in biological and paleontological collection data, for the use in conservation, ecology and paleontology. Includes automated tests to easily flag (and exclude) records assigned to country or province centroid, the open ocean, the headquarters of the Global Biodiversity Information Facility, urban areas or the location of biodiversity institutions (museums, zoos, botanical gardens, universities). Furthermore identifies per species outlier coordinates, zero coordinates, identical latitude/longitude and invalid coordinates. Also implements an algorithm to identify data sets with a significant proportion of rounded coordinates. Especially suited for large data sets. The reference for the methodology is: Zizka et al. (2019) <doi:10.1111/2041-210X.13152>.",
"name": "CoordinateCleaner: Automated Cleaning of Occurrence Records from Biological Collections",
"codeRepository": "https://github.com/ropensci/CoordinateCleaner",
"issueTracker": "https://github.com/ropensci/CoordinateCleaner/issues",
"license": "https://spdx.org/licenses/GPL-3.0",
"version": "2.0.20",
"programmingLanguage": {
"@type": "ComputerLanguage",
"name": "R",
"url": "https://r-project.org"
},
"runtimePlatform": "R Under development (unstable) (2021-10-19 r81077)",
"provider": {
"@id": "https://cran.r-project.org",
"@type": "Organization",
"name": "Comprehensive R Archive Network (CRAN)",
"url": "https://cran.r-project.org"
},
"author": [
{
"@type": "Person",
"givenName": "Alexander",
"familyName": "Zizka",
"email": "zizka.alexander@gmail.com"
}
],
"contributor": [
{
"@type": "Person",
"givenName": "Daniele",
"familyName": "Silvestro"
},
{
"@type": "Person",
"givenName": "Tobias",
"familyName": "Andermann"
},
{
"@type": "Person",
"givenName": "Josue",
"familyName": "Azevedo"
},
{
"@type": "Person",
"givenName": "Camila",
"familyName": "Duarte Ritter"
},
{
"@type": "Person",
"givenName": "Daniel",
"familyName": "Edler"
},
{
"@type": "Person",
"givenName": "Harith",
"familyName": "Farooq"
},
{
"@type": "Person",
"givenName": "Andrei",
"familyName": "Herdean"
},
{
"@type": "Person",
"givenName": "Maria",
"familyName": "Ariza"
},
{
"@type": "Person",
"givenName": "Ruud",
"familyName": "Scharn"
},
{
"@type": "Person",
"givenName": "Sten",
"familyName": "Svanteson"
},
{
"@type": "Person",
"givenName": "Niklas",
"familyName": "Wengstrom"
},
{
"@type": "Person",
"givenName": "Vera",
"familyName": "Zizka"
},
{
"@type": "Person",
"givenName": "Alexandre",
"familyName": "Antonelli"
}
],
"maintainer": [
{
"@type": "Person",
"givenName": "Alexander",
"familyName": "Zizka",
"email": "zizka.alexander@gmail.com"
}
],
"softwareSuggestions": [
{
"@type": "SoftwareApplication",
"identifier": "covr",
"name": "covr",
"provider": {
"@id": "https://cran.r-project.org",
"@type": "Organization",
"name": "Comprehensive R Archive Network (CRAN)",
"url": "https://cran.r-project.org"
},
"sameAs": "https://CRAN.R-project.org/package=covr"
},
{
"@type": "SoftwareApplication",
"identifier": "countrycode",
"name": "countrycode",
"provider": {
"@id": "https://cran.r-project.org",
"@type": "Organization",
"name": "Comprehensive R Archive Network (CRAN)",
"url": "https://cran.r-project.org"
},
"sameAs": "https://CRAN.R-project.org/package=countrycode"
},
{
"@type": "SoftwareApplication",
"identifier": "knitr",
"name": "knitr",
"provider": {
"@id": "https://cran.r-project.org",
"@type": "Organization",
"name": "Comprehensive R Archive Network (CRAN)",
"url": "https://cran.r-project.org"
},
"sameAs": "https://CRAN.R-project.org/package=knitr"
},
{
"@type": "SoftwareApplication",
"identifier": "maps",
"name": "maps",
"provider": {
"@id": "https://cran.r-project.org",
"@type": "Organization",
"name": "Comprehensive R Archive Network (CRAN)",
"url": "https://cran.r-project.org"
},
"sameAs": "https://CRAN.R-project.org/package=maps"
},
{
"@type": "SoftwareApplication",
"identifier": "magrittr",
"name": "magrittr",
"provider": {
"@id": "https://cran.r-project.org",
"@type": "Organization",
"name": "Comprehensive R Archive Network (CRAN)",
"url": "https://cran.r-project.org"
},
"sameAs": "https://CRAN.R-project.org/package=magrittr"
},
{
"@type": "SoftwareApplication",
"identifier": "paleobioDB",
"name": "paleobioDB",
"provider": {
"@id": "https://cran.r-project.org",
"@type": "Organization",
"name": "Comprehensive R Archive Network (CRAN)",
"url": "https://cran.r-project.org"
},
"sameAs": "https://CRAN.R-project.org/package=paleobioDB"
},
{
"@type": "SoftwareApplication",
"identifier": "rmarkdown",
"name": "rmarkdown",
"provider": {
"@id": "https://cran.r-project.org",
"@type": "Organization",
"name": "Comprehensive R Archive Network (CRAN)",
"url": "https://cran.r-project.org"
},
"sameAs": "https://CRAN.R-project.org/package=rmarkdown"
},
{
"@type": "SoftwareApplication",
"identifier": "rnaturalearthdata",
"name": "rnaturalearthdata",
"provider": {
"@id": "https://cran.r-project.org",
"@type": "Organization",
"name": "Comprehensive R Archive Network (CRAN)",
"url": "https://cran.r-project.org"
},
"sameAs": "https://CRAN.R-project.org/package=rnaturalearthdata"
},
{
"@type": "SoftwareApplication",
"identifier": "testthat",
"name": "testthat",
"provider": {
"@id": "https://cran.r-project.org",
"@type": "Organization",
"name": "Comprehensive R Archive Network (CRAN)",
"url": "https://cran.r-project.org"
},
"sameAs": "https://CRAN.R-project.org/package=testthat"
},
{
"@type": "SoftwareApplication",
"identifier": "viridis",
"name": "viridis",
"provider": {
"@id": "https://cran.r-project.org",
"@type": "Organization",
"name": "Comprehensive R Archive Network (CRAN)",
"url": "https://cran.r-project.org"
},
"sameAs": "https://CRAN.R-project.org/package=viridis"
}
],
"softwareRequirements": [
{
"@type": "SoftwareApplication",
"identifier": "R",
"name": "R",
"version": ">= 3.5.0"
},
{
"@type": "SoftwareApplication",
"identifier": "dplyr",
"name": "dplyr",
"provider": {
"@id": "https://cran.r-project.org",
"@type": "Organization",
"name": "Comprehensive R Archive Network (CRAN)",
"url": "https://cran.r-project.org"
},
"sameAs": "https://CRAN.R-project.org/package=dplyr"
},
{
"@type": "SoftwareApplication",
"identifier": "geosphere",
"name": "geosphere",
"provider": {
"@id": "https://cran.r-project.org",
"@type": "Organization",
"name": "Comprehensive R Archive Network (CRAN)",
"url": "https://cran.r-project.org"
},
"sameAs": "https://CRAN.R-project.org/package=geosphere"
},
{
"@type": "SoftwareApplication",
"identifier": "ggplot2",
"name": "ggplot2",
"provider": {
"@id": "https://cran.r-project.org",
"@type": "Organization",
"name": "Comprehensive R Archive Network (CRAN)",
"url": "https://cran.r-project.org"
},
"sameAs": "https://CRAN.R-project.org/package=ggplot2"
},
{
"@type": "SoftwareApplication",
"identifier": "graphics",
"name": "graphics"
},
{
"@type": "SoftwareApplication",
"identifier": "grDevices",
"name": "grDevices"
},
{
"@type": "SoftwareApplication",
"identifier": "methods",
"name": "methods"
},
{
"@type": "SoftwareApplication",
"identifier": "raster",
"name": "raster",
"provider": {
"@id": "https://cran.r-project.org",
"@type": "Organization",
"name": "Comprehensive R Archive Network (CRAN)",
"url": "https://cran.r-project.org"
},
"sameAs": "https://CRAN.R-project.org/package=raster"
},
{
"@type": "SoftwareApplication",
"identifier": "rgbif",
"name": "rgbif",
"provider": {
"@id": "https://cran.r-project.org",
"@type": "Organization",
"name": "Comprehensive R Archive Network (CRAN)",
"url": "https://cran.r-project.org"
},
"sameAs": "https://CRAN.R-project.org/package=rgbif"
},
{
"@type": "SoftwareApplication",
"identifier": "rgeos",
"name": "rgeos",
"provider": {
"@id": "https://cran.r-project.org",
"@type": "Organization",
"name": "Comprehensive R Archive Network (CRAN)",
"url": "https://cran.r-project.org"
},
"sameAs": "https://CRAN.R-project.org/package=rgeos"
},
{
"@type": "SoftwareApplication",
"identifier": "rgdal",
"name": "rgdal",
"provider": {
"@id": "https://cran.r-project.org",
"@type": "Organization",
"name": "Comprehensive R Archive Network (CRAN)",
"url": "https://cran.r-project.org"
},
"sameAs": "https://CRAN.R-project.org/package=rgdal"
},
{
"@type": "SoftwareApplication",
"identifier": "rnaturalearth",
"name": "rnaturalearth",
"provider": {
"@id": "https://cran.r-project.org",
"@type": "Organization",
"name": "Comprehensive R Archive Network (CRAN)",
"url": "https://cran.r-project.org"
},
"sameAs": "https://CRAN.R-project.org/package=rnaturalearth"
},
{
"@type": "SoftwareApplication",
"identifier": "stats",
"name": "stats"
},
{
"@type": "SoftwareApplication",
"identifier": "sp",
"name": "sp",
"provider": {
"@id": "https://cran.r-project.org",
"@type": "Organization",
"name": "Comprehensive R Archive Network (CRAN)",
"url": "https://cran.r-project.org"
},
"sameAs": "https://CRAN.R-project.org/package=sp"
},
{
"@type": "SoftwareApplication",
"identifier": "tidyselect",
"name": "tidyselect",
"provider": {
"@id": "https://cran.r-project.org",
"@type": "Organization",
"name": "Comprehensive R Archive Network (CRAN)",
"url": "https://cran.r-project.org"
},
"sameAs": "https://CRAN.R-project.org/package=tidyselect"
},
{
"@type": "SoftwareApplication",
"identifier": "utils",
"name": "utils"
},
{
"@type": "SoftwareApplication",
"identifier": "https://sysreqs.r-hub.io/get/gdal"
},
{
"@type": "SoftwareApplication",
"identifier": "https://sysreqs.r-hub.io/get/libgdal"
}
],
"releaseNotes": "https://github.com/ropensci/CoordinateCleaner/blob/master/NEWS.md",
"readme": "https://github.com/ropensci/CoordinateCleaner/blob/master/README.md",
"fileSize": "130599.254KB",
"relatedLink": "https://ropensci.github.io/CoordinateCleaner/",
"developmentStatus": "https://www.repostatus.org/#active",
"citation": [
{
"@type": "ScholarlyArticle",
"datePublished": "2019",
"author": [
{
"@type": "Person",
"givenName": "Alexander",
"familyName": "Zizka"
},
{
"@type": "Person",
"givenName": "Daniele",
"familyName": "Silvestro"
},
{
"@type": "Person",
"givenName": "Tobias",
"familyName": "Andermann"
},
{
"@type": "Person",
"givenName": "Josue",
"familyName": "Azevedo"
},
{
"@type": "Person",
"givenName": "Camila",
"familyName": "Duarte Ritter"
},
{
"@type": "Person",
"givenName": "Daniel",
"familyName": "Edler"
},
{
"@type": "Person",
"givenName": "Harith",
"familyName": "Farooq"
},
{
"@type": "Person",
"givenName": "Andrei",
"familyName": "Herdean"
},
{
"@type": "Person",
"givenName": "Maria",
"familyName": "Ariza"
},
{
"@type": "Person",
"givenName": "Ruud",
"familyName": "Scharn"
},
{
"@type": "Person",
"givenName": "Sten",
"familyName": "Svanteson"
},
{
"@type": "Person",
"givenName": "Niklas",
"familyName": "Wengstrom"
},
{
"@type": "Person",
"givenName": "Vera",
"familyName": "Zizka"
},
{
"@type": "Person",
"givenName": "Alexandre",
"familyName": "Antonelli"
}
],
"name": "CoordinateCleaner: standardized cleaning of occurrence records from biological collection databases",
"identifier": "10.1111/2041-210X.13152",
"url": "https://github.com/ropensci/CoordinateCleaner",
"pagination": "-7",
"@id": "https://doi.org/10.1111/2041-210X.13152",
"sameAs": "https://doi.org/10.1111/2041-210X.13152",
"isPartOf": {
"@type": "PublicationIssue",
"issueNumber": "10",
"datePublished": "2019",
"isPartOf": {
"@type": [
"PublicationVolume",
"Periodical"
],
"name": "Methods in Ecology and Evolution"
}
}
}
],
"copyrightHolder": {},
"funder": {},
"keywords": [
"r",
"r-package",
"rstats"
],
"review": {
"@type": "Review",
"url": "https://github.com/ropensci/software-review/issues/210",
"provider": "https://ropensci.org"
}
}
GitHub Events
Total
- Issues event: 3
- Watch event: 6
- Issue comment event: 2
- Push event: 4
- Fork event: 1
Last Year
- Issues event: 3
- Watch event: 6
- Issue comment event: 2
- Push event: 4
- Fork event: 1
Committers
Last synced: 8 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| azizka | a****a@b****e | 394 |
| azizka | z****r@g****m | 187 |
| Zizka | a****y@i****e | 51 |
| BrunoVilela | b****a@h****m | 29 |
| Pakillo | f****c@g****m | 28 |
| Irene | j****e@e****m | 20 |
| Maëlle Salmon | m****n@y****e | 6 |
| plantarum | t****r@p****a | 3 |
| Hugo Gruson | B****o | 2 |
| Jeroen Ooms | j****s@g****m | 2 |
| mhesselbarth | m****h@g****m | 2 |
| AMBarbosa | A****a | 1 |
| John Baumgartner | j****s@g****m | 1 |
| Michael Sumner | m****r@g****m | 1 |
| John Waller | f****2@s****n | 1 |
| Shawn Laffan | s****n@g****m | 1 |
| Vince Buffalo | v****A@g****m | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 7 months ago
All Time
- Total issues: 93
- Total pull requests: 15
- Average time to close issues: 4 months
- Average time to close pull requests: 5 months
- Total issue authors: 58
- Total pull request authors: 13
- Average comments per issue: 2.14
- Average comments per pull request: 0.33
- Merged pull requests: 13
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 4
- Pull requests: 1
- Average time to close issues: N/A
- Average time to close pull requests: about 18 hours
- Issue authors: 3
- Pull request authors: 1
- Average comments per issue: 0.5
- Average comments per pull request: 1.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- jhnwllr (8)
- azizka (6)
- jivelasquezt (5)
- AMBarbosa (4)
- jaum20 (3)
- HMB3 (3)
- maelle (3)
- wcornwell (3)
- rvosa (2)
- rsbivand (2)
- damariszurell (2)
- jpstevenson2018 (2)
- sandro-unibe (2)
- CyanBC (2)
- pepbioalerts (2)
Pull Request Authors
- maelle (2)
- jhnwllr (2)
- plantarum (1)
- shawnlaffan (1)
- Pakillo (1)
- vsbuffalo (1)
- Bisaloo (1)
- isteves (1)
- joelnitta (1)
- mdsumner (1)
- johnbaums (1)
- mhesselbarth (1)
- AMBarbosa (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- cran 1,651 last-month
- Total dependent packages: 5
- Total dependent repositories: 7
- Total versions: 15
- Total maintainers: 1
cran.r-project.org: CoordinateCleaner
Automated Cleaning of Occurrence Records from Biological Collections
- Homepage: https://ropensci.github.io/CoordinateCleaner/
- Documentation: http://cran.r-project.org/web/packages/CoordinateCleaner/CoordinateCleaner.pdf
- License: GPL-3
-
Latest release: 3.0.1
published over 2 years ago
Rankings
Maintainers (1)
Dependencies
- R >= 3.5.0 depends
- dplyr * imports
- geosphere * imports
- ggplot2 * imports
- grDevices * imports
- graphics * imports
- methods * imports
- raster * imports
- rgbif * imports
- rgdal * imports
- rgeos * imports
- rnaturalearth * imports
- sp * imports
- stats * imports
- tidyselect * imports
- utils * imports
- countrycode * suggests
- covr * suggests
- knitr * suggests
- magrittr * suggests
- maps * suggests
- paleobioDB * suggests
- rmarkdown * suggests
- rnaturalearthdata * suggests
- testthat * suggests
- viridis * suggests
- actions/checkout v2 composite
- actions/upload-artifact main composite
- r-lib/actions/check-r-package v1 composite
- r-lib/actions/setup-pandoc v1 composite
- r-lib/actions/setup-r v1 composite
- r-lib/actions/setup-r-dependencies v1 composite
- actions/checkout v2 composite
- r-lib/actions/setup-pandoc v1 composite
- r-lib/actions/setup-r v1 composite
- r-lib/actions/setup-r-dependencies v1 composite
