ebvcube

Accessing, visualising and creating EBV netCDF datasets. Download datasets from the EBV Data Portal: https://portal.geobon.org/

https://github.com/ebvcube/ebvcube

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (17.2%) to scientific vocabulary

Keywords

biodiversity-standards essential-biodiversity-variables netcdf4 r-package
Last synced: 6 months ago · JSON representation ·

Repository

Accessing, visualising and creating EBV netCDF datasets. Download datasets from the EBV Data Portal: https://portal.geobon.org/

Basic Info
  • Host: GitHub
  • Owner: EBVcube
  • License: gpl-3.0
  • Language: R
  • Default Branch: main
  • Homepage:
  • Size: 19.9 MB
Statistics
  • Stars: 8
  • Watchers: 1
  • Forks: 2
  • Open Issues: 1
  • Releases: 9
Topics
biodiversity-standards essential-biodiversity-variables netcdf4 r-package
Created over 4 years ago · Last pushed 6 months ago
Metadata Files
Readme Changelog Contributing License Code of conduct Citation

README.Rmd

---
output: github_document
---



```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%",
  error=TRUE
)
```

# ebvcube package



[![CRAN status](https://www.r-pkg.org/badges/version/ebvcube)](https://cran.r-project.org/package=ebvcube) [![R-CMD-check](https://github.com/EBVCube/ebvcube/actions/workflows/R.yaml/badge.svg?branch=dev)](https://github.com/EBVCube/ebvcube/actions/workflows/R.yaml) [![ebvcube status badge](https://b-cubed-eu.r-universe.dev/ebvcube/badges/version)](https://b-cubed-eu.r-universe.dev/ebvcube) [![name status badge](https://b-cubed-eu.r-universe.dev/badges/:name?color=6CDDB4)](https://b-cubed-eu.r-universe.dev/) [![codecov](https://codecov.io/gh/EBVcube/ebvcube/graph/badge.svg?token=2TVFHRKBNJ)](https://app.codecov.io/gh/EBVcube/ebvcube) [![Static Badge](https://img.shields.io/badge/DOI-10.32614%2FCRAN.package.ebvcube-blue?link=https%3A%2F%2Fcran.r-project.org%2Fweb%2Fpackages%2Febvcube%2Findex.html)](https://cran.r-project.org/package=ebvcube)



This package can be used to easily access the data of the EBV netCDFs which can be downloaded from the [EBV Data Portal](https://portal.geobon.org/). It also provides some basic visualization. Advanced users can build their own netCDFs following the EBV structure.

## 1. Basis

The EBV netCDF structure is designed to hold Essential Biodiversity Variables. This concept is further described [here](https://geobon.org/ebvs/what-are-ebvs/). The files are based on the [Network Common Data Format](https://www.unidata.ucar.edu/software/netcdf/) (netCDF). Additionally, it follows the [Climate and Forecast Conventions](https://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html) (CF, version 1.8) and the [Attribute Convention for Data Discovery](https://wiki.esipfed.org/Attribute_Convention_for_Data_Discovery_1-3) (ACDD, version 1.3).

## 2. Data structure

The structure allows several data cubes per netCDF file. These cubes have four dimensions: longitude, latitude, time and entity, whereby the last dimension can, e.g., encompass different species or groups of species, ecosystem types or other. The usage of hierarchical groups enables the coexistence of multiple data cubes. All cubes share the same dimensions. The first level (netCDF group) are scenarios, e.g., the modelling for different Shared Socioeconomic Pathways (SSP) scenarios. The second level (netCDF group) are metrics, e.g., the percentage of protected area per pixel and its proportional loss over a certain time span per pixel. All metrics are repeated per scenario, if any are present. For an extensive explanation check out the [EBVCube format repository](https://github.com/EBVcube/EBVCube-format).

``` bash
├── scenario_1
│   ├── metric_1
│   │   └── ebv_cube [lon, lat, time, entity]
│   │
│   └── metric_1
│       └── ebv_cube [lon, lat, time, entity]
│
└── scenario_2
    ├── metric_1
    │   └── ebv_cube [lon, lat, time, entity]
    │
    └── metric_1
        └── ebv_cube [lon, lat, time, entity]
```

Just keep in mind: All EBV netCDF always have a metric. But they may or may not have a scenario. The resulting datacubes hold the data. These datacubes are 4D.

## 2. Installation

You can install the ebvcube packages from the following sources:

``` r
#installation of the current version on CRAN
install.packages('ebvcube') 

#installation of the latest development version from GitHub
devtools::install_github('https://github.com/EBVCube/ebvcube/tree/dev')

#installation of the current version on the b3verse (https://b-cubed-eu.r-universe.dev/)
install.packages("ebvcube", repos = c("https://b-cubed-eu.r-universe.dev", "https://cloud.r-project.org", "https://bioc.r-universe.dev"))

#troubleshooting in case the BioConductor packages are missing
#if one of the following packages is not loaded: rhdf5, DelayedArray, HDF5Array
install.packages("BiocManager")
BiocManager::install('rhdf5')
BiocManager::install('DelayedArray')
BiocManager::install('HDF5Array')
```

## 3. Working with the package - a quick intro

The example data set used in this README is a spatial subset (African continent) of the [Local bird diversity (cSAR/BES-SIM)](https://portal.geobon.org/ebv-detail?id=1) data set by Ines Martins.

### 3.1 Take a very first look at the file

With the following two functions you get the metadata of a specific EBV netCDF file. First we take a look at some basic metadata of that file. The properties encompass much more information!

```{r example}
library(ebvcube)

#set the path to the file
file <- system.file(file.path("extdata", "martins_comcom_subset.nc"), package="ebvcube")

#read the properties of the file
prop_file <- ebv_properties(file, verbose=FALSE)

#take a look at the general properties of the data set - there are more properties to discover!
prop_file@general[c(1, 2, 4)]
slotNames(prop_file)


```

Now let's get the paths to all possible datacubes. The resulting data.frame includes the paths and also descriptions of the metric and/or scenario and/or entity. The paths basically consist of the nested structure of scenario, metric and the datacube.

```{r}
datacubes <- ebv_datacubepaths(file, verbose=FALSE)
datacubes
```

In the next step we will get the properties of one specific datacube - fyi: the result also holds the general file properties from above.

```{r}
prop_dc <- ebv_properties(file, datacubes[1, 1], verbose=FALSE)
prop_dc@metric
```

### 3.2 Plot the data to get a better impression

To discover the spatial distribution of the data, you can plot a map of the datacube that we just looked at. It has 12 timesteps. Here we look at the first one.

```{r}
#plot the global map
dc <- datacubes[2, 1]
ebv_map(file, dc, entity=1, timestep = 1, classes = 9,
        verbose=FALSE, col_rev = TRUE)
```

It's nice to see the global distribution, but how is the change of that datacube (non forest birds) over time? Let's take a look at the average. The function returns the values, catch them!

```{r}
#get the averages and plot
averages <- ebv_trend(file, dc, entity=1, verbose=FALSE)
averages
```

It would be cool to have that for other indicators as well? Check out the different options for 'method'.

### 3.3 Read the data from the files to start working

Before you actually load the data it may be nice to get an impression of the value range and other basic measurements.

```{r}
#info for whole dataset
measurements <- ebv_analyse(file, dc, entity=1, verbose=FALSE)
#see the included measurements
names(measurements)
#check out the mean and the number of pixels
measurements$mean
measurements$n

#info for a subset defined by a bounding box
#you can also define the subset by a Shapefile - check it out!
bb <- c(-26, 64, 30, 38)
measurements_bb <- ebv_analyse(file, dc, entity = 1, subset = bb, verbose=FALSE)
#check out the mean of the subset
measurements_bb$mean
measurements_bb$n
```

To access the first three timesteps of the data you can use the following:

```{r}
#load whole data as array for two timesteps
data <- ebv_read(file, dc, entity = 1, timestep = 1:3, type = 'a')
dim(data)
```

You can also get a spatial subset of the data by providing a Shapefile.

```{r}
#load subset from shapefile (Cameroon)
shp <- system.file(file.path('extdata', 'cameroon.shp'), package="ebvcube")
data_shp <- ebv_read_shp(file, dc, entity=1, shp = shp, timestep = c(1, 2, 3), verbose=FALSE)
dim(data_shp)
#very quick plot of the resulting raster plus the shapefile
borders <- terra::vect(shp)
ggplot2::ggplot() +
  tidyterra::geom_spatraster(data = data_shp[[1]]) +
  tidyterra::geom_spatvector(data = borders, fill = NA) +
  ggplot2::scale_fill_fermenter(na.value=NA, palette = 'YlGn', direction = 1) +
  ggplot2::theme_classic()
```

Imagine you have a very large dataset but only limited memory. The package provides the possibility to load the data as a DelayedArray. The ebv_write() function helps you to write that data back on disk properly. Look into the manual to obtain more information.

### 3.4 Take a peek on the creation of an EBV netCDF

#### a. Create an empty EBV netCDF (with metadata)

First of all, you have to insert all the metadata in the [EBV Data Portal](https://portal.geobon.org) and then use the resulting text file (json format) to create an empty netCDF which complies to the EBV netCDF structure, i.e., it has the correct structure mapped to your data and holds the metadata. Additionally to that (json) text file, the function needs a list of all entities the netCDF (see help page for detailed information) will encompass and geospatial information such as the coordinate reference system.

The example is based on the [Local bird diversity (cSAR/BES-SIM)](https://portal.geobon.org/ebv-detail?id=1).

```{r}
#paths
json <- system.file(file.path('extdata', 'metadata.json'), package="ebvcube")
new_nc <- file.path(system.file(package="ebvcube"), 'extdata', 'test.nc')
entities <- c('forest bird species', 'non-forest bird species', 'all bird species')
#defining the fillvalue - optional
fv <- -3.4e+38
#create the netCDF
ebv_create(jsonpath = json, outputpath = new_nc, entities = entities,
           epsg = 4326, extent = c(-180, 180, -90, 90), resolution = c(1, 1),
           fillvalue = fv, overwrite=TRUE, verbose=FALSE)

#needless to say: check the properties of your newly created file to see if you get what you want
#especially the entity_names from the slot general should be checked to see if your csv was formatted the right way
print(ebv_properties(new_nc, verbose=FALSE)@general$entity_names)

#check out the (still empty) datacubes that are available
dc_new <- ebv_datacubepaths(new_nc, verbose=FALSE)
print(dc_new)
```

Hint: You can always take a look at your netCDF in [Panoply](https://www.giss.nasa.gov/tools/panoply/) provided by NASA. That's very helpful to understand the structure.

#### b. Add your data to the EBV NetCDF

In the next step you can add your data to the netCDF from GeoTiff files or in-memory objects (matrix/array). You need to indicate the datacubepath the data belongs to. You can add your data timestep per timestep, in slices or all at once. You can simply add more data to the same datacube by changing the timestep definition.

```{r}
#path to tif with data
root <- system.file(file.path('extdata'), package="ebvcube")
tifs <- c('entity1.tif', 'entity2.tif', 'entity3.tif')
tif_paths <- file.path(root, tifs)

#adding the data
entity <- 1
for (tif in tif_paths){
  ebv_add_data(filepath_nc = new_nc,
               metric = 1,
               entity = entity,
               timestep=1:3,
               data = tif,
               band = 1:3,
               verbose = FALSE)
  entity <- entity + 1
}
```

#### c. Add missing attributes to datacube

Ups! So you did a mistake and want to change the attribute?! No problem. Just use the upcoming function to change it.

```{r}
ebv_attribute(new_nc, attribute_name='units', value='Percentage', levelpath=dc_new[1, 1])
#check the properties one more time - perfect!
print(ebv_properties(new_nc, dc_new[1, 1], verbose=FALSE)@ebv_cube$units)

```

In this case the levelpath corresponds to the datacube path. But you can also alter attributes at the metric or scenario level. See the manual for more info.

## 4. Cite package

```{r}
citation('ebvcube')
```

## List of all functions

| Functionality      | Function            | Description                                   |
|:------------------|:------------------|:---------------------------------|
| Basic access       | ebv_datacubepaths   | Get all available data cubes in the netCDF    |
|                    | ebv_properties      | Get all the metadata of the netCDF            |
|                    | ebv_download        | Download EBV netCDFs from the EBV Portal      |
| Data access        | ebv_read            | Read the data                                 |
|                    | ebv_read_bb         | Read a spatial subset given by a bounding box |
|                    | ebv_read_shp        | Read a spatial subset given by a Shapefile    |
|                    | ebv_analyse         | Get basic measurements of the data            |
|                    | ebv_resample        | Resample the pixel size and alignment         |
|                    | ebv_write           | Write manipulated data back to disc           |
| Data visualization | ebv_map             | Plot a map of the specified data slice        |
|                    | ebv_trend           | Plot the temporal trend                       |
| Data creation      | ebv_create          | Create a new EBV netCDF                       |
|                    | ebv_create_taxonomy | Create a new EBV netCDF with taxonomy info    |
|                    | ebv_metadata        | Create the EBV metadata text file (JSON)      |
|                    | ebv_add_data        | Add data to the new netCDF                    |
|                    | ebv_attribute       | Change an attribute value                     |

Owner

  • Name: EBVcube
  • Login: EBVcube
  • Kind: organization

Citation (CITATION.cff)

# --------------------------------------------
# CITATION file created with {cffr} R package
# See also: https://docs.ropensci.org/cffr/
# --------------------------------------------
 
cff-version: 1.2.0
message: 'To cite package "ebvcube" in publications use:'
type: software
license: GPL-3.0-or-later
title: 'ebvcube: Working with netCDF for Essential Biodiversity Variables'
version: 0.5.2
identifiers:
- type: doi
  value: 10.32614/CRAN.package.ebvcube
abstract: 'The concept of Essential Biodiversity Variables (EBV, <https://geobon.org/ebvs/what-are-ebvs/>)
  comes with a data structure based on the Network Common Data Form (netCDF). The
  ''ebvcube'' ''R'' package provides functionality to easily create, access and visualise
  this data. The EBV netCDFs can be downloaded from the EBV Data Portal: Christian
  Langer/ iDiv (2020) <https://portal.geobon.org/>.'
authors:
- family-names: Oceguera Conchas
  given-names: Emmanuel
  email: e.oceguera@idiv.de
  orcid: https://orcid.org/0009-0008-0107-9298
- family-names: Quoss
  given-names: Luise
  orcid: https://orcid.org/0000-0002-9910-1252
- family-names: Fernandez
  given-names: Nestor
  email: nestor.fernandez@idiv.de
  orcid: https://orcid.org/0000-0002-9645-8571
- family-names: Langer
  given-names: Christian
  email: christian.langer@idiv.de
  orcid: https://orcid.org/0000-0003-1446-3527
- family-names: Valdez
  given-names: Jose
  email: jose.valdez@idiv.de
  orcid: https://orcid.org/0000-0003-2690-9952
- family-names: Pereira
  given-names: Henrique Miguel
  email: henrique.pereira@idiv.de
  orcid: https://orcid.org/0000-0003-1043-1675
preferred-citation:
  type: manual
  title: 'ebvcube: Working with netCDF for Essential Biodiversity Variables'
  authors:
  - family-names: Quoss
    given-names: Luise
    orcid: https://orcid.org/0000-0002-9910-1252
  - family-names: Fernandez
    given-names: Nestor
    email: nestor.fernandez@idiv.de
    orcid: https://orcid.org/0000-0002-9645-8571
  - family-names: Langer
    given-names: Christian
    email: christian.langer@idiv.de
    orcid: https://orcid.org/0000-0003-1446-3527
  - family-names: Valdez
    given-names: Jose
    email: jose.valdez@idiv.de
    orcid: https://orcid.org/0000-0003-2690-9952
  - family-names: Pereira
    given-names: Henrique Miguel
    email: henrique.pereira@idiv.de
    orcid: https://orcid.org/0000-0003-1043-1675
  year: '2024'
  notes: R package version 0.5.2
  institution:
    name: German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig
    address: Germany
  url: https://github.com/EBVcube/ebvcube
repository: https://CRAN.R-project.org/package=ebvcube
repository-code: https://github.com/EBVcube/ebvcube
url: https://github.com/EBVCube/ebvcube
date-released: '2025-06-30'
contact:
- family-names: Oceguera Conchas
  given-names: Emmanuel
  email: e.oceguera@idiv.de
  orcid: https://orcid.org/0009-0008-0107-9298
keywords:
- biodiversity-standards
- essential-biodiversity-variables
- netcdf4
- r-package

GitHub Events

Total
  • Create event: 4
  • Issues event: 2
  • Release event: 4
  • Watch event: 4
  • Issue comment event: 1
  • Push event: 39
  • Pull request event: 10
Last Year
  • Create event: 4
  • Issues event: 2
  • Release event: 4
  • Watch event: 4
  • Issue comment event: 1
  • Push event: 39
  • Pull request event: 10

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 1
  • Total pull requests: 12
  • Average time to close issues: 2 days
  • Average time to close pull requests: about 10 hours
  • Total issue authors: 1
  • Total pull request authors: 3
  • Average comments per issue: 1.0
  • Average comments per pull request: 0.0
  • Merged pull requests: 10
  • Bot issues: 0
  • Bot pull requests: 2
Past Year
  • Issues: 1
  • Pull requests: 12
  • Average time to close issues: 2 days
  • Average time to close pull requests: about 10 hours
  • Issue authors: 1
  • Pull request authors: 3
  • Average comments per issue: 1.0
  • Average comments per pull request: 0.0
  • Merged pull requests: 10
  • Bot issues: 0
  • Bot pull requests: 2
Top Authors
Issue Authors
  • wlangera (1)
Pull Request Authors
  • LuiseQuoss (10)
  • dependabot[bot] (2)
  • E-O-Conchas (1)
Top Labels
Issue Labels
Pull Request Labels
dependencies (2) github_actions (2)

Packages

  • Total packages: 1
  • Total downloads:
    • cran 354 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 14
  • Total maintainers: 1
cran.r-project.org: ebvcube

Working with netCDF for Essential Biodiversity Variables

  • Versions: 14
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 354 Last month
Rankings
Dependent packages count: 29.8%
Dependent repos count: 35.5%
Average: 51.4%
Downloads: 89.0%
Maintainers (1)
Last synced: 6 months ago

Dependencies

DESCRIPTION cran
  • R >= 4.2.0 depends
  • DelayedArray * imports
  • HDF5Array * imports
  • checkmate * imports
  • curl * imports
  • ggplot2 * imports
  • jsonlite * imports
  • memuse * imports
  • methods * imports
  • ncdf4 * imports
  • ncmeta * imports
  • reshape2 * imports
  • rhdf5 * imports
  • stringr * imports
  • terra * imports
  • tidyterra * imports
  • withr * imports
  • knitr * suggests
  • rmarkdown * suggests
  • testthat >= 3.0.0 suggests
.github/workflows/R-CMD-check.yaml actions
  • actions/checkout v3 composite
  • r-lib/actions/check-r-package v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite