arcpbf
Rust crate and R package for processing Esri Protocol Buffers
Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (15.2%) to scientific vocabulary
Keywords
agol
arcgis
r-spatial
Last synced: 6 months ago
·
JSON representation
Repository
Rust crate and R package for processing Esri Protocol Buffers
Basic Info
- Host: GitHub
- Owner: R-ArcGIS
- License: apache-2.0
- Language: Rust
- Default Branch: main
- Homepage: https://r.esri.com/arcpbf/
- Size: 65 MB
Statistics
- Stars: 10
- Watchers: 2
- Forks: 0
- Open Issues: 1
- Releases: 0
Topics
agol
arcgis
r-spatial
Created over 2 years ago
· Last pushed 7 months ago
Metadata Files
Readme
Changelog
License
README.Rmd
---
output: github_document
---
[](https://github.com/R-ArcGIS/arcpbf/actions/workflows/R-CMD-check.yaml)
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
options(max.print = 100)
```
# arcpbf
`{arcpbf}` is an R package that processes [Esri FeatureCollection Protocol Buffers](https://github.com/Esri/arcgis-pbf/tree/main/proto/FeatureCollection).
It is written in Rust and powered by the [extendr](https://github.com/extendr/extendr) library.
arcpbf has functions for reading protocol buffer (pbf) results from an ArcGIS
REST API result. pbf results are returned when `f=pbf` in a [query](https://developers.arcgis.com/rest/services-reference/enterprise/query-feature-service-layer-.htm).
The package is extremely lightweight and fast.
Limitation: this package does not support Z and M dimensions at this point.
## TL;DR
- `open_pbf()` will read a FeatureCollection `pbf` file into a raw vector
- `read_pbf()` will read a FeatureCollection `pbf` file _and_ process it with
- `resp_body_pbf()` and `resps_data_pbf()` process `httr2_response` objects
with FeatureCollection pbf bodies
- `process_pbf()` will process a raw vector or a list of raw vectors
- `post_process_pbf()` will apply post processing steps to the results of
`process_pbf()`
- set `use_sf = TRUE` to return an `sf` object if possible. Applied by
default in `read_pbf()`, `resp_body_pbf()` and `resps_data_pbf()`.
> ***Developer Note***: Rust must be installed to compile the package. Run the one line
installation instructions at https://rustup.rs/. To verify your Rust installation
is compatible, run `rextendr::rust_sitrep()`. That's it.
### PBF support
Note that _only_ the FeatureCollection pbf specification is supported by arcpbf.
If you want to process OSM pbf files use [`osmextract::oe_read()`](https://docs.ropensci.org/osmextract/reference/oe_read.html).
Or, if you want to create and read arbitrary protocol buffers directly in R,
use [`RprotoBuf`](https://cran.r-project.org/web/packages/RProtoBuf).
## Basic usage
In most cases, we will be processing a protocol buffer directly from an http
request created with [`{httr2}`](https://httr2.r-lib.org/).
```{r}
library(arcpbf)
# specify url to sent our request to
url <- "https://services.arcgis.com/P3ePLMYs2RVChkJx/arcgis/rest/services/ACS_Population_by_Race_and_Hispanic_Origin_Boundaries/FeatureServer/2/query?where=1=1&outFields=objectid&resultRecordCount=10&f=pbf&token="
req <- httr2::request(url)
resp <- httr2::req_perform(req)
resp
```
We can process request responses with `resp_body_pbf()`. Post-processing steps
are applied by default. The arguments `post_process` and `use_sf` are `TRUE` by
default.
```{r}
resp_body_pbf(resp)
```
### Multiple response objects
When running multiple requests in parallel using
`httr2::req_perform_parallel()` the responses are returned as a list of
responses. `resps_data_pbf()` processes the responses in a vectorized
manner.
```{r}
# create a list of requests
reqs <- replicate(5, req, simplify = FALSE)
# perform them in parallel
resps <- httr2::req_perform_parallel(reqs)
# process the responses
resps_data_pbf(resps)
```
### Reading from a file
In some cases you may have a file on disk that you want to process a pbf from.
Use `read_pbf()` to do so. Again, post-processing steps are applied by default.
```{r}
fp <- system.file("small-points.pbf", package = "arcpbf")
read_pbf(fp)
```
## FeatureCollection Result Types
There are three types of PBF FeatureCollection responses that may be
returned as a result of a [Feature Service Query request](https://developers.arcgis.com/rest/services-reference/enterprise/query-feature-service-layer-.htm).
- **Feature Results**:
- the default query response type. Contains individual features with their
attributes and geometries if available.
- **Count Result**:
- returned when `returnCountOnly=true` in an API request. Returned as a scalar
integer vector.
- **Object ID Result**:
- returned when `returnIdsOnly=true`. A `data.frame` containing object IDs
where the column name is set to the object ID field name of the feature
service.
### Feature Results
Feature results can either omit geometry entirely, for example in the case of a
Table or when the query parameter `returnGeometry=false`, or include it. When
geometry is omitted entirely, the response is processed as a simple
`data.frame`. However, if the response does contain geometry, the response is a
bit more complex.
Unprocessed feature results with geometries return a named list with 3 elements:
- `attributes`:
- a `data.frame` of the fields and their values
- `sr`:
- a named list with elements `wkt`, `wkid`, `latest_wkid`, `vcs_wkid`,
and `latest_vcs_wkid`. These determine the coordinate reference system of the
response as well as the vertical coordinate reference system.
- `geometry`:
- an `sfc` object _**without a computed bounding box or coordinate reference
system set**_ or a CRS set.
```{r}
# read an example pbf without post-processing
fc_fp <- system.file("small-points.pbf", package = "arcpbf")
res <- read_pbf(fc_fp, post_process = FALSE)
res
```
When post-processing is applied to a geometry Feature Result, the CRS is set
and the bounding box is computed. This requires the `sf` package to be available.
```{r}
post_process_pbf(res)
```
## Lower level functions
The function `open_pbf()` will read a pbf file into a raw vector which can be
passed to `process_pbf()`. In general you will not need this function, but it
is handy for the sake of example.
```{r}
pbf_raw <- open_pbf(fc_fp)
head(pbf_raw, 20)
```
This raw vector can be turned into an R object using `process_pbf()`. The output
_will not_ be post processed.
```{r}
res <- process_pbf(pbf_raw)
res
```
Post-processing can be applied to the result of `process_pbf()` using
`post_process_pbf()`.
```{r}
post_process_pbf(res)
```
`post_process_pbf()` can also be applied to a list of processed pbf responses.
```{r}
multi_res <- list(res, res, res)
post_process_pbf(multi_res)
```
## Benchmarking
Below is a bench mark that compares processing pbfs to the current approach of processing
raw json in arcgislayers and arcgisutils. The below recreates the example from the README of arcgislayers.
```{r}
jsn <- function() {
json_reqs <- c(
"https://services.arcgis.com/P3ePLMYs2RVChkJx/ArcGIS/rest/services/USA_Counties_Generalized_Boundaries/FeatureServer/0/query?outFields=%2A&where=1%3D1&outSR=%7B%22wkid%22%3A4326%7D&returnGeometry=TRUE&token=&f=json&resultOffset=0",
"https://services.arcgis.com/P3ePLMYs2RVChkJx/ArcGIS/rest/services/USA_Counties_Generalized_Boundaries/FeatureServer/0/query?outFields=%2A&where=1%3D1&outSR=%7B%22wkid%22%3A4326%7D&returnGeometry=TRUE&token=&f=json&resultOffset=2001"
)
reqs <- lapply(json_reqs, httr2::request)
resps <- httr2::req_perform_parallel(reqs) |>
lapply(function(x) arcgisutils::parse_esri_json(httr2::resp_body_string(x)))
do.call(rbind.data.frame, resps) |>
sf::st_as_sf()
}
# protobuff processing
pbf <- function() {
pbf_reqs <- c(
"https://services.arcgis.com/P3ePLMYs2RVChkJx/ArcGIS/rest/services/USA_Counties_Generalized_Boundaries/FeatureServer/0/query?outFields=%2A&where=1%3D1&outSR=%7B%22wkid%22%3A4326%7D&returnGeometry=TRUE&token=&f=pbf&resultOffset=0",
"https://services.arcgis.com/P3ePLMYs2RVChkJx/ArcGIS/rest/services/USA_Counties_Generalized_Boundaries/FeatureServer/0/query?outFields=%2A&where=1%3D1&outSR=%7B%22wkid%22%3A4326%7D&returnGeometry=TRUE&token=&f=pbf&resultOffset=2001"
)
reqs <- lapply(pbf_reqs, httr2::request)
httr2::req_perform_parallel(reqs) |>
resps_data_pbf()
}
bench::mark(
jsn(),
pbf(),
check = FALSE,
relative = TRUE,
iterations = 5
)
```
## Internals
Internally, there is a rust crate [`esripbf`](./src/rust/esripbf) which is a
a Rust library built with [`prost`](https://github.com/tokio-rs/prost) to handle the [FeatureCollection Protocol Buffer Specification](https://github.com/Esri/arcgis-pbf/tree/main/proto/FeatureCollection).
## Future Notes
Alternatively, it may make sense to write to a geoarrow array and convert to sfc
using {wk}. These are just thoughts.
Owner
- Name: R-ArcGIS
- Login: R-ArcGIS
- Kind: organization
- Website: https://r-arcgis.github.io
- Repositories: 4
- Profile: https://github.com/R-ArcGIS
GitHub Events
Total
- Issues event: 8
- Watch event: 2
- Issue comment event: 12
- Push event: 10
- Pull request event: 6
- Create event: 2
Last Year
- Issues event: 8
- Watch event: 2
- Issue comment event: 12
- Push event: 10
- Pull request event: 6
- Create event: 2
Committers
Last synced: about 2 years ago
Top Committers
| Name | Commits | |
|---|---|---|
| Josiah Parry | j****y@g****m | 55 |
Issues and Pull Requests
Last synced: 7 months ago
All Time
- Total issues: 9
- Total pull requests: 5
- Average time to close issues: about 1 month
- Average time to close pull requests: 43 minutes
- Total issue authors: 5
- Total pull request authors: 1
- Average comments per issue: 6.33
- Average comments per pull request: 0.0
- Merged pull requests: 5
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 9
- Pull requests: 5
- Average time to close issues: about 1 month
- Average time to close pull requests: 43 minutes
- Issue authors: 5
- Pull request authors: 1
- Average comments per issue: 6.33
- Average comments per pull request: 0.0
- Merged pull requests: 5
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- elipousson (3)
- ryanzomorrodi (2)
- JosiahParry (2)
- JWilliamsonArch (1)
- muschellij2 (1)
Pull Request Authors
- JosiahParry (10)
Top Labels
Issue Labels
bug (2)
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- cran 672 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 7
- Total maintainers: 1
cran.r-project.org: arcpbf
Process ArcGIS Protocol Buffer FeatureCollections
- Homepage: https://r.esri.com/arcpbf/
- Documentation: http://cran.r-project.org/web/packages/arcpbf/arcpbf.pdf
- License: Apache License (≥ 2)
-
Latest release: 0.1.7
published 11 months ago
Rankings
Dependent packages count: 28.9%
Dependent repos count: 36.9%
Average: 50.8%
Downloads: 86.6%
Maintainers (1)
Last synced:
6 months ago
Dependencies
.github/workflows/R-CMD-check.yaml
actions
- actions/checkout v3 composite
- r-lib/actions/check-r-package v2 composite
- r-lib/actions/setup-pandoc v2 composite
- r-lib/actions/setup-r v2 composite
- r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/pkgdown.yaml
actions
- JamesIves/github-pages-deploy-action v4.4.1 composite
- actions/checkout v3 composite
- r-lib/actions/setup-pandoc v2 composite
- r-lib/actions/setup-r v2 composite
- r-lib/actions/setup-r-dependencies v2 composite
src/rust/Cargo.lock
cargo
- aho-corasick 1.1.2
- anyhow 1.0.75
- approx 0.5.1
- autocfg 1.1.0
- bitflags 1.3.2
- bitflags 2.4.1
- bytes 1.5.0
- cfg-if 1.0.0
- either 1.9.0
- equivalent 1.0.1
- errno 0.3.5
- extendr-api 0.6.0
- extendr-macros 0.6.0
- fastrand 2.0.1
- fixedbitset 0.4.2
- geo-types 0.7.11
- hashbrown 0.14.2
- heck 0.4.1
- home 0.5.5
- indexmap 2.1.0
- itertools 0.11.0
- libR-sys 0.6.0
- libc 0.2.149
- libm 0.2.8
- linux-raw-sys 0.4.10
- log 0.4.20
- memchr 2.6.4
- multimap 0.8.3
- num-traits 0.2.17
- once_cell 1.18.0
- paste 1.0.14
- petgraph 0.6.4
- prettyplease 0.2.15
- proc-macro2 1.0.69
- prost 0.12.1
- prost-build 0.12.1
- prost-derive 0.12.1
- prost-types 0.12.1
- quote 1.0.33
- redox_syscall 0.4.1
- regex 1.10.2
- regex-automata 0.4.3
- regex-syntax 0.8.2
- rustix 0.38.21
- serde 1.0.190
- serde_derive 1.0.190
- syn 2.0.38
- tempfile 3.8.1
- unicode-ident 1.0.12
- which 4.4.2
- windows-sys 0.48.0
- windows-targets 0.48.5
- windows_aarch64_gnullvm 0.48.5
- windows_aarch64_msvc 0.48.5
- windows_i686_gnu 0.48.5
- windows_i686_msvc 0.48.5
- windows_x86_64_gnu 0.48.5
- windows_x86_64_gnullvm 0.48.5
- windows_x86_64_msvc 0.48.5
src/rust/Cargo.toml
cargo
src/rust/arcpbf/Cargo.toml
cargo
src/rust/esripbf/Cargo.toml
cargo
DESCRIPTION
cran
- rlang * imports
- collapse >= 2.0.0 suggests
- data.table * suggests
- dplyr * suggests
- httr2 * suggests
- sf * suggests
.github/workflows/fedora.yaml
actions
- actions/checkout v3 composite
- dtolnay/rust-toolchain 1.67.0 composite
- r-lib/actions/check-r-package v2 composite
- r-lib/actions/setup-pandoc v2 composite
- r-lib/actions/setup-r-dependencies v2 composite