Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.5%) to scientific vocabulary
Keywords
cpp
r
url
url-parser
urlparser
Last synced: 6 months ago
·
JSON representation
Repository
Fast and simple url parser for R
Basic Info
- Host: GitHub
- Owner: DyfanJones
- License: other
- Language: C++
- Default Branch: main
- Homepage: https://dyfanjones.r-universe.dev/urlparse
- Size: 836 KB
Statistics
- Stars: 7
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 1
Topics
cpp
r
url
url-parser
urlparser
Created about 1 year ago
· Last pushed 10 months ago
Metadata Files
Readme
Changelog
License
README.Rmd
---
output: github_document
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# urlparse
[](https://CRAN.R-project.org/package=urlparse)
[](https://github.com/DyfanJones/urlparse/actions/workflows/R-CMD-check.yaml)
[](https://app.codecov.io/gh/DyfanJones/urlparse)
[](https://dyfanjones.r-universe.dev/urlparse)
Fast and simple url parser for R. Initially developed for the `paws.common` package.
```{r}
urlparse::url_parse("https://user:pass@host.com:8000/path?query=1#fragment")
```
## Installation
You can install the development version of urlparse like so:
``` r
remotes::install_github("dyfanjones/urlparse")
```
r-universe installation:
```r
install.packages("urlparse", repos = c("https://dyfanjones.r-universe.dev", "https://cloud.r-project.org"))
```
## Example
This is a basic example which shows you how to solve a common problem:
```{r example}
library(urlparse)
```
```{r encode}
url_encoder("foo = bar + 5")
url_decoder(url_encoder("foo = bar + 5"))
```
Similar to python's `from urllib.parse import quote`, `urlparse::url_encoder` supports the `safe` parameter. The additional ASCII characters that should not be encoded.
```{python python_encode_safe}
from urllib.parse import quote
quote("foo = bar + 5", safe = "+")
```
```{r r_encode_safe}
url_encoder("foo = bar + 5", safe = "+")
```
Modify an `url` through piping using the `set_*` functions or using the stand alone `url_modify` function.
```{r url_modify}
url <- "http://example.com"
set_scheme(url, "https") |>
set_port(1234L) |>
set_path("foo/bar") |>
set_query("baz") |>
set_fragment("quux")
url_modify(url, scheme = "https", port = 1234, path = "foo/bar", query = "baz", fragment = "quux")
```
Note: it is faster to use `url_modify` rather than piping the `set_*` functions. This is because `urlparse` has to parse the url within each `set_*` to modify the url.
```{r url_mod_bench}
url <- "http://example.com"
bench::mark(
piping = {set_scheme(url, "https") |>
set_port(1234L) |>
set_path("foo/bar") |>
set_query("baz") |>
set_fragment("quux")},
single_function = url_modify(url, scheme = "https", port = 1234, path = "foo/bar", query = "baz", fragment = "quux")
)
```
## Benchmark:
```{r, echo = FALSE}
show_relative <- function(bm) {
summary_cols <- c("min", "median", "itr/sec", "mem_alloc", "gc/sec")
bm[summary_cols] <- lapply(bm[summary_cols], function(x) as.numeric(x / min(x)))
return(bm)
}
```
### Parsing URL:
```{r benchmark}
url <- "https://user:pass@host.com:8000/path?query=1#fragment"
(bm <- bench::mark(
urlparse = urlparse::url_parse(url),
httr2 = httr2::url_parse(url),
curl = curl::curl_parse_url(url),
urltools = urltools::url_parse(url),
check = F
))
show_relative(bm)
ggplot2::autoplot(bm)
```
Since `urlpase v0.1.999+` you can use the vectorised url parser `url_parser_v2`
```{r benchmark_vectorise}
urls <- c(
"https://www.example.com",
"https://www.google.com/maps/place/Pennsylvania+Station/@40.7519848,-74.0015045,14.7z/data=!4m5!3m4!1s0x89c259ae15b2adcb:0x7955420634fd7eba!8m2!3d40.750568!4d-73.993519",
"https://user_1:password_1@example.org:8080/dir/../api?q=1#frag",
"https://user:password@example.com",
"https://www.example.com:8080/search%3D1%2B3",
"https://www.google.co.jp/search?q=\u30c9\u30a4\u30c4",
"https://www.example.com:8080?var1=foo&var2=ba%20r&var3=baz+larry",
"https://user:password@example.com:8080",
"https://user:password@example.com",
"https://user@example.com:8080",
"https://user@example.com"
)
(bm <- bench::mark(
urlparse = lapply(urls, urlparse::url_parse),
urlparse_v2 = urlparse::url_parse_v2(urls),
httr2 = lapply(urls, httr2::url_parse),
curl = lapply(urls, curl::curl_parse_url),
urltools = urltools::url_parse(urls),
check = F
))
show_relative(bm)
ggplot2::autoplot(bm)
```
Note: `url_parse_v2` returns the parsed url as a `data.frame` this is similar behaviour to `urltools` and `adaR`:
```{r url_parse_v2}
urlparse::url_parse_v2(urls)
```
### Encoding URL:
Note: `urltools` encode special characters to lower case hex i.e.: "?" -> "%3f" instead of "%3F"
```{r benchmark_encode_small}
string <- "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789-._~`!@#$%^&*()=+[{]}\\|;:'\",<>/? "
(bm <- bench::mark(
urlparse = urlparse::url_encoder(string),
curl = curl::curl_escape(string),
urltools = urltools::url_encode(string),
base = URLencode(string, reserved = T),
check = F
))
show_relative(bm)
ggplot2::autoplot(bm)
```
```{r benchmark_encode_large}
string <- "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789-._~`!@#$%^&*()=+[{]}\\|;:'\",<>/? "
url <- paste0(sample(strsplit(string, "")[[1]], 1e4, replace = TRUE), collapse = "")
(bm <- bench::mark(
urlparse = urlparse::url_encoder(url),
curl = curl::curl_escape(url),
urltools = urltools::url_encode(url),
base = URLencode(url, reserved = T, repeated = T),
check = F,
filter_gc = F
))
show_relative(bm)
ggplot2::autoplot(bm)
```
Owner
- Name: Larefly
- Login: DyfanJones
- Kind: user
- Location: United Kingdom
- Repositories: 14
- Profile: https://github.com/DyfanJones
GitHub Events
Total
- Release event: 2
- Watch event: 6
- Push event: 19
- Pull request event: 1
- Create event: 5
Last Year
- Release event: 2
- Watch event: 6
- Push event: 19
- Pull request event: 1
- Create event: 5
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 0
- Total pull requests: 1
- Average time to close issues: N/A
- Average time to close pull requests: 8 minutes
- Total issue authors: 0
- Total pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 1
- Average time to close issues: N/A
- Average time to close pull requests: 8 minutes
- Issue authors: 0
- Pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
- DyfanJones (2)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- cran 176 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 3
- Total maintainers: 1
cran.r-project.org: urlparse
Fast Simple URL Parser
- Homepage: https://github.com/dyfanjones/urlparse
- Documentation: http://cran.r-project.org/web/packages/urlparse/urlparse.pdf
- License: MIT + file LICENSE
-
Latest release: 0.2.1
published 10 months ago
Rankings
Dependent packages count: 27.4%
Forks count: 29.0%
Dependent repos count: 33.8%
Stargazers count: 36.9%
Average: 42.8%
Downloads: 87.0%
Maintainers (1)
Last synced:
6 months ago
Dependencies
.github/workflows/R-CMD-check.yaml
actions
- actions/checkout v4 composite
- r-lib/actions/check-r-package v2 composite
- r-lib/actions/setup-r v2 composite
- r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/pr-commands.yaml
actions
- actions/checkout v4 composite
- r-lib/actions/pr-fetch v2 composite
- r-lib/actions/pr-push v2 composite
- r-lib/actions/setup-r v2 composite
- r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/test-coverage.yaml
actions
- actions/checkout v4 composite
- actions/upload-artifact v4 composite
- codecov/codecov-action v4 composite
- r-lib/actions/setup-r v2 composite
- r-lib/actions/setup-r-dependencies v2 composite
DESCRIPTION
cran
- Rcpp * imports
- testthat >= 3.0.0 suggests
.github/workflows/rhub.yaml
actions
- r-hub/actions/checkout v1 composite
- r-hub/actions/platform-info v1 composite
- r-hub/actions/run-check v1 composite
- r-hub/actions/setup v1 composite
- r-hub/actions/setup-deps v1 composite
- r-hub/actions/setup-r v1 composite