bigrquery

An interface to Google's BigQuery from R.

https://github.com/r-dbi/bigrquery

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
    2 of 45 committers (4.4%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (15.6%) to scientific vocabulary

Keywords

bigquery database r

Keywords from Contributors

summary-tables rtf latex easy-to-use tidyverse docx rmarkdown pandoc olap embedded-database
Last synced: 6 months ago · JSON representation

Repository

An interface to Google's BigQuery from R.

Basic Info
Statistics
  • Stars: 522
  • Watchers: 42
  • Forks: 185
  • Open Issues: 25
  • Releases: 18
Topics
bigquery database r
Created almost 13 years ago · Last pushed 6 months ago
Metadata Files
Readme Changelog Contributing License Code of conduct Codeowners Support

README.Rmd

---
output: github_document
---



```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)

if (bigrquery:::has_internal_auth()) {
  bigrquery:::bq_auth_internal()
} else {
  knitr::opts_chunk$set(eval = FALSE)
}
```
# bigrquery


[![CRAN Status](https://www.r-pkg.org/badges/version/bigrquery)](https://cran.r-project.org/package=bigrquery)
[![R-CMD-check](https://github.com/r-dbi/bigrquery/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/r-dbi/bigrquery/actions/workflows/R-CMD-check.yaml)

[![Codecov test coverage](https://codecov.io/gh/r-dbi/bigrquery/graph/badge.svg)](https://app.codecov.io/gh/r-dbi/bigrquery)


The bigrquery package makes it easy to work with data stored in 
[Google BigQuery](https://cloud.google.com/bigquery/docs) by allowing you to query BigQuery tables and retrieve metadata about your projects, datasets, tables, and jobs. The bigrquery package provides three levels of abstraction on top of BigQuery:

* The low-level API provides thin wrappers over the underlying REST API. All 
  the low-level functions start with `bq_`, and mostly have the form 
  `bq_noun_verb()`. This level of abstraction is most appropriate if you're
  familiar with the REST API and you want do something not supported in the 
  higher-level APIs.
  
* The [DBI interface](https://r-dbi.org) wraps the low-level API and
  makes working with BigQuery like working with any other database system.
  This is most convenient layer if you want to execute SQL queries in 
  BigQuery or upload smaller amounts (i.e. <100 MB) of data.

* The [dplyr interface](https://dbplyr.tidyverse.org/) lets you treat BigQuery 
  tables as if they are in-memory data frames. This is the most convenient 
  layer if you don't want to write SQL, but instead want dbplyr to write it 
  for you.

## Installation

The current bigrquery release can be installed from CRAN: 

```{r eval = FALSE}
install.packages("bigrquery")
```

The newest development release can be installed from GitHub:

```{r eval = FALSE}
#install.packages("pak")
pak::pak("r-dbi/bigrquery")
```

## Usage

### Low-level API

```{r}
library(bigrquery)
billing <- bq_test_project() # replace this with your project ID 
sql <- "SELECT year, month, day, weight_pounds FROM `publicdata.samples.natality`"

tb <- bq_project_query(billing, sql)
bq_table_download(tb, n_max = 10)
```

### DBI

```{r, warning = FALSE}
library(DBI)

con <- dbConnect(
  bigrquery::bigquery(),
  project = "publicdata",
  dataset = "samples",
  billing = billing
)
con 

dbListTables(con)

dbGetQuery(con, sql, n = 10)
```

### dplyr

```{r, message = FALSE}
library(dplyr)

natality <- tbl(con, "natality")

natality %>%
  select(year, month, day, weight_pounds) %>% 
  head(10) %>%
  collect()
```

## Important details

### BigQuery account

To use bigrquery, you'll need a BigQuery project. Fortunately, if you just want to play around with the BigQuery API, it's easy to start with Google's free [public data](https://cloud.google.com/bigquery/public-data) and the [BigQuery sandbox](https://cloud.google.com/bigquery/docs/sandbox). This gives you some fun data to play with along with enough free compute (1 TB of queries & 10 GB of storage per month) to learn the ropes. 

To get started, open  and create a project. Make a note of the "Project ID" as you'll use this as the `billing` project whenever you work with free sample data; and as the `project` when you work with your own data.

### Authentication and authorization

When using bigrquery interactively, you'll be prompted to [authorize bigrquery](https://cloud.google.com/bigquery/docs/authorization) in the browser. You'll be asked if you want to cache tokens for reuse in future sessions. For non-interactive usage, it is preferred to use a service account token, if possible. More places to learn about auth:

  * Help for [`bigrquery::bq_auth()`](https://bigrquery.r-dbi.org/reference/bq_auth.html).
  * [How gargle gets tokens](https://gargle.r-lib.org/articles/how-gargle-gets-tokens.html).
    - bigrquery obtains a token with `gargle::token_fetch()`, which supports
      a variety of token flows. This article provides full details, such as how
      to take advantage of Application Default Credentials or service accounts
      on GCE VMs.
  * [Non-interactive auth](https://gargle.r-lib.org/articles/non-interactive-auth.html). Explains
    how to set up a project when code must run without any user interaction.
  * [How to get your own API credentials](https://gargle.r-lib.org/articles/get-api-credentials.html). Instructions for getting your own OAuth client or service account token.

Note that bigrquery requests permission to modify your data; but it will never do so unless you explicitly request it (e.g. by calling `bq_table_delete()` or `bq_table_upload()`). Our [Privacy policy](https://www.tidyverse.org/google_privacy_policy) provides more info.

## Useful links

* [SQL reference](https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators)
* [API reference](https://cloud.google.com/bigquery/docs/reference/rest)
* [Query/job console](https://console.cloud.google.com/bigquery/)
* [Billing console](https://console.cloud.google.com/)

## Policies

Please note that the 'bigrquery' project is released with a [Contributor Code of Conduct](https://bigrquery.r-dbi.org/CODE_OF_CONDUCT.html). By contributing to this project, you agree to abide by its terms.

[Privacy policy](https://www.tidyverse.org/google_privacy_policy)

Owner

  • Name: r-dbi
  • Login: r-dbi
  • Kind: organization

R + databases

GitHub Events

Total
  • Create event: 3
  • Release event: 1
  • Issues event: 16
  • Watch event: 12
  • Delete event: 1
  • Issue comment event: 39
  • Push event: 10
  • Pull request review event: 3
  • Pull request review comment event: 3
  • Pull request event: 11
  • Fork event: 6
Last Year
  • Create event: 3
  • Release event: 1
  • Issues event: 16
  • Watch event: 12
  • Delete event: 1
  • Issue comment event: 39
  • Push event: 10
  • Pull request review event: 3
  • Pull request review comment event: 3
  • Pull request event: 11
  • Fork event: 6

Committers

Last synced: 9 months ago

All Time
  • Total Commits: 649
  • Total Committers: 45
  • Avg Commits per committer: 14.422
  • Development Distribution Score (DDS): 0.356
Past Year
  • Commits: 11
  • Committers: 4
  • Avg Commits per committer: 2.75
  • Development Distribution Score (DDS): 0.455
Top Committers
Name Email Commits
hadley h****m@g****m 418
Jenny Bryan j****n@g****m 69
Kirill Müller k****r@i****h 61
Craig Citro c****o@g****m 18
Bulat Yapparov b****v@g****m 9
Brian Weinstein b****2@g****m 7
Alejandro Palacio 6****2 6
Nathan Goulding n****g@p****o 5
Jarod G.R. Meng g****g@g****m 5
Akhmed Umyarov r****d 4
Jeremy Hyrkas h****s@g****m 3
Mara Averick m****k@g****m 3
Maximilian Girlich m****h@m****m 3
Bruno Tremblay b****y@l****m 2
Bruno Tremblay b****o@b****a 2
David Adams d****a@g****m 2
totogo s****y@g****m 2
Michael Chirico m****4@g****m 2
Cass Wilkinson Saldaña c****s@g****m 2
guy.dawson g****n@g****k 1
Valentin Umbach v****h@l****m 1
Adeel Khan a****k@g****u 1
Rasmus Bååth r****h@g****m 1
Nicole Deflaux d****x 1
Mickaël Canouil 8****l 1
Mervin Fansler m****r@g****m 1
Maëlle Salmon m****n@y****e 1
Matthew Pancia m****a 1
Kirill Müller k****r@m****g 1
Ka2wei 5****i 1
and 15 more...

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 135
  • Total pull requests: 112
  • Average time to close issues: over 1 year
  • Average time to close pull requests: 5 months
  • Total issue authors: 77
  • Total pull request authors: 27
  • Average comments per issue: 2.64
  • Average comments per pull request: 1.22
  • Merged pull requests: 79
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 14
  • Pull requests: 16
  • Average time to close issues: about 1 hour
  • Average time to close pull requests: 2 days
  • Issue authors: 9
  • Pull request authors: 5
  • Average comments per issue: 0.29
  • Average comments per pull request: 0.75
  • Merged pull requests: 7
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • hadley (19)
  • abalter (17)
  • jennybc (6)
  • selesnow (4)
  • meztez (4)
  • muschellij2 (4)
  • ablack3 (4)
  • joscani (3)
  • andrewbcooper (2)
  • botan (2)
  • byapparov (2)
  • jkylearmstrong (2)
  • calebwpalmer (2)
  • Kvit (1)
  • 3Mcolab (1)
Pull Request Authors
  • hadley (51)
  • apalacio9502 (12)
  • MichaelChirico (6)
  • meztez (5)
  • jennybc (5)
  • mgirlich (3)
  • byapparov (3)
  • muschellij2 (3)
  • taiyodayo (2)
  • mfansler (2)
  • IoannaNika (2)
  • DavisVaughan (2)
  • atheriel (2)
  • ahmohamed (1)
  • mcanouil (1)
Top Labels
Issue Labels
feature (20) bug (19) reprex (12) DBI :card_file_box: (12) api :spider_web: (9) upkeep (8) dbplyr :wrench: (7) download :arrow_down: (5) documentation (4) auth :key: (1)
Pull Request Labels

Packages

  • Total packages: 3
  • Total downloads:
    • cran 12,471 last-month
  • Total docker downloads: 118,042
  • Total dependent packages: 10
    (may contain duplicates)
  • Total dependent repositories: 35
    (may contain duplicates)
  • Total versions: 44
  • Total maintainers: 1
proxy.golang.org: github.com/r-dbi/bigrquery
  • Versions: 17
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 5.5%
Average: 5.7%
Dependent repos count: 5.9%
Last synced: 6 months ago
cran.r-project.org: bigrquery

An Interface to Google's 'BigQuery' 'API'

  • Versions: 18
  • Dependent Packages: 10
  • Dependent Repositories: 35
  • Downloads: 12,471 Last month
  • Docker Downloads: 118,042
Rankings
Forks count: 0.3%
Stargazers count: 0.7%
Dependent repos count: 4.4%
Downloads: 4.6%
Dependent packages count: 6.1%
Average: 6.2%
Docker downloads count: 21.0%
Maintainers (1)
Last synced: 6 months ago
conda-forge.org: r-bigrquery
  • Versions: 9
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent repos count: 34.0%
Average: 42.6%
Dependent packages count: 51.2%
Last synced: 6 months ago

Dependencies

DESCRIPTION cran
  • R >= 3.3 depends
  • DBI * imports
  • Rcpp * imports
  • assertthat * imports
  • bit64 * imports
  • brio * imports
  • curl * imports
  • gargle >= 1.2.0 imports
  • glue >= 1.3.0 imports
  • httr * imports
  • jsonlite * imports
  • lifecycle * imports
  • methods * imports
  • prettyunits * imports
  • progress * imports
  • rlang >= 0.4.9 imports
  • tibble * imports
  • DBItest * suggests
  • blob * suggests
  • covr * suggests
  • dbplyr >= 2.2.1 suggests
  • dplyr >= 0.7.0 suggests
  • hms * suggests
  • readr * suggests
  • sodium * suggests
  • testthat >= 2.1.0 suggests
  • withr * suggests
  • wk >= 0.3.2 suggests
.github/workflows/R-CMD-check.yaml actions
  • actions/checkout v3 composite
  • r-lib/actions/check-r-package v2 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/live-api.yml actions
  • actions/cache v2 composite
  • actions/checkout v2 composite
  • actions/upload-artifact main composite
  • r-lib/actions/setup-pandoc v1 composite
  • r-lib/actions/setup-r v1 composite
.github/workflows/pkgdown.yaml actions
  • JamesIves/github-pages-deploy-action v4.4.1 composite
  • actions/checkout v3 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/pr-commands.yaml actions
  • actions/checkout v3 composite
  • r-lib/actions/pr-fetch v2 composite
  • r-lib/actions/pr-push v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/test-coverage.yaml actions
  • actions/checkout v3 composite
  • actions/upload-artifact v3 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite