pointblank

Data quality assessment and metadata reporting for data frames and database tables

https://github.com/rstudio/pointblank

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
    1 of 18 committers (5.6%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (16.3%) to scientific vocabulary

Keywords

data-assertions data-checker data-dictionaries data-frames data-inference data-management data-profiler data-quality data-validation data-verification database-tables easy-to-understand reporting-tool schema-validation testing-tools yaml-configuration

Keywords from Contributors

literate-programming pandoc rmarkdown glm lm ropensci summary-statistics unconf unconf17 econometrics
Last synced: 4 months ago · JSON representation ·

Repository

Data quality assessment and metadata reporting for data frames and database tables

Basic Info
Statistics
  • Stars: 984
  • Watchers: 30
  • Forks: 59
  • Open Issues: 102
  • Releases: 21
Topics
data-assertions data-checker data-dictionaries data-frames data-inference data-management data-profiler data-quality data-validation data-verification database-tables easy-to-understand reporting-tool schema-validation testing-tools yaml-configuration
Created almost 9 years ago · Last pushed 4 months ago
Metadata Files
Readme Changelog Contributing License Code of conduct Citation Codemeta

README.md


[![CRAN status](https://www.r-pkg.org/badges/version/pointblank)](https://CRAN.R-project.org/package=pointblank) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/license/mit/) [![R-CMD-check](https://github.com/rstudio/pointblank/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/rstudio/pointblank/actions/workflows/R-CMD-check.yaml) [![Linting](https://github.com/rstudio/pointblank/actions/workflows/lint.yaml/badge.svg)](https://github.com/rstudio/pointblank/actions/workflows/lint.yaml) [![Codecov test coverage](https://codecov.io/gh/rstudio/pointblank/graph/badge.svg)](https://app.codecov.io/gh/rstudio/pointblank) [![Best Practices](https://bestpractices.coreinfrastructure.org/projects/4310/badge)](https://bestpractices.coreinfrastructure.org/projects/4310) [![The project has reached a stable, usable state and is being actively developed.](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active) [![Monthly Downloads](https://cranlogs.r-pkg.org/badges/pointblank)](https://CRAN.R-project.org/package=pointblank) [![Total Downloads](https://cranlogs.r-pkg.org/badges/grand-total/pointblank)](https://CRAN.R-project.org/package=pointblank) [![Posit Cloud](https://img.shields.io/badge/Posit%20Cloud-pointblank%20Test%20Drive-blue?style=social&logo=rstudio&logoColor=75AADB)](https://rstudio.cloud/project/3411822) [![Discord](https://img.shields.io/discord/1345877328982446110?color=%237289da&label=Discord)](https://discord.com/invite/YH7CybCNCQ) [![Contributor Covenant](https://img.shields.io/badge/Contributor%20Covenant-v2.1%20adopted-ff69b4.svg)](https://www.contributor-covenant.org/version/2/1/code_of_conduct.html)

With the pointblank package it’s really easy to methodically validate your data whether in the form of data frames or as database tables. On top of the validation toolset, the package gives you the means to provide and keep up-to-date with the information that defines your tables.

For table validation, the agent object works with a large collection of simple (yet powerful!) validation functions. We can enable much more sophisticated validation checks by using custom expressions, segmenting the data, and by selective mutations of the target table. The suite of validation functions ensures that everything just works no matter whether your table is a data frame or a database table.

Sometimes, we want to maintain table information and update it when the table goes through changes. For that, we can use an informant object plus associated functions to help define the metadata entries and present it as a data dictionary. Just like we can with validation, pointblank offers easy ways to have the metadata updated so that this important documentation doesn't become stale.


TABLE VALIDATIONS WITH AN AGENT AND DATA QUALITY REPORTING

Data validation can be carried out in Data Quality Reporting workflow, ultimately resulting in the production of a data quality analysis report. This is most useful in a non-interactive mode where data quality for database tables and on-disk data files must be periodically checked. The pointblank agent is given a collection of validation functions to define validation steps. We can get extracts of data rows that failed validation, set up custom functions that are invoked by exceeding set threshold failure rates, etc. Want to email the report regularly (or, only if certain conditions are met)? Yep, you can do all that.

Here is an example of how to use pointblank to validate a local table with an agent.

``` r

Generate a simple action_levels object to

set the warn state if a validation step

has a single 'fail' test unit

al <- actionlevels(warnat = 1)

Create a pointblank agent object, with the

tibble as the target table. Use three validation

functions, then, interrogate(). The agent will

then have some useful intel.

agent <- dplyr::tibble( a = c(5, 7, 6, 5, NA, 7), b = c(6, 1, 0, 6, 0, 7) ) %>% createagent( label = "A very simple example.", actions = al ) %>% colvalsbetween( columns = a, left = 1, right = 9, napass = TRUE ) %>% colvalslt( columns = c, 12, preconditions = ~ . %>% dplyr::mutate(c = a + b) ) %>% colisnumeric(columns = c(a, b)) %>% interrogate() ```

The reporting’s pretty sweet. We can get a gt-based report by printing an agent.

The pointblank package is designed to be both straightforward yet powerful. And fast! Local data frames don’t take very long to validate extensively and all validation checks on remote tables are done entirely in-database. So we can add dozens or even hundreds of validation steps without any long waits for reporting.

Should you want to perform validation checks on database or Spark tables, provide a tbl_dbi or tbl_spark object to create_agent(). The pointblank package currently supports PostgreSQL. MySQL, MariaDB, Microsoft SQL Server, Google BigQuery, DuckDB, SQLite, and Spark DataFrames (through the sparklyr package).

Here are some validation reports for the considerably larger intendo::intendo_revenue table.

postgres    mysql    duckdb


VALIDATIONS DIRECTLY ON DATA

The Pipeline Data Validation workflow uses the same collection of validation functions but without need of an agent. This is useful for an ETL process where we want to periodically check data and trigger warnings, raise errors, or write out logs when exceeding specified failure thresholds. It’s a cinch to perform checks on import of the data and at key points during the transformation process, perhaps stopping data flow if things are unacceptable with regard to data quality.

The following example uses the same three validation functions as before but, this time, we use them directly on the data. The validation functions act as a filter, passing data through unless execution is stopped by failing validations beyond the set threshold. In this workflow, by default, an error will occur if there is a single ‘fail’ test unit in any validation step:

r dplyr::tibble( a = c(5, 7, 6, 5, NA, 7), b = c(6, 1, 0, 6, 0, 7) ) %>% col_vals_between( columns = a, left = 1, right = 9, na_pass = TRUE ) %>% col_vals_lt( columns = c, value = 12, preconditions = ~ . %>% dplyr::mutate(c = a + b) ) %>% col_is_numeric(columns = c(a, b))

Error: Exceedance of failed test units where values in `c` should have been < `12`.
The `col_vals_lt()` validation failed beyond the absolute threshold level (1).
* failure level (2) >= failure threshold (1) 

We can downgrade this error to a warning with the warn_on_fail() helper function (assigning it to actions). In this way, the data will always be returned, but warnings will appear.

``` r

The warn_on_fail() function is a nice

shortcut for action_levels(warn_at = 1);

it works great in this data checking workflow

(and the threshold can still be adjusted)

dplyr::tibble( a = c(5, 7, 6, 5, NA, 7), b = c(6, 1, 0, 6, 0, 7) ) %>% colvalsbetween( columns = a, left = 1, right = 9, napass = TRUE, actions = warnonfail() ) %>% colvalslt( columns = c, value = 12, preconditions = ~ . %>% dplyr::mutate(c = a + b), actions = warnonfail() ) %>% colisnumeric( columns = c(a, b), actions = warnon_fail() ) ```

#> # A tibble: 6 x 2
#>       a     b
#>   <dbl> <dbl>
#> 1     5     6
#> 2     7     1
#> 3     6     0
#> 4     5     6
#> 5    NA     0
#> 6     7     7

Warning message:
Exceedance of failed test units where values in `c` should have been < `12`.
The `col_vals_lt()` validation failed beyond the absolute threshold level (1).
* failure level (2) >= failure threshold (1) 

Should you need more fine-grained thresholds and resultant actions, the action_levels() function can be used to specify multiple failure thresholds and side effects for each failure state. However, with warn_on_fail() and stop_on_fail() (applied by default, with stop_at = 1), you should have good enough options for this validation workflow.


TABLE INFORMATION

Table information can be synthesized in an information management workflow, giving us a snapshot of a data table we care to collect information on. The pointblank informant is fed a series of info_*() functions to define bits of information about a table. This info text can pertain to individual columns, the table as a whole, and whatever additional information makes sense for your organization. We can even glean little snippets of information (like column stats or sample values) from the target table with info_snippet() and the snip_*() functions and mix them into the data dictionary wherever they're needed.

Here is an example of how to use pointblank to incorporate pieces of info text into an informant object.

``` r

Create a pointblank informant object, with the

tibble as the target table. Use a few information

functions and end with incorporate(). The informant

will then show you information about the tibble.

informant <- dplyr::tibble( a = c(5, 7, 6, 5, NA, 7), b = c(6, 1, 0, 6, 0, 7) ) %>% createinformant( label = "A very simple example.", tblname = "exampletbl" ) %>% infotabular( description = "This two-column table is nothing all that interesting, but, it's fine for examples on GitHub README pages. Column names are a and b. ((Cool stuff))" ) %>% infocolumns( columns = a, info = "This column has an NA value. [[Watch out!]]<>" ) %>% infocolumns( columns = a, info = "Mean value is {a_mean}." ) %>% infocolumns( columns = b, info = "Like column a. The lowest value is `{blowest}." ) %>% info_columns( columns = b, info = "The highest value is{bhighest}`." ) %>% infosnippet( snippetname = "amean", fn = ~ . %>% .$a %>% mean(na.rm = TRUE) %>% round(2) ) %>% infosnippet(snippetname = "blowest", fn = sniplowest("b")) %>% infosnippet(snippetname = "bhighest", fn = sniphighest("b")) %>% infosection( sectionname = "further information", examples and documentation = "Examples for how to use the info_*() functions (and many more) are available at the pointblank site." ) %>% incorporate() ```

By printing the informant we get the table information report.

Here is a link to a hosted information report for the intendo::intendo_revenue table:

Information Report for intendo::intendo_revenue


TABLE SCANS

We can use the scan_data() function to generate a comprehensive summary of a tabular dataset. This allows us to quickly understand what's in the dataset and it helps us determine if there are any peculiarities within the data. Scanning the dplyr::storms dataset with scan_data(tbl = dplyr::storms) gives us an interactive HTML report. Here are a few of them, published in RPubs:

Table Scan of dplyr::storms

Table Scan of pointblank::game_revenue

Database tables can be used with scan_data() as well. Here are two examples using (1) the full_region table of the Rfam database (hosted publicly at mysql-rfam-public.ebi.ac.uk) and (2) the assembly table of the Ensembl database (hosted publicly at ensembldb.ensembl.org).

Rfam:
full\_region

Ensembl:
assembly


OVERVIEW OF PACKAGE FUNCTIONS

There are many functions available in pointblank for understanding data quality and creating data documentation. Here is an overview of all of them, grouped by family. For much more information on these, visit the documentation website or take a Test Drive in the Posit Cloud project.


INSTALLATION

Want to try this out? The pointblank package is available on CRAN:

r install.packages("pointblank")

You can also install the development version of pointblank from GitHub:

``` r

install.packages("pak")

pak::pak("rstudio/pointblank") ```

Getting in Touch

If you encounter a bug, have usage questions, or want to share ideas to make this package better, feel free to file an issue.

Wanna talk about data validation in a more relaxed setting? Join our Discord server! This is a great option for asking about the development of pointblank, pitching ideas that may become features, and just sharing your ideas!

Discord Server

Pointblank for Python

There's also a version of pointblank for Python, a project that got off the ground in late 2024 and is gaining traction in the Python community. You can find it at https://github.com/posit-dev/pointblank.


Code of Conduct

Please note that the gt project is released with a contributor code of conduct.
By participating in this project you agree to abide by its terms.

📄 License

pointblank is licensed under the MIT license. See the LICENSE.md file for more details.

© Posit Software, PBC.

🏛️ Governance

This project is primarily maintained by Rich Iannone. Other authors may occasionally assist with some of these duties.


Owner

  • Name: RStudio
  • Login: rstudio
  • Kind: organization
  • Email: info@rstudio.org
  • Location: Boston, MA

Citation (CITATION.cff)

cff-version: 1.2.0
message: 'If you wish to cite the "pointblank" package use:'
type: software
license: MIT
title: 'pointblank: Data Validation and Organization of Metadata for Local and Remote Tables'
version: 0.12.1
abstract: Validate data in data frames, 'tibble' objects, 'Spark'
    'DataFrames', and database tables. Validation pipelines can be made using
    easily-readable, consecutive validation steps. Upon execution of the
    validation plan, several reporting options are available. User-defined
    thresholds for failure rates allow for the determination of appropriate
    reporting actions. Many other workflows are available including an
    information management workflow, where the aim is to record, collect, and
    generate useful information on data tables.
authors:
- family-names: Iannone
  given-names: Richard
  email: rich@posit.co
  orcid: https://orcid.org/0000-0003-3925-190X
- family-names: Vargas
  given-names: Mauricio
  email: mavargas11@uc.cl
  orcid: https://orcid.org/0000-0003-1017-7574
- family-names: Choe
  given-names: June
  email: jchoe001@gmail.com
  orcid: https://orcid.org/0000-0002-0701-921X
repository: https://doi.org/10.32614/CRAN.package.pointblank
repository-code: https://github.com/rstudio/pointblank
url: https://rstudio.github.io/pointblank/
contact:
- family-names: Iannone
  given-names: Richard
  email: rich@posit.co
  orcid: https://orcid.org/0000-0003-3925-190X

CodeMeta (codemeta.json)

{
  "@context": [
    "https://doi.org/10.5063/schema/codemeta-2.0",
    "http://schema.org"
  ],
  "@type": "SoftwareSourceCode",
  "identifier": "pointblank",
  "description": "Validate data in data frames, 'tibble' objects, and in database\n    tables (e.g., 'PostgreSQL' and 'MySQL'). Validation pipelines can be made\n    using easily-readable, consecutive validation steps. Upon execution of the\n    validation plan, several reporting options are available. User-defined\n    thresholds for failure rates allow for the determination of appropriate\n    reporting actions.",
  "name": "pointblank: Validation of Local and Remote Data Tables",
  "codeRepository": "https://github.com/rstudio/pointblank",
  "issueTracker": "https://github.com/rstudio/pointblank/issues",
  "license": "https://spdx.org/licenses/MIT",
  "version": "0.3.1.9000",
  "programmingLanguage": {
    "@type": "ComputerLanguage",
    "name": "R",
    "version": "3.6.3",
    "url": "https://r-project.org"
  },
  "runtimePlatform": "R version 3.6.3 (2020-02-29)",
  "provider": {
    "@id": "https://cran.r-project.org",
    "@type": "Organization",
    "name": "Comprehensive R Archive Network (CRAN)",
    "url": "https://cran.r-project.org"
  },
  "author": [
    {
      "@type": "Person",
      "givenName": "Richard",
      "familyName": "Iannone",
      "email": "rich@posit.co",
      "@id": "https://orcid.org/0000-0003-3925-190X"
    }
  ],
  "contributor": {},
  "copyrightHolder": {},
  "funder": {},
  "maintainer": [
    {
      "@type": "Person",
      "givenName": "Richard",
      "familyName": "Iannone",
      "email": "riannone@me.com",
      "@id": "https://orcid.org/0000-0003-3925-190X"
    }
  ],
  "softwareSuggestions": [
    {
      "@type": "SoftwareApplication",
      "identifier": "covr",
      "name": "covr",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=covr"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "lubridate",
      "name": "lubridate",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=lubridate"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "RSQLite",
      "name": "RSQLite",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=RSQLite"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "RMariaDB",
      "name": "RMariaDB",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=RMariaDB"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "RPostgres",
      "name": "RPostgres",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=RPostgres"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "readr",
      "name": "readr",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=readr"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "rmarkdown",
      "name": "rmarkdown",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=rmarkdown"
    }
  ],
  "softwareRequirements": [
    {
      "@type": "SoftwareApplication",
      "identifier": "R",
      "name": "R",
      "version": ">= 3.5.0"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "base64enc",
      "name": "base64enc",
      "version": ">= 0.1-3",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=base64enc"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "blastula",
      "name": "blastula",
      "version": ">= 0.3.1",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=blastula"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "cli",
      "name": "cli",
      "version": ">= 2.0.2",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=cli"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "DBI",
      "name": "DBI",
      "version": ">= 1.1.0",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=DBI"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "dplyr",
      "name": "dplyr",
      "version": ">= 0.8.5",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=dplyr"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "dbplyr",
      "name": "dbplyr",
      "version": ">= 1.4.2",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=dbplyr"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "ggforce",
      "name": "ggforce",
      "version": ">= 0.3.1",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=ggforce"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "ggplot2",
      "name": "ggplot2",
      "version": ">= 3.3.0",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=ggplot2"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "glue",
      "name": "glue",
      "version": ">= 1.3.2",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=glue"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "gt",
      "name": "gt",
      "version": ">= 0.2.0.5",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=gt"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "htmltools",
      "name": "htmltools",
      "version": ">= 0.4.0",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=htmltools"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "log4r",
      "name": "log4r",
      "version": ">= 0.3.2",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=log4r"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "knitr",
      "name": "knitr",
      "version": ">= 1.28",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=knitr"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "magrittr",
      "name": "magrittr",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=magrittr"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "rlang",
      "name": "rlang",
      "version": ">= 0.4.5",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=rlang"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "testthat",
      "name": "testthat",
      "version": ">= 2.3.2",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=testthat"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "tibble",
      "name": "tibble",
      "version": ">= 3.0.0",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=tibble"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "tidyselect",
      "name": "tidyselect",
      "version": ">= 1.0.0",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=tidyselect"
    }
  ],
  "releaseNotes": "https://github.com/rstudio/pointblank/blob/main/NEWS.md",
  "readme": "https://github.com/rstudio/pointblank/blob/main/README.md",
  "fileSize": "6906.977KB",
  "contIntegration": "https://codecov.io/gh/rstudio/pointblank?branch=main",
  "keywords": [
    "r",
    "data-validation",
    "data-quality",
    "failure-thresholds",
    "data-frames",
    "database-tables",
    "easy-to-use"
  ],
  "relatedLink": "https://CRAN.R-project.org/package=pointblank"
}

GitHub Events

Total
  • Create event: 5
  • Release event: 1
  • Issues event: 48
  • Watch event: 107
  • Delete event: 3
  • Issue comment event: 89
  • Push event: 57
  • Pull request review comment event: 21
  • Pull request review event: 51
  • Pull request event: 54
  • Fork event: 3
Last Year
  • Create event: 5
  • Release event: 1
  • Issues event: 48
  • Watch event: 107
  • Delete event: 3
  • Issue comment event: 89
  • Push event: 57
  • Pull request review comment event: 21
  • Pull request review event: 51
  • Pull request event: 54
  • Fork event: 3

Committers

Last synced: 7 months ago

All Time
  • Total Commits: 5,458
  • Total Committers: 18
  • Avg Commits per committer: 303.222
  • Development Distribution Score (DDS): 0.093
Past Year
  • Commits: 200
  • Committers: 7
  • Avg Commits per committer: 28.571
  • Development Distribution Score (DDS): 0.335
Top Committers
Name Email Commits
Richard Iannone r****e@m****m 4,951
June Choe y****e@s****u 389
olivroy o****1@h****m 38
Mauricio Vargas m****s@d****l 24
Lars Dalby l****s@b****k 11
ekothe p****s@g****m 9
Pachá m****1@u****l 8
Hannah Frick h****h@p****o 8
DavZim d****n@h****e 4
Florian Kohrt f****t@a****o 4
Jesse Mostipak j****n@g****m 3
Garrick Aden-Buie g****k@a****m 2
Benjamin b****r@g****m 2
LuisDVA l****d@c****x 1
Daniel Possenriede p****e@g****m 1
Brancen Gregory b****y@g****m 1
Mayeul Kauffmann m****k 1
MikeJohnPage 3****e 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 109
  • Total pull requests: 94
  • Average time to close issues: 4 months
  • Average time to close pull requests: 23 days
  • Total issue authors: 48
  • Total pull request authors: 12
  • Average comments per issue: 1.56
  • Average comments per pull request: 1.27
  • Merged pull requests: 78
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 39
  • Pull requests: 54
  • Average time to close issues: 2 days
  • Average time to close pull requests: 1 day
  • Issue authors: 17
  • Pull request authors: 9
  • Average comments per issue: 1.41
  • Average comments per pull request: 0.93
  • Merged pull requests: 39
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • rich-iannone (20)
  • yjunechoe (13)
  • jl5000 (13)
  • hfrick (6)
  • petrbouchal (4)
  • mayeulk (3)
  • kmasiello (3)
  • JauntyJJS (2)
  • dchiu911 (2)
  • hadley (2)
  • lbm364dl (2)
  • chrisbrownlie (2)
  • maelle (2)
  • joscani (2)
  • fkohrt (2)
Pull Request Authors
  • yjunechoe (78)
  • olivroy (16)
  • rich-iannone (13)
  • pachadotdev (9)
  • fkohrt (2)
  • luisDVA (2)
  • MikeJohnPage (2)
  • dpprdan (2)
  • hfrick (2)
  • maelle (1)
  • brancengregory (1)
  • petrbouchal (1)
Top Labels
Issue Labels
Type: ☹︎ Bug (44) Type: ★ Enhancement (44) Effort: [3] High (29) Priority: [3] High (28) Difficulty: [2] Intermediate (22) Difficulty: [3] Advanced (20) Type: ⁇ Question (14) Effort: [2] Medium (13) Priority: [2] Medium (10) Priority: ♨︎ Critical (4) Type: ✎ Docs (2) Help Wanted ㋡ (2)
Pull Request Labels
Type: ☹︎ Bug (2)

Packages

  • Total packages: 2
  • Total downloads:
    • cran 6,904 last-month
  • Total docker downloads: 42,767
  • Total dependent packages: 2
    (may contain duplicates)
  • Total dependent repositories: 14
    (may contain duplicates)
  • Total versions: 42
  • Total maintainers: 1
cran.r-project.org: pointblank

Data Validation and Organization of Metadata for Local and Remote Tables

  • Versions: 22
  • Dependent Packages: 2
  • Dependent Repositories: 14
  • Downloads: 6,904 Last month
  • Docker Downloads: 42,767
Rankings
Stargazers count: 0.4%
Docker downloads count: 0.6%
Forks count: 1.5%
Average: 5.2%
Downloads: 7.2%
Dependent repos count: 7.7%
Dependent packages count: 13.7%
Maintainers (1)
Last synced: 4 months ago
proxy.golang.org: github.com/rstudio/pointblank
  • Versions: 20
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 5.5%
Average: 5.7%
Dependent repos count: 5.9%
Last synced: 4 months ago

Dependencies

DESCRIPTION cran
  • R >= 3.5.0 depends
  • DBI >= 1.1.0 imports
  • base64enc >= 0.1 imports
  • blastula >= 0.3.2 imports
  • cli >= 2.5.0 imports
  • dbplyr >= 2.1.1 imports
  • digest >= 0.6.27 imports
  • dplyr >= 1.0.6 imports
  • fs >= 1.5.0 imports
  • glue >= 1.4.2 imports
  • gt >= 0.6.0 imports
  • htmltools >= 0.5.1.1 imports
  • knitr >= 1.30 imports
  • magrittr * imports
  • rlang >= 0.4.11 imports
  • scales >= 1.1.1 imports
  • testthat >= 2.3.2 imports
  • tibble >= 3.1.2 imports
  • tidyr >= 1.1.3 imports
  • tidyselect >= 1.1.1 imports
  • yaml >= 2.2.1 imports
  • RMySQL * suggests
  • RPostgres * suggests
  • RSQLite * suggests
  • arrow * suggests
  • bigrquery * suggests
  • covr * suggests
  • crayon * suggests
  • data.table * suggests
  • dittodb * suggests
  • duckdb * suggests
  • ggforce * suggests
  • ggplot2 * suggests
  • jsonlite * suggests
  • log4r * suggests
  • lubridate * suggests
  • odbc * suggests
  • readr * suggests
  • rmarkdown * suggests
  • sparklyr * suggests
.github/workflows/R-CMD-check.yaml actions
  • actions/checkout v3 composite
  • r-lib/actions/check-r-package v2 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/lint.yaml actions
  • actions/checkout v3 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/pkgdown.yaml actions
  • actions/checkout v3 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/test-coverage.yaml actions
  • actions/checkout v3 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite