safeframe

Tagging, validating, and safeguarding data to help harden data pipelines.

https://github.com/epiverse-trace/safeframe

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (20.8%) to scientific vocabulary

Keywords

r-package

Keywords from Contributors

standards spacy-extension data-profilers hack meshing wavelets interpretability sequences robust spacy
Last synced: 7 months ago · JSON representation ·

Repository

Tagging, validating, and safeguarding data to help harden data pipelines.

Basic Info
Statistics
  • Stars: 1
  • Watchers: 3
  • Forks: 0
  • Open Issues: 1
  • Releases: 0
Topics
r-package
Created almost 2 years ago · Last pushed 9 months ago
Metadata Files
Readme Changelog License Citation

README.Rmd

---
output: github_document
---







```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)
```

# *safeframe*: Generic Data Tagging and Validating Logo for safeframe



[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/license/mit)
[![R-CMD-check](https://github.com/epiverse-trace/safeframe/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/epiverse-trace/safeframe/actions/workflows/R-CMD-check.yaml)
[![Codecov test coverage](https://codecov.io/gh/epiverse-trace/safeframe/branch/main/graph/badge.svg)](https://app.codecov.io/gh/epiverse-trace/safeframe?branch=main)
[![lifecycle-concept](https://raw.githubusercontent.com/reconverse/reconverse.github.io/master/images/badge-maturing.svg)](https://www.reconverse.org/lifecycle.html#maturing)



**safeframe** provides functions to tag and validate data of any kind. safeframe is an abstraction of [**linelist**](https://github.com/epiverse-trace/linelist), which originally applied these principles to epidemiological linelist data. The original proposal for this package can be found on [the Discussion board](https://github.com/orgs/epiverse-trace/discussions/221).

## Installation

You can install safeframe from CRAN (release) or GitHub (development):

```r
# CRAN
install.packages('safeframe')

# Development
# install.packages("pak")
pak::pak("epiverse-trace/safeframe")
```

## Getting started

```r
library(safeframe)

# Create a safeframe object
x <- make_safeframe(cars, mph = "speed", distance = "dist")

# Validate the tagged data are of a specific type
validate_safeframe(x,
  mph = 'numeric',        # speed should be numeric
  # type() is a helper function of related classes
  distance = type('numeric')    # dist should be numeric, integer
)
```

## Development

### Lifecycle

This package is currently *maturing*, as defined by the [RECON software
lifecycle](https://www.reconverse.org/lifecycle.html):

> Package is functional, documented and tested. Can be used in production with the understanding that the interface may still undergo minor changes. Typically semantic version < 1.0.0.

### Contributions

Contributions are welcome via [pull requests](https://github.com/epiverse-trace/safeframe/pulls). Anything bigger than a typo fix or a small documentation update should be discussed in an issue first. If you want to report a bug or suggest an enhancement, please open an issue. 😊 See also [the general Epiverse-TRACE contribution document](https://github.com/epiverse-trace/.github/blob/main/CONTRIBUTING.md).

Common issues To make it easier for us to evaluate your contribution, please run the following commands before submitting a pull request to ensure your code is consistent with the rest of the package: ```r styler::style_pkg() devtools::document() spelling::update_wordlist(pkg = ".", vignettes = TRUE) lintr::lint_package() devtools::test() devtools::check() ``` This will reduce the time it takes for us to review your contribution. Thank you! 😊
### Related projects This project is related to other existing projects in R or other languages, but also differs from them in the following aspects: - [labelled](https://github.com/larmarange/labelled/): A package for tagging data in R, but it is more focused on tagging variables than validating them. - [linelist](https://github.com/epiverse-trace/linelist): A package for managing and validating linelist data - the original inspiration for safeframe. - [struct](https://github.com/cynkra/struct): A package that "provides ways to modify objects more strictly, guaranteeing that we keep the type of the modified element." ### Code of Conduct Please note that the safeframe project is released with a [Contributor Code of Conduct](https://github.com/epiverse-trace/.github/blob/main/CODE_OF_CONDUCT.md). By contributing to this project, you agree to abide by its terms.

Owner

  • Name: Epiverse-TRACE
  • Login: epiverse-trace
  • Kind: organization

Citation (CITATION.cff)

# --------------------------------------------
# CITATION file created with {cffr} R package
# See also: https://docs.ropensci.org/cffr/
# --------------------------------------------
 
cff-version: 1.2.0
message: 'To cite package "safeframe" in publications use:'
type: software
license: MIT
title: 'safeframe: Generic Data Tagging and Validation Tool'
version: 1.0.0
abstract: Provides tools to help tag and validate data according to user-specified
  rules. The 'safeframe' class adds variable level attributes to 'data.frame' columns.
  Once tagged, these variables can be seamlessly used in downstream analyses, making
  data pipelines clearer, more robust, and more reliable.
authors:
- family-names: Hartgerink
  given-names: Chris
  email: chris@data.org
  orcid: https://orcid.org/0000-0003-1050-6809
repository-code: https://github.com/epiverse-trace/safeframe
url: https://epiverse-trace.github.io/safeframe/
contact:
- family-names: Hartgerink
  given-names: Chris
  email: chris@data.org
  orcid: https://orcid.org/0000-0003-1050-6809
references:
- type: software
  title: 'R: A Language and Environment for Statistical Computing'
  notes: Depends
  url: https://www.R-project.org/
  authors:
  - name: R Core Team
  institution:
    name: R Foundation for Statistical Computing
    address: Vienna, Austria
  year: '2025'
  version: '>= 4.1.0'
- type: software
  title: checkmate
  abstract: 'checkmate: Fast and Versatile Argument Checks'
  notes: Imports
  url: https://mllg.github.io/checkmate/
  repository: https://CRAN.R-project.org/package=checkmate
  authors:
  - family-names: Lang
    given-names: Michel
    email: michellang@gmail.com
    orcid: https://orcid.org/0000-0001-9754-0393
  year: '2025'
  doi: 10.32614/CRAN.package.checkmate
- type: software
  title: lifecycle
  abstract: 'lifecycle: Manage the Life Cycle of your Package Functions'
  notes: Imports
  url: https://lifecycle.r-lib.org/
  repository: https://CRAN.R-project.org/package=lifecycle
  authors:
  - family-names: Henry
    given-names: Lionel
    email: lionel@posit.co
  - family-names: Wickham
    given-names: Hadley
    email: hadley@posit.co
    orcid: https://orcid.org/0000-0003-4757-117X
  year: '2025'
  doi: 10.32614/CRAN.package.lifecycle
- type: software
  title: rlang
  abstract: 'rlang: Functions for Base Types and Core R and ''Tidyverse'' Features'
  notes: Imports
  url: https://rlang.r-lib.org
  repository: https://CRAN.R-project.org/package=rlang
  authors:
  - family-names: Henry
    given-names: Lionel
    email: lionel@posit.co
  - family-names: Wickham
    given-names: Hadley
    email: hadley@posit.co
  year: '2025'
  doi: 10.32614/CRAN.package.rlang
- type: software
  title: tidyselect
  abstract: 'tidyselect: Select from a Set of Strings'
  notes: Imports
  url: https://tidyselect.r-lib.org
  repository: https://CRAN.R-project.org/package=tidyselect
  authors:
  - family-names: Henry
    given-names: Lionel
    email: lionel@posit.co
  - family-names: Wickham
    given-names: Hadley
    email: hadley@posit.co
  year: '2025'
  doi: 10.32614/CRAN.package.tidyselect
- type: software
  title: callr
  abstract: 'callr: Call R from R'
  notes: Suggests
  url: https://callr.r-lib.org
  repository: https://CRAN.R-project.org/package=callr
  authors:
  - family-names: Csárdi
    given-names: Gábor
    email: csardi.gabor@gmail.com
    orcid: https://orcid.org/0000-0001-7098-9676
  - family-names: Chang
    given-names: Winston
  year: '2025'
  doi: 10.32614/CRAN.package.callr
- type: software
  title: dplyr
  abstract: 'dplyr: A Grammar of Data Manipulation'
  notes: Suggests
  url: https://dplyr.tidyverse.org
  repository: https://CRAN.R-project.org/package=dplyr
  authors:
  - family-names: Wickham
    given-names: Hadley
    email: hadley@posit.co
    orcid: https://orcid.org/0000-0003-4757-117X
  - family-names: François
    given-names: Romain
    orcid: https://orcid.org/0000-0002-2444-4226
  - family-names: Henry
    given-names: Lionel
  - family-names: Müller
    given-names: Kirill
    orcid: https://orcid.org/0000-0002-1416-3412
  - family-names: Vaughan
    given-names: Davis
    email: davis@posit.co
    orcid: https://orcid.org/0000-0003-4777-038X
  year: '2025'
  doi: 10.32614/CRAN.package.dplyr
- type: software
  title: knitr
  abstract: 'knitr: A General-Purpose Package for Dynamic Report Generation in R'
  notes: Suggests
  url: https://yihui.org/knitr/
  repository: https://CRAN.R-project.org/package=knitr
  authors:
  - family-names: Xie
    given-names: Yihui
    email: xie@yihui.name
    orcid: https://orcid.org/0000-0003-0645-5666
  year: '2025'
  doi: 10.32614/CRAN.package.knitr
- type: software
  title: magrittr
  abstract: 'magrittr: A Forward-Pipe Operator for R'
  notes: Suggests
  url: https://magrittr.tidyverse.org
  repository: https://CRAN.R-project.org/package=magrittr
  authors:
  - family-names: Bache
    given-names: Stefan Milton
    email: stefan@stefanbache.dk
  - family-names: Wickham
    given-names: Hadley
    email: hadley@rstudio.com
  year: '2025'
  doi: 10.32614/CRAN.package.magrittr
- type: software
  title: rmarkdown
  abstract: 'rmarkdown: Dynamic Documents for R'
  notes: Suggests
  url: https://pkgs.rstudio.com/rmarkdown/
  repository: https://CRAN.R-project.org/package=rmarkdown
  authors:
  - family-names: Allaire
    given-names: JJ
    email: jj@posit.co
  - family-names: Xie
    given-names: Yihui
    email: xie@yihui.name
    orcid: https://orcid.org/0000-0003-0645-5666
  - family-names: Dervieux
    given-names: Christophe
    email: cderv@posit.co
    orcid: https://orcid.org/0000-0003-4474-2498
  - family-names: McPherson
    given-names: Jonathan
    email: jonathan@posit.co
  - family-names: Luraschi
    given-names: Javier
  - family-names: Ushey
    given-names: Kevin
    email: kevin@posit.co
  - family-names: Atkins
    given-names: Aron
    email: aron@posit.co
  - family-names: Wickham
    given-names: Hadley
    email: hadley@posit.co
  - family-names: Cheng
    given-names: Joe
    email: joe@posit.co
  - family-names: Chang
    given-names: Winston
    email: winston@posit.co
  - family-names: Iannone
    given-names: Richard
    email: rich@posit.co
    orcid: https://orcid.org/0000-0003-3925-190X
  year: '2025'
  doi: 10.32614/CRAN.package.rmarkdown
- type: software
  title: spelling
  abstract: 'spelling: Tools for Spell Checking in R'
  notes: Suggests
  url: https://ropensci.r-universe.dev/spelling
  repository: https://CRAN.R-project.org/package=spelling
  authors:
  - family-names: Ooms
    given-names: Jeroen
    email: jeroenooms@gmail.com
    orcid: https://orcid.org/0000-0002-4035-0289
  - family-names: Hester
    given-names: Jim
    email: james.hester@rstudio.com
  year: '2025'
  doi: 10.32614/CRAN.package.spelling
- type: software
  title: testthat
  abstract: 'testthat: Unit Testing for R'
  notes: Suggests
  url: https://testthat.r-lib.org
  repository: https://CRAN.R-project.org/package=testthat
  authors:
  - family-names: Wickham
    given-names: Hadley
    email: hadley@posit.co
  year: '2025'
  doi: 10.32614/CRAN.package.testthat
- type: software
  title: tibble
  abstract: 'tibble: Simple Data Frames'
  notes: Suggests
  url: https://tibble.tidyverse.org/
  repository: https://CRAN.R-project.org/package=tibble
  authors:
  - family-names: Müller
    given-names: Kirill
    email: kirill@cynkra.com
    orcid: https://orcid.org/0000-0002-1416-3412
  - family-names: Wickham
    given-names: Hadley
    email: hadley@rstudio.com
  year: '2025'
  doi: 10.32614/CRAN.package.tibble

GitHub Events

Total
  • Issues event: 16
  • Watch event: 1
  • Delete event: 23
  • Issue comment event: 14
  • Push event: 65
  • Pull request review comment event: 8
  • Pull request review event: 14
  • Pull request event: 45
  • Create event: 21
Last Year
  • Issues event: 16
  • Watch event: 1
  • Delete event: 23
  • Issue comment event: 14
  • Push event: 65
  • Pull request review comment event: 8
  • Pull request review event: 14
  • Pull request event: 45
  • Create event: 21

Committers

Last synced: about 1 year ago

All Time
  • Total Commits: 125
  • Total Committers: 6
  • Avg Commits per committer: 20.833
  • Development Distribution Score (DDS): 0.216
Past Year
  • Commits: 125
  • Committers: 6
  • Avg Commits per committer: 20.833
  • Development Distribution Score (DDS): 0.216
Top Committers
Name Email Commits
Chris Hartgerink c****s@l****g 98
Hugo Gruson 1****o 9
dependabot[bot] 4****] 8
GitHub Action a****n@g****m 6
github-actions[bot] 4****] 2
github-actions g****s@g****m 2
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 7 months ago

All Time
  • Total issues: 22
  • Total pull requests: 79
  • Average time to close issues: 4 months
  • Average time to close pull requests: 15 days
  • Total issue authors: 6
  • Total pull request authors: 4
  • Average comments per issue: 1.32
  • Average comments per pull request: 0.3
  • Merged pull requests: 64
  • Bot issues: 0
  • Bot pull requests: 30
Past Year
  • Issues: 12
  • Pull requests: 54
  • Average time to close issues: 3 months
  • Average time to close pull requests: 15 days
  • Issue authors: 5
  • Pull request authors: 4
  • Average comments per issue: 0.92
  • Average comments per pull request: 0.3
  • Merged pull requests: 43
  • Bot issues: 0
  • Bot pull requests: 23
Top Authors
Issue Authors
  • Bisaloo (5)
  • chartgerink (2)
  • TimTaylor (2)
  • avallecam (1)
  • joshwlambert (1)
Pull Request Authors
  • chartgerink (42)
  • github-actions[bot] (17)
  • dependabot[bot] (13)
  • Bisaloo (7)
Top Labels
Issue Labels
bug (1) upkeep (1)
Pull Request Labels
dependencies (13)

Packages

  • Total packages: 1
  • Total downloads:
    • cran 180 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 1
  • Total maintainers: 1
cran.r-project.org: safeframe

Generic Data Tagging and Validation Tool

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 180 Last month
Rankings
Dependent packages count: 26.2%
Dependent repos count: 32.2%
Average: 48.3%
Downloads: 86.4%
Maintainers (1)
Last synced: 8 months ago