tarchetypes

Archetypes for targets and pipelines

https://github.com/ropensci/tarchetypes

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: zenodo.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (17.0%) to scientific vocabulary

Keywords

data-science high-performance-computing peer-reviewed pipeline r r-package r-targetopia reproducibility rstats targets workflow

Keywords from Contributors

make drake makefile ropensci jags rjags rstats-package carpentries data-carpentry data-wrangling
Last synced: 6 months ago · JSON representation

Repository

Archetypes for targets and pipelines

Basic Info
Statistics
  • Stars: 146
  • Watchers: 7
  • Forks: 20
  • Open Issues: 0
  • Releases: 36
Topics
data-science high-performance-computing peer-reviewed pipeline r r-package r-targetopia reproducibility rstats targets workflow
Created over 5 years ago · Last pushed 6 months ago
Metadata Files
Readme Changelog Contributing License Codemeta

README.Rmd

---
output: github_document
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)
```

# tarchetypes 

[![ropensci](https://badges.ropensci.org/401_status.svg)](https://github.com/ropensci/software-review/issues/401)
[![zenodo](https://zenodo.org/badge/282774543.svg)](https://zenodo.org/badge/latestdoi/282774543)
[![R Targetopia](https://img.shields.io/badge/R_Targetopia-member-blue?style=flat&labelColor=gray)](https://wlandau.github.io/targetopia/)
[![CRAN](https://www.r-pkg.org/badges/version/tarchetypes)](https://CRAN.R-project.org/package=tarchetypes)
[![status](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active)
[![check](https://github.com/ropensci/tarchetypes/actions/workflows/check.yaml/badge.svg)](https://github.com/ropensci/tarchetypes/actions?query=workflow%3Acheck)
[![codecov](https://codecov.io/gh/ropensci/tarchetypes/branch/main/graph/badge.svg?token=3T5DlLwUVl)](https://app.codecov.io/gh/ropensci/tarchetypes)

The `tarchetypes` R package is a collection of target and pipeline archetypes for the [`targets`](https://github.com/ropensci/targets) package. These archetypes express complicated pipelines with concise syntax, which enhances readability and thus reproducibility. Archetypes are possible because of the flexible metaprogramming capabilities of [`targets`](https://github.com/ropensci/targets). In [`targets`](https://github.com/ropensci/targets), one can define a target as an object outside the central pipeline, and the [`tar_target_raw()`](https://docs.ropensci.org/targets/reference/tar_target_raw.html) function completely avoids non-standard evaluation. That means anyone can write their own niche interfaces for specialized projects. `tarchetypes` aims to include the most common and versatile archetypes and usage patterns.

## Grouped data frames

`tarchetypes` has functions for easy dynamic branching over subsets of data frames:

* `tar_group_by()`: define row groups using `dplyr::group_by()` semantics.
* `tar_group_select()`: define row groups using `tidyselect` semantics.
* `tar_group_count()`: define a given number row groups.
* `tar_group_size()`: define row groups of a given size.

If you define a target with one of these functions, all downstream dynamic targets will automatically branch over the row groups.

```{r, echo = FALSE}
targets::tar_script({
  produce_data <- function() {
    expand.grid(var1 = c("a", "b"), var2 = c("c", "d"), rep = c(1, 2, 3))
  }
  list(
    tarchetypes::tar_group_by(data, produce_data(), var1, var2),
    tar_target(group, data, pattern = map(data))
  )
})
```

```{r, eval = FALSE}
# _targets.R file:
library(targets)
library(tarchetypes)
produce_data <- function() {
  expand.grid(var1 = c("a", "b"), var2 = c("c", "d"), rep = c(1, 2, 3))
}
list(
  tar_group_by(data, produce_data(), var1, var2),
  tar_target(group, data, pattern = map(data))
)
```

```{r}
# R console:
library(targets)
tar_make()

# First row group:
tar_read(group, branches = 1)

# Second row group:
tar_read(group, branches = 2)
```

## Literate programming

Consider the following R Markdown report.

```{r, echo = FALSE, comment = ""}
lines <- c(
  "---",
  "title: report",
  "output: html_document",
  "---",
  "",
  "```{r}",
  "library(targets)",
  "tar_read(dataset)",
  "```"
)
cat(lines, sep = "\n")
```

We want to define a target to render the report. And because the report calls `tar_read(dataset)`, this target needs to depend on `dataset`. Without `tarchetypes`, it is cumbersome to set up the pipeline correctly.

```{r, eval = FALSE}
# _targets.R
library(targets)
list(
  tar_target(dataset, data.frame(x = letters)),
  tar_target(
    report, {
      # Explicitly mention the symbol `dataset`.
      list(dataset)
      # Return relative paths to keep the project portable.
      fs::path_rel(
        # Need to return/track all input/output files.
        c( 
          rmarkdown::render(
            input = "report.Rmd",
            # Always run from the project root
            # so the report can find _targets/.
            knit_root_dir = getwd(),
            quiet = TRUE
          ),
          "report.Rmd"
        )
      )
    },
    # Track the input and output files.
    format = "file",
    # Avoid building small reports on HPC.
    deployment = "main"
  )
)
```

With `tarchetypes`, we can simplify the pipeline with the `tar_render()` archetype.

```{r, eval = FALSE}
# _targets.R
library(targets)
library(tarchetypes)
list(
  tar_target(dataset, data.frame(x = letters)),
  tar_render(report, "report.Rmd")
)
```

Above, `tar_render()` scans code chunks for mentions of targets in `tar_load()` and `tar_read()`, and it enforces the dependency relationships it finds. In our case, it reads `report.Rmd` and then forces `report` to depend on `dataset`. That way, `tar_make()` always processes `dataset` before `report`, and it automatically reruns `report.Rmd` whenever `dataset` changes.

## Alternative pipeline syntax

[`tar_plan()`](https://docs.ropensci.org/tarchetypes/reference/tar_plan.html) is a drop-in replacement for [`drake_plan()`](https://docs.ropensci.org/drake/reference/drake_plan.html) in the [`targets`](https://github.com/ropensci/targets) ecosystem. 
It lets users write targets as name/command pairs without having to call [`tar_target()`](https://docs.ropensci.org/targets/reference/tar_target.html).

```{r, eval = FALSE}
tar_plan(
  tar_file(raw_data_file, "data/raw_data.csv", format = "file"),
  # Simple drake-like syntax:
  raw_data = read_csv(raw_data_file, col_types = cols()),
  data =raw_data %>%
    mutate(Ozone = replace_na(Ozone, mean(Ozone, na.rm = TRUE))),
  hist = create_plot(data),
  fit = biglm(Ozone ~ Wind + Temp, data),
  # Needs tar_render() because it is a target archetype:
  tar_render(report, "report.Rmd")
)
```

## Installation

Type | Source | Command
---|---|---
Release | CRAN | `install.packages("tarchetypes")`
Development | GitHub | `remotes::install_github("ropensci/tarchetypes")`
Development | rOpenSci | `install.packages("tarchetypes", repos = "https://dev.ropensci.org")`

## Documentation

For specific documentation on `tarchetypes`, including the help files of all user-side functions, please visit the [reference website](https://docs.ropensci.org/tarchetypes/). For documentation on [`targets`](https://github.com/ropensci/targets) in general, please visit the [`targets` reference website](https://docs.ropensci.org/targets/). Many of the linked resources use `tarchetypes` functions such as [`tar_render()`](https://docs.ropensci.org/tarchetypes/reference/tar_render.html).

## Help

Please read the [help guide](https://books.ropensci.org/targets/help.html) to learn how best to ask for help using `targets` and `tarchetypes`.

## Code of conduct

Please note that this package is released with a [Contributor Code of Conduct](https://ropensci.org/code-of-conduct/).

## Citation

```{r}
citation("tarchetypes")
```

```{r, echo = FALSE}
unlink("_targets.R")
tar_destroy()
```

Owner

  • Name: rOpenSci
  • Login: ropensci
  • Kind: organization
  • Email: info@ropensci.org
  • Location: Berkeley, CA

CodeMeta (codemeta.json)

{
  "@context": "https://doi.org/10.5063/schema/codemeta-2.0",
  "@type": "SoftwareSourceCode",
  "identifier": "tarchetypes",
  "description": "Function-oriented Make-like declarative pipelines for Statistics and data science are supported in the 'targets' R package. As an extension to 'targets', the 'tarchetypes' package provides convenient user-side functions to make 'targets' easier to use. By establishing reusable archetypes for common kinds of targets and pipelines, these functions help express complicated reproducible pipelines concisely and compactly. The methods in this package were influenced by the 'drake' R package by Will Landau (2018) <doi:10.21105/joss.00550>.",
  "name": "tarchetypes: Archetypes for Targets",
  "relatedLink": [
    "https://docs.ropensci.org/tarchetypes/",
    "https://CRAN.R-project.org/package=tarchetypes"
  ],
  "codeRepository": "https://github.com/ropensci/tarchetypes",
  "issueTracker": "https://github.com/ropensci/tarchetypes/issues",
  "license": "https://spdx.org/licenses/MIT",
  "version": "0.8.0.9001",
  "programmingLanguage": {
    "@type": "ComputerLanguage",
    "name": "R",
    "url": "https://r-project.org"
  },
  "runtimePlatform": "R version 4.3.2 (2023-10-31)",
  "provider": {
    "@id": "https://cran.r-project.org",
    "@type": "Organization",
    "name": "Comprehensive R Archive Network (CRAN)",
    "url": "https://cran.r-project.org"
  },
  "author": [
    {
      "@type": "Person",
      "givenName": [
        "William",
        "Michael"
      ],
      "familyName": "Landau",
      "email": "will.landau.oss@gmail.com",
      "@id": "https://orcid.org/0000-0003-1878-3253"
    }
  ],
  "copyrightHolder": [
    {
      "@type": "Organization",
      "name": "Eli Lilly and Company"
    }
  ],
  "maintainer": [
    {
      "@type": "Person",
      "givenName": [
        "William",
        "Michael"
      ],
      "familyName": "Landau",
      "email": "will.landau.oss@gmail.com",
      "@id": "https://orcid.org/0000-0003-1878-3253"
    }
  ],
  "softwareSuggestions": [
    {
      "@type": "SoftwareApplication",
      "identifier": "curl",
      "name": "curl",
      "version": ">= 4.3",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=curl"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "knitr",
      "name": "knitr",
      "version": ">= 1.28",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=knitr"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "quarto",
      "name": "quarto",
      "version": ">= 1.4",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=quarto"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "rmarkdown",
      "name": "rmarkdown",
      "version": ">= 2.1",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=rmarkdown"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "testthat",
      "name": "testthat",
      "version": ">= 3.0.0",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=testthat"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "xml2",
      "name": "xml2",
      "version": ">= 1.3.2",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=xml2"
    }
  ],
  "softwareRequirements": {
    "1": {
      "@type": "SoftwareApplication",
      "identifier": "R",
      "name": "R",
      "version": ">= 3.5.0"
    },
    "2": {
      "@type": "SoftwareApplication",
      "identifier": "dplyr",
      "name": "dplyr",
      "version": ">= 1.0.0",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=dplyr"
    },
    "3": {
      "@type": "SoftwareApplication",
      "identifier": "fs",
      "name": "fs",
      "version": ">= 1.4.2",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=fs"
    },
    "4": {
      "@type": "SoftwareApplication",
      "identifier": "parallel",
      "name": "parallel"
    },
    "5": {
      "@type": "SoftwareApplication",
      "identifier": "rlang",
      "name": "rlang",
      "version": ">= 0.4.7",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=rlang"
    },
    "6": {
      "@type": "SoftwareApplication",
      "identifier": "secretbase",
      "name": "secretbase",
      "version": ">= 0.4.0",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=secretbase"
    },
    "7": {
      "@type": "SoftwareApplication",
      "identifier": "targets",
      "name": "targets",
      "version": ">= 1.6.0",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=targets"
    },
    "8": {
      "@type": "SoftwareApplication",
      "identifier": "tibble",
      "name": "tibble",
      "version": ">= 3.0.1",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=tibble"
    },
    "9": {
      "@type": "SoftwareApplication",
      "identifier": "tidyselect",
      "name": "tidyselect",
      "version": ">= 1.1.0",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=tidyselect"
    },
    "10": {
      "@type": "SoftwareApplication",
      "identifier": "utils",
      "name": "utils"
    },
    "11": {
      "@type": "SoftwareApplication",
      "identifier": "vctrs",
      "name": "vctrs",
      "version": ">= 0.3.4",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=vctrs"
    },
    "12": {
      "@type": "SoftwareApplication",
      "identifier": "withr",
      "name": "withr",
      "version": ">= 2.1.2",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=withr"
    },
    "SystemRequirements": null
  },
  "fileSize": "1209.095KB",
  "citation": [
    {
      "@type": "SoftwareSourceCode",
      "datePublished": "2021",
      "author": [
        {
          "@type": "Person",
          "givenName": [
            "William",
            "Michael"
          ],
          "familyName": "Landau"
        }
      ],
      "name": "tarchetypes: Archetypes for Targets",
      "description": "{https://docs.ropensci.org/tarchetypes/, https://github.com/ropensci/tarchetypes}"
    }
  ],
  "releaseNotes": "https://github.com/ropensci/tarchetypes/blob/master/NEWS.md",
  "readme": "https://github.com/ropensci/tarchetypes/blob/main/README.md",
  "contIntegration": [
    "https://github.com/ropensci/tarchetypes/actions?query=workflow%3Acheck",
    "https://app.codecov.io/gh/ropensci/tarchetypes",
    "https://github.com/ropensci/tarchetypes/actions?query=workflow%3Alint"
  ],
  "developmentStatus": "https://www.repostatus.org/#active",
  "review": {
    "@type": "Review",
    "url": "https://github.com/ropensci/software-review/issues/401",
    "provider": "https://ropensci.org"
  },
  "keywords": [
    "reproducibility",
    "high-performance-computing",
    "r",
    "data-science",
    "rstats",
    "pipeline",
    "r-package",
    "workflow",
    "targets",
    "r-targetopia",
    "peer-reviewed"
  ]
}

GitHub Events

Total
  • Create event: 5
  • Release event: 5
  • Issues event: 18
  • Watch event: 12
  • Delete event: 3
  • Issue comment event: 25
  • Push event: 43
  • Pull request event: 5
  • Pull request review event: 5
  • Pull request review comment event: 5
  • Fork event: 3
Last Year
  • Create event: 5
  • Release event: 5
  • Issues event: 18
  • Watch event: 12
  • Delete event: 3
  • Issue comment event: 25
  • Push event: 43
  • Pull request event: 5
  • Pull request review event: 5
  • Pull request review comment event: 5
  • Fork event: 3

Committers

Last synced: 9 months ago

All Time
  • Total Commits: 667
  • Total Committers: 9
  • Avg Commits per committer: 74.111
  • Development Distribution Score (DDS): 0.19
Past Year
  • Commits: 101
  • Committers: 6
  • Avg Commits per committer: 16.833
  • Development Distribution Score (DDS): 0.238
Top Committers
Name Email Commits
wlandau w****u@g****m 540
wlandau w****s@g****m 98
mutlusun m****n 22
Bill Denney w****y@h****m 2
rmflight r****9@g****m 1
Noam Ross n****s@g****m 1
Mike Mahoney m****8@g****m 1
Maëlle Salmon m****n@y****e 1
Florian Kohrt f****t@a****o 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 89
  • Total pull requests: 24
  • Average time to close issues: 17 days
  • Average time to close pull requests: 5 days
  • Total issue authors: 39
  • Total pull request authors: 8
  • Average comments per issue: 2.78
  • Average comments per pull request: 1.75
  • Merged pull requests: 22
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 10
  • Pull requests: 7
  • Average time to close issues: 13 days
  • Average time to close pull requests: 5 days
  • Issue authors: 9
  • Pull request authors: 3
  • Average comments per issue: 1.1
  • Average comments per pull request: 1.86
  • Merged pull requests: 7
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • wlandau (39)
  • psychelzh (6)
  • Aariq (3)
  • noamross (2)
  • arcruz0 (2)
  • pat-s (2)
  • yonicd (2)
  • Pierre9344 (1)
  • anthonynorth (1)
  • asadow (1)
  • tjmahr (1)
  • arnold-c (1)
  • petrbouchal (1)
  • NFA (1)
  • bastistician (1)
Pull Request Authors
  • wlandau (12)
  • mutlusun (10)
  • noamross (2)
  • billdenney (2)
  • rmflight (2)
  • fkohrt (1)
  • maelle (1)
  • mikemahoney218 (1)
Top Labels
Issue Labels
type: new feature (52) type: bug (7) type: edge case (6) type: trouble (4) depends: external prerequisite (3) topic: literate programming (2) type: use case (1) depends: another issue (1) status: out of scope (1) status: incompatible (1) topic: reproducibility (1) type: maintenance (1)
Pull Request Labels
type: new feature (3) topic: branching (1) topic: literate programming (1) topic: reproducibility (1)

Packages

  • Total packages: 2
  • Total downloads:
    • cran 2,821 last-month
  • Total docker downloads: 194
  • Total dependent packages: 2
    (may contain duplicates)
  • Total dependent repositories: 36
    (may contain duplicates)
  • Total versions: 47
  • Total maintainers: 1
cran.r-project.org: tarchetypes

Archetypes for Targets

  • Versions: 33
  • Dependent Packages: 2
  • Dependent Repositories: 35
  • Downloads: 2,821 Last month
  • Docker Downloads: 194
Rankings
Stargazers count: 3.7%
Dependent repos count: 4.4%
Forks count: 5.2%
Downloads: 9.3%
Average: 10.7%
Dependent packages count: 18.1%
Docker downloads count: 23.7%
Maintainers (1)
Last synced: 6 months ago
conda-forge.org: r-tarchetypes
  • Versions: 14
  • Dependent Packages: 0
  • Dependent Repositories: 1
Rankings
Dependent repos count: 24.4%
Stargazers count: 34.4%
Average: 38.1%
Forks count: 41.8%
Dependent packages count: 51.6%
Last synced: 6 months ago

Dependencies

DESCRIPTION cran
  • R >= 3.5.0 depends
  • digest >= 0.6.25 imports
  • dplyr >= 1.0.0 imports
  • fs >= 1.4.2 imports
  • rlang >= 0.4.7 imports
  • targets >= 0.11.0 imports
  • tibble >= 3.0.1 imports
  • tidyselect >= 1.1.0 imports
  • utils * imports
  • vctrs >= 0.3.4 imports
  • withr >= 2.1.2 imports
  • curl >= 4.3 suggests
  • knitr >= 1.28 suggests
  • quarto >= 1.0 suggests
  • rmarkdown >= 2.1 suggests
  • testthat >= 3.0.0 suggests
  • xml2 >= 1.3.2 suggests
.github/workflows/check.yaml actions
  • actions/checkout v3 composite
  • quarto-dev/quarto-actions/setup v2 composite
  • r-lib/actions/check-r-package v2 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/cover.yaml actions
  • actions/checkout v3 composite
  • quarto-dev/quarto-actions/setup v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/lint.yaml actions
  • actions/checkout v3 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite