Science Score: 36.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
✓Committers with academic emails
2 of 51 committers (3.9%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (19.3%) to scientific vocabulary
Keywords from Contributors
tidy-data
package-creation
tidyverse
data-manipulation
grammar
date-time
network-analysis
odbc
pandoc
reproducibility
Last synced: 10 months ago
·
JSON representation
Repository
A tidy unified interface to models
Basic Info
- Host: GitHub
- Owner: tidymodels
- License: other
- Language: R
- Default Branch: main
- Homepage: https://parsnip.tidymodels.org
- Size: 31.1 MB
Statistics
- Stars: 626
- Watchers: 26
- Forks: 94
- Open Issues: 94
- Releases: 31
Created over 8 years ago
· Last pushed 10 months ago
Metadata Files
Readme
Changelog
Contributing
License
Code of conduct
README.Rmd
---
output: github_document
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# parsnip
[](https://github.com/tidymodels/parsnip/actions/workflows/R-CMD-check.yaml)
[](https://app.codecov.io/gh/tidymodels/parsnip?branch=main)
[](https://CRAN.R-project.org/package=parsnip)
[](https://CRAN.R-project.org/package=parsnip)
[](https://lifecycle.r-lib.org/articles/stages.html)
[](https://app.codecov.io/gh/tidymodels/parsnip)
## Introduction
The goal of parsnip is to provide a tidy, unified interface to models that can be used to try a range of models without getting bogged down in the syntactical minutiae of the underlying packages.
## Installation
```{r, eval = FALSE}
# The easiest way to get parsnip is to install all of tidymodels:
install.packages("tidymodels")
# Alternatively, install just parsnip:
install.packages("parsnip")
# Or the development version from GitHub:
# install.packages("pak")
pak::pak("tidymodels/parsnip")
```
## Getting started
One challenge with different modeling functions available in R _that do the same thing_ is that they can have different interfaces and arguments. For example, to fit a random forest regression model, we might have:
```{r eval = FALSE}
# From randomForest
rf_1 <- randomForest(
y ~ .,
data = dat,
mtry = 10,
ntree = 2000,
importance = TRUE
)
# From ranger
rf_2 <- ranger(
y ~ .,
data = dat,
mtry = 10,
num.trees = 2000,
importance = "impurity"
)
# From sparklyr
rf_3 <- ml_random_forest(
dat,
intercept = FALSE,
response = "y",
features = names(dat)[names(dat) != "y"],
col.sample.rate = 10,
num.trees = 2000
)
```
Note that the model syntax can be very different and that the argument names (and formats) are also different. This is a pain if you switch between implementations.
In this example:
* the **type** of model is "random forest",
* the **mode** of the model is "regression" (as opposed to classification, etc), and
* the computational **engine** is the name of the R package.
The goals of parsnip are to:
* Separate the definition of a model from its evaluation.
* Decouple the model specification from the implementation (whether the implementation is in R, spark, or something else). For example, the user would call `rand_forest` instead of `ranger::ranger` or other specific packages.
* Harmonize argument names (e.g. `n.trees`, `ntrees`, `trees`) so that users only need to remember a single name. This will help _across_ model types too so that `trees` will be the same argument across random forest as well as boosting or bagging.
Using the example above, the parsnip approach would be:
```{r}
library(parsnip)
rand_forest(mtry = 10, trees = 2000) |>
set_engine("ranger", importance = "impurity") |>
set_mode("regression")
```
The engine can be easily changed. To use Spark, the change is straightforward:
```{r}
rand_forest(mtry = 10, trees = 2000) |>
set_engine("spark") |>
set_mode("regression")
```
Either one of these model specifications can be fit in the same way:
```{r}
set.seed(192)
rand_forest(mtry = 10, trees = 2000) |>
set_engine("ranger", importance = "impurity") |>
set_mode("regression") |>
fit(mpg ~ ., data = mtcars)
```
A list of all parsnip models across different CRAN packages can be found at https://www.tidymodels.org/find/parsnip/.
## Contributing
This project is released with a [Contributor Code of Conduct](https://contributor-covenant.org/version/2/0/CODE_OF_CONDUCT.html). By contributing to this project, you agree to abide by its terms.
- For questions and discussions about tidymodels packages, modeling, and machine learning, please [post on RStudio Community](https://forum.posit.co/new-topic?category_id=15&tags=tidymodels,question).
- If you think you have encountered a bug, please [submit an issue](https://github.com/tidymodels/parsnip/issues).
- Either way, learn how to create and share a [reprex](https://reprex.tidyverse.org/articles/articles/learn-reprex.html) (a minimal, reproducible example), to clearly communicate about your code.
- Check out further details on [contributing guidelines for tidymodels packages](https://www.tidymodels.org/contribute/) and [how to get help](https://www.tidymodels.org/help/).
Owner
- Name: tidymodels
- Login: tidymodels
- Kind: organization
- Repositories: 59
- Profile: https://github.com/tidymodels
GitHub Events
Total
- Create event: 35
- Release event: 3
- Issues event: 54
- Watch event: 28
- Delete event: 29
- Issue comment event: 130
- Push event: 121
- Pull request review event: 39
- Pull request review comment event: 52
- Pull request event: 49
- Fork event: 7
Last Year
- Create event: 35
- Release event: 3
- Issues event: 54
- Watch event: 28
- Delete event: 29
- Issue comment event: 130
- Push event: 121
- Pull request review event: 39
- Pull request review comment event: 52
- Pull request event: 49
- Fork event: 7
Committers
Last synced: about 1 year ago
Top Committers
| Name | Commits | |
|---|---|---|
| Max Kuhn | m****n@g****m | 855 |
| Julia Silge | j****e@g****m | 183 |
| Emil Hvitfeldt | e****t@g****m | 182 |
| Hannah Frick | h****h@r****m | 167 |
| Simon P. Couch | s****h@g****m | 163 |
| DavisVaughan | d****s@r****m | 46 |
| Patrick Miller | p****r@c****m | 20 |
| Malcolm Barrett | m****t@g****m | 6 |
| Qiushi Yan | q****n@g****m | 6 |
| Mine Çetinkaya-Rundel | c****e@g****m | 5 |
| Rory Nolan | r****n@g****m | 4 |
| Max Kuhn | m****x@i****l | 4 |
| ‘topepo’ | ‘****n@g****’ | 3 |
| Gray | g****o@g****m | 3 |
| Steven Pawley | d****y@g****m | 3 |
| Max Kuhn | m****x@i****t | 2 |
| irkaal | i****v@g****m | 2 |
| artichaud1 | k****i@h****m | 2 |
| Y. Yu | 5****e | 2 |
| Steve Hummel | 4****1 | 2 |
| Omi Johnson | o****n@b****g | 2 |
| Matt Dancho | m****o@g****m | 2 |
| Kyle Scott | k****9@m****u | 2 |
| Byron | b****r@g****m | 2 |
| Tomasz Kalinowski | k****t@g****m | 1 |
| Tiago Maié | t****e@h****m | 1 |
| Tan Ho | 3****3 | 1 |
| StefanBRas | 2****s | 1 |
| Jonathan Marshall | j****l@m****z | 1 |
| Paige Bailey | p****y@m****m | 1 |
| and 21 more... | ||
Committer Domains (Top 20 + Academic)
rstudio.com: 2
civisanalytics.com: 1
gmail.com’: 1
imp.atlanticbb.net: 1
bigelow.org: 1
miami.edu: 1
massey.ac.nz: 1
microsoft.com: 1
ncsu.edu: 1
posteo.de: 1
salim.space: 1
nps.gov: 1
lunenfeld.ca: 1
ciencias.unam.mx: 1
jameshwade.com: 1
pm.me: 1
h2o.ai: 1
Issues and Pull Requests
Last synced: 10 months ago
All Time
- Total issues: 234
- Total pull requests: 290
- Average time to close issues: 6 months
- Average time to close pull requests: 11 days
- Total issue authors: 76
- Total pull request authors: 16
- Average comments per issue: 1.72
- Average comments per pull request: 1.14
- Merged pull requests: 255
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 59
- Pull requests: 82
- Average time to close issues: 2 days
- Average time to close pull requests: 12 days
- Issue authors: 14
- Pull request authors: 6
- Average comments per issue: 0.36
- Average comments per pull request: 0.87
- Merged pull requests: 71
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- simonpcouch (41)
- hfrick (34)
- EmilHvitfeldt (34)
- topepo (32)
- chillerb (7)
- jxu (4)
- marcelglueck (3)
- juliasilge (3)
- tolliam (2)
- SHo-JANG (2)
- joscani (2)
- Freestyleyang (2)
- exsell-jc (2)
- ZWael (2)
- cb12991 (1)
Pull Request Authors
- simonpcouch (105)
- topepo (63)
- EmilHvitfeldt (57)
- hfrick (37)
- shum461 (5)
- kscott-1 (4)
- dajmcdon (3)
- JamesHWade (2)
- RodDalBen (2)
- RobLBaker (2)
- luisDVA (2)
- bcjaeger (2)
- gaborcsardi (2)
- corybrunson (2)
- gmcmacran (1)
Top Labels
Issue Labels
feature (34)
upkeep (33)
bug (26)
documentation (23)
tidy-dev-day :nerd_face: (16)
discussion (2)
reprex (2)
question (2)
help wanted :heart: (2)
Pull Request Labels
Packages
- Total packages: 2
-
Total downloads:
- cran 34,670 last-month
- Total docker downloads: 33,521,067
-
Total dependent packages: 79
(may contain duplicates) -
Total dependent repositories: 185
(may contain duplicates) - Total versions: 49
- Total maintainers: 1
cran.r-project.org: parsnip
A Common API to Modeling and Analysis Functions
- Homepage: https://github.com/tidymodels/parsnip
- Documentation: http://cran.r-project.org/web/packages/parsnip/parsnip.pdf
- License: MIT + file LICENSE
-
Latest release: 1.3.3
published 10 months ago
Rankings
Stargazers count: 0.7%
Forks count: 1.0%
Dependent repos count: 1.4%
Dependent packages count: 1.5%
Average: 1.5%
Downloads: 2.2%
Docker downloads count: 2.3%
Maintainers (1)
Last synced:
10 months ago
conda-forge.org: r-parsnip
- Homepage: https://tidymodels.github.io/parsnip, https://github.com/tidymodels/parsnip
- License: MIT
-
Latest release: 1.0.3
published over 3 years ago
Rankings
Dependent packages count: 5.9%
Average: 17.4%
Stargazers count: 17.5%
Forks count: 21.7%
Dependent repos count: 24.3%
Last synced:
10 months ago