Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (18.6%) to scientific vocabulary
Last synced: 10 months ago
·
JSON representation
Repository
Basic Info
- Host: GitHub
- Owner: BehavioralDataAnalysis
- License: other
- Language: R
- Default Branch: main
- Size: 164 KB
Statistics
- Stars: 2
- Watchers: 1
- Forks: 3
- Open Issues: 0
- Releases: 0
Created over 3 years ago
· Last pushed over 1 year ago
Metadata Files
Readme
License
Codemeta
README.Rmd
---
output: github_document
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# BehavioralDataAnalysis
[](https://app.codecov.io/gh/BehavioralDataAnalysis/R_package?branch=main)
[](https://github.com/BehavioralDataAnalysis/R_package/actions/workflows/check-standard.yaml)
[](https://www.repostatus.org/#wip)
** WORK IN PROGRESS! Please forgive the mess until the package is ready for release on CRAN **
The goal of BehavioralDataAnalysis is to provide functions to help you analyze behavioral data, i.e., data that represents the behavior of human beings such as customers and employees. In particular, I believe that there are two aspects of behavioral data that are worth emphasizing:
- it doesn't obey a normal distribution nearly as often as we assume. It can be asymmetrical (skewed), fat-tailed, kurtotic, present multiple peaks and what have you.
- we're generally interested in understanding what causes a behavior, so that we can affect it--e.g., increase customer spending or reduce employee churn. This requires the use of experimental or quasi-experimental methods, many of which make our data even less "well-behaved", statistically speaking.
Both of these aspects call for dedicated analytical approaches, which is what this package is about. I describe in more details this "Causal-Behavioral Framework", as I call it, in my book [Behavioral Data Analysis with R and Python](https://smile.amazon.com/Behavioral-Data-Analysis-Python-Customer-Driven-ebook/dp/B0979QYPWD/) (O'Reilly Media). But you can totally use this package without reading the book, and I've tried to make the documentation self-sustaining.
Please note that the package is designed to integrate nicely with the Tidyverse and therefore most functions will expect data formatted as a data.frame or a tibble.
## Installation
You can install the development version of BehavioralDataAnalysis from [GitHub](https://github.com/) with:
``` r
# install.packages("devtools")
devtools::install_github("BehavioralDataAnalysis/R_package")
```
## Examples
### Bootstrap confidence interval
The function that you're most likely to use is probably `boot_ci()`, which estimates a Bootstrap interval for a function applied to a dataset. While the `boot.ci()` function of the [boot package]('https://cran.r-project.org/web/packages/boot/index.html') offers more options and is more powerful, it often requires more memory and computation than my personal laptop can manage and I find it somewhat cumbersome to use. Definitely check it out if you need a more serious implementation than the one here!
You can pass to `boot_ci()` any function that takes as argument a data frame and returns a single number, and by default it will automatically return the 90% confidence interval:
```{r}
library(BehavioralDataAnalysis)
my_data <- data.frame(
x = rnorm(100)
)
my_function <- function(df) { return(mean(df$x)) }
CI <- boot_ci(my_data, my_function)
print(CI)
```
However, the most common use case is probably to use it to run a regression, so you can also pass directly the formula for a linear regression as the second parameter. For example, let's see what is the relationship between mass and height in the `starwars` dataset.
```{r}
data(starwars, package = "dplyr")
CI <- boot_ci(starwars, 'mass~height')
print(CI)
```
### matching subject for experimentation
If you have access to your whole list of subjects ahead of time (e.g., as opposed to users visiting at random your website), you can pair subjects sharing similar characteristics, to ensure that your experimental groups are as balanced as possible. This is also called stratified assignment, hence the name of the function `paired_assign()`. Note however that it will make traditional statistics invalid, and you'll have to use the Bootstrap to build intervals around your central estimates.
```{r, message=FALSE, warning=FALSE}
library(dplyr)
library(BehavioralDataAnalysis)
attach(starwars)
set.seed(1)
dat <- starwars %>%
na.omit() %>%
dplyr::select(-films, -vehicles, -starships) %>%
dplyr::filter(!grepl('Dooku', name))
paired_assigned_dat <- paired_assign(dat, id = 'name')
summ <- paired_assigned_dat %>%
group_by(grp) %>%
summarize(mean_height = mean(height, na.rm = TRUE))
print(summ)
```
As we can see, the mean heights of the two groups are pretty close. With pure randomization on the other hand, the two values are further apart from each other:
```{r}
set.seed(1)
rnd_dat <- dat %>%
mutate(grp = c(rep(0, 14), rep(1, 14))) %>%
mutate(grp = sample(grp))
rnd_summ <- rnd_dat %>%
group_by(grp) %>%
summarize(mean_height = mean(height, na.rm = TRUE))
print(rnd_summ)
```
Owner
- Login: BehavioralDataAnalysis
- Kind: user
- Repositories: 1
- Profile: https://github.com/BehavioralDataAnalysis
CodeMeta (codemeta.json)
{
"@context": "https://doi.org/10.5063/schema/codemeta-2.0",
"@type": "SoftwareSourceCode",
"identifier": "BehavioralDataAnalysis",
"description": "Based on the book Behavioral Data Analysis With R and Python. It provides robust functions to analyze behavioral data without relying on traditional statistics.",
"name": "BehavioralDataAnalysis: Bootstrap and Sampling Functions For Behavioral Data Analysis",
"codeRepository": "https://github.com/BehavioralDataAnalysis/R_package",
"issueTracker": "https://github.com/BehavioralDataAnalysis/R_package/issues",
"license": "https://spdx.org/licenses/MIT",
"version": "0.1.0",
"programmingLanguage": {
"@type": "ComputerLanguage",
"name": "R",
"url": "https://r-project.org"
},
"runtimePlatform": "R version 4.2.2 (2022-10-31 ucrt)",
"author": [
{
"@type": "Person",
"givenName": "Florent",
"familyName": "Buisson",
"email": "florent.buisson.oreilly@maskedmails.com"
}
],
"maintainer": [
{
"@type": "Person",
"givenName": "Florent",
"familyName": "Buisson",
"email": "florent.buisson.oreilly@maskedmails.com"
}
],
"softwareSuggestions": [
{
"@type": "SoftwareApplication",
"identifier": "testthat",
"name": "testthat",
"version": ">= 3.0.0",
"provider": {
"@id": "https://cran.r-project.org",
"@type": "Organization",
"name": "Comprehensive R Archive Network (CRAN)",
"url": "https://cran.r-project.org"
},
"sameAs": "https://CRAN.R-project.org/package=testthat"
}
],
"softwareRequirements": {
"1": {
"@type": "SoftwareApplication",
"identifier": "doParallel",
"name": "doParallel",
"provider": {
"@id": "https://cran.r-project.org",
"@type": "Organization",
"name": "Comprehensive R Archive Network (CRAN)",
"url": "https://cran.r-project.org"
},
"sameAs": "https://CRAN.R-project.org/package=doParallel"
},
"2": {
"@type": "SoftwareApplication",
"identifier": "dplyr",
"name": "dplyr",
"provider": {
"@id": "https://cran.r-project.org",
"@type": "Organization",
"name": "Comprehensive R Archive Network (CRAN)",
"url": "https://cran.r-project.org"
},
"sameAs": "https://CRAN.R-project.org/package=dplyr"
},
"3": {
"@type": "SoftwareApplication",
"identifier": "foreach",
"name": "foreach",
"provider": {
"@id": "https://cran.r-project.org",
"@type": "Organization",
"name": "Comprehensive R Archive Network (CRAN)",
"url": "https://cran.r-project.org"
},
"sameAs": "https://CRAN.R-project.org/package=foreach"
},
"4": {
"@type": "SoftwareApplication",
"identifier": "magrittr",
"name": "magrittr",
"provider": {
"@id": "https://cran.r-project.org",
"@type": "Organization",
"name": "Comprehensive R Archive Network (CRAN)",
"url": "https://cran.r-project.org"
},
"sameAs": "https://CRAN.R-project.org/package=magrittr"
},
"5": {
"@type": "SoftwareApplication",
"identifier": "methods",
"name": "methods"
},
"6": {
"@type": "SoftwareApplication",
"identifier": "Rcpp",
"name": "Rcpp",
"provider": {
"@id": "https://cran.r-project.org",
"@type": "Organization",
"name": "Comprehensive R Archive Network (CRAN)",
"url": "https://cran.r-project.org"
},
"sameAs": "https://CRAN.R-project.org/package=Rcpp"
},
"7": {
"@type": "SoftwareApplication",
"identifier": "scales",
"name": "scales",
"provider": {
"@id": "https://cran.r-project.org",
"@type": "Organization",
"name": "Comprehensive R Archive Network (CRAN)",
"url": "https://cran.r-project.org"
},
"sameAs": "https://CRAN.R-project.org/package=scales"
},
"8": {
"@type": "SoftwareApplication",
"identifier": "stats",
"name": "stats"
},
"9": {
"@type": "SoftwareApplication",
"identifier": "R",
"name": "R",
"version": ">= 2.10"
},
"SystemRequirements": null
},
"fileSize": "6157.639KB",
"readme": "https://github.com/BehavioralDataAnalysis/R_package/blob/main/README.md",
"contIntegration": [
"https://app.codecov.io/gh/BehavioralDataAnalysis/R_package?branch=main",
"https://github.com/BehavioralDataAnalysis/R_package/actions/workflows/check-standard.yaml"
],
"developmentStatus": "https://www.repostatus.org/#wip"
}
GitHub Events
Total
- Push event: 5
Last Year
- Push event: 5
Dependencies
.github/workflows/check-standard.yaml
actions
- actions/checkout v3 composite
- r-lib/actions/check-r-package v2 composite
- r-lib/actions/setup-pandoc v2 composite
- r-lib/actions/setup-r v2 composite
- r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/test-coverage.yaml
actions
- actions/checkout v3 composite
- actions/upload-artifact v3 composite
- r-lib/actions/setup-r v2 composite
- r-lib/actions/setup-r-dependencies v2 composite
DESCRIPTION
cran
- R >= 2.10 depends
- Rcpp * imports
- doParallel * imports
- dplyr * imports
- foreach * imports
- magrittr * imports
- methods * imports
- scales * imports
- stats * imports
- knitr * suggests
- rmarkdown * suggests
- testthat >= 3.0.0 suggests