Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (18.6%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

Basic Info
  • Host: GitHub
  • Owner: BehavioralDataAnalysis
  • License: other
  • Language: R
  • Default Branch: main
  • Size: 164 KB
Statistics
  • Stars: 2
  • Watchers: 1
  • Forks: 3
  • Open Issues: 0
  • Releases: 0
Created over 3 years ago · Last pushed over 1 year ago
Metadata Files
Readme License Codemeta

README.Rmd

---
output: github_document
---



```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)
```

# BehavioralDataAnalysis


[![Codecov test coverage](https://codecov.io/gh/BehavioralDataAnalysis/R_package/branch/main/graph/badge.svg)](https://app.codecov.io/gh/BehavioralDataAnalysis/R_package?branch=main)
[![check-standard](https://github.com/BehavioralDataAnalysis/R_package/actions/workflows/check-standard.yaml/badge.svg)](https://github.com/BehavioralDataAnalysis/R_package/actions/workflows/check-standard.yaml)
[![Project Status: WIP – Initial development is in progress, but there has not yet been a stable, usable release suitable for the public.](https://www.repostatus.org/badges/latest/wip.svg)](https://www.repostatus.org/#wip)



** WORK IN PROGRESS! Please forgive the mess until the package is ready for release on CRAN **


The goal of BehavioralDataAnalysis is to provide functions to help you analyze behavioral data, i.e., data that represents the behavior of human beings such as customers and employees. In particular, I believe that there are two aspects of behavioral data that are worth emphasizing: 
- it doesn't obey a normal distribution nearly as often as we assume. It can be asymmetrical (skewed), fat-tailed, kurtotic, present multiple peaks and what have you. 
- we're generally interested in understanding what causes a behavior, so that we can affect it--e.g., increase customer spending or reduce employee churn. This requires the use of experimental or quasi-experimental methods, many of which make our data even less "well-behaved", statistically speaking. 

Both of these aspects call for dedicated analytical approaches, which is what this package is about. I describe in more details this "Causal-Behavioral Framework", as I call it, in my book [Behavioral Data Analysis with R and Python](https://smile.amazon.com/Behavioral-Data-Analysis-Python-Customer-Driven-ebook/dp/B0979QYPWD/) (O'Reilly Media). But you can totally use this package without reading the book, and I've tried to make the documentation self-sustaining. 

Please note that the package is designed to integrate nicely with the Tidyverse and therefore most functions will expect data formatted as a data.frame or a tibble. 

## Installation

You can install the development version of BehavioralDataAnalysis from [GitHub](https://github.com/) with:

``` r
# install.packages("devtools")
devtools::install_github("BehavioralDataAnalysis/R_package")
```

## Examples

### Bootstrap confidence interval

The function that you're most likely to use is probably `boot_ci()`, which estimates a Bootstrap interval for a function applied to a dataset. While the `boot.ci()` function of the [boot package]('https://cran.r-project.org/web/packages/boot/index.html') offers more options and is more powerful, it often requires more memory and computation than my personal laptop can manage and I find it somewhat cumbersome to use. Definitely check it out if you need a more serious implementation than the one here!

You can pass to `boot_ci()` any function that takes as argument a data frame and returns a single number, and by default it will automatically return the 90% confidence interval:

```{r}
library(BehavioralDataAnalysis)
my_data <- data.frame(
  x = rnorm(100)
)

my_function <- function(df) { return(mean(df$x)) }

CI <- boot_ci(my_data, my_function)
print(CI)
```
However, the most common use case is probably to use it to run a regression, so you can also pass directly the formula for a linear regression as the second parameter. For example, let's see what is the relationship between mass and height in the `starwars` dataset.

```{r}
data(starwars, package = "dplyr")

CI <- boot_ci(starwars, 'mass~height')
print(CI)
```

### matching subject for experimentation

If you have access to your whole list of subjects ahead of time (e.g., as opposed to users visiting at random your website), you can pair subjects sharing similar characteristics, to ensure that your experimental groups are as balanced as possible. This is also called stratified assignment, hence the name of the function `paired_assign()`. Note however that it will make traditional statistics invalid, and you'll have to use the Bootstrap to build intervals around your central estimates. 

```{r, message=FALSE, warning=FALSE}
library(dplyr)
library(BehavioralDataAnalysis)
attach(starwars)
set.seed(1)
dat <- starwars %>%
  na.omit() %>%
  dplyr::select(-films, -vehicles, -starships) %>%
  dplyr::filter(!grepl('Dooku', name))

paired_assigned_dat <- paired_assign(dat, id = 'name')
summ <- paired_assigned_dat %>% 
  group_by(grp) %>% 
  summarize(mean_height = mean(height, na.rm = TRUE))
print(summ)

```

As we can see, the mean heights of the two groups are pretty close. With pure randomization on the other hand, the two values are further apart from each other:

```{r}
set.seed(1)
rnd_dat <- dat %>%
  mutate(grp = c(rep(0, 14), rep(1, 14))) %>%
  mutate(grp = sample(grp))
rnd_summ <- rnd_dat %>% 
  group_by(grp) %>% 
  summarize(mean_height = mean(height, na.rm = TRUE))
print(rnd_summ)

```




Owner

  • Login: BehavioralDataAnalysis
  • Kind: user

CodeMeta (codemeta.json)

{
  "@context": "https://doi.org/10.5063/schema/codemeta-2.0",
  "@type": "SoftwareSourceCode",
  "identifier": "BehavioralDataAnalysis",
  "description": "Based on the book Behavioral Data Analysis With R and Python. It provides robust functions to analyze behavioral data without relying on traditional statistics.",
  "name": "BehavioralDataAnalysis: Bootstrap and Sampling Functions For Behavioral Data Analysis",
  "codeRepository": "https://github.com/BehavioralDataAnalysis/R_package",
  "issueTracker": "https://github.com/BehavioralDataAnalysis/R_package/issues",
  "license": "https://spdx.org/licenses/MIT",
  "version": "0.1.0",
  "programmingLanguage": {
    "@type": "ComputerLanguage",
    "name": "R",
    "url": "https://r-project.org"
  },
  "runtimePlatform": "R version 4.2.2 (2022-10-31 ucrt)",
  "author": [
    {
      "@type": "Person",
      "givenName": "Florent",
      "familyName": "Buisson",
      "email": "florent.buisson.oreilly@maskedmails.com"
    }
  ],
  "maintainer": [
    {
      "@type": "Person",
      "givenName": "Florent",
      "familyName": "Buisson",
      "email": "florent.buisson.oreilly@maskedmails.com"
    }
  ],
  "softwareSuggestions": [
    {
      "@type": "SoftwareApplication",
      "identifier": "testthat",
      "name": "testthat",
      "version": ">= 3.0.0",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=testthat"
    }
  ],
  "softwareRequirements": {
    "1": {
      "@type": "SoftwareApplication",
      "identifier": "doParallel",
      "name": "doParallel",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=doParallel"
    },
    "2": {
      "@type": "SoftwareApplication",
      "identifier": "dplyr",
      "name": "dplyr",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=dplyr"
    },
    "3": {
      "@type": "SoftwareApplication",
      "identifier": "foreach",
      "name": "foreach",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=foreach"
    },
    "4": {
      "@type": "SoftwareApplication",
      "identifier": "magrittr",
      "name": "magrittr",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=magrittr"
    },
    "5": {
      "@type": "SoftwareApplication",
      "identifier": "methods",
      "name": "methods"
    },
    "6": {
      "@type": "SoftwareApplication",
      "identifier": "Rcpp",
      "name": "Rcpp",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=Rcpp"
    },
    "7": {
      "@type": "SoftwareApplication",
      "identifier": "scales",
      "name": "scales",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=scales"
    },
    "8": {
      "@type": "SoftwareApplication",
      "identifier": "stats",
      "name": "stats"
    },
    "9": {
      "@type": "SoftwareApplication",
      "identifier": "R",
      "name": "R",
      "version": ">= 2.10"
    },
    "SystemRequirements": null
  },
  "fileSize": "6157.639KB",
  "readme": "https://github.com/BehavioralDataAnalysis/R_package/blob/main/README.md",
  "contIntegration": [
    "https://app.codecov.io/gh/BehavioralDataAnalysis/R_package?branch=main",
    "https://github.com/BehavioralDataAnalysis/R_package/actions/workflows/check-standard.yaml"
  ],
  "developmentStatus": "https://www.repostatus.org/#wip"
}

GitHub Events

Total
  • Push event: 5
Last Year
  • Push event: 5

Dependencies

.github/workflows/check-standard.yaml actions
  • actions/checkout v3 composite
  • r-lib/actions/check-r-package v2 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/test-coverage.yaml actions
  • actions/checkout v3 composite
  • actions/upload-artifact v3 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
DESCRIPTION cran
  • R >= 2.10 depends
  • Rcpp * imports
  • doParallel * imports
  • dplyr * imports
  • foreach * imports
  • magrittr * imports
  • methods * imports
  • scales * imports
  • stats * imports
  • knitr * suggests
  • rmarkdown * suggests
  • testthat >= 3.0.0 suggests