cofad

cofad: An R package and shiny app for contrast analysis - Published in JOSS (2021)

https://github.com/johannes-titz/cofad

Scientific Fields

Artificial Intelligence and Machine Learning Computer Science - 40% confidence

Last synced: 6 months ago · JSON representation

Repository

Basic Info

Host: GitHub
Owner: johannes-titz
License: other
Language: R
Default Branch: main
Size: 1.4 MB

Statistics

Stars: 2
Watchers: 1
Forks: 1
Open Issues: 1
Releases: 6

Created over 4 years ago · Last pushed 7 months ago

Metadata Files

Readme Changelog License

README.Rmd

---
title: Cofad User Guide
output:
  github_document:
    pandoc_args: --webtex
    hard_line_breaks: TRUE
bibliography: "clean_library.bib"
csl: apa.csl
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

options(width = 80)

# library(stringr)
# 
# clean_bib <- function(input_file, input_bib, output_bib){
#   lines <- paste(readLines(input_file), collapse = "")
#   entries <- unique(str_match_all(lines,"@([a-zA-Z0-9]+)[,\\. \\?\\!\\]]")[[1]][, 2])
# 
#   bib <- paste(readLines(input_bib), collapse = "\n")
#   bib <- unlist(strsplit(bib, "\n@"))
# 
#   output <- sapply(entries, grep, bib, value = T)
#   output <- paste("@", output, sep = "")
# 
#   writeLines(unlist(output), output_bib)
# }

#clean_bib("README.Rmd", "library.bib", "clean_library.bib")
#clean_bib("paper/paper.md", "library.bib", "paper/library_paper.bib")
```

# 

[![R-CMD-check](https://github.com/johannes-titz/cofad/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/johannes-titz/cofad/actions/workflows/R-CMD-check.yaml)
[![CRAN status](https://www.r-pkg.org/badges/version/cofad)](https://CRAN.R-project.org/package=cofad)
[![DOI](https://joss.theoj.org/papers/10.21105/joss.03822/status.svg)](https://doi.org/10.21105/joss.03822)



To cite cofad in publications, please use the following reference:

Titz J. & Burkhardt M. (2021). cofad: An R package and shiny app for contrast analysis. Journal of Open Source Software, 6(67), 3822, https://doi.org/10.21105/joss.03822

For LaTeX users, a BibTeX entry is provided below:

```
@article{titz2021, 
  doi = {10.21105/joss.03822}, 
  url = {https://doi.org/10.21105/joss.03822}, 
  year = {2021}, 
  publisher = {The Open Journal}, 
  volume = {6}, 
  number = {67}, 
  pages = {3822}, 
  author = {Johannes Titz and Markus Burkhardt}, 
  title = {cofad: An R package and shiny app for contrast analysis}, 
  journal = {Journal of Open Source Software} }
```

## Introduction

Cofad is an R package for conducting COntrast analysis in FActorial Designs, such as ANOVAs. If contrast analysis were to win an award, it might be for the most underestimated and underused statistical technique. This is unfortunate, because contrast analysis is at least as informative as ANOVA---and often more so. Rather than testing an unspecific omnibus hypothesis like “there are differences somewhere”, contrast analysis allows you to test a precise, numerically specified hypothesis. It also shifts the focus from mere significance testing to the evaluation of effects.

This focus on effects is reflected in two key ways:

1. Contrast analysis offers three distinct effect size measures: $r_\mathrm{effectsize}$, $r_\mathrm{contrast}$, and $r_\mathrm{alerting}$.

2. These effect sizes relate not just to the data, but to the hypothesis being tested---the stronger the effect, the more it supports the hypothesis.

Cofad also makes it possible to compare two competing hypotheses directly (experimentum crucis) by examining the effect sizes associated with each.

Sounds interesting? Then take a look at some introductory literature, such as @furr2004, @rosenthal1985, @rosenthal2000, or---for German-speaking readers--- @sedlmeier2018. Contrast analysis is relatively easy to grasp if you're already familiar with ANOVA and correlation.

In this vignette, we assume you have a basic understanding of contrast analysis and are ready to apply it to a specific dataset. We begin by showing how to install cofad and use its graphical user interface. Then, we walk through several example analyses for between-subjects, within-subjects, and mixed designs using R.

## Installation

Cofad has two components, the plain R package and a shiny-app that offers an intuitive graphical user interface.

If you just want to use the cofad-app, you do not need to install it. Just go to https://cofad.titz.science and use it there. An example data file is loaded when you add /example after the url.

If you prefer the command line interface or want to use the cofad-app locally, install it from CRAN:

```{r message=FALSE, eval=F}
install.packages("cofad")
```

Alternatively, you can also install the development version from github (you need the package remotes for this):

```{r echo = T, results = "hide", eval = F}
# install.packages("remotes") # uncomment if you do not have devtools installed
remotes::install_github("johannes-titz/cofad")
```

Now you can load cofad and use it in your R scripts.

You can also run the app:

```{r echo = T, results = "hide", eval = F}
cofad::run_app()
```





## Using cofad

Before we start: Your data has to be in the long-format (also referred to as narrow or tidy)! If you do not know what this means, please check the short description of the Wikipedia-article: https://en.wikipedia.org/wiki/Wide_and_narrow_data

### Graphical-User-Interface

The graphical-user-interface is self-explanatory. Just load your data and drag the variables to the correct position. At the moment you can only read .sav (SPSS) and .csv files.

As an example go to `https://cofad.titz.science/example` which will load a data set from @rosenthal2000 (Table 5.3). The cognitive ability of nine children belonging to different age groups (between) was measured four times (within).

There are two hypotheses:

1. cognitive ability linearly increases over time (within)
($\lambda_\mathrm{1} = -3, \lambda_\mathrm{2} = -1, \lambda_\mathrm{3} = 1, \lambda_\mathrm{4} = 3$)
2. cognitive ability linearly increase over age groups (between)
($\lambda_\mathrm{Age 8} = -1, \lambda_\mathrm{Age 10} = 0, \lambda_\mathrm{Age12} = 1$)

Now drag the variables to the correct position and set the lambdas accordingly:

![cofad GUI](gui1b.png)

The result should look like this:

![cofad GUI](gui2b.png)

A mixed design is ideal for testing out the cofad-app. You can now construct a separate within-model by removing the between variable "age". Then you can construct a separate between-model by removing "time" from within and dragging "age" back into the between panel.

The graphical user interface will suffice for most users, but some will prefer to use the scripting capabilities of R. In the next sections we will look at several script examples for different designs.

### Between-Subjects Designs

Let us first load the package:

```{r setup}
library(cofad)
```

Now we need some data and hypotheses. We can simply take the data from @furr2004, where we have different empathy ratings of students from different majors. This data set is available in the cofad package:

```{r}
data("furr_p4")
furr_p4
```

Furr states three hypotheses:

  - Contrast A: Psychology majors have higher empathy scores than Education majors ($\lambda_\mathrm{psych} = 1, \lambda_\mathrm{edu} = -1$).
  - Contrast B: Business majors have higher empathy scores than Chemistry majors ($\lambda_\mathrm{bus} = 1, \lambda_\mathrm{chem} = -1$).
  - Contrast C: On average, Psychology and Education majors have higher empathy scores than Business and Chemistry majors ($\lambda_\mathrm{psych} = 1, \lambda_\mathrm{edu} = 1, \lambda_\mathrm{bus} = -1, \lambda_\mathrm{chem} = -1$).

These hypotheses are only mean comparisons, but this is a good way to start. Let's use cofad to conduct the contrast analysis:

```{r}
ca <- calc_contrast(dv = empathy, between = major,
                    lambda_between = c("psychology" = 1, "education" = -1,
                                       "business" = 0, "chemistry" = 0),
                    data = furr_p4)
ca
```

The print method shows some basic information that can be directly used in a publication. With the summary method some more details are shown:

```{r}
summary(ca)
```

From this table, $r_\mathrm{effectsize}$ is probably the most useful statistic. It is just the correlation between the lambdas and the dependent variable, which can also be calculated by hand:

```{r}
lambdas <- rep(c(1, -1, 0, 0), each = 5)
cor(furr_p4$empathy, lambdas)
```

As you can see, the effect is negative and `cofad` also warns the user that the contrast fits in the opposite direction. This is a big failure for the hypothesis and indicates substantial problems in theorizing.

The other two hypotheses can be tested accordingly:

```{r}
ca <- calc_contrast(dv = empathy, between = major,
                    lambda_between = c("psychology" = 0, "education" = 0,
                                       "business" = 1, "chemistry" = -1),
                    data = furr_p4)
ca
ca <- calc_contrast(dv = empathy, between = major,
                    lambda_between = c("psychology" = 1, "education" = 1,
                                       "business" = -1, "chemistry" = -1),
                    data = furr_p4)
ca
```

When you compare the numbers to the ones presented in @furr2004, you will find the same result, except that @furr2004 uses t-values and the p-values are halved. This is because in contrast analysis you can always test one-sided. The assumption is that your lambdas covariate positively with the mean values, not that they either covariate positively or negatively. Thus, you can always halve the p-value from the F-Test.

Now, imagine we have a more fun hypothesis and not just mean differences. From an elaborate theory we could derive that the means should be 73, 61, 51 and 38. We can test this with cofad directly because cofad will center the lambdas (the mean of the lambdas has to be 0):

```{r}
ca <- calc_contrast(dv = empathy, between = major,
                    lambda_between = c("psychology" = 73, "education" = 61,
                                       "business" = 51, "chemistry" = 38),
                    data = furr_p4)
ca
```

The manual test gives the same effect size:

```{r}
lambdas <- rep(c(73, 61, 51, 38), each = 5)
cor(furr_p4$empathy, lambdas)
```

Let us now run an analysis for within-subjects designs.

## Within-Subjects Designs

For within designs the calculations are quite different, but cofad takes care of the details. We just have to use the within parameters *within* and *lambda_within* instead of the between equivalents. As an example we use Table 16.5 from @sedlmeier2018. Reading ability was assessed for eight participants under four different conditions. The hypothesis is that you can read best without music, white noise reduces your reading ability and music (independently of type) reduces it even further.

```{r}
data("sedlmeier_p537")
head(sedlmeier_p537)
within <- calc_contrast(dv = reading_test, within = music,
                        lambda_within = c("without music" = 1.25, 
                                          "white noise" = 0.25,
                                          "classic" = -0.75,
                                          "jazz" = -0.75),
                        id = participant, data = sedlmeier_p537)
summary(within)
within
```

You can see that the significance test is just a $t$-test and the reported effect size is referring to a mean comparison ($g$). (The $t$-test is one-tailed, because contrast analysis has always a specific hypothesis.) When conducting the analysis by hand, we can see why:

```{r}
mtr <- matrix(sedlmeier_p537$reading_test, ncol = 4)
lambdas <- c(1.25, 0.25, -0.75, -0.75)
lc1 <- mtr %*% lambdas
t.test(lc1)
```

Only the linear combination of the dependent variable and the contrast weights for each participant is needed. With these values a normal $t$-test against 0 is conducted. While you can do this manually, using cofad is quicker and it also gives you more information, such as the different effect sizes.

## Mixed Designs

A mixed design combines between and within factors. In this case cofad first calculates the linear combination (*L*-Values) for the within factor. This new variable serves as the dependent variable for a between contrast analysis. We will again look at the example presented in @rosenthal2000 (see the section graphical user interface). The cognitive ability of nine children belonging to different age groups (between) was measured four times (within).

There are two hypotheses:

1. cognitive ability linearly increases over time (within)
($\lambda_\mathrm{1} = -3, \lambda_\mathrm{2} = -1, \lambda_\mathrm{3} = 1, \lambda_\mathrm{4} = 3$)
2. cognitive ability linearly increase over age groups (between)
($\lambda_\mathrm{Age 8} = -1, \lambda_\mathrm{Age 10} = 0, \lambda_\mathrm{Age12} = 1$)

Let's have a look at the data and calculation:

```{r}
data("rosenthal_tbl53")
head(rosenthal_tbl53)
lambda_within <- c("1" = -3, "2" = -1, "3" = 1, "4" = 3)
lambda_between <-c("age8" = -1, "age10" = 0, "age12" = 1)

contr_mx <- calc_contrast(dv = dv, 
                          between = between,
                          lambda_between = lambda_between,
                          within = within,
                          lambda_within = lambda_within,
                          id = id, 
                          data = rosenthal_tbl53)
contr_mx
```

The results look like a contrast analysis for between-subject designs. The summary gives some more details: The effect sizes, within group means and standard errors of the *L*-values.

```{r}
summary(contr_mx)
```

## Comparing two hypotheses

With `cofad` you can also compare two competing hypotheses. As an example @sedlmeier2013 use a fictitious data set on problem solving skills of boys:

```{r}
sedlmeier_p525
```

Where lsg is the number of solved exercises and the groups are KT = no training, JT = boys-specific training, MT=girls-specific training. Two hypotheses are competing:

- -2, 3, -1 (boys benefit from boys-specific training)
- -2, 1, 1 (boys benefit from training, independently of the type of training)

First, we need to create the difference lambdas:

```{r}
lambda1 <- c(-2, 3, -1)
lambda2 <- c(-2, 1, 1)
lambda <- lambda_diff(lambda1, lambda2, labels = c("KT", "JT", "MT"))
lambda
```

Note that you cannot just subtract the lambdas because their variance can differ, which has an effect on the test. Instead, you need to standardize the lambdas first. `lambda_diff` takes care of this for you.

Now you can run a normal contrast analysis:

```{r}
ca_competing <- calc_contrast(
  dv = lsg,
  between = between,
  lambda_between = round(lambda, 2),
  data = sedlmeier_p525
)
summary(ca_competing)
ca_competing
```
Here, we rounded the lambdas so that the result is similar to the one in @sedlmeier2013, who found t=1.137 and r_effectsize=0.26. The effet size is the same. For the t-value, we need to take the root of the F-value, 1.291, which is `r round(sqrt(1.291), 3)`. There is still a slight difference to the original result of 1.137, which is likely due to rounding errors.

The same also works for within-designs. The reading comprehension data from above can serve as an example. Reading ability was assessed for eight participants under four different conditions:

```{r}
sedlmeier_p537
```

There are two hypotheses:

- 1.25, 0.25, -0.75, -0.75: You can read best without music, white noise reduces your reading ability and music (independently of type) reduces it even further.
- 3, -1, -1, -1: Noise of any kind reduces reading ability.

Again, we need to calculate the difference lambdas first:

```{r}
lambda1 <- c(1.25, 0.25, -0.75, -0.75)
lambda2 <- c(3, -1, -1, -1)
lambda <- lambda_diff(lambda2, lambda1,
                      labels = c("without music", "white noise", "classic",
                                 "jazz"))
lambda
```

Note that we use lambda2 as the first entry into `lambda_diff` because this is how @sedlmeier2013 calculated it (hypothesis2-hypothesis1).

And now the contrast analysis:

```{r}
contr_wi <- calc_contrast(
  dv = reading_test,
  within = music,
  lambda_within = round(lambda, 2),
  id = participant,
  data = sedlmeier_p537
)
summary(contr_wi)
contr_wi
```

@sedlmeier2013 found a t-value of -3.75 and a g_contrast of -1.33. Again, there is a slight difference for the t-value when compared to our calculation, likely due to rounding errors. Further note, that hypothesis 1 fits better because the statistic and effect are negative.

## Aggregated Data

Sometimes you would like to run a contrast analysis on aggregated data (e.g. when no raw data is available). If you have the means, standard deviations and sample sizes for every condition, you can do this with cofad. For instance, if we take our first example and aggregate it, we can still run the contrast analysis:

```{r message=FALSE}
library(dplyr)
furr_agg <- furr_p4 %>% 
  group_by(major) %>% 
  summarize(mean = mean(empathy), sd = sd(empathy), n = n())
lambdas = c("psychology" = 1, "education" = -1, "business" = 0, "chemistry" = 0)
calc_contrast_aggregated(mean, sd, n, major, lambdas, furr_agg)
```

And the result is indeed the same when compared to the analysis with the raw data:

```{r}
ca <- calc_contrast(dv = empathy, between = major,
                    lambda_between = c("psychology" = 1, "education" = -1,
                                       "business" = 0, "chemistry" = 0),
                    data = furr_p4)
ca
```

Note that this will only work for between-subjects designs.

## Testing

The current test coverage for the package stands at 88%. Within `cofad`, there exists a Shiny app, subjected to testing through shinytest2. Regrettably, this testing approach proves less robust, leading to unpredictable failures specifically on the Windows OS when executed through GitHub Actions. Consequently, these tests are omitted during GitHub runs and are exclusively conducted on a local environment.

## Issues and Support

If you find any bugs, please use the issue tracker at:

https://github.com/johannes-titz/cofad/issues

If you need answers on how to use the package, drop an e-mail at johannes at titz.science or johannes.titz at gmail.com

## Contributing

Comments and feedback of any kind are very welcome! We will thoroughly consider every suggestion on how to improve the code, the documentation, and the presented examples. Even minor things, such as suggestions for better wording or improving grammar in any part of the package, are more than welcome.

If you want to make a pull request, please check that you can still build the package without any errors, warnings, or notes. Overall, simply stick to the R packages book: https://r-pkgs.org/ and follow the code style described here: https://style.tidyverse.org/

## Acknowledgments

We want to thank Thomas Schäfer and Isabell Winkler for testing cofad and giving helpful feedback.

## References

Owner

Name: Johannes Titz
Login: johannes-titz
Kind: user

Website: rlernen.de
Repositories: 8
Profile: https://github.com/johannes-titz

JOSS Publication

cofad: An R package and shiny app for contrast analysis

Published

November 17, 2021

DOI

10.21105/joss.03822

Volume 6, Issue 67, Page 3822

Authors

Johannes Titz

Department of Psychology, TU Chemnitz, Germany

Markus Burkhardt

Department of Psychology, TU Chemnitz, Germany

Editor

Chris Hartgerink

GitHub Events

Total

Release event: 2
Push event: 12
Create event: 1

Last Year

Release event: 2
Push event: 12
Create event: 1

Committers

Last synced: 7 months ago

All Time

Total Commits: 381
Total Committers: 4
Avg Commits per committer: 95.25
Development Distribution Score (DDS): 0.108

Past Year

Commits: 25
Committers: 1
Avg Commits per committer: 25.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
Johannes Titz	j**s@t**e	340
Markus Burkhardt	m**t@p**e	31
Markus Burkhardt	m**t@p**e	8
Johannes Titz	j**z@g**m	2

Committer Domains (Top 20 + Academic)

psychologie.tu-chemnitz.de: 1 physik.tu-chemnitz.de: 1 titz.science: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 3
Total pull requests: 0
Average time to close issues: 4 days
Average time to close pull requests: N/A
Total issue authors: 1
Total pull request authors: 0
Average comments per issue: 0.33
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

johannes-titz (2)

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

Total packages: 1
Total downloads:
- cran 556 last-month

Total dependent packages: 0
Total dependent repositories: 1
Total versions: 6
Total maintainers: 1

cran.r-project.org: cofad

Contrast Analyses for Factorial Designs

Homepage: https://github.com/johannes-titz/cofad
Documentation: http://cran.r-project.org/web/packages/cofad/cofad.pdf
License: LGPL (≥ 3)
Latest release: 0.3.3
published 9 months ago

Versions: 6
Dependent Packages: 0
Dependent Repositories: 1
Downloads: 556 Last month

Rankings

Dependent repos count: 24.0%

Dependent packages count: 28.8%

Average: 34.3%

Downloads: 50.2%

Maintainers (1)

markus.burkhardt@psychologie.tu-chemnitz.de

Last synced: 6 months ago

cofad

Science Score: 95.0%

Scientific Fields

Repository

Basic Info

Statistics

Metadata Files

README.Rmd

Owner

JOSS Publication

cofad: An R package and shiny app for contrast analysis

Authors

Editor

Tags

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

cran.r-project.org: cofad

Rankings

Maintainers (1)

Dependencies