cofad
cofad: An R package and shiny app for contrast analysis - Published in JOSS (2021)
Science Score: 95.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 12 DOI reference(s) in README and JOSS metadata -
✓Academic publication links
Links to: joss.theoj.org -
✓Committers with academic emails
2 of 4 committers (50.0%) from academic institutions -
○Institutional organization owner
-
✓JOSS paper metadata
Published in Journal of Open Source Software
Scientific Fields
Artificial Intelligence and Machine Learning
Computer Science -
40% confidence
Last synced: 4 months ago
·
JSON representation
Repository
Basic Info
- Host: GitHub
- Owner: johannes-titz
- License: other
- Language: R
- Default Branch: main
- Size: 1.4 MB
Statistics
- Stars: 2
- Watchers: 1
- Forks: 1
- Open Issues: 1
- Releases: 6
Created over 4 years ago
· Last pushed 5 months ago
Metadata Files
Readme
Changelog
License
README.Rmd
---
title: Cofad User Guide
output:
github_document:
pandoc_args: --webtex
hard_line_breaks: TRUE
bibliography: "clean_library.bib"
csl: apa.csl
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
options(width = 80)
# library(stringr)
#
# clean_bib <- function(input_file, input_bib, output_bib){
# lines <- paste(readLines(input_file), collapse = "")
# entries <- unique(str_match_all(lines,"@([a-zA-Z0-9]+)[,\\. \\?\\!\\]]")[[1]][, 2])
#
# bib <- paste(readLines(input_bib), collapse = "\n")
# bib <- unlist(strsplit(bib, "\n@"))
#
# output <- sapply(entries, grep, bib, value = T)
# output <- paste("@", output, sep = "")
#
# writeLines(unlist(output), output_bib)
# }
#clean_bib("README.Rmd", "library.bib", "clean_library.bib")
#clean_bib("paper/paper.md", "library.bib", "paper/library_paper.bib")
```
#
[](https://github.com/johannes-titz/cofad/actions/workflows/R-CMD-check.yaml)
[](https://CRAN.R-project.org/package=cofad)
[](https://doi.org/10.21105/joss.03822)
To cite cofad in publications, please use the following reference:
Titz J. & Burkhardt M. (2021). cofad: An R package and shiny app for contrast analysis. Journal of Open Source Software, 6(67), 3822, https://doi.org/10.21105/joss.03822
For LaTeX users, a BibTeX entry is provided below:
```
@article{titz2021,
doi = {10.21105/joss.03822},
url = {https://doi.org/10.21105/joss.03822},
year = {2021},
publisher = {The Open Journal},
volume = {6},
number = {67},
pages = {3822},
author = {Johannes Titz and Markus Burkhardt},
title = {cofad: An R package and shiny app for contrast analysis},
journal = {Journal of Open Source Software} }
```
## Introduction
Cofad is an R package for conducting COntrast analysis in FActorial Designs, such as ANOVAs. If contrast analysis were to win an award, it might be for the most underestimated and underused statistical technique. This is unfortunate, because contrast analysis is at least as informative as ANOVA---and often more so. Rather than testing an unspecific omnibus hypothesis like “there are differences somewhere”, contrast analysis allows you to test a precise, numerically specified hypothesis. It also shifts the focus from mere significance testing to the evaluation of effects.
This focus on effects is reflected in two key ways:
1. Contrast analysis offers three distinct effect size measures: $r_\mathrm{effectsize}$, $r_\mathrm{contrast}$, and $r_\mathrm{alerting}$.
2. These effect sizes relate not just to the data, but to the hypothesis being tested---the stronger the effect, the more it supports the hypothesis.
Cofad also makes it possible to compare two competing hypotheses directly (experimentum crucis) by examining the effect sizes associated with each.
Sounds interesting? Then take a look at some introductory literature, such as @furr2004, @rosenthal1985, @rosenthal2000, or---for German-speaking readers--- @sedlmeier2018. Contrast analysis is relatively easy to grasp if you're already familiar with ANOVA and correlation.
In this vignette, we assume you have a basic understanding of contrast analysis and are ready to apply it to a specific dataset. We begin by showing how to install cofad and use its graphical user interface. Then, we walk through several example analyses for between-subjects, within-subjects, and mixed designs using R.
## Installation
Cofad has two components, the plain R package and a shiny-app that offers an intuitive graphical user interface.
If you just want to use the cofad-app, you do not need to install it. Just go to https://cofad.titz.science and use it there. An example data file is loaded when you add /example after the url.
If you prefer the command line interface or want to use the cofad-app locally, install it from CRAN:
```{r message=FALSE, eval=F}
install.packages("cofad")
```
Alternatively, you can also install the development version from github (you need the package remotes for this):
```{r echo = T, results = "hide", eval = F}
# install.packages("remotes") # uncomment if you do not have devtools installed
remotes::install_github("johannes-titz/cofad")
```
Now you can load cofad and use it in your R scripts.
You can also run the app:
```{r echo = T, results = "hide", eval = F}
cofad::run_app()
```
## Using cofad
Before we start: Your data has to be in the long-format (also referred to as narrow or tidy)! If you do not know what this means, please check the short description of the Wikipedia-article: https://en.wikipedia.org/wiki/Wide_and_narrow_data
### Graphical-User-Interface
The graphical-user-interface is self-explanatory. Just load your data and drag the variables to the correct position. At the moment you can only read .sav (SPSS) and .csv files.
As an example go to `https://cofad.titz.science/example` which will load a data set from @rosenthal2000 (Table 5.3). The cognitive ability of nine children belonging to different age groups (between) was measured four times (within).
There are two hypotheses:
1. cognitive ability linearly increases over time (within)
($\lambda_\mathrm{1} = -3, \lambda_\mathrm{2} = -1, \lambda_\mathrm{3} = 1, \lambda_\mathrm{4} = 3$)
2. cognitive ability linearly increase over age groups (between)
($\lambda_\mathrm{Age 8} = -1, \lambda_\mathrm{Age 10} = 0, \lambda_\mathrm{Age12} = 1$)
Now drag the variables to the correct position and set the lambdas accordingly:

The result should look like this:

A mixed design is ideal for testing out the cofad-app. You can now construct a separate within-model by removing the between variable "age". Then you can construct a separate between-model by removing "time" from within and dragging "age" back into the between panel.
The graphical user interface will suffice for most users, but some will prefer to use the scripting capabilities of R. In the next sections we will look at several script examples for different designs.
### Between-Subjects Designs
Let us first load the package:
```{r setup}
library(cofad)
```
Now we need some data and hypotheses. We can simply take the data from @furr2004, where we have different empathy ratings of students from different majors. This data set is available in the cofad package:
```{r}
data("furr_p4")
furr_p4
```
Furr states three hypotheses:
- Contrast A: Psychology majors have higher empathy scores than Education majors ($\lambda_\mathrm{psych} = 1, \lambda_\mathrm{edu} = -1$).
- Contrast B: Business majors have higher empathy scores than Chemistry majors ($\lambda_\mathrm{bus} = 1, \lambda_\mathrm{chem} = -1$).
- Contrast C: On average, Psychology and Education majors have higher empathy scores than Business and Chemistry majors ($\lambda_\mathrm{psych} = 1, \lambda_\mathrm{edu} = 1, \lambda_\mathrm{bus} = -1, \lambda_\mathrm{chem} = -1$).
These hypotheses are only mean comparisons, but this is a good way to start. Let's use cofad to conduct the contrast analysis:
```{r}
ca <- calc_contrast(dv = empathy, between = major,
lambda_between = c("psychology" = 1, "education" = -1,
"business" = 0, "chemistry" = 0),
data = furr_p4)
ca
```
The print method shows some basic information that can be directly used in a publication. With the summary method some more details are shown:
```{r}
summary(ca)
```
From this table, $r_\mathrm{effectsize}$ is probably the most useful statistic. It is just the correlation between the lambdas and the dependent variable, which can also be calculated by hand:
```{r}
lambdas <- rep(c(1, -1, 0, 0), each = 5)
cor(furr_p4$empathy, lambdas)
```
As you can see, the effect is negative and `cofad` also warns the user that the contrast fits in the opposite direction. This is a big failure for the hypothesis and indicates substantial problems in theorizing.
The other two hypotheses can be tested accordingly:
```{r}
ca <- calc_contrast(dv = empathy, between = major,
lambda_between = c("psychology" = 0, "education" = 0,
"business" = 1, "chemistry" = -1),
data = furr_p4)
ca
ca <- calc_contrast(dv = empathy, between = major,
lambda_between = c("psychology" = 1, "education" = 1,
"business" = -1, "chemistry" = -1),
data = furr_p4)
ca
```
When you compare the numbers to the ones presented in @furr2004, you will find the same result, except that @furr2004 uses t-values and the p-values are halved. This is because in contrast analysis you can always test one-sided. The assumption is that your lambdas covariate positively with the mean values, not that they either covariate positively or negatively. Thus, you can always halve the p-value from the F-Test.
Now, imagine we have a more fun hypothesis and not just mean differences. From an elaborate theory we could derive that the means should be 73, 61, 51 and 38. We can test this with cofad directly because cofad will center the lambdas (the mean of the lambdas has to be 0):
```{r}
ca <- calc_contrast(dv = empathy, between = major,
lambda_between = c("psychology" = 73, "education" = 61,
"business" = 51, "chemistry" = 38),
data = furr_p4)
ca
```
The manual test gives the same effect size:
```{r}
lambdas <- rep(c(73, 61, 51, 38), each = 5)
cor(furr_p4$empathy, lambdas)
```
Let us now run an analysis for within-subjects designs.
## Within-Subjects Designs
For within designs the calculations are quite different, but cofad takes care of the details. We just have to use the within parameters *within* and *lambda_within* instead of the between equivalents. As an example we use Table 16.5 from @sedlmeier2018. Reading ability was assessed for eight participants under four different conditions. The hypothesis is that you can read best without music, white noise reduces your reading ability and music (independently of type) reduces it even further.
```{r}
data("sedlmeier_p537")
head(sedlmeier_p537)
within <- calc_contrast(dv = reading_test, within = music,
lambda_within = c("without music" = 1.25,
"white noise" = 0.25,
"classic" = -0.75,
"jazz" = -0.75),
id = participant, data = sedlmeier_p537)
summary(within)
within
```
You can see that the significance test is just a $t$-test and the reported effect size is referring to a mean comparison ($g$). (The $t$-test is one-tailed, because contrast analysis has always a specific hypothesis.) When conducting the analysis by hand, we can see why:
```{r}
mtr <- matrix(sedlmeier_p537$reading_test, ncol = 4)
lambdas <- c(1.25, 0.25, -0.75, -0.75)
lc1 <- mtr %*% lambdas
t.test(lc1)
```
Only the linear combination of the dependent variable and the contrast weights for each participant is needed. With these values a normal $t$-test against 0 is conducted. While you can do this manually, using cofad is quicker and it also gives you more information, such as the different effect sizes.
## Mixed Designs
A mixed design combines between and within factors. In this case cofad first calculates the linear combination (*L*-Values) for the within factor. This new variable serves as the dependent variable for a between contrast analysis. We will again look at the example presented in @rosenthal2000 (see the section graphical user interface). The cognitive ability of nine children belonging to different age groups (between) was measured four times (within).
There are two hypotheses:
1. cognitive ability linearly increases over time (within)
($\lambda_\mathrm{1} = -3, \lambda_\mathrm{2} = -1, \lambda_\mathrm{3} = 1, \lambda_\mathrm{4} = 3$)
2. cognitive ability linearly increase over age groups (between)
($\lambda_\mathrm{Age 8} = -1, \lambda_\mathrm{Age 10} = 0, \lambda_\mathrm{Age12} = 1$)
Let's have a look at the data and calculation:
```{r}
data("rosenthal_tbl53")
head(rosenthal_tbl53)
lambda_within <- c("1" = -3, "2" = -1, "3" = 1, "4" = 3)
lambda_between <-c("age8" = -1, "age10" = 0, "age12" = 1)
contr_mx <- calc_contrast(dv = dv,
between = between,
lambda_between = lambda_between,
within = within,
lambda_within = lambda_within,
id = id,
data = rosenthal_tbl53)
contr_mx
```
The results look like a contrast analysis for between-subject designs. The summary gives some more details: The effect sizes, within group means and standard errors of the *L*-values.
```{r}
summary(contr_mx)
```
## Comparing two hypotheses
With `cofad` you can also compare two competing hypotheses. As an example @sedlmeier2013 use a fictitious data set on problem solving skills of boys:
```{r}
sedlmeier_p525
```
Where lsg is the number of solved exercises and the groups are KT = no training, JT = boys-specific training, MT=girls-specific training. Two hypotheses are competing:
- -2, 3, -1 (boys benefit from boys-specific training)
- -2, 1, 1 (boys benefit from training, independently of the type of training)
First, we need to create the difference lambdas:
```{r}
lambda1 <- c(-2, 3, -1)
lambda2 <- c(-2, 1, 1)
lambda <- lambda_diff(lambda1, lambda2, labels = c("KT", "JT", "MT"))
lambda
```
Note that you cannot just subtract the lambdas because their variance can differ, which has an effect on the test. Instead, you need to standardize the lambdas first. `lambda_diff` takes care of this for you.
Now you can run a normal contrast analysis:
```{r}
ca_competing <- calc_contrast(
dv = lsg,
between = between,
lambda_between = round(lambda, 2),
data = sedlmeier_p525
)
summary(ca_competing)
ca_competing
```
Here, we rounded the lambdas so that the result is similar to the one in @sedlmeier2013, who found t=1.137 and r_effectsize=0.26. The effet size is the same. For the t-value, we need to take the root of the F-value, 1.291, which is `r round(sqrt(1.291), 3)`. There is still a slight difference to the original result of 1.137, which is likely due to rounding errors.
The same also works for within-designs. The reading comprehension data from above can serve as an example. Reading ability was assessed for eight participants under four different conditions:
```{r}
sedlmeier_p537
```
There are two hypotheses:
- 1.25, 0.25, -0.75, -0.75: You can read best without music, white noise reduces your reading ability and music (independently of type) reduces it even further.
- 3, -1, -1, -1: Noise of any kind reduces reading ability.
Again, we need to calculate the difference lambdas first:
```{r}
lambda1 <- c(1.25, 0.25, -0.75, -0.75)
lambda2 <- c(3, -1, -1, -1)
lambda <- lambda_diff(lambda2, lambda1,
labels = c("without music", "white noise", "classic",
"jazz"))
lambda
```
Note that we use lambda2 as the first entry into `lambda_diff` because this is how @sedlmeier2013 calculated it (hypothesis2-hypothesis1).
And now the contrast analysis:
```{r}
contr_wi <- calc_contrast(
dv = reading_test,
within = music,
lambda_within = round(lambda, 2),
id = participant,
data = sedlmeier_p537
)
summary(contr_wi)
contr_wi
```
@sedlmeier2013 found a t-value of -3.75 and a g_contrast of -1.33. Again, there is a slight difference for the t-value when compared to our calculation, likely due to rounding errors. Further note, that hypothesis 1 fits better because the statistic and effect are negative.
## Aggregated Data
Sometimes you would like to run a contrast analysis on aggregated data (e.g. when no raw data is available). If you have the means, standard deviations and sample sizes for every condition, you can do this with cofad. For instance, if we take our first example and aggregate it, we can still run the contrast analysis:
```{r message=FALSE}
library(dplyr)
furr_agg <- furr_p4 %>%
group_by(major) %>%
summarize(mean = mean(empathy), sd = sd(empathy), n = n())
lambdas = c("psychology" = 1, "education" = -1, "business" = 0, "chemistry" = 0)
calc_contrast_aggregated(mean, sd, n, major, lambdas, furr_agg)
```
And the result is indeed the same when compared to the analysis with the raw data:
```{r}
ca <- calc_contrast(dv = empathy, between = major,
lambda_between = c("psychology" = 1, "education" = -1,
"business" = 0, "chemistry" = 0),
data = furr_p4)
ca
```
Note that this will only work for between-subjects designs.
## Testing
The current test coverage for the package stands at 88%. Within `cofad`, there exists a Shiny app, subjected to testing through shinytest2. Regrettably, this testing approach proves less robust, leading to unpredictable failures specifically on the Windows OS when executed through GitHub Actions. Consequently, these tests are omitted during GitHub runs and are exclusively conducted on a local environment.
## Issues and Support
If you find any bugs, please use the issue tracker at:
https://github.com/johannes-titz/cofad/issues
If you need answers on how to use the package, drop an e-mail at johannes at titz.science or johannes.titz at gmail.com
## Contributing
Comments and feedback of any kind are very welcome! We will thoroughly consider every suggestion on how to improve the code, the documentation, and the presented examples. Even minor things, such as suggestions for better wording or improving grammar in any part of the package, are more than welcome.
If you want to make a pull request, please check that you can still build the package without any errors, warnings, or notes. Overall, simply stick to the R packages book: https://r-pkgs.org/ and follow the code style described here: https://style.tidyverse.org/
## Acknowledgments
We want to thank Thomas Schäfer and Isabell Winkler for testing cofad and giving helpful feedback.
## References
Owner
- Name: Johannes Titz
- Login: johannes-titz
- Kind: user
- Website: rlernen.de
- Repositories: 8
- Profile: https://github.com/johannes-titz
JOSS Publication
cofad: An R package and shiny app for contrast analysis
Published
November 17, 2021
Volume 6, Issue 67, Page 3822
Authors
Tags
contrast analysis factorial design shinyGitHub Events
Total
- Release event: 2
- Push event: 12
- Create event: 1
Last Year
- Release event: 2
- Push event: 12
- Create event: 1
Committers
Last synced: 5 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Johannes Titz | j****s@t****e | 340 |
| Markus Burkhardt | m****t@p****e | 31 |
| Markus Burkhardt | m****t@p****e | 8 |
| Johannes Titz | j****z@g****m | 2 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 4 months ago
All Time
- Total issues: 3
- Total pull requests: 0
- Average time to close issues: 4 days
- Average time to close pull requests: N/A
- Total issue authors: 1
- Total pull request authors: 0
- Average comments per issue: 0.33
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- johannes-titz (2)
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- cran 556 last-month
- Total dependent packages: 0
- Total dependent repositories: 1
- Total versions: 6
- Total maintainers: 1
cran.r-project.org: cofad
Contrast Analyses for Factorial Designs
- Homepage: https://github.com/johannes-titz/cofad
- Documentation: http://cran.r-project.org/web/packages/cofad/cofad.pdf
- License: LGPL (≥ 3)
-
Latest release: 0.3.3
published 8 months ago
Rankings
Dependent repos count: 24.0%
Dependent packages count: 28.8%
Average: 34.3%
Downloads: 50.2%
Maintainers (1)
Last synced:
4 months ago
Dependencies
.github/workflows/draft-pdf.yml
actions
- actions/checkout v2 composite
- actions/upload-artifact v1 composite
- openjournals/openjournals-draft-action master composite
DESCRIPTION
cran
- R >= 3.1.0 depends
- Hmisc * imports
- dplyr * imports
- htmlwidgets * imports
- magrittr * imports
- readr * imports
- rhandsontable * imports
- rlang * imports
- shiny * imports
- shinyBS * imports
- shinyalert * imports
- shinybusy * imports
- shinydashboard * imports
- shinyjs * imports
- sortable * imports
- stringr * imports
- utils * imports
- knitr * suggests
- rmarkdown * suggests
- testthat >= 3.0.0 suggests
