factorex

R package factorEx: Design and Analysis for Factorial Experiments

https://github.com/naoki-egami/factorex

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
○
codemeta.json file
○
.zenodo.json file
○
DOI references
○
Academic publication links
✓
Committers with academic emails
1 of 2 committers (50.0%) from academic institutions
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (10.9%) to scientific vocabulary

Last synced: 10 months ago · JSON representation

Repository

R package factorEx: Design and Analysis for Factorial Experiments

Basic Info

Host: GitHub
Owner: naoki-egami
Language: R
Default Branch: master
Homepage:
Size: 1.92 MB

Statistics

Stars: 8
Watchers: 4
Forks: 2
Open Issues: 0
Releases: 0

Created over 7 years ago · Last pushed over 3 years ago

Metadata Files

Readme

factorEx: Design and Analysis for Factorial Experiments

Description:

R package factorEx provides design-based and model-based estimators for the population average marginal component effects (the pAMCE) in factorial experiments, including conjoint analysis. The package also implements a series of recommendations offered in de la Cuesta, Egami, and Imai (2022, PA) and Egami and Imai (2019, JASA).

Authors:

References:

de la Cuesta, Egami, and Imai. (2022). Improving the External Validity of Conjoint Analysis: The Essential Role of Profile Distribution. Political Analysis, Vol.30, No.1 (January), pp. 19–45.
Egami and Imai. (2019). Causal Interaction in Factorial Experiments: Application to Conjoint Analysis. Journal of the American Statistical Association, Vol.114, No.526 (June), pp. 529–540.

Installation Instructions

factorEx is available on CRAN and can be installed using:

r install.packages("factorEx")

You can also install the most recent development version using the devtools package. First you have to install devtools using the following code. Note that you only have to do this once:

r if(!require(devtools)) install.packages("devtools")

Then, load devtools and use the function install_github() to install factorEx:

r library(devtools) install_github("naoki-egami/factorEx", dependencies=TRUE)

Examples

Design-based Confirmatory Analysis
- Case 1: Use Marginal Distributions for Target Profile Distribution
- Case 2: Use Combination of Marginal and Partial Joint Distributions for Target Profile Distribution
Model-based Exploratory Analysis

(1) Design-based Confirmatory Analysis

Here, we use the conjoint experiment that randomized profiles according to the marginal population randomization design.

Case 1: Use Marginal Distributions for Target Profile Distributions

When using marginal distributions, target_dist should be a list and each element should have a factor name. Within each list, a numeric vector should have the same level names as those in data.

``` r

Load the package and data

library(factorEx) data("OnoBurden")

OnoBurdendatapr <- OnoBurden$OnoBurdendatapr # randomization based on marginal population design

we focus on target profile distributions based on Democratic legislators.

See de la Cuesta, Egami, and Imai (2019+) for details.

targetdistmarginal <- OnoBurden$targetdistmarginal

targetdistmarginal ```

## $gender
##      Male    Female 
## 0.6778243 0.3221757 
## 
## $age
## 36 years old 44 years old 52 years old 60 years old 68 years old 76 years old 
##   0.05020921   0.13807531   0.23012552   0.22594142   0.25104603   0.10460251 
## 
## $family
## Single (never married)      Single (divorced)     Married (no child) 
##             0.07729469             0.03864734             0.12560386 
## Married (two children) 
##             0.75845411 
## 
## $race
##          White       Hispanic Asian American          Black 
##      0.6725664      0.1283186      0.0000000      0.1991150 
## 
## $experience
##      None   4 years   8 years  12 years 
## 0.1966527 0.2259414 0.1548117 0.4225941 
## 
## $party
## Dem Rep 
##   1   0 
## 
## $pos_security
##     Cut military budget Maintain strong defense 
##              0.98557692              0.01442308

We can estimate the pAMCE with design_pAMCE with target_type = "marginal". Use factor_name to specify for which factors we estimate the pAMCE.

r out_design_mar <- design_pAMCE(formula = Y ~ gender + age + family + race + experience + party + pos_security, factor_name = c("gender", "age", "experience"), data = OnoBurden_data_pr, pair_id = OnoBurden_data_pr$pair_id, cluster_id = OnoBurden_data_pr$id, target_dist = target_dist_marginal, target_type = "marginal") summary(out_design_mar)

## 
## ----------------
## Population AMCEs:
## ----------------
##  target_dist     factor        level     Estimate  Std. Error p value    
##       target     gender       Female  0.027987587 0.005861738   0.000 ***
##       target        age 44 years old  0.019219282 0.014421828   0.183    
##       target        age 52 years old -0.008792916 0.013765415   0.523    
##       target        age 60 years old -0.006826945 0.013875303   0.623    
##       target        age 68 years old  0.011247969 0.013569292   0.407    
##       target        age 76 years old -0.052741541 0.014775629   0.000 ***
##       target experience     12 years  0.041672460 0.007627281   0.000 ***
##       target experience      4 years  0.046173813 0.008868432   0.000 ***
##       target experience      8 years  0.040752213 0.009313376   0.000 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Use plot to visualize the estimated pAMCEs.

r plot(out_design_mar, factor_name = c("gender", "experience"))

Case 2: Use Combination of Marginal and Partial Joint Distributions for Target Profile Distribution

The use of partial joint distributions is useful because it can relax the assumption of no three-way or higher-order interactions (see de la Cuesta, Egami, and Imai (2019+)).

When using a combination of marginal and partial joint distributions, target_dist should be a list and each element should be a numeric vector (if marginal) or an array/table (if partial joint). Then, use argument partial_joint_name to specify which factors are marginal and partial joints. In the following example, c("gender", "age", "family") has the partial joint distributions over the three factors. race and party are based on the marginal distributions, respectively. c("experience", "pos_security") has the partial joint distributions over the two factors. Within each list, a numeric vector or an array/table should have the same level names as those in data.

r target_dist_partial <- OnoBurden$target_dist_partial target_dist_partial

## $`gender:age:family`
## , , family = Single (never married)
## 
##         age
## gender   36 years old 44 years old 52 years old 60 years old 68 years old
##   Male    0.004184100  0.004184100  0.004184100  0.004184100  0.008368201
##   Female  0.000000000  0.004184100  0.004184100  0.008368201  0.016736402
##         age
## gender   76 years old
##   Male    0.004184100
##   Female  0.004184100
## 
## , , family = Single (divorced)
## 
##         age
## gender   36 years old 44 years old 52 years old 60 years old 68 years old
##   Male    0.004184100  0.000000000  0.004184100  0.004184100  0.004184100
##   Female  0.000000000  0.004184100  0.004184100  0.004184100  0.000000000
##         age
## gender   76 years old
##   Male    0.000000000
##   Female  0.004184100
## 
## , , family = Married (no child)
## 
##         age
## gender   36 years old 44 years old 52 years old 60 years old 68 years old
##   Male    0.008368201  0.008368201  0.025104603  0.008368201  0.029288703
##   Female  0.000000000  0.000000000  0.004184100  0.008368201  0.012552301
##         age
## gender   76 years old
##   Male    0.000000000
##   Female  0.004184100
## 
## , , family = Married (two children)
## 
##         age
## gender   36 years old 44 years old 52 years old 60 years old 68 years old
##   Male    0.025104603  0.079497908  0.117154812  0.092050209  0.092050209
##   Female  0.004184100  0.020920502  0.050209205  0.062761506  0.033472803
##         age
## gender   76 years old
##   Male    0.041841004
##   Female  0.037656904
## 
## 
## $race
##          White       Hispanic Asian American          Black 
##      0.6725664      0.1283186      0.0000000      0.1991150 
## 
## $party
## Dem Rep 
##   1   0 
## 
## $`experience:pos_security`
##           pos_security
## experience Cut military budget Maintain strong defense
##   None             0.066945607             0.000000000
##   4 years          0.221757322             0.004184100
##   8 years          0.154811715             0.000000000
##   12 years         0.414225941             0.008368201

r partial_joint_name <- list(c("gender", "age", "family"), "race", "party", c("experience", "pos_security"))

We can estimate the pAMCE with design_pAMCE with target_type = "partial_joint" and appropriate partial_joint_name. The function can use factor_name to specify for which factors we estimate the pAMCE.

r out_design_par <- design_pAMCE(formula = Y ~ gender + age + family + race + experience + party + pos_security, factor_name = c("gender", "age", "race"), data = OnoBurden_data_pr, pair_id = OnoBurden_data_pr$pair_id, cluster_id = OnoBurden_data_pr$id, target_dist = target_dist_partial, target_type = "partial_joint", partial_joint_name = partial_joint_name) summary(out_design_par)

## 
## ----------------
## Population AMCEs:
## ----------------
##  target_dist factor        level     Estimate  Std. Error p value    
##       target gender       Female  0.024756315 0.006362147   0.000 ***
##       target    age 44 years old  0.024750351 0.015045579   0.100    
##       target    age 52 years old -0.006198274 0.014335803   0.665    
##       target    age 60 years old -0.001011886 0.014397430   0.944    
##       target    age 68 years old  0.016337413 0.014132614   0.248    
##       target    age 76 years old -0.046107728 0.015464360   0.003  **
##       target   race        Black -0.025770076 0.008043842   0.001  **
##       target   race     Hispanic -0.028217748 0.009332710   0.002  **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(2) Model-based Exploratory Analysis

Here, we use the conjoint experiment that randomized profiles according to the uniform distribution and incorporate the target profile distribution in the analysis stage.

``` r OnoBurdendata <- OnoBurden$OnoBurdendata # randomization based on uniform

due to large sample size, focus on "congressional candidates" for this example

OnoBurdendatacong <- OnoBurdendata[OnoBurdendata$office == "Congress", ]

outmodel <- modelpAMCE(formula = Y ~ gender + age + family + race + experience + party + possecurity, data = OnoBurdendatacong, reg = TRUE, pairid = OnoBurdendatacong$pairid, clusterid = OnoBurdendatacong$id, targetdist = targetdistmarginal, targettype = "marginal") summary(outmodel, factorname = c("gender")) ```

## 
## ----------------
## Population AMCEs:
## ----------------
##  target_dist factor  level   Estimate Std. Error p value 
##     target_1 gender Female 0.02485328 0.01783633   0.163 
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

When sample = TRUE, the function also reports the AMCE based on the in-sample profile distributions (sample AMCE), which is the uniform AMCE in this example.

r summary(out_model, factor_name = c("gender"), sample = TRUE)

## 
## ----------------
## Population AMCEs:
## ----------------
##  target_dist factor  level     Estimate  Std. Error p value 
##  sample AMCE gender Female -0.002290771 0.008321458   0.783 
##     target_1 gender Female  0.024853283 0.017836332   0.163 
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Use plot to visualize the estimated pAMCEs. When diagnose = TRUE, it provides two diagnostic checks; specification tests and the check of bootstrap distributions.

r plot(out_model, factor_name = c("gender"), diagnose = TRUE)

In the model-based analysis, we can also decompose the difference between the pAMCE and the uniform AMCE. Use effect_name to specify which pAMCE we want to decompose. effect_name has two elements; the first is a factor name and the second is a level name of interest.

r decompose_pAMCE(out_model, effect_name = c("gender", "Female"))

##                type       factor      estimate          se      low.95ci
## 1 target_1 - sample          age -4.476601e-03 0.002526321 -9.542598e-03
## 2 target_1 - sample       family -1.028249e-03 0.002956149 -6.693040e-03
## 3 target_1 - sample         race  5.505289e-03 0.007778271 -9.474694e-03
## 4 target_1 - sample   experience  6.965927e-05 0.000791340 -1.264228e-03
## 5 target_1 - sample        party  1.061621e-02 0.007640463 -6.015014e-03
## 6 target_1 - sample pos_security  1.685740e-02 0.008586040 -2.671033e-05
##       high.95ci
## 1 -0.0003348427
## 2  0.0058319109
## 3  0.0219185310
## 4  0.0015099003
## 5  0.0247779725
## 6  0.0315124642

Or use plot_decompose to visualize the decomposition.

r plot_decompose(out_model, effect_name = c("gender", "Female"))

Owner

Name: Naoki Egami
Login: naoki-egami
Kind: user
Location: New York
Company: Columbia University

Website: https://naokiegami.com
Repositories: 3
Profile: https://github.com/naoki-egami

Assistant Professor, Columbia

GitHub Events

Total

Last Year

Committers

Last synced: over 3 years ago

All Time

Total Commits: 112
Total Committers: 2
Avg Commits per committer: 56.0
Development Distribution Score (DDS): 0.241

Top Committers

Name	Email	Commits
naoki-egami	n**i@p**u	85
Naoki Egami	n**i@u**m	27

Committer Domains (Top 20 + Academic)

princeton.edu: 1

Issues and Pull Requests

Last synced: 10 months ago

Packages

Total packages: 1
Total downloads: unknown

Total dependent packages: 0
Total dependent repositories: 0
Total versions: 2

cran.r-project.org: factorEx

Design and Analysis for Factorial Experiments

Homepage: https://github.com/naoki-egami/factorEx
Documentation: http://cran.r-project.org/web/packages/factorEx/factorEx.pdf
License: GPL-2
Status: removed
Latest release: 1.0.1
published about 6 years ago

Versions: 2
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 0

Rankings

Forks count: 17.8%

Stargazers count: 19.8%

Dependent packages count: 29.8%

Dependent repos count: 35.5%

Average: 38.5%

Downloads: 89.7%