mifa

An R package providing multiple Imputation of covariance matrices in order to perform factor analysis.

https://github.com/teebusch/mifa

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 7 DOI reference(s) in README
○
Academic publication links
✓
Committers with academic emails
1 of 3 committers (33.3%) from academic institutions
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (17.4%) to scientific vocabulary

Keywords

factor-analysis imputation rstats

Last synced: 6 months ago · JSON representation

Repository

An R package providing multiple Imputation of covariance matrices in order to perform factor analysis.

Basic Info

Host: GitHub
Owner: Teebusch
License: other
Language: R
Default Branch: master
Homepage: https://teebusch.github.io/mifa/
Size: 1.6 MB

Statistics

Stars: 2
Watchers: 0
Forks: 0
Open Issues: 5
Releases: 3

Topics

factor-analysis imputation rstats

Created over 6 years ago · Last pushed 9 months ago

Metadata Files

Readme Changelog License

README.Rmd

---
output: github_document
editor_options: 
  markdown: 
    wrap: 80
---



```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse   = TRUE,
  comment    = "#>",
  fig.path   = "man/figures/README-",
  dpi        = 300,
  fig.width  = 10,
  fig.height = 7,
  dev        = "svg"
)

set.seed(1234)

# preload libraries here to avoid messages
library(mifa)
library(psych)
library(ggplot2)
library(tidyr)

ggplot2::theme_set(theme_minimal(base_size = 16))
```

# mifa - multiple imputation for factor analysis




[![Lifecycle: maturing](https://img.shields.io/badge/lifecycle-maturing-blue.svg)](https://lifecycle.r-lib.org/articles/stages.html#maturing)
[![CRAN status](https://www.r-pkg.org/badges/version/mifa)](https://CRAN.R-project.org/package=mifa)
[![R-CMD-check](https://github.com/Teebusch/mifa/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/Teebusch/mifa/actions/workflows/R-CMD-check.yaml)
[![Codecov test coverage](https://codecov.io/gh/Teebusch/mifa/graph/badge.svg)](https://app.codecov.io/gh/Teebusch/mifa)


`mifa` is an R package that implements multiple imputation of covariance
matrices to allow to perform factor analysis on incomplete data. It works as
follows:

1.  Impute missing values multiple times using *Multivariate Imputation with
    Chained Equations* (MICE) from the [mice](https://amices.org/mice/) package.

2.  Combine the covariance matrices of the imputed data sets into a single
    covariance matrix using Rubin's rules^{^[Rubin D. B. Multiple imputation
    for nonresponse in surveys (2004). John Wiley & Sons.]}

3.  Use the combined covariance matrix for exploratory factor analysis.

`mifa` also provides two types of confidence intervals for the variance
explained by different numbers of principal components: 
Fieller confidence intervals (parametric) for larger samples^{^[Fieller, E. 
C. (1954). Some problems in interval estimation. Journal of the Royal 
Statistical Society. Series B (Methodological): 175-185.]} 
and bootstrapped confidence intervals (nonparametric) for smaller 
samples.^{^[Shao, J. & Sitter, R. R. (1996). Bootstrap for imputed survey 
data. Journal of the American Statistical Association 91.435 (1996): 1278-1288. doi: [10.1080/01621459.1996.10476997](https://dx.doi.org/10.1080/01621459.1996.10476997)]}

**For more information about the method, see:**

Nassiri, V., Lovik, A., Molenberghs, G., Verbeke, G. (2018). On using multiple
imputation for exploratory factor analysis of incomplete data. *Behavior
Research Methods* 50, 501–517. doi:  [10.3758/s13428-017-1013-4](https://doi.org/10.3758/s13428-017-1013-4)

*Note:* The paper was accompanied by an implementation in R, and this package
emerged from it. The repository appears to have been abandoned by the authors,
but you can still find it [here](https://github.com/vahidnassiri/mifa).

## Installation

Install from CRAN with:

```{r eval=FALSE}
install.packages("mifa")
```

Or install the development version from [Github](https://github.com/teebusch/mifa) with:

``` r
# install.packages("devtools")
devtools::install_github("teebusch/mifa")
```

## Usage

### Example Data

For this example we use the `bfi` data set from the `psych` package. It contains
2,800 subjects' answers to 25 personality self-report items and 3 demographic
variables (sex, education, and age). Each of the 25 personality questions is
meant to tap into one of the "Big 5" personality factors, as indicated by their
names: **O**penness, **C**onscientiousness, **A**greeableness, ,
**E**xtraversion, **N**euroticism. There are missing responses for most items.
Instead of dropping the incomplete cases from the analysis, we will use `mifa`
to impute them, and then perform a factor analysis on the imputed covariance
matrix.

### Imputing the Covariance Matrix

First, we use `mifa()` to impute the covariance matrix and get an idea how many
factors we should use. We use the `cov_vars` argument to tell `mifa` to use
`gender`, `education`, and `age` for the imputations, but exclude them from the
covariance matrix:

```{r run-mifa, messages=FALSE}
library(mifa)
library(psych)

mi <- mifa(
  data      = bfi, 
  cov_vars  = -c(gender, education, age),
  n_pc      = 2:8, 
  ci        = "fieller", 
  print     = FALSE
)

mi
```

### Factor Analysis

It looks like the first 5 principal components explain more than half of the
variance in the responses, so we perform a factor analysis with 5 factors, using
the `fa()` function from the `psych` package. We can get the imputed covariance
matrix of our data from `mi$cov_combined`. From there on, it's business as
usual.

```{r message=FALSE}
fit <- fa(mi$cov_combined, n.obs = nrow(bfi), nfactors = 5)
```

The factor diagram shows that the five factors correspond nicely to the 5 types
of questions:

```{r fa-diagram, fig.height = 8}
fa.diagram(fit)
```

We can add the factor scores to the original data, in order to explore group
differences. Because we need complete data to calculate factor scores, we first
impute a single data set with mice:

```{r}
data_imp <- mice::complete(mice::mice(bfi, 1, print = FALSE))

fct_scores <- data.frame(factor.scores(data_imp[, 1:25], fit)$scores)

data_imp <- data.frame(
  Gender        = factor(data_imp$gender),
  Extraversion  = fct_scores$MR1,
  Neuroticism   = fct_scores$MR2,
  Conscientious = fct_scores$MR3,
  Openness      = fct_scores$MR4,
  Agreeableness = fct_scores$MR5
)

levels(data_imp$Gender) <- c("Male", "Female")
```

Then we can visualize the group differences:

```{r fa-group-comparison}
library(ggplot2)
library(tidyr)

data_imp2 <- tidyr::pivot_longer(data_imp, cols = -Gender, names_to = "factor")

ggplot(data_imp2) +
  geom_density(aes(value, linetype = Gender)) +
  facet_wrap(~ factor, nrow = 2) +
  theme(legend.position = "inside", legend.position.inside = c(.9, .1))
```

## Further Reading

Owner

Name: Tobias Busch
Login: Teebusch
Kind: user
Location: Oslo | Köln

Website: tobiasbusch.xyz
Twitter: tobilottii
Repositories: 2
Profile: https://github.com/Teebusch

I'm a researcher with a PhD in Biomedical science working on hearing and cochlear implants.

GitHub Events

Total

Issue comment event: 1
Push event: 13
Pull request event: 2
Create event: 2

Last Year

Issue comment event: 1
Push event: 13
Pull request event: 2
Create event: 2

Committers

Last synced: 9 months ago

All Time

Total Commits: 63
Total Committers: 3
Avg Commits per committer: 21.0
Development Distribution Score (DDS): 0.048

Past Year

Commits: 6
Committers: 1
Avg Commits per committer: 6.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
Tobias Busch	t**h@g**m	60
Tobias Busch	t**u@u**o	2
unknown	v**i@o**u	1

Committer Domains (Top 20 + Academic)

openanalytics.eu: 1 uio.no: 1

Issues and Pull Requests

Last synced: 7 months ago

All Time

Total issues: 4
Total pull requests: 4
Average time to close issues: N/A
Average time to close pull requests: about 2 hours
Total issue authors: 3
Total pull request authors: 1
Average comments per issue: 0.25
Average comments per pull request: 0.0
Merged pull requests: 4
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 1
Average time to close issues: N/A
Average time to close pull requests: about 8 hours
Issue authors: 0
Pull request authors: 1
Average comments per issue: 0
Average comments per pull request: 0.0
Merged pull requests: 1
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

Teebusch (2)
miniqiong (1)
kikabradford (1)

Pull Request Authors

Teebusch (5)

Top Labels

Issue Labels

question (1)

Pull Request Labels

Packages

Total packages: 1
Total downloads:
- cran 148 last-month

Total dependent packages: 0
Total dependent repositories: 0
Total versions: 2
Total maintainers: 1

cran.r-project.org: mifa

Multiple Imputation for Exploratory Factor Analysis

Homepage: https://github.com/teebusch/mifa
Documentation: http://cran.r-project.org/web/packages/mifa/mifa.pdf
License: MIT + file LICENSE
Latest release: 0.2.1
published 10 months ago

Versions: 2
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 148 Last month

Rankings

Stargazers count: 28.5%

Forks count: 28.8%

Dependent packages count: 29.8%

Dependent repos count: 35.5%

Average: 40.0%

Downloads: 77.4%

Maintainers (1)

teebusch@gmail.com

Last synced: 6 months ago

Dependencies

DESCRIPTION cran

checkmate * imports
dplyr * imports
mice * imports
stats * imports
covr * suggests
ggplot2 * suggests
knitr * suggests
psych * suggests
rmarkdown * suggests
testthat * suggests
tidyr * suggests

.github/workflows/R-CMD-check.yaml actions

actions/checkout v4 composite
r-lib/actions/check-r-package v2 composite
r-lib/actions/setup-pandoc v2 composite
r-lib/actions/setup-r v2 composite
r-lib/actions/setup-r-dependencies v2 composite

.github/workflows/pkgdown.yaml actions

JamesIves/github-pages-deploy-action v4.5.0 composite
actions/checkout v4 composite
r-lib/actions/setup-pandoc v2 composite
r-lib/actions/setup-r v2 composite
r-lib/actions/setup-r-dependencies v2 composite

.github/workflows/test-coverage.yaml actions

actions/checkout v4 composite
actions/upload-artifact v4 composite
codecov/codecov-action v5 composite
r-lib/actions/setup-r v2 composite
r-lib/actions/setup-r-dependencies v2 composite

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

mifa

Science Score: 49.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.Rmd

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

cran.r-project.org: mifa

Rankings

Maintainers (1)

Dependencies