mifa
An R package providing multiple Imputation of covariance matrices in order to perform factor analysis.
Science Score: 49.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 7 DOI reference(s) in README -
○Academic publication links
-
✓Committers with academic emails
1 of 3 committers (33.3%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (17.4%) to scientific vocabulary
Keywords
factor-analysis
imputation
rstats
Last synced: 6 months ago
·
JSON representation
Repository
An R package providing multiple Imputation of covariance matrices in order to perform factor analysis.
Basic Info
- Host: GitHub
- Owner: Teebusch
- License: other
- Language: R
- Default Branch: master
- Homepage: https://teebusch.github.io/mifa/
- Size: 1.6 MB
Statistics
- Stars: 2
- Watchers: 0
- Forks: 0
- Open Issues: 5
- Releases: 3
Topics
factor-analysis
imputation
rstats
Created over 6 years ago
· Last pushed 9 months ago
Metadata Files
Readme
Changelog
License
README.Rmd
---
output: github_document
editor_options:
markdown:
wrap: 80
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
dpi = 300,
fig.width = 10,
fig.height = 7,
dev = "svg"
)
set.seed(1234)
# preload libraries here to avoid messages
library(mifa)
library(psych)
library(ggplot2)
library(tidyr)
ggplot2::theme_set(theme_minimal(base_size = 16))
```
# mifa - multiple imputation for factor analysis
[](https://lifecycle.r-lib.org/articles/stages.html#maturing)
[](https://CRAN.R-project.org/package=mifa)
[](https://github.com/Teebusch/mifa/actions/workflows/R-CMD-check.yaml)
[](https://app.codecov.io/gh/Teebusch/mifa)
`mifa` is an R package that implements multiple imputation of covariance
matrices to allow to perform factor analysis on incomplete data. It works as
follows:
1. Impute missing values multiple times using *Multivariate Imputation with
Chained Equations* (MICE) from the [mice](https://amices.org/mice/) package.
2. Combine the covariance matrices of the imputed data sets into a single
covariance matrix using Rubin's rules^[Rubin D. B. Multiple imputation
for nonresponse in surveys (2004). John Wiley & Sons.]
3. Use the combined covariance matrix for exploratory factor analysis.
`mifa` also provides two types of confidence intervals for the variance
explained by different numbers of principal components:
Fieller confidence intervals (parametric) for larger samples^[Fieller, E.
C. (1954). Some problems in interval estimation. Journal of the Royal
Statistical Society. Series B (Methodological): 175-185.]
and bootstrapped confidence intervals (nonparametric) for smaller
samples.^[Shao, J. & Sitter, R. R. (1996). Bootstrap for imputed survey
data. Journal of the American Statistical Association 91.435 (1996): 1278-1288. doi: [10.1080/01621459.1996.10476997](https://dx.doi.org/10.1080/01621459.1996.10476997)]
**For more information about the method, see:**
Nassiri, V., Lovik, A., Molenberghs, G., Verbeke, G. (2018). On using multiple
imputation for exploratory factor analysis of incomplete data. *Behavior
Research Methods* 50, 501–517. doi: [10.3758/s13428-017-1013-4](https://doi.org/10.3758/s13428-017-1013-4)
*Note:* The paper was accompanied by an implementation in R, and this package
emerged from it. The repository appears to have been abandoned by the authors,
but you can still find it [here](https://github.com/vahidnassiri/mifa).
## Installation
Install from CRAN with:
```{r eval=FALSE}
install.packages("mifa")
```
Or install the development version from [Github](https://github.com/teebusch/mifa) with:
``` r
# install.packages("devtools")
devtools::install_github("teebusch/mifa")
```
## Usage
### Example Data
For this example we use the `bfi` data set from the `psych` package. It contains
2,800 subjects' answers to 25 personality self-report items and 3 demographic
variables (sex, education, and age). Each of the 25 personality questions is
meant to tap into one of the "Big 5" personality factors, as indicated by their
names: **O**penness, **C**onscientiousness, **A**greeableness, ,
**E**xtraversion, **N**euroticism. There are missing responses for most items.
Instead of dropping the incomplete cases from the analysis, we will use `mifa`
to impute them, and then perform a factor analysis on the imputed covariance
matrix.
### Imputing the Covariance Matrix
First, we use `mifa()` to impute the covariance matrix and get an idea how many
factors we should use. We use the `cov_vars` argument to tell `mifa` to use
`gender`, `education`, and `age` for the imputations, but exclude them from the
covariance matrix:
```{r run-mifa, messages=FALSE}
library(mifa)
library(psych)
mi <- mifa(
data = bfi,
cov_vars = -c(gender, education, age),
n_pc = 2:8,
ci = "fieller",
print = FALSE
)
mi
```
### Factor Analysis
It looks like the first 5 principal components explain more than half of the
variance in the responses, so we perform a factor analysis with 5 factors, using
the `fa()` function from the `psych` package. We can get the imputed covariance
matrix of our data from `mi$cov_combined`. From there on, it's business as
usual.
```{r message=FALSE}
fit <- fa(mi$cov_combined, n.obs = nrow(bfi), nfactors = 5)
```
The factor diagram shows that the five factors correspond nicely to the 5 types
of questions:
```{r fa-diagram, fig.height = 8}
fa.diagram(fit)
```
We can add the factor scores to the original data, in order to explore group
differences. Because we need complete data to calculate factor scores, we first
impute a single data set with mice:
```{r}
data_imp <- mice::complete(mice::mice(bfi, 1, print = FALSE))
fct_scores <- data.frame(factor.scores(data_imp[, 1:25], fit)$scores)
data_imp <- data.frame(
Gender = factor(data_imp$gender),
Extraversion = fct_scores$MR1,
Neuroticism = fct_scores$MR2,
Conscientious = fct_scores$MR3,
Openness = fct_scores$MR4,
Agreeableness = fct_scores$MR5
)
levels(data_imp$Gender) <- c("Male", "Female")
```
Then we can visualize the group differences:
```{r fa-group-comparison}
library(ggplot2)
library(tidyr)
data_imp2 <- tidyr::pivot_longer(data_imp, cols = -Gender, names_to = "factor")
ggplot(data_imp2) +
geom_density(aes(value, linetype = Gender)) +
facet_wrap(~ factor, nrow = 2) +
theme(legend.position = "inside", legend.position.inside = c(.9, .1))
```
## Further Reading
Owner
- Name: Tobias Busch
- Login: Teebusch
- Kind: user
- Location: Oslo | Köln
- Website: tobiasbusch.xyz
- Twitter: tobilottii
- Repositories: 2
- Profile: https://github.com/Teebusch
I'm a researcher with a PhD in Biomedical science working on hearing and cochlear implants.
GitHub Events
Total
- Issue comment event: 1
- Push event: 13
- Pull request event: 2
- Create event: 2
Last Year
- Issue comment event: 1
- Push event: 13
- Pull request event: 2
- Create event: 2
Committers
Last synced: 9 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Tobias Busch | t****h@g****m | 60 |
| Tobias Busch | t****u@u****o | 2 |
| unknown | v****i@o****u | 1 |
Committer Domains (Top 20 + Academic)
openanalytics.eu: 1
uio.no: 1
Issues and Pull Requests
Last synced: 7 months ago
All Time
- Total issues: 4
- Total pull requests: 4
- Average time to close issues: N/A
- Average time to close pull requests: about 2 hours
- Total issue authors: 3
- Total pull request authors: 1
- Average comments per issue: 0.25
- Average comments per pull request: 0.0
- Merged pull requests: 4
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 1
- Average time to close issues: N/A
- Average time to close pull requests: about 8 hours
- Issue authors: 0
- Pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- Teebusch (2)
- miniqiong (1)
- kikabradford (1)
Pull Request Authors
- Teebusch (5)
Top Labels
Issue Labels
question (1)
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- cran 148 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 2
- Total maintainers: 1
cran.r-project.org: mifa
Multiple Imputation for Exploratory Factor Analysis
- Homepage: https://github.com/teebusch/mifa
- Documentation: http://cran.r-project.org/web/packages/mifa/mifa.pdf
- License: MIT + file LICENSE
-
Latest release: 0.2.1
published 10 months ago
Rankings
Stargazers count: 28.5%
Forks count: 28.8%
Dependent packages count: 29.8%
Dependent repos count: 35.5%
Average: 40.0%
Downloads: 77.4%
Maintainers (1)
Last synced:
6 months ago
Dependencies
DESCRIPTION
cran
- checkmate * imports
- dplyr * imports
- mice * imports
- stats * imports
- covr * suggests
- ggplot2 * suggests
- knitr * suggests
- psych * suggests
- rmarkdown * suggests
- testthat * suggests
- tidyr * suggests
.github/workflows/R-CMD-check.yaml
actions
- actions/checkout v4 composite
- r-lib/actions/check-r-package v2 composite
- r-lib/actions/setup-pandoc v2 composite
- r-lib/actions/setup-r v2 composite
- r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/pkgdown.yaml
actions
- JamesIves/github-pages-deploy-action v4.5.0 composite
- actions/checkout v4 composite
- r-lib/actions/setup-pandoc v2 composite
- r-lib/actions/setup-r v2 composite
- r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/test-coverage.yaml
actions
- actions/checkout v4 composite
- actions/upload-artifact v4 composite
- codecov/codecov-action v5 composite
- r-lib/actions/setup-r v2 composite
- r-lib/actions/setup-r-dependencies v2 composite