GGMnonreg

GGMnonreg: Non-Regularized Gaussian Graphical Models in R - Published in JOSS (2021)

https://github.com/donaldrwilliams/ggmnonreg

Science Score: 93.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 4 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: zenodo.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Scientific Fields

Sociology Social Sciences - 40% confidence
Last synced: 4 months ago · JSON representation

Repository

Basic Info
  • Host: GitHub
  • Owner: donaldRwilliams
  • License: gpl-2.0
  • Language: R
  • Default Branch: master
  • Size: 1.1 MB
Statistics
  • Stars: 7
  • Watchers: 1
  • Forks: 6
  • Open Issues: 3
  • Releases: 0
Created over 7 years ago · Last pushed almost 2 years ago
Metadata Files
Readme License

README.Rmd

---
output: github_document
bibliography: inst/REFERENCES.bib
---



```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "75%"
)
```



# GGMnonreg: Non-regularized Gaussian Graphical Models

[![CRAN Version](http://www.r-pkg.org/badges/version/GGMnonreg)](https://cran.r-project.org/package=GGMnonreg)
[![Downloads](https://cranlogs.r-pkg.org/badges/GGMnonreg)](https://cran.r-project.org/package=GGMnonreg)
[![CircleCI build status](https://circleci.com/gh/donaldRwilliams/GGMnonreg.svg?style=shield)](https://circleci.com/gh/donaldRwilliams/GGMnonreg)
[![zenodo](https://zenodo.org/badge/DOI/10.5281/zenodo.5668161.svg)](https://doi.org/10.5281/zenodo.5668161)


The goal of **GGMnonreg** is to estimate non-regularized graphical models. Note that 
the title is a bit of a misnomer, in that Ising and mixed graphical models are also supported.

Graphical modeling is quite common in fields with *wide* data, that is, when there are more 
variables than observations. Accordingly, many regularization-based approaches have been developed for those kinds of data. There are key drawbacks of regularization when the goal is inference, 
including, but not limited to, the fact that obtaining a valid measure of parameter uncertainty is very (very) difficult.

More recently, graphical modeling has emerged in psychology [@Epskamp2018ggm], where the data 
is typically long or low-dimensional [*p* < *n*; @williams2019nonregularized; @williams_rethinking]. The primary purpose of  **GGMnonreg** is to provide methods specifically for low-dimensional data 
(e.g., those common to psychopathology networks).

## Supported Models
* Gaussian graphical model. The following data types are supported.
  + Gaussian 
  + Ordinal
  + Binary
* Ising model [@marsman_2018]
* Mixed graphical model

## Additional methods
The following are also included

* Expected network replicability [@williams2020learning]
* Compare Gaussian graphical models
* Measure of parameter uncertainty [@williams2019nonregularized]
* Edge inclusion "probabilities"
* Network visualization
* Constrained precision matrix (the network, given an assumed graph)
* Predictability (variance explained)

## Installation
To install the latest release version (1.1.0) from CRAN use

```r
install.packages("GGMnonreg")    
```

You can install the development version from [GitHub](https://github.com/) with:
``` r
# install.packages("devtools")
devtools::install_github("donaldRwilliams/GGMnonreg")
```

## Ising
An Ising model is fitted with the following
```{r}
library(GGMnonreg)

# make binary
Y <- ifelse(ptsd[,1:5] == 0, 0, 1)

# fit model
fit <- ising_search(Y, IC = "BIC", 
                    progress = FALSE)

fit
```

Note the same code, more or less, is also used for GGMs and mixed graphical models.

## Predictability
It is common to compute predictability, or variance explained, for each node in the network.
An advantage of **GGMnonreg** is that a measure of uncertainty is also provided.

```{r}
# data
Y <- na.omit(bfi[,1:5])

# fit model
fit <- ggm_inference(Y, boot = FALSE)

# predictability
predictability(fit)
```


## Parameter Uncertainty
Confidence intervals for each relation are obtained with
```{r}
# data
Y <- na.omit(bfi[,1:5])

# fit model
fit <- ggm_inference(Y, boot = TRUE, 
                     method = "spearman", 
                     B = 100, progress = FALSE)

confint(fit)
```
These can then be plotted with, say, **ggplot2** (left to the user).

## Edge Inclusion
When mining data, or performing an automatic search, it is difficult to make inference on the
network parameters (e.g., confidence are not easily computed). To summarize data mining,
**GGMnonreg** provides edge inclusion "probabilities" (proportion bootstrap samples for 
which each relation was detected).

```{r}
# data
Y <- na.omit(bfi[,1:5])

# fit model
fit <-  eip(Y, method = "spearman", 
            B  = 100, progress = FALSE)

fit
```
Note in all cases, the provided estimates correspond to the upper-triangular elements
of the network.

## Expected Network Replicability
**GGMnonreg** allows for computing expected network replicability (ENR), i.e., the number of 
effects that will be detected in any number of replications. This is an analytic solution.

The first step is defining a true network
```{r}
# first make the true network
cors <- cor(GGMnonreg::ptsd)

# inverse
inv <- solve(cors)

# partials
pcors <-  -cov2cor(inv)

# set values to zero
pcors <- ifelse(abs(pcors) < 0.05, 0, pcors)
```

Then obtain ENR
```{r}
fit_enr <- enr(net = pcors, n = 500, replications = 2)

fit_enr
```
Note this is inherently frequentist. As such, over the long run, 45 % of the edges will be replicated on average. Then we can further infer that, in hypothetical replication attempts, more than half of the edges
will be replicated only 5 % of the time.

ENR can also be plotted
```{r}
plot_enr(fit_enr)
```

### Intuition
Here is the basic idea of ENR
```{r}
# location of edges
index <- which(pcors[upper.tri(diag(20))] != 0)

# convert network into correlation matrix
diag(pcors) <- 1
cors_new <- corpcor::pcor2cor(pcors)

# replicated edges
R <- NA

# increase 1000 to, say, 5,000
for(i in 1:1000){

  # two replications
  Y1 <- MASS::mvrnorm(500, rep(0, 20), cors_new)
  Y2 <- MASS::mvrnorm(500, rep(0, 20), cors_new)

  # estimate network 1
  fit1 <- ggm_inference(Y1, boot = FALSE)

  # estimate network 2
  fit2 <- ggm_inference(Y2, boot = FALSE)

  # number of replicated edges (detected in both networks)
  R[i] <- sum(
    rowSums(
      cbind(fit1$adj[upper.tri(diag(20))][index],
            fit2$adj[upper.tri(diag(20))][index])
    ) == 2)
}
```
Notice that replication of two networks is being assessed over the long run. In other words,
if we draw two random samples, what is the expected replicability.

Compare analytic to simulation
```{r}
# combine simulation and analytic
cbind.data.frame(
  data.frame(simulation = sapply(seq(0, 0.9, 0.1), function(x) {
    mean(R > round(length(index) * x) )
  })),
  data.frame(analytic = round(fit_enr$cdf, 3))
)

# average replicability (simulation)
mean(R / length(index))

# average replicability (analytic)
fit_enr$ave_pwr
```

ENR works with any correlation, assuming there is an estimate of the standard error.

## Network plot
```{r, message=FALSE}
# data
Y <- ptsd

# estimate graph
fit <- ggm_inference(Y, boot = FALSE)

# get info for plotting
plot(fit, edge_magnify = 5)
```

## Bug Reports, Feature Requests, and Contributing
Bug reports and feature requests can be made by opening an issue on [Github](https://github.com/donaldRwilliams/GGMnonreg/issues). To contribute towards
the development of **GGMnonreg**, you can start a branch with a pull request and we can 
discuss the proposed changes there.

## References

Owner

  • Name: Donald R. Williams
  • Login: donaldRwilliams
  • Kind: user

JOSS Publication

GGMnonreg: Non-Regularized Gaussian Graphical Models in R
Published
November 11, 2021
Volume 6, Issue 67, Page 3308
Authors
Donald R. Williams
Department of Psychology, University of California, Davis, NWEA, Portland, USA
Editor
Arfon Smith ORCID
Tags
Graphical models partial correlations Mixed graphical model Ising model

GitHub Events

Total
  • Issues event: 3
  • Watch event: 1
Last Year
  • Issues event: 3
  • Watch event: 1

Committers

Last synced: 5 months ago

All Time
  • Total Commits: 156
  • Total Committers: 2
  • Avg Commits per committer: 78.0
  • Development Distribution Score (DDS): 0.122
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Donald R. Williams b****9@g****m 137
donaldRwilliams y****u@e****m 19

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 3
  • Total pull requests: 7
  • Average time to close issues: over 2 years
  • Average time to close pull requests: 6 days
  • Total issue authors: 3
  • Total pull request authors: 3
  • Average comments per issue: 0.33
  • Average comments per pull request: 0.14
  • Merged pull requests: 5
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 2
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 2
  • Pull request authors: 0
  • Average comments per issue: 0.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • KarolineHuth (1)
  • lauraeong (1)
  • guhjy (1)
Pull Request Authors
  • donaldRwilliams (5)
  • AlexChristensen (2)
  • SachaEpskamp (1)
Top Labels
Issue Labels
Pull Request Labels

Dependencies

DESCRIPTION cran
  • R >= 4.0.0 depends
  • GGMncv * imports
  • GGally * imports
  • MASS * imports
  • Matrix * imports
  • Rdpack * imports
  • bestglm * imports
  • corpcor * imports
  • doParallel * imports
  • foreach * imports
  • ggplot2 * imports
  • methods * imports
  • network * imports
  • parallel * imports
  • poibin * imports
  • psych * imports
  • sna * imports
  • stats * imports
  • qgraph * suggests