GGMnonreg

GGMnonreg: Non-Regularized Gaussian Graphical Models in R - Published in JOSS (2021)

https://github.com/donaldrwilliams/ggmnonreg

Science Score: 93.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 4 DOI reference(s) in README and JOSS metadata
✓
Academic publication links
Links to: zenodo.org
○
Committers with academic emails
○
Institutional organization owner
✓
JOSS paper metadata
Published in Journal of Open Source Software

Scientific Fields

Sociology Social Sciences - 40% confidence

Last synced: 6 months ago · JSON representation

Repository

Basic Info

Host: GitHub
Owner: donaldRwilliams
License: gpl-2.0
Language: R
Default Branch: master
Size: 1.1 MB

Statistics

Stars: 7
Watchers: 1
Forks: 6
Open Issues: 3
Releases: 0

Created over 7 years ago · Last pushed about 2 years ago

Metadata Files

Readme License

README.Rmd

---
output: github_document
bibliography: inst/REFERENCES.bib
---



```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "75%"
)
```



# GGMnonreg: Non-regularized Gaussian Graphical Models

[![CRAN Version](http://www.r-pkg.org/badges/version/GGMnonreg)](https://cran.r-project.org/package=GGMnonreg)
[![Downloads](https://cranlogs.r-pkg.org/badges/GGMnonreg)](https://cran.r-project.org/package=GGMnonreg)
[![CircleCI build status](https://circleci.com/gh/donaldRwilliams/GGMnonreg.svg?style=shield)](https://circleci.com/gh/donaldRwilliams/GGMnonreg)
[![zenodo](https://zenodo.org/badge/DOI/10.5281/zenodo.5668161.svg)](https://doi.org/10.5281/zenodo.5668161)


The goal of **GGMnonreg** is to estimate non-regularized graphical models. Note that 
the title is a bit of a misnomer, in that Ising and mixed graphical models are also supported.

Graphical modeling is quite common in fields with *wide* data, that is, when there are more 
variables than observations. Accordingly, many regularization-based approaches have been developed for those kinds of data. There are key drawbacks of regularization when the goal is inference, 
including, but not limited to, the fact that obtaining a valid measure of parameter uncertainty is very (very) difficult.

More recently, graphical modeling has emerged in psychology [@Epskamp2018ggm], where the data 
is typically long or low-dimensional [*p* < *n*; @williams2019nonregularized; @williams_rethinking]. The primary purpose of  **GGMnonreg** is to provide methods specifically for low-dimensional data 
(e.g., those common to psychopathology networks).

## Supported Models
* Gaussian graphical model. The following data types are supported.
  + Gaussian 
  + Ordinal
  + Binary
* Ising model [@marsman_2018]
* Mixed graphical model

## Additional methods
The following are also included

* Expected network replicability [@williams2020learning]
* Compare Gaussian graphical models
* Measure of parameter uncertainty [@williams2019nonregularized]
* Edge inclusion "probabilities"
* Network visualization
* Constrained precision matrix (the network, given an assumed graph)
* Predictability (variance explained)

## Installation
To install the latest release version (1.1.0) from CRAN use

```r
install.packages("GGMnonreg")    
```

You can install the development version from [GitHub](https://github.com/) with:
``` r
# install.packages("devtools")
devtools::install_github("donaldRwilliams/GGMnonreg")
```

## Ising
An Ising model is fitted with the following
```{r}
library(GGMnonreg)

# make binary
Y <- ifelse(ptsd[,1:5] == 0, 0, 1)

# fit model
fit <- ising_search(Y, IC = "BIC", 
                    progress = FALSE)

fit
```

Note the same code, more or less, is also used for GGMs and mixed graphical models.

## Predictability
It is common to compute predictability, or variance explained, for each node in the network.
An advantage of **GGMnonreg** is that a measure of uncertainty is also provided.

```{r}
# data
Y <- na.omit(bfi[,1:5])

# fit model
fit <- ggm_inference(Y, boot = FALSE)

# predictability
predictability(fit)
```


## Parameter Uncertainty
Confidence intervals for each relation are obtained with
```{r}
# data
Y <- na.omit(bfi[,1:5])

# fit model
fit <- ggm_inference(Y, boot = TRUE, 
                     method = "spearman", 
                     B = 100, progress = FALSE)

confint(fit)
```
These can then be plotted with, say, **ggplot2** (left to the user).

## Edge Inclusion
When mining data, or performing an automatic search, it is difficult to make inference on the
network parameters (e.g., confidence are not easily computed). To summarize data mining,
**GGMnonreg** provides edge inclusion "probabilities" (proportion bootstrap samples for 
which each relation was detected).

```{r}
# data
Y <- na.omit(bfi[,1:5])

# fit model
fit <-  eip(Y, method = "spearman", 
            B  = 100, progress = FALSE)

fit
```
Note in all cases, the provided estimates correspond to the upper-triangular elements
of the network.

## Expected Network Replicability
**GGMnonreg** allows for computing expected network replicability (ENR), i.e., the number of 
effects that will be detected in any number of replications. This is an analytic solution.

The first step is defining a true network
```{r}
# first make the true network
cors <- cor(GGMnonreg::ptsd)

# inverse
inv <- solve(cors)

# partials
pcors <-  -cov2cor(inv)

# set values to zero
pcors <- ifelse(abs(pcors) < 0.05, 0, pcors)
```

Then obtain ENR
```{r}
fit_enr <- enr(net = pcors, n = 500, replications = 2)

fit_enr
```
Note this is inherently frequentist. As such, over the long run, 45 % of the edges will be replicated on average. Then we can further infer that, in hypothetical replication attempts, more than half of the edges
will be replicated only 5 % of the time.

ENR can also be plotted
```{r}
plot_enr(fit_enr)
```

### Intuition
Here is the basic idea of ENR
```{r}
# location of edges
index <- which(pcors[upper.tri(diag(20))] != 0)

# convert network into correlation matrix
diag(pcors) <- 1
cors_new <- corpcor::pcor2cor(pcors)

# replicated edges
R <- NA

# increase 1000 to, say, 5,000
for(i in 1:1000){

  # two replications
  Y1 <- MASS::mvrnorm(500, rep(0, 20), cors_new)
  Y2 <- MASS::mvrnorm(500, rep(0, 20), cors_new)

  # estimate network 1
  fit1 <- ggm_inference(Y1, boot = FALSE)

  # estimate network 2
  fit2 <- ggm_inference(Y2, boot = FALSE)

  # number of replicated edges (detected in both networks)
  R[i] <- sum(
    rowSums(
      cbind(fit1$adj[upper.tri(diag(20))][index],
            fit2$adj[upper.tri(diag(20))][index])
    ) == 2)
}
```
Notice that replication of two networks is being assessed over the long run. In other words,
if we draw two random samples, what is the expected replicability.

Compare analytic to simulation
```{r}
# combine simulation and analytic
cbind.data.frame(
  data.frame(simulation = sapply(seq(0, 0.9, 0.1), function(x) {
    mean(R > round(length(index) * x) )
  })),
  data.frame(analytic = round(fit_enr$cdf, 3))
)

# average replicability (simulation)
mean(R / length(index))

# average replicability (analytic)
fit_enr$ave_pwr
```

ENR works with any correlation, assuming there is an estimate of the standard error.

## Network plot
```{r, message=FALSE}
# data
Y <- ptsd

# estimate graph
fit <- ggm_inference(Y, boot = FALSE)

# get info for plotting
plot(fit, edge_magnify = 5)
```

## Bug Reports, Feature Requests, and Contributing
Bug reports and feature requests can be made by opening an issue on [Github](https://github.com/donaldRwilliams/GGMnonreg/issues). To contribute towards
the development of **GGMnonreg**, you can start a branch with a pull request and we can 
discuss the proposed changes there.

## References

Owner

Name: Donald R. Williams
Login: donaldRwilliams
Kind: user

Repositories: 10
Profile: https://github.com/donaldRwilliams

JOSS Publication

GGMnonreg: Non-Regularized Gaussian Graphical Models in R

Published

November 11, 2021

DOI

10.21105/joss.03308

Volume 6, Issue 67, Page 3308

Authors

Donald R. Williams
Department of Psychology, University of California, Davis, NWEA, Portland, USA

Editor

Arfon Smith

GitHub Events

Total

Issues event: 3
Watch event: 1

Last Year

Issues event: 3
Watch event: 1

Committers

Last synced: 7 months ago

All Time

Total Commits: 156
Total Committers: 2
Avg Commits per committer: 78.0
Development Distribution Score (DDS): 0.122

Past Year

Commits: 0
Committers: 0
Avg Commits per committer: 0.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
Donald R. Williams	b**9@g**m	137
donaldRwilliams	y**u@e**m	19

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 3
Total pull requests: 7
Average time to close issues: over 2 years
Average time to close pull requests: 6 days
Total issue authors: 3
Total pull request authors: 3
Average comments per issue: 0.33
Average comments per pull request: 0.14
Merged pull requests: 5
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 2
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 2
Pull request authors: 0
Average comments per issue: 0.0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

KarolineHuth (1)
lauraeong (1)
guhjy (1)

Pull Request Authors

donaldRwilliams (5)
AlexChristensen (2)
SachaEpskamp (1)

Top Labels

Issue Labels

Pull Request Labels

Dependencies

DESCRIPTION cran

R >= 4.0.0 depends
GGMncv * imports
GGally * imports
MASS * imports
Matrix * imports
Rdpack * imports
bestglm * imports
corpcor * imports
doParallel * imports
foreach * imports
ggplot2 * imports
methods * imports
network * imports
parallel * imports
poibin * imports
psych * imports
sna * imports
stats * imports
qgraph * suggests

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

GGMnonreg

Science Score: 93.0%

Scientific Fields

Repository

Basic Info

Statistics

Metadata Files

README.Rmd

Owner

JOSS Publication

GGMnonreg: Non-Regularized Gaussian Graphical Models in R

Authors

Editor

Tags

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies