bionetworks_estimations
Data and scripts for "Testing biological network motif significance with exponential random graph models"
Science Score: 67.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 2 DOI reference(s) in README -
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.1%) to scientific vocabulary
Keywords
Repository
Data and scripts for "Testing biological network motif significance with exponential random graph models"
Statistics
- Stars: 3
- Watchers: 2
- Forks: 0
- Open Issues: 0
- Releases: 6
Topics
Metadata Files
README.md
Data and scripts for "Testing biological network motif significance with exponential random graph models" (2021) and "New network models facilitate analysis of biological networks" (2023)
Testing biological network motif significance with exponential random graph models
Analysis of the structure of biological networks often uses statistical tests to establish the over-representation of motifs, which are thought to be important building blocks of such networks, related to their biological functions. However, there is disagreement as to the statistical significance of these motifs, and there are potential problems with standard methods for estimating this significance. Exponential random graph models (ERGMs) are a class of statistical model that can overcome some of the shortcomings of commonly used methods for testing the statistical significance of motifs. ERGMs were first introduced into the bioinformatics literature over ten years ago but have had limited application to biological networks, possibly due to the practical difficulty of estimating model parameters. Advances in estimation algorithms now afford analysis of much larger networks in practical time. We illustrate the application of ERGM to both an undirected protein-protein interaction (PPI) network and directed gene regulatory networks. ERGM models indicate over-representation of triangles in the PPI network, and confirm results from previous research as to over-representation of transitive triangles (feed-forward loop) in an E. coli and a yeast regulatory network. We also confirm, using ERGMs, previous research showing that under-representation of the cyclic triangle (feedback loop) can be explained as a consequence of other topological features.
New network models facilitate analysis of biological networks
Exponential-family random graph models (ERGMs) are a family of network models originating in social network analysis, which have also been applied to biological networks. Advances in estimation algorithms have increased the practical scope of these models to larger networks, however it is still not always possible to estimate a model without encountering problems of model near-degeneracy, particularly if it is desired to use only simple model parameters, rather than more complex parameters designed to overcome the problem of near-degeneracy. Two new network models related to the ERGM, the Tapered ERGM, and the latent order logistic (LOLOG) model, have recently been proposed to overcome this problem. In this work I illustrate the application of the Tapered ERGM and the LOLOG to a set of biological networks, including protein-protein interaction (PPI) networks, gene regulatory networks, and neural networks. I find that the Tapered ERGM and the LOLOG are able to estimate models for networks for which it was not possible to estimate a conventional ERGM, and are able to do so using only simple model parameters. In the case of two neural networks where data on the spatial position of neurons is available, this allows the estimation of models including terms for spatial distance and triangle structures, allowing triangle motif statistical significance to be estimated while accounting for the effect of spatial proximity on connection probability. For some larger networks, however, Tapered ERGM and LOLOG estimation was not possible in practical time, while conventional ERGM models were able to be estimated only by using the Equilibrium Expectation (EE) algorithm.
Software
The data and scripts in this repository were originally imported from the ergmbionetworksdatascripts.tar.gz archive available from https://sites.google.com/site/alexdstivala/home/ergmbionetworks.
The EstimNetDirected software for estimating ERGM parameters for directed networks (which now also handles undirected and bipartite networks) is available from https://sites.google.com/site/alexdstivala/home/estimnetdirected or GitHub: https://github.com/stivalaa/EstimNetDirected.
The older Estimnet software for estimating ERGM parameters for undirected networks is available from https://web.archive.org/web/20221007014617/http://www.estimnet.org/. (EstimNetDirected can now, despite the name, also estimate ERGM models for undirected and bipartite networks).
The statnet software collection of R packages is available from CRAN at https://cran.r-project.org/web/packages/statnet/index.html
The latent order logistic model (LOLOG) R package is available from CRAN at https://cran.r-project.org/web/packages/lolog/index.html
The ergm.tapered R package is available via GitHub at https://github.com/statnet/ergm.tapered
References
Stivala, A. (2023). New network models facilitate analysis of biological networks. arXiv preprint arXiv:2312.06047 https://arxiv.org/abs/2312.06047
Stivala, A. & Lomi, A. (2021). Testing biological network motif significance with exponential random graph models. Applied Network Science 6:91. https://doi.org/10.1007/s41109-021-00434-y
Owner
- Name: Alex Stivala
- Login: stivalaa
- Kind: user
- Website: https://sites.google.com/site/alexdstivala/
- Repositories: 3
- Profile: https://github.com/stivalaa
Research fellow, Università della Svizzera italiana (Switzerland)
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Stivala"
given-names: "Alex"
orcid: "https://orcid.org/0000-0002-2442-4743"
- family-names: "Lomi"
given-names: "Alessandro"
title: "Testing biological network motif significance with exponential random graph models"
url: "https://github.com/stivalaa/bionetworks_estimations"
preferred-citation:
type: article
authors:
- family-names: "Stivala"
given-names: "Alex"
orcid: "https://orcid.org/0000-0002-2442-4743"
- family-names: "Lomi"
given-names: "Alessandro"
title: "Testing biological network motif significance with exponential random graph models"
doi: 10.1007/s41109-021-00434-y
journal: Applied Network Science
year: 2021
volume: 6
start: 91