dag_maxpathhomology
Compute the maximal path homology for unweighted directed acyclic graphs.
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.8%) to scientific vocabulary
Repository
Compute the maximal path homology for unweighted directed acyclic graphs.
Basic Info
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
MaxPathHomology
Zhu, Z. and Chi, Z. (2024). "Recursive Computation of Path Homology for Stratified Digraphs".
This Python script is an implementation of the above paper that recursively computes the maximal (reduced) path homology of an unweighted stratified graph or unweighted directed acyclic graph (DAG).
Installation
To get started with this project, follow the steps below to install the necessary dependencies:
- Python: Ensure that Python 3.8 or higher is installed. You can check your Python version by running:
bash python --version - Clone the repository:
bash git clone https://github.com/zhengtongzhu/DAG_MaxPathHomology.git cd DAG_MaxPathHomology - Install dependencies:
bash pip install -r requirements.txt Run the project:
bash python recursive_algorithm.pyYou will see the following output from the test example:
python lp: 2, sum_dim: 3, basis: [ [(-b0 + b1)*((-c0 + c1)*(-c3 + d1) + (c0 - c1)*(-c3 + d0)), (-a0 + a1)*(-b2 + b3)*(c2 - c3)], [(-b4 + b5)*(-c4 + c5)*(d2 - d3)] ]
dag_preprocess.py
The graph_preprocess.py module provides functions to compute l(G) and the stratified decomposition $G_*$ for a given unweighted DAG without multiple edges. These methods are described in Section 5.2 of the paper.
Example Usage
Consider a DAG G with the following edgelist:
python
edgelist = [('a0', 'b2'), ('a0', 'b3'), ('a1', 'b2'), ('a1', 'b3'), ('b4', 'd1'),
('a1', 'c1'), ('b0', 'c0'), ('b0', 'c1'), ('b1', 'c0'), ('b1', 'c1'),
('b2', 'c2'), ('b2', 'c3'), ('b3', 'c2'), ('b3', 'c3'), ('b4', 'c4'),
('b4', 'c5'), ('b5', 'c4'), ('b5', 'c5'), ('b0', 'c2'), ('b0', 'c3'),
('b1', 'c2'), ('b1', 'c3'), ('b4', 'c2'), ('b4', 'c3'), ('b5', 'c2'),
('b5', 'c3'), ('c0', 'd0'), ('c0', 'd1'), ('c1', 'd0'), ('c1', 'd1'),
('c4', 'd2'), ('c4', 'd3'), ('c5', 'd2'), ('c5', 'd3'), ('a2', 'b4')]
each tuple (a, b) in this edgelist represents a directed unweighted edge in G from node a to node b. Below is a visualization of G:

The dag_process function first check if the DAG G contains multi-edges or has a loop (based on NetworkX):
python
if len(edgelist) != len(set(edgelist)):
raise ValueError("Error: The graph has duplicate edges.")
G = nx.DiGraph(edgelist)
if not nx.is_directed_acyclic_graph(G):
raise ValueError("Error: The graph must be a DAG.")
then compute the longest path length of G (by daglongestpath_length):
python
lp = nx.dag_longest_path_length(G)
We repeatedly prune the graph using prune and $G*$-algorithm (`lpedgelist) until the graph structure can no longer be simplified. The repetition of thepruneandlpedgelistprovides a set of weakly connected components{Gi}. These{Gi}are stratified graphs, and computing the direct sum of the maximal path homology of allGiis equivalent to computing the maximal path homology ofG, which are guaranteed by Corollary 3.5 and Proposition 3.7 of the paper. For theedgelistabove,10` edges are removed after pruning:
('a2', 'b4'): removed byprune.('b0', 'c2'),('b0', 'c3'),('b1', 'c2'),('b1', 'c3'),('b4', 'c2'),('b4', 'c3'),('b5', 'c2'),('b5', 'c3'),('b4', 'd1'): removed bylp_edgelist.
The original graph G is splitted into two weakly connected components:
```python
G_0 contains the following edges:
[('a0', 'b2'), ('a0', 'b3'), ('a1', 'b2'), ('a1', 'b3'), ('a1', 'c1'),
('b2', 'c2'), ('b2', 'c3'), ('b3', 'c2'), ('b3', 'c3'),
('b0', 'c0'), ('b0', 'c1'), ('b1', 'c0'), ('b1', 'c1'),
('c0', 'd0'), ('c0', 'd1'), ('c1', 'd0'), ('c1', 'd1')]
G_1 contains the following edges: [('b4', 'c4'), ('b4', 'c5'), ('b5', 'c4'), ('b5', 'c5'), ('c4', 'd2'), ('c4', 'd3'), ('c5', 'd2'), ('c5', 'd3')] ``` The visual representation of these two components is shown below:
G_0
G_1
The dag_process returns 5 components of all G_i:
python
subgraph_dict, node_counts, lp, num_graph, graph_list = dag_process(edgelist)
- The $i^{th}$ element of
subgraph_dictis a dictionary representsG_i, where each key is a layer index (start from0) and the value corresponding to the key is the set of nodes in that layer. For example, in theedgelistabove, thesubgraph_dictis:
python
subgraph_dict = [
{0: {'b1', 'a1', 'b0', 'a0'}, 1: {'c1', 'b2', 'b3', 'c0'}, 2: {'c2', 'c3', 'd1', 'd0'}},
{0: {'b5', 'b4'}, 1: {'c5', 'c4'}, 2: {'d3', 'd2'}}
]
c2 is a node in G_0's 2nd layer and c4 is a node in G_1's 1st layer.
- The
i$^{th}$ element ofnode_countsrepresents the number of nodes in different layer ofG_i. From thesubgraph_dictabove we know that
python
node_counts = [
[4, 4, 4],
[2, 2, 2]
]
where [4, 4, 4] means G_0 has 4 nodes in layer 0, 4 nodes in layer 1, and 4 nodes in layer 2.
lpis the longest path length ofGas described above (lp = l(G)). Herelp = 2.num_graphis the cardinality of{G_i}. Herenum_graph = 2.graph_listis a list where each elementG_iis stored as an instance of thenx.DiGraphclass.
recursive_algrithm.py
The recursive_algrithm.py includes the main algorithm max_path_homology to compute the maximal path homology for an unweighted stratified graph. Along with graph_preprocess.py, it also works for unweighted DAGs.
The max_path_homology returns lp, sum_dim and basis.
python
lp, sum_dim, basis = max_path_homology(edgelist, calculate_basis)
Here:
- lp = l(G) is the longest path length of G.
- sum_dim is the sum of the Betti Numbers of the lp-dimentional (maximal) path homologies of all G_i.
- If the input calculate_basis == True, then basis returns a basis of the lp-dimensional (maximal) reduced path homology of all G_i, otherwise it returns None.
If calculate_basis == True, the outputs are:
python
lp = 2
sum_dim = 3
basis = [
[(-b0 + b1)*((-c0 + c1)*(-c3 + d1) + (c0 - c1)*(-c3 + d0)),
(-a0 + a1)*(-b2 + b3)*(c2 - c3)],
[(-b4 + b5)*(-c4 + c5)*(d2 - d3)]
]
general_algorithm.py
The general_algorithm.py is based on an implementation from the following project:
Carranza, D., Doherty, B., Kapulkin, K., Opie, M., Sarazola, M., & Wong, L. Z. (2022). Python script for computing path homology of digraphs (Version 1.0.0) [Computer software]. https://github.com/sheaves/path_homology.
This script implements a general algorithm for computing reduced path homology. If you use this project or its comparative implementation in your work, please also cite the above project to acknowledge their contributions.
Other Modules
maxpph.py: Produces a decreasing persistence path homology plot (Section 6.2 of the paper).
experiment_func.py, stratified_gamma_1_2_3.py and stratified_gamma_4_5.py: Implement additional functions and simulations discussed in Section 6.1 of the paper.
maxph_matrix.py: Contains functions for matrix operations.
Citation
If you find this code useful, please cite it using the following BibTeX entry:
bibtex
@software{Zhu_Computing_the_maximal_2024,
author = {Zhu, Zhengtong and Chi, Zhiyi},
month = dec,
title = {{Computing the maximal path homology of directed acyclic graph}},
url = {https://github.com/zhengtongzhu/DAG_MaxPathHomology},
version = {1.0.0},
year = {2024}
}
Owner
- Name: Zhengtong Zhu
- Login: zhengtongzhu
- Kind: user
- Repositories: 1
- Profile: https://github.com/zhengtongzhu
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you think this software is helpful, please cite it as
follows."
title: "Computing the maximal path homology of directed acyclic graph"
type: software
authors:
- given-names: Zhengtong
family-names: Zhu
- given-names: Zhiyi
family-names: Chi
repository-code: 'https://github.com/zhengtongzhu/DAG_MaxPathHomology'
version: 1.0.0
date-released: '2024-12-07'
GitHub Events
Total
- Public event: 1
- Push event: 27
Last Year
- Public event: 1
- Push event: 27
Dependencies
- matplotlib ==3.8.3
- networkx ==3.2.1
- numpy ==2.1.3
- sympy ==1.12
- tqdm ==4.66.2