unified-graph-transformer

Unified Graph Transformer (UGT) is a novel Graph Transformer model specialised in preserving both local and global graph structures and developed by NS Lab @ CUK based on pure PyTorch backend.

https://github.com/nslab-cuk/unified-graph-transformer

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 6 DOI reference(s) in README
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (11.0%) to scientific vocabulary

Keywords

graph graph-neural-networks graph-representation-learning graph-transformer pretrained-graph-model self-supervised-graph-learning structure-preserving-graph-transformer transformers

Last synced: 6 months ago · JSON representation ·

Repository

Unified Graph Transformer (UGT) is a novel Graph Transformer model specialised in preserving both local and global graph structures and developed by NS Lab @ CUK based on pure PyTorch backend.

Basic Info

Host: GitHub
Owner: NSLab-CUK
License: mit
Language: Python
Default Branch: main
Homepage: https://nslab-cuk.github.io/2023/08/17/UGT/
Size: 1.65 MB

Statistics

Stars: 25
Watchers: 3
Forks: 5
Open Issues: 0
Releases: 1

Topics

graph graph-neural-networks graph-representation-learning graph-transformer pretrained-graph-model self-supervised-graph-learning structure-preserving-graph-transformer transformers

Created over 2 years ago · Last pushed about 1 year ago

Metadata Files

Readme License Citation

Unified Graph Transformer

Unified Graph Transformer (UGT) is a novel Graph Transformer model specialised in preserving both local and global graph structures and developed by NS Lab, CUK based on pure PyTorch backend. The paper is available on on AAAI Proceedings.

1. Overview

Over the past few years, graph neural networks and graph transformers have been successfully used to analyze graph-structured data, mainly focusing on node classification and link prediction tasks. However, the existing studies mostly only consider local connectivity while ignoring long-range connectivity and the roles of nodes. We propose Unified Graph Transformer Networks (UGT) that effectively integrate local and global structural information into fixed-length vector representations. UGT learns local structure by identifying the local substructures and aggregating features of the k-hop neighborhoods of each node. We construct virtual edges, bridging distant nodes with structural similarity to capture the long-range dependencies. UGT learns unified representations through self-attention, encoding structural distance and p-step transition probability between node pairs. Furthermore, we propose a self-supervised learning task that effectively learns transition probability to fuse local and global structural features, which could then be transferred to other downstream tasks. Experimental results on real-world benchmark datasets over various downstream tasks showed that UGT significantly outperformed baselines that consist of state-of-the-art models. In addition, UGT reaches the third-order Weisfeiler-Lehman power to distinguish non-isomorphic graph pairs.

Graph Transformer Architecture
The overall architecture of Unified Graph Transformer Networks.

2. Reproducibility

Datasets and Tasks

The package Node_level_tasks contains the modules required for node clustering and node classification tasks. The package Graph_classification is for graph classification task. The package IsomorphismTesting is for Isomorphism testing.

For the node-level tasks, we used eleven publicly available datasets, which are grouped into three different domains, including Air-traffic networks (e.g., Brazil, Europe, and USA), Webpage networks (e.g., Chameleon, Squirrel, Actor, Cornell, Texas, and Wisconsin), and Citation networks (e.g., Cora and Citeseer). We used four publicly available datasets for the graph classification task, including Enzymes, Proteins, NCI1, and NCI9 from TUDataset. For isomorphism testing, we used Graph8c and five Strongly Regular Graphs datasets (graph8c, SR251256, SR261034, SR281264, SR401224), which contain 1d-WL and 3d-WL equivalent graph pairs, respectively, for isomorphism testing. The datasets are automatically downloaded.

Requirements and Environment Setup

The source code developed in Python 3.8.8. UGT are built using Torch-geometric 2.3.1 and DGL 1.1.0. Please refers to official websites for installation and setup. All the requirements are included in environment.yml file.

```

Conda installation

Install python environment

conda env create -f environment.yml ```

Hyperparameters

Following Options can be passed to exp_10.py

--dataset: The name of datasets. For example: --dataset cora

--lr: Learning rate for training the model. For example: --lr 0.001

--epochs: Number of epochs for training the model. For example: --epochs 1000

--layers: Number of layers for training the model. For example: --layers 4

--task: The specific task. For example: --task pre_training.

--pre_load: The processing mode, including pre-processing and training. For example: --pre_load 0 for pre-processing graph.

--dims: The dimmension of hidden vectors. For example: --dims 16.

--k_transition: The number of transition step. For example: --k_transition 6.

--k_hop: The number of hop for sasmpling. For example: --k_hop 2.

--alpha: Hyperparameters for transition construction loss. For example: --alpha 0.4.

--beta: Hyperparameters for feature construction loss. For example: --beta 0.6.

Node-level Tasks

The the source code for node clustering and node classification are included in the Node_level_tasks folder. We do experiments on eleven public benchmark datasets, including cora, citeseer, brazil, europe, usa, chameleon, squirrel, film, cornell, texas, and wisconsin. The datasets are automatically downloaded. Note that one can test node classification and node clustering task followed by pre-training tasks. We sample k-hop neighbourhood and virtual edges in the pre-processing step and store the processed data in the pts and outputs folders. For example, one can first pre-train UGT on cora dataset and then test on downstream tasks, such as node classification and node clustering:

``` cd Nodeleveltasks

Pre-preprocessing data

python exp10.py --dataset cora --preload 1

Pre_training

python exp10.py --dataset cora --task pretraining --pre_load 0

Running Node classification task:

python exp10.py --dataset cora --task nodeclassification --preload 0 --lr 0.001 --dims 16 --khop 1 --numlayers 2 --ktransition 6 --alfa 0.5 --beta 0.5

Running Node clustering task:

python exp10.py --dataset cora --task nodeclustering --preload 0 --lr 0.001 --dims 16 --khop 1 --numlayers 2 --ktransition 6 --alfa 0.5 --beta 0.5 ```

Graph-level Classification Task

The the source code for Graph-level classification task is included in the Graph_classification folder. We do experiments on four public benchmark datasets, including Enzymes, Proteins, NCI1, and NCI9 from TUDataset. The datasets are automatically downloaded. For example, one can run UGT on Proteins dataset: ``` cd Graph_classification

Pre-processing:

python exp10.py --dataset PROTEINS --preload 1 --task graph_classification

Graph classification task:

python exp10.py --dataset PROTEINS --preload 0 --task graph_classification ```

Isomorphism Testing

There are five graph datasets, including Graph8c and five Strongly Regular Graphs datasets (SR251256, SR261034, SR281264, SR401224), which contain 1d-WL and 3d-WL equivalent graph pairs, respectively. For example, one can test the power of UGT on sr16622 dataset:

``` cd IsomorphismTesting

python exp10.py --dataset sr16622 --task isotest ```

3. Reference

:pagewithcurl: Paper on AAAI Proceedings: *

:pagewithcurl: Paper on arXiv: *

:chartwithupwards_trend: Experimental results on Papers With Code: *

:pencil: Blog post on Network Science Lab: *

4. Citing UGT

Please cite our paper if you find UGT useful in your work: ``` @InProceedings{Hoang_2024, author = {Hoang, Van Thuy and Lee, O-Joun}, booktitle = {Proceedings of the 38th AAAI Conference on Artificial Intelligence (AAAI 2024)}, title = {Transitivity-Preserving Graph Representation Learning for Bridging Local Connectivity and Role-Based Similarity}, year = {2024}, address = {Vancouver, Canada}, editor = {Michael Wooldridge and Jennifer Dy and Sriraam Natarajan}, month = feb, number = {11}, pages = {12456--12465}, publisher = {Association for the Advancement of Artificial Intelligence (AAAI)}, volume = {38}, doi = {10.1609/aaai.v38i11.29138}, issn = {2159-5399}, url = {https://doi.org/10.1609/aaai.v38i11.29138}, }

@misc{hoang2023ugt, title={Transitivity-Preserving Graph Representation Learning for Bridging Local Connectivity and Role-based Similarity}, author={Van Thuy Hoang and O-Joun Lee}, year={2023}, eprint={2308.09517}, archivePrefix={arXiv}, primaryClass={cs.LG} } ```

Please take a look at our community-aware graph transformer model, CGT, which can mitigate degree bias problem of message passing mechanism, together.

5. Contributors

Owner

Name: NS Lab @ CUK
Login: NSLab-CUK
Kind: user
Location: Bucheon, Rep. of Korea
Company: The Catholic University of Korea

Website: https://nslab-cuk.github.io/
Twitter: NS_CUK
Repositories: 41
Profile: https://github.com/NSLab-CUK

Network Science Lab, Dept. of Artificial Intelligence, The Catholic University of Korea

Citation (CITATION.cff)

cff-version: 1.0.0
date-released: 2024-02
message: "If you use this software, please cite it as below."
authors:
- family-names: "Hoang"
  given-names: "Van Thuy"
- family-names: "Lee"
  given-names: "O-Joun"
title: "Transitivity-Preserving Graph Representation Learning for Bridging Local Connectivity and Role-based Similarity"
url: "https://ojs.aaai.org/index.php/AAAI/article/view/29138"
preferred-citation:
  type: conference-paper
  conference:
    name: "Proceedings of the 38th AAAI Conference on Artificial Intelligence (AAAI 2024)"
  authors:
    - family-names: "Hoang"
      given-names: "Van Thuy"
    - family-names: "Lee"
      given-names: "O-Joun"
  title: "Transitivity-Preserving Graph Representation Learning for Bridging Local Connectivity and Role-based Similarity"
  url: "https://ojs.aaai.org/index.php/AAAI/article/view/29138"
  year: 2024
  publisher: "Association for the Advancement of Artificial Intelligence"
  address: "Vancouver, Canada"