community-aware-graph-transformer

Community-aware Graph Transformer (CGT) is a novel Graph Transformer model that utilizes community structures to address node degree biases in message-passing mechanism and developed by NS Lab @ CUK based on pure PyTorch backend.

https://github.com/nslab-cuk/community-aware-graph-transformer

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 4 DOI reference(s) in README
  • Academic publication links
    Links to: arxiv.org, ieee.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.4%) to scientific vocabulary

Keywords

degree-bias degree-fairness graph-algorithms graph-neural-networks graph-representation-learning graph-theory graph-transformer graphs node-classification pretrained-graph-model self-supervised-graph-learning transformer
Last synced: 6 months ago · JSON representation ·

Repository

Community-aware Graph Transformer (CGT) is a novel Graph Transformer model that utilizes community structures to address node degree biases in message-passing mechanism and developed by NS Lab @ CUK based on pure PyTorch backend.

Basic Info
Statistics
  • Stars: 14
  • Watchers: 2
  • Forks: 2
  • Open Issues: 0
  • Releases: 1
Topics
degree-bias degree-fairness graph-algorithms graph-neural-networks graph-representation-learning graph-theory graph-transformer graphs node-classification pretrained-graph-model self-supervised-graph-learning transformer
Created about 2 years ago · Last pushed 6 months ago
Metadata Files
Readme License Citation

README.md

Community-aware Graph Transformer

Community-aware Graph Transformer (CGT) is a novel Graph Transformer model that utilizes community structures to address node degree biases in message-passing mechanism and developed by NS Lab, CUK based on pure PyTorch backend. The paper is available on arXiv.

Python pytorch


1. Overview

Recent augmentation-based methods showed that message-passing (MP) neural networks often perform poorly on low-degree nodes, leading to degree biases due to a lack of messages reaching low-degree nodes. Despite their success, most methods use heuristic or uniform random augmentations, which are non-differentiable and may not always generate valuable edges for learning representations. In this paper, we propose Community-aware Graph Transformers, namely CGT, to learn degree-unbiased representations based on learnable augmentations and graph transformers by extracting within community structures. We first design a learnable graph augmentation to generate more within-community edges connecting low-degree nodes through edge perturbation. Second, we propose an improved self-attention to learn underlying proximity and the roles of nodes within the community. Third, we propose a self-supervised learning task that could learn the representations to preserve the global graph structure and regularize the graph augmentations. Extensive experiments on various benchmark datasets showed CGT outperforms state-of-the-art baselines and significantly improves the node degree biases.


Graph Transformer Architecture
The overall architecture of Community-aware Graph Transformer.

2. Reproducibility

Datasets and Tasks

We used six publicly available datasets, which are grouped into three different domains, including citation network (Cora, Citeseer, and Pubmed datasets), Co-purchase network networks (Amazon Computers and Photo datasets), and reference network (WikiCS). The datasets are automatically downloaded from Pytorch Geometric.

Requirements and Environment Setup

The source code was developed in Python 3.8.8. CGT is built using Torch-geometric 2.3.1 and DGL 1.1.0. Please refer to the official websites for installation and setup. All the requirements are included in the environment.yml file.

```

Conda installation

Install python environment

conda env create -f environment.yml ```

Hyperparameters

The following Options can be passed to exp.py:

--dataset: The name of dataset inputs. For example: --dataset cora

--lr: Learning rate for training the model. For example: --lr 0.001

--epochs: Number of epochs for pre-training the model. For example: --epochs 500

--run_times_fine: Number of epochs for fine-tuning the model. For example: --run_times_fine 500

--layers: Number of layers for model training. For example: --layers 4

--drop: Dropout rate. For example: --drop 0.5

--dims: The dimmension of hidden vectors. For example: --dims 64.

--k_transition: The number of transition step. For example: --k_transition 3.

--alpha: Hyperparameters for degree-related score. For example: --alpha 0.1.

--beta: Hyperparameters for adjacency matrix score. For example: --beta 0.95.

--alpha_1: Hyperparameters for transition construction loss. For example: --alpha_1 0.5.

--alpha_2: Hyperparameters for feature construction loss. For example: --alpha_2 0.5.

--alpha_3: Hyperparameters for augmentation loss. For example: --alpha_3 0.5.

How to run

The source code contains both pre-training and fine-tuning processes. The following commands will run the pre-training process and fine-tune the CGT on Cora dataset for both node classification and clustering tasks.

```

python exp.py --dataset cora

```

3. Reference

:pagewithcurl: Paper on IEEE TNSE: * DOI

:pagewithcurl: Paper on arXiv: * arXiv arXiv

:chartwithupwards_trend: Experimental results on Papers With Code: * PwC PwC

:pencil: Blog post on Network Science Lab: * Web

4. Citing CGT

Please cite our paper if you find CGT useful in your work: ``` @Article{hoang2025mitigating_TNSE, author = {Van Thuy Hoang and Hyeon-Ju Jeon and O-Joun Lee}, journal = {IEEE Transactions on Network Science and Engineering}, title = {Mitigating Degree Bias in Graph Representation Learning with Learnable Structural Augmentation and Structural Self-Attention}, year = {2025}, issn = {2327-4697}, volume = {12}, number = {5}, pages = {3656--3670}, doi = {10.1109/TNSE.2025.3563697}, }

@misc{hoang2025mitigating, title={Mitigating Degree Bias in Graph Representation Learning with Learnable Structural Augmentation and Structural Self-Attention}, author={Van Thuy Hoang and Hyeon-Ju Jeon and O-Joun Lee}, year={2025}, eprint={2504.15075}, archivePrefix={arXiv}, primaryClass={cs.AI} }

@misc{hoang2023mitigating, title={Mitigating Degree Biases in Message Passing Mechanism by Utilizing Community Structures}, author={Van Thuy Hoang and O-Joun Lee}, year={2023}, eprint={2312.16788}, archivePrefix={arXiv}, primaryClass={cs.LG} } ```

Please take a look at our structure-preserving graph transformer model, UGT, which has as high expessive power as 3-WL, together.

5. Contributors




Owner

  • Name: NS Lab @ CUK
  • Login: NSLab-CUK
  • Kind: user
  • Location: Bucheon, Rep. of Korea
  • Company: The Catholic University of Korea

Network Science Lab, Dept. of Artificial Intelligence, The Catholic University of Korea

Citation (CITATION.cff)

cff-version: 1.0.0
date-released: 2025-04
message: "If you use this software, please cite it as below."
authors:
- family-names: "Hoang"
  given-names: "Van Thuy"
- family-names: "Jeon"
  given-names: "Hyeon-Ju"
- family-names: "Lee"
  given-names: "O-Joun"
title: "Mitigating Degree Bias in Graph Representation Learning With Learnable Structural Augmentation and Structural Self-Attention"
url: "https://ieeexplore.ieee.org/document/10974679"
preferred-citation:
  type: article
  journal: "IEEE Transactions on Network Science and Engineering"
  authors:
    - family-names: "Hoang"
      given-names: "Van Thuy"
    - family-names: "Jeon"
      given-names: "Hyeon-Ju"
    - family-names: "Lee"
      given-names: "O-Joun"
  title: "Mitigating Degree Bias in Graph Representation Learning With Learnable Structural Augmentation and Structural Self-Attention"
  url: "https://ieeexplore.ieee.org/document/10974679"
  year: 2025
  publisher: "IEEE"

GitHub Events

Total
  • Watch event: 4
  • Push event: 3
Last Year
  • Watch event: 4
  • Push event: 3

Issues and Pull Requests

Last synced: almost 2 years ago

All Time
  • Total issues: 0
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 0
  • Total pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels

Dependencies

environment.yml pypi
  • asgiref ==3.7.2
  • backcall ==0.2.0
  • chardet ==3.0.4
  • charset-normalizer ==3.1.0
  • cmake ==3.26.4
  • contourpy ==1.0.7
  • cycler ==0.11.0
  • decorator ==4.4.2
  • django ==4.2.2
  • et-xmlfile ==1.1.0
  • fastdtw ==0.3.4
  • fastjsonschema ==2.17.1
  • filelock ==3.12.1
  • fonttools ==4.39.4
  • idna ==2.8
  • image ==1.5.33
  • importlib-resources ==5.12.0
  • intel-openmp ==2023.1.0
  • ipython-genutils ==0.2.0
  • joblib ==1.2.0
  • kiwisolver ==1.4.4
  • lit ==16.0.5.post0
  • littleutils ==0.2.2
  • markupsafe ==2.1.3
  • matplotlib ==3.7.1
  • mistune ==0.8.4
  • mkl ==2019.0
  • mpmath ==1.3.0
  • networkx ==2.5.1
  • numpy ==1.22.4
  • oauthlib ==3.2.2
  • ogb ==1.3.6
  • openpyxl ==3.1.2
  • outdated ==0.2.2
  • packaging ==23.1
  • pandas ==2.0.2
  • pandocfilters ==1.5.0
  • parso ==0.8.3
  • pexpect ==4.8.0
  • pickleshare ==0.7.5
  • pillow ==9.5.0
  • pip ==23.1.2
  • prometheus-client ==0.17.0
  • prompt-toolkit ==2.0.10
  • protobuf ==4.23.2
  • psutil ==5.9.5
  • ptyprocess ==0.7.0
  • pyasn1 ==0.5.0
  • pyg-lib ==0.2.0
  • pygments ==2.15.1
  • pyparsing ==3.0.9
  • pyrsistent ==0.19.3
  • python-dateutil ==2.8.2
  • pytz ==2023.3
  • pyzmq ==25.1.0
  • requests ==2.31.0
  • retrying ==1.3.4
  • scikit-learn ==0.24.1
  • scipy ==1.6.2
  • send2trash ==1.8.2
  • six ==1.15.0
  • sqlparse ==0.4.4
  • sympy ==1.12
  • testpath ==0.6.0
  • threadpoolctl ==3.1.0
  • torch ==2.0.1
  • torch-cluster ==1.6.1
  • torch-geometric ==2.3.1
  • torch-scatter ==2.1.1
  • torch-sparse ==0.6.17
  • torch-spline-conv ==1.2.2
  • torchaudio ==2.0.2
  • torchvision ==0.15.2
  • tornado ==6.3.2
  • tqdm ==4.60.0
  • traitlets ==5.9.0
  • triton ==2.0.0
  • typing-extensions ==4.6.3
  • tzdata ==2023.3
  • urllib3 ==1.25.11
  • wcwidth ==0.2.6
  • webencodings ==0.5.1
  • zipp ==3.15.0