sgc

official implementation for the paper "Simplifying Graph Convolutional Networks"

https://github.com/tiiiger/sgc

Science Score: 38.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
○
codemeta.json file
○
.zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org, scholar.google
✓
Committers with academic emails
1 of 6 committers (16.7%) from academic institutions
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (11.9%) to scientific vocabulary

Keywords

graph machine-learning

Last synced: 6 months ago · JSON representation ·

Repository

official implementation for the paper "Simplifying Graph Convolutional Networks"

Basic Info

Host: GitHub
Owner: Tiiiger
License: mit
Language: Python
Default Branch: master
Homepage:
Size: 5.96 MB

Statistics

Stars: 843
Watchers: 17
Forks: 146
Open Issues: 2
Releases: 0

Topics

graph machine-learning

Created about 7 years ago · Last pushed about 4 years ago

Metadata Files

Readme License Citation

Simplifying Graph Convolutional Networks

Updates

As pointed out by #23, there was a subtle bug in our preprocessing code for the reddit dataset. After fixing this bug, SGC achieves a F1 score of 95.0 (previously, it was 94.9).
Practical advice: it is often very helpful to normalize the features to have zero mean with standard deviation one to accelerate the convergence of SGC (and many other linear models). For example, we apply this normalization for the reddit dataset. Please consider doing this when applying SGC to other datasets. For some relevant discussions, see Ross et al, 2013 and Li and Zhang, 1998.

Authors:

*: Equal Contribution

Overview

This repo contains an example implementation of the Simple Graph Convolution (SGC) model, described in the ICML2019 paper Simplifying Graph Convolutional Networks.

SGC removes the nonlinearities and collapes the weight matrices in Graph Convolutional Networks (GCNs) and is essentially a linear model. For an illustration,

SGC achieves competitive performance while saving much training time. For reference, on a GTX 1080 Ti,

Dataset | Metric | Training Time :------:|:------:|:-----------:| Cora | Acc: 81.0 % | 0.13s Citeseer| Acc: 71.9 % | 0.14s Pubmed | Acc: 78.9 % | 0.29s Reddit | F1: 94.9 % | 2.7s

This home repo contains the implementation for citation networks (Cora, Citeseer, and Pubmed) and social network (Reddit). We have a work-in-progress branch ablation, containing additional codebase for our ablation studies.

If you find this repo useful, please cite: @InProceedings{pmlr-v97-wu19e, title = {Simplifying Graph Convolutional Networks}, author = {Wu, Felix and Souza, Amauri and Zhang, Tianyi and Fifty, Christopher and Yu, Tao and Weinberger, Kilian}, booktitle = {Proceedings of the 36th International Conference on Machine Learning}, pages = {6861--6871}, year = {2019}, publisher = {PMLR}, }

Other reference implementations

Other reference implementations can be found in the follwing libraries. Note that in these examples, the hyperparameters are potentially different and the results would be different from the paper reported ones.

Deep Graph Library: example.
PyTorch Geometric: documentation and example.
Spektral: example
StellarGraph: example
tf_geometric: example

Dependencies

Our implementation works with PyTorch>=1.0.0 Install other dependencies: $ pip install -r requirement.txt

Data

We provide the citation network datasets under data/, which corresponds to the public data splits. Due to space limit, please download reddit dataset from FastGCN and put reddit_adj.npz, reddit.npz under data/.

Usage

Citation Networks: We tune the only hyperparameter, weight decay, with hyperopt and put the resulting hyperparameter under SGC-tuning. See tuning.py for more details on hyperparameter optimization. $ python citation.py --dataset cora --tuned $ python citation.py --dataset citeseer --tuned --epochs 150 $ python citation.py --dataset pubmed --tuned

Reddit: $ python reddit.py --inductive --test

Downstream

We collect the code base for downstream tasks under downstream. Currently, we are releasing only SGC implementation for text classification.

Acknowledgement

This repo is modified from pygcn, and FastGCN.

We thank Deep Graph Library team for helping providing a reference implementation of SGC and benchmarking SGC in Deep Graph Library. We thank Matthias Fey, author of PyTorch Geometric, for his help on providing a reference implementation of SGC within PyTorch Geometric. We thank Daniele Grattarola, author of Spektral, for his help on providing a reference implementation of SGC within Spektral.

Owner

Name: Tianyi
Login: Tiiiger
Kind: user
Location: Ithaca, NY

Website: https://tiiiger.github.io/
Repositories: 5
Profile: https://github.com/Tiiiger

Graduate student at Stanford University.

Citation (citation.py)

import time
import argparse
import numpy as np
import torch
import torch.nn.functional as F
import torch.optim as optim
from utils import load_citation, sgc_precompute, set_seed
from models import get_model
from metrics import accuracy
import pickle as pkl
from args import get_citation_args
from time import perf_counter

# Arguments
args = get_citation_args()

if args.tuned:
    if args.model == "SGC":
        with open("{}-tuning/{}.txt".format(args.model, args.dataset), 'rb') as f:
            args.weight_decay = pkl.load(f)['weight_decay']
            print("using tuned weight decay: {}".format(args.weight_decay))
    else:
        raise NotImplemented

# setting random seeds
set_seed(args.seed, args.cuda)

adj, features, labels, idx_train, idx_val, idx_test = load_citation(args.dataset, args.normalization, args.cuda)

model = get_model(args.model, features.size(1), labels.max().item()+1, args.hidden, args.dropout, args.cuda)

if args.model == "SGC": features, precompute_time = sgc_precompute(features, adj, args.degree)
print("{:.4f}s".format(precompute_time))

def train_regression(model,
                     train_features, train_labels,
                     val_features, val_labels,
                     epochs=args.epochs, weight_decay=args.weight_decay,
                     lr=args.lr, dropout=args.dropout):

    optimizer = optim.Adam(model.parameters(), lr=lr,
                           weight_decay=weight_decay)
    t = perf_counter()
    for epoch in range(epochs):
        model.train()
        optimizer.zero_grad()
        output = model(train_features)
        loss_train = F.cross_entropy(output, train_labels)
        loss_train.backward()
        optimizer.step()
    train_time = perf_counter()-t

    with torch.no_grad():
        model.eval()
        output = model(val_features)
        acc_val = accuracy(output, val_labels)

    return model, acc_val, train_time

def test_regression(model, test_features, test_labels):
    model.eval()
    return accuracy(model(test_features), test_labels)

if args.model == "SGC":
    model, acc_val, train_time = train_regression(model, features[idx_train], labels[idx_train], features[idx_val], labels[idx_val],
                     args.epochs, args.weight_decay, args.lr, args.dropout)
    acc_test = test_regression(model, features[idx_test], labels[idx_test])

print("Validation Accuracy: {:.4f} Test Accuracy: {:.4f}".format(acc_val, acc_test))
print("Pre-compute time: {:.4f}s, train time: {:.4f}s, total: {:.4f}s".format(precompute_time, train_time, precompute_time+train_time))

GitHub Events

Total

Watch event: 13
Fork event: 2

Last Year

Watch event: 13
Fork event: 2

Committers

Last synced: 9 months ago

All Time

Total Commits: 19
Total Committers: 6
Avg Commits per committer: 3.167
Development Distribution Score (DDS): 0.421

Past Year

Commits: 0
Committers: 0
Avg Commits per committer: 0.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
Tiiiger	z**x@g**m	11
felixgwu	f****u	3
liu-jc	j**u@q**m	2
kelseyball	k**l@g**m	1
hujunxianligong	h****g	1
Bartimaeus	c**2@c**u	1

Committer Domains (Top 20 + Academic)

cornell.edu: 1 qq.com: 1

Issues and Pull Requests

Last synced: 9 months ago

All Time

Total issues: 29
Total pull requests: 7
Average time to close issues: 5 days
Average time to close pull requests: about 18 hours
Total issue authors: 25
Total pull request authors: 5
Average comments per issue: 2.48
Average comments per pull request: 2.14
Merged pull requests: 4
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

liu-jc (3)
alamsaqib (2)
fmonta (2)
ghost (1)
polarwk (1)
MortonWang (1)
a370865882 (1)
youngflyasd (1)
yyl424525 (1)
tonyandsunny (1)
AIRobotZhang (1)
kentwhf (1)
Panxuran (1)
dsj96 (1)
jharrang (1)

Pull Request Authors

mro15 (2)
liu-jc (2)
kelseyball (1)
hujunxianligong (1)
ostapen (1)

Top Labels

Issue Labels

Pull Request Labels

Dependencies

downstream/TextSGC/requirements.txt pypi

hyperopt ==0.1.1

requirements.txt pypi

hyperopt ==0.1.1
networkx ==1.11
numpy *
scikit-learn *
scipy *

sgc

Science Score: 38.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Simplifying Graph Convolutional Networks

Updates

Authors:

Overview

Other reference implementations

Dependencies

Data

Usage

Downstream

Acknowledgement

Owner

Citation (citation.py)

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies