sgc

official implementation for the paper "Simplifying Graph Convolutional Networks"

https://github.com/tiiiger/sgc

Science Score: 38.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org, scholar.google
  • Committers with academic emails
    1 of 6 committers (16.7%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.9%) to scientific vocabulary

Keywords

graph machine-learning
Last synced: 6 months ago · JSON representation ·

Repository

official implementation for the paper "Simplifying Graph Convolutional Networks"

Basic Info
  • Host: GitHub
  • Owner: Tiiiger
  • License: mit
  • Language: Python
  • Default Branch: master
  • Homepage:
  • Size: 5.96 MB
Statistics
  • Stars: 843
  • Watchers: 17
  • Forks: 146
  • Open Issues: 2
  • Releases: 0
Topics
graph machine-learning
Created about 7 years ago · Last pushed about 4 years ago
Metadata Files
Readme License Citation

README.md

Simplifying Graph Convolutional Networks

made-with-python License: MIT

Updates

  • As pointed out by #23, there was a subtle bug in our preprocessing code for the reddit dataset. After fixing this bug, SGC achieves a F1 score of 95.0 (previously, it was 94.9).
  • Practical advice: it is often very helpful to normalize the features to have zero mean with standard deviation one to accelerate the convergence of SGC (and many other linear models). For example, we apply this normalization for the reddit dataset. Please consider doing this when applying SGC to other datasets. For some relevant discussions, see Ross et al, 2013 and Li and Zhang, 1998.

Authors:

*: Equal Contribution

Overview

This repo contains an example implementation of the Simple Graph Convolution (SGC) model, described in the ICML2019 paper Simplifying Graph Convolutional Networks.

SGC removes the nonlinearities and collapes the weight matrices in Graph Convolutional Networks (GCNs) and is essentially a linear model. For an illustration,

SGC achieves competitive performance while saving much training time. For reference, on a GTX 1080 Ti,

Dataset | Metric | Training Time :------:|:------:|:-----------:| Cora | Acc: 81.0 % | 0.13s Citeseer| Acc: 71.9 % | 0.14s Pubmed | Acc: 78.9 % | 0.29s Reddit | F1: 94.9 % | 2.7s

This home repo contains the implementation for citation networks (Cora, Citeseer, and Pubmed) and social network (Reddit). We have a work-in-progress branch ablation, containing additional codebase for our ablation studies.

If you find this repo useful, please cite: @InProceedings{pmlr-v97-wu19e, title = {Simplifying Graph Convolutional Networks}, author = {Wu, Felix and Souza, Amauri and Zhang, Tianyi and Fifty, Christopher and Yu, Tao and Weinberger, Kilian}, booktitle = {Proceedings of the 36th International Conference on Machine Learning}, pages = {6861--6871}, year = {2019}, publisher = {PMLR}, }

Other reference implementations

Other reference implementations can be found in the follwing libraries. Note that in these examples, the hyperparameters are potentially different and the results would be different from the paper reported ones.

Dependencies

Our implementation works with PyTorch>=1.0.0 Install other dependencies: $ pip install -r requirement.txt

Data

We provide the citation network datasets under data/, which corresponds to the public data splits. Due to space limit, please download reddit dataset from FastGCN and put reddit_adj.npz, reddit.npz under data/.

Usage

Citation Networks: We tune the only hyperparameter, weight decay, with hyperopt and put the resulting hyperparameter under SGC-tuning. See tuning.py for more details on hyperparameter optimization. $ python citation.py --dataset cora --tuned $ python citation.py --dataset citeseer --tuned --epochs 150 $ python citation.py --dataset pubmed --tuned

Reddit: $ python reddit.py --inductive --test

Downstream

We collect the code base for downstream tasks under downstream. Currently, we are releasing only SGC implementation for text classification.

Acknowledgement

This repo is modified from pygcn, and FastGCN.

We thank Deep Graph Library team for helping providing a reference implementation of SGC and benchmarking SGC in Deep Graph Library. We thank Matthias Fey, author of PyTorch Geometric, for his help on providing a reference implementation of SGC within PyTorch Geometric. We thank Daniele Grattarola, author of Spektral, for his help on providing a reference implementation of SGC within Spektral.

Owner

  • Name: Tianyi
  • Login: Tiiiger
  • Kind: user
  • Location: Ithaca, NY

Graduate student at Stanford University.

Citation (citation.py)

import time
import argparse
import numpy as np
import torch
import torch.nn.functional as F
import torch.optim as optim
from utils import load_citation, sgc_precompute, set_seed
from models import get_model
from metrics import accuracy
import pickle as pkl
from args import get_citation_args
from time import perf_counter

# Arguments
args = get_citation_args()

if args.tuned:
    if args.model == "SGC":
        with open("{}-tuning/{}.txt".format(args.model, args.dataset), 'rb') as f:
            args.weight_decay = pkl.load(f)['weight_decay']
            print("using tuned weight decay: {}".format(args.weight_decay))
    else:
        raise NotImplemented

# setting random seeds
set_seed(args.seed, args.cuda)

adj, features, labels, idx_train, idx_val, idx_test = load_citation(args.dataset, args.normalization, args.cuda)

model = get_model(args.model, features.size(1), labels.max().item()+1, args.hidden, args.dropout, args.cuda)

if args.model == "SGC": features, precompute_time = sgc_precompute(features, adj, args.degree)
print("{:.4f}s".format(precompute_time))

def train_regression(model,
                     train_features, train_labels,
                     val_features, val_labels,
                     epochs=args.epochs, weight_decay=args.weight_decay,
                     lr=args.lr, dropout=args.dropout):

    optimizer = optim.Adam(model.parameters(), lr=lr,
                           weight_decay=weight_decay)
    t = perf_counter()
    for epoch in range(epochs):
        model.train()
        optimizer.zero_grad()
        output = model(train_features)
        loss_train = F.cross_entropy(output, train_labels)
        loss_train.backward()
        optimizer.step()
    train_time = perf_counter()-t

    with torch.no_grad():
        model.eval()
        output = model(val_features)
        acc_val = accuracy(output, val_labels)

    return model, acc_val, train_time

def test_regression(model, test_features, test_labels):
    model.eval()
    return accuracy(model(test_features), test_labels)

if args.model == "SGC":
    model, acc_val, train_time = train_regression(model, features[idx_train], labels[idx_train], features[idx_val], labels[idx_val],
                     args.epochs, args.weight_decay, args.lr, args.dropout)
    acc_test = test_regression(model, features[idx_test], labels[idx_test])

print("Validation Accuracy: {:.4f} Test Accuracy: {:.4f}".format(acc_val, acc_test))
print("Pre-compute time: {:.4f}s, train time: {:.4f}s, total: {:.4f}s".format(precompute_time, train_time, precompute_time+train_time))

GitHub Events

Total
  • Watch event: 13
  • Fork event: 2
Last Year
  • Watch event: 13
  • Fork event: 2

Committers

Last synced: 9 months ago

All Time
  • Total Commits: 19
  • Total Committers: 6
  • Avg Commits per committer: 3.167
  • Development Distribution Score (DDS): 0.421
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Tiiiger z****x@g****m 11
felixgwu f****u 3
liu-jc j****u@q****m 2
kelseyball k****l@g****m 1
hujunxianligong h****g 1
Bartimaeus c****2@c****u 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 9 months ago

All Time
  • Total issues: 29
  • Total pull requests: 7
  • Average time to close issues: 5 days
  • Average time to close pull requests: about 18 hours
  • Total issue authors: 25
  • Total pull request authors: 5
  • Average comments per issue: 2.48
  • Average comments per pull request: 2.14
  • Merged pull requests: 4
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • liu-jc (3)
  • alamsaqib (2)
  • fmonta (2)
  • ghost (1)
  • polarwk (1)
  • MortonWang (1)
  • a370865882 (1)
  • youngflyasd (1)
  • yyl424525 (1)
  • tonyandsunny (1)
  • AIRobotZhang (1)
  • kentwhf (1)
  • Panxuran (1)
  • dsj96 (1)
  • jharrang (1)
Pull Request Authors
  • mro15 (2)
  • liu-jc (2)
  • kelseyball (1)
  • hujunxianligong (1)
  • ostapen (1)
Top Labels
Issue Labels
Pull Request Labels

Dependencies

downstream/TextSGC/requirements.txt pypi
  • hyperopt ==0.1.1
requirements.txt pypi
  • hyperopt ==0.1.1
  • networkx ==1.11
  • numpy *
  • scikit-learn *
  • scipy *