lgi-ls
[NeurIPS 2023] Latent Graph Inference with Limited Supervision
Science Score: 18.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
○codemeta.json file
-
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.7%) to scientific vocabulary
Repository
[NeurIPS 2023] Latent Graph Inference with Limited Supervision
Basic Info
- Host: GitHub
- Owner: Jianglin954
- License: mit
- Language: JavaScript
- Default Branch: main
- Homepage: https://jianglin954.github.io/LGI-LS/
- Size: 2.38 MB
Statistics
- Stars: 13
- Watchers: 1
- Forks: 1
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
LGI-LS (NeurIPS 2023)
Codes for the NeurIPS 2023 paper Latent Graph Inference with Limited Supervision.
Datasets
The Cora, Citeseer, and Pubmed datasets can be download from here. Please place the downloaded files in the folder data_tf. The ogbn-arxiv dataset will be loaded automatically.
Installation
bash
conda create -n LGI python=3.7.2
conda activate LGI
pip install torch==1.5.1 torchvision==0.6.1
pip install scipy==1.2.1
pip install scikit-learn==0.21.3
pip install dgl-cu102==0.5.2
pip install ogb==1.2.3
wget https://data.pyg.org/whl/torch-1.5.0%2Bcu102/torch_scatter-2.0.5-cp37-cp37m-linux_x86_64.whl
wget https://data.pyg.org/whl/torch-1.5.0%2Bcu102/torch_sparse-0.6.5-cp37-cp37m-linux_x86_64.whl
wget https://data.pyg.org/whl/torch-1.5.0%2Bcu102/torch_cluster-1.5.4-cp37-cp37m-linux_x86_64.whl
wget https://data.pyg.org/whl/torch-1.5.0%2Bcu102/torch_spline_conv-1.2.0-cp37-cp37m-linux_x86_64.whl
pip install torch_scatter-2.0.5-cp37-cp37m-linux_x86_64.whl
pip install torch_sparse-0.6.5-cp37-cp37m-linux_x86_64.whl
pip install torch_cluster-1.5.4-cp37-cp37m-linux_x86_64.whl
pip install torch_spline_conv-1.2.0-cp37-cp37m-linux_x86_64.whl
pip install torch-geometric==1.6.1
Usage
We provide GCN+KNN, GCN+KNN_U, and GCN+KNN_R as examples due to their simplicity and effectiveness. To test their performances on the Pubmed dataset, run the following command:
bash
bash experiments.sh
The experimental results will be saved in the corresponding *.txt file.
Reference
@inproceedings{Jianglin2023LGI,
title={Latent Graph Inference with Limited Supervision},
author={Lu, Jianglin and Xu, Yi and Wang, Huan and Bai, Yue and Fu, Yun},
booktitle={Advances in Neural Information Processing Systems},
year={2023}
}
@inproceedings{fatemi2021slaps,
title={SLAPS: Self-Supervision Improves Structure Learning for Graph Neural Networks},
author={Fatemi, Bahare and Asri, Layla El and Kazemi, Seyed Mehran},
booktitle={Advances in Neural Information Processing Systems},
year={2021}
}
Acknowledgement
Our codes are mainly based on SLAPS. For other comparison methods, please refer to their publicly available code repositories. We gratefully thank the authors for their contributions.
Owner
- Name: Jianglin Lu
- Login: Jianglin954
- Kind: user
- Location: Boston
- Company: Northeastern University
- Website: https://jianglin954.github.io/
- Repositories: 6
- Profile: https://github.com/Jianglin954
I am a first-year Ph.D. student at Northeastern University, USA. My research interests mainly include computer vision, machine learning, and data mining.
Citation (citation_networks.py)
# The MIT License
# Copyright (c) 2016 Thomas Kipf
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
# THE SOFTWARE.
import pickle as pkl
import sys
import warnings
import numpy as np
import scipy.sparse as sp
import torch
warnings.simplefilter("ignore")
def parse_index_file(filename):
"""Parse index file."""
index = []
for line in open(filename):
index.append(int(line.strip()))
return index
def sample_mask(idx, l):
"""Create mask."""
mask = np.zeros(l)
mask[idx] = 1
return np.array(mask, dtype=np.bool)
def load_citation_network(dataset_str):
names = ['x', 'y', 'tx', 'ty', 'allx', 'ally', 'graph']
objects = []
for i in range(len(names)):
with open("data_tf/ind.{}.{}".format(dataset_str, names[i]), 'rb') as f:
if sys.version_info > (3, 0):
objects.append(pkl.load(f, encoding='latin1'))
else:
objects.append(pkl.load(f))
x, y, tx, ty, allx, ally, graph = tuple(objects)
test_idx_reorder = parse_index_file("data_tf/ind.{}.test.index".format(dataset_str))
test_idx_range = np.sort(test_idx_reorder)
if dataset_str == 'citeseer':
test_idx_range_full = range(min(test_idx_reorder), max(test_idx_reorder) + 1)
tx_extended = sp.lil_matrix((len(test_idx_range_full), x.shape[1]))
tx_extended[test_idx_range - min(test_idx_range), :] = tx
tx = tx_extended
ty_extended = np.zeros((len(test_idx_range_full), y.shape[1]))
ty_extended[test_idx_range - min(test_idx_range), :] = ty
ty = ty_extended
features = sp.vstack((allx, tx)).tolil()
features[test_idx_reorder, :] = features[test_idx_range, :]
labels = np.vstack((ally, ty))
labels[test_idx_reorder, :] = labels[test_idx_range, :]
idx_test = test_idx_range.tolist()
idx_train = range(len(y))
idx_val = range(len(y), len(y) + 500)
train_mask = sample_mask(idx_train, labels.shape[0])
val_mask = sample_mask(idx_val, labels.shape[0])
test_mask = sample_mask(idx_test, labels.shape[0])
features = torch.FloatTensor(features.todense())
labels = torch.LongTensor(labels)
train_mask = torch.BoolTensor(train_mask)
val_mask = torch.BoolTensor(val_mask)
test_mask = torch.BoolTensor(test_mask)
nfeats = features.shape[1]
for i in range(labels.shape[0]):
sum_ = torch.sum(labels[i])
if sum_ != 1:
labels[i] = torch.tensor([1, 0, 0, 0, 0, 0])
labels = (labels == 1).nonzero()[:, 1]
nclasses = torch.max(labels).item() + 1
return features, nfeats, labels, nclasses, train_mask, val_mask, test_mask
def load_citation_network_halftrain(dataset_str):
names = ['x', 'y', 'tx', 'ty', 'allx', 'ally', 'graph']
objects = []
for i in range(len(names)):
with open("data_tf/ind.{}.{}".format(dataset_str, names[i]), 'rb') as f:
if sys.version_info > (3, 0):
objects.append(pkl.load(f, encoding='latin1'))
else:
objects.append(pkl.load(f))
x, y, tx, ty, allx, ally, graph = tuple(objects)
test_idx_reorder = parse_index_file("data_tf/ind.{}.test.index".format(dataset_str))
test_idx_range = np.sort(test_idx_reorder)
if dataset_str == 'citeseer':
test_idx_range_full = range(min(test_idx_reorder), max(test_idx_reorder) + 1)
tx_extended = sp.lil_matrix((len(test_idx_range_full), x.shape[1]))
tx_extended[test_idx_range - min(test_idx_range), :] = tx
tx = tx_extended
ty_extended = np.zeros((len(test_idx_range_full), y.shape[1]))
ty_extended[test_idx_range - min(test_idx_range), :] = ty
ty = ty_extended
features = sp.vstack((allx, tx)).tolil()
features[test_idx_reorder, :] = features[test_idx_range, :]
labels = np.vstack((ally, ty))
labels[test_idx_reorder, :] = labels[test_idx_range, :]
idx_test = test_idx_range.tolist()
idx_train = range(len(y))
idx_val = range(len(y), len(y) + 500)
train_mask = sample_mask(idx_train, labels.shape[0])
val_mask = sample_mask(idx_val, labels.shape[0])
test_mask = sample_mask(idx_test, labels.shape[0])
features = torch.FloatTensor(features.todense())
labels = torch.LongTensor(labels)
train_mask = torch.BoolTensor(train_mask)
val_mask = torch.BoolTensor(val_mask)
test_mask = torch.BoolTensor(test_mask)
nfeats = features.shape[1]
for i in range(labels.shape[0]):
sum_ = torch.sum(labels[i])
if sum_ != 1:
labels[i] = torch.tensor([1, 0, 0, 0, 0, 0])
labels = (labels == 1).nonzero()[:, 1]
nclasses = torch.max(labels).item() + 1
if dataset_str == 'pubmed':
colum_sum = torch.zeros((1, 3))
elif dataset_str == 'cora':
colum_sum = torch.zeros((1, 7))
elif dataset_str == 'citeseer':
colum_sum = torch.zeros((1, 6))
for iii in range(y.shape[0]):
colum_sum = colum_sum + y[iii, :]
if colum_sum.max() > 10:
colum_sum = colum_sum - y[iii, :]
train_mask[iii] = 0
return features, nfeats, labels, nclasses, train_mask, val_mask, test_mask
def load_citation_network_calculate_starved_nodes(dataset_str):
names = ['x', 'y', 'tx', 'ty', 'allx', 'ally', 'graph']
objects = []
for i in range(len(names)):
with open("data_tf/ind.{}.{}".format(dataset_str, names[i]), 'rb') as f:
if sys.version_info > (3, 0):
objects.append(pkl.load(f, encoding='latin1'))
else:
objects.append(pkl.load(f))
x, y, tx, ty, allx, ally, graph = tuple(objects)
num_vertices = len(graph)
adjacency = [[0 for j in range(num_vertices)] for i in range(num_vertices)]
for i in range(num_vertices):
for j in graph[i]:
adjacency[i][j] = 1
adjacency = np.array(adjacency)
test_idx_reorder = parse_index_file("data_tf/ind.{}.test.index".format(dataset_str))
test_idx_range = np.sort(test_idx_reorder)
if dataset_str == 'citeseer':
# Fix citeseer dataset (there are some isolated nodes in the graph)
# Find isolated nodes, add them as zero-vecs into the right position
test_idx_range_full = range(min(test_idx_reorder), max(test_idx_reorder) + 1)
tx_extended = sp.lil_matrix((len(test_idx_range_full), x.shape[1]))
tx_extended[test_idx_range - min(test_idx_range), :] = tx
tx = tx_extended
ty_extended = np.zeros((len(test_idx_range_full), y.shape[1]))
ty_extended[test_idx_range - min(test_idx_range), :] = ty
ty = ty_extended
features = sp.vstack((allx, tx)).tolil()
features[test_idx_reorder, :] = features[test_idx_range, :]
labels = np.vstack((ally, ty))
labels[test_idx_reorder, :] = labels[test_idx_range, :]
idx_test = test_idx_range.tolist()
idx_train = range(len(y))
idx_val = range(len(y), len(y) + 500)
train_mask = sample_mask(idx_train, labels.shape[0])
val_mask = sample_mask(idx_val, labels.shape[0])
test_mask = sample_mask(idx_test, labels.shape[0])
features = torch.FloatTensor(features.todense())
labels = torch.LongTensor(labels)
train_mask = torch.BoolTensor(train_mask)
val_mask = torch.BoolTensor(val_mask)
test_mask = torch.BoolTensor(test_mask)
nfeats = features.shape[1]
for i in range(labels.shape[0]):
sum_ = torch.sum(labels[i])
if sum_ != 1:
labels[i] = torch.tensor([1, 0, 0, 0, 0, 0])
labels = (labels == 1).nonzero()[:, 1]
nclasses = torch.max(labels).item() + 1
return features, nfeats, labels, nclasses, train_mask, val_mask, test_mask, adjacency