lifelong-learning

Lifelong Learning of Graph Neural Networks for Open-World Node Classification

https://github.com/lgalke/lifelong-learning

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 5 DOI reference(s) in README
✓
Academic publication links
Links to: sciencedirect.com, zenodo.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (13.6%) to scientific vocabulary

Keywords

continual-learning deep-learning evolving-graphs graph-convolutional-networks graph-neural-networks lifelong-learning node-classification pytorch unseen-class-detection

Last synced: 6 months ago · JSON representation ·

Repository

Lifelong Learning of Graph Neural Networks for Open-World Node Classification

Basic Info

Host: GitHub
Owner: lgalke
Language: Python
Default Branch: main
Homepage:
Size: 142 KB

Statistics

Stars: 29
Watchers: 2
Forks: 4
Open Issues: 0
Releases: 0

Topics

continual-learning deep-learning evolving-graphs graph-convolutional-networks graph-neural-networks lifelong-learning node-classification pytorch unseen-class-detection

Created about 5 years ago · Last pushed over 2 years ago

Metadata Files

Readme Citation

Lifelong Learning of Graph Neural Networks for Open-World Node Classification

Papers

Lukas Galke, Iacopo Vagliano, Benedikt Franke, Tobias Zielke, Marcel Hoffmann, Ansgar Scherp (2023). Lifelong Learning on Evolving Graphs Under the Constraints of Imbalanced Classes and New Classes. Neural Networks 164, 156-176.

Lukas Galke, Benedikt Franke, Tobias Zielke, Ansgar Scherp (2021). Lifelong Learning of Graph Neural Networks for Open-World Node Classification, In 2021 International Joint Conference on Neural Networks (IJCNN), IEEE.

``` @article{galke2023lifelong, title = {Lifelong learning on evolving graphs under the constraints of imbalanced classes and new classes}, journal = {Neural Networks}, volume = {164}, pages = {156-176}, year = {2023}, issn = {0893-6080}, doi = {https://doi.org/10.1016/j.neunet.2023.04.022}, url = {https://www.sciencedirect.com/science/article/pii/S0893608023002083}, author = {Lukas Galke and Iacopo Vagliano and Benedikt Franke and Tobias Zielke and Marcel Hoffmann and Ansgar Scherp} }

@inproceedings{galke2021lifelong, author={Galke, Lukas and Franke, Benedikt and Zielke, Tobias and Scherp, Ansgar}, booktitle={2021 International Joint Conference on Neural Networks (IJCNN)}, title={Lifelong Learning of Graph Neural Networks for Open-World Node Classification}, year={2021}, volume={}, number={}, pages={1-8}, doi={10.1109/IJCNN52387.2021.9533412} } ```

Installation

Setup a python virtual environment (recommended)
Install pytorch as suited to your OS / python package manager
Install dgl as suited to your OS / python package manager / CUDA version
Install torch-geometric
Install other requirements via pip install -r requirements.txt within your copy of this repository. This will include mainly numpy, scipy, pandas, matplotlib, seaborn, scikit-learn.

In the requirements.txt file we list the versions we have used for our experiments. Newer versions of standard packages will likely work fine.

Get the datasets

The three datasets of our paper are available on zenodo. Download the zip files and extract them into the data subdirectory, such that the structure looks exactly like this:

data/dblp-easy
data/dblp-hard
data/pharmabio

Example call to run an experiment

The following exemplary command will run an experiment with a GraphSAGE model with two graph convolutoinal layers and a hidden dimension of 32 on the dblp-easy dataset starting evaluation at task corresponding to year 2004 while using 200 annual epochs.

python3 run_experiment.py --seed 42 --backend dgl --model gs-mean --n_hidden 32 --start warm --lr "0.005" --history 3 --n_layers 1 --weight_decay 0 --dropout 0.5 --initial_epochs 0 --annual_epochs 200 --dataset "dblp-easy" --t_start 2004 --save "results.csv"

The results.csv file can be reused for multiple runs (e.g. with different seeds, different models, different datasets), the script appends new results to the file. Consult python3 run_experiment.py -h for more information. The other variant of our run script run_experiment_new.py is similar, but operates on preprocessed tasks (see below).

Tabularize results

To bring the results into a human readable form, we provide a tabularize.py script. This script takes one or more results files as input and groups them depending on the -g parameter, which can be multiple attributes. Example: Aggregate results by dataset, history size, model, restart configuration, and learning rate.

python3 tabularize.py results.csv -g dataset history model start initial_lr

The script tabularize_forward_transfer.py uses the exact grouping as we use for the results of the paper and also computes Forward Transfer (averaged difference between warm and cold restarts). Note that each configuration must have a value for both start=warm and start=cold, such that Forward Transfer can be computed.

Visualize results

You can visualize with the visualize.py script:

python3 visualize.py --style "window size %RF" --hue model --col dataset --row start --save plot.png results.csv

where results.csv is the file where you've aggregated the results. You can also provide multiple results files, then they will be concatenated before plotting.

Preprocessing tasks for multiple experiments

Constructing tasks dynamically may take some time during experiments. The tasks can be preprocessed with the script preprocess_datasets.py Important: the Argument t_zero must be one task before the first evaluation task, i.e., if you want to start evaluation at time 2004, you must preprocess tasks starting at time 2003.

Example to preprocess tasks for dblp-easy starting at year 2003 with a history size of 3 for models implemented via dgl:

python3 preprocess_dataset.py data/dblp-easy` --t_zero 2003 --history 3 --backend dgl data/dblp-easy

In scripts/preprocess_all_datasets.bash, you find a shorthand to preprocess all datasets in all history size configurations from the paper for both backends. This shorthand should be started with the repository's root as working directory and expects the directory structure of data/ as described above.

Then, you need to use the version of the run script that uses preprocessed tasks, namely run_experiment_new.py. The interface of the script is nearly the same as the one of run_experiment.py. One difference is that reproducing our ablation study for comparison with once-trained static models is only possible with run_experiment.py.

Full reproduction of the paper's experiments

In the experiments/ directory, you find bash scripts to re-run all our experiments.

Issues?

Note: Such that the python/bash scripts within this repository work properly, you must call them from the root directory of this repository.

If there are any problems with using this repository, feel free to file an issue.

File Descriptions

Notes

The experiments for inductive vs transductive learning can be found in a different repository.

Owner

Name: Lukas Galke
Login: lgalke
Kind: user
Location: Nijmegen, NL
Company: Max Planck Institute for Psycholinguistics

Website: http://lpag.de
Twitter: LukasGalke
Repositories: 7
Profile: https://github.com/lgalke

Early Postdoc at MPI Psycholinguistics exploring emergent communication with deep nets

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite the accompanying journal article."
preferred-citation:
  type: article
  authors:
  - family-names: "Galke"
    given-names: "Lukas"
    orcid: "https://orcid.org/0000-0001-6124-1092"
  - family-names: "Vagliano"
    given-names: "Iacopo"
  - family-names: "Franke"
    given-names: "Benedikt"
  - family-names: "Zielke"
    given-names: "Tobias"
  - family-names: "Hoffmann"
    given-names: "Marcel"
  - family-names: "Scherp"
    given-names: "Ansgar"
  title: "Lifelong learning on evolving graphs under the constraints of imbalanced classes and new classes"
  doi: 10.1016/j.neunet.2023.04.022
  year: 2023
  journal: "Neural Networks"
  volume: 164
  start: 156
  end: 176

GitHub Events

Total

Watch event: 3

Last Year

Watch event: 3

Dependencies

requirements.txt pypi

dgl-cu110 ==0.5.3
joblib *
matplotlib ==3.3.4
networkx *
numpy *
pandas ==1.2.1
scikit-learn *
scipy ==1.6.0
seaborn ==0.11.0
torch ==1.7.1
torch-cluster ==1.5.8
torch-geometric ==1.6.3
torch-scatter ==2.0.5
torch-sparse ==0.6.8
torch-spline-conv ==1.2.0
tqdm *

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

lifelong-learning

Science Score: 67.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Lifelong Learning of Graph Neural Networks for Open-World Node Classification

Papers

Installation

Get the datasets

Example call to run an experiment

Tabularize results

Visualize results

Preprocessing tasks for multiple experiments

Full reproduction of the paper's experiments

Issues?

File Descriptions

Notes

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Dependencies