https://github.com/aspirina765/ultra
A foundation model for knowledge graph reasoning
Science Score: 10.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.3%) to scientific vocabulary
Last synced: 10 months ago
·
JSON representation
Repository
A foundation model for knowledge graph reasoning
Basic Info
- Host: GitHub
- Owner: aspirina765
- License: mit
- Default Branch: main
- Size: 5.74 MB
Statistics
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
- Releases: 0
Fork of DeepGraphLearning/ULTRA
Created over 2 years ago
· Last pushed over 2 years ago
https://github.com/aspirina765/ULTRA/blob/main/
# ULTRA: Towards Foundation Models for Knowledge Graph Reasoning # [](https://pytorch.org/get-started/locally/) [](https://pytorch-geometric.readthedocs.io/en/latest/install/installation.html) [](https://arxiv.org/abs/2310.04562) [](https://huggingface.co/collections/mgalkin/ultra-65699bb28369400a5827669d)  PyG implementation of [ULTRA], a foundation model for KG reasoning. Authored by [Michael Galkin], [Zhaocheng Zhu], and [Xinyu Yuan]. *Logo generated by DALLE 3.* [Zhaocheng Zhu]: https://kiddozhu.github.io [Michael Galkin]: https://migalkin.github.io/ [Xinyu Yuan]: https://github.com/KatarinaYuan [Ultra]: https://deepgraphlearning.github.io/project/ultra ## Overview ## ULTRA is a foundation model for knowledge graph (KG) reasoning. A single pre-trained ULTRA model performs link prediction tasks on *any* multi-relational graph with any entity / relation vocabulary. Performance-wise averaged on 50+ KGs, a single pre-trained ULTRA model is better **in the 0-shot inference mode** than many SOTA models trained specifically on each graph. Following the *pretrain-finetune* paradigm of foundation models, you can run a pre-trained ULTRA checkpoint immediately in the zero-shot manner on any graph as well as use more fine-tuning. ULTRA provides **unified, learnable, transferable** representations for any KG. Under the hood, ULTRA employs graph neural networks and modified versions of [NBFNet](https://github.com/KiddoZhu/NBFNet-PyG). ULTRA does not learn any entity and relation embeddings specific to a downstream graph but instead obtains *relative relation representations* based on interactions between relations. The original implementation with the TorchDrug framework is available [here](https://github.com/DeepGraphLearning/ultra_torchdrug) for reproduction purposes. This repository is based on PyTorch 2.1 and PyTorch-Geometric 2.4. **Your superpowers** : * Use the [pre-trained checkpoints](#checkpoints) to run zero-shot inference and fine-tuning on 57 transductive and inductive [datasets](#datasets). * Run [training and inference](#run-inference-and-fine-tuning) with multiple GPUs. * [Pre-train](#pretraining) ULTRA on your own mixture of graphs. * Run [evaluation on many datasets](#run-on-many-datasets) sequentially. * Use the pre-trained checkpoints to run inference and fine-tuning on [your own KGs](#adding-your-own-graph). Table of contents: * [Installation](#installation) * [Checkpoints](#checkpoints) * [Run inference and fine-tuning](#run-inference-and-fine-tuning) * [Single experiment](#run-a-single-experiment) * [Many experiments](#run-on-many-datasets) * [Pretraining](#pretraining) * [Datasets](#datasets) * [Adding custom datasets](#adding-your-own-graph) ## Updates * **Jan 15th, 2024**: Accepted at [ICLR 2024](https://openreview.net/forum?id=jVEoydFOl9)! * **Dec 4th, 2023**: Added a new ULTRA checkpoint `ultra_50g` pre-trained on 50 graphs. Averaged over 16 larger transductive graphs, it delivers 0.389 MRR / 0.549 Hits@10 compared to 0.329 MRR / 0.479 Hits@10 of the `ultra_3g` checkpoint. The inductive performance is still as good! Use this checkpoint for inference on larger graphs. * **Dec 4th, 2023**: Pre-trained ULTRA models (3g, 4g, 50g) are now also available on the [HuggingFace Hub](https://huggingface.co/collections/mgalkin/ultra-65699bb28369400a5827669d)! ## Installation ## You may install the dependencies via either conda or pip. Ultra PyG is implemented with Python 3.9, PyTorch 2.1 and PyG 2.4 (CUDA 11.8 or later when running on GPUs). If you are on a Mac, you may omit the CUDA toolkit requirements. ### From Conda ### ```bash conda install pytorch=2.1.0 pytorch-cuda=11.8 cudatoolkit=11.8 pytorch-scatter=2.1.2 pyg=2.4.0 -c pytorch -c nvidia -c pyg -c conda-forge conda install ninja easydict pyyaml -c conda-forge ``` ### From Pip ### ```bash pip install torch==2.1.0 --index-url https://download.pytorch.org/whl/cu118 pip install torch-scatter==2.1.2 torch-sparse==0.6.18 torch-geometric==2.4.0 -f https://data.pyg.org/whl/torch-2.1.0+cu118.html pip install ninja easydict pyyaml ```## Checkpoints ## We provide two pre-trained ULTRA checkpoints in the `/ckpts` folder of the same model size (6-layer GNNs per relation and entity graphs, 64d, 168k total parameters) trained on 4 x A100 GPUs with this codebase: * `ultra_3g.pth`: trained on `FB15k237, WN18RR, CoDExMedium` for 800,000 steps, config is in `/config/transductive/pretrain_3g.yaml` * `ultra_4g.pth`: trained on `FB15k237, WN18RR, CoDExMedium, NELL995` for 400,000 steps, config is in `/config/transductive/pretrain_4g.yaml` You can use those checkpoints for zero-shot inference on any graph (including your own) or use it as a backbone for fine-tuning. Both checkpoints are rather small (2 MB each). Zero-shot performance of the checkpoints compared to the paper version (PyG experiments were run on a single RTX 3090, PyTorch 2.1, PyG 2.4, CUDA 11.8 using the `run_many.py` script in this repo):Compilation of the `rspmm` kernel
To make relational message passing iteration `O(V)` instead of `O(E)` we ship a custom `rspmm` kernel that will be compiled automatically upon the first launch. The `rspmm` kernel supports `transe` and `distmult` message functions, others like `rotate` will resort to full edge materialization and `O(E)` complexity. The kernel can be compiled on both CPUs (including M1/M2 on Macs) and GPUs (it is done only once and then cached). For GPUs, you need a CUDA 11.8+ toolkit with the `nvcc` compiler. If you are deploying this in a Docker container, make sure to start from the `devel` images that contain `nvcc` in addition to plain CUDA runtime. Make sure your `CUDA_HOME` variable is set properly to avoid potential compilation errors, eg ```bash export CUDA_HOME=/usr/local/cuda-11.8/ ```
| Model | Inductive (e) (18 graphs) | Inductive (e,r) (23 graphs) | ||
|---|---|---|---|---|
| MRR | Hits@10 | MRR | Hits@10 | |
| ULTRA (3g) Paper | 0.430 | 0.566 | 0.345 | 0.512 |
| ULTRA (4g) Paper | 0.439 | 0.580 | 0.352 | 0.518 |
| ULTRA (3g) PyG | 0.420 | 0.562 | 0.344 | 0.511 |
| ULTRA (4g) PyG | 0.444 | 0.588 | 0.344 | 0.513 |
On the training graph mixture
Right now, 10 transductive datasets are supported for the pre-training mixture in the `JointDataset`: * FB15k237 * WN18RR * CoDExSmall * CoDExMedium * CoDExLarge * NELL995 * YAGO310 * ConceptNet100k * DBpedia100k * AristoV4 You can add more datasets (from all 57 implemented as well as your custom ones) by modifying the `datasets_map` in `datasets.py`. By adding inductive datasets you'd need to add proper filtering datasets (similar to that in `test()` function in `run.py`) to have a consistent evaluation protocol.Transductive datasets (16)
* `FB15k237`, `WN18RR`, `NELL995`, `YAGO310`, `CoDExSmall`, `CoDExMedium`, `CoDExLarge`, `Hetionet`, `ConceptNet100k`, `DBpedia100k`, `AristoV4` - full head/tail evaluation * `WDsinger`, `NELL23k`, `FB15k237_10`, `FB15k237_20`, `FB15k237_50`- only tail evaluationInductive (entity) datasets (18) - new nodes but same relations at inference time
* 12 GraIL datasets (FB / WN / NELL) x (V1 / V2 / V3 / V4) * 2 ILPC 2022 datasets * 4 datasets from [INDIGO](https://github.com/shuwen-liu-ox/INDIGO) | Dataset | Versions | | :-------: | :-------:| | `FB15k237Inductive`| `v1, v2, v3, v4` | | `WN18RRInductive`| `v1, v2, v3, v4` | | `NELLInductive`| `v1, v2, v3, v4` | | `ILPC2022`| `small, large` | | `HM`| `1k, 3k, 5k, indigo` |Inductive (entity, relation) datasets (23) - both new nodes and relations at inference time
* 13 Ingram datasets (FB / WK / NL) x (25 / 50 / 75 / 100) * 10 [MTDEA](https://arxiv.org/abs/2307.06046) datasets | Dataset | Versions | | :-------: | :-------:| | `FBIngram`| `25, 50, 75, 100` | | `WKIngram`| `25, 50, 75, 100` | | `NLIngram`| `0, 25, 50, 75, 100` | | `WikiTopicsMT1`| `tax, health` | | `WikiTopicsMT2`| `org, sci` | | `WikiTopicsMT3`| `art, infra` | | `WikiTopicsMT4`| `sci, health` | | `Metafam`| single version | | `FBNELL`| single version |Code example
```python class CustomDataset(TransductiveDataset): urls = [ "link/to/train.txt", "link/to/valid.txt", "link/to/test.txt", ] name = "custom_data" ```Code example
```python class CustomDataset(InductiveDataset): urls = [ "link/to/train.txt", "link/to/inference_graph.txt", "link/to/inference_valid.txt", "link/to/inference_test.txt", ] name = "custom_data" ```Owner
- Login: aspirina765
- Kind: user
- Repositories: 423
- Profile: https://github.com/aspirina765