https://github.com/awslabs/dgl-ke

High performance, easy-to-use, and scalable package for learning large-scale knowledge graph embeddings.

https://github.com/awslabs/dgl-ke

Science Score: 20.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Committers with academic emails
    1 of 26 committers (3.8%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.8%) to scientific vocabulary

Keywords

dgl graph-learning knowledge-graph knowledge-graphs-embeddings machine-learning

Keywords from Contributors

mxnet sagemaker drug-discovery geometric-deep-learning graph-neural-networks
Last synced: 5 months ago · JSON representation

Repository

High performance, easy-to-use, and scalable package for learning large-scale knowledge graph embeddings.

Basic Info
  • Host: GitHub
  • Owner: awslabs
  • License: apache-2.0
  • Language: Python
  • Default Branch: master
  • Homepage: https://dglke.dgl.ai/doc/
  • Size: 4.76 MB
Statistics
  • Stars: 1,317
  • Watchers: 27
  • Forks: 197
  • Open Issues: 60
  • Releases: 2
Topics
dgl graph-learning knowledge-graph knowledge-graphs-embeddings machine-learning
Created almost 6 years ago · Last pushed 7 months ago
Metadata Files
Readme Contributing License Code of conduct

README.md

License

Documentation

Knowledge graphs (KGs) are data structures that store information about different entities (nodes) and their relations (edges). A common approach of using KGs in various machine learning tasks is to compute knowledge graph embeddings. DGL-KE is a high performance, easy-to-use, and scalable package for learning large-scale knowledge graph embeddings. The package is implemented on the top of Deep Graph Library (DGL) and developers can run DGL-KE on CPU machine, GPU machine, as well as clusters with a set of popular models, including TransE, TransR, RESCAL, DistMult, ComplEx, and RotatE.

DGL-ke architecture
Figure: DGL-KE Overall Architecture

Currently DGL-KE support three tasks:

  • Training, trains KG embeddings using dglke_train(single machine) or dglke_dist_train(distributed environment).
  • Evaluation, reads the pre-trained embeddings and evaluates the embeddings with a link prediction task on the test set using dglke_eval.
  • Inference, reads the pre-trained embeddings and do the entities/relations linkage predicting inference tasks using dglke_predict or do the embedding similarity inference tasks using dglke_emb_sim.

Note

If you just want to train a KGE model using TransE, DistMult, RotatE, please goto https://github.com/awslabs/graphstorm.

A Quick Start

To install the latest version of DGL-KE run:

sudo pip3 install dgl sudo pip3 install dglke

Train a transE model on FB15k dataset by running the following command:

DGLBACKEND=pytorch dglke_train --model_name TransE_l2 --dataset FB15k --batch_size 1000 \ --neg_sample_size 200 --hidden_dim 400 --gamma 19.9 --lr 0.25 --max_step 500 --log_interval 100 \ --batch_size_eval 16 -adv --regularization_coef 1.00E-09 --test --num_thread 1 --num_proc 8

This command will download the FB15k dataset, train the transE model and save the trained embeddings into the file.

Performance and Scalability

DGL-KE is designed for learning at scale. It introduces various novel optimizations that accelerate training on knowledge graphs with millions of nodes and billions of edges. Our benchmark on knowledge graphs consisting of over 86M nodes and 338M edges shows that DGL-KE can compute embeddings in 100 minutes on an EC2 instance with 8 GPUs and 30 minutes on an EC2 cluster with 4 machines (48 cores/machine). These results represent a 2×∼5× speedup over the best competing approaches.

vs-gv-fb15k
Figure: DGL-KE vs GraphVite on FB15k

vs-pbg-fb
Figure: DGL-KE vs Pytorch-BigGraph on Freebase

Learn more details with our documentation! If you are interested in the optimizations in DGL-KE, please check out our paper for more details.

Cite

If you use DGL-KE in a scientific publication, we would appreciate citations to the following paper:

bibtex @inproceedings{DGL-KE, author = {Zheng, Da and Song, Xiang and Ma, Chao and Tan, Zeyuan and Ye, Zihao and Dong, Jin and Xiong, Hao and Zhang, Zheng and Karypis, George}, title = {DGL-KE: Training Knowledge Graph Embeddings at Scale}, year = {2020}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, booktitle = {Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval}, pages = {739–748}, numpages = {10}, series = {SIGIR '20} }

License

This project is licensed under the Apache-2.0 License.

Owner

  • Name: Amazon Web Services - Labs
  • Login: awslabs
  • Kind: organization
  • Location: Seattle, WA

AWS Labs

GitHub Events

Total
  • Issues event: 2
  • Watch event: 56
  • Issue comment event: 3
  • Fork event: 6
Last Year
  • Issues event: 2
  • Watch event: 56
  • Issue comment event: 3
  • Fork event: 6

Committers

Last synced: 9 months ago

All Time
  • Total Commits: 209
  • Total Committers: 26
  • Avg Commits per committer: 8.038
  • Development Distribution Score (DDS): 0.603
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Chao Ma m****0@g****m 83
Song c****g@g****m 27
Da Zheng z****6@g****m 22
Ubuntu u****u@i****l 21
Lingfei Huo m****v@g****m 20
Zheng d****n@1****m 6
Xiaoyu Zhai x****i@h****m 5
Ubuntu u****u@i****l 3
Phi p****r@g****m 3
Biyang Zeng 3****3 2
Jinjing Zhou V****n 2
Lingfei l****0@u****u 1
Ubuntu u****u@i****l 1
Amazon GitHub Automation 5****o 1
Daiki Katsuragawa 5****a 1
James Michael DuPont J****t@g****m 1
Marc van Oudheusden 6****s 1
Sebastian Nilsson 5****o 1
Tudor Andrei Dumitrascu T****i 1
Ugo Cottin u****n@g****m 1
ZhichenJiang 1****0@q****m 1
guillaume-be g****n@g****m 1
jmikedupont2 j****2@g****m 1
joker j****i@o****m 1
lroberts7 1****7 1
nbro n****o 1

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 81
  • Total pull requests: 28
  • Average time to close issues: about 2 months
  • Average time to close pull requests: 2 months
  • Total issue authors: 70
  • Total pull request authors: 17
  • Average comments per issue: 3.1
  • Average comments per pull request: 0.71
  • Merged pull requests: 21
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 2
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 2
  • Pull request authors: 0
  • Average comments per issue: 0.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • ryantd (3)
  • Maristela-de-Jesus (3)
  • classicsong (3)
  • DJRavinszkha (2)
  • ccvalley (2)
  • Maester-Khris (2)
  • YijianLiu (2)
  • yulong-CSAI (2)
  • guillaume-be (1)
  • maryamag85 (1)
  • thomasthomasth (1)
  • zhjwy9343 (1)
  • trevorlazarus (1)
  • alexucb (1)
  • asaluja (1)
Pull Request Authors
  • ryantd (6)
  • menjarleev (4)
  • zby123 (3)
  • classicsong (2)
  • raifthenerd (2)
  • joker-xii (1)
  • TudorAndrei (1)
  • jeanimal (1)
  • jmikedupont2 (1)
  • guillaume-be (1)
  • moudheus (1)
  • ugocottin (1)
  • Michael1015198808 (1)
  • h4ck3rm1k3 (1)
  • aksnzhy (1)
Top Labels
Issue Labels
enhancement (7) bug (3) duplicate (1)
Pull Request Labels

Packages

  • Total packages: 2
  • Total downloads:
    • pypi 259 last-month
  • Total dependent packages: 0
    (may contain duplicates)
  • Total dependent repositories: 5
    (may contain duplicates)
  • Total versions: 7
  • Total maintainers: 1
proxy.golang.org: github.com/awslabs/dgl-ke
  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 5.5%
Average: 5.7%
Dependent repos count: 5.9%
Last synced: 6 months ago
pypi.org: dglke

A distributed system to learn embeddings of large graphs

  • Versions: 6
  • Dependent Packages: 0
  • Dependent Repositories: 5
  • Downloads: 259 Last month
Rankings
Stargazers count: 1.9%
Forks count: 3.7%
Dependent repos count: 6.6%
Average: 6.9%
Dependent packages count: 10.1%
Downloads: 12.3%
Maintainers (1)
Last synced: 6 months ago

Dependencies

python/setup.py pypi