hybridbackend

A high-performance framework for training wide-and-deep recommender systems on heterogeneous cluster

https://github.com/deeprec-ai/hybridbackend

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: ieee.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.8%) to scientific vocabulary

Keywords

deep-learning gpu hybrid-parallelism parquet recommender-system
Last synced: 6 months ago · JSON representation ·

Repository

A high-performance framework for training wide-and-deep recommender systems on heterogeneous cluster

Basic Info
  • Host: GitHub
  • Owner: DeepRec-AI
  • License: apache-2.0
  • Language: C++
  • Default Branch: main
  • Homepage:
  • Size: 2.89 MB
Statistics
  • Stars: 157
  • Watchers: 15
  • Forks: 31
  • Open Issues: 13
  • Releases: 7
Topics
deep-learning gpu hybrid-parallelism parquet recommender-system
Created about 4 years ago · Last pushed almost 2 years ago
Metadata Files
Readme Contributing License Citation Roadmap

README.md

HybridBackend

cibuild readthedocs PRs Welcome license

HybridBackend is a high-performance framework for training wide-and-deep recommender systems on heterogeneous cluster.

Features

  • Memory-efficient loading of categorical data
  • GPU-efficient orchestration of embedding layers
  • Communication-efficient training and evaluation at scale
  • Easy to use with existing AI workflows

Usage

A minimal example:

```python import tensorflow as tf import hybridbackend.tensorflow as hb

ds = hb.data.Dataset.fromparquet(filenames) ds = ds.batch(batchsize)

...

with tf.device('/gpu:0'): embs = tf.nn.embeddinglookupsparse(weights, input_ids) # ... ```

Please see documentation for more information.

Install

Method 1: Install from PyPI

pip install {PACKAGE}

| {PACKAGE} | Dependency | Python | CUDA | GLIBC | Data Opt. | Embedding Opt. | Parallelism Opt. | | ----------------------------------------------------------------------------------------- | ----------------------------------------------------------------------- | ------ | ---- | ------ | --------- | -------------- | ---------------- | | hybridbackend-tf115-cu121 | TensorFlow 1.15 | 3.8 | 12.1 | >=2.31 | ✓ | ✓ | ✓ | | hybridbackend-tf115-cu100 | TensorFlow 1.15 | 3.6 | 10.0 | >=2.27 | ✓ | ✓ | ✗ | | hybridbackend-tf115-cpu | TensorFlow 1.15 | 3.6 | - | >=2.24 | ✓ | ✗ | ✗ |

Method 2: Build from source

See Building Instructions.

We also provide built docker images for latest DeepRec: registry.cn-shanghai.aliyuncs.com/pai-dlc/hybridbackend:1.0.0-deeprec-py3.6-cu114-ubuntu18.04

License

HybridBackend is licensed under the Apache 2.0 License.

Community

text @inproceedings{zhang2022picasso, title={PICASSO: Unleashing the Potential of GPU-centric Training for Wide-and-deep Recommender Systems}, author={Zhang, Yuanxing and Chen, Langshi and Yang, Siran and Yuan, Man and Yi, Huimin and Zhang, Jie and Wang, Jiamang and Dong, Jianbo and Xu, Yunlong and Song, Yue and others}, booktitle={2022 IEEE 38th International Conference on Data Engineering (ICDE)}, year={2022}, organization={IEEE} }

Contact Us

If you would like to share your experiences with others, you are welcome to contact us in DingTalk:

dingtalk

Owner

  • Name: DeepRec-AI
  • Login: DeepRec-AI
  • Kind: organization

Citation (CITATION.cff)

cff-version: 1.1.0
title: HybridBackend
doi: 10.5281/zenodo.6464188
type: software
url: "https://github.com/alibaba/HybridBackend"
authors:
  - given-names: Man
    family-names: Yuan
  - given-names: Langshi
    family-names: Chen
message: >-
  Please cite HybridBackend in your publications if it helps
preferred-citation:
  title: "PICASSO: Unleashing the Potential of GPU-centric Training for Wide-and-deep Recommender Systems"
  type: conference-paper
  collection-title: "2022 IEEE 38th International Conference on Data Engineering (ICDE)"
  year: 2022
  authors:
  - family-names: "Zhang"
    given-names: "Yuanxing"
  - family-names: "Chen"
    given-names: "Langshi"
  - family-names: "Yang"
    given-names: "Siran"
  - family-names: "Yuan"
    given-names: "Man"
  - family-names: "Yi"
    given-names: "Huimin"
  - family-names: "Zhang"
    given-names: "Jie"
  - family-names: "Wang"
    given-names: "Jiamang"
  - family-names: "Dong"
    given-names: "Jianbo"
  - family-names: "Xu"
    given-names: "Yunlong"
  - family-names: "Song"
    given-names: "Yue"
  - family-names: "Li"
    given-names: "Yong"
  - family-names: "Zhang"
    given-names: "Di"
  - family-names: "Lin"
    given-names: "Wei"
  - family-names: "Qu"
    given-names: "Lin"
  - family-names: "Zheng"
    given-names: "Bo"

GitHub Events

Total
  • Watch event: 5
  • Fork event: 1
Last Year
  • Watch event: 5
  • Fork event: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 8
  • Total pull requests: 9
  • Average time to close issues: 11 days
  • Average time to close pull requests: 1 day
  • Total issue authors: 5
  • Total pull request authors: 3
  • Average comments per issue: 1.0
  • Average comments per pull request: 0.11
  • Merged pull requests: 3
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • dixingxing0 (3)
  • karterotte (2)
  • DelightRun (1)
  • ZhuYuJin (1)
Pull Request Authors
  • francktcheng (6)
  • 2sin18 (3)
  • Nov11 (2)
Top Labels
Issue Labels
Pull Request Labels
enhancement (2)

Dependencies

.github/workflows/cibuild.yaml actions
  • EnricoMi/publish-unit-test-result-action v1 composite
  • actions/checkout v2 composite
  • actions/download-artifact v2 composite
  • actions/upload-artifact v2 composite
  • aliyun/ack-set-context v1 composite
  • dorny/test-reporter v1 composite
.github/workflows/cpu-legacy-nightly.yaml actions
  • actions/checkout v2 composite
  • aliyun/ack-set-context v1 composite
  • pypa/gh-action-pypi-publish release/v1 composite
.github/workflows/cpu-legacy.yaml actions
  • actions/checkout v2 composite
  • aliyun/ack-set-context v1 composite
  • pypa/gh-action-pypi-publish release/v1 composite
.github/workflows/cpu-nightly.yaml actions
  • actions/checkout v2 composite
  • aliyun/ack-set-context v1 composite
  • michaelhenry/create-report v2.0.0 composite
  • pypa/gh-action-pypi-publish release/v1 composite
.github/workflows/cpu.yaml actions
  • actions/checkout v2 composite
  • aliyun/ack-set-context v1 composite
  • pypa/gh-action-pypi-publish release/v1 composite
.github/workflows/gpu-nightly.yaml actions
  • actions/checkout v2 composite
  • aliyun/ack-set-context v1 composite
  • michaelhenry/create-report v2.0.0 composite
  • pypa/gh-action-pypi-publish release/v1 composite
.github/workflows/gpu.yaml actions
  • actions/checkout v2 composite
  • aliyun/ack-set-context v1 composite
  • pypa/gh-action-pypi-publish release/v1 composite
docs/requirements.txt pypi
  • docutils ==0.16
  • hybridbackend-cpu ==0.6.0a0
  • myst-parser *
  • sphinx *
  • sphinx_rtd_theme *
  • tensorflow ==1.15.5