https://github.com/deepset-ai/nvidia-triton-inference
This repository contains setup examples for hosting model inference using NVIDIA triton
Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (4.3%) to scientific vocabulary
Repository
This repository contains setup examples for hosting model inference using NVIDIA triton
Basic Info
- Host: GitHub
- Owner: deepset-ai
- License: apache-2.0
- Language: Python
- Default Branch: main
- Size: 31.3 KB
Statistics
- Stars: 0
- Watchers: 0
- Forks: 2
- Open Issues: 1
- Releases: 0
Metadata Files
README.md
nvidia-triton-inference
This repository contains setup examples for hosting model inference using NVIDIA triton
How to build a triton embedding image
- Setup your model and tokenizer files
- move model.onnx to `hf-embedding-template/onnx_model/1/`
- move any other model files (model and tokenizer config) to `hf-embedding-template/preprocessing/1/`
Start Triton Server and attach shell
docker run --shm-size=16g --gpus all -it --rm -p 8000:8000 -p 8001:8001 -p 8002:8002 -v /hf-embedding-template:/models nvcr.io/nvidia/tritonserver:24.08-py3 bashRun inside the Triton Container
``` pip install transformers
tritonserver --model-repository=/models ```
Run client
``` pip install tritonclient[http]
python client.py ```
helm charts
This repo comes with ready to run helm charts. They can be found under /helm. E.g. text-embedder-trion is readily configured to run a triton embedding server.
Owner
- Name: deepset
- Login: deepset-ai
- Kind: organization
- Email: hello@deepset.ai
- Location: Berlin, Germany
- Website: https://deepset.ai
- Twitter: deepset_ai
- Repositories: 14
- Profile: https://github.com/deepset-ai
Building enterprise search systems powered by latest NLP & open-source.
GitHub Events
Total
- Fork event: 1
Last Year
- Fork event: 1
Issues and Pull Requests
Last synced: 11 months ago
All Time
- Total issues: 0
- Total pull requests: 6
- Average time to close issues: N/A
- Average time to close pull requests: 10 minutes
- Total issue authors: 0
- Total pull request authors: 2
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 5
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 6
- Average time to close issues: N/A
- Average time to close pull requests: 10 minutes
- Issue authors: 0
- Pull request authors: 2
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 5
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
- tstadel (9)
- SuperMohit (2)