tritonserver-docs-cn

Triton Inference Server Docs Chinese

https://github.com/guyue55/tritonserver-docs-cn

Science Score: 31.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (9.8%) to scientific vocabulary

Last synced: 10 months ago · JSON representation ·

Repository

Triton Inference Server Docs Chinese

Basic Info

Host: GitHub
Owner: guyue55
License: bsd-3-clause
Language: Python
Default Branch: main
Size: 4.76 MB

Statistics

Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Created over 1 year ago · Last pushed about 1 year ago

Metadata Files

Readme Contributing License Citation Security

Triton 推理服务器

[!警告] 当前发布版本为 2.51.0，对应 NVIDIA GPU Cloud (NGC) 上的 24.10 容器发布版本。

Triton 推理服务器是一个开源的推理服务软件，可以简化 AI 推理过程。Triton 使团队能够部署来自多个深度学习和机器学习框架的任何 AI 模型，包括 TensorRT、TensorFlow、PyTorch、ONNX、OpenVINO、Python、RAPIDS FIL 等。Triton 推理服务器支持在 NVIDIA GPU、x86 和 ARM CPU 或 AWS Inferentia 上跨云、数据中心、边缘和嵌入式设备进行推理。Triton 推理服务器为多种查询类型提供优化的性能，包括实时、批处理、集成和音视频流。Triton 推理服务器是 NVIDIA AI Enterprise 的一部分，这是一个加速数据科学流程并简化生产 AI 开发和部署的软件平台。

主要特性包括：

支持多个深度学习框架
支持多个机器学习框架
并发模型执行
动态批处理
序列批处理和隐式状态管理用于有状态模型
提供后端 API，允许添加自定义后端和预/后处理操作
支持用 Python 编写自定义后端，即基于 Python 的后端
使用集成或业务逻辑脚本(BLS)的模型流水线
基于社区开发的 KServe 协议的 HTTP/REST 和 GRPC 推理协议
C API和Java API允许 Triton 直接链接到您的应用程序中，用于边缘和其他进程内用例
指标显示 GPU 利用率、服务器吞吐量、服务器延迟等

刚接触 Triton 推理服务器？使用这些教程开始您的 Triton 之旅！

加入 Triton 和 TensorRT 社区，及时了解最新的产品更新、错误修复、内容、最佳实践等。需要企业支持？通过 NVIDIA AI Enterprise 软件套件可获得 Triton 推理服务器的 NVIDIA 全球支持。

3 个简单步骤部署模型

```bash

步骤 1：创建示例模型仓库

git clone -b r24.10 https://github.com/triton-inference-server/server.git cd server/docs/examples ./fetch_models.sh

步骤 2：从 NGC Triton 容器启动 triton

docker run --gpus=1 --rm --net=host -v ${PWD}/model_repository:/models nvcr.io/nvidia/tritonserver:24.10-py3 tritonserver --model-repository=/models

步骤 3：发送推理请求

在另一个控制台中，从 NGC Triton SDK 容器启动 image_client 示例

docker run -it --rm --net=host nvcr.io/nvidia/tritonserver:24.10-py3-sdk /workspace/install/bin/imageclient -m densenetonnx -c 3 -s INCEPTION /workspace/images/mug.jpg

推理应返回以下内容

Image '/workspace/images/mug.jpg': 15.346230 (504) = COFFEE MUG 13.224326 (968) = CUP 10.422965 (505) = COFFEEPOT ```

请阅读快速入门指南以获取有关此示例的更多信息。快速入门指南还包含如何在仅 CPU 系统上启动 Triton 的示例。刚接触 Triton 并想知道从哪里开始？观看入门视频。

示例和教程

查看 NVIDIA LaunchPad，免费访问在 NVIDIA 基础设施上托管的一系列 Triton 推理服务器动手实验。

特定的端到端示例（如 ResNet、BERT 和 DLRM）位于 GitHub 上的 NVIDIA 深度学习示例页面。NVIDIA 开发者专区包含额外的文档、演示和示例。

文档

构建和部署

推荐使用 Docker 镜像来构建和使用 Triton 推理服务器。

使用 Docker 容器安装 Triton 推理服务器（推荐）
不使用 Docker 容器安装 Triton 推理服务器
构建自定义 Triton 推理服务器 Docker 容器
从源代码构建 Triton 推理服务器
为 Windows 10 构建 Triton 推理服务器
在 GCP、AWS 和 NVIDIA FleetCommand 上使用 Kubernetes 和 Helm 部署 Triton 推理服务器的示例
安全部署注意事项

使用 Triton

为 Triton 推理服务器准备模型

使用 Triton 服务模型的第一步是将一个或多个模型放入模型仓库中。根据模型的类型和您想为模型启用的 Triton 功能，您可能需要为模型创建模型配置。

如果模型需要，添加自定义操作到 Triton
使用模型集成和业务逻辑脚本(BLS)启用模型流水线
通过设置调度和批处理参数和模型实例优化您的模型
使用模型分析器工具通过分析帮助优化您的模型配置
了解如何通过加载和卸载模型显式管理可用模型

配置和使用 Triton 推理服务器

阅读快速入门指南在 GPU 和 CPU 上运行 Triton 推理服务器
Triton 支持多个执行引擎，称为后端，包括 TensorRT、TensorFlow、PyTorch、ONNX、OpenVINO、Python等
并非所有上述后端都在 Triton 支持的每个平台上都受支持。查看后端-平台支持矩阵了解您的目标平台支持哪些后端
了解如何使用性能分析器和模型分析器优化性能
了解如何在 Triton 中管理模型的加载和卸载
使用 HTTP/REST JSON 或 gRPC 协议直接向 Triton 发送请求

客户端支持和示例

Triton 客户端应用程序向 Triton 发送推理和其他请求。[Python 和 C

Owner

Name: 古月
Login: guyue55
Kind: user

Repositories: 1
Profile: https://github.com/guyue55

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
title: "Triton Inference Server: An Optimized Cloud and Edge Inferencing Solution."
url: https://github.com/triton-inference-server
repository-code: https://github.com/triton-inference-server/server
authors:
  - name: "NVIDIA Corporation"

GitHub Events

Total

Push event: 1
Create event: 2

Last Year

Push event: 1
Create event: 2

Dependencies

deploy/gke-marketplace-app/server-deployer/Dockerfile docker

gcr.io/cloud-marketplace-tools/k8s/deployer_helm/onbuild latest build

deploy/mlflow-triton-plugin/setup.py pypi

mlflow >=2.2.1,<3.0

pyproject.toml pypi

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science