bentoml

The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!

https://github.com/bentoml/bentoml

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
✓
Committers with academic emails
7 of 222 committers (3.2%) from academic institutions
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (14.4%) to scientific vocabulary

Keywords

ai-inference deep-learning generative-ai inference-platform llm llm-inference llm-serving llmops machine-learning ml-engineering mlops model-inference-service model-serving multimodal python

Keywords from Contributors

fine-tuning agents langchain llama model-inference mistral llm-ops llama3-2-vision llama3-2 llama3-1

Last synced: 10 months ago · JSON representation ·

Repository

The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!

Basic Info

Host: GitHub
Owner: bentoml
License: apache-2.0
Language: Python
Default Branch: main
Homepage: https://bentoml.com
Size: 98.3 MB

Statistics

Stars: 8,028
Watchers: 80
Forks: 871
Open Issues: 139
Releases: 177

Topics

ai-inference deep-learning generative-ai inference-platform llm llm-inference llm-serving llmops machine-learning ml-engineering mlops model-inference-service model-serving multimodal python

Created over 7 years ago · Last pushed 10 months ago

Metadata Files

Readme Contributing License Code of conduct Citation Codeowners Security Governance

README.md

BentoML: Unified Model Serving Framework

Unified Model Serving Framework

🍱 Build model inference APIs and multi-model serving systems with any open-source or custom AI models. 👉 Join our Slack community!

What is BentoML?

BentoML is a Python library for building online serving systems optimized for AI apps and model inference.

🍱 Easily build APIs for Any AI/ML Model. Turn any model inference script into a REST API server with just a few lines of code and standard Python type hints.
🐳 Docker Containers made simple. No more dependency hell! Manage your environments, dependencies and model versions with a simple config file. BentoML automatically generates Docker images, ensures reproducibility, and simplifies how you deploy to different environments.
🧭 Maximize CPU/GPU utilization. Build high performance inference APIs leveraging built-in serving optimization features like dynamic batching, model parallelism, multi-stage pipeline and multi-model inference-graph orchestration.
👩‍💻 Fully customizable. Easily implement your own APIs or task queues, with custom business logic, model inference and multi-model composition. Supports any ML framework, modality, and inference runtime.
🚀 Ready for Production. Develop, run and debug locally. Seamlessly deploy to production with Docker containers or BentoCloud.

Getting started

Install BentoML:

```

Requires Python≥3.9

pip install -U bentoml ```

Define APIs in a service.py file.

```python import bentoml

@bentoml.service( image=bentoml.images.Image(pythonversion="3.11").pythonpackages("torch", "transformers"), ) class Summarization: def init(self) -> None: import torch from transformers import pipeline

    device = "cuda" if torch.cuda.is_available() else "cpu"
    self.pipeline = pipeline('summarization', device=device)

@bentoml.api(batchable=True)
def summarize(self, texts: list[str]) -> list[str]:
    results = self.pipeline(texts)
    return [item['summary_text'] for item in results]

```

💻 Run locally

Install PyTorch and Transformers packages to your Python virtual environment.

bash pip install torch transformers # additional dependencies for local run

Run the service code locally (serving at http://localhost:3000 by default):

bash bentoml serve

You should expect to see the following output.

[INFO] [cli] Starting production HTTP BentoServer from "service:Summarization" listening on http://localhost:3000 (Press CTRL+C to quit) [INFO] [entry_service:Summarization:1] Service Summarization initialized

Now you can run inference from your browser at http://localhost:3000 or with a Python script:

```python import bentoml

with bentoml.SyncHTTPClient('http://localhost:3000') as client: summarizedtext: str = client.summarize([bentoml.doc])[0] print(f"Result: {summarizedtext}") ```

🐳 Deploy using Docker

Run bentoml build to package necessary code, models, dependency configs into a Bento - the standardized deployable artifact in BentoML:

bash bentoml build

Ensure Docker is running. Generate a Docker container image for deployment:

bash bentoml containerize summarization:latest

Run the generated image:

bash docker run --rm -p 3000:3000 summarization:latest

☁️ Deploy on BentoCloud

BentoCloud provides compute infrastructure for rapid and reliable GenAI adoption. It helps speed up your BentoML development process leveraging cloud compute resources, and simplify how you deploy, scale and operate BentoML in production.

```bash

After signup, run the following command to create an API token:

bentoml cloud login

Deploy from current directory:

bentoml deploy ```

bentocloud-ui

For detailed explanations, read the Hello World example.

Examples

LLMs: Llama 3.2, Mistral, DeepSeek Distil, and more.
Image Generation: Stable Diffusion 3 Medium, Stable Video Diffusion, Stable Diffusion XL Turbo, ControlNet, and LCM LoRAs.
Embeddings: SentenceTransformers and ColPali
Audio: ChatTTS, XTTS, WhisperX, Bark
Computer Vision: YOLO and ResNet
Advanced examples: Function calling, LangGraph, CrewAI

Check out the full list for more sample code and usage.

Advanced topics

See Documentation for more tutorials and guides.

Community

Get involved and join our Community Slack 💬, where thousands of AI/ML engineers help each other, contribute to the project, and talk about building AI products.

To report a bug or suggest a feature request, use GitHub Issues.

Contributing

There are many ways to contribute to the project:

Report bugs and "Thumbs up" on issues that are relevant to you.
Investigate issues and review other developers' pull requests.
Contribute code or documentation to the project by submitting a GitHub pull request.
Check out the Contributing Guide and Development Guide to learn more.
Share your feedback and discuss roadmap plans in the #bentoml-contributors channel here.

Thanks to all of our amazing contributors!

Usage tracking and feedback

The BentoML framework collects anonymous usage data that helps our community improve the product. Only BentoML's internal API calls are being reported. This excludes any sensitive information, such as user code, model data, model names, or stack traces. Here's the code used for usage tracking. You can opt-out of usage tracking by the --do-not-track CLI option:

bash bentoml [command] --do-not-track

Or by setting the environment variable:

bash export BENTOML_DO_NOT_TRACK=True

License

Apache License 2.0

Owner

Name: BentoML
Login: bentoml
Kind: organization
Location: San Francisco

Website: https://bentoml.com
Twitter: bentomlai
Repositories: 76
Profile: https://github.com/bentoml

The most flexible way to serve AI models in production

Citation (CITATION.cff)

cff-version: 1.2.0
title: 'BentoML: The framework for building reliable, scalable and cost-efficient AI application'
message: >-
  If you use this software, please cite it using these
  metadata.
type: software
authors:
  - given-names: Chaoyu
    family-names: Yang
    email: chaoyu@bentoml.com
  - given-names: Sean
    family-names: Sheng
    email: ssheng@bentoml.com
  - given-names: Aaron
    family-names: Pham
    email: aarnphm@bentoml.com
    orcid: 'https://orcid.org/0009-0008-3180-5115'
  - given-names: Shenyang
    family-names: ' Zhao'
    email: larme@bentoml.com
  - given-names: Sauyon
    family-names: Lee
    email: sauyon@bentoml.com
  - given-names: Bo
    family-names: Jiang
    email: jiang@bentoml.com
  - given-names: Fog
    family-names: Dong
    email: fog@bentoml.com
  - given-names: Xipeng
    family-names: Guan
    email: xipeng@bentoml.com
  - given-names: Frost
    family-names: Ming
    email: frost@bentoml.com
repository-code: 'https://github.com/bentoml/bentoml'
url: 'https://bentoml.com/'
keywords:
  - MLOps
  - LLMOps
  - LLM
  - Infrastructure
  - BentoML
  - LLM Serving
  - Model Serving
  - Serverless Deployment
license: Apache-2.0

GitHub Events

Total

Create event: 150
Issues event: 126
Release event: 36
Watch event: 834
Delete event: 118
Member event: 1
Issue comment event: 281
Push event: 380
Pull request review comment event: 164
Pull request event: 642
Pull request review event: 396
Fork event: 96

Last Year

Create event: 150
Issues event: 126
Release event: 36
Watch event: 834
Delete event: 118
Member event: 1
Issue comment event: 281
Push event: 380
Pull request review comment event: 164
Pull request event: 642
Pull request review event: 396
Fork event: 96

Committers

Last synced: about 1 year ago

All Time

Total Commits: 3,582
Total Committers: 222
Avg Commits per committer: 16.135
Development Distribution Score (DDS): 0.843

Past Year

Commits: 480
Committers: 39
Avg Commits per committer: 12.308
Development Distribution Score (DDS): 0.533

Top Committers

Name	Email	Commits
Chaoyu	p**g@g**m	562
Aaron Pham	2****m	534
Frost Ming	me@f****m	413
bojiang	5****g	355
Bozhao	y**6@g**m	325
Sherlock Xu	6****3	254
Sauyon Lee	2****n	196
dependabot[bot]	4****]	159
Sean Sheng	s**g@g**m	114
Zhao Shenyang	d**v@z**m	105
Leon	i@l****m	45
Tianxin Dong	f**g@b**m	37
Jian Shen	j**2@g**m	28
xianxian.zhang	1****l	25
pre-commit-ci[bot]	6****]	23
Jacky Zhao	j**9@g**m	17
yetone	y**l@g**m	17
Steve Guo	4****o	16
Judah Rand	1****d	11
Jithin James	j**7@g**m	11
Jinyang Liu	l**g@b**e	10
Sungjun.Kim	s**m@l**m	10
Tim Liu	9****l	9
Aanand Kainth	a**d@a**e	9
Tasha J. Kim	t**m@g**m	6
MingLiangDai	9****i	5
Quan Nguyen	8****r	5
devin-ai-integration[bot]	1****]	5
lintingzhen	l**n@g**m	5
Mayur Newase	m**1@g**m	5
and 192 more...

Committer Domains (Top 20 + Academic)

naver.com: 2 qq.com: 2 shokov.dev: 1 kaist.ac.kr: 1 proton.me: 1 beyondminds.ai: 1 buad.es: 1 theclarkfamily.name: 1 appian.com: 1 myhunter.cuny.edu: 1 uwaterloo.ca: 1 curationcorp.com: 1 matejsmid.cz: 1 vmware.com: 1 missionlane.com: 1 somaiya.edu: 1 akainth.me: 1 linecorp.com: 1 bigtiger.me: 1 bentoml.com: 1 creativedutchmen.com: 1 anu.edu.au: 1 ischool.berkeley.edu: 1

Issues and Pull Requests

Last synced: 10 months ago

All Time

Total issues: 405
Total pull requests: 1,928
Average time to close issues: 10 months
Average time to close pull requests: 11 days
Total issue authors: 254
Total pull request authors: 96
Average comments per issue: 1.97
Average comments per pull request: 0.54
Merged pull requests: 1,623
Bot issues: 0
Bot pull requests: 119

Past Year

Issues: 69
Pull requests: 747
Average time to close issues: 7 days
Average time to close pull requests: about 23 hours
Issue authors: 58
Pull request authors: 42
Average comments per issue: 1.29
Average comments per pull request: 0.39
Merged pull requests: 648
Bot issues: 0
Bot pull requests: 48

View more stats

Top Authors

Issue Authors

aarnphm (36)
ssheng (14)
sauyon (10)
parano (9)
KimSoungRyoul (9)
holzweber (8)
smidm (7)
judahrand (6)
rlleshi (5)
Hubert-Bonisseur (4)
nadworny (4)
Matthieu-Tinycoaching (4)
MattiasDC (3)
BangDaeng (3)
isuyyy (3)

Pull Request Authors

frostming (634)
Sherlock113 (448)
aarnphm (148)
bojiang (76)
FogDong (63)
xianml (60)
dependabot[bot] (56)
ssheng (54)
sauyon (46)
pre-commit-ci[bot] (40)
jianshen92 (39)
devin-ai-integration[bot] (21)
Haivilo (20)
judahrand (19)
larme (17)

Top Labels

Issue Labels

bug (229) enhancement (60) feature (16) help-wanted (15) feedback-wanted (14) good-first-issue (12) framework (11) questions (9) documentation (8) io-descriptor (4) from community (2) fixed-in-main (2) ci (1)

Pull Request Labels

documentation (409) dependencies (56) python (23) github_actions (16) pr/merge-hold (15) security (8) javascript (7) rust (6) enhancement (2) feature (1)

Packages

Total packages: 8
Total downloads:
- pypi 106,500 last-month
Total docker downloads: 7,841

Total dependent packages: 13
(may contain duplicates)
Total dependent repositories: 502
(may contain duplicates)
Total versions: 540
Total maintainers: 6
Total advisories: 8

pypi.org: bentoml

BentoML: The easiest way to serve AI apps and models

Homepage: https://bentoml.com
Documentation: https://docs.bentoml.com
License: Apache-2.0
Latest release: 1.4.23
published 10 months ago

Versions: 203
Dependent Packages: 13
Dependent Repositories: 499
Downloads: 106,407 Last month
Docker Downloads: 7,841

Rankings

Dependent repos count: 0.6%

Downloads: 1.0%

Average: 1.2%

Dependent packages count: 1.3%

Docker downloads count: 2.0%

Maintainers (3)

parano ssheng aar0npham

Advisories (8)

Last synced: 10 months ago

proxy.golang.org: github.com/bentoml/bentoml

Homepage: https://github.com/bentoml/bentoml
Documentation: https://pkg.go.dev/github.com/bentoml/bentoml#section-documentation
License: Apache-2.0
Latest release: v1.4.22
published 11 months ago

Versions: 164
Dependent Packages: 0
Dependent Repositories: 0

Rankings

Dependent repos count: 1.6%

Average: 4.1%

Dependent packages count: 6.5%

Last synced: 10 months ago

proxy.golang.org: github.com/bentoml/BentoML

Homepage: https://github.com/bentoml/BentoML
Documentation: https://pkg.go.dev/github.com/bentoml/BentoML#section-documentation
License: Apache-2.0
Latest release: v1.4.22
published 11 months ago

Versions: 164
Dependent Packages: 0
Dependent Repositories: 0

Rankings

Dependent packages count: 5.4%

Average: 5.6%

Dependent repos count: 5.8%

Last synced: 10 months ago

pypi.org: yatai

Model and deployment management for BentoML

Homepage: https://github.com/bentoml/bentoml
Documentation: https://yatai.readthedocs.io/
License: apache-2.0
Latest release: 0.0.1
published almost 5 years ago

Versions: 1
Dependent Packages: 0
Dependent Repositories: 1
Downloads: 27 Last month

Rankings

Stargazers count: 0.4%

Forks count: 1.9%

Dependent packages count: 7.4%

Average: 15.0%

Dependent repos count: 22.2%

Downloads: 43.1%

Maintainers (1)

yubozhao

Last synced: 10 months ago

pypi.org: sentencebertservice

BentoML generated model module

Homepage: https://github.com/bentoml/BentoML
Documentation: https://sentencebertservice.readthedocs.io/
License: apache-2.0
Latest release: 20211205152102
published over 4 years ago

Versions: 1
Dependent Packages: 0
Dependent Repositories: 1
Downloads: 7 Last month

Rankings

Stargazers count: 0.4%

Forks count: 1.9%

Dependent packages count: 7.4%

Average: 16.7%

Dependent repos count: 22.2%

Downloads: 51.7%

Maintainers (1)

napoler

Last synced: 10 months ago

conda-forge.org: bentoml

BentoML simplifies ML model deployment and serves your models at production scale. PyPI: [https://pypi.org/project/bentoml/](https://pypi.org/project/bentoml/)

Homepage: https://github.com/bentoml/BentoML
License: Apache-2.0
Latest release: 1.0.0
published almost 4 years ago

Versions: 2
Dependent Packages: 0
Dependent Repositories: 1

Rankings

Stargazers count: 5.0%

Forks count: 7.1%

Average: 21.9%

Dependent repos count: 24.1%

Dependent packages count: 51.5%

Last synced: 10 months ago

pypi.org: bentoml-core

The rust core of BentoML: The Unified Model Serving Framework

Documentation: https://docs.bentoml.org/en/latest/
License: Apache-2.0
Latest release: 0.1.0
published about 3 years ago

Versions: 2
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 35 Last month

Rankings

Dependent packages count: 7.3%

Average: 24.1%

Dependent repos count: 40.9%

Maintainers (2)

aar0npham sauyon

Last synced: 10 months ago

pypi.org: bentoml-unsloth

BentoML: The easiest way to serve AI apps and models

Homepage: https://bentoml.com
Documentation: https://docs.bentoml.com
License: Apache-2.0
Latest release: 0.1.2
published almost 2 years ago

Versions: 3
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 24 Last month

Rankings

Dependent packages count: 10.3%

Average: 34.2%

Dependent repos count: 58.1%

Maintainers (1)

aar0npham

Last synced: 10 months ago

Dependencies

.github/workflows/ci.yml actions

actions/checkout v4 composite
actions/download-artifact v3 composite
actions/upload-artifact v3 composite
docker/setup-buildx-action v3 composite
docker/setup-qemu-action v3 composite
marocchino/sticky-pull-request-comment v2 composite
pdm-project/setup-pdm v3 composite
re-actors/alls-green release/v1 composite

.github/workflows/cleanup.yml actions

actions/checkout v4 composite

.github/workflows/codeql-analysis.yml actions

actions/checkout v4 composite
github/codeql-action/analyze v2 composite
github/codeql-action/autobuild v2 composite
github/codeql-action/init v2 composite

.github/workflows/cqa.yml actions

actions/checkout v4 composite
pdm-project/setup-pdm v3 composite

.github/workflows/release.yml actions

actions/checkout v4 composite
actions/download-artifact v3 composite
actions/setup-python v4 composite
actions/upload-artifact v3 composite
pypa/gh-action-pypi-publish release/v1 composite

.devcontainer/Dockerfile docker

python 3-bullseye build

docs/source/guides/snippets/tracing/docker-compose.yml docker

iris_classifier klncyjcfqwldtgxi
jaegertracing/all-in-one 1.38

examples/monitoring/task_classification/requirements.txt pypi

bentoml >=1.0.19
pandas *
scikit-learn *

pyproject.toml pypi

Jinja2 >=3.0.1
PyYAML >=5.0
aiohttp *
attrs >=21.1.0
cattrs >=22.1.0,<23.2.0
circus >=0.17.0,!=0.17.2
click >=7.0
click-option-group *
cloudpickle >=2.0.0
deepmerge *
fs *
httpx *
inflection *
numpy *
nvidia-ml-py <12
opentelemetry-api ==1.20.0
opentelemetry-instrumentation ==0.41b0
opentelemetry-instrumentation-aiohttp-client ==0.41b0
opentelemetry-instrumentation-asgi ==0.41b0
opentelemetry-sdk ==1.20.0
opentelemetry-semantic-conventions ==0.41b0
opentelemetry-util-http ==0.41b0
packaging >=22.0
pathspec *
pip-requirements-parser >=31.2.0
pip-tools >=6.6.2
prometheus-client >=0.10.0
psutil *
python-dateutil *
python-json-logger *
python-multipart *
requests *
rich >=11.2.0
schema *
simple-di >=0.1.4
starlette >=0.24.0
uvicorn *
watchfiles >=0.15.0

tests/e2e/bento_server_grpc/requirements.txt pypi

Pillow * test
pydantic * test

tests/e2e/bento_server_http/nested/requirements.txt pypi

pandas * test
pyarrow * test
scikit-learn * test

tests/e2e/bento_server_http/requirements.txt pypi

Pillow * test
fastapi * test
pydantic * test
starlette <0.26 test

tests/integration/frameworks/mlflow/NoPyfunc/requirements.txt pypi

cloudpickle >=2.0.0 test
mlflow * test
psutil >=5.8.0 test
scikit-learn >=1.0.2 test

tests/e2e/bento_new_sdk/requirements.txt pypi

pydantic >=2 test
scikit-learn * test

examples/inference-graph/requirements.txt pypi

bentoml *
torch *
transformers ==4.30.0

examples/io-descriptors/requirements.txt pypi

bentoml *
pandas *
pdf2img *
pydub *
torch *

examples/model-loading-saving/requirements.txt pypi

accelerate *
bentoml *
diffusers *
torch *
transformers *

tests/unit/_internal/bento/simplebento/environment.yaml conda

_libgcc_mutex 0.1
_openmp_mutex 4.5
blas 1.0
ca-certificates 2021.10.26
certifi 2021.10.8
intel-openmp 2021.4.0
joblib 1.1.0
ld_impl_linux-64 2.35.1
libffi 3.3
libgcc-ng 9.3.0
libgfortran-ng 7.5.0
libgfortran4 7.5.0
libgomp 9.3.0
libstdcxx-ng 9.3.0
mkl 2021.4.0
mkl-service 2.4.0
mkl_fft 1.3.1
mkl_random 1.2.2
ncurses 6.3
numpy 1.21.2
numpy-base 1.21.2
openssl 1.1.1l
pip 21.2.4
python 3.9.7
readline 8.1
scikit-learn 1.0.1
scipy 1.7.1
setuptools 58.0.4
six 1.16.0
sqlite 3.36.0
threadpoolctl 2.2.0
tk 8.6.11
tzdata 2021e
wheel 0.37.0
xz 5.2.5
zlib 1.2.11

examples/gradio/requirements.txt pypi

bentoml *
fastapi *
gradio *
torch *
transformers *

examples/mlflow/requirements.txt pypi

bentoml *
mlflow *
scikit-learn *

examples/quickstart/requirements.txt pypi

bentoml *
torch *
transformers *

examples/xgboost/requirements.txt pypi

bentoml *
scikit-learn *
xgboost *