openllm - v0.6.30

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.6.29...v0.6.30

- Python
Published by github-actions[bot] 10 months ago

What's Changed

feat: add support for search filter by @aarnphm in https://github.com/bentoml/OpenLLM/pull/1177

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.6.28...v0.6.29

- Python
Published by github-actions[bot] 10 months ago

openllm - v0.6.28

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.6.27...v0.6.28

- Python
Published by github-actions[bot] 10 months ago

openllm - v0.6.27

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.6.26...v0.6.27

- Python
Published by github-actions[bot] 10 months ago

What's Changed

chore(deps): bump openai from 1.70.0 to 1.73.0 in the production-dependencies group by @dependabot in https://github.com/bentoml/OpenLLM/pull/1175
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/1176

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.6.25...v0.6.26

- Python
Published by github-actions[bot] 10 months ago

What's Changed

feat: add support for --arg by @aarnphm in https://github.com/bentoml/OpenLLM/pull/1174

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.6.24...v0.6.25

- Python
Published by github-actions[bot] 11 months ago

What's Changed

chore(deps): bump bentoml from 1.4.7 to 1.4.8 in the pip group by @dependabot in https://github.com/bentoml/OpenLLM/pull/1171
chore(deps): bump openai from 1.69.0 to 1.70.0 in the production-dependencies group by @dependabot in https://github.com/bentoml/OpenLLM/pull/1169

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.6.23...v0.6.24

- Python
Published by github-actions[bot] 11 months ago

openllm - v0.6.23

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.6.22...v0.6.23

- Python
Published by github-actions[bot] 11 months ago

openllm - v0.6.22

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.6.21...v0.6.22

- Python
Published by github-actions[bot] 11 months ago

What's Changed

chore(deps): bump the production-dependencies group with 2 updates by @dependabot in https://github.com/bentoml/OpenLLM/pull/1157
feat(types): enable strict mypy by @aarnphm in https://github.com/bentoml/OpenLLM/pull/1158
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/1159
chore(deps): bump openai from 1.66.3 to 1.68.2 in the production-dependencies group by @dependabot in https://github.com/bentoml/OpenLLM/pull/1160
chore(deps): bump the actions-dependencies group with 2 updates by @dependabot in https://github.com/bentoml/OpenLLM/pull/1161
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/1162
chore(deps): bump the production-dependencies group with 2 updates by @dependabot in https://github.com/bentoml/OpenLLM/pull/1163
chore(deps): bump actions/setup-python from 5.4.0 to 5.5.0 in the actions-dependencies group by @dependabot in https://github.com/bentoml/OpenLLM/pull/1164
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/1165
chore: update instructions for deploy with openllm by @aarnphm in https://github.com/bentoml/OpenLLM/pull/1166
feat: support --env by @aarnphm in https://github.com/bentoml/OpenLLM/pull/1167

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.6.20...v0.6.21

- Python
Published by github-actions[bot] 11 months ago

What's Changed

chore(deps): bump openai from 1.61.1 to 1.63.0 in the production-dependencies group by @dependabot in https://github.com/bentoml/OpenLLM/pull/1146
docs: Fix deploy command by @Sherlock113 in https://github.com/bentoml/OpenLLM/pull/1147
chore(deps): bump the production-dependencies group with 2 updates by @dependabot in https://github.com/bentoml/OpenLLM/pull/1149
chore(deps): bump actions/upload-artifact from 4.6.0 to 4.6.1 in the actions-dependencies group by @dependabot in https://github.com/bentoml/OpenLLM/pull/1150
docs: Fix serve command by @Sherlock113 in https://github.com/bentoml/OpenLLM/pull/1148
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/1151
chore(deps): bump the production-dependencies group with 2 updates by @dependabot in https://github.com/bentoml/OpenLLM/pull/1152
chore(deps): bump actions/download-artifact from 4.1.8 to 4.1.9 in the actions-dependencies group by @dependabot in https://github.com/bentoml/OpenLLM/pull/1153
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/1154
chore(deps): bump the production-dependencies group with 2 updates by @dependabot in https://github.com/bentoml/OpenLLM/pull/1155
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/1156

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.6.19...v0.6.20

- Python
Published by github-actions[bot] 12 months ago

What's Changed

chore(deps): bump openai from 1.61.0 to 1.61.1 in the production-dependencies group by @dependabot in https://github.com/bentoml/OpenLLM/pull/1143
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/1144

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.6.18...v0.6.19

- Python
Published by github-actions[bot] about 1 year ago

What's Changed

chore(deps): bump the actions-dependencies group with 2 updates by @dependabot in https://github.com/bentoml/OpenLLM/pull/1132
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/1133
chore(deps): bump openai from 1.59.3 to 1.59.6 in the production-dependencies group by @dependabot in https://github.com/bentoml/OpenLLM/pull/1131
chore(deps): bump openai from 1.59.6 to 1.59.8 in the production-dependencies group by @dependabot in https://github.com/bentoml/OpenLLM/pull/1134
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/1135
chore(deps): bump openai from 1.59.8 to 1.60.1 in the production-dependencies group by @dependabot in https://github.com/bentoml/OpenLLM/pull/1138
chore(deps): bump pypa/gh-action-pypi-publish from 1.12.3 to 1.12.4 in the actions-dependencies group by @dependabot in https://github.com/bentoml/OpenLLM/pull/1139
chore(deps): bump openai from 1.60.1 to 1.61.0 in the production-dependencies group by @dependabot in https://github.com/bentoml/OpenLLM/pull/1141
chore(deps): bump actions/setup-python from 5.3.0 to 5.4.0 in the actions-dependencies group by @dependabot in https://github.com/bentoml/OpenLLM/pull/1142
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/1140

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.6.17...v0.6.18

- Python
Published by github-actions[bot] about 1 year ago

What's Changed

chore(deps): bump openai from 1.57.4 to 1.58.1 in the production-dependencies group by @dependabot in https://github.com/bentoml/OpenLLM/pull/1123
chore(deps): bump actions/upload-artifact from 4.4.3 to 4.5.0 in the actions-dependencies group by @dependabot in https://github.com/bentoml/OpenLLM/pull/1124
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/1125
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/1127
chore(deps): bump openai from 1.58.1 to 1.59.3 in the production-dependencies group by @dependabot in https://github.com/bentoml/OpenLLM/pull/1126
docs: Add Llama 3.3 to readme by @Sherlock113 in https://github.com/bentoml/OpenLLM/pull/1128
fix: respect Python version of bento by @bojiang in https://github.com/bentoml/OpenLLM/pull/1130

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.6.16...v0.6.17

- Python
Published by bojiang about 1 year ago

What's Changed

chore(deps): bump openai from 1.55.0 to 1.55.3 in the production-dependencies group by @dependabot in https://github.com/bentoml/OpenLLM/pull/1116
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/1117
chore(deps): bump the production-dependencies group across 1 directory with 2 updates by @dependabot in https://github.com/bentoml/OpenLLM/pull/1120
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/1119
chore(deps): bump the actions-dependencies group with 2 updates by @dependabot in https://github.com/bentoml/OpenLLM/pull/1121
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/1122

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.6.15...v0.6.16

- Python
Published by bojiang about 1 year ago

What's Changed

chore(deps): bump the actions-dependencies group across 1 directory with 4 updates by @dependabot in https://github.com/bentoml/OpenLLM/pull/1106
chore(deps): bump openai from 1.52.0 to 1.53.0 in the production-dependencies group across 1 directory by @dependabot in https://github.com/bentoml/OpenLLM/pull/1105
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/1104
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/1109
chore(deps): bump the actions-dependencies group across 1 directory with 2 updates by @dependabot in https://github.com/bentoml/OpenLLM/pull/1111
chore(deps): bump the production-dependencies group across 1 directory with 2 updates by @dependabot in https://github.com/bentoml/OpenLLM/pull/1110
chore(deps): bump openai from 1.54.4 to 1.55.0 in the production-dependencies group by @dependabot in https://github.com/bentoml/OpenLLM/pull/1113
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/1114

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.6.14...v0.6.15

- Python
Published by github-actions[bot] about 1 year ago

What's Changed

ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/1100
chore(deps): bump openai from 1.51.2 to 1.52.0 in the production-dependencies group by @dependabot in https://github.com/bentoml/OpenLLM/pull/1099

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.6.13...v0.6.14

- Python
Published by github-actions[bot] over 1 year ago

openllm - v0.6.13

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.6.12...v0.6.13

- Python
Published by github-actions[bot] over 1 year ago

What's Changed

chore(deps): bump actions/checkout from 4.1.7 to 4.2.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/1090
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/1091
chore(deps): bump openai from 1.47.0 to 1.50.2 by @dependabot in https://github.com/bentoml/OpenLLM/pull/1089
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/1094
chore(deps): bump the actions-dependencies group with 3 updates by @dependabot in https://github.com/bentoml/OpenLLM/pull/1098
chore(deps): bump openai from 1.50.2 to 1.51.2 by @dependabot in https://github.com/bentoml/OpenLLM/pull/1095

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.6.11...v0.6.12

- Python
Published by github-actions[bot] over 1 year ago

What's Changed

chore(deps): bump openai from 1.41.0 to 1.42.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/1069
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/1070
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/1075
chore(deps): bump pypa/gh-action-pypi-publish from 1.9.0 to 1.10.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/1074
chore(deps): bump actions/setup-python from 5.1.1 to 5.2.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/1073
chore(deps): bump actions/upload-artifact from 4.3.6 to 4.4.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/1072
chore(deps): bump openai from 1.42.0 to 1.43.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/1071
chore(deps): bump pypa/gh-action-pypi-publish from 1.10.0 to 1.10.1 by @dependabot in https://github.com/bentoml/OpenLLM/pull/1079
chore(deps): bump openai from 1.43.0 to 1.44.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/1077
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/1080
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/1082
chore(deps): bump openai from 1.44.0 to 1.45.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/1081
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/1086
chore(deps): bump pypa/gh-action-pypi-publish from 1.10.1 to 1.10.2 by @dependabot in https://github.com/bentoml/OpenLLM/pull/1085
chore(deps): bump openai from 1.45.0 to 1.47.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/1084
docs: Update model table by @Sherlock113 in https://github.com/bentoml/OpenLLM/pull/1087
docs: Update model by @Sherlock113 in https://github.com/bentoml/OpenLLM/pull/1088

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.6.10...v0.6.11

- Python
Published by github-actions[bot] over 1 year ago

What's Changed

chore(deps): bump openai from 1.38.0 to 1.41.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/1065
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/1061
chore(deps): bump actions/upload-artifact from 4.3.5 to 4.3.6 by @dependabot in https://github.com/bentoml/OpenLLM/pull/1060
fix(cloud): respect user-set BENTOML_HOME by @aarnphm in https://github.com/bentoml/OpenLLM/pull/1067
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/1066

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.6.9...v0.6.10

- Python
Published by github-actions[bot] over 1 year ago

What's Changed

ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/1058
chore(deps): bump actions/upload-artifact from 4.3.4 to 4.3.5 by @dependabot in https://github.com/bentoml/OpenLLM/pull/1057
chore(deps): bump openai from 1.37.1 to 1.38.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/1056

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.6.7...v0.6.9

- Python
Published by github-actions[bot] over 1 year ago

openllm - v0.6.8

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.6.7...v0.6.8

- Python
Published by github-actions[bot] over 1 year ago

What's Changed

feat(venv): support pip options by @rickzx in https://github.com/bentoml/OpenLLM/pull/1052
chore: use uv instead of venv layers by @bojiang in https://github.com/bentoml/OpenLLM/pull/1054

New Contributors

@rickzx made their first contribution in https://github.com/bentoml/OpenLLM/pull/1052

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.6.6...v0.6.7

- Python
Published by github-actions[bot] over 1 year ago

What's Changed

ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/1043
chore(deps): bump openai from 1.35.12 to 1.35.13 by @dependabot in https://github.com/bentoml/OpenLLM/pull/1041
chore(deps): bump softprops/action-gh-release from 2.0.6 to 2.0.8 by @dependabot in https://github.com/bentoml/OpenLLM/pull/1046
chore(deps): bump openai from 1.35.13 to 1.36.1 by @dependabot in https://github.com/bentoml/OpenLLM/pull/1045
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/1047
chore(deps): bump openai from 1.36.1 to 1.37.1 by @dependabot in https://github.com/bentoml/OpenLLM/pull/1048
docs: Update OpenLLM readme by @Sherlock113 in https://github.com/bentoml/OpenLLM/pull/1051

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.6.5...v0.6.6

- Python
Published by github-actions[bot] over 1 year ago

What's Changed

infra(style): automatically create release notes from tag by @aarnphm in https://github.com/bentoml/OpenLLM/pull/1040

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.6.4...v0.6.5

- Python
Published by github-actions[bot] over 1 year ago

What's Changed

chore: make the UI link clickable in output by @bojiang in https://github.com/bentoml/OpenLLM/pull/1038

New Contributors

@bojiang made their first contribution in https://github.com/bentoml/OpenLLM/pull/1038

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.6.2...v0.6.3

- Python
Published by ssheng over 1 year ago

We are thrilled to announce the release of OpenLLM 0.6, which marks a significant shift in our project's philosophy. This release introduces breaking changes to the codebase, reflecting our renewed focus on streamlining cloud deployment for LLMs.

In the previous releases, our goal was to provide users with the ability to fully customize their LLM deployment. However, we realized that the customization support in OpenLLM led to scope creep, deviating from our core focus on making LLM deployment simple. With the rise of open source LLMs and the growing emphasis on LLM-focused application development, we have decided to concentrate on what OpenLLM does best - simplifying LLM deployment.

We have completely revamped the architecture to make OpenLLM a tool that simplifies running LLMs as an API endpoint, prioritizing ease of use and performance. This means that 0.6 breaks away from many of the old Python APIs provided in 0.5, emphasizing itself as an easy-to-use CLI tool with cross-platform compatibility for users to deploy open source LLMs.

To learn more about the exciting features and capabilities of OpenLLM, visit our GitHub repository. We invite you to explore the new release, provide feedback, and join us in our mission to make cloud deployment of LLMs accessible and efficient for everyone.

Thank you for your continued support and trust in OpenLLM. We look forward to seeing the incredible applications you will build with the tool.

- Python
Published by ssheng over 1 year ago

openllm - v0.5.7

Installation

bash pip install openllm==0.5.7

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.5.7

Usage

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.5.6...v0.5.7

- Python
Published by github-actions[bot] over 1 year ago

openllm - OpenLLM: v0.5

OpenLLM has undergone a significant upgrade in its v0.5 release to enhance compatibility with the BentoML 1.2 SDK. The CLI has also been streamlined to focus on delivering the most easy-to-use and reliable experience for deploying open-source LLMs to production. However, version 0.5 introduces breaking changes.

Breaking changes, and the reason why.

After releasing version 0.4, we realized that while OpenLLM offers a high degree of flexibility and power to users, they encountered numerous issues when attempting to deploy these models. OpenLLM had been trying to accomplish a lot by providing support for different backends (mainly PyTorch for CPU inference and vLLM for GPU inference) and accelerators. Although this provided users with the option to quickly test on their local machines, we discovered that this brought a lot of confusion when running OpenLLM locally versus the cloud. The difference between local and cloud deployment made it difficult for users to understand and control the packaged Bento to behave correctly on the cloud.

The motivation for 0.5 is to focus on cloud deployment. Cloud deployments often focus on high throughput and high concurrency serving, and GPU is the most common choice of hardware for achieving high throughput and concurrency serving. Therefore, we simplified backend support to just vLLM which is the most suitable and reliable for serving LLM on GPU on the cloud.

Architecture changes and SDK.

For version 0.5, we have decided to reduce the scope and support the backend that yields the most performance (in this case, vLLM). This means that pip install openllm will also depend on vLLM. In other words, we will currently pause our support for CPU going forward. All interactions with OpenLLM servers going forward should be done through clients (i.e., BentoML's Clients, OpenAI, etc.).

CLI

CLI has now been simplified to openllm start and openllm build

HuggingFace models

openllm start

openllm start will continue to accept HuggingFace model id for supported model architectures:

bash openllm start microsoft/Phi-3-mini-4k-instruct --trust-remote-code

For any models that requires remote code execution, one should pass in --trust-remote-code

openllm start will also accept serving from local path directly. Make sure to also pass in --trust-remote-code if you wish to use with openllm start

bash openllm start path/to/custom-phi-instruct --trust-remote-code

openllm build

In previous versions, OpenLLM would copy the local cache of the models into the generated Bento store, resulting in having two copies of the models on users’ machine. From v0.5 going forward, models won't be packaged with the Bento and will be downloaded into Hugging Face cache first time on deployment.

```bash openllm build microsoft/Phi-3-mini-4k-instruct --trust-remote-code

Successfully built Bento 'microsoft--phi-3-mini-4k-instruct-service:5fa34190089f0ee40f9cce3cafc396b89b2e5e83'.

██████╗ ██████╗ ███████╗███╗ ██╗██╗ ██╗ ███╗ ███╗ ██╔═══██╗██╔══██╗██╔════╝████╗ ██║██║ ██║ ████╗ ████║ ██║ ██║██████╔╝█████╗ ██╔██╗ ██║██║ ██║ ██╔████╔██║ ██║ ██║██╔═══╝ ██╔══╝ ██║╚██╗██║██║ ██║ ██║╚██╔╝██║ ╚██████╔╝██║ ███████╗██║ ╚████║███████╗███████╗██║ ╚═╝ ██║ ╚═════╝ ╚═╝ ╚══════╝╚═╝ ╚═══╝╚══════╝╚══════╝╚═╝ ╚═╝.

📖 Next steps: ☁️ Deploy to BentoCloud: $ bentoml deploy microsoft--phi-3-mini-4k-instruct-service:5fa34190089f0ee40f9cce3cafc396b89b2e5e83 -n ${DEPLOYMENTNAME} ☁️ Update existing deployment on BentoCloud: $ bentoml deployment update --bento microsoft--phi-3-mini-4k-instruct-service:5fa34190089f0ee40f9cce3cafc396b89b2e5e83 ${DEPLOYMENTNAME} 🐳 Containerize BentoLLM: $ bentoml containerize microsoft--phi-3-mini-4k-instruct-service:5fa34190089f0ee40f9cce3cafc396b89b2e5e83 --opt progress=plain ```

For quantized models, make sure to also pass in the --quantize flag during build

bash openllm build casperhansen/llama-3-70b-instruct-awq --quantize awq

See openllm build --help for more information

Private models

openllm start

For private models, we recommend users to save it to BentoML’s Model store first before using openllm start:

python with bentoml.models.create(name="my-private-models") as model: PrivateTrainedModel.save_pretrained(model.path) MyTokenizer.save_pretrained(model.path)

Note: Make sure to also save your tokenizer in this bentomodel

You can then pass in the private model name directly to openllm start

bash openllm start my-private-models

openllm build

Similar to openllm start, openllm build will only accept private models from BentoML’s model store:

bash openllm build my-private-models

What's next?

Currently, OpenAI's compatibility will only have the /chat/completions and /models endpoints supported. We will continue bringing /completions as well as function calling support soon, so stay tuned.

Thank you for your continued support and trust in us. We would love to hear more of your feedback on the releases.

- Python
Published by github-actions[bot] over 1 year ago

openllm - v0.5.5

Installation

bash pip install openllm==0.5.5

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.5.5

Usage

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

feat(models): command-r by @aarnphm in https://github.com/bentoml/OpenLLM/pull/1005
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/1007
chore(deps): bump taiki-e/install-action from 2.33.34 to 2.34.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/1006

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.5.4...v0.5.5

- Python
Published by github-actions[bot] over 1 year ago

openllm - v0.5.4

Installation

bash pip install openllm==0.5.4

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.5.4

Usage

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

feat(API): add light support for batch inference by @aarnphm in https://github.com/bentoml/OpenLLM/pull/1004

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.5.3...v0.5.4

- Python
Published by github-actions[bot] over 1 year ago

openllm - v0.5.3

Installation

bash pip install openllm==0.5.3

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.5.3

Usage

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.5.2...v0.5.3

- Python
Published by github-actions[bot] over 1 year ago

openllm - v0.5.2

Installation

bash pip install openllm==0.5.2

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.5.2

Usage

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.5.1...v0.5.2

- Python
Published by github-actions[bot] over 1 year ago

openllm - v0.5.0

Installation

bash pip install openllm==0.5.1

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.5.1

Usage

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.5.0...v0.5.1

- Python
Published by github-actions[bot] over 1 year ago

openllm - v0.5.0

Installation

bash pip install openllm==0.5.0

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.5.0

Usage

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/870
chore(deps): bump taiki-e/install-action from 2.25.9 to 2.26.18 by @dependabot in https://github.com/bentoml/OpenLLM/pull/899
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/909
chore(deps): bump github/codeql-action from 3.23.1 to 3.24.3 by @dependabot in https://github.com/bentoml/OpenLLM/pull/908
chore(deps): bump sigstore/cosign-installer from 3.3.0 to 3.4.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/907
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/931
feat: 1.2 APIs by @aarnphm in https://github.com/bentoml/OpenLLM/pull/821
chore(deps): bump taiki-e/install-action from 2.26.18 to 2.27.9 by @dependabot in https://github.com/bentoml/OpenLLM/pull/920
chore(deps): bump next from 13.4.8 to 13.5.1 by @dependabot in https://github.com/bentoml/OpenLLM/pull/912
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/935
chore(deps): bump marocchino/sticky-pull-request-comment from 2.8.0 to 2.9.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/933
chore(deps): bump aquasecurity/trivy-action from 0.16.1 to 0.18.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/932
chore(deps): bump docker/setup-buildx-action from 3.0.0 to 3.2.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/941
chore(deps): bump github/codeql-action from 3.24.3 to 3.24.9 by @dependabot in https://github.com/bentoml/OpenLLM/pull/939
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/942
fix(compat): use annotated type from typing_compat by @rudeigerc in https://github.com/bentoml/OpenLLM/pull/943
docs: Update high-level messaging by @Sherlock113 in https://github.com/bentoml/OpenLLM/pull/949
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/947
chore(deps): bump aquasecurity/trivy-action from 0.18.0 to 0.19.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/946
chore(deps): bump taiki-e/install-action from 2.27.9 to 2.32.9 by @dependabot in https://github.com/bentoml/OpenLLM/pull/945
Update README.md by @parano in https://github.com/bentoml/OpenLLM/pull/964
chore(deps): bump taiki-e/install-action from 2.32.9 to 2.33.9 by @dependabot in https://github.com/bentoml/OpenLLM/pull/970
chore(deps): bump sigstore/cosign-installer from 3.4.0 to 3.5.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/954
chore(deps): bump docker/metadata-action from 5.5.0 to 5.5.1 by @dependabot in https://github.com/bentoml/OpenLLM/pull/956
chore(deps): bump actions/setup-python from 5.0.0 to 5.1.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/955
chore(deps): bump pypa/gh-action-pypi-publish from 1.8.11 to 1.8.14 by @dependabot in https://github.com/bentoml/OpenLLM/pull/958
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/959
fix: update correct CompletionOutput object by @aarnphm in https://github.com/bentoml/OpenLLM/pull/973
chore(deps): bump docker/build-push-action from 5.1.0 to 5.3.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/979
chore(deps): bump docker/login-action from 3.0.0 to 3.1.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/978
chore(deps): bump github/codeql-action from 3.24.9 to 3.25.3 by @dependabot in https://github.com/bentoml/OpenLLM/pull/977
chore(deps): bump docker/setup-buildx-action from 3.2.0 to 3.3.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/975
fix: make sure to respect additional parameters parse by @aarnphm in https://github.com/bentoml/OpenLLM/pull/981
chore(deps): bump peter-evans/create-pull-request from 5.0.2 to 6.0.5 by @dependabot in https://github.com/bentoml/OpenLLM/pull/976
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/980
chore(deps): bump rlespinasse/github-slug-action from 4.4.1 to 4.5.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/988
chore(deps): bump softprops/action-gh-release from 1 to 2 by @dependabot in https://github.com/bentoml/OpenLLM/pull/987
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/989
chore(deps): bump taiki-e/install-action from 2.33.9 to 2.33.22 by @dependabot in https://github.com/bentoml/OpenLLM/pull/985
chore(deps): bump actions/checkout from 4.1.1 to 4.1.5 by @dependabot in https://github.com/bentoml/OpenLLM/pull/984
chore(deps): bump next from 13.4.8 to 14.1.1 by @dependabot in https://github.com/bentoml/OpenLLM/pull/983
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/994
chore(deps): bump actions/checkout from 4.1.5 to 4.1.6 by @dependabot in https://github.com/bentoml/OpenLLM/pull/993
chore(deps): bump github/codeql-action from 3.25.3 to 3.25.5 by @dependabot in https://github.com/bentoml/OpenLLM/pull/992
chore(deps): bump aquasecurity/trivy-action from 0.19.0 to 0.20.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/991
fix(docs): update correct BentoML links by @dennisrall in https://github.com/bentoml/OpenLLM/pull/995
tests: add additional basic testing by @aarnphm in https://github.com/bentoml/OpenLLM/pull/982
infra: prepare 0.5 releases by @aarnphm in https://github.com/bentoml/OpenLLM/pull/996
chore(deps): bump actions/upload-artifact from 3.1.3 to 4.3.3 by @dependabot in https://github.com/bentoml/OpenLLM/pull/986
chore(deps): bump actions/download-artifact from 3.0.2 to 4.1.7 by @dependabot in https://github.com/bentoml/OpenLLM/pull/990
chore(qol): update CLI options and performance upgrade for build cache by @aarnphm in https://github.com/bentoml/OpenLLM/pull/997
feat(ci): running CI on paperspace by @aarnphm in https://github.com/bentoml/OpenLLM/pull/998
chore(deps): bump taiki-e/install-action from 2.33.22 to 2.33.34 by @dependabot in https://github.com/bentoml/OpenLLM/pull/1000
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/1002

New Contributors

@rudeigerc made their first contribution in https://github.com/bentoml/OpenLLM/pull/943
@dennisrall made their first contribution in https://github.com/bentoml/OpenLLM/pull/995

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.44...v0.5.0

- Python
Published by github-actions[bot] over 1 year ago

openllm - v0.5.0-alpha.15

Installation

bash pip install openllm==0.5.0-alpha.15

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.5.0-alpha.15

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.5.0-alpha.15 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

chore(deps): bump docker/setup-buildx-action from 3.0.0 to 3.2.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/941
chore(deps): bump github/codeql-action from 3.24.3 to 3.24.9 by @dependabot in https://github.com/bentoml/OpenLLM/pull/939
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/942
fix(compat): use annotated type from typing_compat by @rudeigerc in https://github.com/bentoml/OpenLLM/pull/943
docs: Update high-level messaging by @Sherlock113 in https://github.com/bentoml/OpenLLM/pull/949
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/947
chore(deps): bump aquasecurity/trivy-action from 0.18.0 to 0.19.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/946
chore(deps): bump taiki-e/install-action from 2.27.9 to 2.32.9 by @dependabot in https://github.com/bentoml/OpenLLM/pull/945
Update README.md by @parano in https://github.com/bentoml/OpenLLM/pull/964
chore(deps): bump taiki-e/install-action from 2.32.9 to 2.33.9 by @dependabot in https://github.com/bentoml/OpenLLM/pull/970
chore(deps): bump sigstore/cosign-installer from 3.4.0 to 3.5.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/954
chore(deps): bump docker/metadata-action from 5.5.0 to 5.5.1 by @dependabot in https://github.com/bentoml/OpenLLM/pull/956
chore(deps): bump actions/setup-python from 5.0.0 to 5.1.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/955
chore(deps): bump pypa/gh-action-pypi-publish from 1.8.11 to 1.8.14 by @dependabot in https://github.com/bentoml/OpenLLM/pull/958
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/959
fix: update correct CompletionOutput object by @aarnphm in https://github.com/bentoml/OpenLLM/pull/973
chore(deps): bump docker/build-push-action from 5.1.0 to 5.3.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/979
chore(deps): bump docker/login-action from 3.0.0 to 3.1.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/978
chore(deps): bump github/codeql-action from 3.24.9 to 3.25.3 by @dependabot in https://github.com/bentoml/OpenLLM/pull/977
chore(deps): bump docker/setup-buildx-action from 3.2.0 to 3.3.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/975
fix: make sure to respect additional parameters parse by @aarnphm in https://github.com/bentoml/OpenLLM/pull/981
chore(deps): bump peter-evans/create-pull-request from 5.0.2 to 6.0.5 by @dependabot in https://github.com/bentoml/OpenLLM/pull/976
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/980
chore(deps): bump rlespinasse/github-slug-action from 4.4.1 to 4.5.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/988
chore(deps): bump softprops/action-gh-release from 1 to 2 by @dependabot in https://github.com/bentoml/OpenLLM/pull/987
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/989
chore(deps): bump taiki-e/install-action from 2.33.9 to 2.33.22 by @dependabot in https://github.com/bentoml/OpenLLM/pull/985
chore(deps): bump actions/checkout from 4.1.1 to 4.1.5 by @dependabot in https://github.com/bentoml/OpenLLM/pull/984
chore(deps): bump next from 13.4.8 to 14.1.1 by @dependabot in https://github.com/bentoml/OpenLLM/pull/983
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/994
chore(deps): bump actions/checkout from 4.1.5 to 4.1.6 by @dependabot in https://github.com/bentoml/OpenLLM/pull/993
chore(deps): bump github/codeql-action from 3.25.3 to 3.25.5 by @dependabot in https://github.com/bentoml/OpenLLM/pull/992
chore(deps): bump aquasecurity/trivy-action from 0.19.0 to 0.20.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/991
fix(docs): update correct BentoML links by @dennisrall in https://github.com/bentoml/OpenLLM/pull/995
tests: add additional basic testing by @aarnphm in https://github.com/bentoml/OpenLLM/pull/982
infra: prepare 0.5 releases by @aarnphm in https://github.com/bentoml/OpenLLM/pull/996
chore(deps): bump actions/upload-artifact from 3.1.3 to 4.3.3 by @dependabot in https://github.com/bentoml/OpenLLM/pull/986
chore(deps): bump actions/download-artifact from 3.0.2 to 4.1.7 by @dependabot in https://github.com/bentoml/OpenLLM/pull/990
chore(qol): update CLI options and performance upgrade for build cache by @aarnphm in https://github.com/bentoml/OpenLLM/pull/997
feat(ci): running CI on paperspace by @aarnphm in https://github.com/bentoml/OpenLLM/pull/998
chore(deps): bump taiki-e/install-action from 2.33.22 to 2.33.34 by @dependabot in https://github.com/bentoml/OpenLLM/pull/1000

New Contributors

@rudeigerc made their first contribution in https://github.com/bentoml/OpenLLM/pull/943
@dennisrall made their first contribution in https://github.com/bentoml/OpenLLM/pull/995

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.5.0-alpha.1...v0.5.0-alpha.15

- Python
Published by github-actions[bot] over 1 year ago

openllm -

- Python
Published by aarnphm almost 2 years ago

openllm - v0.5.0-alpha

- Python
Published by aarnphm almost 2 years ago

openllm - v0.4.44

Installation

bash pip install openllm==0.4.44

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.44

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.44 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

fix: remove vllm dependency for pytorch bento by @larme in https://github.com/bentoml/OpenLLM/pull/893

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.43...v0.4.44

- Python
Published by github-actions[bot] about 2 years ago

openllm - v0.4.43

Installation

bash pip install openllm==0.4.43

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.43

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.43 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

fix: limit BentoML version range by @larme in https://github.com/bentoml/OpenLLM/pull/881
chore: bump up bentoml version to 1.1.11 by @larme in https://github.com/bentoml/OpenLLM/pull/883
Bump BentoML version in tools by @larme in https://github.com/bentoml/OpenLLM/pull/884

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.42...v0.4.43

- Python
Published by github-actions[bot] about 2 years ago

openllm - v0.4.42

Installation

bash pip install openllm==0.4.42

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.42

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.42 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

docs: Update opt example to ms-phi by @Sherlock113 in https://github.com/bentoml/OpenLLM/pull/805
chore(script): run vendored scripts by @aarnphm in https://github.com/bentoml/OpenLLM/pull/808
docs: README.md typo by @weibeu in https://github.com/bentoml/OpenLLM/pull/819
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/818
chore(deps): bump docker/metadata-action from 5.3.0 to 5.4.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/814
chore(deps): bump taiki-e/install-action from 2.22.5 to 2.23.1 by @dependabot in https://github.com/bentoml/OpenLLM/pull/813
chore(deps): bump github/codeql-action from 3.22.11 to 3.22.12 by @dependabot in https://github.com/bentoml/OpenLLM/pull/815
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/825
chore(deps): bump crazy-max/ghaction-import-gpg from 6.0.0 to 6.1.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/824
chore(deps): bump taiki-e/install-action from 2.23.1 to 2.23.7 by @dependabot in https://github.com/bentoml/OpenLLM/pull/823
docs: Add Llamaindex in freedom to build by @Sherlock113 in https://github.com/bentoml/OpenLLM/pull/826
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/836
chore(deps): bump docker/metadata-action from 5.4.0 to 5.5.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/834
chore(deps): bump aquasecurity/trivy-action from 0.16.0 to 0.16.1 by @dependabot in https://github.com/bentoml/OpenLLM/pull/832
chore(deps): bump taiki-e/install-action from 2.23.7 to 2.24.1 by @dependabot in https://github.com/bentoml/OpenLLM/pull/833
chore(deps): bump vllm to 0.2.7 by @aarnphm in https://github.com/bentoml/OpenLLM/pull/837
chore: update discord link by @aarnphm in https://github.com/bentoml/OpenLLM/pull/838
improv(package): use python slim base image and let pytorch install cuda by @larme in https://github.com/bentoml/OpenLLM/pull/807
fix(dockerfile): conflict deps by @aarnphm in https://github.com/bentoml/OpenLLM/pull/841
chore: fix typo in list_models pydoc by @fuzzie360 in https://github.com/bentoml/OpenLLM/pull/847
docs: update README.md telemetry code link by @fuzzie360 in https://github.com/bentoml/OpenLLM/pull/842
chore(deps): bump taiki-e/install-action from 2.24.1 to 2.25.1 by @dependabot in https://github.com/bentoml/OpenLLM/pull/846
chore(deps): bump github/codeql-action from 3.22.12 to 3.23.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/844
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/848
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/858
chore(deps): bump taiki-e/install-action from 2.25.1 to 2.25.9 by @dependabot in https://github.com/bentoml/OpenLLM/pull/856
chore(deps): bump github/codeql-action from 3.23.0 to 3.23.1 by @dependabot in https://github.com/bentoml/OpenLLM/pull/855
fix: proper SSE handling for vllm by @larme in https://github.com/bentoml/OpenLLM/pull/877
chore: set stop to empty list by default by @larme in https://github.com/bentoml/OpenLLM/pull/878
fix: all runners sse output by @larme in https://github.com/bentoml/OpenLLM/pull/880

New Contributors

@weibeu made their first contribution in https://github.com/bentoml/OpenLLM/pull/819
@fuzzie360 made their first contribution in https://github.com/bentoml/OpenLLM/pull/847

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.41...v0.4.42

- Python
Published by github-actions[bot] about 2 years ago

openllm - v0.4.41

GPTQ Supports

vLLM backend now support GPTQ with upstream

python openlml start TheBloke/Mistral-7B-Instruct-v0.2-GPTQ --backend vllm --quantise gptq

Installation

bash pip install openllm==0.4.41

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.41

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.41 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

docs: add notes about dtypes usage. by @aarnphm in https://github.com/bentoml/OpenLLM/pull/786
chore(deps): bump taiki-e/install-action from 2.22.0 to 2.22.5 by @dependabot in https://github.com/bentoml/OpenLLM/pull/790
chore(deps): bump github/codeql-action from 2.22.9 to 3.22.11 by @dependabot in https://github.com/bentoml/OpenLLM/pull/794
chore(deps): bump sigstore/cosign-installer from 3.2.0 to 3.3.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/793
chore(deps): bump actions/download-artifact from 3.0.2 to 4.0.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/791
chore(deps): bump actions/upload-artifact from 3.1.3 to 4.0.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/792
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/796
fix(cli): avoid runtime __origin__ check for older Python by @aarnphm in https://github.com/bentoml/OpenLLM/pull/798
feat(vllm): support GPTQ with 0.2.6 by @aarnphm in https://github.com/bentoml/OpenLLM/pull/797
fix(ci): lock to v3 iteration of actions/artifacts workflow by @aarnphm in https://github.com/bentoml/OpenLLM/pull/799

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.40...v0.4.41

- Python
Published by github-actions[bot] about 2 years ago

openllm - v0.4.40

Installation

bash pip install openllm==0.4.40

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.40

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.40 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

fix(infra): conform ruff to 150 LL by @aarnphm in https://github.com/bentoml/OpenLLM/pull/781
infra: update blame ignore to formatter hash by @aarnphm in https://github.com/bentoml/OpenLLM/pull/782
perf: upgrade mixtral to use expert parallelism by @aarnphm in https://github.com/bentoml/OpenLLM/pull/783

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.39...v0.4.40

- Python
Published by github-actions[bot] about 2 years ago

openllm - v0.4.39

Installation

bash pip install openllm==0.4.39

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.39

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.39 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

fix(logprobs): correct check logprobs by @aarnphm in https://github.com/bentoml/OpenLLM/pull/779

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.38...v0.4.39

- Python
Published by github-actions[bot] about 2 years ago

openllm - v0.4.38

Installation

bash pip install openllm==0.4.38

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.38

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.38 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

fix(mixtral): correct chat templates to remove additional spacing by @aarnphm in https://github.com/bentoml/OpenLLM/pull/774
fix(cli): correct set arguments for openllm import and openllm build by @aarnphm in https://github.com/bentoml/OpenLLM/pull/775
fix(mixtral): setup hack atm to load weights from pt specifically instead of safetensors by @aarnphm in https://github.com/bentoml/OpenLLM/pull/776

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.37...v0.4.38

- Python
Published by github-actions[bot] about 2 years ago

openllm - v0.4.37

Installation

bash pip install openllm==0.4.37

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.37

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.37 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

feat(mixtral): correct support for mixtral by @aarnphm in https://github.com/bentoml/OpenLLM/pull/772
chore: running all script when installation by @aarnphm in https://github.com/bentoml/OpenLLM/pull/773

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.36...v0.4.37

- Python
Published by github-actions[bot] about 2 years ago

openllm - v0.4.36

Mixtral supports

Supports Mixtral on BentoCloud with vLLM and all required dependencies.

Bento built with openllm now defaults to python 3.11 for this change to work.

Installation

bash pip install openllm==0.4.36

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.36

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.36 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

feat(openai): supports echo by @aarnphm in https://github.com/bentoml/OpenLLM/pull/760
fix(openai): logprobs when echo is enabled by @aarnphm in https://github.com/bentoml/OpenLLM/pull/761
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/767
chore(deps): bump docker/metadata-action from 5.2.0 to 5.3.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/766
chore(deps): bump actions/setup-python from 4.7.1 to 5.0.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/765
chore(deps): bump taiki-e/install-action from 2.21.26 to 2.22.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/764
chore(deps): bump aquasecurity/trivy-action from 0.14.0 to 0.16.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/763
chore(deps): bump github/codeql-action from 2.22.8 to 2.22.9 by @dependabot in https://github.com/bentoml/OpenLLM/pull/762
feat: mixtral support by @aarnphm in https://github.com/bentoml/OpenLLM/pull/770

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.35...v0.4.36

- Python
Published by github-actions[bot] about 2 years ago

openllm - v0.4.35

Installation

bash pip install openllm==0.4.35

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.35

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.35 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

chore(deps): bump pypa/gh-action-pypi-publish from 1.8.10 to 1.8.11 by @dependabot in https://github.com/bentoml/OpenLLM/pull/749
chore(deps): bump docker/metadata-action from 5.0.0 to 5.2.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/751
chore(deps): bump taiki-e/install-action from 2.21.19 to 2.21.26 by @dependabot in https://github.com/bentoml/OpenLLM/pull/750
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/753
fix(logprobs): explicitly set logprobs=None by @aarnphm in https://github.com/bentoml/OpenLLM/pull/757

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.34...v0.4.35

- Python
Published by github-actions[bot] about 2 years ago

openllm - v0.4.34

Installation

bash pip install openllm==0.4.34

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.34

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.34 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

feat(models): Support qwen by @yansheng105 in https://github.com/bentoml/OpenLLM/pull/742

New Contributors

@yansheng105 made their first contribution in https://github.com/bentoml/OpenLLM/pull/742

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.33...v0.4.34

- Python
Published by github-actions[bot] about 2 years ago

openllm - v0.4.33

Installation

bash pip install openllm==0.4.33

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.33

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.33 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.32...v0.4.33

- Python
Published by github-actions[bot] about 2 years ago

openllm - v0.4.32

Installation

bash pip install openllm==0.4.32

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.32

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.32 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

chore(deps): bump taiki-e/install-action from 2.21.17 to 2.21.19 by @dependabot in https://github.com/bentoml/OpenLLM/pull/735
chore(deps): bump github/codeql-action from 2.22.7 to 2.22.8 by @dependabot in https://github.com/bentoml/OpenLLM/pull/734
chore: revert back previous backend support PyTorch by @aarnphm in https://github.com/bentoml/OpenLLM/pull/739

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.31...v0.4.32

- Python
Published by github-actions[bot] about 2 years ago

openllm - v0.4.31

Installation

bash pip install openllm==0.4.31

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.31

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.31 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

fix(docs): remove invalid options by @aarnphm in https://github.com/bentoml/OpenLLM/pull/733

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.30...v0.4.31

- Python
Published by github-actions[bot] about 2 years ago

openllm - v0.4.30

Installation

bash pip install openllm==0.4.30

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.30

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.30 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.29...v0.4.30

- Python
Published by github-actions[bot] about 2 years ago

openllm - v0.4.29

Installation

bash pip install openllm==0.4.29

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.29

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.29 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.28...v0.4.29

- Python
Published by github-actions[bot] about 2 years ago

openllm - v0.4.28

Installation

bash pip install openllm==0.4.28

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.28

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.28 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

fix(baichuan): supported from baichuan 2 from now on. by @MingLiangDai in https://github.com/bentoml/OpenLLM/pull/728

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.27...v0.4.28

- Python
Published by github-actions[bot] about 2 years ago

openllm - v0.4.26

Installation

bash pip install openllm==0.4.26

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.26

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.26 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

fix(infra): setup higher timer for building container images by @aarnphm in https://github.com/bentoml/OpenLLM/pull/723
fix(client): correct schemas parser from correct response output by @aarnphm in https://github.com/bentoml/OpenLLM/pull/724
feat(openai): chat templates and complete control of prompt generation by @aarnphm in https://github.com/bentoml/OpenLLM/pull/725

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.25...v0.4.26

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.4.25

Installation

bash pip install openllm==0.4.25

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.25

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.25 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

fix(openai): correct stop tokens and finish_reason state by @aarnphm in https://github.com/bentoml/OpenLLM/pull/722

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.24...v0.4.25

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.4.24

Installation

bash pip install openllm==0.4.24

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.24

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.24 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.23...v0.4.24

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.4.23

Installation

bash pip install openllm==0.4.23

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.23

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.23 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

chore: cleanup unused prompt templates by @aarnphm in https://github.com/bentoml/OpenLLM/pull/713
feat(generation): add support for eostokenid by @aarnphm in https://github.com/bentoml/OpenLLM/pull/714
fix(ci): tests by @aarnphm in https://github.com/bentoml/OpenLLM/pull/715
refactor: delete unused code by @aarnphm in https://github.com/bentoml/OpenLLM/pull/716
chore(logger): fix logger and streamline style by @aarnphm in https://github.com/bentoml/OpenLLM/pull/717
chore(strategy): compact and add stubs by @aarnphm in https://github.com/bentoml/OpenLLM/pull/718
chore(types): append additional types change by @aarnphm in https://github.com/bentoml/OpenLLM/pull/719
fix(base-image): update base image to include cuda for now by @aarnphm in https://github.com/bentoml/OpenLLM/pull/720

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.22...v0.4.23

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.4.22

Installation

bash pip install openllm==0.4.22

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.22

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.22 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

refactor: update runner helpers and add maxmodellen by @aarnphm in https://github.com/bentoml/OpenLLM/pull/712

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.21...v0.4.22

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.4.21

Installation

bash pip install openllm==0.4.21

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.21

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.21 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/711
chore(deps): bump taiki-e/install-action from 2.21.11 to 2.21.17 by @dependabot in https://github.com/bentoml/OpenLLM/pull/709
chore(deps): bump docker/build-push-action from 5.0.0 to 5.1.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/708
chore(deps): bump github/codeql-action from 2.22.5 to 2.22.7 by @dependabot in https://github.com/bentoml/OpenLLM/pull/707

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.20...v0.4.21

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.4.20

Installation

bash pip install openllm==0.4.20

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.20

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.20 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.19...v0.4.20

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.4.19

Installation

bash pip install openllm==0.4.19

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.19

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.19 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.18...v0.4.19

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.4.18

Installation

bash pip install openllm==0.4.18

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.18

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.18 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

chore: update lower bound version of bentoml to avoid breakage by @aarnphm in https://github.com/bentoml/OpenLLM/pull/703
feat(openai): dynamic model_type registration by @aarnphm in https://github.com/bentoml/OpenLLM/pull/704

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.17...v0.4.18

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.4.17

Installation

bash pip install openllm==0.4.17

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.17

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.17 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the CHANGELOG.md

What's Changed

infra: update generate notes and better local handle by @aarnphm in https://github.com/bentoml/OpenLLM/pull/701
fix(backend): correct use variable for backend when initialisation by @aarnphm in https://github.com/bentoml/OpenLLM/pull/702

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.16...v0.4.17

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.4.16

Installation

bash pip install openllm==0.4.16

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.16

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P ghcr.io/bentoml/openllm:0.4.16 start opt

To run OpenLLM Clojure UI (community-maintained): docker run -p 8420:80 ghcr.io/bentoml/openllm-ui-clojure:0.4.16

Find more information about this release in the CHANGELOG.md

What's Changed

feat(ctranslate): initial infrastructure support by @aarnphm in https://github.com/bentoml/OpenLLM/pull/694
feat(vllm): bump to 0.2.2 by @aarnphm in https://github.com/bentoml/OpenLLM/pull/695
feat(engine): CTranslate2 by @aarnphm in https://github.com/bentoml/OpenLLM/pull/698
chore: update documentation about runtime by @aarnphm in https://github.com/bentoml/OpenLLM/pull/699
chore: update changelog [skip ci] by @aarnphm in https://github.com/bentoml/OpenLLM/pull/700

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.15...v0.4.16

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.4.15

Installation

bash pip install openllm==0.4.15

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.15

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P ghcr.io/bentoml/openllm:0.4.15 start opt

To run OpenLLM Clojure UI (community-maintained): docker run -p 8420:80 ghcr.io/bentoml/openllm-ui-clojure:0.4.15

Find more information about this release in the CHANGELOG.md

What's Changed

fix(cattrs): strictly lock <23.2 until we upgrade validation logic by @aarnphm in https://github.com/bentoml/OpenLLM/pull/690
fix(annotations): check library through find_spec by @aarnphm in https://github.com/bentoml/OpenLLM/pull/691
feat: heuristics logprobs by @aarnphm in https://github.com/bentoml/OpenLLM/pull/692
chore: update documentation by @aarnphm in https://github.com/bentoml/OpenLLM/pull/693

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.14...v0.4.15

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.4.14

Installation

bash pip install openllm==0.4.14

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.14

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P ghcr.io/bentoml/openllm:0.4.14 start opt

To run OpenLLM Clojure UI (community-maintained): docker run -p 8420:80 ghcr.io/bentoml/openllm-ui-clojure:0.4.14

Find more information about this release in the CHANGELOG.md

What's Changed

fix(dependencies): ignore broken cattrs release by @aarnphm in https://github.com/bentoml/OpenLLM/pull/689

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.13...v0.4.14

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.4.13

Installation

bash pip install openllm==0.4.13

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.13

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P ghcr.io/bentoml/openllm:0.4.13 start opt

To run OpenLLM Clojure UI (community-maintained): docker run -p 8420:80 ghcr.io/bentoml/openllm-ui-clojure:0.4.13

Find more information about this release in the CHANGELOG.md

What's Changed

fix(llm): remove unnecessary check by @aarnphm in https://github.com/bentoml/OpenLLM/pull/683
examples: improve instructions and cleanup simple API server by @aarnphm in https://github.com/bentoml/OpenLLM/pull/684
fix(build): lock lower version based on each release and update infra by @aarnphm in https://github.com/bentoml/OpenLLM/pull/686

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.12...v0.4.13

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.4.12

Installation

bash pip install openllm==0.4.12

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.12

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P ghcr.io/bentoml/openllm:0.4.12 start opt

To run OpenLLM Clojure UI (community-maintained): docker run -p 8420:80 ghcr.io/bentoml/openllm-ui-clojure:0.4.12

Find more information about this release in the CHANGELOG.md

What's Changed

fix(envvar): explicitly set NVIDIADRIVERCAPABILITIES by @aarnphm in https://github.com/bentoml/OpenLLM/pull/681
fix(torch_dtype): correctly infer based on options by @aarnphm in https://github.com/bentoml/OpenLLM/pull/682

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.11...v0.4.12

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.4.11

Installation

bash pip install openllm==0.4.11

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.11

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P ghcr.io/bentoml/openllm:0.4.11 start opt

To run OpenLLM Clojure UI (community-maintained): docker run -p 8420:80 ghcr.io/bentoml/openllm-ui-clojure:0.4.11

Find more information about this release in the CHANGELOG.md

What's Changed

infra: update cbfmt options by @aarnphm in https://github.com/bentoml/OpenLLM/pull/676
fix(examples): add support for streaming feature by @aarnphm in https://github.com/bentoml/OpenLLM/pull/677
fix: correct set item for attrs >23.1 by @aarnphm in https://github.com/bentoml/OpenLLM/pull/678
fix(build): correctly parse default env for container by @aarnphm in https://github.com/bentoml/OpenLLM/pull/679
fix(env): correct format environment on docker by @aarnphm in https://github.com/bentoml/OpenLLM/pull/680

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.10...v0.4.11

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.4.10

Installation

bash pip install openllm==0.4.10

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.10

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P ghcr.io/bentoml/openllm:0.4.10 start opt

To run OpenLLM Clojure UI (community-maintained): docker run -p 8420:80 ghcr.io/bentoml/openllm-ui-clojure:0.4.10

Find more information about this release in the CHANGELOG.md

What's Changed

fix(runner): remove keyword args for attrs.get() by @jeffwang0516 in https://github.com/bentoml/OpenLLM/pull/661
fix: update notebook by @xianml in https://github.com/bentoml/OpenLLM/pull/662
feat(type): provide structured annotations stubs by @aarnphm in https://github.com/bentoml/OpenLLM/pull/663
feat(llm): respect warnings environment for dtype warning by @aarnphm in https://github.com/bentoml/OpenLLM/pull/664
infra: makes huggingface-hub requirements on fine-tune by @aarnphm in https://github.com/bentoml/OpenLLM/pull/665
types: update stubs for remaining entrypoints by @aarnphm in https://github.com/bentoml/OpenLLM/pull/667
perf: reduce footprint by @aarnphm in https://github.com/bentoml/OpenLLM/pull/668
perf(build): locking and improve build speed by @aarnphm in https://github.com/bentoml/OpenLLM/pull/669
docs: add LlamaIndex integration by @aarnphm in https://github.com/bentoml/OpenLLM/pull/646
infra: remove codegolf by @aarnphm in https://github.com/bentoml/OpenLLM/pull/671
feat(models): Phi 1.5 by @aarnphm in https://github.com/bentoml/OpenLLM/pull/672
fix(docs): chatglm support on vLLM by @aarnphm in https://github.com/bentoml/OpenLLM/pull/673
chore(loading): include verbose warning about trustremotecode by @aarnphm in https://github.com/bentoml/OpenLLM/pull/674
perf: potentially reduce image size by @aarnphm in https://github.com/bentoml/OpenLLM/pull/675

New Contributors

@jeffwang0516 made their first contribution in https://github.com/bentoml/OpenLLM/pull/661

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.9...v0.4.10

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.4.9

Installation

bash pip install openllm==0.4.9

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.9

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P ghcr.io/bentoml/openllm:0.4.9 start opt

To run OpenLLM Clojure UI (community-maintained): docker run -p 8420:80 ghcr.io/bentoml/openllm-ui-clojure:0.4.9

Find more information about this release in the CHANGELOG.md

What's Changed

infra: update scripts to run update readme automatically by @aarnphm in https://github.com/bentoml/OpenLLM/pull/658
chore: update requirements in README.md by @aarnphm in https://github.com/bentoml/OpenLLM/pull/659
fix(falcon): remove early_stopping default arguments by @aarnphm in https://github.com/bentoml/OpenLLM/pull/660

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.8...v0.4.9

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.4.8

Installation

bash pip install openllm==0.4.8

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.8

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P ghcr.io/bentoml/openllm:0.4.8 start opt

To run OpenLLM Clojure UI (community-maintained): docker run -p 8420:80 ghcr.io/bentoml/openllm-ui-clojure:0.4.8

Find more information about this release in the CHANGELOG.md

What's Changed

docs: update instruction adding new models and remove command docstring by @aarnphm in https://github.com/bentoml/OpenLLM/pull/654
chore(cli): move playground to CLI components by @aarnphm in https://github.com/bentoml/OpenLLM/pull/655
perf: improve build logics and cleanup speed by @aarnphm in https://github.com/bentoml/OpenLLM/pull/657

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.7...v0.4.8

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.4.7

Installation

bash pip install openllm==0.4.7

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.7

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P ghcr.io/bentoml/openllm:0.4.7 start opt

To run OpenLLM Clojure UI (community-maintained): docker run -p 8420:80 ghcr.io/bentoml/openllm-ui-clojure:0.4.7

Find more information about this release in the CHANGELOG.md

What's Changed

refactor: use DEBUG env-var instead of OPENLLMDEVDEBUG by @aarnphm in https://github.com/bentoml/OpenLLM/pull/647
fix(cli): update context name parsing correctly by @aarnphm in https://github.com/bentoml/OpenLLM/pull/652
feat: Yi models by @aarnphm in https://github.com/bentoml/OpenLLM/pull/651
fix: correct OPENLLMDEVBUILD check by @xianml in https://github.com/bentoml/OpenLLM/pull/653

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.6...v0.4.7

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.4.6

Installation

bash pip install openllm==0.4.6

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.6

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P ghcr.io/bentoml/openllm:0.4.6 start opt

To run OpenLLM Clojure UI (community-maintained): docker run -p 8420:80 ghcr.io/bentoml/openllm-ui-clojure:0.4.6

Find more information about this release in the CHANGELOG.md

What's Changed

chore: cleanup unused code path by @aarnphm in https://github.com/bentoml/OpenLLM/pull/633
perf(model): update mistral inference parameters and prompt format by @larme in https://github.com/bentoml/OpenLLM/pull/632
infra: remove unused postprocess_generate by @aarnphm in https://github.com/bentoml/OpenLLM/pull/634
docs: update README.md by @aarnphm in https://github.com/bentoml/OpenLLM/pull/635
fix(client): correct destructor the httpx object boht sync and async by @aarnphm in https://github.com/bentoml/OpenLLM/pull/636
doc: update adding new model guide by @larme in https://github.com/bentoml/OpenLLM/pull/637
fix(generation): compatibility dtype with CPU by @aarnphm in https://github.com/bentoml/OpenLLM/pull/638
fix(cpu): more verbose definition for dtype casting by @aarnphm in https://github.com/bentoml/OpenLLM/pull/639
fix(service): to yield out correct JSON objects by @aarnphm in https://github.com/bentoml/OpenLLM/pull/640
fix(cli): set default dtype to auto infer by @aarnphm in https://github.com/bentoml/OpenLLM/pull/642
fix(dependencies): lock build < 1 for now by @aarnphm in https://github.com/bentoml/OpenLLM/pull/643
chore(openapi): unify inject param by @aarnphm in https://github.com/bentoml/OpenLLM/pull/645

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.5...v0.4.6

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.4.5

Installation

bash pip install openllm==0.4.5

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.5

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P ghcr.io/bentoml/openllm:0.4.5 start opt

To run OpenLLM Clojure UI (community-maintained): docker run -p 8420:80 ghcr.io/bentoml/openllm-ui-clojure:0.4.5

Find more information about this release in the CHANGELOG.md

What's Changed

refactor(cli): move out to its own packages by @aarnphm in https://github.com/bentoml/OpenLLM/pull/619
fix(cli): correct set working_dir by @aarnphm in https://github.com/bentoml/OpenLLM/pull/620
chore(cli): always show available models by @aarnphm in https://github.com/bentoml/OpenLLM/pull/621
fix(sdk): make sure build to quiet out stdout by @aarnphm in https://github.com/bentoml/OpenLLM/pull/622
chore: update jupyter notebooks with new API by @aarnphm in https://github.com/bentoml/OpenLLM/pull/623
fix(ruff): correct consistency between isort and formatter by @aarnphm in https://github.com/bentoml/OpenLLM/pull/624
feat(vllm): support passing specific dtype by @aarnphm in https://github.com/bentoml/OpenLLM/pull/626
chore(deps): bump taiki-e/install-action from 2.21.8 to 2.21.11 by @dependabot in https://github.com/bentoml/OpenLLM/pull/625
feat(cli): --dtype arguments by @aarnphm in https://github.com/bentoml/OpenLLM/pull/627
fix(cli): make sure to pass the dtype to subprocess service by @aarnphm in https://github.com/bentoml/OpenLLM/pull/628
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/629
infra: removing clojure frontend from infra cycle by @aarnphm in https://github.com/bentoml/OpenLLM/pull/630
fix(torch_dtype): load eagerly by @aarnphm in https://github.com/bentoml/OpenLLM/pull/631

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.4...v0.4.5

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.4.4

Installation

bash pip install openllm==0.4.4

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.4

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P ghcr.io/bentoml/openllm:0.4.4 start opt

To run OpenLLM Clojure UI (community-maintained): docker run -p 8420:80 ghcr.io/bentoml/openllm-ui-clojure:0.4.4

Find more information about this release in the CHANGELOG.md

What's Changed

chore: no need compat workaround for setting cell_contents by @aarnphm in https://github.com/bentoml/OpenLLM/pull/616
chore(llm): expose quantise and lazy load heavy imports by @aarnphm in https://github.com/bentoml/OpenLLM/pull/617
feat(llm): update warning envvar and add embedded mode by @aarnphm in https://github.com/bentoml/OpenLLM/pull/618

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.3...v0.4.4

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.4.3

Installation

bash pip install openllm==0.4.3

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.3

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P ghcr.io/bentoml/openllm:0.4.3 start opt

To run OpenLLM Clojure UI (community-maintained): docker run -p 8420:80 ghcr.io/bentoml/openllm-ui-clojure:0.4.3

Find more information about this release in the CHANGELOG.md

What's Changed

feat(server): helpers endpoints for conversation format by @aarnphm in https://github.com/bentoml/OpenLLM/pull/613
feat(client): support return response_cls to string by @aarnphm in https://github.com/bentoml/OpenLLM/pull/614
feat(client): add helpers subclass by @aarnphm in https://github.com/bentoml/OpenLLM/pull/615

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.2...v0.4.3

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.4.2

Installation

bash pip install openllm==0.4.2

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.2

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P ghcr.io/bentoml/openllm:0.4.2 start opt

To run OpenLLM Clojure UI (community-maintained): docker run -p 8420:80 ghcr.io/bentoml/openllm-ui-clojure:0.4.2

Find more information about this release in the CHANGELOG.md

What's Changed

refactor(cli): cleanup API by @aarnphm in https://github.com/bentoml/OpenLLM/pull/592
infra: move out clojure to external by @aarnphm in https://github.com/bentoml/OpenLLM/pull/593
infra: using ruff formatter by @aarnphm in https://github.com/bentoml/OpenLLM/pull/594
infra: remove tsconfig by @aarnphm in https://github.com/bentoml/OpenLLM/pull/595
revert: configuration not to dump flatten by @aarnphm in https://github.com/bentoml/OpenLLM/pull/597
package: add openllm core dependencies to labels by @aarnphm in https://github.com/bentoml/OpenLLM/pull/600
fix: loading correct local models by @aarnphm in https://github.com/bentoml/OpenLLM/pull/599
fix: correct importmodules locally by @aarnphm in https://github.com/bentoml/OpenLLM/pull/601
fix: overload flattened dict by @aarnphm in https://github.com/bentoml/OpenLLM/pull/602
feat(client): support authentication token and shim implementation by @aarnphm in https://github.com/bentoml/OpenLLM/pull/605
fix(client): check for should retry header by @aarnphm in https://github.com/bentoml/OpenLLM/pull/606
chore(client): remove ununsed state enum by @aarnphm in https://github.com/bentoml/OpenLLM/pull/609
chore: remove generated stubs for now by @aarnphm in https://github.com/bentoml/OpenLLM/pull/610
refactor(config): simplify configuration and update start CLI output by @aarnphm in https://github.com/bentoml/OpenLLM/pull/611
docs: update supported feature set by @aarnphm in https://github.com/bentoml/OpenLLM/pull/612

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.1...v0.4.2

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.4.1

OpenLLM version 0.4.0 introduces several enhanced features.

Unified API and Continuous Batching support: 0.4.0 brings a revamped API for OpenLLM. Users now can run LLM with two new APIs. Under async context, calls to both llm.generate_iterator and llm.generate now supports continuous batching for the most optimal throughput.
- await llm.generate_iterator(prompt, stop, **kwargs): one shot generation for any given prompt
  
```python import openllm, asyncio

llm = openllm.LLM("HuggingFaceH4/zephyr-7b-beta")

async def infer(prompt,*kwargs): return await llm.generate(prompt, *kwargs)

asyncio.run(infer("Time is a definition of"))

```
- await llm.generate(prompt, stop, **kwargs: stream generation that returns tokens as they become ready
  
```python import bentoml, openllm import openllm

llm = openllm.LLM("HuggingFaceH4/zephyr-7b-beta")

svc = bentoml.Service(name='zephyr-instruct', runners=[llm.runner])

@svc.api(input=bentoml.io.Text(), output=bentoml.io.Text(mediatype='text/event-stream')) async def prompt(inputtext: str) -> str: async for generation in llm.generateiterator(inputtext): yield f"data: {generation.outputs[0].text}\n\n"

```
- Internally, the runner can be accessed only with llm.runner and llm.runner.generate_iterator.
- The backend is now automatically inferred based on the presence of vllm in the environment. However, if you prefer to manually specify the backend, you can achieve this by using openllm.LLM("HuggingFaceH4/zephyr-7b-beta", backend='pt').
- Quantization can also be passed directly to this new LLM API.
  
  python llm = openllm.LLM("TheBloke/Mistral-7B-Instruct-v0.1-AWQ", quantize='awq')
Mistral Model: OpenLLM now supports Mistral. To start a Mistral server, simply execute openllm start mistral.
AWQ and SqueezeLLM Quantization: AWQ and SqueezeLLM is now supported with vLLM backend. Simply pass --quantize [awq|squezzellm] to openllm start to use AWQ or SqueezeLLM quantization.

IMPORTANT: For using AWQ it is crucial that the model weight is already quantized with AWQ. Please look for the model variant on HuggingFace hub for the AWQ version of the model you want to use. Currently, only AWQ with vLLM is fully tested and supported.
General bug fixes: fixed a bug with regards to tag generation. Standalone Bento that use this new API should just work as expected if the model is already exists in the model store.
- For consistency, make sure to run openllm prune -y --include-bentos

Installation

bash pip install openllm==0.4.1

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.1

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P ghcr.io/bentoml/openllm:0.4.1 start opt

To run OpenLLM Clojure UI (community-maintained): docker run -p 8420:80 ghcr.io/bentoml/openllm-ui-clojure:0.4.1

Find more information about this release in the CHANGELOG.md

What's Changed

chore(runner): yield the outputs directly by @aarnphm in https://github.com/bentoml/OpenLLM/pull/573
chore(openai): simplify client examples by @aarnphm in https://github.com/bentoml/OpenLLM/pull/574
fix(examples): correct dependencies in requirements.txt [skip ci] by @aarnphm in https://github.com/bentoml/OpenLLM/pull/575
refactor: cleanup typing to expose correct API by @aarnphm in https://github.com/bentoml/OpenLLM/pull/576
fix(stubs): update initialisation types by @aarnphm in https://github.com/bentoml/OpenLLM/pull/577
refactor(strategies): move logics into openllm-python by @aarnphm in https://github.com/bentoml/OpenLLM/pull/578
chore(service): cleanup API by @aarnphm in https://github.com/bentoml/OpenLLM/pull/579
infra: disable npm updates and correct python packages by @aarnphm in https://github.com/bentoml/OpenLLM/pull/580
chore(deps): bump aquasecurity/trivy-action from 0.13.1 to 0.14.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/583
chore(deps): bump taiki-e/install-action from 2.21.7 to 2.21.8 by @dependabot in https://github.com/bentoml/OpenLLM/pull/581
chore(deps): bump sigstore/cosign-installer from 3.1.2 to 3.2.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/582
fix: device imports using strategies by @aarnphm in https://github.com/bentoml/OpenLLM/pull/584
fix(gptq): update config fields by @aarnphm in https://github.com/bentoml/OpenLLM/pull/585
fix: unbound variable for completion client by @aarnphm in https://github.com/bentoml/OpenLLM/pull/587
fix(awq): correct awq detection for support by @aarnphm in https://github.com/bentoml/OpenLLM/pull/586
feat(vllm): squeezellm by @aarnphm in https://github.com/bentoml/OpenLLM/pull/588
docs: update quantization notes by @aarnphm in https://github.com/bentoml/OpenLLM/pull/589
fix(cli): append model-id instruction to build by @aarnphm in https://github.com/bentoml/OpenLLM/pull/590
container: update tracing dependencies by @aarnphm in https://github.com/bentoml/OpenLLM/pull/591

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.4.0...v0.4.1

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.4.0

Release Highlights

OpenLLM 0.4.0 brings a few revamp feature

Unified API

0.4.0 brings a revamped API for OpenLLM. Users now can run LLM with two new API

await llm.generate_iterator(prompt, stop, **kwargs)
await llm.generate(prompt, stop, **kwargs

llm.generate is the one shot generation for any given prompt, whereas llm.generate_iterator is the streaming variant.

```python import openllm, asyncio

llm = openllm.LLM("HuggingFaceH4/zephyr-7b-beta")

async def infer(prompt,*kwargs): return await llm.generate(prompt, *kwargs)

asyncio.run(infer("Time is a definition of")) ```

For using within a BentoML Service, one can do the following

```python import bentoml, openllm import openllm

llm = openllm.LLM("HuggingFaceH4/zephyr-7b-beta")

svc = bentoml.Service(name='zephyr-instruct', runners=[llm.runner])

@svc.api(input=bentoml.io.Text(), output=bentoml.io.Text(mediatype='text/event-stream')) async def prompt(inputtext: str) -> str: async for generation in llm.generateiterator(inputtext): yield f"data: {generation.outputs[0].text}\n\n" ```

Mistral supports

Mistral is now supported with OpenLLM. Simply do openllm start mistral to start a mistral server

AWQ support

AWQ is not supported with both vLLM and PyTorch backend. Simply pass --quantize awq to use AWQ.

[!IMPORTANT] For using AWQ it is crucial that the model weight is already quantized with AWQ. Please look for the model variant on HuggingFace hub for the AWQ version of the model you want to use

General bug fixes

Fixes a bug with regards to tag generation. Standalone Bento that use this new API should just work as expected if the model is already exists in the model store.

For consistency, make sure to run openllm prune -y --include-bentos

Installation

bash pip install openllm==0.4.0

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.4.0

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P ghcr.io/bentoml/openllm:0.4.0 start opt

To run OpenLLM Clojure UI (community-maintained): docker run -p 8420:80 ghcr.io/bentoml/openllm-ui-clojure:0.4.0

Find more information about this release in the CHANGELOG.md

What's Changed

ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/563
chore(deps): bump aquasecurity/trivy-action from 0.13.0 to 0.13.1 by @dependabot in https://github.com/bentoml/OpenLLM/pull/562
chore(deps): bump taiki-e/install-action from 2.21.3 to 2.21.7 by @dependabot in https://github.com/bentoml/OpenLLM/pull/561
chore(deps-dev): bump eslint from 8.47.0 to 8.53.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/558
chore(deps): bump @vercel/og from 0.5.18 to 0.5.20 by @dependabot in https://github.com/bentoml/OpenLLM/pull/556
chore(deps-dev): bump @types/react from 18.2.20 to 18.2.35 by @dependabot in https://github.com/bentoml/OpenLLM/pull/559
chore(deps-dev): bump @typescript-eslint/eslint-plugin from 6.9.0 to 6.10.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/564
fix : updated client to toggle tls verification by @ABHISHEK03312 in https://github.com/bentoml/OpenLLM/pull/532
perf: unify LLM interface by @aarnphm in https://github.com/bentoml/OpenLLM/pull/518
fix(stop): stop is not available in config by @aarnphm in https://github.com/bentoml/OpenLLM/pull/566
infra: update docs on serving fine-tuning layers by @aarnphm in https://github.com/bentoml/OpenLLM/pull/567
fix: update build dependencies and format chat prompt by @aarnphm in https://github.com/bentoml/OpenLLM/pull/569
chore(examples): update openai client by @aarnphm in https://github.com/bentoml/OpenLLM/pull/568
fix(client): one-shot generation construction by @aarnphm in https://github.com/bentoml/OpenLLM/pull/570
feat: Mistral support by @aarnphm in https://github.com/bentoml/OpenLLM/pull/571

New Contributors

@ABHISHEK03312 made their first contribution in https://github.com/bentoml/OpenLLM/pull/532

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.3.14...v0.4.0

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.3.14

Installation

bash pip install openllm==0.3.14

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.3.14

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P ghcr.io/bentoml/openllm:0.3.14 start opt

To run OpenLLM Clojure UI (community-maintained): docker run -p 8420:80 ghcr.io/bentoml/openllm-ui-clojure:0.3.14

Find more information about this release in the CHANGELOG.md

What's Changed

chore(deps): bump taiki-e/install-action from 2.20.15 to 2.21.3 by @dependabot in https://github.com/bentoml/OpenLLM/pull/546
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/548
chore(deps): bump aquasecurity/trivy-action from 0.12.0 to 0.13.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/545
chore(deps): bump github/codeql-action from 2.22.4 to 2.22.5 by @dependabot in https://github.com/bentoml/OpenLLM/pull/544
fix: update llama2 notebook example by @xianml in https://github.com/bentoml/OpenLLM/pull/516
chore(deps-dev): bump @types/react from 18.2.20 to 18.2.33 by @dependabot in https://github.com/bentoml/OpenLLM/pull/542
chore(deps-dev): bump @typescript-eslint/eslint-plugin from 6.8.0 to 6.9.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/537
chore(deps-dev): bump @edge-runtime/vm from 3.1.4 to 3.1.6 by @dependabot in https://github.com/bentoml/OpenLLM/pull/540
chore(deps-dev): bump eslint from 8.47.0 to 8.52.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/541
fix: Max new tokens by @XunchaoZ in https://github.com/bentoml/OpenLLM/pull/550
chore(inference): update vllm to 0.2.1.post1 and update config parsing by @aarnphm in https://github.com/bentoml/OpenLLM/pull/554

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.3.13...v0.3.14

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.3.13

Installation

bash pip install openllm==0.3.13

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.3.13

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P ghcr.io/bentoml/openllm:0.3.13 start opt

To run OpenLLM Clojure UI (community-maintained): docker run -p 8420:80 ghcr.io/bentoml/openllm-ui-clojure:0.3.13

Find more information about this release in the CHANGELOG.md

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.3.12...v0.3.13

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.3.12

Installation

bash pip install openllm==0.3.12

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.3.12

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P ghcr.io/bentoml/openllm:0.3.12 start opt

To run OpenLLM Clojure UI (community-maintained): docker run -p 8420:80 ghcr.io/bentoml/openllm-ui-clojure:0.3.12

Find more information about this release in the CHANGELOG.md

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.3.11...v0.3.12

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.3.10

Installation

bash pip install openllm==0.3.10

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.3.10

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P ghcr.io/bentoml/openllm:0.3.10 start opt

To run OpenLLM Clojure UI (community-maintained): docker run -p 8420:80 ghcr.io/bentoml/openllm-ui-clojure:0.3.10

Find more information about this release in the CHANGELOG.md

What's Changed

chore(deps-dev): bump @typescript-eslint/parser from 6.7.5 to 6.8.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/511
chore(deps-dev): bump @next/eslint-plugin-next from 13.5.4 to 13.5.5 by @dependabot in https://github.com/bentoml/OpenLLM/pull/510
chore(deps-dev): bump @typescript-eslint/eslint-plugin from 6.7.5 to 6.8.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/509
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/529
chore(deps): bump actions/checkout from 4.1.0 to 4.1.1 by @dependabot in https://github.com/bentoml/OpenLLM/pull/528
chore(deps): bump github/codeql-action from 2.22.3 to 2.22.4 by @dependabot in https://github.com/bentoml/OpenLLM/pull/527
chore(deps): bump taiki-e/install-action from 2.20.3 to 2.20.15 by @dependabot in https://github.com/bentoml/OpenLLM/pull/526
chore(deps-dev): bump eslint-plugin-import from 2.28.1 to 2.29.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/525
chore(deps-dev): bump turbo from 1.10.15 to 1.10.16 by @dependabot in https://github.com/bentoml/OpenLLM/pull/521
chore(deps-dev): bump @types/dedent from 0.7.0 to 0.7.1 by @dependabot in https://github.com/bentoml/OpenLLM/pull/524
chore(falcon): Use official implementation by @aarnphm in https://github.com/bentoml/OpenLLM/pull/530
chore(deps-dev): bump @types/node from 20.5.3 to 20.8.7 by @dependabot in https://github.com/bentoml/OpenLLM/pull/522
feat: Conversation template by @XunchaoZ in https://github.com/bentoml/OpenLLM/pull/519

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.3.9...v0.3.10

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.3.9

Installation

bash pip install openllm==0.3.9

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.3.9

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P ghcr.io/bentoml/openllm:0.3.9 start opt

To run OpenLLM Clojure UI (community-maintained): docker run -p 8420:80 ghcr.io/bentoml/openllm-ui-clojure:0.3.9

Find more information about this release in the CHANGELOG.md

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.3.8...v0.3.9

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.3.8

Installation

bash pip install openllm==0.3.8

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.3.8

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P ghcr.io/bentoml/openllm:0.3.8 start opt

To run OpenLLM Clojure UI (community-maintained): docker run -p 8420:80 ghcr.io/bentoml/openllm-ui-clojure:0.3.8

Find more information about this release in the CHANGELOG.md

What's Changed

docs: Add OpenLLM tutorial Google Colab link by @Sherlock113 in https://github.com/bentoml/OpenLLM/pull/497
chore(deps-dev): bump @types/react-dom from 18.0.6 to 18.2.13 by @dependabot in https://github.com/bentoml/OpenLLM/pull/492
chore(deps-dev): bump @types/react from 18.2.20 to 18.2.28 by @dependabot in https://github.com/bentoml/OpenLLM/pull/493
chore(deps-dev): bump typescript from 5.1.6 to 5.2.2 by @dependabot in https://github.com/bentoml/OpenLLM/pull/494
chore(deps-dev): bump @types/node from 20.5.3 to 20.8.5 by @dependabot in https://github.com/bentoml/OpenLLM/pull/495
fix(breaking): remove embeddings and update client implementation by @aarnphm in https://github.com/bentoml/OpenLLM/pull/500
feat: openai.Model.list() by @XunchaoZ in https://github.com/bentoml/OpenLLM/pull/499
chore(deps): bump taiki-e/install-action from 2.20.2 to 2.20.3 by @dependabot in https://github.com/bentoml/OpenLLM/pull/502
chore(deps): bump github/codeql-action from 2.22.2 to 2.22.3 by @dependabot in https://github.com/bentoml/OpenLLM/pull/501
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/506
fix: check for parity by @aarnphm in https://github.com/bentoml/OpenLLM/pull/508
chore(deps-dev): bump prettier-plugin-tailwindcss from 0.5.5 to 0.5.6 by @dependabot in https://github.com/bentoml/OpenLLM/pull/504
chore(deps-dev): bump @types/node from 20.5.3 to 20.8.6 by @dependabot in https://github.com/bentoml/OpenLLM/pull/503

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.3.7...v0.3.8

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.3.7

Installation

bash pip install openllm==0.3.7

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.3.7

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P ghcr.io/bentoml/openllm:0.3.7 start opt

To run OpenLLM Clojure UI (community-maintained): docker run -p 8420:80 ghcr.io/bentoml/openllm-ui-clojure:0.3.7

Find more information about this release in the CHANGELOG.md

What's Changed

chore(deps): bump taiki-e/install-action from 2.18.9 to 2.18.13 by @dependabot in https://github.com/bentoml/OpenLLM/pull/372
chore(deps): bump pypa/cibuildwheel from 2.15.0 to 2.16.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/376
chore(deps): bump github/codeql-action from 2.21.5 to 2.21.8 by @dependabot in https://github.com/bentoml/OpenLLM/pull/378
chore(deps): bump @mui/material from 5.14.8 to 5.14.9 by @dependabot in https://github.com/bentoml/OpenLLM/pull/368
chore(deps-dev): bump @edge-runtime/vm from 3.1.0 to 3.1.3 by @dependabot in https://github.com/bentoml/OpenLLM/pull/370
chore(deps): bump @floating-ui/dom from 1.5.1 to 1.5.3 by @dependabot in https://github.com/bentoml/OpenLLM/pull/366
chore(deps-dev): bump @typescript-eslint/parser from 6.4.1 to 6.7.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/365
chore(deps): bump @vercel/og from 0.5.11 to 0.5.13 by @dependabot in https://github.com/bentoml/OpenLLM/pull/356
chore(deps-dev): bump postcss from 8.4.23 to 8.4.29 by @dependabot in https://github.com/bentoml/OpenLLM/pull/360
chore(deps-dev): bump turbo from 1.10.13 to 1.10.14 by @dependabot in https://github.com/bentoml/OpenLLM/pull/364
chore(bench): add more prompts by @aarnphm in https://github.com/bentoml/OpenLLM/pull/380
chore(deps): bump @mui/styled-engine from 5.13.2 to 5.14.10 by @dependabot in https://github.com/bentoml/OpenLLM/pull/379
chore(deps): bump @mui/base from 5.0.0-beta.14 to 5.0.0-beta.16 by @dependabot in https://github.com/bentoml/OpenLLM/pull/384
chore(deps): bump @babel/runtime from 7.22.10 to 7.22.15 by @dependabot in https://github.com/bentoml/OpenLLM/pull/362
chore(deps): bump @mui/system from 5.14.8 to 5.14.10 by @dependabot in https://github.com/bentoml/OpenLLM/pull/383
chore(deps): bump @mui/x-data-grid from 6.13.0 to 6.14.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/361
chore(deps): bump @mui/private-theming from 5.14.8 to 5.14.10 by @dependabot in https://github.com/bentoml/OpenLLM/pull/382
chore(deps): bump @mui/icons-material from 5.14.8 to 5.14.9 by @dependabot in https://github.com/bentoml/OpenLLM/pull/367
chore(deps): bump @mui/utils from 5.14.8 to 5.14.10 by @dependabot in https://github.com/bentoml/OpenLLM/pull/381
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/406
chore(deps): bump actions/checkout from 4.0.0 to 4.1.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/405
chore(deps): bump taiki-e/install-action from 2.18.13 to 2.18.16 by @dependabot in https://github.com/bentoml/OpenLLM/pull/404
chore(deps-dev): bump eslint from 8.47.0 to 8.50.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/400
chore(deps): bump nextra from 2.11.1 to 2.13.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/393
chore(deps-dev): bump @types/react from 18.2.20 to 18.2.22 by @dependabot in https://github.com/bentoml/OpenLLM/pull/403
chore(deps-dev): bump tailwindcss from 3.3.2 to 3.3.3 by @dependabot in https://github.com/bentoml/OpenLLM/pull/398
chore(deps-dev): bump autoprefixer from 10.4.12 to 10.4.16 by @dependabot in https://github.com/bentoml/OpenLLM/pull/397
chore(deps): bump @vercel/og from 0.5.13 to 0.5.17 by @dependabot in https://github.com/bentoml/OpenLLM/pull/396
chore(deps-dev): bump @next/eslint-plugin-next from 13.4.19 to 13.5.2 by @dependabot in https://github.com/bentoml/OpenLLM/pull/392
chore(deps): bump @floating-ui/core from 1.4.1 to 1.5.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/402
chore(deps-dev): bump @svgr/webpack from 2.0.0-alpha.26fa501a to 8.1.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/389
chore(deps): bump nextra-theme-docs from 2.12.3 to 2.13.1 by @dependabot in https://github.com/bentoml/OpenLLM/pull/411
chore(deps-dev): bump @typescript-eslint/eslint-plugin from 6.7.0 to 6.7.3 by @dependabot in https://github.com/bentoml/OpenLLM/pull/414
chore(deps-dev): bump shadow-cljs from 2.25.4 to 2.25.6 by @dependabot in https://github.com/bentoml/OpenLLM/pull/394
chore(deps): bump @mui/x-data-grid from 6.14.0 to 6.15.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/395
chore(deps-dev): bump @typescript-eslint/parser from 6.7.0 to 6.7.3 by @dependabot in https://github.com/bentoml/OpenLLM/pull/412
chore(deps-dev): bump @types/react from 18.2.20 to 18.2.23 by @dependabot in https://github.com/bentoml/OpenLLM/pull/413
chore(deps-dev): bump postcss from 8.4.29 to 8.4.30 by @dependabot in https://github.com/bentoml/OpenLLM/pull/390
chore(deps): bump @mui/x-data-grid from 6.15.0 to 6.16.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/434
chore(deps-dev): bump postcss from 8.4.30 to 8.4.31 by @dependabot in https://github.com/bentoml/OpenLLM/pull/430
chore(deps): bump pypa/cibuildwheel from 2.16.0 to 2.16.1 by @dependabot in https://github.com/bentoml/OpenLLM/pull/440
chore(deps): bump github/codeql-action from 2.21.8 to 2.21.9 by @dependabot in https://github.com/bentoml/OpenLLM/pull/439
chore(deps): bump taiki-e/install-action from 2.18.16 to 2.19.2 by @dependabot in https://github.com/bentoml/OpenLLM/pull/438
chore(deps): bump @mui/x-date-pickers from 6.13.0 to 6.16.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/428
chore(deps-dev): bump @types/react from 18.2.20 to 18.2.24 by @dependabot in https://github.com/bentoml/OpenLLM/pull/427
chore(deps-dev): bump @types/node from 20.5.3 to 20.8.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/435
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/441
chore(deps): bump @babel/runtime from 7.22.15 to 7.23.1 by @dependabot in https://github.com/bentoml/OpenLLM/pull/436
chore(deps): bump nextra from 2.13.0 to 2.13.1 by @dependabot in https://github.com/bentoml/OpenLLM/pull/425
feat: PromptTemplate and system prompt support by @MingLiangDai in https://github.com/bentoml/OpenLLM/pull/407
feat: OpenAI compatible API by @XunchaoZ in https://github.com/bentoml/OpenLLM/pull/417
chore(deps): bump pypa/cibuildwheel from 2.16.1 to 2.16.2 by @dependabot in https://github.com/bentoml/OpenLLM/pull/474
chore(deps): bump github/codeql-action from 2.21.9 to 2.22.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/473
chore(deps): bump aws-actions/configure-aws-credentials from 4.0.0 to 4.0.1 by @dependabot in https://github.com/bentoml/OpenLLM/pull/471
chore(deps): bump actions/setup-python from 4.7.0 to 4.7.1 by @dependabot in https://github.com/bentoml/OpenLLM/pull/470
chore(deps): bump taiki-e/install-action from 2.19.2 to 2.20.1 by @dependabot in https://github.com/bentoml/OpenLLM/pull/472
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/475
fix: do not reply on env var for built bento/docker by @larme in https://github.com/bentoml/OpenLLM/pull/477
fix: asyncio stalling inside notebooks by @xianml in https://github.com/bentoml/OpenLLM/pull/478
fix: fix client HTTPS by @sauyon in https://github.com/bentoml/OpenLLM/pull/480
feat: add 1 openllm llama2 notebook demo by @xianml in https://github.com/bentoml/OpenLLM/pull/479
chore(deps-dev): bump eslint from 8.47.0 to 8.51.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/469
chore(deps-dev): bump @types/node from 20.5.3 to 20.8.3 by @dependabot in https://github.com/bentoml/OpenLLM/pull/467
chore(deps-dev): bump turbo from 1.10.14 to 1.10.15 by @dependabot in https://github.com/bentoml/OpenLLM/pull/461
chore(deps): bump nextra from 2.13.1 to 2.13.2 by @dependabot in https://github.com/bentoml/OpenLLM/pull/458
chore(deps): bump nextra-theme-docs from 2.13.1 to 2.13.2 by @dependabot in https://github.com/bentoml/OpenLLM/pull/457
chore(deps-dev): bump @typescript-eslint/eslint-plugin from 6.7.3 to 6.7.5 by @dependabot in https://github.com/bentoml/OpenLLM/pull/483
chore(deps): bump highlight.js from 11.8.0 to 11.9.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/456
chore(deps-dev): bump @next/eslint-plugin-next from 13.5.2 to 13.5.4 by @dependabot in https://github.com/bentoml/OpenLLM/pull/464
chore(deps-dev): bump @typescript-eslint/parser from 6.7.3 to 6.7.5 by @dependabot in https://github.com/bentoml/OpenLLM/pull/482
chore(deps-dev): bump prettier-plugin-tailwindcss from 0.5.4 to 0.5.5 by @dependabot in https://github.com/bentoml/OpenLLM/pull/455
chore(deps-dev): bump @types/react-transition-group from 4.4.6 to 4.4.7 by @dependabot in https://github.com/bentoml/OpenLLM/pull/460
chore(deps-dev): bump @edge-runtime/vm from 3.1.3 to 3.1.4 by @dependabot in https://github.com/bentoml/OpenLLM/pull/466
chore(deps): bump github/codeql-action from 2.22.0 to 2.22.2 by @dependabot in https://github.com/bentoml/OpenLLM/pull/484
chore(deps): bump taiki-e/install-action from 2.20.1 to 2.20.2 by @dependabot in https://github.com/bentoml/OpenLLM/pull/485
chore(deps-dev): bump @types/react from 18.2.20 to 18.2.28 by @dependabot in https://github.com/bentoml/OpenLLM/pull/486
chore(deps): bump @mui/x-data-grid from 6.16.0 to 6.16.2 by @dependabot in https://github.com/bentoml/OpenLLM/pull/487
chore(deps): bump @vercel/og from 0.5.17 to 0.5.18 by @dependabot in https://github.com/bentoml/OpenLLM/pull/488
chore(deps-dev): bump @types/node from 20.5.3 to 20.8.4 by @dependabot in https://github.com/bentoml/OpenLLM/pull/491
chore(deps): bump @babel/runtime from 7.23.1 to 7.23.2 by @dependabot in https://github.com/bentoml/OpenLLM/pull/490
feat(client): simple implementation and streaming by @aarnphm in https://github.com/bentoml/OpenLLM/pull/256

New Contributors

@MingLiangDai made their first contribution in https://github.com/bentoml/OpenLLM/pull/407
@XunchaoZ made their first contribution in https://github.com/bentoml/OpenLLM/pull/417
@xianml made their first contribution in https://github.com/bentoml/OpenLLM/pull/478
@sauyon made their first contribution in https://github.com/bentoml/OpenLLM/pull/480

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.3.6...v0.3.7

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.3.6

Installation

bash pip install openllm==0.3.6

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.3.6

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P ghcr.io/bentoml/openllm:0.3.6 start opt

To run OpenLLM Clojure UI (community-maintained): docker run -p 8420:80 ghcr.io/bentoml/openllm-ui-clojure:0.3.6

Find more information about this release in the CHANGELOG.md

What's Changed

ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/374
chore(deps): bump peter-evans/create-pull-request from 4.2.4 to 5.0.2 by @dependabot in https://github.com/bentoml/OpenLLM/pull/373
feat: support continuous batching on generate by @aarnphm in https://github.com/bentoml/OpenLLM/pull/375

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.3.5...v0.3.6

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.3.5

Installation

bash pip install openllm==0.3.5

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.3.5

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P ghcr.io/bentoml/openllm:0.3.5 start opt

To run OpenLLM Clojure UI (community-maintained): docker run -p 8420:80 ghcr.io/bentoml/openllm-ui-clojure:0.3.5

Find more information about this release in the CHANGELOG.md

What's Changed

fix: set default serialisation methods by @aarnphm in https://github.com/bentoml/OpenLLM/pull/355

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.3.4...v0.3.5

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.3.4

Installation

bash pip install openllm==0.3.4

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.3.4

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P ghcr.io/bentoml/openllm:0.3.4 start opt

To run OpenLLM Clojure UI (community-maintained): docker run -p 8420:80 ghcr.io/bentoml/openllm-ui-clojure:0.3.4

Find more information about this release in the CHANGELOG.md

What's Changed

dos: fix typo by @Sherlock113 in https://github.com/bentoml/OpenLLM/pull/305
fix(serving): vllm bad num_gpus by @alanpoulain in https://github.com/bentoml/OpenLLM/pull/326
fix(serialisation): vllm ignore by @aarnphm in https://github.com/bentoml/OpenLLM/pull/324
feat: continuous batching with vLLM by @aarnphm in https://github.com/bentoml/OpenLLM/pull/349
fix(prompt): correct export extra objects items by @aarnphm in https://github.com/bentoml/OpenLLM/pull/351

New Contributors

@alanpoulain made their first contribution in https://github.com/bentoml/OpenLLM/pull/326

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.3.3...v0.3.4

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.3.3

Installation

bash pip install openllm==0.3.3

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.3.3

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P ghcr.io/bentoml/openllm:0.3.3 start opt

To run OpenLLM Clojure UI (community-maintained): docker run -p 8420:80 ghcr.io/bentoml/openllm-ui-clojure:0.3.3

Find more information about this release in the CHANGELOG.md

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.3.2...v0.3.3

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.3.2

Installation

bash pip install openllm==0.3.2

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.3.2

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P ghcr.io/bentoml/openllm:0.3.2 start opt

To run OpenLLM Clojure UI (community-maintained): docker run -p 8420:80 ghcr.io/bentoml/openllm-ui-clojure:0.3.2

Find more information about this release in the CHANGELOG.md

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.3.1...v0.3.2

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.3.1

Installation

bash pip install openllm==0.3.1

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.3.1

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P ghcr.io/bentoml/openllm:0.3.1 start opt

To run OpenLLM Clojure UI (community-maintained): docker run -p 8420:80 ghcr.io/bentoml/openllm-ui-clojure:0.3.1

Find more information about this release in the CHANGELOG.md

What's Changed

docs: Update the readme by @Sherlock113 in https://github.com/bentoml/OpenLLM/pull/302
revert: disable compiled wheels for now by @aarnphm in https://github.com/bentoml/OpenLLM/pull/304

New Contributors

@Sherlock113 made their first contribution in https://github.com/bentoml/OpenLLM/pull/302

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.3.0...v0.3.1

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.3.0

Installation

bash pip install openllm==0.3.0

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.3.0

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P ghcr.io/bentoml/openllm:0.3.0 start opt

To run OpenLLM Clojure UI (community-maintained): docker run -p 8420:80 ghcr.io/bentoml/openllm-ui-clojure:0.3.0

Find more information about this release in the CHANGELOG.md

What's Changed

cron(style): run formatter [generated] [skip ci] by @aarnphm in https://github.com/bentoml/OpenLLM/pull/257
feat(vllm): streaming by @aarnphm in https://github.com/bentoml/OpenLLM/pull/260
chore(build): use latest vllm pre-built kernel by @aarnphm in https://github.com/bentoml/OpenLLM/pull/261
chore(deps): bump bentoml/setup-bentoml-action from 59beefe94e2e8f8ebbedf555fc86bd5d1ae0a708 to d4010d8303684b183f45b33c6a44644f81337bdb by @dependabot in https://github.com/bentoml/OpenLLM/pull/267
chore(deps): bump actions/checkout from 3.5.3 to 3.6.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/269
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/273
chore(deps): bump docker/setup-buildx-action from 2.9.1 to 2.10.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/271
chore(deps): bump aws-actions/configure-aws-credentials from 2.2.0 to 3.0.1 by @dependabot in https://github.com/bentoml/OpenLLM/pull/270
chore(deps): bump taiki-e/install-action from 2.16.0 to 2.17.2 by @dependabot in https://github.com/bentoml/OpenLLM/pull/268
chore(deps): bump @mui/icons-material from 5.14.3 to 5.14.6 by @dependabot in https://github.com/bentoml/OpenLLM/pull/265
chore(deps-dev): bump shadow-cljs from 2.25.2 to 2.25.3 by @dependabot in https://github.com/bentoml/OpenLLM/pull/262
chore(deps): bump @mui/base from 5.0.0-beta.11 to 5.0.0-beta.12 by @dependabot in https://github.com/bentoml/OpenLLM/pull/264
chore(deps): bump @mui/material from 5.14.5 to 5.14.6 by @dependabot in https://github.com/bentoml/OpenLLM/pull/266
chore(deps): bump @mui/x-date-pickers from 6.11.2 to 6.12.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/263
infra: update build dependencies [clojure-ui build] by @aarnphm in https://github.com/bentoml/OpenLLM/pull/278
fix: persistent styling between ruff and yapf by @aarnphm in https://github.com/bentoml/OpenLLM/pull/279
Fixed weird characters over model dropdowns by @GutZuFusss in https://github.com/bentoml/OpenLLM/pull/280
refactor(breaking): unify LLM API by @aarnphm in https://github.com/bentoml/OpenLLM/pull/283
fix(yapf): align weird new lines break [generated] [skip ci] by @aarnphm in https://github.com/bentoml/OpenLLM/pull/284
fix(serving): vllm distributed size by @aarnphm in https://github.com/bentoml/OpenLLM/pull/285
chore(deps): bump sigstore/cosign-installer from 3.1.1 to 3.1.2 by @dependabot in https://github.com/bentoml/OpenLLM/pull/296
chore(deps): bump taiki-e/install-action from 2.17.2 to 2.17.8 by @dependabot in https://github.com/bentoml/OpenLLM/pull/295
chore(deps): bump aquasecurity/trivy-action from 559eb1224e654a86c844a795e6702a0742c60c72 to fbd16365eb88e12433951383f5e99bd901fc618f by @dependabot in https://github.com/bentoml/OpenLLM/pull/294
chore(deps): bump github/codeql-action from 2.21.4 to 2.21.5 by @dependabot in https://github.com/bentoml/OpenLLM/pull/292
chore(deps): bump crazy-max/ghaction-import-gpg from 5.3.0 to 5.4.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/293
fix(gptq): use upstream integration by @aarnphm in https://github.com/bentoml/OpenLLM/pull/297
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/298
chore(deps): bump @mui/x-date-pickers from 6.12.0 to 6.12.1 by @dependabot in https://github.com/bentoml/OpenLLM/pull/291
chore(deps): bump @mui/icons-material from 5.14.6 to 5.14.7 by @dependabot in https://github.com/bentoml/OpenLLM/pull/290
chore(deps): bump @mui/material from 5.14.6 to 5.14.7 by @dependabot in https://github.com/bentoml/OpenLLM/pull/289
chore(deps): bump @mui/base from 5.0.0-beta.12 to 5.0.0-beta.13 by @dependabot in https://github.com/bentoml/OpenLLM/pull/287
chore(deps): bump @mui/x-data-grid from 6.11.2 to 6.12.1 by @dependabot in https://github.com/bentoml/OpenLLM/pull/288

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.2.27...v0.3.0

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.2.27

Installation

bash pip install openllm==0.2.27

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.2.27

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P ghcr.io/bentoml/openllm:0.2.27 start opt

To run OpenLLM Clojure UI (community-maintained): docker run -p 8420:80 ghcr.io/bentoml/openllm-ui-clojure:0.2.27

Find more information about this release in the CHANGELOG.md

What's Changed

feat: token streaming and SSE support by @aarnphm in https://github.com/bentoml/OpenLLM/pull/240
chore(deps): bump @mui/x-data-grid from 6.11.1 to 6.11.2 by @dependabot in https://github.com/bentoml/OpenLLM/pull/242
chore(deps): bump peter-evans/create-pull-request from 4.2.4 to 5.0.2 by @dependabot in https://github.com/bentoml/OpenLLM/pull/244
chore(deps): bump taiki-e/install-action from 2.15.4 to 2.16.0 by @dependabot in https://github.com/bentoml/OpenLLM/pull/245
chore(deps): bump @mui/x-date-pickers from 6.0.0 to 6.11.2 by @dependabot in https://github.com/bentoml/OpenLLM/pull/243
refactor: packages by @aarnphm in https://github.com/bentoml/OpenLLM/pull/249
ci: pre-commit autoupdate [pre-commit.ci] by @pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/246
feat(embeddings): Using self-hosted CPU EC2 runner by @aarnphm in https://github.com/bentoml/OpenLLM/pull/250
refactor(contrib): similar namespace [clojure-ui build] by @aarnphm in https://github.com/bentoml/OpenLLM/pull/251
chore: ignore peft and fix adapter loading issue by @aarnphm in https://github.com/bentoml/OpenLLM/pull/255

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.2.26...v0.2.27

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.2.26

Installation

bash pip install openllm==0.2.26

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.2.26

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P ghcr.io/bentoml/openllm:0.2.26 start opt

To run OpenLLM Clojure UI (community-maintained): docker run -p 8420:80 ghcr.io/bentoml/openllm-ui-clojure:0.2.26

Find more information about this release in the CHANGELOG.md

What's Changed

chore(gh): use setup-bentoml-action by @aarnphm in https://github.com/bentoml/OpenLLM/pull/230
fix(binary): move to correct folders when building standalone installer by @aarnphm in https://github.com/bentoml/OpenLLM/pull/228
chore: conditional commit for running jobs by @aarnphm in https://github.com/bentoml/OpenLLM/pull/232
feat(embedding): Adding generic endpoint by @aarnphm in https://github.com/bentoml/OpenLLM/pull/227

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.2.25...v0.2.26

- Python
Published by github-actions[bot] over 2 years ago

openllm - v0.2.25

What changed?

The long-awaited ClojureScript UI is now GA. Try it out with docker run -p 8420:80 ghcr.io/bentoml/openllm-ui-clojure:0.2.25. Thanks @GutZuFusss

Added vLLM support for Falcon, and general CQA.

Installation

bash pip install openllm==0.2.25

To upgrade from a previous version, use the following command: bash pip install --upgrade openllm==0.2.25

Usage

All available models: openllm models

To start a LLM: python -m openllm start opt

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P ghcr.io/bentoml/openllm:0.2.25 start opt

To run OpenLLM Clojure UI (community-maintained): docker run -p 8420:80 ghcr.io/bentoml/openllm-ui-clojure:0.2.25

Find more information about this release in the CHANGELOG.md

What's Changed

chore: upload nightly wheels to test.pypi.org by @aarnphm in https://github.com/bentoml/OpenLLM/pull/215
feat(contrib): ClojureScript UI by @GutZuFusss in https://github.com/bentoml/OpenLLM/pull/89
fix(ci): remove broken build hooks by @aarnphm in https://github.com/bentoml/OpenLLM/pull/216
chore(ci): add dependabot and fix vllm release container by @aarnphm in https://github.com/bentoml/OpenLLM/pull/217
feat(models): add vLLM support for Falcon by @aarnphm in https://github.com/bentoml/OpenLLM/pull/223
chore(readme): update nightly badge [skip ci] by @aarnphm in https://github.com/bentoml/OpenLLM/pull/224

New Contributors

@GutZuFusss made their first contribution in https://github.com/bentoml/OpenLLM/pull/89

Full Changelog: https://github.com/bentoml/OpenLLM/compare/v0.2.24...v0.2.25

- Python
Published by aarnphm over 2 years ago

Recent Releases of openllm

openllm - v0.6.30

openllm - v0.6.29

What's Changed

openllm - v0.6.28

openllm - v0.6.27

openllm - v0.6.26

What's Changed

openllm - v0.6.25

What's Changed

openllm - v0.6.24

What's Changed

openllm - v0.6.23

openllm - v0.6.22

openllm - v0.6.21

What's Changed

openllm - v0.6.20

What's Changed

openllm - v0.6.19

What's Changed

openllm - v0.6.18

What's Changed

openllm - v0.6.17

What's Changed

openllm - v0.6.16

What's Changed

openllm - v0.6.15

What's Changed

openllm - v0.6.14

What's Changed

openllm - v0.6.13

openllm - v0.6.12

What's Changed

openllm - v0.6.11

What's Changed

openllm - v0.6.10

What's Changed

openllm - v0.6.9

What's Changed

openllm - v0.6.8

openllm - v0.6.7

What's Changed

New Contributors

openllm - v0.6.6

What's Changed

openllm - v0.6.5

What's Changed

openllm - v0.6.3

What's Changed

New Contributors

openllm - v0.6.0

openllm - v0.5.7

Installation

Usage

openllm - OpenLLM: v0.5

Breaking changes, and the reason why.

Architecture changes and SDK.

CLI

HuggingFace models

openllm start

openllm build

Private models

openllm start

openllm build

What's next?

openllm - v0.5.5

Installation

Usage

What's Changed

openllm - v0.5.4

Installation

Usage

What's Changed

openllm - v0.5.3

Installation

Usage

openllm - v0.5.2

Installation

Usage

openllm - v0.5.0