https://github.com/bentoml/llmgateway

https://github.com/bentoml/llmgateway

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.4%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

Basic Info
  • Host: GitHub
  • Owner: bentoml
  • Language: Python
  • Default Branch: main
  • Size: 13.7 KB
Statistics
  • Stars: 2
  • Watchers: 4
  • Forks: 0
  • Open Issues: 1
  • Releases: 0
Created almost 2 years ago · Last pushed over 1 year ago
Metadata Files
Readme

README.md

LLMGateway

LLMGateway is an example project that demonstrates how to build a gateway application that works with different LLM APIs using BentoML. LLMGateway supports private LLM APIs like OpenAI and open-source LLM deployments such as Llama and Mistral. The project offers a unified API interface that makes it easier to work with different LLMs. In addition, LLMGateway demonstrates how to integrate with tools for detecting harmful prompts to keep usage safe and caching to make the LLMs more efficient.

Prerequisites

  • You have installed Python 3.8+ and pip. See the Python downloads page to learn more.
  • You have a basic understanding of key concepts in BentoML, such as Services. We recommend you read Quickstart first.
  • (Optional) We recommend you create a virtual environment for dependency isolation for this project. See the Conda documentation or the Python documentation for details.

Install dependencies

git clone https://github.com/bentoml/BentoSentenceTransformers.git cd BentoSentenceTransformers pip install -r requirements.txt

Run the LLM gateway

We have defined a BentoML Service in service.py. Run bentoml serve in your project directory to start the Service on your laptop.

$ bentoml serve .

The server is now active at http://localhost:3000.

Deploy to BentoCloud

After the Service is ready, you can deploy the application to BentoCloud for better management and scalability. Sign up if you haven't got a BentoCloud account.

Make sure you have logged in to BentoCloud, then run the following command to deploy it.

bash bentoml deploy .

Once the application is up and running on BentoCloud, you can access it via the exposed URL.

Test

Prepare the test client. export BASE_URL=[Local or BentoCloud URL] export OPENAI_API_KEY=xxx

Send a GPT-3.5 request: python test.py

Run again to hit cache: python test.py

Route to another model: MODEL=llama3.1 python test.py

Test toxic detection: MODEL=llama3.1 PROMPT="You are a worthless AI agent!" python test.py

Owner

  • Name: BentoML
  • Login: bentoml
  • Kind: organization
  • Location: San Francisco

The most flexible way to serve AI models in production

GitHub Events

Total
  • Watch event: 1
  • Pull request event: 1
Last Year
  • Watch event: 1
  • Pull request event: 1

Issues and Pull Requests

Last synced: 10 months ago

All Time
  • Total issues: 0
  • Total pull requests: 2
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 0
  • Total pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 2
Past Year
  • Issues: 0
  • Pull requests: 2
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 2
Top Authors
Issue Authors
Pull Request Authors
  • dependabot[bot] (2)
Top Labels
Issue Labels
Pull Request Labels
dependencies (2)

Dependencies

requirements.txt pypi
  • bentoml >=1.2.20
  • fastapi ==0.111.1
  • openai ==1.36.0
  • sse-starlette ==2.1.2
  • torch ==2.3.0
  • transformers ==4.42.4