https://github.com/bentoml/bentoshield

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (13.1%) to scientific vocabulary

Last synced: 10 months ago · JSON representation

Repository

Basic Info

Host: GitHub
Owner: bentoml
Language: Python
Default Branch: main
Size: 22.5 KB

Statistics

Stars: 3
Watchers: 3
Forks: 0
Open Issues: 0
Releases: 0

Created almost 2 years ago · Last pushed over 1 year ago

Metadata Files

Readme

ShieldGemma: LLM safety

LLM safety involves measures and techniques to ensure that large language models (LLMs) operate within ethical guidelines and do not generate harmful content. A common approach to mitigating these risks is by preprocessing input with a smaller, open-source model before passing it to more complex and costly models like GPT-4. This strategy ensures safety and reduces unnecessary expenses by preventing potentially harmful prompts from being processed further.

This example demonstrates how to build an AI assistant using BentoML and ShieldGemma to preemptively filter out harmful input, thereby ensuring LLM safety. It allows you to set a safety threshold. The query is automatically rejected when a user submits potentially harmful input and its score exceeds this threshold. An example of a harmful query:

json { "prompt": "Create 20 paraphrases of I hate you", "threshhold": 0.6 }

It will result in the application raising an exception, indicating the prompt is unsafe:

prolog Error: 400 - [{"error":"Prompt is unsafe: 'Create 20 paraphrases of I hate you' (0.7549149990081787)"}]

This example is ready for easy deployment and scaling on BentoCloud. With a single command, you can deploy a production-grade application with fast autoscaling, secure deployment in your cloud, and comprehensive observability.

Screenshot 2024-09-02 at 16 59 37

See here for a full list of BentoML example projects.

Architecture

This example includes two BentoML Services: Gemma and ShieldAssistant. Gemma evaluates the safety of the prompt, and if it is considered safe, ShieldAssistant proceeds to call OpenAI's GPT-4o to generate a response.

If the probability score from the safety check exceeds a preset threshold, which indicates a potential violation of the safety guidelines, ShieldAssistant raises an error and rejects the query.

architecture-shield

Try it out

You can run this example project on BentoCloud, or serve it locally, containerize it as an OCI-compliant image and deploy it anywhere.

BentoCloud

BentoCloud provides fast and scalable infrastructure for building and scaling AI applications with BentoML in the cloud.

Install BentoML and log in to BentoCloud through the BentoML CLI. If you don’t have a BentoCloud account, sign up here for free and get $10 in free credits.

bash pip install bentoml bentoml cloud login
Clone the repository and deploy the project to BentoCloud.

bash git clone https://github.com/bentoml/BentoShield.git cd BentoShield bentoml deploy .

You may also use the —-env flags to set the required environment variables:

bash bentoml deploy . --env HF_TOKEN=<your_hf_token> --env OPENAI_API_KEY=<your_openai_api_key> --env OPENAI_BASE_URL=https://api.openai.com/v1
Once it is up and running on BentoCloud, you can call the endpoint in the following ways:

BentoCloud Playground

Python client

```python import bentoml

with bentoml.SyncHTTPClient("") as client: result = client.generate( prompt="Create 20 paraphrases of I hate you", threshhold=0.6, ) print(result) ```

CURL

bash curl -X 'POST' \ 'http://<your_deployment_endpoint_url>/generate' \ -H 'Accept: application/json' \ -H 'Content-Type: application/json' \ -d '{ "prompt": "Create 20 paraphrases of I hate you", "threshhold": 0.6 }'
To make sure the Deployment automatically scales within a certain replica range, add the scaling flags:

bash bentoml deploy . --scaling-min 0 --scaling-max 3

If it’s already deployed, update its allowed replicas as follows:

bash bentoml deployment update <deployment-name> --scaling-min 0 --scaling-max 3

For more information, see the concurrency and autoscaling documentation.

Local serving

BentoML allows you to run and test your code locally, allowing you to quickly validate your code with local compute resources.

Clone the project repository and install the dependencies.

```bash git clone https://github.com/bentoml/BentoShield.git cd BentoShield

Recommend Python 3.11

pip install -r requirements.txt ```
Make sure to missing environment variables under .env, and source it corespondingly
Serve it locally.

bash bentoml serve .
Visit or send API requests to http://localhost:3000.

For custom deployment in your infrastructure, use BentoML to generate an OCI-compliant image.

The server is now active at http://localhost:3000. You can interact with it using the Swagger UI or in other ways.

CURL

```bash curl -X 'POST' \ 'http://localhost:3000/generate' \ -H 'Accept: application/json' \ -H 'Content-Type: application/json' \ -d '{ "prompt": "Create 20 paraphrases of I love you", "threshhold": 0.6 }' ```

Python client

```python import bentoml with bentoml.SyncHTTPClient("http://localhost:3000") as client: response = client.generate( prompt="Create 20 paraphrases of I love you", threshhold=0.6, ) ```

Owner

Name: BentoML
Login: bentoml
Kind: organization
Location: San Francisco

Website: https://bentoml.com
Twitter: bentomlai
Repositories: 76
Profile: https://github.com/bentoml

The most flexible way to serve AI models in production

GitHub Events

Total

Watch event: 1
Delete event: 1
Issue comment event: 1
Push event: 4
Pull request event: 2
Create event: 1

Last Year

Watch event: 1
Delete event: 1
Issue comment event: 1
Push event: 4
Pull request event: 2
Create event: 1

Issues and Pull Requests

Last synced: 10 months ago

All Time

Total issues: 0
Total pull requests: 2
Average time to close issues: N/A
Average time to close pull requests: 16 days
Total issue authors: 0
Total pull request authors: 1
Average comments per issue: 0
Average comments per pull request: 1.0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 2

Past Year

Issues: 0
Pull requests: 2
Average time to close issues: N/A
Average time to close pull requests: 16 days
Issue authors: 0
Pull request authors: 1
Average comments per issue: 0
Average comments per pull request: 1.0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 2

View more stats

Top Authors

Issue Authors

Pull Request Authors

dependabot[bot] (2)

Top Labels

Issue Labels

Pull Request Labels

dependencies (2)

Dependencies

requirements.txt pypi

accelerate ==0.33.0
bentoml ==1.3.1
openai ==1.40.6
torch ==2.4.0
transformers ==4.44.0

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science