https://github.com/aaltorse/llmgateway

API gateway server for LLMs.

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (11.4%) to scientific vocabulary

Last synced: 10 months ago · JSON representation

Repository

API gateway server for LLMs.

Basic Info

Host: GitHub
Owner: AaltoRSE
Language: Python
Default Branch: main
Size: 569 KB

Statistics

Stars: 0
Watchers: 4
Forks: 0
Open Issues: 0
Releases: 0

Created over 2 years ago · Last pushed 12 months ago

Metadata Files

Readme

LLMGateway

This is a front facing gateway for multiple LLMs. The idea, is that this server acts as a middle man between multiple different LLMs providing OpenAI compatible APIS and the user. The gateway keeps track of users token usage (the current state only tracks completion tokens), and handles access to the LLM servers (i.e. keeps the secrets that allow using them).

Features

Self service Auth via SAML
Self service checkout for key generation

- Admin management via REST API

TODO

Here are a few features which are currently on our TODO list:

Admin Front End UI
More fine grained Access key control (i.e. controlling what models can be accessed with a key)
Proper Usage logging (including prompt tokens, currently restricted to completion tokens)

Requirements and dependencies

Kubernetes
- Certbot Letsencrypt plugin
- Set up secrets for Admin key and LLM API key
- MongoDB and Redis deployed on the cluster.
Security
- The current authenticaion scheme for the user API is based on session cookies and SAML authentication.
- This means, you need an existing IdP which has the gateway setup as a service provider.
- You will need to update the auth saml router endpoints to conform with what the kind of access you want to allow
- The current assumption is that any key can be used with any model and that there is no use restriction.
- If you want to implement this kind of restriction, you should add another dependency on the llm endpoints.
Python dependencies:
- General:
- fastapi
- gunicorn
- uvicorn
- redis-py
- pymongo
- schedule
- httpx
- sse-starlette
- itsdangerous (for session managment)
- python-multipart
- python-jose
- For SAML:
- python3-saml

Architecture

Mongo DB

The Mongo database employed in this gateway is used for storing the logging information along with user data.

The `apikeys` collection:

python { "user" : str, # User, this key belongs to "active": boolean, # whether the key is active "key": str, # the actual key "name": str # name given to the key }

Likely future fields: authorization : [ str ] to indicate which models a key is for.

The `logs` collection

python { "tokencount": int, # This is the completion tokens "isprompt": boolean, # Whether this is for prompt or completion "model": str, # which model was used for this usage "source": str, # key or user who caused this usage "sourcetype": str # Whether the source is a "user" or an "apikey" "timestamp": datetime, # Current timestamp in UTC }

The `user` collection

python { "username": str, # The identifier of the user - provided by the IdP "keys": [ str ], # The set of keys belonging to this user }

Likely future fields:

"isAdmin" : boolean indicator whether the user is an admin, default, false

Redis

The redis database is mainly used for fast retrieval of authentication keys, and should thus be kept in sync with the mongo db keys.

Logging / Usage

The way usage is currently logged and retrieved is potentially rather slow. If it becomes necessary to implement rate limits / daily or similar restrictions, it might be necessary, to implement a more efficient usage check methodology, than the retrieval from MongoDB, as that DB can become pretty crowded. For daily max usage, an option could be to add usage to the redis db. It might also be necessary to add additional "costs" to each model in the future.

Run gateway locally

You will need to set the LLMDEFAULTURL environment variable (including any port specification) for the container to point to the location of your LLM server.

You will need at least one LLM model running on your local machine. This model needs to accept requests on LLMDEFAULTURL//v1/..

The API of the model server needs to be compatible with the API provided by llama-cpp-python[server]

In the future, LLM endpoints will also have to provide an additional /extras/tokenize/count endpoint, which calculates prompt tokens based on either a single input string, or a full ChatCompletionRequest.

The docker-compose.yml included in this repo is an example on how to test locally. You will need to set up the keycloak installation for this to work and point the gateway saml authentication to that keycloak service.

Owner

Name: AaltoRSE
Login: AaltoRSE
Kind: organization

Repositories: 38
Profile: https://github.com/AaltoRSE

GitHub Events

Total

Push event: 34
Create event: 1

Last Year

Push event: 34
Create event: 1

Issues and Pull Requests

Last synced: about 2 years ago

All Time

Total issues: 0
Total pull requests: 9
Average time to close issues: N/A
Average time to close pull requests: 6 days
Total issue authors: 0
Total pull request authors: 2
Average comments per issue: 0
Average comments per pull request: 0.0
Merged pull requests: 9
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 9
Average time to close issues: N/A
Average time to close pull requests: 6 days
Issue authors: 0
Pull request authors: 2
Average comments per issue: 0
Average comments per pull request: 0.0
Merged pull requests: 9
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

Pull Request Authors

tpfau (7)
ruokolt (2)

Top Labels

Issue Labels

Pull Request Labels

Dependencies

Dockerfile docker

mambaorg/micromamba latest build

environment.yml conda

fastapi
gunicorn
httpx
pip
pymongo
python3-saml
redis-py
schedule
uvicorn

https://github.com/aaltorse/llmgateway

Science Score: 26.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

LLMGateway

Features

- Admin management via REST API

TODO

Requirements and dependencies

Architecture

Mongo DB

The apikeys collection:

The logs collection

The user collection

Redis

Logging / Usage

Run gateway locally

Owner

GitHub Events

Total

Last Year

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies

The `apikeys` collection:

The `logs` collection

The `user` collection