Updated 6 months ago

bentoml • Rank 26.1 • Science 54%

The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!

Updated 6 months ago

tiny_qa_benchmark_pp • Rank 2.1 • Science 77%

Tiny QA Benchmark++ a micro-benchmark suite (52-item gold + on-demand multilingual synthetic packs), generator CLI, and CI-ready eval harness for ultra-fast LLM smoke-testing & regression-catching.

Updated 6 months ago

tensorzero • Rank 23.4 • Science 54%

TensorZero is an open-source stack for industrial-grade LLM applications. It unifies an LLM gateway, observability, optimization, evaluation, and experimentation.

Updated 6 months ago

mlflow • Rank 35.0 • Science 36%

The open source developer platform to build AI/LLM applications and models with confidence. Enhance your AI applications with end-to-end tracking, observability, and evaluations, all in one integrated platform.

Updated 6 months ago

openllm • Rank 22.1 • Science 44%

Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.

Updated 6 months ago

nutcracker • Rank 1.9 • Science 54%

Large Model Evaluation Experiments

Updated 5 months ago

https://github.com/apache/hamilton • Rank 12.1 • Science 36%

Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.

Updated 6 months ago

@superagent-ai/poker-eval • Rank 4.0 • Science 44%

A comprehensive tool for assessing AI Agents performance in simulated poker environments

Updated 6 months ago

llmstack • Science 26%

No-code multi-agent framework to build LLM Agents, workflows and applications with your data

Updated 6 months ago

leo • Science 44%

v0.1.0-beta

Updated 6 months ago

promptfoo • Science 26%

Test your prompts, agents, and RAGs. AI Red teaming, pentesting, and vulnerability scanning for LLMs. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration.

Updated 6 months ago

gpt4dfci • Science 57%

A private and secure generative AI tool, based on GPT-4 and deployed for non-clinical use at Dana-Farber Cancer Institute

Updated 5 months ago

https://github.com/awslabs/aiops-modules • Science 26%

AIOps modules is a collection of reusable Infrastructure as Code (IaC) modules for Machine Learning (ML), Foundation Models (FM), Large Language Models (LLM) and GenAI development and operations on AWS

Updated 5 months ago

https://github.com/bentoml/transformers-nlp-service • Science 13%

Online Inference API for NLP Transformer models - summarization, text classification, sentiment analysis and more

Updated 6 months ago

smyth-docs • Science 26%

Everything you need to build, deploy, and collaborate with agents. Ride the llama, avoid the drama.

Updated 6 months ago

glide • Science 44%

🐦 A open blazing-fast simple model gateway for rapid development of production GenAI apps