promptfoo

Test your prompts, agents, and RAGs. AI Red teaming, pentesting, and vulnerability scanning for LLMs. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration.

https://github.com/promptfoo/promptfoo

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.7%) to scientific vocabulary

Keywords

ci ci-cd cicd evaluation evaluation-framework llm llm-eval llm-evaluation llm-evaluation-framework llmops pentesting prompt-engineering prompt-testing prompts rag red-teaming testing vulnerability-scanners
Last synced: 6 months ago · JSON representation

Repository

Test your prompts, agents, and RAGs. AI Red teaming, pentesting, and vulnerability scanning for LLMs. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration.

Basic Info
  • Host: GitHub
  • Owner: promptfoo
  • License: mit
  • Language: TypeScript
  • Default Branch: main
  • Homepage: https://promptfoo.dev
  • Size: 286 MB
Statistics
  • Stars: 8,269
  • Watchers: 23
  • Forks: 682
  • Open Issues: 282
  • Releases: 338
Topics
ci ci-cd cicd evaluation evaluation-framework llm llm-eval llm-evaluation llm-evaluation-framework llmops pentesting prompt-engineering prompt-testing prompts rag red-teaming testing vulnerability-scanners
Created almost 3 years ago · Last pushed 6 months ago
Metadata Files
Readme Contributing Funding License Citation Security

README.md

Promptfoo: LLM evals & red teaming

npm npm GitHub Workflow Status MIT license Discord

promptfoo is a developer-friendly local tool for testing LLM applications. Stop the trial-and-error approach - start shipping secure, reliable AI apps.

Website Getting Started Red Teaming Documentation Discord

Quick Start

```sh

Install and initialize project

npx promptfoo@latest init

Run your first evaluation

npx promptfoo eval ```

See Getting Started (evals) or Red Teaming (vulnerability scanning) for more.

What can you do with Promptfoo?

  • Test your prompts and models with automated evaluations
  • Secure your LLM apps with red teaming and vulnerability scanning
  • Compare models side-by-side (OpenAI, Anthropic, Azure, Bedrock, Ollama, and more)
  • Automate checks in CI/CD
  • Share results with your team

Here's what it looks like in action:

prompt evaluation matrix - web viewer

It works on the command line too:

prompt evaluation matrix - command line

It also can generate security vulnerability reports:

gen ai red team

Why Promptfoo?

  • Developer-first: Fast, with features like live reload and caching
  • Private: Runs 100% locally - your prompts never leave your machine
  • Flexible: Works with any LLM API or programming language
  • Battle-tested: Powers LLM apps serving 10M+ users in production
  • Data-driven: Make decisions based on metrics, not gut feel
  • Open source: MIT licensed, with an active community

Learn More

Contributing

We welcome contributions! Check out our contributing guide to get started.

Join our Discord community for help and discussion.

Owner

  • Name: promptfoo
  • Login: promptfoo
  • Kind: organization

Test your prompts

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 744
  • Total pull requests: 3,769
  • Average time to close issues: 16 days
  • Average time to close pull requests: 2 days
  • Total issue authors: 378
  • Total pull request authors: 190
  • Average comments per issue: 1.66
  • Average comments per pull request: 0.96
  • Merged pull requests: 2,796
  • Bot issues: 4
  • Bot pull requests: 707
Past Year
  • Issues: 357
  • Pull requests: 2,941
  • Average time to close issues: 7 days
  • Average time to close pull requests: 1 day
  • Issue authors: 215
  • Pull request authors: 109
  • Average comments per issue: 1.38
  • Average comments per pull request: 1.16
  • Merged pull requests: 2,090
  • Bot issues: 3
  • Bot pull requests: 656
Top Authors
Issue Authors
  • jamesbraza (18)
  • albertlieyingadrian (17)
  • aantn (15)
  • typpo (13)
  • mldangelo (11)
  • efung (10)
  • pelikhan (10)
  • SysOverdrive (10)
  • zhlmmc (8)
  • romaintoub (8)
  • chrismaltais (8)
  • sbichenko (8)
  • mshavliuk (7)
  • sangwoo-joh (7)
  • dhodun (7)
Pull Request Authors
  • mldangelo (1,092)
  • typpo (988)
  • gru-agent[bot] (393)
  • dependabot[bot] (272)
  • sklein12 (236)
  • will-holley (121)
  • MrFlounder (103)
  • faizanminhas (75)
  • AISimplyExplained (41)
  • vedantr (37)
  • devin-ai-integration[bot] (28)
  • vsauter (26)
  • abrayne (21)
  • use-tusk[bot] (14)
  • billybonks (10)
Top Labels
Issue Labels
bug (30) enhancement (29) question (25) good first issue (6) dependencies (5) in-progress (5) javascript (4) Open Source (4) documentation (3) help wanted (2) wontfix (2) codex (1)
Pull Request Labels
dependencies (272) javascript (252) codex (56) python (15) bug (6) enhancement (4) github_actions (4) sourcery (3) documentation (2) in-progress (1) question (1) good first issue (1)

Packages

  • Total packages: 4
  • Total downloads:
    • npm 163,514 last-month
    • pypi 2,422 last-month
  • Total dependent packages: 0
    (may contain duplicates)
  • Total dependent repositories: 1
    (may contain duplicates)
  • Total versions: 344
  • Total maintainers: 5
proxy.golang.org: github.com/promptfoo/promptfoo/examples/golang-provider

Package main implements a promptfoo provider that uses OpenAI's API. It demonstrates a simple implementation of the provider interface using shared code from the core and pkg1 packages.

  • Versions: 0
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 5.7%
Average: 5.9%
Dependent repos count: 6.1%
Last synced: 6 months ago
proxy.golang.org: github.com/promptfoo/promptfoo
  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 6.2%
Average: 6.4%
Dependent repos count: 6.6%
Last synced: 6 months ago
npmjs.org: promptfoo

LLM eval & testing toolkit

  • Versions: 342
  • Dependent Packages: 0
  • Dependent Repositories: 1
  • Downloads: 163,514 Last month
Rankings
Downloads: 0.8%
Stargazers count: 2.3%
Forks count: 4.3%
Dependent repos count: 10.3%
Average: 13.9%
Dependent packages count: 51.9%
Maintainers (3)
Last synced: 6 months ago
pypi.org: promptfoo

LLM evals and red teaming

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 2,422 Last month
Rankings
Dependent packages count: 10.5%
Average: 34.7%
Dependent repos count: 58.9%
Maintainers (2)
Last synced: 6 months ago

Dependencies

.github/workflows/main.yml actions
  • actions/checkout v3 composite
  • actions/setup-node v3 composite
  • bahmutov/npm-install v1 composite
examples/jest-integration/package-lock.json npm
  • 284 dependencies
examples/jest-integration/package.json npm
  • @types/jest ^29.5.1 development
  • jest ^29.5.0 development
  • ts-jest ^29.1.0 development
  • typescript ^5.0.4 development
examples/node-package/package.json npm
package-lock.json npm
  • 495 dependencies
package.json npm
  • @types/async ^3.2.20 development
  • @types/cache-manager ^4.0.2 development
  • @types/cache-manager-fs-hash ^0.0.1 development
  • @types/cli-progress ^3.11.0 development
  • @types/cors ^2.8.13 development
  • @types/debounce ^1.2.1 development
  • @types/express ^4.17.17 development
  • @types/glob ^8.1.0 development
  • @types/jest ^29.5.1 development
  • @types/js-yaml ^4.0.5 development
  • @types/node-fetch ^2.6.4 development
  • @types/nunjucks ^3.2.2 development
  • @types/opener ^1.4.0 development
  • @types/semver ^7.5.0 development
  • babel-jest ^29.5.0 development
  • jest ^29.5.0 development
  • jest-watch-typeahead ^2.2.2 development
  • prettier ^2.8.8 development
  • ts-jest ^29.1.0 development
  • ts-node ^10.9.1 development
  • typescript ^5.0.4 development
  • @anthropic-ai/sdk ^0.5.2
  • @apidevtools/json-schema-ref-parser ^10.1.0
  • async ^3.2.4
  • cache-manager ^4.1.0
  • cache-manager-fs-hash ^1.0.0
  • chalk ^4.1.2
  • cli-progress ^3.12.0
  • cli-table3 ^0.6.3
  • commander ^10.0.1
  • cors ^2.8.5
  • csv-parse ^5.3.8
  • csv-stringify ^6.3.2
  • debounce ^1.2.1
  • express ^4.18.2
  • glob ^10.2.6
  • js-yaml ^4.1.0
  • node-fetch ^2.6.7
  • nunjucks ^3.2.4
  • opener ^1.5.2
  • replicate ^0.12.3
  • rouge ^1.0.3
  • semver ^7.5.3
  • socket.io ^4.6.1
  • tiny-invariant ^1.3.1
  • winston ^3.8.2
examples/langchain-python/requirements.txt pypi
  • PyYAML ==6.0
  • SQLAlchemy ==2.0.18
  • aiohttp ==3.8.5
  • aiosignal ==1.3.1
  • async-timeout ==4.0.2
  • attrs ==23.1.0
  • certifi ==2023.5.7
  • charset-normalizer ==3.1.0
  • dataclasses-json ==0.5.9
  • frozenlist ==1.3.3
  • greenlet ==2.0.2
  • idna ==3.4
  • langchain ==0.0.228
  • langchainplus-sdk ==0.0.20
  • marshmallow ==3.19.0
  • marshmallow-enum ==1.5.1
  • multidict ==6.0.4
  • mypy-extensions ==1.0.0
  • numexpr ==2.8.4
  • numpy ==1.25.0
  • openai ==0.27.8
  • openapi-schema-pydantic ==1.2.4
  • packaging ==23.1
  • pydantic ==1.10.11
  • requests ==2.31.0
  • tenacity ==8.2.2
  • tqdm ==4.65.0
  • typing-inspect ==0.9.0
  • typing_extensions ==4.7.1
  • urllib3 ==2.0.3
  • yarl ==1.9.2
Dockerfile docker
  • node 16-alpine build
src/web/nextui/package-lock.json npm
  • 414 dependencies
src/web/nextui/package.json npm
  • @types/js-yaml ^4.0.5 development
  • @types/prismjs ^1.26.0 development
  • prisma ^5.2.0 development
  • @emotion/react ^11.11.1
  • @emotion/styled ^11.11.0
  • @mui/icons-material ^5.14.3
  • @mui/material ^5.14.4
  • @prisma/client ^5.2.0
  • @tanstack/react-table ^8.9.3
  • @types/diff ^5.0.3
  • @types/node 20.4.10
  • @types/react 18.2.20
  • @types/react-dom 18.2.7
  • @types/react-syntax-highlighter ^15.5.7
  • @types/uuid ^9.0.2
  • chart.js ^4.3.3
  • debounce ^1.2.1
  • diff ^5.1.0
  • eslint 8.47.0
  • eslint-config-next 13.4.13
  • js-yaml ^4.1.0
  • next 13.4.13
  • opener ^1.5.2
  • prismjs ^1.29.0
  • react 18.2.0
  • react-dnd ^16.0.1
  • react-dnd-html5-backend ^16.0.1
  • react-dom 18.2.0
  • react-error-boundary ^4.0.11
  • react-simple-code-editor ^0.13.1
  • react-syntax-highlighter ^15.5.0
  • socket.io ^4.7.2
  • socket.io-client ^4.7.2
  • tiny-invariant ^1.3.1
  • typescript 5.1.6
  • uuid ^9.0.0
  • zustand ^4.4.1