Recent Releases of tensorzero

tensorzero - 2025.8.4

[!WARNING] Planned Deprecations

  • The OpenAI-compatible embeddings endpoint will require the prefix tensorzero::embedding_model_name:: for model names (e.g. tensorzero::embedding_model_name::openai::text-embedding-3-small). Support for unprefixed names will be removed in a future release (2025.12+).

Bug Fixes

  • Fix a ClickHouse warning that occurred when a model inference had input tokens set to null and output tokens non-null, or vice versa. This issue only caused warnings and did not affect TensorZero's user-facing functionality.

New Features

  • Add extra_body support for embedding model configurations to enable custom API request fields for various embedding providers. (thanks @ishbir!)
  • Update the Azure OpenAI Service model provider to use API version 2025-04-01-preview.
  • Add CrewAI integration example.

& multiple under-the-hood and UI improvements (thanks @MengAiDev!)

- Rust
Published by GabrielBianconi 6 months ago

tensorzero - 2025.8.3

[!CAUTION] Breaking Changes

  • Temporarily removing support for batching writes to ClickHouse with the embedded gateway in Python: In the previous release, we added support for batching writes to ClickHouse to boost ingest throughput and reduce insert overhead at scale (default off). Later, we discovered that in rare scenarios, the Python GIL could interfere with this setting in embedded clients and cause a deadlock. While we investigate a solution, we are removing support for batching with the embedded client to prevent technical footguns. Batching remains available when using a standalone (HTTP) gateway.

New Features

  • Add support for splitting configuration into multiple files with glob patterns
  • Add an example for multimodal (vision) fine-tuning
  • Expose more hyperparameters for programmatic supervised fine-tuning with Fireworks
  • Optimize queries in the UI to improve the performance of assorted pages in large-scale deployments
  • Enable setting global labels for all created resources in Helm (thanks @jinnovation!)
  • Support embedding endpoint when using the OpenAI SDK with an embedded gateway (patch_openai_client)

& many under-the-hood and UI improvements (thanks @wliu4040!)

- Rust
Published by GabrielBianconi 6 months ago

tensorzero - 2025.8.2

New Features

  • Add a Playground to the UI to compare variants side-by-side, iterate on prompts quickly, and replay inference requests.
  • Support batching writes to ClickHouse to boost ingest throughput and reduce insert overhead at scale.
  • Add a Jupyter notebook recipe for supervised fine-tuning with Unsloth.

& many under-the-hood and UI improvements (thanks @contrun @lblack00!)

- Rust
Published by GabrielBianconi 6 months ago

tensorzero - 2025.8.1

New Features

  • Add an OpenAI-compatible endpoint for embeddings, with support for OpenAI (& OpenAI-compatible) and Azure OpenAI Service model providers.
  • Add support for self-hosted replicated ClickHouse databases.
  • Parse reasoning_content from Fireworks and vLLM model providers.
  • Improve error messages for AWS Bedrock and AWS SageMaker model providers.

Bug Fixes

  • Allow configuration to specify description for JSON functions.
  • Fix a regression where function descriptions were no longer rendered in the UI.

& many under-the-hood and UI improvements (thanks @yuvraj-kumar-dev)

- Rust
Published by GabrielBianconi 6 months ago

tensorzero - 2025.8.0

New Features

  • Add gateway.observability.skip_completed_migrations configuration option to reduce gateway startup time and database load. When enabled, the gateway will skip running the ClickHouse migration workflow (i.e. verifying and potentially applying every migration) on startup for migrations that are already present in a database table that tracks migration history.
  • Support raw_text content blocks in the OpenAI-compatible inference endpoint. (Thanks @hongantran3804 @pykm05 @pycoder49!)
  • Allow users to collect outputs from "Try with variant" in the UI as demonstrations.

Bug Fixes

  • Fix handling of reasoning content blocks for DeepSeek-R1 on AWS Bedrock.
  • Set proper default value for max_tokens for the Anthropic and GCP Vertex AI Anthropic model providers. The gateway will now error if no value is provided in the configuration or request and the model is unknown.
  • Skip caching model inferences that generated invalid tool call arguments.

& many under-the-hood and UI improvements (thanks @michaldorsett @K-coder05 @dcaputo-harmoni @masonblier @Nicolasgarbarino!)

- Rust
Published by GabrielBianconi 7 months ago

tensorzero - 2025.7.5

Experimental

  • Add gateway.unstable_disable_feedback_target_validation configuration option to improve the performance of the feedback endpoint in large-scale deployments (not recommended unless you know what you're doing).

& multiple under-the-hood and UI improvements (thanks @michaldorsett @HJStaiff @liamjdavis!)

- Rust
Published by GabrielBianconi 7 months ago

tensorzero - 2025.7.4

Bug Fixes

  • Fixed an issue with inference caching where inference requests that were identical except for their inline (base64-encoded) file data incorrectly shared the same cache key, resulting in false cache hits. The cache key now includes a hash of the inline file data, ensuring that such requests are properly distinguished.

New Features

  • Added functionality for deleting datasets in the UI (soft deletion).

Experimental

  • Added support for filtering by time and tags to the experimental_list_inferences method.
  • Added support for ordering by metric value and time to the experimental_list_inferences method.

& multiple under-the-hood and UI improvements (thanks @NamNgHH!)

- Rust
Published by GabrielBianconi 7 months ago

tensorzero - 2025.7.3

[!WARNING] Planned Deprecations

  • Migrate gateway.enable_template_filesystem_access = true to gateway.template_filesystem_access.enabled = true. We're about to add more fields to enable_template_filesystem_access to support multi-file configuration.

Bug Fixes

  • Remove a third-party dependency that was causing a memory leak in the UI.
  • Fix a regression that prevented the UI from running offline.

& multiple under-the-hood and UI improvements

- Rust
Published by GabrielBianconi 7 months ago

tensorzero - 2025.7.2

Bug Fixes

  • Update TensorZero's ClickHouse client to match the parameter recommendations by ClickHouse. (This change aims to resolve occasional connection errors with ClickHouse Cloud.)

- Rust
Published by GabrielBianconi 7 months ago

tensorzero - 2025.7.1

New Features

  • Improve UI components for rendering text, JSON, Markdown, and MiniJinja templates (syntax highlighting, line numbers, wrapping, etc.)
  • Improve the performance of the UI's episode list page
  • Add pseudonymous usage analytics to the gateway (see docs for details and instructions to opt out)

Experimental

  • Launch SFT jobs for Together AI and GCP Vertex AI Gemini programatically
  • Return internal error details in the response body (gateway.unstable_error_json) (thanks @panesher)

& many under-the-hood and UI improvements (thanks @michaldorsett @itsrajatrai @caarlos0)

- Rust
Published by GabrielBianconi 7 months ago

tensorzero - 2025.7.0

New Features

  • Revamped the UI's supervised fine-tuning workflow to fully support TensorZero's inference capabilities, including multimodal data (vision, documents, etc.), multi-turn tool use, and more.
  • Added streaming inference support for best-of-n and mixture-of-n variant types.
  • Optimized the performance of some database queries in the UI.

Experimental

Experimental features don't have a stable API. They may change or be removed in future releases.

  • Added methods to the Python client for programmatically launching (experimental_launch_optimization) and polling for (experimental_poll_optimization) optimization jobs. For now, these methods support supervised fine-tuning with OpenAI and Fireworks AI.
  • Added a method to the Python client for retrieving the configuration (experimental_get_config).
  • Updated experimental_render_inferences to accept outputs from both experimental_list_inferences and list_datapoints.

& many under-the-hood and UI improvements (thanks @jeevikasirwani!)

- Rust
Published by GabrielBianconi 8 months ago

tensorzero - 2025.6.3

New Features

  • Add delete = true option to extra_body and extra_headers configuration fields to instruct the gateway to delete built-in fields from the request body or headers.
  • Add gateway.base_path field to configuration to instruct the gateway to prefix all endpoints with this path.
  • Add discard_unknown_chunks field to model provider configuration to instruct the gateway to discard chunks with unknown or unsupported types instead of throwing an error.
  • Add optional name field to tool configuration; if provided, the tool name will be sent to the LLMs instead of the tool ID, allowing for multiple tools with the same name.
  • Add functionality to filter list_datapoints by function name.

& multiple under-the-hood and UI improvements

- Rust
Published by GabrielBianconi 8 months ago

tensorzero - 2025.6.2

New Features

  • Add recipe for supervised fine-tuning with Google Vertex AI Gemini
  • Add granular timeouts ([timeouts]) to variant and model configuration blocks
  • Support short-hand model names for Groq (groq::...) and OpenRouter (openrouter::...) model providers
  • Support tool use with vLLM (thanks @CHRV @chaet1t!)
  • Add explicit stop_sequences inference parameter
  • Support dynamic credentials in OpenAI-compatible inference endpoint (tensorzero::credentials) (thanks @zmij!)
  • Support multimodal inference and file inputs on AWS Bedrock

& multiple under-the-hood and UI improvements

- Rust
Published by GabrielBianconi 8 months ago

tensorzero - 2025.6.1

[!CAUTION] Breaking Changes

  • Streaming Inference + Tool Use: During streaming inferences, raw_name in a tool call chunk represents a delta that should be accumulated. If the tool name has finished streaming, this field will contain an empty string. Previously, TensorZero returned the same raw_name in every subsequent chunk for that tool call. The new behavior matches the OpenAI API's behavior.

Bug Fixes

  • Return null instead of an empty string when missing service_tier in the OpenAI-compatible inference endpoint

New Features

  • Allow inference containing files with arbitrary MIME types
  • Add [timeouts] to model provider configuration for granular timeout functionality
  • Support templates without schemas; add built-in system_text, assistant_text, and user_text template variables
  • Support tags in OpenAI-compatible inference endpoint (tensorzero::tags)
  • Add experimental_list_inferences method to the client for retrieving historical inferences

& multiple under-the-hood and UI improvements (thanks @vr-varad!)

- Rust
Published by GabrielBianconi 9 months ago

tensorzero - 2025.6.0

Bug Fixes

  • Increase database health check timeout in the gateway to 180s to gracefully handle warmup of serverless databases

New Features

  • Handle thinking and unknown content blocks for gcp_vertex_anthropic and gcp_vertex_gemini models
  • Add endpoint_id field in the configuration for gcp_vertex_anthropic and gcp_vertex_gemini models to support fine-tuned models
  • Add a dedicated Groq (groq) model provider (thanks @oliverbarnes!)
  • Support include_original_response during streaming inference

& multiple under-the-hood and UI improvements

- Rust
Published by GabrielBianconi 9 months ago

tensorzero - 2025.5.9

Bug Fixes

  • Stop requiring the ClickHouse URL to have a port (use the scheme's default port instead)

New Features

  • Support inference with PDFs for OpenAI, Anthropic, and GCP Vertex AI Gemini model providers
  • Support short-hand model names for GCP Vertex AI Gemini and GCP Vertex AI Anthropic model providers

& multiple under-the-hood and UI improvements (thanks @GustavoFortti!)

- Rust
Published by GabrielBianconi 9 months ago

tensorzero - 2025.5.8

Bug Fixes

  • Remove extraneous log entries introduced in the previous release

New Features

  • Add dynamic evaluations → Dynamic Evaluations Tutorial
  • Add TGI support to AWS Sagemaker model provider
  • Add a dedicated OpenRouter model provider (thanks @oliverbarnes)
  • Add an example deploying TensorZero with Kubernetes (k8s) and Helm (thanks @timusri)

& multiple under-the-hood and UI improvements (thanks @subham73 @omahs @arrrrny!)

- Rust
Published by GabrielBianconi 9 months ago

tensorzero - 2025.5.7

Bug Fixes

  • Fix a regression in the evaluations UI where incremental results are not displayed until all generated outputs are ready.
  • Fix a regression in the evaluations UI where users could not open the detail page for completed datapoints in partial (e.g. failed) runs.

& multiple under-the-hood and UI improvements (thanks @adithya-adee @Garvity!)

- Rust
Published by GabrielBianconi 9 months ago

tensorzero - 2025.5.6

Bug Fixes

  • Fix an edge case in the Python client affecting list_datapoints and get_datapoint for datapoints without output.

- Rust
Published by GabrielBianconi 9 months ago

tensorzero - 2025.5.5

[!WARNING] Planned Deprecations

  • We are renaming the Python client types ChatInferenceDatapointInput and JsonInferenceDatapointInput to ChatDatapointInsert and JsonDatapointInsert for clarity. Both versions will be supported until 2025.8+ (#2131).

Bug Fixes

  • Handle a regression in the Fireworks AI SFT API

New Features

  • Add endpoints (and client methods) for listing and querying datapoints programmatically

& multiple under-the-hood and UI improvements

- Rust
Published by GabrielBianconi 9 months ago

tensorzero - 2025.5.4

[!WARNING]

Completed Deprecations

  • This release completes the planned deprecation of the gateway.disable_observability configuration option. Use gateway.observability.enabled instead.

Bug Fixes

  • Allow multiple text content blocks in individual input messages.

- Rust
Published by GabrielBianconi 9 months ago

tensorzero - 2025.5.3

[!IMPORTANT]

Please upgrade to this version if you're having performance issues with your ClickHouse queries.

Bug Fixes

  • Optimized the performance of additional ClickHouse queries in the UI that previously consumed excessive time and memory at scale.

& multiple under-the-hood and UI improvements (thanks @bhatt-priyadutt!)

- Rust
Published by GabrielBianconi 9 months ago

tensorzero - 2025.5.2

[!WARNING] Planned Deprecations

  • This release patches our OpenAI-compatible inference API to match the OpenAI API format when using dynamic output schemas. Both API formats are accepted for now, but we plan to deprecate the legacy (incorrect) format in 2025.8+ (#2094).

Bug Fixes

  • Comply with the OpenAI API format when using dynamic output schemas in the OpenAI-compatible inference API

- Rust
Published by GabrielBianconi 10 months ago

tensorzero - 2025.5.1

[!IMPORTANT]

Please upgrade to this version if you're having performance issues with your ClickHouse queries.

Bug Fixes

  • Optimized the performance of ClickHouse queries in the UI that previously consumed excessive time and memory at scale

New Features

  • Add support for managing datasets and datapoints programmatically
  • Enable tool use with SGLang (thanks @subygan!)
  • Improve error messages across the stack

& multiple under-the-hood and UI improvements (thanks @subham73 @Daksh14!)

- Rust
Published by GabrielBianconi 10 months ago

tensorzero - 2025.5.0

Bug Fixes

  • Fix an issue in the UI where multimodal inferences fail to parse when no object storage region is specified.
  • Handle an edge case with Google AI Studio's streaming API where some response fields are missing in certain chunks.

New Features

  • Support tensorzero::extra_body and tensorzero::extra_headers in the OpenAI-compatible inference endpoint.
  • Allow users to specify inference caching behavior in the evaluations UI.
  • Improve the performance of some database queries in the UI.

& multiple under-the-hood and UI improvements

- Rust
Published by GabrielBianconi 10 months ago

tensorzero - 2025.4.8

[!IMPORTANT]

This release addresses an issue affecting a small subset of users caused by a change to the OpenAI API. The OpenAI API suddenly started rejecting non-standard HTTP request headers sent by our gateway. If you receive the error message Input should be a valid dictionary from the OpenAI API, please upgrade to the latest version of TensorZero.

Bug Fixes

  • Fix an issue affecting a small subset of users caused by a recent change to the OpenAI API
  • Support Windows-specific signal handling to enable non-WSL Windows users to run the gateway natively

New Features

  • Support exporting OpenTelemetry traces (OTLP) from the TensorZero Gateway
  • Add batch inference for GCP Vertex AI Gemini models (gcp_vertex_gemini provider)
  • Enable users to send extra headers to model providers at inference time (thanks @oliverbarnes!)

& multiple under-the-hood and UI improvements (thanks @nyurik @rushatgabhane!)

- Rust
Published by GabrielBianconi 10 months ago

tensorzero - 2025.4.7

Bug Fixes

  • Fix an edge case that could result in duplicate inference results in the database for batch inference jobs.
  • Improve the performance of a database query in the inference detail page in the UI.
  • Fix an edge case that prevented the UI from parsing tool choices correctly in some scenarios.
  • Avoid unnecessarily parsing tool results sent to GCP Vertex AI models.

New Features

  • Add examples integrating TensorZero with Cursor and OpenAI Codex.
  • Make the OpenAI-compatible inference endpoint respect include_usage.

& multiple under-the-hood and UI improvements

- Rust
Published by GabrielBianconi 10 months ago

tensorzero - 2025.4.6

Bug Fixes

  • Fix a bug with usage logging for streaming inferences with Google AI Studio Gemini.

New Features

  • Add support for multimodal inference (vision) with GCP Vertex AI Anthropic.
  • Improve adherence to OpenAI API behavior in the OpenAI-compatible inference endpoint.

& multiple under-the-hood and UI improvements

- Rust
Published by GabrielBianconi 10 months ago

tensorzero - 2025.4.5

[!CAUTION] Breaking Changes

  • LLM judges no longer perform chain-of-thought reasoning by default (including for chat_completion variants). This is now opt-in via the new experimental_chain_of_thought variant type.
  • The --format human_readable flag for the evaluations binary was renamed to --format pretty.

New Features

  • Add experimental_chain_of_thought variant type.
  • Enable users to provide manual feedback to inferences, episodes, and evaluations in the UI.
  • Support inference caching with OpenAI-compatible inference endpoint.
  • Add an example integrating LangGraph and TensorZero.

& multiple under-the-hood and UI improvements (thanks @oliverbarnes @subham73!)

- Rust
Published by GabrielBianconi 10 months ago

tensorzero - 2025.4.4

Bug Fixes

  • Handle an edge case when OpenAI doesn't return any content blocks for a JSON function in strict mode
  • Accept snake case values for --format flag in evaluations binary (human-readablehuman_readable)
  • Re-enable stream_options for xAI (they changed their API again)

& multiple under-the-hood and UI improvements (thanks @oliverbarnes @benashby!)

- Rust
Published by GabrielBianconi 10 months ago

tensorzero - 2025.4.3

Notable Improvements

  • Released TensorZero Datasets & Evaluations! Learn more →
  • Add reasoning support for Fireworks AI (thanks @igortoliveira!)

& multiple under-the-hood and UI improvements

- Rust
Published by GabrielBianconi 11 months ago

tensorzero - 2025.4.2

[!WARNING]

Moving forward, we're removing the leading zero from minor version numbers. This release will be version 2025.4.2 instead of 2025.04.2.

Bug Fixes

  • Handle a regression in the xAI API: they no longer support the stream_options parameter.

Notable Improvements

  • Add AWS Sagemaker model provider.
  • Add Agentic RAG example.

& multiple under-the-hood and UI improvements

- Rust
Published by GabrielBianconi 11 months ago

tensorzero - 2025.04.1

Bug Fixes

  • Fixed incorrect double-serialization of tool call arguments in the TensorZero Python client.

& multiple under-the-hood and UI improvements

- Rust
Published by GabrielBianconi 11 months ago

tensorzero - 2025.04.0

[!CAUTION] Breaking Changes

  • The tensorzero/gateway binary now offers multiple log formats. The default format is now pretty (human-readable) instead of json (machine-readable). To restore the previous behavior, set --log-format json.

[!WARNING] Planned Deprecations

  • The tensorzero/gateway Docker container will no longer specify the configuration path by default (in CMD). Please specify --config-file path/to/tensorzero.toml or --default-config in your Docker Compose file or Docker run command. The docker-compose.yml files in the repository have been updated to reflect this. → Planned for deprecation in 2025.06+ (#1101).

Bug Fixes

  • Fixed an issue where the TensorZero Python client incorrectly logged a deprecation warning for valid tool call usage.

Notable Improvements

  • Added support for images and multi-modal models (VLMs) in the TensorZero UI.
  • Added support for extra_body overrides at inference time (in addition to configuration).
  • Added support for extra_headers overrides in model provider and variant configuration.
  • Added a --log-format CLI argument to the TensorZero Gateway: pretty (default), json.
  • Improved UI components for outputs, schemas, tags, and other data fields.
  • Improved assorted error and warning messages for better developer experience.

& multiple under-the-hood and UI improvements (thanks @adityabharadwaj198!)

- Rust
Published by GabrielBianconi 11 months ago

tensorzero - 2025.03.4

Bug Fixes

  • Migrate Text type in Python client to new API format to avoid deprecation warning from #1170

Notable Changes

  • Support initializing AsyncTensorZeroGateway without awaiting (async_setup=False)
  • Display dataset statistics for OpenAI in the SFT UI (thanks @naotone!)
  • Simplify streaming inference with Anthropic Extended Thinking
  • Simplify tool use in multi-turn inferences (accept optional raw_name and raw_arguments fields in subsequent inferences)

& multiple under-the-hood and UI improvements (thanks @ewang0 @kumarlokesh!)

- Rust
Published by GabrielBianconi 11 months ago

tensorzero - 2025.03.3

We're skipping this release due to an issue while publishing. Please use version 2025.03.4 instead.

- Rust
Published by GabrielBianconi 11 months ago

tensorzero - v2025.03.2

[!CAUTION] Breaking Changes

  • parallel_tool_calls now defaults to null (i.e. follows the model provider's default behavior) instead of false (i.e. disabled). OpenAI and some other providers default to true (i.e. enabled).

[!WARNING] Planned Deprecations

  • When using the OpenAI-compatible inference endpoint, TensorZero-specific arguments (e.g. episode_id, variant_name) should be provided in the request's body using the tensorzero:: prefix. Previously, these arguments were provided as headers (without a prefix). → Planned for deprecation in 2025.05+.

Bug Fixes

  • ClickHouse 2025.2 introduced a bug (https://github.com/ClickHouse/ClickHouse/issues/77848) that affected the inference detail page in the TensorZero UI. This release includes a workaround for the upstream bug.

Notable Changes

  • Add embedded gateway support for openai-python (using tensorzero.patch_openai_client)
  • Add recipe notebook for supervised fine-tuning on demonstrations for Together AI
  • Support signature field for thought content blocks for Anthropic Extended Thinking
  • Support multiple system / developer messages in OpenAI-compatible inference endpoint (via concatenation)

& multiple under-the-hood improvements

- Rust
Published by GabrielBianconi 11 months ago

tensorzero - v2025.03.1

[!CAUTION] Breaking Changes

  • OpenAI-compatible inference endpoint now returns the fully-qualified variant name (tensorzero::function_name::xxx::variant_name::yyy) or model name (tensorzero::model_name::zzz) in the model field. Previously, it returned just the short variant name or model name (yyy or zzz).

[!WARNING] Planned Deprecations

The following features are getting deprecated. Both versions will work for now, but we encourage you to migrate as soon as possible. A future release will remove the deprecated approach.

  • Launching the gateway without any configuration-related flags is deprecated. To launch without custom configuration (i.e. without --config-file), use --default-config.

Bug Fixes

  • Fix caching behavior for best-of-N/mixture-of-N sampling variants with repeated candidates (now cached independently).
  • Fix OpenAI inference behavior when combining parallel tool calls with multi-turn inferences.

Notable Changes

  • Support caching for embedding requests.
  • Add multimodal (vision) support for GCP Vertex AI Gemini models.
  • Add examples and tests using openai-node for inference.
  • Add [gateway.enable_template_filesystem_access] configuration flag to enable MiniJinja templates to import sub-templates in the file system using the include directive (gateway only; not supported by recipes for now).

& multiple under-the-hood improvements

- Rust
Published by GabrielBianconi 11 months ago

tensorzero - v2025.03.0

[!WARNING] Breaking Changes

  • The gateway will no longer sample variants with zero weight unless they are explicitly pinned at inference time ("variant_name": "..."). The gateway will first sample among variants with positive weight, then fallback to variants with unspecified weight. If you'd like to set up "fallback-only variants", do not specify their weights.

[!WARNING] Deprecating Functionality

The following features are getting deprecated. Both versions will work for now, but we encourage you to migrate as soon as possible. A future release will remove the deprecated approach.

  • Normalize the format for text content blocks:
    • {"type": "text", "text": "Hello"} for standard text messages (when no template/schema are defined)
    • {"type": "text", "arguments": {"k": "v"}} to use prompts with templates and schemas (use "tensorzero::arguments" in the OpenAI-compatible endpoint)
    • {"type": "raw_text", "text": "Hello"} to ignore template/schema (not supported by OpenAI-compatible endpoint)

Notable Changes

  • Add recipe for automated prompt engineering with MIPRO (recipes/mipro).
  • OpenAI-compatible inference endpoint matches OpenAI responses more closely

& multiple under-the-hood improvements (thanks @naotone @baslia!)

- Rust
Published by GabrielBianconi 12 months ago

tensorzero - v2025.02.6

Notable Changes

  • Support image inputs (VLMs) (+ optional object storage for observability).
  • Add extra_body field to variant and model provider configuration blocks to allow users to override TensorZero behaviors or access unsupported features by downstream providers.
  • Add notebook for fine-tuning models with Together AI.

& multiple under-the-hood & UI improvements

- Rust
Published by GabrielBianconi 12 months ago

tensorzero - v2025.02.5

Bug Fix

  • Fixed an import issue for tensorzero.util.uuid7 in the new Python client

- Rust
Published by GabrielBianconi about 1 year ago

tensorzero - v2025.02.4

Notable Changes

  • Add Python client with embedded gateway
  • Add support for thought content blocks (e.g. DeepSeek R1)

& multiple under-the-hood & UI improvements

- Rust
Published by GabrielBianconi about 1 year ago

tensorzero - v2025.02.3

[!WARNING] Deprecating Functionality

The following features getting deprecated. Both versions will work for now, but we encourage you to migrate as soon as possible. A future release will remove the deprecated approach.

  • The gateway no longer writes asynchronously to ClickHouse by default. Set gateway.observability.async_writes = true in the config to re-enable the behavior.
  • Use tensorzero::function_name::xxx instead of tensorzero::xxx with the OpenAI client.

Notable Changes

  • Allow using tensorzero::model_name::xxx with the OpenAI client.
  • Improved JSON schema handling for Gemini models.
  • Lower the client's minimum Python version to 3.9.
  • Automatically downgrade strict JSON mode for Deepseek.

& multiple under-the-hood improvements (thanks @Kaboomtastic!)

- Rust
Published by GabrielBianconi about 1 year ago

tensorzero - v2025.02.2

Notable Changes

  • Support inference caching
  • Add Try with variant... button to inference detail page

`Try with variant...` button demo

& multiple under-the-hood improvements

- Rust
Published by GabrielBianconi about 1 year ago

tensorzero - v2025.02.1

Notable Changes

  • Add raw_text input content block that doesn't apply schemas or templates
  • Support strict mode for Azure OpenAI Service
  • Support dynamic output schemas in Python clients

& multiple under-the-hood improvements

- Rust
Published by GabrielBianconi about 1 year ago

tensorzero - v2025.02.0

Notable Changes

  • Implement caching for non-streaming inferences (streaming coming soon)
  • Add Deepseek model provider (thanks @ankit-varma10!)
  • Support short-hand model names for embedding models

& multiple under-the-hood improvements (thanks @Kannav02 @buyunwang @lekzzA @syed-ghufran-hassan @amerinor01!)

- Rust
Published by GabrielBianconi about 1 year ago

tensorzero - v2025.01.9

Bug Fixes

  • Fixed a bug in the UI when ClickHouse was using non-default database names.

Notable Changes

  • Support model_name (as an alternative to function_name) in the /inference endpoint (HTTP only; Python client support coming in the next release)

& multiple under-the-hood improvements

- Rust
Published by GabrielBianconi about 1 year ago

tensorzero - v2025.01.8

[!WARNING] Deprecating Functionality

The following features getting deprecated. Both versions will work for now, but we encourage you to migrate as soon as possible. A future release will remove the deprecated approach.

  • Replace configuration gateway.observability.disable_observability with gateway.observability.enabled, which supports true (require), false (ignore), and null (warn if missing but proceed).
  • Rename environment variable CLICKHOUSE_URL to TENSORZERO_CLICKHOUSE_URL

Notable Changes

  • Clean up haiku and NER examples (featuring TensorZero UI!)

& multiple under-the-hood improvements (thanks @ChetanXpro @atharvapatil4!)

- Rust
Published by GabrielBianconi about 1 year ago

tensorzero - v2025.01.7

This release has multiple small under-the-hood fixes (thanks @Kannav02!).

- Rust
Published by GabrielBianconi about 1 year ago

tensorzero - v2025.01.6

Notable Changes

  • TensorZero UI preview release: docker pull tensorzero/ui
  • SGLang model provider [docs]
  • Recipe for Direct Preference Optimization (DPO) with OpenAI (thanks @ankit-varma10!)

& multiple under-the-hood improvements (thanks @ChetanXpro!)

- Rust
Published by GabrielBianconi about 1 year ago

tensorzero - v2025.01.5

This release makes progress towards the TensorZero UI (thanks @ChetanXpro!).

- Rust
Published by GabrielBianconi about 1 year ago

tensorzero - v2025.01.4

This release has a small under-the-hood fix.

- Rust
Published by GabrielBianconi about 1 year ago

tensorzero - v2025.01.3

This release has multiple under-the-hood improvements.

Thanks @szepeviktor for contributing!

- Rust
Published by GabrielBianconi about 1 year ago

tensorzero - v2025.01.2

Notable Changes

  • Support batch inference with OpenAI
  • Add Huggingface TGI model provider (thanks @hfl0506!)

& progress towards the TensorZero dashboard

& multiple under-the-hood improvements (thanks @amerinor01 @ChetanXpro!)

& welcoming @Aaron1011 to the TensorZero team!

- Rust
Published by GabrielBianconi about 1 year ago

tensorzero - v2025.01.1

Notable Changes

  • Support built-in functions in MiniJinja templates (e.g. tojson)

& progress towards the TensorZero dashboard

& multiple under-the-hood improvements (thanks @ChetanXpro!)

- Rust
Published by GabrielBianconi about 1 year ago

tensorzero - v2025.01.0

This release has multiple under-the-hood improvements. Notably, these improvements are paving the way for the upcoming TensorZero UI.

Thanks @ChetanXpro and @hk1997 for the contributions!

- Rust
Published by GabrielBianconi about 1 year ago

tensorzero - v2024.12.3

Notable Changes

  • Special handling for an edge case when feedback is submitted before ClickHouse inserts the corresponding inference

& progress towards the TensorZero dashboard (thanks @ChetanXpro!)

- Rust
Published by GabrielBianconi about 1 year ago

tensorzero - v2024.12.2

Notable Changes

  • Support for Anthropic's disable_parallel_tool_use flag (thanks @hk1997!)
  • Validate inference IDs for all inference-level feedbacks (thanks @hk1997!)
  • Add example covering streaming inference (thanks @vincent0426!)
  • Better error messages (thanks @Kaboomtastic!)
  • Fixed a DX issue with short-hand model configurations

& progress towards the TensorZero dashboard

& multiple improvements under the hood (thanks @shailahir!)

- Rust
Published by GabrielBianconi about 1 year ago

tensorzero - v2024.12.1

New Features

  • New model providers: xAI (new contributor — thanks @Kaboomtastic!) and Hyperbolic (new contributor — thanks @hk1997!)
  • Short-form model initialization when declaring variants (e.g. model_name = "openai::gpt-4o-mini")

Other Notable Changes

  • Small improvements in tool use support across multiple providers
  • Better error messages

& multiple improvements under the hood

- Rust
Published by GabrielBianconi about 1 year ago

tensorzero - v2024.12.0

Notable Changes

  • Fixed an issue caused by a regression in the GCP Vertex Gemini API

- Rust
Published by GabrielBianconi about 1 year ago

tensorzero - v2024.11.4

Notable Changes

  • Fixed edge case in Python client during streaming

- Rust
Published by GabrielBianconi about 1 year ago

tensorzero - v2024.11.3

Breaking Changes

  • Dynamic credentials in inference requests should reference the credential name (e.g. api_key_location = "dynamic::ARGUMENT_NAME") instead of the provider type. See API Reference and Configuration Reference. This change was necessary to allow for multiple credentials per provider type (see below).

New Features

  • Allow client to add tags to inference requests.
  • Revamp credential management to allow for multiple credentials per provider type (e.g. api_key_location = "env::ENV_VAR_NAME", api_key_location = "file:: PATH_TO_CREDENTIALS_FILE", api_key_location = "dynamic::ARGUMENT_NAME", api_key_location = "none").

Other Notable Changes

  • Improve errors and logs in gateway.

- Rust
Published by GabrielBianconi about 1 year ago

tensorzero - v2024.11.2

Notable Changes

  • Fixed edge case in Python client

- Rust
Published by GabrielBianconi over 1 year ago

tensorzero - v2024.11.1

Notable Changes

  • Minor improvements to error handling in the gateway

- Rust
Published by GabrielBianconi over 1 year ago

tensorzero - v2024.11.0

New Features

  • TensorZero is now (mostly) compatible with the OpenAI Python client (openai-python)
  • Recipe for fine-tuning on demonstrations using Fireworks

Other Notable Changes

  • Added version to the gateway's status handler
  • Minor improvements to the TensorZero Python client

- Rust
Published by GabrielBianconi over 1 year ago

tensorzero - v2024.10.1

New Features

  • Exposed top_p, presence_penalty, and frequency_penalty inference parameters

- Rust
Published by GabrielBianconi over 1 year ago

tensorzero - v2024.10.0

New Features

  • Mixture-of-n inference-time optimization (variant type) [Example]
  • Support for dynamic credentials: provider credentials (e.g. API keys) defined at inference time
  • Configurable retries for variants
  • Model provider for Google AI Studio Gemini
  • User-defined tags to feedback

Other Notable Changes

  • Improved latency of feedback endpoint
  • Started storing more information (hydrated inputs and outputs) for observability

& multiple improvements under the hood.

- Rust
Published by GabrielBianconi over 1 year ago

tensorzero - v2024.09.3

New Features

Other Notable Changes

  • Added strict generation (structured outputs) for OpenAI's JSON Mode.

- Rust
Published by GabrielBianconi over 1 year ago

tensorzero - v2024.09.2

New Features

  • Added a synchronous Python client to the tensorzero library: TensorZeroGateway
  • Added support for the OpenAI o1 family of models

Other Notable Changes

  • Made variant weights optional (defaults to 0)

& multiple improvements under the hood.

- Rust
Published by GabrielBianconi over 1 year ago

tensorzero - v2024.09.1

Breaking Changes

  • Split Inference table into a table per function type: ChatInference and JsonInference → data recorded during the preview release of TensorZero won't be automatically available in the new tables, and should be manually migrated if relevant (old table won't be deleted)

New Features

  • Support Anthropic models through GCP Vertex AI (gcp_vertex_anthropic)
  • Add dynamic output schemas for JSON functions (output schemas provided at inference time)

Other Notable Changes

  • Store raw request content in ModelInference table instead parsed input (which is already stored in the ChatInference and JsonInference tables)
  • Make output schema optional for JSON functions (defaults to a schema that accepts any valid JSON)

- Rust
Published by GabrielBianconi over 1 year ago

tensorzero - v2024.09.0

Megumin learns Rust

- Rust
Published by GabrielBianconi over 1 year ago