https://github.com/spiceai/spiceai -

- Rust
Published by Jeadie 9 months ago

https://github.com/spiceai/spiceai - v1.6.0

Spice v1.6.0 (Aug 26, 2025)

Spice 1.6.0 upgrades DataFusion to v48, reducing expressions memory footprint by ~50% for faster planning and lower memory usage, eliminating unnecessary projections in queries, optimizing string functions like ascii and character_length for up to 3x speedup, and accelerating unbounded aggregate window functions by 5.6x. The release adds Kafka and MongoDB connectors for real-time streaming and NoSQL data acceleration, supports OpenAI Responses API for advanced model interactions including OpenAI-hosted tools like web_search and code_interpreter, improves the OpenAI Embeddings Connector with usage tier configuration for higher throughput via increased concurrent requests, introduces Model2Vec embeddings for ultra-low-latency encoding, and improves the Amazon S3 Vectors engine to support multi-column primary keys.

What's New in v1.6.0

DataFusion v48 Highlights

Spice.ai is built on the DataFusion query engine. The v48 release brings:

Performance & Size Improvements 🚀: Expressions memory footprint was reduced by ~50% resulting in faster planning and lower memory usage, with planning times improved by 10-20%. There are now fewer unnecessary projections in queries. The string functions, ascii and character_length were optimized for improved performance, with character_length achieving up to 3x speedup. Queries with unbounded aggregate window functions have improved performance by 5.6 times via avoided unnecessary computation for constant results across partitions. The Expr struct size was reduced from 272 to 144 bytes.

New Features & Enhancements ✨: Support was added for ORDER BY ALL for easy ordering of all columns in a query.

See the Apache DataFusion 48.0.0 Blog for details.

Runtime Highlights

Amazon S3 Vectors Multi-Column Primary Keys: The Amazon S3 Vectors engine now supports datasets with multi-column primary keys. This enables vector indexes for datasets where more than one column forms the primary key, such as those splitting documents into chunks for retrieval contexts. For multi-column keys, Spice serializes the keys using arrow-json format, storing them as single string keys in the vector index.

Model2Vec Embeddings: Spice now supports model2vec static embeddings with a new model2vec embeddings provider, for sentence transformers up to 500x faster and 15x smaller, enabling scenarios requiring low latency and high-throughput encoding.

yaml embeddings: - from: model2vec:minishlab/potion-base-8M # HuggingFace model name: potion - from: model2vec:path/to/my/local/model # local model name: local

Learn more in the Model2Dev Embeddings documentation.

Kafka Data Connector: Use from: kafka:<topic> to ingest data directly from Kafka topics for integration with existing Kafka-based event streaming infrastructure, providing real-time data acceleration and query without additional middleware.

Example Spicepod.yml:

yaml - from: kafka:orders_events name: orders acceleration: enabled: true refresh_mode: append params: kafka_bootstrap_servers: server:9092

Learn more in the Kafka Data Connector documentation.

MongoDB Data Connector: Use from: mongodb:<dataset> to access and accelerate data stored in MongoDB, deployed on-premises or in the cloud.

Example spicepod.yml:

yaml datasets: - from: mongodb:my_dataset name: my_dataset params: mongodb_host: localhost mongodb_db: my_database mongodb_user: my_user mongodb_pass: password

Learn more in the MongoDB Data Connector documentation.

OpenAI Responses API Support: The OpenAI Responses API (/v1/responses) is now supported, which is OpenAI's most advanced interface for generating model responses.

You can now make requests to any responses compatible model using the new /v1/responses endpoint.

Example curl request:

bash curl http://localhost:8090/v1/responses \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-4.1", "input": "Tell me a three sentence bedtime story about Spice AI." }'

To use responses in spice chat, use the --responses flag.

Example:

bash spice chat --responses # Use the `/v1/responses` endpoint for all completions instead of `/v1/chat/completions`

Use OpenAI-hosted tools supported by Open AI's Responses API by specifying the openai_responses_tools parameter:

Example spicepod.yml:

yaml models: - name: test from: openai:gpt-4.1 params: openai_api_key: ${ secrets:SPICE_OPENAI_API_KEY } tools: sql, list_datasets openai_responses_tools: web_search, code_interpreter # 'code_interpreter' or 'web_search'

These OpenAI-specific tools are only available from the /v1/responses endpoint. Any other tools specified via the tools parameter are available from both the /v1/chat/completions and /v1/responses endpoints.

Learn more in the OpenAI Model Provider documentation.

OpenAI Embeddings & Models Connectors Usage Tier: The OpenAI Embeddings and Models Connectors now supports specifying account usage tier for embeddings and model requests, improving the performance of generating text embeddings or calling models during dataset load and search by increasing concurrent requests.

Example spicepod.yml:

yaml embeddings: - from: openai:text-embedding-3-small name: openai_embed params: openai_usage_tier: tier1

By setting the usage tier to the matching usage tier for your OpenAI account, the Embeddings and Models Connector will increase the maximum number of concurrent requests to match the specified tier.

Learn more in the OpenAI Model Provider documentation.

Contributors

New Contributors

@krinart made their first contribution in github.com/spiceai/spiceai/pull/6573

Breaking Changes

No breaking changes.

Cookbook Updates

Added OpenAI Responses API - Use OpenAI's Responses API with Spice
Added Live Orders Analytics with Apache Kafka Data Connector - Combine real-time data streaming from Kafka with other datasets
Added MongoDB Data Connector - Use MongoDB as a data source with Spice

The Spice Cookbook includes 77 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.6.0, use one of the following methods:

CLI:

console spice upgrade

Homebrew:

console brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.6.0 image:

console docker pull spiceai/spiceai:1.6.0

For available tags, see DockerHub.

Helm:

console helm repo update helm upgrade spiceai spiceai/spiceai

AWS Marketplace:

🎉 Spice is also now available in the AWS Marketplace!

What's Changed

Dependencies

DataFusion: Upgraded to v48
Rust: Upgraded from 1.86.0 to 1.87.0

Changelog

Support Streaming with Tool Calls (#6941) by @Advayp in #6941
Fix parameterized query planning in DataFusion (#6942) by @Jeadie in #6942
Update the UnableToLoadCredentials error with a pointer to docs (#6937) by @phillipleblanc in #6937
Fix spicecloud benchmark (#6935) by @krinart in #6935
[Debezium] Support for VariableScaleDecimal (#6934) by @krinart in #6934
Update to DF 48 (#6665) by @mach-kernel and @kczimm in #6665
Mark append-stream and CDC datasets as ready after first message (#6914) by @sgrebnov in #6914
Model2Vec embedding model support (#6846) by @mach-kernel in #6846
Update snapshot for S3 vector search test (#6920) by @Jeadie in #6920
remove [] from queryset in spicepod path for CI (#6919) by @Jeadie in #6919
Remove verbose tracing (#6915) by @Jeadie in #6915
Refactor how models supporting the Responses API are loaded (#6912) by @Advayp in #6912
Write tests for truncate formatting in arrow_tools and fix bug. (#6900) by @Jeadie in #6900
Support using the Responses API from spice chat (#6894) by @Advayp in #6894
Include GPT-5 into Text-To-SQL and Financebench benchmarks (#6907) by @sgrebnov in #6907
Better error message when credentials aren't loaded for S3 Vectors (#6910) by @phillipleblanc in #6910
Add tracing and system prompt support for the Responses API (#6893) by @Advayp in #6893
Constraint violation check is improved to control behavior when violations occur within a batch (#6897) by @phillipleblanc in #6897
fix: Multi-column text search with v1/search (#6905) by @peasee in #6905
fix: Correctly project text search primary keys to underlying projection (#6904) by @peasee in #6904
fix: Update benchmark snapshots (#6901) by @app/github-actions in #6901
In S3vector, do not pushdown on non-filterable columns (#6884) by @Jeadie in #6884
Run E2E Test CI macOS build on bigger runners (#6896) by @phillipleblanc in #6896
Enable configuration of the Responses API for the Azure model provider (#6891) by @Advayp in #6891
fix: Update benchmark snapshots (#6888) by @app/github-actions in #6888
Update OpenAPI specification for /v1/responses (#6889) by @Advayp in #6889
Add test to ensure tools are injected correctly in the Responses API (#6886) by @Advayp in #6886
Enable embeddings for append streams (#6878) by @sgrebnov in #6878
Show correct limit for EXPLAIN plans in S3VectorsQueryExec (#6852) by @Jeadie in #6852
Responses API support for Azure Open AI (#6879) by @Advayp in #6879
fix: Update search test case structure (#6865) by @peasee in #6865
Fix mongodb benchmark (#6883) by @phillipleblanc in #6883
Support multiple column primary keys for S3 vectors. (#6775) by @Jeadie in #6775
Kafka Data Connector: persist consumer between restarts (#6870) by @sgrebnov in #6870
Fix newlines in errors added in recent PRs (#6877) by @phillipleblanc in #6877
Add override parameter to force support for the Responses API (#6871) by @Advayp in #6871
Don't use metadata columns in VectorScanTableProvider (#6854) by @Jeadie in #6854
Add non-streaming tool call support (hosted and Spice tools) via the Responses API (#6869) by @Advayp in #6869
Update error guideline to remove newlines + remove newlines from error messages. (#6866) by @phillipleblanc in #6866
Remove void acceleration engine + optional table behaviors (#6868) by @phillipleblanc in #6868
Kafka Data Connector basic support (#6856) by @sgrebnov in #6856
Federated+Accelerated TPCH Benchmarks for MongoDB (#6788) by @krinart in #6788
Pass embeddings calculated in compute_index to the acceleration (#6792) by @phillipleblanc in #6792
Add non-streaming and streaming support for OpenAI Responses API endpoint (#6830) by @Advayp in #6830
Use latest version of OpenAI crate to resolve issues with Service Tier deserialization (#6853) by @Advayp in #6853
Update openapi.json (#6799) by @app/github-actions in #6799
Improve management message (#6850) by @lukekim in #6850
fix: Include FTS search column if it is the PK (#6836) by @peasee in #6836
Refactor Health Checks (#6848) by @Advayp in #6848
Introduce a Responses trait and LLM registry for model providers that support the OpenAI Responses API (#6798) by @Advayp in #6798
fix: Update datafusion-table-providers to include constraints (#6837) by @peasee in #6837
Bump postcard from 1.1.2 to 1.1.3 (#6841) by @app/dependabot in #6841
Bump governor from 0.10.0 to 0.10.1 (#6835) by @app/dependabot in #6835
Bump ctor from 0.2.9 to 0.5.0 (#6827) by @app/dependabot in #6827
Bump azure_core from 0.26.0 to 0.27.0 (#6826) by @app/dependabot in #6826
Bump rstest from 0.25.0 to 0.26.1 (#6825) by @app/dependabot in #6825
Use latest commit in our fork of async-openai (#6829) by @Advayp in #6829
Bump rustls from 0.23.27 to 0.23.31 (#6824) by @app/dependabot in #6824
Bump async-trait from 0.1.88 to 0.1.89 (#6823) by @app/dependabot in #6823
Bump hyper from 1.6.0 to 1.7.0 (#6814) by @app/dependabot in #6814
Bump serde_json from 1.0.140 to 1.0.142 (#6812) by @app/dependabot in #6812
Add s3 vector test retrieving vectors (#6786) by @Jeadie in #6786
fix: Allow v1/search with only FTS (#6811) by @peasee in #6811
Bump tantivy from 0.24.1 to 0.24.2 (#6806) by @app/dependabot in #6806
Bump tokio-util from 0.7.15 to 0.7.16 (#6810) by @app/dependabot in #6810
fix: Improve FTS index primary key handling (#6809) by @peasee in #6809
Bump logos from 0.15.0 to 0.15.1 (#6808) by @app/dependabot in #6808
Bump hf-hub from 0.4.2 to 0.4.3 (#6807) by @app/dependabot in #6807
Bump odbc-api from 13.0.1 to 13.1.0 (#6803) by @app/dependabot in #6803
fix: Spice search CLI with FTS supports string or slice unmarshalling (#6805) by @peasee in #6805
Bump uuid from 1.17.0 to 1.18.0 (#6797) by @app/dependabot in #6797
Bump reqwest from 0.12.22 to 0.12.23 (#6796) by @app/dependabot in #6796
Bump anyhow from 1.0.98 to 1.0.99 (#6795) by @app/dependabot in #6795
Bump clap from 4.5.41 to 4.5.45 (#6794) by @app/dependabot in #6794
Respect default MAXDECODINGMESSAGE_SIZE (100MB) in Flight API (#6802) by @sgrebnov in #6802
Fix compilation errors caused by upgrading async-openai (#6793) by @Advayp in #6793
Remove outdated vector search benchmark (replaced with testoperator) (#6791) by @sgrebnov in #6791
Handle errors in vector ingestion pipeline (#6782) by @phillipleblanc in #6782
fix: Explicitly error when chunking is defined for vector engines (#6787) by @peasee in #6787
Make VectorScanTableProvider and VectorQueryTableProvider support multi-column primary keys (#6757) by @Jeadie in #6757
Use megascience/megascience Q+A dataset for text search testing. (#6702) by @Jeadie in #6702
Flight REPL autocomplete (#6589) by @krinart in #6589
use ref: github.event.pull_request.head.sha in integration_models.yml (#6780) by @Jeadie in #6780
fix: Move search telemetry calls in UDTF to scan (#6778) by @peasee in #6778
Fix Hugging Face models and embeddings loading in Docker (#6777) by @ewgenius in #6777
feat: Migrate bedrock rate limiter (#6773) by @peasee in #6773
Run the PR checks on the DEV runners (#6769) by @phillipleblanc in #6769
feat: add OpenAI models rate controller (#6767) by @peasee in #6767
Implement MongoDB data connector (#6594) by @krinart in #6594
fix: Use head ref for concurrency group (#6770) by @peasee in #6770
fix: Run enforce pulls with spice on pullrequesttarget (#6768) by @peasee in #6768
feat: Add OpenAI Embeddings Rate Controller (#6764) by @peasee in #6764
Move AWS SDK credential bridge integration test to the existing AWS SDK integration test run (#6766) by @phillipleblanc in #6766
Use Spice specific errors instead of OpenAIError in embedding module (#6748) by @kczimm in #6748
Use context in Glue Catalog Provider (#6763) by @Advayp in #6763
pin cargo-deny to previous version (#6762) by @kczimm in #6762
Bump actions/download-artifact from 4 to 5 (#6720) by @app/dependabot in #6720
Upgrade dependabot dependencies (#6754) by @phillipleblanc in #6754
Set E2E Test CI models build to 90 minute timeout (#6756) by @phillipleblanc in #6756
chore: upgrade to Rust 1.87.0 (#6614) by @kczimm in #6614
feat: Add initial runtime-rate-limiter crate (#6753) by @peasee in #6753
feat: Add more embedding traces, add MiniLM MTEB spicepod (#6742) by @peasee in #6742
Update QA analytics for release (#6740) by @Advayp in #6740
Always use 'returnData: true' for s3 vector query index (#6741) by @Jeadie in #6741
feat: Add Embedding and Search anonymous telemetry (#6737) by @peasee in #6737
Add 1.5.2 to SECURITY.md (#6739) by @ewgenius in #6739
Combine the Iceberg and Object Store AWS SDK bridges into one crate (#6718) by @Advayp in #6718
Updates to v1.5.2 release notes (#6736) by @lukekim in #6736
Update end game template - move glue catalog to catalogs section (#6732) by @ewgenius in #6732
Update v1.5.2.md (#6735) by @kczimm in #6735
Add note about S3 Vectors workaround (#6734) by @phillipleblanc in #6734
feat: Avoid joining for VectorScanTableProvider if the index is sufficient (#6714) by @peasee in #6714
update changelog (#6729) by @kczimm in #6729
remove unneeded autogenerated s3 vector code (#6715) by @Jeadie in #6715
fix: Set S3 vectors default limit to 30, add more tracing (#6712) by @peasee in #6712
docs: Add Hadoop cookbook to endgame template (#6708) by @peasee in #6708
Fix testoperator append mode compilation error (#6706) by @phillipleblanc in #6706
test: Add VectorScanTableProvider snapshot tests (#6701) by @peasee in #6701
feat: Add Hadoop catalog-mode benchmark (#6684) by @peasee in #6684
Move shared AWS crates used in bridges to workspace (#6705) by @Advayp in #6705
Use installation id to group connections (#6703) by @Advayp in #6703
Add Guardrails for AWS bedrock models (#6692) by @Jeadie in #6692
Update bedrock keys for CI. (#6693) by @Jeadie in #6693
Update acknowledgements (#6690) by @app/github-actions in #6690
ROADMAP updates Aug 1, 2025 (#6667) by @lukekim in #6667
Add retry logic for OpenAI embeddings creation (#6656) by @sgrebnov in #6656
Make models E2E chat test more robust (#6657) by @sgrebnov in #6657
Update Search GH Workflow to use Test Operator (#6650) by @sgrebnov in #6650
Score and P95 latency calculation for MTEB Quora-based vector search tests in Test Operator (#6640) by @sgrebnov in #6640
Fix multiple query error being classified as an internal error (#6635) by @Advayp in #6635
Add Support for S3 Table Buckets (#6573) by krinart in #6573
set MISTRALRSMETALPRECOMPILE=0 for metal (#6652) by @kczimm in #6652
Vector search to push down udtf limit argument into logical sort plan (#6636) by @mach-kernel in #6636
docs: Update qa_analytics.csv (#6643) by @peasee in #6643
Update SECURITY.md (#6642) by @Jeadie in #6642
docs: Update qa_analytics.csv (#6641) by @peasee in #6641
Separate token usage (#6619) by @Advayp in #6619
Fix typo in release notes (#6634) by @Advayp in #6634
Add environment variable for org token (#6633) by @Advayp in #6633
CDC: Compute embeddings on ingest (#6612) by @mach-kernel in #6612
Add view name to view creation errors (#6611) by @lukekim in #6611
Add core logic for running MTEB Quora-based vector search tests in Test Operator (#6607) by @sgrebnov in #6607
Revert "Update generate-openapi.yml (#6584)" (#6620) by @Jeadie in #6620
Non-accelerated views should report as ready only after all dependent datasets are ready (#6617) by @sgrebnov in #6617

- Rust
Published by sgrebnov 9 months ago

https://github.com/spiceai/spiceai - v1.5.2

Spice v1.5.2 (Aug 4, 2025)

Spice v1.5.2 introduces a new Amazon Bedrock Models Provider for converse API (Nova) compatible models, AWS Redshift support using the Postgres data connector, and Hadoop Catalog Support for Iceberg tables along with several bug fixes and improvements.

What's New in v1.5.2

Amazon Bedrock Models Provider: Adds a new Amazon Bedrock LLM Provider. Models compatible with the Converse API (Nova) are supported.

Amazon Bedrock provides access to a range of foundation models for generative AI. Spice supports using Bedrock-hosted models by specifying the bedrock prefix in the from field and configuring the required parameters.

Supported Model IDs:

us.amazon.nova-lite-v1:0
us.amazon.nova-micro-v1:0
us.amazon.nova-premier-v1:0
us.amazon.nova-pro-v1:0

Refer to the Amazon Bedrock documentation for details on available models.

Example Spicepod.yaml:

yaml models: - from: bedrock:us.amazon.nova-lite-v1:0 name: novash params: aws_region: us-east-1 aws_access_key_id: ${ secrets:AWS_ACCESS_KEY_ID } aws_secret_access_key: ${ secrets:AWS_SECRET_ACCESS_KEY } bedrock_guardrail_identifier: arn:aws:bedrock:abcdefg012927:0123456789876:guardrail/hello bedrock_guardrail_version: DRAFT bedrock_trace: enabled bedrock_temperature: 42

For more information, see the Amazon Bedrock Documentation.

AWS Redshift Support for Postgres Data Connector: Spice now supports connecting to Amazon Redshift using the PostgreSQL data connector. Redshift is a columnar OLAP database compatible with PostgreSQL, allowing you to use the same connector and configuration parameters.

To connect to Redshift, use the format postgres:schema.table in your Spicepod and set the connection parameters to match your Redshift cluster settings.

Example Spicepod.yaml:

```yaml

Example datasets for Redshift TPCH tables

datasets: - from: postgres:public.customer name: customer params: pghost: ${secrets:PGHOST} pgport: 5439 pgsslmode: prefer pgdb: dev pguser: ${secrets:PGUSER} pgpass: ${secrets:PGPASS} - from: postgres:public.lineitem name: lineitem params: pghost: ${secrets:PGHOST} pgport: 5439 pgsslmode: prefer pgdb: dev pguser: ${secrets:PGUSER} pgpass: ${secrets:PGPASS} ```

Redshift types are mapped to PostgreSQL types. See the PostgreSQL connector documentation for details on supported types and configuration.

Hadoop Catalog Support for Iceberg: The Iceberg Data and Catalog connectors now support connecting to Hadoop catalogs on local filesystem (file://) or S3 object storage (s3://, s3a://). This enables connecting to Iceberg catalogs without a separate catalog provider service.

Example Spicepod.yaml:

```yaml catalogs: - from: iceberg:file:///tmp/hadoopwarehouse/ name: localhadoop - from: iceberg:s3://my-bucket/hadoopwarehouse/ name: s3hadoop

# Example datasets - from: iceberg:file:///data/hadoopwarehouse/test/mytable1 name: localhadoop - from: iceberg:s3://my-bucket/hadoopwarehouse/test/mytable2 name: s3hadoop ```

For more details, see the Iceberg Data Connector documentation and the Iceberg Catalog Connector documentation.

Contributors

Breaking Changes

N/A

Cookbook Updates

Added Amazon Redshift Support to the Postgres Data Connector cookbook: Connect to tables in Amazon Redshift.

The Spice Cookbook includes 75 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.5.2, use one of the following methods:

CLI:

console spice upgrade

Homebrew:

console brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.5.2 image:

console docker pull spiceai/spiceai:1.5.2

For available tags, see DockerHub.

Helm:

console helm repo update helm upgrade spiceai spiceai/spiceai

AWS Marketplace:

🎉 Spice is also now available in the AWS Marketplace!

What's Changed

Dependencies

No major dependency updates.

Changelog

fixes for databricks OpenAI compatibility (#6629) by @Jeadie in #6629
Update spicepod.schema.json (#6632) by @app/github-actions in #6632
Remove 'stream_options' from databricks LLMs (#6637) by @Jeadie in #6637
Move retry and rate limiting logic for Amazon bedrock out of embeddings. (#6626) by @Jeadie in #6626
Disable Metal precomplation in integration_llms.yml (#6649) by @Jeadie in #6649
fix: Hadoop integration test (#6660) by @peasee in #6660
feat: Add Hadoop Catalog Data Component (#6658) by @peasee in #6658
update datafusion-table-providers to latest spiceai tag (#6661) by @mach-kernel in #6661
feat: Add Hadoop Catalog connectors for Iceberg (#6659) by @peasee in #6659
Make FullTextSearchExec robust to RecordBatch column ordering. (#6675) by @Jeadie in #6675
Make 'runtime-object-store' crate (#6674) by @Jeadie in #6674
fix: Support include for Iceberg (#6663) by @peasee in #6663
feat: Add Hadoop TPCH benchmark (#6678) by @peasee in #6678
feat: Add Hadoop metadata_path parameter (#6680) by @peasee in #6680
fix: Automatically infer Hadoop warehouse scheme (#6681) by @peasee in #6681
Amazon Bedrock, specifically Nova models (#6673) by @Jeadie in [#6673](https://github.com/spiceai/spiceai/pull/6673
fix perplexityauthtoken parameters for web_search (#6685) by @Jeadie in #6685
Fix AWS Auth issue (#6699) by @Advayp in #6699
Limit Concurrent Requests for GitHub (#6672) by @Advayp in #6672
Add runtime parameter to enable more permissive parquet reading when page indexes are missing (#6716) by @phillipleblanc in #6716
Improve Flight REPL error messages (#6696) by @lukekim in #6696
Fixes from search tests (#6710) by @Jeadie in #6710

- Rust
Published by ewgenius 10 months ago

https://github.com/spiceai/spiceai - v1.5.1

Spice v1.5.1 (July 28, 2025)

Spice v1.5.1 expands the GitHub data connector to include pull-request comments, adds a configurable rate limiting for AWS Bedrock embedding models, expands partition pruning with inequality operators, and adds client-supplied cache keys for granular caching control in the HTTP and Arrow Flight SQL APIs.

What's New in v1.5.1

GitHub Data Connector Pull Request Comments: Configure GitHub pulls datasets to include comments.

Example Spicepod.yaml:

yaml datasets: - from: github:github.com/spiceai/spiceai/pulls name: spiceai.pulls params: include_comments: all # 'review', 'discussion', or 'none'. Defaults to 'none'. max_comments_fetched: '25' # Defaults to 100 # ...

For details, see the GitHub Data Connector documentation.

AWS Bedrock Embedding Models Invocation Control: Improved rate limiting control for AWS Bedrock embedding models with max_concurrent_invocations configuration.

yaml embeddings: - from: bedrock:cohere.embed-english-v3 name: cohere-embeddings params: max_concurrent_invocations: '41' # ...

For details, see the AWS Bedrock Embeddings Model Provider documentation.

Improved Query Partitioning: Expanded partition pruning support with additional inequality operators (e.g. >, >=, <, <=).

For details, see the Query Partitioning documentation.

Client-Supplied Cache Keys: Support for a new Spice-Cache-Key header/metadata-key in the HTTP and Arrow Flight SQL query APIs to for fine-grained client-side caching control.

Example HTTP API usage:

bash $ curl -vvS -XPOST http://localhost:8090/v1/sql \ -H"spice-cache-key: 1851400_20170216_north_america" \ -d "select * from scihub_journals_accessed where user_id = '1851400' and date_trunc('DAY', timestamp) = '2017-02-16' and city = 'New York';"

Example Response:

bash < HTTP/1.1 200 OK < content-type: application/json < x-cache: Hit from spiceai < results-cache-status: HIT < vary: Spice-Cache-Key < vary: origin, access-control-request-method, access-control-request-headers < content-length: 604 < date: Wed, 23 Jul 2025 20:26:12 GMT < [{ "timestamp": "2017-02-16 09:55:06", "doi": "10.1155/2012/650929", "ip_identifier": 1000856, "user_id": 1851400, "country": "United States", "city": "New York", "longitude": 40.7830603, "latitude": -73.9712488 }, ... ]

For details, see the Cache Control documentation.

Contributors

New Contributors

@varunguleriaCodes made their first contribution in github.com/spiceai/spiceai/pull/6383

Breaking Changes

N/A

Cookbook Updates

No new recipes added in this release.

The Spice Cookbook includes 74 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.5.1, use one of the following methods:

CLI:

console spice upgrade

Homebrew:

console brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.5.1 image:

console docker pull spiceai/spiceai:1.5.1

For available tags, see DockerHub.

Helm:

console helm repo update helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

No major dependency updates.

Changelog

Fix refresh via Api when dataset is already accelerated and no refresh interval is set by @sgrebnov in https://github.com/spiceai/spiceai/pull/6549
Add support for custom GraphQL unnesting behavior by @Advayp in https://github.com/spiceai/spiceai/pull/6540
Regex Update to disallow hyphens dataset names by @varunguleriaCodes in https://github.com/spiceai/spiceai/pull/6383
Enforce max limit on comments fetched per PR by @Advayp in https://github.com/spiceai/spiceai/pull/6580
Fix accelerated refresh issue by @Advayp in https://github.com/spiceai/spiceai/pull/6590
Enable configurations of max invocations for Bedrock models by @Advayp in https://github.com/spiceai/spiceai/pull/6592
Client-supplied cache keys (Spice-Cache-Key) by @mach-kernel in https://github.com/spiceai/spiceai/pull/6579
Improved partition pruning by @kczimm in https://github.com/spiceai/spiceai/pull/6582
Fix retention filter when both retention_sql and period are set by @sgrebnov in https://github.com/spiceai/spiceai/pull/6595
Initial support for PR comments by @Advayp in https://github.com/spiceai/spiceai/pull/6569
chore: Update croner by @peasee in https://github.com/spiceai/spiceai/pull/6547
fix databricks streaming for Claude model by @peasee in https://github.com/spiceai/spiceai/pull/6601
Remove FullTextUDTFAnalyzerRule and move FTS code into search crate by @jeadie in https://github.com/spiceai/spiceai/pull/6596
Remove download of legacy sentence transformers config by @jeadie in https://github.com/spiceai/spiceai/pull/6605
re-add snapshot tests by @jeadie
Embedding column config to support client-specified vector sizes by @mach-kernel in https://github.com/spiceai/spiceai/pull/6610
Fix mismatch in columns for the GitHub PR table type by @Advayp in https://github.com/spiceai/spiceai/pull/6616
bump version to 1.5.1 by @phillipleblanc
fix issues with cherry-picking by @jeadie
Add integration tests for GitHub PRs with comments by @Advayp in https://github.com/spiceai/spiceai/pull/6581
Add view name to view creation errors by @lukekim in https://github.com/spiceai/spiceai/pull/6611
CDC: Compute embeddings on ingest by @mach-kernel in https://github.com/spiceai/spiceai/pull/6612

- Rust
Published by Jeadie 10 months ago

https://github.com/spiceai/spiceai - v1.5.0

- Rust
Published by ewgenius 10 months ago

https://github.com/spiceai/spiceai - v1.5.0-rc.3

Spice v1.5.0-rc.3 (July 16, 2025)

This is the third release candidate for v1.5.0, building on the capabilities introduced in v1.5.0-rc.2. This release introduces native support for Amazon S3 Vectors, enabling petabyte scale vector search directly from S3 vector buckets, alongside SQL-integrated vector and full-text search, partitioning for DuckDB acceleration, and automated refreshes for search indexes and views. It includes the AWS Bedrock Embeddings Model Provider, the Oracle Database connector, and the now-stable Spice.ai Cloud Data Connector, and the upgrade to DuckDB v1.3.2.

What's New in v1.5.0-rc.3

Amazon S3 Vectors Support: Spice.ai now integrates with Amazon S3 Vectors, launched in public preview on July 15, 2025, enabling vector-native object storage with built-in indexing and querying. This integration supports semantic search, recommendation systems, and retrieval-augmented generation (RAG) at petabyte scale with S3’s durability and elasticity. Spice.ai manages the vector lifecycle—ingesting data, embedding it with models like Amazon Titan or Cohere via AWS Bedrock, or MiniLM L6 available from HuggingFace, and storing it in S3 Vector buckets.

Example Spicepod.yml configuration for S3 Vectors:

yaml datasets: - from: s3://my_vector_bucket/data/ name: my_vectors params: file_format: parquet acceleration: enabled: true vectors: engine: s3_vectors params: s3_vectors_aws_region: us-east-2 s3_vectors_bucket: my-s3-vectors-bucket columns: - name: content embeddings: - from: bedrock_titan row_id: - id

Example SQL query using S3 Vectors:

sql SELECT * FROM vector_search(my_vectors, 'Cricket bats', 10) WHERE price < 100 ORDER BY score

For more details, refer to the S3 Vectors Documentation.

Highlights in v1.5.0-rc.3

SQL-integrated Search: Vector and full-text search capabilities are now natively available in SQL queries, extending the power of the POST v1/search endpoint to all SQL workflows.

Example Vector-Similarity-Search (VSS) using the vector_search UDTF on the table reviews for the search term "Cricket bats":

sql SELECT review_id, review_text, review_date, score FROM vector_search(reviews, "Cricket bats") WHERE country_code="AUS" LIMIT 3

Example Full-Text-Search (FTS) using the text_search UDTF on the table reviews for the search term "Cricket bats":

sql SELECT review_id, review_text, review_date, score FROM text_search(reviews, "Cricket bats") LIMIT 3

DuckDB v1.3.2 Upgrade: Upgraded DuckDB engine from v1.1.3 to v1.3.2. Key improvements include support for adding primary keys to existing tables, resolution of over-eager unique constraint checking for smoother inserts, and 13% reduced runtime on TPC-H SF100 queries through extensive optimizer refinements. The v1.2.x release of DuckDB was skipped due to a regression in indexes.

Read the DuckDB v1.2.0 announcement.
Read the DuckDB v1.3.0 announcement.

Partitioned Acceleration: DuckDB file-based accelerations now support partition_by expressions, enabling queries to scale to large datasets through automatic data partitioning and query predicate pruning. New UDFs, bucket and truncate, simplify partition logic.

New UDFs useful for partition_by expressions:

bucket(num_buckets, col): Partitions a column into a specified number of buckets based on a hash of the column value.
truncate(width, col): Truncates a column to a specified width, aligning values to the nearest lower multiple (e.g., truncate(10, 101) = 100).

Example Spicepod.yml configuration:

yaml datasets: - from: s3://my_bucket/some_large_table/ name: my_table params: file_format: parquet acceleration: enabled: true engine: duckdb mode: file partition_by: bucket(100, account_id) # Partition account_id into 100 buckets

Full-Text-Search (FTS) Index Refresh: Accelerated datasets with search indexes maintain up-to-date results with configurable refresh intervals.

Example refreshing search indexes on body every 10 seconds:

yaml datasets: - from: github:github.com/spiceai/docs/pulls name: spiceai.doc.pulls params: github_token: ${secrets:GITHUB_TOKEN} acceleration: enabled: true refresh_mode: full refresh_check_interval: 10s columns: - name: body full_text_search: enabled: true row_id: - id

Scheduled View Refresh: Accelerated Views now support cron-based refresh schedules using refresh_cron, automating updates for accelerated data.

Example Spicepod.yml configuration:

yaml views: - name: my_view sql: SELECT 1 acceleration: enabled: true refresh_cron: '0 * * * *' # Every hour

For more details, refer to Scheduled Refreshes.

Multi-column Vector Search: For datasets configured with embeddings on more than one column, POST v1/search and similarity_search perform parallel vector search on each column, aggregating results using reciprocal rank fusion.

Example Spicepod.yml for multi-column search:

yaml datasets: - from: github:github.com/apache/datafusion/issues name: datafusion.issues params: github_token: ${secrets:GITHUB_TOKEN} columns: - name: title embeddings: - from: hf_minilm - name: body embeddings: - from: openai_embeddings

AWS Bedrock Embeddings Model Provider: Added support for AWS Bedrock embedding models, including Amazon Titan Text Embeddings and Cohere Text Embeddings.

Example Spicepod.yml:

yaml embeddings: - from: bedrock:cohere.embed-english-v3 name: cohere-embeddings params: aws_region: us-east-1 input_type: search_document truncate: END - from: bedrock:amazon.titan-embed-text-v2:0 name: titan-embeddings params: aws_region: us-east-1 dimensions: '256'

For more details, refer to the AWS Bedrock Embedding Models Documentation.

Oracle Data Connector: Use from: oracle: to access and accelerate data stored in Oracle databases, deployed on-premises or in the cloud.

Example Spicepod.yml:

yaml datasets: - from: oracle:"SH"."PRODUCTS" name: products params: oracle_host: 127.0.0.1 oracle_username: scott oracle_password: tiger

See the Oracle Data Connector documentation.

Spice.ai Cloud Data Connector: Graduated to Stable.

Contributors

Breaking Changes

Search HTTP API Response: POST v1/search response payload has changed. See the new API documentation for details.
Model Provider Parameter Prefixes: Model Provider parameters use provider-specific prefixes instead of openai_ prefixes (e.g., hf_temperature for HuggingFace, anthropic_max_completion_tokens for Anthropic, perplexity_tool_choice for Perplexity). The openai_ prefix remains supported for backward compatibility but is deprecated and will be removed in a future release.

Cookbook Updates

Added Oracle Data Connector cookbook: Connect to tables in Oracle databases.
Added Hashed Partitioning with DuckDB cookbook: Accelerate data on large datasets by partitioning data into a fixed number of buckets.

The Spice Cookbook now includes 72 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.5.0-rc.3, download and install the specific binary from github.com/spiceai/spiceai/releases/tag/v1.5.0-rc.3 or pull the v1.5.0-rc.3 Docker image (spiceai/spiceai:1.5.0-rc.3).

What's Changed

Dependencies

delta_kernel: Upgraded to v0.12.1
DuckDB: Upgraded from v1.1.3 to v1.3.2
iceberg-rust: Upgraded from v0.4.0 to v0.5.1

Changelog

v1.5.0-rc.2 release notes (#6440) by @lukekim in #6440
Amazon S3 Vectors support by @lukekim in #6468

- Rust
Published by phillipleblanc 11 months ago

https://github.com/spiceai/spiceai - v1.5.0-rc.2

Spice v1.5.0-rc.2 (July 14, 2025)

This is the second release candidate for v1.5.0, which introduces SQL-integrated vector and full-text search, partitioning for DuckDB acceleration, and automated refreshes for search indexes and views. It adds a new AWS Bedrock Embeddings Model Provider, a new Oracle Database connector, and promotes the Spice.ai Cloud Data Connector to stable, alongside multi-column vector search for expanded search. This release also upgrades DuckDB from v1.1.3 to v1.3.2, accelerating Spice.ai datasets with improved indexes, query performance, and internal storage optimizations.

What's New in v1.5.0-rc.2

SQL-integrated Search: Vector and full-text search capabilities are now natively available in SQL queries, extending the power of the POST v1/search endpoint to all SQL workflows.

Example Vector-Similarity-Search (VSS) using the new vector_search UDTF on the table reviews for the search term "Cricket bats".

sql SELECT review_id, review_text, review_date, score FROM vector_search(reviews, "Cricket bats") WHERE country_code="AUS" LIMIT 3

Example Full-Text-Search (FTS) using the new text_search UDTF on the table reviews for the search term "Cricket bats".

sql SELECT review_id, review_text, review_date, score FROM text_search(reviews, "Cricket bats") LIMIT 3

DuckDB v1.3.2 Upgrade: Upgraded DuckDB engine from v1.1.3 to v1.3.2. Key improvements include support for adding primary keys to existing tables, resolution of over-eager unique constraint checking for smoother inserts, and 13% reduced runtime on TPC-H SF100 queries through extensive optimizer refinements. The v1.2.x release of DuckDB was skipped due to a regression in indexes.

Read the DuckDB v1.2.0 announcement.
Read the DuckDB v1.3.0 announcement.

Partitioned Acceleration: DuckDB file-based accelerations now support partition_by expressions, enabling queries to scale to large datasets through automatic data partitioning and query predicate pruning. New UDFs, bucket and truncate, simplify partition logic.

New UDFs useful for partition_by expressions:

bucket(num_buckets, col): Partitions a column into a specified number of buckets based on a hash of the column value.
truncate(width, col): Truncates a column to a specified width, aligning values to the nearest lower multiple (e.g., truncate(10, 101) = 100).

Example Spicepod.yml configuration:

yaml datasets: - from: s3://my_bucket/some_large_table/ name: my_table params: file_format: parquet acceleration: enabled: true engine: duckdb mode: file partition_by: bucket(100, account_id) # Partition account_id into 100 buckets

Full-Text-Search (FTS) Index Refresh: Accelerated datasets with search indexes maintain up-to-date results with configurable refresh intervals.

Example refreshing search indexes on body every 10 seconds (based on acceleration.refresh_check_interval).

yaml datasets: - from: github:github.com/spiceai/docs/pulls name: spiceai.doc.pulls params: github_token: ${secrets:GITHUB_TOKEN} acceleration: enabled: true refresh_mode: full refresh_check_interval: 10s columns: - name: body full_text_search: enabled: true row_id: - id

Scheduled View Refresh: Accelerated Views now support cron-based refresh schedules using refresh_cron, automating updates for accelerated data.

Example Spicepod.yml configuration:

yaml views: - name: my_view sql: SELECT 1 acceleration: enabled: true refresh_cron: '0 * * * *' # Every hour

For more details, refer to Scheduled Refreshes.

Multi-column Vector Search: For datasets configured with embeddings on more than one column, POST v1/search and similarity_search will perform parallel vector search on each column, and aggregate results using a reciprocal rank fusion scoring method.

Example Spicepod.yml where search results will consider both the Github issue's title and the content of its body.

yaml datasets: - from: github:github.com/apache/datafusion/issues name: datafusion.issues params: github_token: ${secrets:GITHUB_TOKEN} columns: - name: title embeddings: - from: hf_minilm - name: body embeddings: - from: openai_embeddings

AWS Bedrock Embeddings Model Provider: Added support for AWS Bedrock embedding models, including Amazon Titan Text Embeddings and Cohere Text Embeddings.

Example Spicepod.yaml:

```yaml embeddings: - from: bedrock:cohere.embed-english-v3 name: cohere-embeddings params: awsregion: us-east-1 inputtype: search_document truncate: END

from: bedrock:amazon.titan-embed-text-v2:0 name: titan-embeddings params: aws_region: us-east-1 dimensions: '256' ```

For more details, refer to the AWS Bedrock Embedding Models Documentation.

Oracle Data Connector: Use from: oracle: to access and accelerate data stored in Oracle databases, deployed on-premises or in the cloud.

Example Spicepod.yml:

yaml datasets: - from: oracle:"SH"."PRODUCTS" name: products params: oracle_host: 127.0.0.1 oracle_username: scott oracle_password: tiger

See the Oracle Data Connector documentation for details.

Spice.ai Cloud Data Connector: Graduated to Stable.

Contributors

Breaking Changes

Search HTTP API Response: POST v1/search response payload has changed. See the new API documentation for details.
Model Provider Parameter Prefixes: Model Provider parameters use provider-specific prefixes instead of openai_ prefixes (e.g., hf_temperature instead of openai_temperature for HuggingFace, anthropic_max_completion_tokens for Anthropic, perplexity_tool_choice for Perplexity). The openai_ prefix remains supported for backward compatibility but is now deprecated will be removed in a future release.

Cookbook Updates

Added Oracle Data Connector cookbook: Connect to tables in Oracle databases.
Added Hashed Partitioning with DuckDB cookbook: Accelerate data on large datasets by partitioning data into a fixed number of buckets.

The Spice Cookbook now includes 72 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.5.0-rc.2, download and install the specific binary from github.com/spiceai/spiceai/releases/tag/v1.5.0-rc.2 or pull the v1.5.0-rc.2 Docker image (spiceai/spiceai:1.5.0-rc.2).

What's Changed

Dependencies

delta_kernel: Upgraded to v0.12.1
DuckDB: Upgraded from v1.1.3 to v1.3.2
iceberg: Upgraded from v0.4.0 to v0.5.1

Changelog

fix llm integraion test (#6398) by @Sevenannn in #6398
Promote spice cloud connector to stable quality (#6221) by @Sevenannn in #6221
v1.5.0-rc.1 release notes (#6397) by @lukekim in #6397
Fix model nsql integration tests (#6365) by @Sevenannn in #6365
Fix incorrect UDTF name and SQL query (#6404) by @lukekim in #6404
Update v1.5.0-rc.1.md (#6407) by @sgrebnov in #6407
Improve error messages (#6405) by @lukekim in #6405
build(deps): bump Jimver/cuda-toolkit from 0.2.25 to 0.2.26 (#6388) by @app/dependabot in #6388
Upgrade dependabot dependencies (#6411) by @phillipleblanc in #6411
Fix projection pushdown issues for document based file connector (#6362) by @Advayp in #6362
Create a new crate for UDFs (#6416) by @kczimm in #6416
Add a PartitionedDuckDB Accelerator (#6338) by @kczimm in #6338
Use vector_search() UDTF in HTTP APIs (#6417) by @Jeadie in #6417
add supported types (#6409) by @kczimm in #6409
Enable session time zone override for MySQL (#6426) by @sgrebnov in #6426
Acceleration-like indexing for full text search indexes. (#6382) by @Jeadie in #6382
Provide error message when partition by expression changes (#6415) by @kczimm in #6415
Add support for Oracle Autonomous Database connections (Oracle Cloud) (#6421) by @sgrebnov in #6421
prune partitions for exact and in list with and without UDFs (#6423) by @kczimm in #6423
Fixes and reenable FTS tests (#6431) by @Jeadie in #6431
Updating text-embedding-inference & mistralrs dependency (#6366) by @Jeadie in #6366
Upgrade DuckDB to 1.3.2 (#6434) by @phillipleblanc in #6434
Fix issue in limit clause for the Github Data connector (#6443) by @Advayp in #6443
Upgrade iceberg-rust to 0.5.1 (#6446) by @phillipleblanc in #6446

- Rust
Published by peasee 11 months ago

https://github.com/spiceai/spiceai - v1.5.0-rc.1

Spice v1.5.0-rc.1 (July 7, 2025)

This is the first release candidate for v1.5.0, which introduces partitioning for DuckDB acceleration, SQL-integrated vector and full-text search, and automated refreshes for search indexes and views. It adds a new AWS Bedrock Embeddings Model Provider, a new Oracle Database connector, and promotes the Spice.ai Cloud Data Connector to stable, alongside multi-column vector search for expanded search.

What's New in v1.5.0-rc.1

Partitioned Acceleration: DuckDB file-based accelerations now support partition_by expressions, enabling queries to scale to large datasets through automatic data partitioning and query predicate pruning. New UDFs, bucket and truncate, simplify partition logic.

New UDFs useful for partition_by expressions:

bucket(num_buckets, col): Partitions a column into a specified number of buckets based on a hash of the column value.
truncate(width, col): Truncates a column to a specified width, aligning values to the nearest lower multiple (e.g., truncate(10, 101) = 100).

Example Spicepod.yml configuration:

yaml datasets: - from: s3://my_bucket/some_large_table/ name: my_table params: file_format: parquet acceleration: enabled: true engine: duckdb mode: file partition_by: bucket(100, account_id) # Partition account_id into 100 buckets

SQL-integrated Search: Vector and full-text search capabilities are now natively available in SQL queries, extending the power of the POST v1/search endpoint to all SQL workflows.

Example Vector-Similarity-Search (VSS) using the new similarity_search UDTF on the table reviews for the search term "Cricket bats".

sql SELECT review_id, review_text, review_date, score FROM similarity_search(reviews, "Cricket bats") WHERE country_code="AUS" LIMIT 3

Example Full-Text-Search (FTS) using the new text_search UDTF on the table reviews for the search term "Cricket bats".

sql SELECT review_id, review_text, review_date, score FROM reviews FROM text_search(reviews, "Cricket bats") LIMIT 3

Full-Text-Search (FTS) Index Refresh: Accelerated datasets with search indexes maintain up-to-date results with configurable refresh intervals.

Example refreshing search indexes on body every 10 seconds (based on acceleration.refresh_check_interval).

yaml datasets: - from: github:github.com/spiceai/docs/pulls name: spiceai.doc.pulls params: github_token: ${secrets:GITHUB_TOKEN} acceleration: enabled: true refresh_mode: full refresh_check_interval: 10s columns: - name: body full_text_search: enabled: true row_id: - id

Scheduled View Refresh: Accelerated Views now support cron-based refresh schedules using refresh_cron, automating updates for accelerated data.

Example Spicepod.yml configuration:

yaml views: - name: my_view sql: SELECT 1 acceleration: enabled: true refresh_cron: '0 * * * *' # Every hour

For more details, refer to Scheduled Refreshes.

Multi-column Vector Search: For datasets configured with embeddings on more than one column, POST v1/search and similarity_search will perform parallel vector search on each column, and aggregate results using a reciprocal rank fusion scoring method.

Example Spicepod.yml where search results will consider both the Github issue's title and the content of its body.

yaml datasets: - from: github:github.com/apache/datafusion/issues name: datafusion.issues params: github_token: ${secrets:GITHUB_TOKEN} columns: - name: title embeddings: - from: hf_minilm - name: body embeddings: - from: openai_embeddings

AWS Bedrock Embeddings Model Provider: Added support for AWS Bedrock embedding models, including Amazon Titan Text Embeddings and Cohere Text Embeddings.

Example Spicepod.yaml:

```yaml embeddings: - from: bedrock:cohere.embed-english-v3 name: cohere-embeddings params: awsregion: us-east-1 inputtype: search_document truncate: END

from: bedrock:amazon.titan-embed-text-v2:0 name: titan-embeddings params: aws_region: us-east-1 dimensions: '256' ```

For more details, refer to the AWS Bedrock Embedding Models Documentation.

Oracle Data Connector: Use from: oracle: to access and accelerate data stored in Oracle databases, deployed on-premises or in the cloud.

Example Spicepod.yml:

yaml datasets: - from: oracle:"SH"."PRODUCTS" name: products params: oracle_host: 127.0.0.1 oracle_username: scott oracle_password: tiger

See the Oracle Data Connector documentation for details.

Spice.ai Cloud Data Connector: Graduated to Stable.

Contributors

Breaking Changes

Search HTTP API Response: POST v1/search response payload has changed. See the new API documentation for details.
Model Provider Parameter Prefixes: Model Provider parameters use provider-specific prefixes instead of openai_ prefixes (e.g., hf_temperature instead of openai_temperature for HuggingFace, anthropic_max_completion_tokens for Anthropic, perplexity_tool_choice for Perplexity). The openai_ prefix remains supported for backward compatibility but is now deprecated will be removed in a future release.

Cookbook Updates

Added Oracle Data Connector cookbook: Connect to tables in Oracle databases.

The Spice Cookbook now includes 71 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.5.0-rc.1, download and install the specific binary from github.com/spiceai/spiceai/releases/tag/v1.5.0-rc.1 or pull the v1.5.0-rc.1 Docker image (spiceai/spiceai:1.5.0-rc.1).

What's Changed

Dependencies

delta_kernel: Upgraded to v0.12.1

Changelog

Jeadie/25 06 10/finance (#6182) by @Jeadie in #6182
chore: Update dependencies (#6196) by @peasee in #6196
Fix FlightSQL GetDbSchemas and GetTables schemas to fully match the protocol (#6197) by @sgrebnov in #6197
Use spice-rs in test operator and retry on connection reset error (#6136) by @Sevenannn in #6136
Move model-grading evals to testoperator (#6195) by @Jeadie in #6195
Don't use base table for full text search post apply vector search (#6215) by @Jeadie in #6215
Fix content-type header in v1/sql response (#6217) by @Jeadie in #6217
Add v1.4.0-rc.1 release into qa_analytics.csv (#6209) by @sgrebnov in #6209
fix: Reschedule AI benchmarks, set max parallel to 1 (#6224) by @peasee in #6224
task: Add MySQL indexes (#6227) by @peasee in #6227
fix pagination (#6222) by @Jeadie in #6222
Add build links to release notes (#6220) by @kczimm in #6220
feat: Enable additional testoperator tests (#6218) by @peasee in #6218
chore: Update testoperator release target to 1.4 (#6235) by @peasee in #6235
fix: Update benchmark snapshots (#6234) by @app/github-actions in #6234
fix: Lower SF100 memory limit (#6236) by @peasee in #6236
Add glue integration test using hive and iceberg tables (#6248) by @kczimm in #6248
allow database for empty patterns (#6258) by @kczimm in #6258
add Glue catalog to README.md (#6179) by @kczimm in #6179
Add bucket UDF for partitioning (#6200) by @kczimm in #6200
New tool parsley (#6232) by @Jeadie in #6232
Upgrade dependabot dependencies (#6261) by @phillipleblanc in #6261
Upgrade delta_kernel to 0.12.1 (#6263) by @phillipleblanc in #6263
fix: Throughput test dispatching (#6265) by @peasee in #6265
fix: badges on README.md show correct status (#6268) by @phillipleblanc in #6268
Extend Flight CommandGetTables with source native data type info (#6259) by @sgrebnov in #6259
fix: Docker image build with profile (#6270) by @peasee in #6270
docs: Post-release update (#6275) by @peasee in #6275
Improve error message for incorrect/missing Glue table or database (#6257) by @kczimm in #6257
Update spicepod.schema.json (#6274) by @app/github-actions in #6274
Update openapi.json (#6279) by @app/github-actions in #6279
Add Remote Spicepod support (#6233) by @phillipleblanc in #6233
Update QA analytics for v1.4.0 (#6277) by @ewgenius in #6277
Add truncate UDF (#6278) by @kczimm in #6278
Update qa_analytics.csv for 1.4.0 (#6284) by @sgrebnov in #6284
Default grok to 'grok-3' (#6285) by @Jeadie in #6285
For Spice.ai connectors, do not default to dev SCP for dev builds (#6254) by @Jeadie in #6254
fix: Deny extra caching parameters (#6288) by @peasee in #6288
Make DynamoDB connectivity errors more specific and actionable (#6294) by @sgrebnov in #6294
Create a table provider from full text search index + query (#6286) by @Jeadie in #6286
Update Flight CommandGetTables to Return Native DataFusion SQL Data Types (#6297) by @sgrebnov in #6297
Adds a synchronous get_table function on the DataFusion context (#6300) by @phillipleblanc in #6300
Better Glue connector error messages (#6289) by @kczimm in #6289
fix: consume response stream before reading authorization metadata (#6292) by @Sevenannn in #6292
feat: Use retryable stream in test operator (#6231) by @Sevenannn in #6231
Support reserved word column names in DynamoDB (#6308) by @sgrebnov in #6308
fix: Implement Default manually for SQLResultsCacheConfig (#6310) by @peasee in #6310
Add integration test for DynamoDB Data Connector (#6311) by @sgrebnov in #6311
fix: Warn about no configured datasets if no datasets and catalogs are present (#6296) by @Advayp in #6296
Add better error messages for cases when a port is already in use (#6313) by @Advayp in #6313
Disallow datasets with protected names (#6309) by @Advayp in #6309
Roadmap updates June 2025 (#6319) by @lukekim in #6319
Add partitioning models (#6298) by @kczimm in #6298
Encode ScalarValues for use in filenames (#6318) by @kczimm in #6318
Standardize model parameter handling & prioritize <model-prefix>_<param> for model default overrides (#6199) by @Sevenannn in #6199
Add initial support for Oracle Data Connector (#6321) by @sgrebnov in #6321
Oracle connector: Support all major Oracle data types (#6323) by @sgrebnov in #6323
Oracle connector: support filter predicate pushdown (#6326) by @sgrebnov in #6326
text_search UDTF and required AnalyzerRule. (#6280) by @Jeadie in #6280
Build indexes as part of accelerations (#6324) by @phillipleblanc in #6324
feat: Add support for cron-based view refresh (#6341) by @peasee in #6341
Surface table not found errors immediately (#6317) by @Advayp in #6317
runtime-datafusion-index: Stop infinite recursion for IndexTableScanOptimizerRule (#6353) by @phillipleblanc in #6353
Add optional behaviors to DataAccelerator tables + add WantsUnderlyingTableBehavior to VoidTable (#6354) by @phillipleblanc in #6354
AWS Bedrock models. (#6358) by @Jeadie in #6358
Ensure views load even if they're the only components defined (#6359) by @Advayp in #6359
Improve type conversion and add integration tests for the Oracle connector (#6327) by @sgrebnov in #6327
Upgrade dependabot dependencies (#6375) by @phillipleblanc in #6375
Don't run tests that require a Databricks cluster on every PR (#6379) by @phillipleblanc in #6379
Properly handle duplicate flags to spice run (#6364) by @Advayp in #6364
Fix the case sensitivity of the key in env secrets store (#6371) by @ewgenius in #6371
vector_search UDTF and related changes (#6381) by @Jeadie in #6381
Update end_game.md (#6380) by @sgrebnov in #6380
fix: openai model endpoint (#6394) by @Sevenannn in #6394
Enable Oracle connector in default build configuration by @sgrebnov in #6395
Enable configuring otel endpoint from spice run by @Advayp in #6360

- Rust
Published by phillipleblanc 11 months ago

https://github.com/spiceai/spiceai - v1.4.0

- Rust
Published by peasee 12 months ago

https://github.com/spiceai/spiceai - v1.4.0-rc.1

Spice v1.4.0-rc.1 (June 11, 2025)

This release candidate for v1.4.0 upgrades DataFusion to v47 and Arrow to v55 for faster queries, more efficient Parquet/CSV handling, and improved reliability. It introduces the AWS Glue Catalog and Data Connectors for native access to Glue-managed data on S3 and supports Databricks U2M OAuth for secure Databricks user authentication. New Cron-based dataset refreshes and worker schedules enable automated task management, while dataset and search results caching improvements further optimizes query, search, and RAG performance.

What's New in v1.4.0-rc.1

DataFusion v47 Highlights

Spice.ai is built on the DataFusion query engine. The v47 release brings:

Performance Improvements 🚀: This release delivers major query speedups through specialized GroupsAccumulator implementations for first_value, last_value, and min/max on Duration types, eliminating unnecessary sorting and computation. TopK operations are now up to 10x faster thanks to early exit optimizations, while sort performance is further enhanced by reusing row converters, removing redundant clones, and optimizing sort-preserving merge streams. Logical operations benefit from short-circuit evaluation for AND/OR, reducing overhead, and additional enhancements address high latency from sequential metadata fetching, improve int/string comparison efficiency, and simplify logical expressions for better execution.

Bug Fixes & Compatibility Improvements 🛠️: The release addresses issues with external sort, aggregation, and window functions, improves handling of NULL values and type casting in arrays and binary operations, and corrects problems with complex joins and nested window expressions. It also addresses SQL unparsing for subqueries, aliases, and UNION BY NAME.

See the Apache DataFusion 47.0.0 Changelog for details.

Arrow v55 Highlights

Arrow v55 delivers faster Parquet gzip compression, improved array concatenation, and better support for large files (4GB+) and modular encryption. Parquet metadata reads are now more efficient, with support for range requests and enhanced compatibility for INT96 timestamps and timezones. CSV parsing is more robust, with clearer error messages. These updates boost performance, compatibility, and reliability.

See the Arrow 55.0.0 Changelog and Arrow 55.1.0 Changelog for details.

Search Result Caching: Spice now supports runtime caching for search results, improving performance for subsequent searches and chat completion requests that use the document_similarity LLM tool. Caching is configurable with options like maximum size, item TTL, eviction policy, and hashing algorithm.

Example spicepod.yml configuration:

yaml runtime: caching: search_results: enabled: true max_size: 128mb item_ttl: 5s eviction_policy: lru hashing_algorithm: siphash

For more information, refer to the Caching documentation.

AWS Glue Catalog Connector: Connect to AWS Glue Data Catalogs to query Iceberg, Parquet, or CSV tables in S3.

Example spicepod.yml configuration:

yaml catalogs: - from: glue name: my_glue_catalog params: glue_key: <your-access-key-id> glue_secret: <your-secret-access-key> glue_region: <your-region> include: - 'testdb.hive_*' - 'testdb.iceberg_*'

For more information, refer to the Glue Catalog Connector documentation.

AWS Glue Data Connector: Connect to specific tables in AWS Glue Data Catalogs to query Iceberg, Parquet, or CSV in S3.

Example spicepod.yml configuration:

yaml datasets: - from: glue:my_database.my_table name: my_table params: glue_auth: key glue_region: us-east-1 glue_key: ${secrets:AWS_ACCESS_KEY_ID} glue_secret: ${secrets:AWS_SECRET_ACCESS_KEY}

For more information, refer to the Glue Data Connector documentation.

Databricks U2M OAuth: Spice now supports User-to-Machine (U2M) authentication for Databricks when called with a compatible client, such as the Spice Cloud Platform.

yaml datasets: - from: databricks:spiceai_sandbox.default.messages name: messages params: databricks_endpoint: ${secrets:DATABRICKS_ENDPOINT} databricks_cluster_id: ${secrets:DATABRICKS_CLUSTER_ID} databricks_client_id: ${secrets:DATABRICKS_CLIENT_ID}

Dataset Refresh Schedules: Accelerated datasets now support a refresh_cron parameter, automatically refreshing the dataset on a defined cron schedule. Cron scheduled refreshes respect the global dataset_refresh_parallelism parameter.

Example spicepod.yml configuration:

yaml datasets: - name: my_dataset from: s3://my-bucket/my_file.parquet acceleration: refresh_cron: 0 0 * * * # Daily refresh at midnight

For more information, refer to the Dataset Refresh Schedules documentation.

Worker Execution Schedules: Workers now support a cron parameter and will execute an LLM-prompt or SQL query automatically on the defined cron schedule, in conjunction with a provided params.prompt.

Example spicepod.yml configuration:

yaml workers: - name: email_reporter models: - from: gpt-4o params: prompt: 'Inspect the latest emails, and generate a summary report for them. Post the summary report to the connected Teams channel' cron: 0 2 * * * # Daily at 2am

For more information, refer to the Worker Execution Schedules documentation.

SQL Worker Actions: Spice now supports workers with sql actions, to execute automated SQL queries on a cron schedule:

yaml workers: - name: my_worker cron: 0 * * * * sql: 'SELECT * FROM lineitem'

For more information, refer to the Workers with a SQL action documentation;

Contributors

Breaking Changes

No breaking changes.

Cookbook Updates

Added Glue Catalog Connector and Data Connector cookbooks: Connect to tables and databases in the AWS Glue Data catalog.

The Spice Cookbook now includes 69 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.4.0-rc.1, download and install the specific binary from github.com/spiceai/spiceai/releases/tag/v1.4.0-rc.1 or pull one of the nightly Docker images:

What's Changed

Dependencies

DataFusion: Upgraded to v47
arrow-rs: Upgraded to v55.1.0
delta_kernel: Upgraded to v0.11.0

Changelog

Update trunk to 1.4.0-unstable (#5878) by @phillipleblanc in #5878
Update openapi.json (#5885) by @app/github-actions in #5885
feat: Testoperator reports benchmark failure summary (#5889) by @peasee in #5889
fix: Publish binaries to dev when platform option is all (#5905) by @peasee in #5905
feat: Print dispatch current test count of total (#5906) by @peasee in #5906
Include multiple duckdb files acceleration scenarios into testoperator dispatch (#5913) by @sgrebnov in #5913
feat: Support building testoperator on dev (#5915) by @peasee in #5915
Update spicepod.schema.json (#5927) by @app/github-actions in #5927
Update ROADMAP & SECURITY for 1.3.0 (#5926) by @phillipleblanc in #5926
Define SearchGeneration paradigm & use in Vector Search (#5876) by @Jeadie in #5876
docs: Update qa_analytics.csv (#5928) by @peasee in #5928
fix: Properly publish binaries to dev on push (#5931) by @peasee in #5931
Load request context extensions on every flight incoming call (#5916) by @ewgenius in #5916
Fix deferred loading for datasets with embeddings (#5932) by @ewgenius in #5932
Schedule AI benchmarks to run every Mon and Thu evening PST (#5940) by @sgrebnov in #5940
Fix explain plan snapshots for TPCDS queries Q36, Q70 & Q86 not being deterministic after DF 46 upgrade (#5942) by @phillipleblanc in #5942
chore: Upgrade to Rust 1.86 (#5945) by @peasee in #5945
Standardise HTTP settings across CLI (#5769) by @Jeadie in #5769
Fix deferred flag for Databricks SQL warehouse mode (#5958) by @ewgenius in #5958
Add deferred catalog loading (#5950) by @ewgenius in #5950
Refactor deferred_load using ComponentInitialization enum for better clarity (#5961) by @ewgenius in #5961
Post-release housekeeping (#5964) by @phillipleblanc in #5964
add LTO for release builds (#5709) by @kczimm in #5709
Fix dependabot/192 (#5976) by @Jeadie in #5976
Fix Test-to-SQL benchmark scheduled run (#5977) by @sgrebnov in #5977
Fix JSON to ScalarValue type conversion to match DataFusion behavior (#5979) by @sgrebnov in #5979
Add v1.3.1 release notes (#5978) by @lukekim in #5978
Define CandidateAggregation trait and implement RRF for multi column vector search. (#5943) by @Jeadie in #5943
Regenerate nightly build workflow (#5995) by @ewgenius in #5995
Fix DataFusion dependency loading in Databricks request context extension (#5987) by @ewgenius in #5987
Update spicepod.schema.json (#6000) by @app/github-actions in #6000
feat: Run MySQL SF100 on dev runners (#5986) by @peasee in #5986
fix: Remove caching RwLock (#6001) by @peasee in #6001
1.3.1 Post-release housekeeping (#6002) by @phillipleblanc in #6002
feat: Add initial scheduler crate (#5923) by @peasee in #5923
fix flight request context scope (#6004) by @ewgenius in #6004
fix: Ensure snapshots on different scale factors are retained (#6009) by @peasee in #6009
fix: Allow dev runners in dispatch files (#6011) by @peasee in #6011
refactor: Deprecate resultscache for caching.sqlresults (#6008) by @peasee in #6008
Fix models benchmark results reporting (#6013) by @sgrebnov in #6013
fix: Run PR checks for tools/ changes (#6014) by @peasee in #6014
feat: Add a CronRequestChannel for scheduler (#6005) by @peasee in #6005
feat: Add refresh_cron acceleration parameter, start scheduler on table load (#6016) by @peasee in #6016
Update license check to allow dual license crates (#6021) by @sgrebnov in #6021
Initial worker concept (#5973) by @Jeadie in #5973
Don't fail if cargo-deny already installed (license check) (#6023) by @sgrebnov in #6023
Upgrade to DataFusion 47 and Arrow 55 (#5966) by @sgrebnov in #5966
Read Iceberg tables from Glue Catalog Connector (#5965) by @kczimm in #5965
Handle multiple highlights in v1/search UX (#5963) by @Jeadie in #5963
feat: Add cron scheduler configurations for workers (#6033) by @peasee in #6033
feat: Add search cache configuration and results wrapper (#6020) by @peasee in #6020
Fix GitHub Actions Ubuntu for more workflows (#6040) by @phillipleblanc in #6040
Fix Actions for testoperator dispatch manual (#6042) by @phillipleblanc in #6042
refactor: Remove worker type (#6039) by @peasee in #6039
feat: Support cron dataset refreshes (#6037) by @peasee in #6037
Upgrade datafusion-federation to 0.4.2 (#6022) by @phillipleblanc in #6022
Define SearchPipeline and use in runtime/vector_search.rs. (#6044) by @Jeadie in #6044
fix: Scheduler test when scheduler is running (#6051) by @peasee in #6051
doc: Spice Cloud Connector Limitation (#6035) by @Sevenannn in #6035
Add support for on_conflict:upsert for Arrow MemTable (#6059) by @sgrebnov in #6059
Enhance Arrow Flight DoPut operation tracing (#6053) by @sgrebnov in #6053
Update openapi.json (#6032) by @app/github-actions in #6032
Add tools enabled to MCP server capabilities (#6060) by @Jeadie in #6060
Upgrade to delta_kernel 0.11 (#6045) by @phillipleblanc in #6045
refactor: Replace refresh oneshot with notify (#6050) by @peasee in #6050
Enable Upsert OnConflictBehavior for runtime.task_history table (#6068) by @sgrebnov in #6068
feat: Add a workers integration test (#6069) by @peasee in #6069
Fix DuckDB acceleration ORDER BY rand() and ORDER BY NULL (#6071) by @phillipleblanc in #6071
Update Models Benchmarks to report unsuccessful evals as errors (#6070) by @sgrebnov in #6070
Revert: fix: Use HTTPS ubuntu sources (#6082) by @Sevenannn in #6082
Add initial support for Spice Cloud Platform management (#6089) by @sgrebnov in #6089
Run spiceai cloud connector TPC tests using spice dev apps (#6049) by @Sevenannn in #6049
feat: Add SQL worker action (#6093) by @peasee in #6093
Post-release housekeeping (#6097) by @phillipleblanc in #6097
Fix search bench (#6091) by @Jeadie in #6091
fix: Update benchmark snapshots (#6094) by @app/github-actions in #6094
fix: Update benchmark snapshots (#6095) by @app/github-actions in #6095
Glue catalog connector for hive style parquet (#6054) by @kczimm in #6054
Update openapi.json (#6100) by @app/github-actions in #6100
Improve Flight Client DoPut / Publish error handling (#6105) by @sgrebnov in #6105
Define PostApplyCandidateGeneration to handle all filters & projections. (#6096) by @Jeadie in #6096
refactor: Update the tracing task names for scheduled tasks (#6101) by @peasee in #6101
task: Switch GH runners in PR and testoperator (#6052) by @peasee in #6052
feat: Connect search caching for HTTP and tools (#6108) by @peasee in #6108
test: Add multi-dataset cron test (#6102) by @peasee in #6102
Sanitize the ListingTableURL (#6110) by @phillipleblanc in #6110
Avoid partial writes by FlightTableWriter (#6104) by @sgrebnov in #6104
fix: Update the TPCDS postgres acceleration indexes (#6111) by @peasee in #6111
Make Glue Catalog refreshable (#6103) by @kczimm in #6103
Refactor Glue catalog to use a new Glue data connector (#6125) by @kczimm in #6125
Emit retry error on flight transient connection failure (#6123) by @Sevenannn in #6123
Update Flight DoPut implementation to send single final PutResult (#6124) by @sgrebnov in #6124
feat: Add metrics for search results cache (#6129) by @peasee in #6129
update MCP crate (#6130) by @Jeadie in #6130
feat: Add search cache status header, respect cache control (#6131) by @peasee in #6131
fix: Allow specifying individual caching blocks (#6133) by @peasee in #6133
Update openapi.json (#6132) by @app/github-actions in #6132
Add CSV support to Glue data connector (#6138) by @kczimm in #6138
Update Spice Cloud Platform management UX (#6140) by @sgrebnov in #6140
Add TPCH bench for Glue catalog (#6055) by @kczimm in #6055
Enforce maxtokensper_request limit in OpenAI embedding logic (#6144) by @sgrebnov in #6144
Enable Spice Cloud Control Plane connect (management) for FinanceBench (#6147) by @sgrebnov in #6147
Add integration test for Spice Cloud Platform management (#6150) by @sgrebnov in #6150
fix: Invalidate search cache on refresh (#6137) by @peasee in #6137
fix: Prevent registering cron schedule with change stream accelerations (#6152) by @peasee in #6152
test: Add an append cron integration test (#6151) by @peasee in #6151
fix: Cache search results with no-cache directive (#6155) by @peasee in #6155
fix: Glue catalog dispatch runner type (#6157) by @peasee in #6157
Fix: Glue S3 location for directories and Iceberg credentials (#6174) by @kczimm in #6174
Support multiple columns in FTS (#6156) by @Jeadie in #6156
fix: Add --cache-control flag for search CLI (#6158) by @peasee in #6158
Add Glue data connector tpch bench test for parquet and csv (#6170) by @kczimm in #6170
fix: Apply results cache deprecation correctly (#6177) by @peasee in #6177
Fix Linux CUDA build (use candle-core 0.8.4 and cudarc v0.12) (#6181) by @sgrebnov in #6181
fix: return empty stream when no results for Databricks SQL Warehouse (#6192) by @kczimm in #6192

Full Changelog: v1.3.2...v1.4.0-rc.1

- Rust
Published by kczimm 12 months ago

https://github.com/spiceai/spiceai - v1.3.2

Spice v1.3.2 (June 3, 2025)

Spice v1.3.2 improves DuckDB acceleration to accept ORDER BY rand() and ORDER BY NULL SQL queries, and supports the TIMESTAMP_NTZ(0) (timestamp with seconds precision) type in Snowflake.

Contributors

Breaking Changes

No breaking changes.

Cookbook Updates

No new cookbook recipes.

The Spice Cookbook now includes 67 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.3.2, use one of the following methods:

CLI:

console spice upgrade

Homebrew:

console brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.3.2 image:

console docker pull spiceai/spiceai:1.3.2

For available tags, see DockerHub.

Helm:

console helm repo update helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

No major dependency changes.

Changelog

Handle Snowflake Timestamp NTZ with seconds precision (#6084) by @kczimm in #6084
Fix DuckDB acceleration ORDER BY rand() and ORDER BY NULL (#6071) by @phillipleblanc in #6071

Full Changelog: https://github.com/spiceai/spiceai/compare/v1.3.1...v1.3.2

- Rust
Published by phillipleblanc 12 months ago

https://github.com/spiceai/spiceai - v1.3.1

- Rust
Published by phillipleblanc about 1 year ago

https://github.com/spiceai/spiceai - v1.3.0

Spice v1.3.0 (May 19, 2025)

Spice v1.3.0 accelerates data and AI applications with significantly improved query performance, reliability, and expanded Databricks integration. New support for the Databricks SQL Statement Execution API enables direct SQL queries on Databricks SQL Warehouses, complementing Mosaic AI model serving and embeddings (introduced in v1.2.2) and existing Databricks catalog and dataset integrations. This release upgrades to DataFusion v46, optimizes results caching performance, and strengthens security with least-privilege sandboxed improvements.

What's New in v1.3.0

Databricks SQL Statement Execution API Support: Added support for the Databricks SQL Statement Execution API, enabling direct SQL queries against Databricks SQL Warehouses for optimized performance in analytics and reporting workflows.

Example spicepod.yml configuration:

yaml datasets: - from: databricks:spiceai.datasets.my_awesome_table name: my_awesome_table params: mode: sql_warehouse databricks_endpoint: ${env:DATABRICKS_ENDPOINT} databricks_sql_warehouse_id: ${env:DATABRICKS_SQL_WAREHOUSE_ID} databricks_token: ${env:DATABRICKS_TOKEN}

For details, see the Databricks Data Connector documentation.

Improved Results Cache Performance & Hashing Algorithm: Spice now supports an alternative results cache hashing algorithm, ahash, in addition to siphash, being the default. Configure it via:

yaml runtime: results_cache: hashing_algorithm: ahash # or siphash

The hashing algorithm determines how cache keys are hashed before being stored, impacting both lookup speed and protection against potential DOS attacks.

Using ahash improves performance for large queries or query plans. Combined with results cache optimizations, it reduces 99th percentile request latency and increases total requests/second for queries with large result sets (100k+ cached rows). The following charts show performance tested against the TPCH Query #17 on a scale factor 5 dataset (30+ million rows, 5GB):

| Latency | Req/sec | | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | Improvements for the 99th percentile query latency, compared against 1.2.2 with cache key type and hashing algorithm. | Improvements for the requests/second, compared against 1.2.2 with cache key type and hashing algorithm. |

Note: ahash was not available in v1.2.2, so it is excluded from comparisons.

To learn more, refer to the Results Cache Hashing Algorithm documentation.

SQL Query Performance: Optimized the critical SQL query path, reducing overhead and improving response times for simple queries by 10-20%.
DuckDB Acceleration: Fixed a bug in the DuckDB acceleration engine causing query failures under high concurrency when querying datasets accelerated into multiple DuckDB files.
Container Security: The container image now runs as a non-root user with enhanced sandboxing and includes only essential dependencies for a slimmer, more secure image.

DataFusion v46 Highlights

Spice.ai is built on the DataFusion query engine. The v46 release brings:

Faster Performance 🚀: DataFusion 46 introduces significant performance enhancements, including a 2x faster median() function for large datasets without grouping, 10–100% speed improvements in FIRST_VALUE and LAST_VALUE window functions by avoiding sorting, and a 40x faster uuid() function. Additional optimizations, such as a 50% faster repeat() string function, accelerated chr() and to_hex() functions, improved grouping algorithms, and Parquet row group pruning with NOT LIKE filters, further boost overall query efficiency.
New range() Table Function: A new table-valued function range(start, stop, step) has been added to make it easy to generate integer sequences — similar to PostgreSQL’s generate_series() or Spark’s range(). Example: SELECT * FROM range(1, 10, 2);
UNION [ALL | DISTINCT] BY NAME Support: DataFusion now supports UNION BY NAME and UNION ALL BY NAME, which align columns by name instead of position. This matches functionality found in systems like Spark and DuckDB and simplifies combining heterogeneously ordered result sets.

Example:

sql SELECT col1, col2 FROM t1 UNION ALL BY NAME SELECT col2, col1 FROM t2;

See the DataFusion 46.0.0 release notes for details.

Spice.ai adopts the latest minus one DataFusion release for quality assurance and stability. The upgrade to DataFusion v47 is planned for Spice v1.4.0 in June.

Contributors

Breaking Changes

No breaking changes.

Cookbook Updates

Added Accelerated Views: Pre-calculate and materialize data derived from one or more underlying datasets.

The Spice Cookbook now includes 67 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.3.0, use one of the following methods:

CLI:

console spice upgrade

Homebrew:

console brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.3.0 image:

console docker pull spiceai/spiceai:1.3.0

For available tags, see DockerHub.

Helm:

console helm repo update helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

DataFusion: Upgraded to v46
Apache Arrow: Upgraded to v54.3.0
delta_kernel: Upgraded to v0.10.0

Changelog

update to 1.2.2 by @Jeadie in #5806
Move sandboxing logic to Dockerfile by @phillipleblanc in #5808
Add note to run installation health workflow after release is marked as official by @Sevenannn in #5797
ROADMAP updates May 13, 2025 by @lukekim in #5809
Update qa_analytics.csv by @kczimm in #5810
post-release housekeeping by @Jeadie in #5811
Fix flaky DataBricks M2M integration tests by @phillipleblanc in #5818
Add DataFusion request context extension to http routes by @ewgenius in #5807
Use Utf8 for partition columns by @phillipleblanc in #5820
Use full path for location metadata column by @phillipleblanc in #5819
Remove the DataFusion reference from the flight service and use the reference from the request context instead by @ewgenius in #5821
Upgrade delta_kernel to 0.10 by @phillipleblanc in #5823
fix: Update benchmark snapshots by @app/github-actions in #5827
Update qa_analytics.csv by @kczimm in #5824
fix: Update benchmark snapshots by @app/github-actions in #5826
fix: Update benchmark snapshots by @app/github-actions in #5825
Fix dispatch spicepod reference for file[parquet]-duckdb[file]-indexes and file[parquet]-duckdb[memory]-indexes by @phillipleblanc in #5837
Fix spice run --http-endpoint in CLI by @Jeadie in #5812
Prevent excessively copying RawCacheKey by @peasee in #5838
Make DuckDB database attachments logic more robust by @sgrebnov in #5839
Simplify Databricks U2M auth flow, by moving user auth to the request context by @ewgenius in #5842
Update to new MCP crate by @Jeadie in #5758
Disable the query tracker when task history is disabled by @peasee in #5852
Set fsGroup on PodSpec to force volumes to be mounted with permission to docker image by @phillipleblanc in #5854
Clarify Helm release steps by @phillipleblanc in #5855
Avoid cloning cached results by @peasee in #5853
Upgrade to DataFusion 46 by @phillipleblanc in #5543
Update openapi.json by @app/github-actions in #5856
Adapt to Arrow 54 changes in Dict IDs preserving (Arrow IPC) by @sgrebnov in #5866
fix: Update benchmark snapshots by @app/github-actions in #5867
Fix s3[parquet]-duckdb[file-many] benchmark Spicepod configuration by @sgrebnov in #5868
fix: Update benchmark snapshots by @app/github-actions in #5869
feat: Refactor caching, support hashing algorithms by @peasee in #5859
Overried health checks for Databricks models in U2M auth mode by @ewgenius in #5858
Update trunk to 1.4.0-unstable by @phillipleblanc in #5878
fix: Pass parameters to testoperator explain plan by @peasee in #5883
Disallow schema updates for existing accelerated tables by @phillipleblanc in #5887
Deferrable registration for Databricks U2M datasets by @ewgenius in #5860

See the full list of changes at: v1.2.2...v1.3.0

- Rust
Published by phillipleblanc about 1 year ago

https://github.com/spiceai/spiceai - v1.2.2

Spice v1.2.2 (May 12, 2025)

Spice v1.2.2 introduces support for Databricks Mosaic AI model serving and embeddings, alongside the existing Databricks catalog and dataset integrations. It adds configurable service ports in the Helm chart and resolves several bugs to improve stability and performance.

Highlights in v1.2.2

Databricks Model & Embedding Provider: Spice integrates with Databricks Model Serving for models and embeddings, enabling secure access via machine-to-machine (M2M) OAuth authentication with service principal credentials. The runtime automatically refreshes tokens using databricks_client_id and databricks_client_secret, ensuring uninterrupted operation. This feature supports Databricks-hosted large language models and embedding models.

```yaml models: - from: databricks:databricks-llama-4-maverick name: llama-4-maverick params: databricksendpoint: dbc-46470731-42e5.cloud.databricks.com databricksclientid: ${secrets:DATABRICKSCLIENTID} databricksclientsecret: ${secrets:DATABRICKSCLIENT_SECRET}

embeddings: - from: databricks:databricks-gte-large-en name: gte-large-en params: databricksendpoint: dbc-42424242-4242.cloud.databricks.com databricksclientid: ${secrets:DATABRICKSCLIENTID} databricksclientsecret: ${secrets:DATABRICKSCLIENT_SECRET} ```

For detailed setup instructions, refer to the Databricks Model Provider documentation.

Configurable Helm Chart Service Ports: The Helm chart now supports custom ports for flexible network configurations for deployments. Specify non-default ports in your Helm values file.
Resolved Issues:
- MCP Nested Tool Calling: Fixed a bug preventing nested tool invocation when Spice operates as the MCP server federating to MCP clients.
- Dataset Load Concurrency: Corrected a failure to respect the dataset_load_parallelism setting during dataset loading.
- Acceleration Hot-Reload: Addressed an issue where changes to acceleration enable/disable settings were not detected during hot reload of Spicepod.yaml.

Contributors

Breaking Changes

No breaking changes.

Cookbook Updates

Updated cookbooks:

Databricks Catalogs: Includes using Databricks Service Principal
Databricks: Includes using M2M auth
Python ADBC: Adds a dataset to be queried over ADBC.

The Spice Cookbook now includes 68 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.2.2, use one of the following methods:

CLI:

console spice upgrade

Homebrew:

console brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.2.2 image:

console docker pull spiceai/spiceai:1.2.2

For available tags, see DockerHub.

Helm:

console helm repo update helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

No major dependency changes.

Changelog

Update spark-connect-rs to override user agent string by @ewgenius in https://github.com/spiceai/spice/pull/5798
Merge pull request by @ewgenius in https://github.com/spiceai/spice/pull/5796
Pass the default user agent string to the Databricks Spark, Delta, and Unity clients by @ewgenius in https://github.com/spiceai/spice/pull/5717
bump to 1.2.2 by @Jeadie in https://github.com/spiceai/spice/pull/none
Helm chart: support for service ports overrides by @sgrebnov in https://github.com/spiceai/spice/pull/5774
Update spice cli login command with client-id and client-secret flags for Databricks by @ewgenius in https://github.com/spiceai/spice/pull/5788
Fix bug where setting Cache-Control: no-cache doesn't compute the cache key by @phillipleblanc in https://github.com/spiceai/spice/pull/5779
Update to datafusion-contrib/datafusion-table-providers#336 by @phillipleblanc in https://github.com/spiceai/spice/pull/5778
Lru cache: limit single cached record size to u32::MAX (4GB) by @sgrebnov in https://github.com/spiceai/spice/pull/5772
Fix LLMs calling nested MCP tools by @Jeadie in https://github.com/spiceai/spice/pull/5771
MySQL: Set the charactersetresults/charactersetclient/charactersetconnection session variables on connection setup by @Sevenannn in https://github.com/spiceai/spice/pull/5770
Control the parallelism of acceleration refresh datasets with runtime.datasetloadparallelism by @phillipleblanc in https://github.com/spiceai/spice/pull/5763
Fix Iceberg predicates not matching the Arrow type of columns read from parquet files by @phillipleblanc in https://github.com/spiceai/spice/pull/5761
fix: Use decimal_cmp for numerical BETWEEN in SQLite by @peasee in https://github.com/spiceai/spice/pull/5760
Support product name override in databricks user agent string by @ewgenius in https://github.com/spiceai/spice/pull/5749
Databricks U2M Token Provider support by @ewgenius in https://github.com/spiceai/spice/pull/5747
Remove HTTP auth from LLM config and simplify Databricks models logic by using static headers by @Jeadie in https://github.com/spiceai/spice/pull/5742
clear plan cache when dataset updates by @kczimm in https://github.com/spiceai/spice/pull/5741
Support Databricks M2M auth in LLMs + Embeddings by @Jeadie in https://github.com/spiceai/spice/pull/5720
Retrieve Github App tokens in background; make TokenProvider not async by @Jeadie in https://github.com/spiceai/spice/pull/5718
Make 'token_providers' crate by @Jeadie in https://github.com/spiceai/spice/pull/5716
Databricks AI: Embedding models & LLM streaming by @Jeadie in https://github.com/spiceai/spice/pull/5715

See the full list of changes at: v1.2.1...v1.2.2

- Rust
Published by Jeadie about 1 year ago

https://github.com/spiceai/spiceai - v1.2.2

- Rust
Published by Jeadie about 1 year ago

https://github.com/spiceai/spiceai - v1.2.1

Spice v1.2.1 (May 6, 2025)

Spice v1.2.1 includes several data connector fixes and improves query performance for accelerated views. This release also introduces Databricks Service Principal (M2M OAuth) authentication and expands parameterized queries.

Highlights in v1.2.1

Databricks Service Principal Support: Databricks datasets and catalogs now support Machine-to-Machine (M2M) OAuth authentication via Service Principals, enabling secure machine connections to Databricks.

Example spicepod.yaml:

yaml datasets: - from: databricks:spiceai.datasets.my_awesome_table # A reference to a table in the Databricks unity catalog name: my_delta_lake_table params: mode: delta_lake databricks_endpoint: dbc-a1b2345c-d6e7.cloud.databricks.com databricks_client_id: ${secrets:DATABRICKS_CLIENT_ID} databricks_client_secret: ${secrets:DATABRICKS_CLIENT_SECRET}

For details, see documentation for:

Databricks Data Connector
Databricks Unity Catalog Connector
Iceberg Data Connector: Now supports cross-account table access via the AWS Glue Catalog Connector and fixes an issue when querying data from append mode datasets.
Iceberg Catalog API: Full compatibility with the Iceberg HTTP REST Catalog API to consume Spice datasets from Iceberg Catalog clients.

For details, see documentation for:

Iceberg Data Connector
S3 Data Connector
Improved Parameterized Query Support: Expanded type inference for placeholders in:
- IN list expressions
- LIKE patterns
- SIMILAR TO patterns
- LIMIT clauses
- Subqueries

New Contributors 🎉

@nuvic made their first contribution in #5673

Contributors

Breaking Changes

No breaking changes.

Cookbook Updates

New recipes for:

Language Model Evaluations: Use Spice.ai OSS to evaluate language models.
LLM as a Judge: Use LLM judge models to evaluate the performance of other language models.

The Spice Cookbook now includes 68 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.2.1, use one of the following methods:

CLI:

console spice upgrade

Homebrew:

console brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.2.1 image:

console docker pull spiceai/spiceai:1.2.1

For available tags, see DockerHub.

Helm:

console helm repo update helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

No major dependency changes.

Changelog

Fix: Specify metric type as a dimension for testoperator by @peasee in #5630
Fix: Add option to run dispatch schedule by @peasee in #5631
Infer placeholder datatype for InList, Like, and SimilarTo by @kczimm in #5626
Add QA analytics for 1.2.0 by @phillipleblanc in #5640
Fix: Use SPICEDCOMMIT for spicedcommit_sha by @peasee in #5632
New crates/tools by @Jeadie in #5121
Update openapi.json by @github-actions in #5643
Enable metrics reporting for models benchmarks (evals) by @sgrebnov in #5639
Implement CatalogBuilder, add app and runtime references to catalog component, add runtime reference to connector params by @ewgenius in #5641
Fix eventing bug in LLM progress; Add tool and worker progress by @Jeadie in #5619
Handle small precision differences in TPCH answer validation by @phillipleblanc in #5642
Add TokenProviderRegistry to the runtime by @ewgenius in #5651
Provide ModelContextLayer for evals by @Jeadie in #5648
Databricks datacomponents refactor. Databricks Spark connect - add settoken method and writable spark session by @ewgenius in #5654
Extract AWS Glue warehouse for cross-account Iceberg tables by @phillipleblanc in #5656
Refactor Dataset component by @phillipleblanc in #5660
Fix Iceberg API returning 404 when schema contains a Dictionary by @phillipleblanc in #5665
Fix dependencies: downgrade swagger-ui to v8; force zip to 2.3.0 by @kczimm in #5664
Add DuckDB indexes spicepod, additional dispatches by @peasee in #5633
Update readme: update data federation link by @nuvic in #5673
Support metadata columns for object-store based data connectors by @phillipleblanc in #5661
Add model name to LLM judges, and add modelgradedscoring task by @Jeadie in #5655
Add SF1000 TPCH test spicepods for delta lake by @Sevenannn in #5606
Validate Github Connector resource existence before building the github connector graphql table by @Sevenannn in #5674
Remove hard-coded embedding performance tests in CI by @Sevenannn in #5675
Databricks M2M auth for spark connect data connector by @ewgenius in #5659
Enable federated data refresh support for accelerated views by @sgrebnov in #5677
Add pods watcher integration test by @Sevenannn in #5681
Add m2m support for databricks delta connector by @ewgenius in #5680
Update end_game.md by @sgrebnov in #5684
Update StaticTokenProvider to use SecretString instead of raw str value by @ewgenius in #5686
Add M2M Auth support for Databricks catalog connector by @ewgenius in #5687
Update UX to disable acceleration federation by @sgrebnov in #5682
Improve placeholder inference (LIMIT & Expr::InSubquery) by @phillipleblanc in #5692
Tweak default log to ignore aws_config::imds::region by @phillipleblanc in #5693
Make Spice properly Iceberg Catalog API compatible for load table API by @phillipleblanc in #5695
Use deterministic queries for Databricks m2m catalog tests by @ewgenius in #5696
Support retrieving the latest Iceberg table on table scan by @phillipleblanc in #5704
Infer partitions from schemasourcepath if present by @phillipleblanc in #5721

Full Changelog: v1.2.0...v1.2.1

- Rust
Published by sgrebnov about 1 year ago

https://github.com/spiceai/spiceai - v1.2.0

Spice v1.2.0 (Apr 28, 2025)

Spice v1.2.0 is a significant update. It upgrades DataFusion to v45 and Arrow to v54. This release brings faster query performance, support for parameterized queries in SQL and HTTP APIs, and the ability to accelerate views. Several bugs have been fixed and dependencies updated for better stability and speed.

DataFusion v45 Highlights

Spice.ai is built on the DataFusion query engine. The v45 release brings:

Faster Performance 🚀: DataFusion is now the fastest single-node engine for Apache Parquet files in the clickbench benchmark. Performance improved by over 33% from v33 to v45. Arrow StringView is now on by default, making string and binary data queries much faster, especially with Parquet files.
Better Quality 📋: DataFusion now runs over 5 million SQL tests per push using the SQLite sqllogictest suite. There are new checks for logical plan correctness and more thorough pre-release testing.
New SQL Functions ✨: Added show functions, to_local_time, regexp_count, map_extract, array_distance, array_any_value, greatest, least, and arrays_overlap.

See the DataFusion 45.0.0 release notes for details.

Spice.ai upgrades to the latest minus one DataFusion release to ensure adequate testing and stability. The next upgrade to DataFusion v46 is planned for Spice v1.3.0 in May.

What's New in v1.2.0

Parameterized Queries: Parameterized queries are now supported with the Flight SQL API and HTTP API. Positional and named arguments via $1 and :param syntax are supported, respectively. Logical plans for SQL statements are cached for faster repeated queries.

Example Cookbook recipes:

See the API Documentation for additional details.

Accelerated Views: Views, not just datasets, can now be accelerated. This provides much better performance for views that perform heavy computation.

Example spicepod.yaml:

yaml views: - name: accelerated_view acceleration: enabled: true engine: duckdb primary_key: id refresh_check_interval: 1h sql: | select * from dataset_a union all select * from dataset_b

See the Data Acceleration documentation.

Memory Usage Metrics & Configuration: Runtime now tracks memory usage as a metric, and a new runtime memory_limit parameter is available. The memory limit parameter applies specifically to the runtime and should be used in addition to existing memory usage configuration, such as duckdb_memory_limit. Memory usage for queries beyond the memory limit will spill to disk.

See the Memory Reference for details.

New Worker Component: Workers are new configurable compute units in the Spice runtime. They help manage compute across models and tools, handle errors, and balance load. Workers are configured in the workers section of spicepod.yaml.

Example spicepod.yaml:

yaml workers: - name: round-robin description: | Distributes requests between 'foo' and 'bar' models in a round-robin fashion. models: - from: foo - from: bar - name: fallback description: | Tries 'bar' first, then 'foo', then 'baz' if earlier models fail. models: - from: foo order: 2 - from: bar order: 1 - from: baz order: 3

See the Workers Documentation for details.

Databricks Model Provider: Databricks models can now be used with from: databricks:model_name.

Example spicepod.yaml:

yaml models: - from: databricks:llama-3_2_1_1b_instruct name: llama-instruct params: databricks_endpoint: dbc-46470731-42e5.cloud.databricks.com databricks_token: ${ secrets:SPICE_DATABRICKS_TOKEN }

See the Databricks model documentation.

spice chat CLI Improvements: The spice chat command now supports an optional --temperature parameter. A one-shot chat can also be sent with spice chat <message>.
More Type Support: Added support for Postgres JSON type and DuckDB Dictionary type.
Other Improvements:
- New image tags let you pick memory allocators for different use-cases: jemalloc, sysalloc, and mimalloc.
- Better error handling and logging for chat and model operations.

Contributors

Cookbook Updates

New recipes for:

Python ADBC Client with Parameterized Queries: Using Parameterized Queries from Python over ADBC.
Java JDBC Client with Parameterized Queries: Using Parameterized Queries from Java over JDBC.
Scala JDBC Client with Parameterized Queries: Using Parameterized Queries from Scala over JDBC.

The Spice Cookbook now includes 68 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.2.0, use one of the following methods:

CLI:

console spice upgrade

Homebrew:

console brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.2.0 image:

console docker pull spiceai/spiceai:1.2.0

For available tags, see DockerHub.

Helm:

console helm repo update helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

DataFusion: upgraded to v45.
Apache Arrow: Upgraded to v54.3.0.

Spice is now built with Rust 1.85.0 and Rust 2024.

Changelog

Update end_game.md (#5312) by @peasee in https://github.com/spiceai/spiceai/pull/5312
feat: Add initial testoperator query validation (#5311) by @peasee in https://github.com/spiceai/spiceai/pull/5311
Update Helm + Prepare for next release (#5317) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5317
Update spicepod.schema.json (#5319) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5319
add integration test for reading encrypted PDFs from S3 (#5308) by @kczimm in https://github.com/spiceai/spiceai/pull/5308
Stop load_components during runtime shutdown (#5306) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5306
Update openapi.json (#5321) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5321
feat: Implement record batch data validation (#5331) by @peasee in https://github.com/spiceai/spiceai/pull/5331
Update QA analytics for v1.1.1 (#5320) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5320
fix: Update benchmark snapshots (#5337) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5337
Enforce pulls with Spice v1.0.4 (#5339) by @lukekim in https://github.com/spiceai/spiceai/pull/5339
Upgrade to DataFusion 45, Arrow 54, Rust 1.85 & Edition 2024 (#5334) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5334
feat: Allow validating testoperator in benchmark workflow (#5342) by @peasee in https://github.com/spiceai/spiceai/pull/5342
Upgrade delta_kernel to 0.9 (#5343) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5343
deps: Update odbc-api (#5344) by @peasee in https://github.com/spiceai/spiceai/pull/5344
Fix schema inference for Snowflake tables with large number of columns (#5348) by @ewgenius in https://github.com/spiceai/spiceai/pull/5348
feat: Update testoperator dispatch for validation, version metric (#5349) by @peasee in https://github.com/spiceai/spiceai/pull/5349
fix: validate_results not validate (#5352) by @peasee in https://github.com/spiceai/spiceai/pull/5352
revert to previous pdf-extract; remove test for encrypted pdf support (#5355) by @kczimm in https://github.com/spiceai/spiceai/pull/5355
Stablize the test verify_similarity_search_chat_completion (#5284) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5284
Turn off delta_kernel::log_segment logging and refactor log filtering (#5367) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5367
Upgrade to DuckDB 1.2.2 (#5375) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5375
Update Readme - fix broken and outdated links (#5376) by @ewgenius in https://github.com/spiceai/spiceai/pull/5376
Upgrade dependabot dependencies (#5385) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5385
fix: Remove IMAP oauth (#5386) by @peasee in https://github.com/spiceai/spiceai/pull/5386
Bump Helm chart to 1.1.2 (#5389) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5389
Refactor accelerator registry as part of runtime. (#5318) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5318
Include vnd.spiceai.sql/nsql.v1+json response examples (openapi docs) (#5388) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5388
docs: Update endgame template with SpiceQA, update qa analytics (#5391) by @peasee in https://github.com/spiceai/spiceai/pull/5391
Make graceful shutdown timeout configurable (#5358) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5358
docs: Update release criteria with note on max columns (#5401) by @peasee in https://github.com/spiceai/spiceai/pull/5401
Update openapi.json (#5392) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5392
FinanceBench: update scorer instructions and switch scoring model to gpt-4.1 (#5395) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5395
feat: Write OTel metrics for testoperator (#5397) by @peasee in https://github.com/spiceai/spiceai/pull/5397
Update nsql openapi title (#5403) by @ewgenius in https://github.com/spiceai/spiceai/pull/5403
Track ai_inferences_count with used tools flag. Extensible runtime request context. (#5393) by @ewgenius in https://github.com/spiceai/spiceai/pull/5393
Include newly detected view as changed view (#5408) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5408
Track usedtools in aiinferenceswithspice_count as number (#5409) by @ewgenius in https://github.com/spiceai/spiceai/pull/5409
Update openapi.json (#5406) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5406
Tweak enforce pulls with Spice (#5411) by @lukekim in https://github.com/spiceai/spiceai/pull/5411
Allow flightsql and spiceai connectors to override flight max message size (#5407) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5407
Retry model graded scorer once on successful, empty response (#5405) by @Jeadie in https://github.com/spiceai/spiceai/pull/5405
use span task name in 'spice trace' tree, not span_id (#5412) by @Jeadie in https://github.com/spiceai/spiceai/pull/5412
Rename to track_ai_inferences_with_spice_count in all places (#5410) by @ewgenius in https://github.com/spiceai/spiceai/pull/5410
Update qa_analytics.csv (#5421) by @peasee in https://github.com/spiceai/spiceai/pull/5421
Remove the filter for the list_datasets tool in the AI inferences metric count. (#5417) by @ewgenius in https://github.com/spiceai/spiceai/pull/5417
fix: Testoperator uses an exact API key for benchmark metric submission (#5413) by @peasee in https://github.com/spiceai/spiceai/pull/5413
feat: Enable testoperator metrics in workflow (#5422) by @peasee in https://github.com/spiceai/spiceai/pull/5422
Upgrade mistral.rs (#5404) by @Jeadie in https://github.com/spiceai/spiceai/pull/5404
Include all FinanceBench documents in benchmark tests (#5426) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5426
Handle second Ctrl-C to force runtime termination (#5427) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5427
Add optional --temperature parameter for spice chat CLI command (#5429) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5429
Remove with_runtime_status from the RuntimeBuilder (#5430) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5430
Fix spice chat error handling (#5433) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5433
Add more test models to FinanceBench benchmark (#5431) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5431
support 'from: databricks:model_name' (#5434) by @Jeadie in https://github.com/spiceai/spiceai/pull/5434
Upgrade Pulls with Spice to v1.0.6 and add concurrency control (#5442) by @lukekim in https://github.com/spiceai/spiceai/pull/5442
Upgrade DataFusion table providers (#5443) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5443
Test spice chat in e2etestspice_cli (#5447) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5447
Allow for one-shot chat request using spice chat <message> (#5444) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5444
Enable parallel data sampling for NSQL (#5449) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5449
Upgrade Go from v1.23.4 to v1.24.2 (#5462) by @lukekim in https://github.com/spiceai/spiceai/pull/5462
Update PULLREQUESTTEMPLATE.md (#5465) by @lukekim in https://github.com/spiceai/spiceai/pull/5465
Enable captured outputs by default when spiced is started by the CLI (spice run) (#5464) by @lukekim in https://github.com/spiceai/spiceai/pull/5464
Parameterized queries via Flight SQL API (#5420) by @kczimm in https://github.com/spiceai/spiceai/pull/5420
fix: Update benchmarks readme badge (#5466) by @peasee in https://github.com/spiceai/spiceai/pull/5466
delay auth check for binding parameterized queries (#5475) by @kczimm in https://github.com/spiceai/spiceai/pull/5475
Add support for ? placeholder syntax in parameterized queries (#5463) by @kczimm in https://github.com/spiceai/spiceai/pull/5463
enable task name override for non static span names (#5423) by @Jeadie in https://github.com/spiceai/spiceai/pull/5423
Allow parameter queries with no parameters (#5481) by @kczimm in https://github.com/spiceai/spiceai/pull/5481
Support unparsing UNION for distinct results (#5483) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5483
add rust-toolchain.toml (#5485) by @kczimm in https://github.com/spiceai/spiceai/pull/5485
Add parameterized query support to the HTTP API (#5484) by @kczimm in https://github.com/spiceai/spiceai/pull/5484
E2E test for spice chat behavior (#5451) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5451
Renable and fix huggingface models integration tests (#5478) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5478
Update openapi.json (#5488) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5488
feat: Record memory usage as a metric (#5489) by @peasee in https://github.com/spiceai/spiceai/pull/5489
fix: update dispatcher to run all benchmarks, rename metric, update spicepods, add scale factor (#5500) by @peasee in https://github.com/spiceai/spiceai/pull/5500
Fix ILIKE filters support (#5502) by @ewgenius in https://github.com/spiceai/spiceai/pull/5502
fix: Update test spicepod locations and names (#5505) by @peasee in https://github.com/spiceai/spiceai/pull/5505
fix: Update benchmark snapshots (#5508) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5508
fix: Update benchmark snapshots (#5512) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5512
Fix Delta Lake bug for: Found unmasked nulls for non-nullable StructArray field "predicate" (#5515) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5515
fix: working directory for duckdb e2e test spicepods (#5510) by @peasee in https://github.com/spiceai/spiceai/pull/5510
Tweaks to README.md (#5516) by @lukekim in https://github.com/spiceai/spiceai/pull/5516
Cache logical plans of SQL statements (#5487) by @kczimm in https://github.com/spiceai/spiceai/pull/5487
Fix content-type: application/json (#5517) by @Jeadie in https://github.com/spiceai/spiceai/pull/5517
Validate postgres results in testoperator dispatch (#5504) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5504
fix: Update benchmark snapshots (#5511) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5511
Fix results cache by SQL with prepared statements (#5518) by @kczimm in https://github.com/spiceai/spiceai/pull/5518
Add initial support for views acceleration (#5509) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5509
fix: Update benchmark snapshots (#5527) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5527
Support switching the memory allocator Spice uses via alloc-* features. (#5528) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5528
fix: Update benchmark snapshots (#5525) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5525
Add test spicepod for tpch mysql-duckdbfile acceleration by @Sevenannn in https://github.com/spiceai/spiceai/pull/5521
Fix nightly arm build - change tag -default to -models (#5529) by @ewgenius in https://github.com/spiceai/spiceai/pull/5529
LLM router via worker spicepod component (#5513) by @Jeadie in https://github.com/spiceai/spiceai/pull/5513
Apply Spice advanced acceleration logic and params support to accelerated views (#5526) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5526
Enable DatasetCheckpoint logic for accelerated views (#5533) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5533
Fix public '.model' name for router workers (#5535) by @Jeadie in https://github.com/spiceai/spiceai/pull/5535
feat: Add Runtime memory limit parameter (#5536) by @peasee in https://github.com/spiceai/spiceai/pull/5536
For fallback worker, check first item in chat/completion stream. (#5537) by @Jeadie in https://github.com/spiceai/spiceai/pull/5537
Move rate limit check to after parameterized query binding (#5540) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5540
Update spicepod.schema.json (#5545) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5545
Accelerate views: refreshonstartup, ready_state, jitter params support (#5547) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5547
Add integration test for accelerated views (#5550) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5550
Don't install make or expect on spiceai-macos runners (#5554) by @lukekim in https://github.com/spiceai/spiceai/pull/5554
event_stream crate for emitting events from tracing::Span; used in v1/chat/completions streaming. (#5474) by @Jeadie in https://github.com/spiceai/spiceai/pull/5474
Fix typo in method (#5559) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5559
Run test operator every day and current and previous commits (#5557) by @lukekim in https://github.com/spiceai/spiceai/pull/5557
Add awsallowhttp parameter for delta lake connector (#5541) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5541
feat: Add branch name to metric dimensions in testoperator (#5563) by @peasee in https://github.com/spiceai/spiceai/pull/5563
fix: Update the tpch benchmark snapshots for: ./test/spicepods/tpch/sf1/federated/odbc[databricks].yaml (#5565) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5565
fix: Split scheduled dispatch into a separate job (#5567) by @peasee in https://github.com/spiceai/spiceai/pull/5567
fix: Use outputs.SPICED_COMMIT (#5568) by @peasee in https://github.com/spiceai/spiceai/pull/5568
fix: Use refs in testoperator dispatch instead of commits (#5569) by @peasee in https://github.com/spiceai/spiceai/pull/5569
fix: actions/checkout ref does not take a full ref (#5571) by @peasee in https://github.com/spiceai/spiceai/pull/5571
fix: Testoperator dispatch (#5572) by @peasee in https://github.com/spiceai/spiceai/pull/5572
Respect update-snapshots when running all benchmarks manually (#5577) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5577
Use FETCHHEAD instead of ${{ inputs.ref }} to list commits in setupspiced (#5579) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5579
Add additional test scenarios for benchmarks (#5582) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5582
fix: Update the tpch benchmark snapshots for: test/spicepods/tpch/sf1/accelerated/databricks[delta_lake]-duckdb[file].yaml (#5590) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5590
fix: Update the tpch benchmark snapshots for: test/spicepods/tpch/sf1/accelerated/mysql-duckdb[file].yaml (#5591) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5591
Fix Snowflake data connector rows ordering (#5599) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5599
fix: Update benchmark snapshots (#5595) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5595
fix: Update the tpch benchmark snapshots for: test/spicepods/tpch/sf1/accelerated/databricks[delta_lake]-arrow.yaml (#5594) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5594
fix: Update benchmark snapshots (#5589) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5589
fix: Update benchmark snapshots (#5583) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5583
Downgrade DuckDB to 1.1.3 (#5607) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5607
Add prepared statement integration tests (#5544) by @kczimm in https://github.com/spiceai/spiceai/pull/5544

Full Changelog: v1.1.2...v1.2.0

- Rust
Published by ewgenius about 1 year ago

https://github.com/spiceai/spiceai - v1.1.2

Spice v1.1.2 (Apr 14, 2025)

Spice v1.1.2 improves Delta Lake Data Connector performance, introduces new Accept headers for the /v1/sql and /v1/nsql endpoints to include query metadata with results, and resolves an issue with the Snowflake Data Connector when handling wide tables (>600 columns).

The official Tableau Connector for Spice.ai v0.1 has been released, making it easy to connect to both self-hosted Spice.ai and Spice Cloud instances using Tableau.

What's New in v1.1.2

Tableau Connector for Spice.ai: Released the initial version (v0.1) of the official Tableau Taco Connector (fully open-source), enabling data visualization and analytics in Tableau with self-hosted Spice.ai and Spice Cloud deployments.
- Official Release: github.com/spicehq/tableau-connector/releases/tag/v0.1.0
- Docs: spiceai.org/docs/clients/tableau
- Open Source Repository: github.com/spiceai/tableau-connector
Delta Lake Data Connector: Upgraded delta_kernel to v0.9, and optimized scan operations, reducing query execution time by up to 20% on large datasets.
Snowflake Data Connector: Fixed a bug that caused failures when loading tables with more than 600 columns.
Query Metadata (SQL and NSQL): Added support for the application/vnd.spiceai.sql.v1+json Accept header on the /v1/sql endpoint, and the application/vnd.spiceai.nsql.v1+json Accept header on the /v1/nsql endpoint, enabling responses to include metadata such as the executed SQL query and schema alongside results.

Example:

bash curl -XPOST "http://localhost:8090/v1/nsql" \ -H "Content-Type: application/json" \ -H "Accept: application/vnd.spiceai.nsql.v1+json" \ -d '{ "query": "What’s the highest tip any passenger gave?" }' | jq

Example response:

json { "row_count": 1, "schema": { "fields": [ { "name": "highest_tip", "data_type": "Float64", "nullable": true, "dict_id": 0, "dict_is_ordered": false, "metadata": {} } ], "metadata": {} }, "data": [ { "highest_tip": 428.0 } ], "sql": "SELECT MAX(\"tip_amount\") AS \"highest_tip\"\nFROM \"spice\".\"public\".\"taxi_trips\"" }

For details, see the SQL Query API and NSQL API documentation.

Contributors

Breaking Changes

No breaking changes in this release.

Cookbook Updates

No major cookbook additions.

The Spice Cookbook now includes 65 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.1.2, use one of the following methods:

CLI:

console spice upgrade

Homebrew:

console brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.1.2 image:

console docker pull spiceai/spiceai:1.1.2

For available tags, see DockerHub.

Helm:

console helm repo update helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

delta_kernel: updated to v0.9.0.

Changelog

Backport - Fix schema inference for Snowflake tables with large number of columns #5348 by @ewgenius in #5350
Upgrade delta_kernel to 0.9 (#5343) by @phillipleblanc in #5356
Add basic support for application/vnd.spiceai.sql.v1+json format (#5333) by @sgrebnov in #5333
Convert DataFusion filters to Delta Kernel predicates by @phillipleblanc in #5362
revert to previous pdf-extract; remove test for encrypted pdf support by @kczimm in #5355
Turn off delta_kernel::log_segment logging and refactor log filtering by @phillipleblanc in #5367
Extend application/vnd.spiceai.sql.v1+json with schema and row_count fields by @sgrebnov in #5365
Make separate vnd.spiceai.sql.v1+json and vnd.spiceai.nsql.v1+json MIME types by @sgrebnov in #5382

Full Changelog: v1.1.1...v1.1.2

- Rust
Published by phillipleblanc about 1 year ago

https://github.com/spiceai/spiceai - v1.1.1

Spice v1.1.1 (Apr 7, 2025)

Spice v1.1.1 introduces several key updates, including a new Component Metrics System, improved Delta Data Connector performance, improved MCP tool descriptions, and expanded runtime results caching options. This release also adds detailed MySQL connection pool metrics for better observability. Component Metrics are Prometheus-compatible and accessible via the metrics endpoint.

Highlights v1.1.1

Component Metrics System: A new system for monitoring components, starting with MySQL connection pool metrics. These metrics provide insights into MySQL connection performance and can be selectively enabled in the dataset configuration. Metrics are exposed in Prometheus format via the metrics endpoint.

For more details, see the Component Metrics documentation.

Results Caching Enhancements: Added a cache_key_type option for runtime results caching. Options include:
- plan (Default): Uses the query's logical plan as the cache key. Matches semantically equivalent queries but requires query parsing.
- sql: Uses the raw SQL string as the cache key. Provides faster lookups but requires exact string matches. Use sql for predictable queries without dynamic functions like NOW().

Example spicepod.yaml configuration:

yaml runtime: results_cache: enabled: true cache_max_size: 128MiB cache_key_type: sql # Use SQL for the results cache key item_ttl: 1s

For more details, see the runtime configuration documentation.

Delta Data Connector: Improved scan performance for faster query performance.
MCP Tools: Improved descriptions for built-in MCP tools to improve usability.
MySQL Component Metrics: Added detailed metrics for monitoring MySQL connections, such as connection count and pool activity.

Example spicepod.yaml configuration:

yaml datasets: - from: mysql:my_table name: my_dataset metrics: - name: connection_count enabled: true - name: connections_in_pool enabled: true - name: active_wait_requests enabled: true params: mysql_host: localhost mysql_tcp_port: 3306 mysql_user: root mysql_pass: ${secrets:MYSQL_PASS}

For more details, see the MySQL Data Connector documentation.

spice.js SDK: The spice.js SDK has been updated to v2.0.1 and includes several important security updates.

New Contributors 🎉

@kczimm made their first contribution in #5243

Contributors

Breaking Changes

No breaking changes in this release.

Cookbook Updates

The Spice Cookbook now includes 65 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.1.1, use one of the following methods:

CLI:

console spice upgrade

Homebrew:

console brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.1.1 image:

console docker pull spiceai/spiceai:1.1.1

For available tags, see DockerHub.

Helm:

console helm repo update helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

No major dependency changes.

Changelog

fix: Testoperator DuckDB, SQLite, Postgres, Spicecloud by @peasee in #5190
Update Helm Chart and SECURITY.md to v1.1.0 by @lukekim in #5223
Update version.txt to v1.1.1-unstable by @lukekim in #5224
Update Cargo.lock to v1.1.1-unstable by @lukekim in #5225
Add tests for verify_schema_source_path in ListingTableConnector by @phillipleblanc in #5221
Reduce noise from debug logging by @phillipleblanc in #5227
Improve openai_test_chat_messages integration test reliability by @Sevenannn in #5222
Verify the checkpoints existence before shutting down runtime in integration tests directly querying checkpoint by @Sevenannn in #5232
Fix CORS support for json content-type api by @sgrebnov in #5241
Fix ModelGradedScorer error: The 'metadata' parameter is only allowed when 'store' is enabled. by @sgrebnov in #5231
fix: Use pulls-with-spice-action and switch to spiceai-macos runners by @peasee in #5238
Use v1.0.3 pulls with spice action by @lukekim in #5244
feat: Build ODBC binaries, run testoperator on ODBC by @peasee in #5237
Bump timeout for several integration test runtime load_components & readiness check by @Sevenannn in #5229
Validate port is available before binding port for docker container in integration tests by @Sevenannn in #5248
Update datafusion-table-providers to fix the schema for PostgreSQL materialized views by @ewgenius in #5259
Verify flight server is ready for flight integration tests by @Sevenannn in #5240
fix: Publish to MinIO inside of matrix on buildandrelease by @peasee in #5258
fix: TPCDS on zero results benchmarks by @peasee in #5263
Use model as a judge scorer for Financebench by @sgrebnov in #5264
Fix FinanceBench llm scorer secret name by @sgrebnov in #5276
Implements support for runtime.results_cache.cache_key_type by @phillipleblanc in #5265
fix: Testoperator MS SQL, query overrides, dispatcher by @peasee in #5279
refactor: Delete old benchmarks by @peasee in #5283
Imporve embedding column parsing performance test by @Sevenannn in #5268
Add Support for AWS Session Token in S3 Data Connector by @kczimm in #5243
Implement Component Metrics system + MySQL connection pool metrics by @phillipleblanc in #5290
Add default descriptions to built-in MCP tools by @lukekim in #5293
fix: Vector search with cased columns by @peasee in #5295
Run delta kernel scan in a blocking Tokio thread. by @phillipleblanc in #5296
Expose the mysql_pool_min and mysql_pool_max connection pool parameters by @phillipleblanc in #5297
use patched pdf-extract by @kczimm in #5270

Full Changelog: v1.1.0...v1.1.1

- Rust
Published by phillipleblanc about 1 year ago

https://github.com/spiceai/spiceai - v1.1.0

Spice v1.1.0 (Mar 31, 2025)

Spice v1.1.0 introduces full support for the Model-Context-Protocol (MCP), expanding how models and tools connect. Spice can now act as both an MCP Server, with the new /v1/mcp/sse API, and an MCP Client, supporting stdio and SSE-based servers. This release also introduces a new Web Search tool with Perplexity model support, advanced evaluation workflows with custom eval scorers, including LLM-as-a-judge, and adds an IMAP Data Connector for federated SQL queries across email servers. Alongside these features, v1.1.0 includes automatic NSQL query retries, expanded task tracing, request drains for HTTP server shutdowns, delivering improved reliability, flexibility, and observability.

Highlights in v1.1.0

Spice as an MCP Server and Client: Spice now supports the Model Context Protocol (MCP), for expanded tool discovery and connectivity. Spice can:

Run stdio-based MCP servers internally.
Connect to external MCP servers over SSE protocol (Streamable HTTP is coming soon!)

For more details, see the MCP documentation.

### Usage

yaml tools: - name: google_maps from: mcp:npx params: mcp_args: -y @modelcontextprotocol/server-google-maps

### Spice as an MCP Server

Tools in Spice can be accessed via MCP. For example, connecting from an IDE like Cursor or Windsurf to Spice. Set the MCP Server URL to http://localhost:8090/v1/mcp/sse.

Perplexity Model Support: Spice now supports Perplexity-hosted models, enabling advanced web search and retrieval capabilities. Example configuration:

yaml models: - name: webs from: perplexity:sonar params: perplexity_auth_token: ${ secrets:SPICE_PERPLEXITY_AUTH_TOKEN } perplexity_search_domain_filter: - docs.spiceai.org - huggingface.co

For more details, see the Perplexity documentation.

Web Search Tool: The new Web Search Tool enables Spice models to search the web for information using search engines like Perplexity. Example configuration:

yaml tools: - name: the_internet from: websearch description: 'Search the web for information.' params: engine: perplexity perplexity_auth_token: ${ secrets:SPICE_PERPLEXITY_AUTH_TOKEN }

For more details, see the Web Search Tool documentation.

Eval Scorers: Eval scorers assess model performance on evaluation cases. Spice includes built-in scorers:
- match: Exact match.
- json_match: JSON equivalence.
- includes: Checks if actual output includes expected output.
- fuzzy_match: Normalized subset matching.
- levenshtein: Levenshtein distance.

Custom scorers can use embedding models or LLMs as judges. Example:

yaml evals: - name: australia dataset: cricket_questions scorers: - hf_minilm - judge - match embeddings: - name: hf_minilm from: huggingface:huggingface.co/sentence-transformers/all-MiniLM-L6-v2 models: - name: judge from: openai:gpt-4o params: openai_api_key: ${ secrets:OPENAI_API_KEY } system_prompt: | Compare these stories and score their similarity (0.0 to 1.0). Story A: {{ .actual }} Story B: {{ .ideal }}

For more details, see the Eval Scorers documentation.

IMAP Data Connector: Query emails stored in IMAP servers using federated SQL. Example:

yaml datasets: - from: imap:myawesomeemail@gmail.com name: emails params: imap_access_token: ${secrets:IMAP_ACCESS_TOKEN}

For more details, see the IMAP Data Connector documentation.

Automatic NSQL Query Retries: Failed NSQL queries are now automatically retried, improving reliability for federated queries. For more details, see the NSQL documentation.
Enhanced Task Tracing: Task history now includes chat completion IDs, and runtime readiness is traced for better observability. Use the runtime.task_history table to query task details. See the Task History documentation.
Vector Search with Keyword Filtering: The vector search API now includes an optional list of keywords as a parameter, to pre-filter SQL results before performing a vector search. When vector searching via a chat completion, models will automatically generate keywords relevant to the search. See the Vector Search API documentation.
Improved Refresh Behavior on Startup: Spice won't automatically refresh an accelerated dataset on startup if it doesn't need to. See the Refresh on Startup documentation.
Graceful Shutdown for HTTP Server: The HTTP server now drains requests for graceful shutdowns, ensuring smoother runtime termination.

New Contributors 🎉

@Garamda made their first contribution in https://github.com/spiceai/spiceai/pull/4840
@sergey-shandar made their first contribution in https://github.com/spiceai/spiceai/pull/4868
@benrussell made their first contribution in https://github.com/spiceai/spiceai/pull/5126

Contributors

@sgrebnov
@phillipleblanc
@peasee
@Jeadie
@lukekim
@benrussell
@Sevenannn
@sergey-shandar
@Garamda
@johnnynunez

Breaking Changes

No breaking changes.

Cookbook Updates

The Spice Cookbook now has 74 recipes that make it easy to get started with Spice!

Upgrading

To upgrade to v1.1.0, use one of the following methods:

CLI:

console spice upgrade

Homebrew:

console brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.1.0 image:

console docker pull spiceai/spiceai:1.1.0

For available tags, see DockerHub.

Helm:

console helm repo update helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

No major dependency changes.

Changelog

release: Bump chart, and versions for next release by @peasee in https://github.com/spiceai/spiceai/pull/4464
feat: Schedule testoperator by @peasee in https://github.com/spiceai/spiceai/pull/4503
fix: Remove on zero results arguments from benchmarks by @peasee in https://github.com/spiceai/spiceai/pull/4533
fix: Don't snapshot clickbench benchmarks by @peasee in https://github.com/spiceai/spiceai/pull/4534
docs: v1.0.1 release note by @Sevenannn in https://github.com/spiceai/spiceai/pull/4529
Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/4535
In spiced_docker, propagate setup to publish-cuda by @Jeadie in https://github.com/spiceai/spiceai/pull/4543
Upgrade Rust to 1.84 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4541
Upgrade dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4546
Revert "Use OpenAI golang client in spice chat (#4491)" by @Jeadie in https://github.com/spiceai/spiceai/pull/4564
feat: add schema inference for the Spice.ai Data Connector by @peasee in https://github.com/spiceai/spiceai/pull/4579
Remove 'tools: builtin' by @Jeadie in https://github.com/spiceai/spiceai/pull/4607
feat: Add initial IMAP connector by @peasee in https://github.com/spiceai/spiceai/pull/4587
feat: Add email content loading by @peasee in https://github.com/spiceai/spiceai/pull/4616
feat: Add SSL and Auth parameters for IMAP by @peasee in https://github.com/spiceai/spiceai/pull/4613
Change /v1/models to be OpenAI compatible by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4624
Use pdf-extract crate to extract text from PDF documents by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4615
Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/4628
Add 1.0.2 release notes by @sgrebnov in https://github.com/spiceai/spiceai/pull/4627
Fix cuda::ffi by @Jeadie in https://github.com/spiceai/spiceai/pull/4649
Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/4654
fix: Spice.ai schema inference by @peasee in https://github.com/spiceai/spiceai/pull/4674
Add SQL Benchmark with sample eval configuration based on TPCH by @sgrebnov in https://github.com/spiceai/spiceai/pull/4549
Update Helm chart to Spice v1.0.2 by @sgrebnov in https://github.com/spiceai/spiceai/pull/4655
Update v1.0.2 release notes by @sgrebnov in https://github.com/spiceai/spiceai/pull/4639
Fix E2E AI release install test on self-hosted runners (macos) by @sgrebnov in https://github.com/spiceai/spiceai/pull/4675
Main performance metrics calculation for Text to SQL Benchmark by @sgrebnov in https://github.com/spiceai/spiceai/pull/4681
Add eval datasets / test scripts for model grading criteria by @Sevenannn in https://github.com/spiceai/spiceai/pull/4663
Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/4684
Add testoperator for evals running by @sgrebnov in https://github.com/spiceai/spiceai/pull/4688
Add GH Workflow to run Text to SQL benchmark by @sgrebnov in https://github.com/spiceai/spiceai/pull/4689
Add 1.0.2 as supported version to SECURITY.md by @sgrebnov in https://github.com/spiceai/spiceai/pull/4695
Text-To-SQL benchmark: trace failed tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/4705
Text-To-SQL benchmark: extend list of benchmarking models by @sgrebnov in https://github.com/spiceai/spiceai/pull/4707
Text-To-SQL: increase sql coverage, add more advanced tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/4713
Use model that supports tools in hf_test by @Jeadie in https://github.com/spiceai/spiceai/pull/4712
Fix Spice.ai E2E test by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4723
Return non-existing model for v1/chat endpoint by @Sevenannn in https://github.com/spiceai/spiceai/pull/4718
Update Helm chart for 1.0.3 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4742
Update dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4740
Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/4744
Update SECURITY.md with 1.0.3 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4745
Add basic smoke test of perplexity LLM to llm integration tests. by @Jeadie in https://github.com/spiceai/spiceai/pull/4735
Don't run integration tests on PRs when only CLI is changed by @Jeadie in https://github.com/spiceai/spiceai/pull/4751
Prompt user to upgrade through brew / do another clean install when spice is installed through homebrew / at non-standard path by @Sevenannn in https://github.com/spiceai/spiceai/pull/4746
feat: Search with keyword filtering by @peasee in https://github.com/spiceai/spiceai/pull/4759
Fix search benchmark by @sgrebnov in https://github.com/spiceai/spiceai/pull/4765
feat: Add IMAP access token parameter by @peasee in https://github.com/spiceai/spiceai/pull/4769
Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/4774
Mark trunk builds as unstable by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4776
feat: Release Spice.ai RC by @peasee in https://github.com/spiceai/spiceai/pull/4753
fix: Validate columns and keywords in search by @peasee in https://github.com/spiceai/spiceai/pull/4775
Run models E2E tests on PR by @sgrebnov in https://github.com/spiceai/spiceai/pull/4798
fix: models runtime not required for cloud chat by @peasee in https://github.com/spiceai/spiceai/pull/4781
Only open one PR for openapi.json by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4807
docs: Release IMAP Alpha by @peasee in https://github.com/spiceai/spiceai/pull/4797
Add Results-Cache-Status to indicate query result came from cache by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4809
Initial spice cli e2e tests with spice upgrade tests by @Sevenannn in https://github.com/spiceai/spiceai/pull/4764
Log CLI and Runtime Versions on startup by @sgrebnov in https://github.com/spiceai/spiceai/pull/4816
Sort keys for openai by @Jeadie in https://github.com/spiceai/spiceai/pull/4766
Remove docs index trigger from the endgame template by @ewgenius in https://github.com/spiceai/spiceai/pull/4832
Release notes for v1.0.4 by @Jeadie in https://github.com/spiceai/spiceai/pull/4827
Update SECURITY.md by @Jeadie in https://github.com/spiceai/spiceai/pull/4829
Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/4831
Don't print URL by @lukekim in https://github.com/spiceai/spiceai/pull/4838
add 'eval_run' to 'spice trace' by @Jeadie in https://github.com/spiceai/spiceai/pull/4841
Run benchmark tests w/o uploading test results (pending improvements) by @sgrebnov in https://github.com/spiceai/spiceai/pull/4843
Fix 'actual" and "output" columns in eval.results. by @Jeadie in https://github.com/spiceai/spiceai/pull/4835
Fix string escaping of system prompt by @Jeadie in https://github.com/spiceai/spiceai/pull/4844
update helm chart to v1.0.4 by @Jeadie in https://github.com/spiceai/spiceai/pull/4828
Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/4806
fix: Skip sccache in PR for external users by @peasee in https://github.com/spiceai/spiceai/pull/4851
fix: Return BAD_REQUEST when not embeddings are configured by @peasee in https://github.com/spiceai/spiceai/pull/4804
Debug log cuda detection failure in spice by @Sevenannn in https://github.com/spiceai/spiceai/pull/4852
fix: Set RUSTC wrapper explicitly by @peasee in https://github.com/spiceai/spiceai/pull/4854
Improve trace UX for ai_completion, fix infinite tool calls by @Jeadie in https://github.com/spiceai/spiceai/pull/4853
Allow homebrew spice cli to upgrade the runtime by @Sevenannn in https://github.com/spiceai/spiceai/pull/4811
Add support for MCP tools by @Jeadie in https://github.com/spiceai/spiceai/pull/4808
fix: Rustc wrapper actions by @peasee in https://github.com/spiceai/spiceai/pull/4867
Provide link to supported OS list when user platform is not supported by @Garamda in https://github.com/spiceai/spiceai/pull/4840
Always download spice runtime version matched with spice cli version by @Sevenannn in https://github.com/spiceai/spiceai/pull/4761
Disable flaky integration test by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4871
fix: sccache actions setup by @peasee in https://github.com/spiceai/spiceai/pull/4873
Fixing Go installation in the setup script for Linux Arm64 by @sergey-shandar in https://github.com/spiceai/spiceai/pull/4868
Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/4864
DuckDB acceleration: Use temp table only for append with conflict resolution by @sgrebnov in https://github.com/spiceai/spiceai/pull/4874
Trace the output of streamed chat/completions to runtime.task_history. by @Jeadie in https://github.com/spiceai/spiceai/pull/4845
Always pass X-API-Key in spice api calls header if detected in env by @ewgenius in https://github.com/spiceai/spiceai/pull/4878
Revert "DuckDB acceleration: Use temp table only for append with conflict resolution" by @sgrebnov in https://github.com/spiceai/spiceai/pull/4886
Allow overriding spicerack base url in the CLI by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4892
Add test Spicepod for DuckDB full acceleration with constraints by @sgrebnov in https://github.com/spiceai/spiceai/pull/4891
Refactor Parameter Handling by @Advayp in https://github.com/spiceai/spiceai/pull/4833
Add test Spicepod for DuckDB append acceleration with constraints by @sgrebnov in https://github.com/spiceai/spiceai/pull/4898
Update to latest async-openai fork. Update secrecy by @Sevenannn in https://github.com/spiceai/spiceai/pull/4911
Fix mcp tools build by @sgrebnov in https://github.com/spiceai/spiceai/pull/4916
Add more test spicepods by @Sevenannn in https://github.com/spiceai/spiceai/pull/4923
task: Add more dispatch files by @peasee in https://github.com/spiceai/spiceai/pull/4933
run spiceai benchmark test using test operator by @Sevenannn in https://github.com/spiceai/spiceai/pull/4920
Convert sequential search code block to parallel async by @Garamda in https://github.com/spiceai/spiceai/pull/4936
fix: Throughput metric calculation by @peasee in https://github.com/spiceai/spiceai/pull/4938
Update dependabot dependencies & cargo update by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4872
Improve servers shutdown sequence during runtime termination by @sgrebnov in https://github.com/spiceai/spiceai/pull/4942
Semantic model for views. Views visible in table_schema & list_datasets tools. by @Jeadie in https://github.com/spiceai/spiceai/pull/4946
update openai-async by @Jeadie in https://github.com/spiceai/spiceai/pull/4948
Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/4961
fix: Redundant results snapshotting by @peasee in https://github.com/spiceai/spiceai/pull/4956
Create schema for views if not exist by @Jeadie in https://github.com/spiceai/spiceai/pull/4957
Bump Jimver/cuda-toolkit from 0.2.21 to 0.2.22 by @dependabot in https://github.com/spiceai/spiceai/pull/4969
List available operations in spice trace <operation> by @Jeadie in https://github.com/spiceai/spiceai/pull/4953
Initial commit of release analytics by @lukekim in https://github.com/spiceai/spiceai/pull/4975
Remove spaces from CSV by @lukekim in https://github.com/spiceai/spiceai/pull/4977
Fix Spice pods watcher by @sgrebnov in https://github.com/spiceai/spiceai/pull/4984
feat: Add appendable data sources for the testoperator by @peasee in https://github.com/spiceai/spiceai/pull/4949
Omit timestamp when warning regarding datasets with hyphens by @Advayp in https://github.com/spiceai/spiceai/pull/4987
Update helm chart to v1.0.5 by @sgrebnov in https://github.com/spiceai/spiceai/pull/4990
docs: Update qa_analytics.csv by @peasee in https://github.com/spiceai/spiceai/pull/4989
Update end_game template by @sgrebnov in https://github.com/spiceai/spiceai/pull/4991
Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/4993
Add v1.0.5 release notes by @sgrebnov in https://github.com/spiceai/spiceai/pull/4994
Supported Versions: include v1.0.5 by @sgrebnov in https://github.com/spiceai/spiceai/pull/4995
Dependabot updates by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4992
Switch to basic markdown formatting for vector search by @sgrebnov in https://github.com/spiceai/spiceai/pull/4934
docs: Update qa_analytics.csv by @peasee in https://github.com/spiceai/spiceai/pull/5001
feat: Add TPCDS FileAppendableSource for testoperator by @peasee in https://github.com/spiceai/spiceai/pull/5002
Update ring by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5003
docs: Update qa_analytics.csv by @peasee in https://github.com/spiceai/spiceai/pull/5006
feat: Add ClickBench FileAppendableSource for testoperator by @peasee in https://github.com/spiceai/spiceai/pull/5004
feat: Validate append test table counts by @peasee in https://github.com/spiceai/spiceai/pull/5008
feat: Add append spicepods by @peasee in https://github.com/spiceai/spiceai/pull/5009
Improve Vector Search performance for large content w/o primary key defined by @sgrebnov in https://github.com/spiceai/spiceai/pull/5010
Don't try to downgrade Arc in testaccelerationduckdbsingleinstance by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5014
feat: Add an initial testoperator vector search command by @peasee in https://github.com/spiceai/spiceai/pull/5011
feat: Update testoperator workflows for automatic snapshot updates by @peasee in https://github.com/spiceai/spiceai/pull/5018
Fix Vector Search when additional columns include embedding column by @sgrebnov in https://github.com/spiceai/spiceai/pull/5022
Include test for primary key passed as additional column in Vector Search by @sgrebnov in https://github.com/spiceai/spiceai/pull/5024
fix: Update benchmark snapshots by @github-actions in https://github.com/spiceai/spiceai/pull/5020
upgrade mistral.rs by @Jeadie in https://github.com/spiceai/spiceai/pull/4952
fix: Indexes for TPCDS SQLite Spicepod by @peasee in https://github.com/spiceai/spiceai/pull/5038
fix: Update benchmark snapshots by @github-actions in https://github.com/spiceai/spiceai/pull/5035
Include local files in generated Spicepod package by @sgrebnov in https://github.com/spiceai/spiceai/pull/5041
update mistral.rs to 'spiceai' branch rev by @Jeadie in https://github.com/spiceai/spiceai/pull/5029
Configure spiced as an MCP SSE server by @Jeadie in https://github.com/spiceai/spiceai/pull/5039
Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/5052
fix: Disable benchmarks schedule, enable testoperator schedule by @peasee in https://github.com/spiceai/spiceai/pull/5058
fix: Update benchmark snapshots by @github-actions in https://github.com/spiceai/spiceai/pull/5060
Update ROADMAP.md March 2025 by @lukekim in https://github.com/spiceai/spiceai/pull/5061
fix: Testoperator data setup by @peasee in https://github.com/spiceai/spiceai/pull/5068
fix: All HTTP endpoints to hang when adding an invalid dataset with --pods-watcher-enabled by @sgrebnov in https://github.com/spiceai/spiceai/pull/5050
fix: Update benchmark snapshots by @github-actions in https://github.com/spiceai/spiceai/pull/5073
Integration tests for MCP tooling by @Jeadie in https://github.com/spiceai/spiceai/pull/5053
OpenAPI docs for MCP by @Jeadie in https://github.com/spiceai/spiceai/pull/5057
fix: Acceleration federation test by @peasee in https://github.com/spiceai/spiceai/pull/5090
fix: Allow spiced commit in testoperator dispatch by @peasee in https://github.com/spiceai/spiceai/pull/5098
fix: Use RefreshOverrides for the refresh API definition by @peasee in https://github.com/spiceai/spiceai/pull/5095
Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/5094
fix: Increase tries for refreshstatuschangetoready test by @peasee in https://github.com/spiceai/spiceai/pull/5099
feat: Testoperator reports on max and median memory usage by @peasee in https://github.com/spiceai/spiceai/pull/5101
Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/5105
fix: Fail testoperator on failed queries by @peasee in https://github.com/spiceai/spiceai/pull/5106
Update Helm chart to 1.0.6 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5107
Update SECURITY.md to include 1.0.6 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5109
Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/5108
Add QA analytics for 1.0.6 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5110
add env variables to tools, usable in MCP stdio by @Jeadie in https://github.com/spiceai/spiceai/pull/5097
HF downloads obey SIGTERM by @Jeadie in https://github.com/spiceai/spiceai/pull/5044
Add v1.0.6 release notes into trunk by @sgrebnov in https://github.com/spiceai/spiceai/pull/5111
Remove redundant mod name for iceberg integration tests by @Sevenannn in https://github.com/spiceai/spiceai/pull/5112
Use fixed data directory for test operator by @Sevenannn in https://github.com/spiceai/spiceai/pull/5103
Improvements for evals by @Jeadie in https://github.com/spiceai/spiceai/pull/5040
Make McpProxy trait for MCP passthrough by @Jeadie in https://github.com/spiceai/spiceai/pull/5115
Properly handle '/' for tool names. by @Jeadie in https://github.com/spiceai/spiceai/pull/5116
Use retry logic when loading tools by @Jeadie in https://github.com/spiceai/spiceai/pull/5120
Exclude slow tests from regular pr runs by @Sevenannn in https://github.com/spiceai/spiceai/pull/5119
Fix test operator snapshot update by @Sevenannn in https://github.com/spiceai/spiceai/pull/5130
spice init: Fixes windows bug where full path is used for spicepod name by @benrussell in https://github.com/spiceai/spiceai/pull/5126
fix: Update benchmark snapshots by @github-actions in https://github.com/spiceai/spiceai/pull/5131
Implement graceful shutdown for HTTP server by @sgrebnov in https://github.com/spiceai/spiceai/pull/5102
Update enhancement.md by @lukekim in https://github.com/spiceai/spiceai/pull/5142
Add GitHub Workflow and PoC Spicepod configuration to run FinanceBench tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/5145
Fix Postgres and MySQL installation on macos14-runner (E2E CI) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5155
De-duplicate attachments in DuckDBAttachments by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5156
v1.0.7 release note by @Sevenannn in https://github.com/spiceai/spiceai/pull/5153
Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/5160
Update Helm chart to 1.0.7 by @Sevenannn in https://github.com/spiceai/spiceai/pull/5159
Add github token to macos test release download tasks by @Sevenannn in https://github.com/spiceai/spiceai/pull/5161
update security.md for 1.0.7 by @Sevenannn in https://github.com/spiceai/spiceai/pull/5162
Update roadmap.md by @Sevenannn in https://github.com/spiceai/spiceai/pull/5163
Add a performance comparison section for 1.0.7 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5164
docs: Add snafu error variant point to style guide by @peasee in https://github.com/spiceai/spiceai/pull/5167
Fix 1.0.7 release note by @Sevenannn in https://github.com/spiceai/spiceai/pull/5168
Adjust DuckDB connection pool size based on DuckDB accelerator instances usage by @Sevenannn in https://github.com/spiceai/spiceai/pull/5117
Add automatic retry for NSQL queries by @sgrebnov in https://github.com/spiceai/spiceai/pull/5169
Include chat completion id to task history by @sgrebnov in https://github.com/spiceai/spiceai/pull/5170
Trace when all runtime components are ready by @sgrebnov in https://github.com/spiceai/spiceai/pull/5171
Update qa_analytics.csv for 1.0.7 by @Sevenannn in https://github.com/spiceai/spiceai/pull/5165
Set default tool recursion limit to 10 to prevent infinite loops by @sgrebnov in https://github.com/spiceai/spiceai/pull/5173
Add support for schema_source_path param for object-store data connectors by @sgrebnov in https://github.com/spiceai/spiceai/pull/5178
Run license check and check changes on self-hosted macOS runners by @lukekim in https://github.com/spiceai/spiceai/pull/5179
Add MCP by @lukekim in https://github.com/spiceai/spiceai/pull/5183

Full Changelog: github.com/spiceai/spiceai/compare/v1.0.0...release/1.1

- Rust
Published by phillipleblanc about 1 year ago

https://github.com/spiceai/spiceai - v1.0.7

Spice v1.0.7 (Mar 26, 2025)

Spice v1.0.7 improves memory usage when using DuckDB, improves schema inference performance when using object-store based data connectors, and fixes a bug in Dremio schema inference.

Highlights in v1.0.7

DuckDB Memory Usage: Memory usage when using DuckDB has been significantly improved for data loads and refreshes through expanded use of zero-copy Arrow and multi-threading for data loads. When a duckdb_memory_limit is specified, disk spilling has been improved for greater-than-memory workloads. In addition, a new temp_directory runtime parameter supports storing temporary files to alternative location than the DuckDB data file for higher throughput. For example, temp_directory could be set to a different high-IOPs IO2 EBS volume that is separate from the duckdb_file_path.

Automated end-to-end tests for the DuckDB Accelerator coverage has been significantly expanded.

For configuration details, see the documentation for runtime parameters and the DuckDB Data Accelerator.

Schema Inference Performance for Object-Store Data Connectors: Schema inference performance has been improved, especially for large numbers of objects (1M+ objects) when using object-store based data connectors by making the object-listing and selection more efficient.

Contributors

@phillipleblanc
@sgrebnov
@peasee
@Sevenannn

Breaking Changes

No breaking changes.

Upgrading

To upgrade to v1.0.7, use one of the following methods:

CLI:

console spice upgrade

Homebrew:

console brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.0.7 image:

console docker pull spiceai/spiceai:1.0.7

For available tags, see DockerHub.

Helm:

console helm repo update helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

DataFusion Table Providers: Upgraded from 760ece6ac52b7d180d697f347642af403c2e711c to 9ba9dce19a1fdbd5e22cc2e445c5b3ea731944b4.

Changelog

fix: Remove on zero results arguments from benchmarks by @peasee in https://github.com/spiceai/spiceai/pull/4533
Run benchmark tests w/o uploading test results (pending improvements) by @sgrebnov in https://github.com/spiceai/spiceai/pull/4843
fix: Return BAD_REQUEST when not embeddings are configured by @peasee in https://github.com/spiceai/spiceai/pull/4804
Fix Dremio schema inference by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5114
Improve performance of schema inference for object-store data connectors by @sgrebnov in https://github.com/spiceai/spiceai/pull/5124
Always download spice runtime version matched with spice cli version by @Sevenannn in https://github.com/spiceai/spiceai/pull/4761
Fix go lint errors by @sgrebnov in https://github.com/spiceai/spiceai/pull/5147
Make DuckDB acceleration E2E tests more comprehensive by @sgrebnov in https://github.com/spiceai/spiceai/pull/5146
Enable Spice to load larger than memory datasets into DuckDB accelerations by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5149
Add temp_directory runtime parameter and insert it for DuckDB accelerations by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5152
Fix Postgres and MySQL installation on macos14-runner (E2E CI) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5155
Enable E2E for DuckDB full mode acceleration with indexes only in CI by @sgrebnov in https://github.com/spiceai/spiceai/pull/5154

Full Changelog: https://github.com/spiceai/spiceai/compare/v1.0.6...v1.0.7

- Rust
Published by Sevenannn about 1 year ago

https://github.com/spiceai/spiceai - v1.0.6

Spice v1.0.6 (Mar 17, 2025)

Spice v1.0.6 improves stability for DuckDB acceleration, Iceberg Data/Catalog connector improvements when using AWS Glue, and fixes an issue with the ready_state: on_registration federation fallback when using DuckDB. In addition, redundant data refreshes on startup are avoided for accelerations with persistent data.

Highlights in v1.0.6

Iceberg Data/Catalog Connector Improvements: Improves Iceberg data & catalog connector reliability, including bug fixes for AWS Glue API rate-limiting and compatibility, REST API pagination support, explicit AWS credential handling, and support for AWS STS role assumption.
Fixes On-Registration Fallback when using DuckDB: Previously, when using DuckDB as a data accelerator and the ready_state: on_registration configuration, queries made during the initial data refresh did not properly fallback to the federated source. This is now fixed.
DuckDB downgraded for Stability: DuckDB has been downgraded to v1.1.3 due to a regression in memory handling tracked by duckdb/duckdb issue #16640. Once resolved and validated, Spice will re-upgrade to v1.2.x.
Expanded Integration Tests: Additional integration tests covering federated accelerator behavior and graceful shutdown processes have been added.
Optimized Data Refresh for Persistent Accelerations: Changed behavior in v1.0.6. When using persistent (file-mode) acceleration without a defined refresh interval, Spice performs a full refresh at startup only if no previously accelerated data is available. This ensures efficient startup behavior by avoiding unnecessary refreshes. This logic applies only to full refreshes when no refresh interval is specified.

To maintain the previous behavior and always refresh on every startup, set:

yaml acceleration: refresh_on_startup: always

Contributors

@peasee
@phillipleblanc
@sgrebnov
@lukekim
@Sevenannn

Breaking Changes

Starting from v1.0.6 when using persistent (file-mode) acceleration without a defined refresh interval, Spice performs a full refresh at startup only if no previously accelerated data is available. To maintain the previous behavior and always refresh on every startup, set:

yaml acceleration: refresh_on_startup: always

Cookbook Updates

No new recipes.

Upgrading

To upgrade to v1.0.6, use one of the following methods:

CLI:

console spice upgrade

Homebrew:

console brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.0.6 image:

console docker pull spiceai/spiceai:1.0.6

For available tags, see DockerHub.

Helm:

console helm repo update helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

duckdb-rs: Downgraded from 1.2.0 to 1.1.3

Changelog

Implement proper readystate: onregistration for federation enabled accelerators by @phillipleblanc in #5019
Add indexes and primary keys mismatch detection for DuckDB Acceleration by @sgrebnov in #5045
Add comprehensive integration tests for the ready_state behavior by @phillipleblanc in #5042
Add test Spicepod for acceleration with constraints by @sgrebnov in #4891
Add test Spicepod for DuckDB append acceleration with constraints by @sgrebnov in #4898
Add DuckDB graceful shutdown test to E2E CI tests by @sgrebnov in #5047
Update duckdbappendwithpkand_indexes.yaml (work for duckdb 1.1.x) by @sgrebnov in #5067
fix: Downgrade to DuckDB 1.1.3 by @peasee in #5055
fix: Acceleration federation integration test by @peasee in #5070
Improvements to Iceberg Catalog/Data Connector by @phillipleblanc in #5071
Add Results-Cache-Status to indicate query result came from cache by @phillipleblanc in #4809
fix: Spice.ai schema inference by @peasee in #4674
Add refresh_on_startup Spicepod configuration param by @phillipleblanc and @sgrebnov in #5086
Test restart behavior of DuckDB file acceleration against glue iceberg table by @Sevenannn #5075
Run Iceberg Data Connector - DuckDB File mode integration test by @Sevenannn #5069
Integration test for glue iceberg catalog by @Sevenannn #5077

Full Changelog: https://github.com/spiceai/spiceai/compare/v1.0.5...v1.0.6

- Rust
Published by phillipleblanc about 1 year ago

https://github.com/spiceai/spiceai - v1.0.5

Spice v1.0.5 (Mar 10, 2025)

Spice v1.0.5 expands Iceberg support with the introduction of the Iceberg Data Connector, in addition to the existing Iceberg Catalog Connector. This new connector enables direct dataset creation and configuration for specific Iceberg objects, enabling federated and accelerated SQL queries on Apache Iceberg tables.

Performance improvements include enhanced Parquet pruning in append mode, where object-store metadata is now leveraged alongside Hive partitioning to optimize file pruning. This results in faster and more efficient queries.

DuckDB has been upgraded to v1.2.0, along with additional stability improvements, including improved graceful shutdown and the ability to configure the DuckDB memory limit.

Additional updates include support for the Arrow Map type.

Highlights in v1.0.5

New Iceberg Data Connector: Enables direct dataset creation and querying of Iceberg tables.

Example usage in spicepod.yaml:

yaml datasets: - from: iceberg:https://iceberg-catalog-host.com/v1/namespaces/my_namespace/tables/my_table name: my_table params: # Same as Iceberg Catalog Connector acceleration: enabled: true For detailed setup instructions, authentication options, and configuration parameters, refer to the Iceberg Data Connector documentation.

Improved Parquet pruning in append mode: Uses object-store metadata for more efficient file pruning.
DuckDB upgrade to v1.2.0 with improved graceful shutdown: Read the DuckDB v1.2.0 announcement for details, including breaking changes for map and list_reduce. Graceful shutdown of DuckDB has been improved for better stability across restarts.
Configurable DuckDB memory limit: Use the duckdb_memory_limit parameter to set the DuckDB acceleration memory limit:

yaml - from: spice.ai:path.to.my_dataset name: my_dataset acceleration: params: duckdb_memory_limit: '2GB' enabled: true engine: duckdb mode: file

Contributors

@peasee
@phillipleblanc
@sgrebnov
@lukekim

Breaking Changes

DuckDB v1.2.0 has breaking changes.

Upgrading

To upgrade to v1.0.5, use one of the following methods:

CLI:

console spice upgrade

Homebrew:

console brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.0.5 image:

console docker pull spiceai/spiceai:1.0.5

For available tags, see DockerHub.

Helm:

console helm repo update helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

duckdb-rs: Upgraded from 1.1.1 to 1.2.0

Changelog

fix: Update OpenAI model health check by @peasee in #4849
fix: Allow metrics endpoint setting in CLI by @peasee in #4939
DuckDB acceleration: fix Decimal with zero scale support by @sgrebnov in #4922
Introduce runtime shutdown state by @sgrebnov in #4917
Add support for Flight and HTTP endpoints configuration to Spice CLI (run and sql) by @sgrebnov and @lukekim in #4913
Fix Datafusion resources deallocation during shutdown by @sgrebnov in #4912
DuckDB: fix error handling during record batch insertion by @sgrebnov in #4894
DuckDB: add support for Map Arrow type for DuckDB acceleration by @sgrebnov in #4887
Upgrade to DuckDB v1.2.0 by @sgrebnov in #4842
Gracefully shutdown the runtime and deallocate static resources by @sgrebnov in #4879
Implement an Iceberg Data Connector by @phillipleblanc in #4941
Don't trace canceled dataset refresh during runtime termination by @sgrebnov in #4958
Use metadata column lastmodified when specified as a timecolumn by @phillipleblanc in #4970
Add duckdbmemorylimit param support for DuckDB acceleration by @sgrebnov in #4971
Add Iceberg dataset integration test by @phillipleblanc in #4950

Full Changelog: https://github.com/spiceai/spiceai/compare/v1.0.4...v1.0.5

- Rust
Published by sgrebnov about 1 year ago

https://github.com/spiceai/spiceai - v1.0.4

Spice v1.0.4 (Feb 17, 2024)

Spice v1.0.4 includes several bugfixes including improved table column casing and normalization, Delta Lake partition pruning and improved tracing throughout spiced and added functionality to spice trace.

Highlights in v1.0.4

Improved spice trace functionality: A more detailed spice trace format with new flags --include-output, --include-input and --truncate ``` >> spice trace ai_chat --include-input --truncate

TREE STATUS DURATION TASK INPUT
b28bab6b58971b7e ✅ 1352.12ms aichat {"messages":[{"role":"user","content":"hello"}],"model":"openaimodel","stream":... (45 characters omitted) └── 1a0ad7c6138abb09 ✅ 1352.03ms aicompletion {"messages":[{"role":"user","content":"hello"}],"model":"openaimodel","stream":... (45 characters omitted) ```

Contributors

@phillipleblanc
@Sevenannn
@sgrebnov
@peasee
@Jeadie
@lukekim

Breaking Changes

No breaking changes.

Cookbook Updates

Upgrading

To upgrade to v1.0.4, use one of the following methods:

CLI:

console spice upgrade

Homebrew:

console brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.0.4 image:

console docker pull spiceai/spiceai:1.0.4

For available tags, see DockerHub.

Helm:

console helm repo update helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

No major dependency changes.

Changelog

Do not return underlying content of chunked embedding column by default during tooluse::documentsimilarity by @Jeadie in https://github.com/spiceai/spiceai/pull/4802
Fix Snowflake Case-Sensitive Identifiers support by @sgrebnov in https://github.com/spiceai/spiceai/pull/4813
Prepare for 1.0.4 by @sgrebnov in https://github.com/spiceai/spiceai/pull/4801
Add support for a timepartitioncolumn by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4784
Prevent the automatic normalization of refresh_sql columns to lowercase by @sgrebnov in https://github.com/spiceai/spiceai/pull/4787
Implement partition pruning for Delta Lake tables by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4783
Fix constraint verification for columns with uppercase letters by @sgrebnov in https://github.com/spiceai/spiceai/pull/4785
Add truncate command for spice trace by @peasee in https://github.com/spiceai/spiceai/pull/4771
Implement Cache-Control: no-cache to bypass results cache by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4763
Prompt user to download runtime when running spice sql by @Sevenannn in https://github.com/spiceai/spiceai/pull/4747
Add vector search tracing by @peasee in https://github.com/spiceai/spiceai/pull/4757
Update spice trace output format by @Jeadie in https://github.com/spiceai/spiceai/pull/4750
Fix tool call arguments in Grok messages by @Jeadie in https://github.com/spiceai/spiceai/pull/4741

Full Changelog: https://github.com/spiceai/spiceai/compare/v1.0.3...v1.0.4

- Rust
Published by phillipleblanc over 1 year ago

https://github.com/spiceai/spiceai - v1.0.3

Spice v1.0.3 (Feb 10, 2024)

Spice v1.0.3 provides several bug fixes, including a fix for the initial data load period when a retention policy has been set, and a new unsupported_type_action: string parameter to auto-convert unsupported types to strings.

Highlights in v1.0.3

PostgreSQL Data Connector: New unsupported_type_action: string parameter that auto-converts unsupported types such as JSONB to strings.

Contributors

@phillipleblanc
@Sevenannn
@sgrebnov
@peasee
@Jeadie
@lukekim

Breaking Changes

No breaking changes.

Cookbook Updates

Updated Kubernetes Deployment Recipe
Updated Data Retention Recipe

Upgrading

To upgrade to v1.0.3, use one of the following methods:

CLI:

console spice upgrade

Homebrew:

console brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.0.3 image:

console docker pull spiceai/spiceai:1.0.3

For available tags, see DockerHub.

Helm:

console helm repo update helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

No major dependency changes.

Changelog

For local models, use 'content=""' instead of None by @Jeadie and @phillipleblanc in https://github.com/spiceai/spiceai/pull/4646
Perplexity Sonar LLM component by @Jeadie and @lukekim in https://github.com/spiceai/spiceai/pull/4673
Update async openai fork & support reasoning effort parameter by @Sevenannn and @phillipleblanc in https://github.com/spiceai/spiceai/pull/4679
Web search tool by @Jeadie and @lukekim in https://github.com/spiceai/spiceai/pull/4687
Setup tpc-extension by @ewgenius and @phillipleblanc in https://github.com/spiceai/spiceai/pull/4690
fix: Use PostgreSQL interval style for Spice.ai by @peasee and @phillipleblanc in https://github.com/spiceai/spiceai/pull/4716
Fix spice upgrade command by @Sevenannn and @sgrebnov in https://github.com/spiceai/spiceai/pull/4699
Fix bug: Ensure refresh only retrieves data within the retention period by @sgrebnov and @phillipleblanc in https://github.com/spiceai/spiceai/pull/4717
Implement unsupportedtypeaction: string for Postgres JSONB support by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4719
Fix the get latest release logic by @Sevenannn and @phillipleblanc in https://github.com/spiceai/spiceai/pull/4721
add 'accelerated_refresh' to 'spice trace' allowlist by @Jeadie and @phillipleblanc in https://github.com/spiceai/spiceai/pull/4711
Update version to 1.0.3 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4731
Truncate embedding columns within sampling tool by @Jeadie in https://github.com/spiceai/spiceai/pull/4722
Validate primary key columns during accelerated dataset initialization by @sgrebnov in https://github.com/spiceai/spiceai/pull/4736

Full Changelog: https://github.com/spiceai/spiceai/compare/v1.0.2...v1.0.3

- Rust
Published by phillipleblanc over 1 year ago

https://github.com/spiceai/spiceai - v1.0.2

Spice v1.0.2 (Feb 3, 2024)

Spice v1.0.2 adds support for running local filesystem-hosted DeepSeek models including R1 (cloud-hosted via DeepSeek API was already supported) and improves the developer experience for debugging AI chat tasks along with several bug fixes. The HuggingFace and Filesystem-Hosted models providers have both graduated to Release Candidates (RC) and the Spice.ai Cloud Platform catalog provider has graduated to Beta.

Highlights in v1.0.2

spice trace New spice trace CLI command that outputs a detailed breakdown of traces and tasks, including tool usage and AI completions.

Examples:

```shell trace> spice trace aichat 61cc6bd0e571c783 aichat ├── 69362c30f238076f tooluse::getreadiness ├── b6b17f1a9a6b86dc aicompletion ├── c30d692c6c41c5ee tooluse::listdatasets └── ce18756d5fef0df0 aicompletion

trace> spice trace ai_chat --trace-id 61cc6bd0e571c783

trace> spice trace ai_chat --id chatcmpl-AvXwmPSV1PMyGBi9dLfkEQTZPjhqz ```

The spice trace CLI simply outputs data available in the runtime.task_history table which can also be queried by SQL.

To learn more, see:

spice trace Documentation
Task History Documentation
Filesystem-Hosted Models Provider: Graduated to Release Candidate (RC). To learn more, see the Filesystem-Hosted Models Provider Documentation.
HuggingFace Models Provider: Graduated to Release Candidate (RC). To learn more, see the HuggingFace Models Provider Documentation.
Spice.ai Cloud Platform Catalog: Graduated to Beta.

Contributors

@phillipleblanc
@johnnynunez
@Sevenannn
@sgrebnov
@peasee
@Jeadie
@lukekim

New Contributors

@johnnynunez made their first contribution in github.com/spiceai/spiceai/pull/4502

Breaking Changes

No breaking changes.

Cookbook Updates

Added Filesystem-Hosted Model Provider Recipe

Upgrading

To upgrade to v1.0.2, use one of the following methods:

CLI:

console spice upgrade

Homebrew:

console brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.0.2 image:

console docker pull spiceai/spiceai:1.0.2

For available tags, see DockerHub.

Helm:

console helm repo update helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

No major dependency changes.

Changlog

Update release branch naming by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4539
ready for arm buildings by @johnnynunez in https://github.com/spiceai/spiceai/pull/4502
Bump helm chart version to 1.0.1 by @Sevenannn in https://github.com/spiceai/spiceai/pull/4542
Include 1.0.1 as supported version in security.md by @Sevenannn in https://github.com/spiceai/spiceai/pull/4545
Update CI to build on hosted windows runners by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4540
docs: Update Windows install by @peasee in https://github.com/spiceai/spiceai/pull/4551
Fix spark spicepod for test operator by @Sevenannn in https://github.com/spiceai/spiceai/pull/4555
Improve hugging face model chat error by @Sevenannn in https://github.com/spiceai/spiceai/pull/4554
fix: Update Windows E2E install by @peasee in https://github.com/spiceai/spiceai/pull/4557
feat: Add Spice Cloud Catalog Spicepod, release Alpha by @peasee in https://github.com/spiceai/spiceai/pull/4561
Fix huggingface embedding errors by @Sevenannn in https://github.com/spiceai/spiceai/pull/4558
feat: Load table schemas through REST for Spice Cloud Catalog by @peasee in https://github.com/spiceai/spiceai/pull/4563
Add upgrade instruction in release note by @Sevenannn in https://github.com/spiceai/spiceai/pull/4548
Add federated source information to refresh errors by @sgrebnov in https://github.com/spiceai/spiceai/pull/4560
docs: Update ROADMAP.md by @peasee in https://github.com/spiceai/spiceai/pull/4566
Merge mistral upstream by @Jeadie in https://github.com/spiceai/spiceai/pull/4562
Fix windows build by @Sevenannn in https://github.com/spiceai/spiceai/pull/4574
feat: Update Spice Cloud Catalog errors, release as Beta by @peasee in https://github.com/spiceai/spiceai/pull/4575
docs: Add TOC to README.md by @peasee in https://github.com/spiceai/spiceai/pull/4538
Updates to spiceai/mistral.rs by @Jeadie in https://github.com/spiceai/spiceai/pull/4580
Improve refresh error tracing by @sgrebnov in https://github.com/spiceai/spiceai/pull/4576
Add HTTP consistency & overhead to testoperator dispatch tool by @Jeadie in https://github.com/spiceai/spiceai/pull/4556
Fix append mode refresh with MySQL Data Connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/4583
fix: Retry flaky tests by @peasee in https://github.com/spiceai/spiceai/pull/4577
Fix E2E models test build on macOS runners by @sgrebnov in https://github.com/spiceai/spiceai/pull/4585
spice trace chat support in CLI by @Jeadie in https://github.com/spiceai/spiceai/pull/4582
Include hf test specs, enable ready_wait in workflow by @Sevenannn in https://github.com/spiceai/spiceai/pull/4584
Add paths verification when loading models by @sgrebnov in https://github.com/spiceai/spiceai/pull/4591
Add generation_config.json support for Filesystem models by @sgrebnov in https://github.com/spiceai/spiceai/pull/4592
Promote Filesystem model provider to RC by @sgrebnov in https://github.com/spiceai/spiceai/pull/4593
docs: Add models grading criteria by @peasee in https://github.com/spiceai/spiceai/pull/4550
Fix typo in Alpha Release Criteria (models) by @sgrebnov in https://github.com/spiceai/spiceai/pull/4588
fix: Retry AI integration tests by @peasee in https://github.com/spiceai/spiceai/pull/4595
Run LLM integration tests on Macs; add running local models by @Jeadie in https://github.com/spiceai/spiceai/pull/4495
Update version to 1.0.2 by @sgrebnov in https://github.com/spiceai/spiceai/pull/4594
feat: Schedule testoperator by @peasee in https://github.com/spiceai/spiceai/pull/4503
Improve UX of downloading GGUF from HF by @Jeadie in https://github.com/spiceai/spiceai/pull/4601
Improve spice trace CLI command by @sgrebnov https://github.com/spiceai/spiceai/pull/4629
Improve the UX of using huggingface models & embeddings by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4623
GGUF, hide metadata by @Jeadie in https://github.com/spiceai/spiceai/pull/4631
Promote hugging face to rc by @Sevenannn in https://github.com/spiceai/spiceai/pull/4626
Endgame Issue template improvements by @lukekim in https://github.com/spiceai/spiceai/pull/4647
feat: setup sccache for PR checks by @peasee in https://github.com/spiceai/spiceai/pull/4652
Run buildandrelease_cuda.yml when crates/llms/Cargo.toml changes by @Jeadie in https://github.com/spiceai/spiceai/pull/4648
Update E2E installation tests to match model runtime version by @sgrebnov in https://github.com/spiceai/spiceai/pull/4653
fix: Postgres LargeUtf8 is equal to Utf8 by @peasee in https://github.com/spiceai/spiceai/pull/4664
Fix eager string formatting in mistral.rs by @Jeadie in https://github.com/spiceai/spiceai/pull/4665
Better error for spicepod parsing by @Sevenannn in https://github.com/spiceai/spiceai/pull/4632
Update datafusion-table-providers (MySQL improvements) by @sgrebnov in https://github.com/spiceai/spiceai/pull/4670
Handle delta tables partitioned by a date column with large date values by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4672

Full Changelog: https://github.com/spiceai/spiceai/compare/v1.0.1...v1.0.2

- Rust
Published by sgrebnov over 1 year ago

https://github.com/spiceai/spiceai - v1.0.1

Spice v1.0.1 (Jan 27, 2024)

Spice v1.0.1 focuses on an improved developer experience, with automatic CUDA GPU detection for local models, in addition to bug fixes. Notably, the Iceberg Catalog Connector now supports AWS Glue including Sig v4 authentication.

Highlights in v1.0.1

AWS Glue Support for Iceberg Catalog Connector: The Iceberg Catalog Connector now supports AWS Glue. Example spicepod.yaml configuration:

yaml - from: iceberg:https://glue.ap-northeast-2.amazonaws.com/iceberg/v1/catalogs/123456789012/namespaces name: glue

spice upgrade CLI Command: The spice upgrade CLI command detects more edge cases for a smoother upgrade experience.
GPU Acceleration Detection: The Spice CLI now automatically detects and enables CUDA (NVIDIA GPUs) GPU acceleration when supported in addition to Metal (M-Series on macOS).
Python SDK: The Python SDK (spicepy) has updated to v3.0.0, aligning the SDK with the Runtime

Breaking changes

No breaking changes.

Dependencies

No major dependency changes.

Cookbook

Added DeepSeek Model Recipe
Added OpenAI LLM & Embeddings Recipe

Upgrading

To upgrade to v1.0.1, use one of the following methods:

CLI:

console spice upgrade

Homebrew:

console brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.0.1 image:

console docker pull spiceai/spiceai:1.0.1

For available tags, see DockerHub.

Helm:

console helm repo update helm upgrade spiceai spiceai/spiceai

Contributors

@Jeadie
@phillipleblanc
@ewgenius
@peasee
@Sevenannn
@sgrebnov
@lukekim

What's Changed

Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/4459
docs: 1.0 release notes by @peasee in https://github.com/spiceai/spiceai/pull/4440
Create a release-only workflow that uses a previous run's artifacts by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4461
Add publish-only CUDA workflow by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4462
Fix the CUDA release workflow by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4463
docs: Update SECURITY.md for stable by @peasee in https://github.com/spiceai/spiceai/pull/4465
docs: Update endgame by @peasee in https://github.com/spiceai/spiceai/pull/4460
docs: Promote HF and File model components by @peasee in https://github.com/spiceai/spiceai/pull/4457
fix: E2E test release installation by @peasee in https://github.com/spiceai/spiceai/pull/4466
Fix publish part of CUDA workflow by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4467
Fix broken docs links in README by @ewgenius in https://github.com/spiceai/spiceai/pull/4468
Update benchmark snapshots by @github-actions in https://github.com/spiceai/spiceai/pull/4474
Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/4477
Add instruction to force-install CPU runtime to v1.0 release notes by @sgrebnov in https://github.com/spiceai/spiceai/pull/4469
feat: Add WIP testoperator dispatch workflow by @peasee in https://github.com/spiceai/spiceai/pull/4478
Fix Bug: invalid REPL cursor position on Windows by @sgrebnov in https://github.com/spiceai/spiceai/pull/4480
feat: Download latest spiced commit for testoperators by @peasee in https://github.com/spiceai/spiceai/pull/4483
Add compute engine image by @lukekim in https://github.com/spiceai/spiceai/pull/4486
fix: Testoperator git fetch depth by @peasee in https://github.com/spiceai/spiceai/pull/4484
feat: New spicepods, testoperator improvements, TPCDS Q1 fix by @peasee in https://github.com/spiceai/spiceai/pull/4475
Add 87 CUDA compatiblity to build CI by @Jeadie in https://github.com/spiceai/spiceai/pull/4489
Use OpenAI golang client in spice chat by @Jeadie in https://github.com/spiceai/spiceai/pull/4491
Verify search and chat on Windows as part of AI installation tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/4492
feat: Add testoperator dispatch command by @peasee in https://github.com/spiceai/spiceai/pull/4479
Run CUDA builds on non-GPU instances by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4496
Use upgraded spice cli when performing runtime upgrade in spice upgrade by @Sevenannn in https://github.com/spiceai/spiceai/pull/4490
Revert "Use OpenAI golang client in spice chat (#4491)" by @Jeadie in https://github.com/spiceai/spiceai/pull/4532
Make Anthropic rate limit error message friendlier by @sgrebnov in https://github.com/spiceai/spiceai/pull/4501
Update supported CUDA targets: add 87(cli), remove 75 by @sgrebnov in https://github.com/spiceai/spiceai/pull/4509
Support AWS Glue for Iceberg catalog connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4517
Package CUDA runtime libraries into artifact for Windows by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4497

Full Changelog: https://github.com/spiceai/spiceai/compare/v1.0.0...v1.0.1

- Rust
Published by Sevenannn over 1 year ago

https://github.com/spiceai/spiceai - v1.0.0

Spice v1.0-stable (Jan 20, 2025)

🎉 After 47 releases, Spice.ai OSS has reached production readiness with the 1.0-stable milestone!

The core runtime and features such as query federation, query acceleration, catalog integration, search and AI-inference have all graduated to stable status along with key component graduations across data connectors, data accelerators, catalog connectors, and AI model providers.

Highlights in v1.0-stable

Stable Data Connectors: The following data connectors have graduated to Stable:
- Delta Lake
- MySQL
- Dremio
- PostgreSQL
- Databricks (mode: delta_lake)
- DuckDB
- S3
Stable Data Accelerators: The following data accelerators have graduated to Stable:
- DuckDB
- Arrow
Unity Catalog Connector: Graduated to Stable.
Databricks (mode: spark_connect) Data Connector: Graduated to Beta.
Beta Catalog Connectors: The Iceberg and Databricks catalog connectors graduated to Beta.
OpenAI Model & Embeddings Provider: Graduated to Release Candidate (RC).
Alpha Model Providers: The Anthropic and xAI (Grok) model providers graduated to Alpha.

Breaking Changes

Default Runtime Version: The CLI will install the GPU accelerated AI-capable Runtime by default, when running spice install or spice run.
Default OpenAI Model: The default OpenAI model has updated to gpt-4o-mini.
Identifier Normalization: Unquoted identifiers such as table names are no longer normalized to lowercase. Identifiers will now retain their exact case as provided.
Sandboxed Docker Image: The Runtime Docker Image now runs the spiced process as the nobody user in a minimal chroot sandbox.
Insecure S3 and ABFS endpoints: The S3 and ABFS connectors now enforce insecure endpoint checks, preventing HTTP endpoints unless allow_http is explicitly enabled. Refer to the documentation for details.

Dependencies

No major dependency changes.

Upgrading

To upgrade to v1.0.0, use one of the following methods:

CLI:

console spice upgrade

Homebrew:

console brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.0.0 image:

console docker pull spiceai/spiceai:1.0.0

For available tags, see DockerHub.

Helm:

console helm repo update helm upgrade spiceai spiceai/spiceai

Contributors

@peasee
@ewgenius
@Jeadie
@Sevenannn
@lukekim
@phillipleblanc
@sgrebnov

What's Changed

feat: Update load test criteria, testoperator updates by @peasee in https://github.com/spiceai/spiceai/pull/4311
Update helm for v1.0.0-rc.5 by @ewgenius in https://github.com/spiceai/spiceai/pull/4313
Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/4318
Bump version to v1.0.0, update SECURITY.md by @ewgenius in https://github.com/spiceai/spiceai/pull/4314
Initial criteria for models, embeddings by @Jeadie in https://github.com/spiceai/spiceai/pull/4223
Update benchmark snapshots by @github-actions in https://github.com/spiceai/spiceai/pull/4321
Add dremio param for running load test by @Sevenannn in https://github.com/spiceai/spiceai/pull/4315
Promote Databricks (mode: delta_lake) connector to stable by @Sevenannn in https://github.com/spiceai/spiceai/pull/4328
Handle failed query in load test by @Sevenannn in https://github.com/spiceai/spiceai/pull/4327
feat: Use load test hours for baseline query sets by @peasee in https://github.com/spiceai/spiceai/pull/4334
Fix typo in 1.0.0-rc.5 release notes by @ewgenius in https://github.com/spiceai/spiceai/pull/4329
feat: add testoperator data consistency by @peasee in https://github.com/spiceai/spiceai/pull/4319
docs: Release DuckDB connector stable by @peasee in https://github.com/spiceai/spiceai/pull/4335
Fix DocumentDB -> DynamoDB by @lukekim in https://github.com/spiceai/spiceai/pull/4339
Update benchmark snapshots by @github-actions in https://github.com/spiceai/spiceai/pull/4337
fix: Download hits.parquet from MinIO for benchmark by @peasee in https://github.com/spiceai/spiceai/pull/4338
Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/4341
Remove evil averages by @lukekim in https://github.com/spiceai/spiceai/pull/4343
Don't run builds on non-code changes by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4344
Remove streaming requirement from Databricks spark Beta and Spark connector Beta by @ewgenius in https://github.com/spiceai/spiceai/pull/4345
Update s3 tpcds spicepods by @ewgenius in https://github.com/spiceai/spiceai/pull/4346
Explicitly set required scale factor for throughput and load tests by @ewgenius in https://github.com/spiceai/spiceai/pull/4347
Fix s3 tpcds dataset name by @ewgenius in https://github.com/spiceai/spiceai/pull/4348
Promote Iceberg Catalog Connector to Beta by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4350
Update s3 clickbench benchmark snapshots by @ewgenius in https://github.com/spiceai/spiceai/pull/4351
fix: DuckDB clickbench on zero results by @peasee in https://github.com/spiceai/spiceai/pull/4349
Add integration test with snapshots for databricks catalog connector by @Sevenannn in https://github.com/spiceai/spiceai/pull/4353
refactor: Remove on zero results from benchmarks, add data consistency workflow by @peasee in https://github.com/spiceai/spiceai/pull/4354
Fix Bug: No field named body_embedding when do vector search with refresh sql containing subset of columns by @sgrebnov in https://github.com/spiceai/spiceai/pull/4297
docs: Update roadmap by @peasee in https://github.com/spiceai/spiceai/pull/4364
feat: Release accelerators stable by @peasee in https://github.com/spiceai/spiceai/pull/4361
Add TPCH/TPCDS test spicepods for MySQL by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4365
Catch when an insecure (http) S3 and ABFS data connectors endpoint is used without specifying the allow_http parameter by @ewgenius in https://github.com/spiceai/spiceai/pull/4363
Update ROADMAP - Iceberg catalog alpha for v1.0 by @ewgenius in https://github.com/spiceai/spiceai/pull/4367
Promote databricks catalog and databricks (spark_connect) connector to beta by @Sevenannn in https://github.com/spiceai/spiceai/pull/4369
Update Roadmap - Iceberg beta by @ewgenius in https://github.com/spiceai/spiceai/pull/4373
Build CUDA binaries for Linux by @Jeadie in https://github.com/spiceai/spiceai/pull/4320
Promote Nvidia NIM as Alpha by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4380
Promote xai to alpha by @Sevenannn in https://github.com/spiceai/spiceai/pull/4381
Update stable criteria for object store based connectors by @ewgenius in https://github.com/spiceai/spiceai/pull/4383
Testoperator: http consistency and overhead tests, fixes and ci by @ewgenius in https://github.com/spiceai/spiceai/pull/4382
Promote S3 Data Connector to Stable by @ewgenius in https://github.com/spiceai/spiceai/pull/4385
Download platform-supported CUDA binary version on Linux by @sgrebnov in https://github.com/spiceai/spiceai/pull/4356
Fix http consistency test workflow, add overhead workflow by @ewgenius in https://github.com/spiceai/spiceai/pull/4387
feat: Add Postgres test spicepods by @peasee in https://github.com/spiceai/spiceai/pull/4388
Fix typos + specific in model criteria; Make explicit alpha/beta tests for LLMS in crates/llms/tests. by @Jeadie in https://github.com/spiceai/spiceai/pull/4377
Fix federation bug for correlated subqueries of deeply nested Dremio tables by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4389
Fix http overhead workflow by @ewgenius in https://github.com/spiceai/spiceai/pull/4390
Tweak model tests, fix embedding input by @ewgenius in https://github.com/spiceai/spiceai/pull/4391
Promote Dremio to Stable quality by @Sevenannn in https://github.com/spiceai/spiceai/pull/4392
Add beta functionality tests for embedding models. by @Jeadie in https://github.com/spiceai/spiceai/pull/4352
docs: Release postgres connector stable by @peasee in https://github.com/spiceai/spiceai/pull/4398
Increase timeout for model response in E2E tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/4399
Disable ident normalization (i.e. SELECT MyColumn from table works) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4400
Preserve schema metadata by @ewgenius in https://github.com/spiceai/spiceai/pull/4402
Make models integration tests tracing less verbose by @sgrebnov in https://github.com/spiceai/spiceai/pull/4403
Fix cuda feature build on Windows by @sgrebnov in https://github.com/spiceai/spiceai/pull/4404
Promote MySQL to Stable by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4406
docs: Release Delta Lake and Unity catalog by @peasee in https://github.com/spiceai/spiceai/pull/4405
Use gpt-4o-mini as a default model for openai provider by @ewgenius in https://github.com/spiceai/spiceai/pull/4410
Fix streaming for Openai and Anthropic by @Jeadie in https://github.com/spiceai/spiceai/pull/4409
Tweak model loading and missing tool errors messages by @ewgenius in https://github.com/spiceai/spiceai/pull/4412
Spice CLI: fallback to CPU build for unsupported GPU Compute Capability by @sgrebnov in https://github.com/spiceai/spiceai/pull/4407
Build Windows CUDA binaries as part of build_and_release workflow by @sgrebnov in https://github.com/spiceai/spiceai/pull/4386
Update docs link by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4416
feat: Add CPU models install escape hatch by @peasee in https://github.com/spiceai/spiceai/pull/4419
Handle OpenAI API Errors by @ewgenius in https://github.com/spiceai/spiceai/pull/4417
Update spice cli to use GH_TOKEN or GITHUB_TOKEN env variables when calling releases api by @ewgenius in https://github.com/spiceai/spiceai/pull/4175
Implement secure sandboxing for Docker image by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4411
Automatically install supported CUDA binary on Windows by @sgrebnov in https://github.com/spiceai/spiceai/pull/4420
Metrics for LLMs+ embeddings by @Jeadie in https://github.com/spiceai/spiceai/pull/4418
Jeadie/25 01 17/beta perf by @Jeadie in https://github.com/spiceai/spiceai/pull/4397
Pass GitHub token to all CI steps calling spice run by @ewgenius in https://github.com/spiceai/spiceai/pull/4423
Run the models integration tests on PRs by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4421
Run CUDA builds in a separate workflow by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4430
Promote OpenAI models and embeddings providers to RC by @ewgenius in https://github.com/spiceai/spiceai/pull/4432
Update link to retrieval-augmented generation (RAG) details by @sgrebnov in https://github.com/spiceai/spiceai/pull/4433
Unity catalog should strip parameter prefix before passing parameters to delta lake factory by @Sevenannn in https://github.com/spiceai/spiceai/pull/4436
Update quickstart traces to match current version by @sgrebnov in https://github.com/spiceai/spiceai/pull/4435
Update Supported Embeddings Providers Readme section by @sgrebnov in https://github.com/spiceai/spiceai/pull/4434
Local models can stream tools by @Jeadie in https://github.com/spiceai/spiceai/pull/4429
fix: Use MetricsCollector::show() for HTTP testoperator commands by @peasee in https://github.com/spiceai/spiceai/pull/4442
Fix run query action by @ewgenius in https://github.com/spiceai/spiceai/pull/4444
Default to AI-enabled runtime for spice run/spice install by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4443
Change no spicepod.yaml log to warning by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4447
refactor: Update Catalog Connector error messages by @peasee in https://github.com/spiceai/spiceai/pull/4441
Fix panic when converting OTel metrics by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4449
refactor: Update model errors by @peasee in https://github.com/spiceai/spiceai/pull/4446
Update spiceai/mistral.rs to silence metadata logs by @ewgenius in https://github.com/spiceai/spiceai/pull/4452
fix xAI; don't use openai defaults by @Jeadie in https://github.com/spiceai/spiceai/pull/4450
Improves the UX of using huggingface models by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4451
Add GH Workflow to test spice ai runtime installation by @sgrebnov in https://github.com/spiceai/spiceai/pull/4448
fix: Use specific model errors where available by @peasee in https://github.com/spiceai/spiceai/pull/4454
Detect and report unsupported embedding column type during dataset registration by @sgrebnov in https://github.com/spiceai/spiceai/pull/4456
Handle Errors by @Jeadie in https://github.com/spiceai/spiceai/pull/4455
Catch and report negative openai_temperature error by @Sevenannn in https://github.com/spiceai/spiceai/pull/4453
Clarify release check error message if it is caused by wrong GH token by @ewgenius in https://github.com/spiceai/spiceai/pull/4458

Full Changelog: https://github.com/spiceai/spiceai/compare/v1.0.0-rc.5...v1.0.0

- Rust
Published by peasee over 1 year ago

https://github.com/spiceai/spiceai - v1.0.0-rc.5

Spice v1.0-rc.5 (Jan 13, 2025)

Spice v1.0.0-rc.5 is the fifth release candidate for the first major version of Spice.ai OSS. This release focuses production readiness and critical bug fixes. In addition, a new DynamoDB data connector has been added along with automatic detection for GPU acceleration when running Spice using the CLI.

Highlights in v1.0-rc.5

Automatic GPU Acceleration Detection: Automatically detect and utilize GPU acceleration when running by CLI. Install AI components locally using the CLI command spice install ai. Currently supports NVIdia CUDA and Apple Metal (M-series).
DynamoDB Data Connector: Query AWS DynamoDB tables using SQL with the new DynamoDB Data Connector.

yaml datasets: - from: dynamodb:users name: users params: dynamodb_aws_region: us-west-2 dynamodb_aws_access_key_id: ${secrets:aws_access_key_id} dynamodb_aws_secret_access_key: ${secrets:aws_secret_access_key} acceleration: enabled: true

console sql> describe users; +----------------+-----------+-------------+ | column_name | data_type | is_nullable | +----------------+-----------+-------------+ | created_at | Utf8 | YES | | date_of_birth | Utf8 | YES | | email | Utf8 | YES | | account_status | Utf8 | YES | | updated_at | Utf8 | YES | | full_name | Utf8 | YES | | ... | +----------------+-----------+-------------+

File Data Connector: Graduated to Stable.
Dremio Data Connector: Graduated to Release Candidate (RC).
Spice.ai, Spark, and Snowflake Data Connectors: Graduated to Beta.

Dependencies

No major dependency changes.

Contributors

@Jeadie
@phillipleblanc
@ewgenius
@peasee
@Sevenannn
@lukekim

What's Changed

Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/4190
Ensure non-nullity of primary keys in MemTable; check validity of initial data. by @Jeadie in https://github.com/spiceai/spiceai/pull/4158
Bump version to v1.0.0 stable by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4191
Fix metal + models download by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4193
Update spice.ai connector beta roadmap by @ewgenius in https://github.com/spiceai/spiceai/pull/4194
feat: verify on zero results snapshots by @peasee in https://github.com/spiceai/spiceai/pull/4195
Add throughput test module to test-framework by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4196
Update Spice.ai TPCH snapshots by @ewgenius in https://github.com/spiceai/spiceai/pull/4202
Replace all usage of lazy_static! with LazyLock by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4199
Fix model + metal download by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4200
Run Clickbench for Dremio by @Sevenannn in https://github.com/spiceai/spiceai/pull/4138
Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/4205
Fix the typo in connector stable criteria by @Sevenannn in https://github.com/spiceai/spiceai/pull/4213
feat: Add throughput test example by @peasee in https://github.com/spiceai/spiceai/pull/4214
feat: calculate throughput test query percentiles by @peasee in https://github.com/spiceai/spiceai/pull/4215
feat: Add throughput test to actions by @peasee in https://github.com/spiceai/spiceai/pull/4217
Implement DynamoDB Data Connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4218
1.0 doc updates by @lukekim in https://github.com/spiceai/spiceai/pull/4181
Improve clarity and concison of use-cases by @lukekim in https://github.com/spiceai/spiceai/pull/4220
Remove macOS Intel build by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4221
fix: Test operator throughput test workflow by @peasee in https://github.com/spiceai/spiceai/pull/4222
DynamoDB: Automatically load AWS credentials from IAM roles if access key not provided by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4226
File connector clickbench snapshots results by @ewgenius in https://github.com/spiceai/spiceai/pull/4225
Spice.ai Catalog Connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4204
feat: Add test framework metrics collection by @peasee in https://github.com/spiceai/spiceai/pull/4227
Add badges for build/test status on README.md by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4228
Release Dremio to RC by @Sevenannn in https://github.com/spiceai/spiceai/pull/4224
feat: Add more test spicepods by @peasee in https://github.com/spiceai/spiceai/pull/4229
feat: Add load test to testoperator by @peasee in https://github.com/spiceai/spiceai/pull/4231
Add TSV format to all object_store-based connectors by @Jeadie in https://github.com/spiceai/spiceai/pull/4192
Move test-framework to dev-dependencies for Runtime by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4230
Document limitation for correlated subqueries in TPCH for Spice.ai connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4235
Changes for CUDA by @Jeadie in https://github.com/spiceai/spiceai/pull/4130
fix: Collect batches from test framework, load test updates by @peasee in https://github.com/spiceai/spiceai/pull/4234
Suppress opentelemetry_sdk warnings - they aren't useful by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4243
fix: Set dataset status first, update test framework by @peasee in https://github.com/spiceai/spiceai/pull/4244
feat: Re-enable defaults on test spicepods by @peasee in https://github.com/spiceai/spiceai/pull/4248
Add usage for streaming local models; Fix spice chat usage bar TPS expansion by @Jeadie in https://github.com/spiceai/spiceai/pull/4232
refactor: Use composite testoperator setup, add query overrides by @peasee in https://github.com/spiceai/spiceai/pull/4246
Enable expandviewsat_output for DF optimizer and transform schema to expanded view types by @ewgenius in https://github.com/spiceai/spiceai/pull/4237
Add throughput test spicepod for databricks delta mode connector by @Sevenannn in https://github.com/spiceai/spiceai/pull/4241
Spark data connector - update and enable TPCH and TPCDS benchmarks by @ewgenius in https://github.com/spiceai/spiceai/pull/4240
Increase the timeout minutes of load test to 10 hours by @Sevenannn in https://github.com/spiceai/spiceai/pull/4254
Improve partition column counts error for delta table by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4247
Add e2e test for databricks catalog connector (mode: delta_lake) by @Sevenannn in https://github.com/spiceai/spiceai/pull/4255
Spark connector integration tests by @ewgenius in https://github.com/spiceai/spiceai/pull/4256
Run benchmark test with the new test framework by @Sevenannn in https://github.com/spiceai/spiceai/pull/4245
Configure databricks delta secrets to run load test by @Sevenannn in https://github.com/spiceai/spiceai/pull/4257
Support properties for emitted telemetry by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4249
feat: Add ready_wait test operator workflow input by @peasee in https://github.com/spiceai/spiceai/pull/4259
Handle 'LargeStringArray' for embedding tables by @Jeadie in https://github.com/spiceai/spiceai/pull/4263
llms tests for alpha/beta model criteria by @Jeadie in https://github.com/spiceai/spiceai/pull/4261
Configurable runner type for load and throughput tests by @ewgenius in https://github.com/spiceai/spiceai/pull/4262
Handle NULL partition columns for Delta Lake tables by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4264
Add integration test for Snowflake by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4266
Add Snowflake TPCH queries by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4268
Handle LargeStringArray in v1/search. by @Jeadie in https://github.com/spiceai/spiceai/pull/4265
Fix build_cuda in Update spiced_docker.yml by @Jeadie in https://github.com/spiceai/spiceai/pull/4269
Run Snowflake benchmark in GitHub Actions by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4270
Allow Snowflake query override for CI tests by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4271
Don't run GPU builds for trunk by @Jeadie in https://github.com/spiceai/spiceai/pull/4272
Fix InvalidTypeAction not working by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4273
Add xAI key to llm integration tests by @Jeadie in https://github.com/spiceai/spiceai/pull/4274
Update openai snapshots by @Jeadie in https://github.com/spiceai/spiceai/pull/4275
Fix federation bug for correlated subqueries by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4276
Update end_game.md by @ewgenius in https://github.com/spiceai/spiceai/pull/4278
Promote Snowflake to Beta by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4277
Set version to 1.0.0-rc.5 by @ewgenius in https://github.com/spiceai/spiceai/pull/4283
Update cargo.lock by @ewgenius in https://github.com/spiceai/spiceai/pull/4285
Update spice.ai data connector snapshots by @ewgenius in https://github.com/spiceai/spiceai/pull/4281
Promote the Spice.ai Data Connector to Beta by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4282
Revert change to integration_models__models__search__openai_chunking_response.snap by @Jeadie in https://github.com/spiceai/spiceai/pull/4279
Allow for a subset of build artifacts to be published to minio by @Jeadie in https://github.com/spiceai/spiceai/pull/4280
Promote File Data Connector to Stable by @ewgenius in https://github.com/spiceai/spiceai/pull/4286
Add Iceberg to Supported Catalogs by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4287
Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/4289
Fix Spark benchmark credentials, add back overrides by @ewgenius in https://github.com/spiceai/spiceai/pull/4295
Promote Spark Data Connector to Beta by @ewgenius in https://github.com/spiceai/spiceai/pull/4296
Add Dremio throughput test spicepod by @Sevenannn in https://github.com/spiceai/spiceai/pull/4233
Add error message for invalid databricks mode parameter by @Sevenannn in https://github.com/spiceai/spiceai/pull/4299
Fix pre-release check to look for build string by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4300
Promote databricks catalog connector (mode: delta_lake) to beta by @Sevenannn in https://github.com/spiceai/spiceai/pull/4301
Properly delegate load_table to Rest Catalog by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4303
Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/4302
docs: Update ROADMAP.md by @peasee in https://github.com/spiceai/spiceai/pull/4306
v1.0.0-rc.5 Release Notes by @ewgenius in https://github.com/spiceai/spiceai/pull/4298

Full Changelog: https://github.com/spiceai/spiceai/compare/v1.0.0-rc.4...v1.0.0-rc.5

- Rust
Published by ewgenius over 1 year ago

https://github.com/spiceai/spiceai - v1.0.0-rc.4

Spice v1.0-rc.4 (Jan 6, 2025)

Happy New Year 🎆!

Spice v1.0.0-rc.4 is the fourth release candidate for the first major version of Spice.ai OSS. This release continues the focus on production readiness. In addition, xAI has been added as a model provider.

Highlights in v1.0-rc.4

xAI Model Provider: Adds support for xAI hosted models.

yaml models: - from: xai:grok2-latest name: xai params: xai_api_key: ${secrets:SPICE_XAI_API_KEY}

yaml datasets: - from: file://my_table.tsv name: table

Spicepod Spec Version: Spicepod spec version v1 is now by default. v1beta1 will continue to work.

yaml version: v1 kind: Spicepod name: my_pod

GitHub Data Connector: Graduated to Stable.
PostgreSQL Data Accelerator: Graduated to Release Candidate (RC).

Cookbook

Added xAI model provider recipe.

Dependencies

No major dependency changes.

Contributors

@lukekim
@phillipleblanc
@peasee
@karifabri
@sgrebnov
@Jeadie
@ewgenius

What's Changed

Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/4087
Update Helm chart for v1.0.0-rc.3 (v0.2.2) by @lukekim in https://github.com/spiceai/spiceai/pull/4088
Rev version to v1.0.0-rc.4 by @lukekim in https://github.com/spiceai/spiceai/pull/4090
Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/4089
Fix OpenAI Models Integration tests by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4084
fix: Update Postgres TPCDS and ClickBench queries by @peasee in https://github.com/spiceai/spiceai/pull/4092
fix: Check Postgres acceleration schema on insert by @peasee in https://github.com/spiceai/spiceai/pull/4094
Update v1.0.0-rc.3.md by @karifabri in https://github.com/spiceai/spiceai/pull/4096
Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/4093
First-class TSV for file data connector by @lukekim in https://github.com/spiceai/spiceai/pull/4098
Allow Flight DoPut only for write api-keys by @sgrebnov in https://github.com/spiceai/spiceai/pull/4010
Only create tables eval.runs and eval.results when an eval is defined by @Jeadie in https://github.com/spiceai/spiceai/pull/4099
Update Copyright year to include 2025 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4100
feat: add postgres clickbench accelerator, release postgres accelerator by @peasee in https://github.com/spiceai/spiceai/pull/4111
Add spice binaries with metal to releases; detect metal device in spice install/upgrade. by @Jeadie in https://github.com/spiceai/spiceai/pull/4097
docs: Clarify connector release criteria by @peasee in https://github.com/spiceai/spiceai/pull/4112
Update datafusion-federation to fix LIMIT with OFFSET handling in logical plan rewrite by @ewgenius in https://github.com/spiceai/spiceai/pull/4115
Support Grok AI. by @Jeadie in https://github.com/spiceai/spiceai/pull/4113
Fix spice chat usage bar. by @Jeadie in https://github.com/spiceai/spiceai/pull/4119
Set unified max encoding and decoding message size for all flight client configurations across runtime by @ewgenius in https://github.com/spiceai/spiceai/pull/4116
feat: Add the file connector as an appendable benchmark connector by @peasee in https://github.com/spiceai/spiceai/pull/4120
Add spice eval command by @lukekim in https://github.com/spiceai/spiceai/pull/4118
Support multi-level table nesting for Dremio by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4129
feat: run append TPCH benchmarks in workflow (Arrow, DuckDB) by @peasee in https://github.com/spiceai/spiceai/pull/4131
Fix bug in Iceberg tables selecting a subset of columns by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4132
feat: Run append TPCDS benchmarks in workflow (Arrow, DuckDB) by @peasee in https://github.com/spiceai/spiceai/pull/4141
Setup spice.ai clickbench by @ewgenius in https://github.com/spiceai/spiceai/pull/4134
Data is streamed when reading from the GitHub connector (GraphQL tables) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4142
Mark the GitHub Data Connector as Stable by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4143
Fix table quoting for Databricks Spark connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4145
Extend flight compute context for spice.ai connector with org and app names, to fix federated queries from different spice.ai data sources by @ewgenius in https://github.com/spiceai/spiceai/pull/4144
Enforce Flight DoPut policies: Rate Limiting, Read Timeout, and Max Records per Batch by @sgrebnov in https://github.com/spiceai/spiceai/pull/4117
Fix bug Changes in catalog.yaml would require saving in spicepod.yaml to apply by @sgrebnov in https://github.com/spiceai/spiceai/pull/4147
Update benchmark snapshots by @github-actions in https://github.com/spiceai/spiceai/pull/4137
Add test-framework crate to contain all common benchmark, E2E, integration testing logic. by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4157
Fix platform_option variable in build_and_release.yml. by @Jeadie in https://github.com/spiceai/spiceai/pull/4154
feat: Add Clickbench append benchmark for DuckDB and Arrow by @peasee in https://github.com/spiceai/spiceai/pull/4160
Upload artifacts to Minio on buildandrelease by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4159
feat: add on zero results benchmark by @peasee in https://github.com/spiceai/spiceai/pull/4164
Update spice.ai connector tests by @ewgenius in https://github.com/spiceai/spiceai/pull/4161

Full Changelog: https://github.com/spiceai/spiceai/compare/v1.0.0-rc.3...v1.0.0-rc.4

- Rust
Published by phillipleblanc over 1 year ago

https://github.com/spiceai/spiceai - v1.0.0-rc.3

Spice v1.0-rc.3 (Dec 30, 2024)

Spice v1.0.0-rc.3 is the third release candidate for the first major version of Spice.ai OSS. This release continues the focus on production readiness and includes new Iceberg Catalog APIs, DuckDB improvements, and a new Iceberg Catalog Connector.

Highlights in v1.0-rc.3

Iceberg Catalog APIs: Spice now functions as an Iceberg Catalog provider, implementing a core subset of the Iceberg Catalog APIs. This enables Iceberg Catalog clients native discovery of datasets and schemas through Spice APIs.
GET /v1/namespaces - List all catalogs registered in Spice.
GET /v1/namespaces?parent=catalog - List schemas registered under a given catalog.
GET /v1/namespaces/:catalog_schema/tables - List tables registered under a given schema.
GET /v1/namespaces/:catalog_schema/tables/:table - Get the schema of a given table.
Iceberg Catalog Connector: The Iceberg Catalog Connector is a new integration to discover and query datasets from a remote Iceberg Catalog.

Example connecting to a remote Iceberg Catalog with tables stored in S3:

yaml catalogs: - from: iceberg:https://my-iceberg-catalog.com/v1/namespaces name: ice params: iceberg_s3_access_key_id: ${secrets:ICEBERG_S3_ACCESS_KEY_ID} iceberg_s3_secret_access_key: ${secrets:ICEBERG_S3_SECRET_ACCESS_KEY} iceberg_s3_region: us-east-1

View the Iceberg Catalog Connector documentation for more details.

DuckDB Improvements: Added cosine_distance support for DuckDB-backed vector search, improved unnest nested type handling for array_element and lists, and optimized query performance.
SQLite Data Accelerator: Graduated to Release Candidate (RC).
File Data Accelerator: Graduated to Release Candidate (RC).

Breaking changes

API:v1/datasets/sample has been removed as it is not particularly useful, can be replicated via SQL, and via the tools endpoint POST v1/tools/:name.

Cookbook

New Language Model Evals Recipe shwoing how to measure the performance of a language model using LLM-as-Judge, configured entirely in the spice runtime.
New Iceberg Catalog Recipe showing how to use Spice to query Iceberg tables from a Iceberg catalog.

Dependencies

OpenTelemetry: Upgraded from 0.26.0 to 0.27.1
Go: Upgraded from 1.22 to 1.23 (CLI)

Contributors

@sgrebnov
@phillipleblanc
@peasee
@Jeadie
@Sevenannn
@lukekim
@ewgenius

What's Changed

Add CI configuration for search benchmark dataset access by @sgrebnov in https://github.com/spiceai/spiceai/pull/3888
Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/3895
Upgrade dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3896
chore: Update helm chart for RC.2 by @peasee in https://github.com/spiceai/spiceai/pull/3899
Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/3903
chore: Update MacOS test release install to macos-13 by @peasee in https://github.com/spiceai/spiceai/pull/3901
Add usage to spice chat and fix v1/models?status=true. by @Jeadie in https://github.com/spiceai/spiceai/pull/3898
chore: Bump versions for rc3 by @peasee in https://github.com/spiceai/spiceai/pull/3902
docs: Update endgame with a step to verify dependencies in release notes by @peasee in https://github.com/spiceai/spiceai/pull/3897
Ensure eval dataset input and ouput of correct length by @Jeadie in https://github.com/spiceai/spiceai/pull/3900
spice add/connect/dataset configure should update spicepod, not overwrite it & upgrade to Go 1.23 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3905
Bump opentelemetry from 0.26.0 to 0.27.1 by @dependabot in https://github.com/spiceai/spiceai/pull/3879
Ensure trace_id is overridden for prior written spans by @Jeadie in https://github.com/spiceai/spiceai/pull/3906
add 'role': 'assistant' for local models by @Jeadie in https://github.com/spiceai/spiceai/pull/3910
Run tpcds benchmark for file connector by @Sevenannn in https://github.com/spiceai/spiceai/pull/3924
Update to reference cookbook instead of quickstarts/samples by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3928
Fix/remove flaky integration tests by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3930
Implement /v1/iceberg/namespaces & /v1/iceberg/config APIs by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3923
Add script for creating tpcds parquet files and spicepod for file connector by @Sevenannn in https://github.com/spiceai/spiceai/pull/3931
Use utoipa to generate openapi.json and swagger for dev by @Jeadie in https://github.com/spiceai/spiceai/pull/3927
fuzzy_match, json_match, includes scorer by @Jeadie in https://github.com/spiceai/spiceai/pull/3926
Implement /v1/iceberg/namespaces/:namespace by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3933
Implement GET /v1/iceberg/namespaces/:namespace/tables API by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3934
Add custom Spice DuckDB dialect with cosine_distance support by @sgrebnov in https://github.com/spiceai/spiceai/pull/3938
Fix NSQL error: all columns in a record batch must have the same length by @sgrebnov in https://github.com/spiceai/spiceai/pull/3947
Don't include tools use in hf test model by @Jeadie in https://github.com/spiceai/spiceai/pull/3955
Implement GET /v1/namespaces/{namespace}/tables/{table} API by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3940
Update dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3967
DuckDB: add support for nested types in Lists by @sgrebnov in https://github.com/spiceai/spiceai/pull/3961
Add script to set up clickbench for file connector by @Sevenannn in https://github.com/spiceai/spiceai/pull/3945
docs: Add connector stable criteria by @peasee in https://github.com/spiceai/spiceai/pull/3908
Update Roadmp Dec 23, 2024 by @lukekim in https://github.com/spiceai/spiceai/pull/3978
Improve CI testing for OpenAPI, new tool spiceschema, fix broken OpenAPI stuff. by @Jeadie in https://github.com/spiceai/spiceai/pull/3948
remove v1/datasets/sample by @Jeadie in https://github.com/spiceai/spiceai/pull/3981
feat: add SQLite ClickBench benchmark by @peasee in https://github.com/spiceai/spiceai/pull/3975
Remove feature 'llms/mistralrs' by @Jeadie in https://github.com/spiceai/spiceai/pull/3984
Add support for 'params.spice_tools: nsql' by @Jeadie in https://github.com/spiceai/spiceai/pull/3985
Fix integration tests - add missing format query parameter in /v1/status requests by @ewgenius in https://github.com/spiceai/spiceai/pull/3989
Enhance AI tools sampling logic for robust handling of large fields by @sgrebnov in https://github.com/spiceai/spiceai/pull/3959
Fix subquery federation by @Sevenannn in https://github.com/spiceai/spiceai/pull/3991
Fix unnest and add DuckDB support for array_element by @sgrebnov in https://github.com/spiceai/spiceai/pull/3995
Add score value snapshotting to vector similarity search tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/3996
Use Llama-3.2-3B-Instruct for Hugging Face integration testing by @sgrebnov in https://github.com/spiceai/spiceai/pull/3992
Simplify construct_chunk_query_sql for DuckDB compatibility by @sgrebnov in https://github.com/spiceai/spiceai/pull/3988
Update TPCH and TPCDS benchmarks for spice.ai connector by @ewgenius in https://github.com/spiceai/spiceai/pull/3982
Correctly pass Hugging Face token in models integration tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/3997
Fix: on_zero_results causes TransactionContext Error: Catalog write-write conflict on create with "attachment_0" by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3998
Add DuckDB acceleration to search benchmarks by @sgrebnov in https://github.com/spiceai/spiceai/pull/4000
Enable Postgres write via non-default postgres-write feature flag by @sgrebnov in https://github.com/spiceai/spiceai/pull/4004
Allow search benchmark to write test results by @sgrebnov in https://github.com/spiceai/spiceai/pull/4008
Make Flight DoPut atomic and commit write only on successful stream completion by @sgrebnov in https://github.com/spiceai/spiceai/pull/4002
Create a CatalogConnector abstraction by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4003
Fix generate-openapi.yml and add .schema/openapi.json. by @Jeadie in https://github.com/spiceai/spiceai/pull/3983
Enable spice.ai tpcds bench workflow. Comment failing tpch queries. by @ewgenius in https://github.com/spiceai/spiceai/pull/4001
feat: Add SQLite ClickBench overrides by @peasee in https://github.com/spiceai/spiceai/pull/4016
Implement Iceberg Catalog Connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4053
feat: Datafusion updates for SQLite fixes and release by @peasee in https://github.com/spiceai/spiceai/pull/4054
docs: Add accelerator stable release criteria by @peasee in https://github.com/spiceai/spiceai/pull/4017
Add dremio tpch / tpcds benchmark test by @Sevenannn in https://github.com/spiceai/spiceai/pull/4063
Update docs, and make PR to spiceai/docs for new openapi.json. by @Jeadie in https://github.com/spiceai/spiceai/pull/4019
Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/4065
Fix dremio subquery rewrite by @Sevenannn in https://github.com/spiceai/spiceai/pull/4064
Update generate-openapi.yml by @Jeadie in https://github.com/spiceai/spiceai/pull/4073
docs: Add catalog criteria by @peasee in https://github.com/spiceai/spiceai/pull/4052
fix distinct_columns in auto/nsql tool groups by @Jeadie in https://github.com/spiceai/spiceai/pull/4074
Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/4075
Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/4076
Implement windowfuncsupportwindowframe from DremioDialect by @Sevenannn in https://github.com/spiceai/spiceai/pull/4012
Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/4079
Promote file connector to rc by @Sevenannn in https://github.com/spiceai/spiceai/pull/4080
Add Iceberg to README by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4085
Fix '/v1/status' default format by @Jeadie in https://github.com/spiceai/spiceai/pull/4081

Full Changelog: https://github.com/spiceai/spiceai/compare/v1.0.0-rc.2...v1.0.0-rc.3

- Rust
Published by lukekim over 1 year ago

https://github.com/spiceai/spiceai - v1.0.0-rc.2

Spice v1.0-rc.2 (Dec 16, 2024)

Spice v1.0.0-rc.2 is the second release candidate for the first major version of Spice.ai OSS. This release continues to build on the stability of Spice for production use, including key Data Connector graduations, bug fixes, and AI features.

Highlights in v1.0-rc.2

MS SQL and File Data Connectors: Graduated from Alpha to Beta.
GraphQL and Databricks Delta Lake Data Connectors: Graduated from Beta to Release Candidate.
gospice SDK Release: The Spice Go SDK has updated to v7.0, adding support for refreshing datasets and upgrading dependencies.
Azure AI Support: Added support for both LLMs and embedding models. Example spicepod.yml configuration:

yaml embeddings: - name: azure from: azure:text-embedding-3-small params: endpoint: https://your-resource-name.openai.azure.com azure_api_version: 2024-08-01-preview azure_deployment_name: text-embedding-3-small azure_api_key: ${ secrets:SPICE_AZURE_API_KEY } models: - name: azure from: azure:gpt-4o-mini params: endpoint: https://your-resource-name.openai.azure.com azure_api_version: 2024-08-01-preview azure_deployment_name: gpt-4o-mini azure_api_key: ${ secrets:SPICE_AZURE_TOKEN }

Accelerate subsets of columns: Spice now supports acceleration for specific columns from a federated source. Specify the desired columns directly in the Refresh SQL for more selective and efficient data acceleration.

Example spicepod.yaml configuration:

yaml datasets: - from: s3://spiceai-demo-datasets/taxi_trips/2024/ name: taxi_trips params: file_format: parquet acceleration: refresh_sql: SELECT tpep_pickup_datetime, tpep_dropoff_datetime, trip_distance, total_amount FROM taxi_trips

Breaking changes

Sharepoint Authentication Parameters: now use access tokens instead of authorization codes, using the sharepoint_bearer_token parameter. The sharepoint_auth_code parameter has been removed.

Data Connector Delimiters: now support / and ://, in addition to : in the from parameter of the dataset configuration. The following examples are equivalent:

from: postgres://my_postgres_table
from: postgres/my_postgres_table
from: postgres:my_postgres_table

Some data connectors, such as s3 which only accepts ://, place further restrictions on the allowed delimiter.

The file data connector has changed how it interprets the :// delimiter to reflect how most other URL parsers work, i.e. file://my_file_path. Previously, the file path was interpreted as /my_file_path. Now, it is interpreted as a relative path, i.e. my_file_path.

Spice Search limit: is now applied to the final search result, instead of previously being applied separately to each dataset involved in a search before aggregation.

Dependencies

Rust: Upgraded to 1.83

Contributors

@phillipleblanc
@ewgenius
@Jeadie
@sgrebnov
@peasee
@Sevenannn
@Advayp

New Contributors

@Advayp made their first contribution in https://github.com/spiceai/spiceai/pull/3862

What's Changed

Fix install scripts to handle the RC release by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3718
Update helm chart to v1.0.0-rc.1 by @ewgenius in https://github.com/spiceai/spiceai/pull/3720
Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/3719
Add logic to ignore task cancellations due to runtime shutdown by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3717
Update to next relese version v1.0.0-rc.2 by @ewgenius in https://github.com/spiceai/spiceai/pull/3721
Handle parsing OTel KeyValues from the baggage header by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3722
Update llms dependencies: mistralrs, async-openai by @Jeadie in https://github.com/spiceai/spiceai/pull/3725
Support jsonl for object store by @Jeadie in https://github.com/spiceai/spiceai/pull/3726
Fix NSQL models integration tests for HF by @sgrebnov in https://github.com/spiceai/spiceai/pull/3727
standardise 'csvschemainfermaxrecords' -> 'schemainfermax_records'; include deprecation messages for dataset params by @Jeadie in https://github.com/spiceai/spiceai/pull/3732
feat: Add script to generate TPC-H data for file connector by @peasee in https://github.com/spiceai/spiceai/pull/3737
feat: Add file connector integration test by @peasee in https://github.com/spiceai/spiceai/pull/3735
fix: Add explicit message for ODBC connector when not installed by @peasee in https://github.com/spiceai/spiceai/pull/3736
Remove Box::leak in create_accelerated_table by @sgrebnov in https://github.com/spiceai/spiceai/pull/3739
docs: Update enhancement and PR template by @peasee in https://github.com/spiceai/spiceai/pull/3740
feat: add file connector benchmark by @peasee in https://github.com/spiceai/spiceai/pull/3734
docs: Release file connector beta by @peasee in https://github.com/spiceai/spiceai/pull/3738
For embeddings, use sentence_*_config.json, download HF async, use TEI functions by @Jeadie in https://github.com/spiceai/spiceai/pull/3724
Optimize build & release workflow for trunk builds by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3741
Update benchmark snapshots by @github-actions in https://github.com/spiceai/spiceai/pull/3752
Skip Spice cloud integration tests by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3755
Add http_requests metric and deprecate http_requests_total by @sgrebnov in https://github.com/spiceai/spiceai/pull/3748
Update benchmark snapshots by @github-actions in https://github.com/spiceai/spiceai/pull/3759
fix: Parquet file generation script by @peasee in https://github.com/spiceai/spiceai/pull/3762
fix: Use InvalidConfiguration error for GraphQL query errors by @peasee in https://github.com/spiceai/spiceai/pull/3763
Extend Spice Search integration and E2E tests to cover chunking by @sgrebnov in https://github.com/spiceai/spiceai/pull/3750
test: Add GraphQL integration tests from external sources by @peasee in https://github.com/spiceai/spiceai/pull/3756
docs: Release GraphQL release candidate by @peasee in https://github.com/spiceai/spiceai/pull/3764
Accelerate a subset of columns from source dataset in Refresh SQL by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3765
Run TPCDS benchmark for databricks delta mode by @Sevenannn in https://github.com/spiceai/spiceai/pull/3751
Update dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3747
Implement vector search benchmark initialization by @sgrebnov in https://github.com/spiceai/spiceai/pull/3774
Implement InvalidTypeAction for PostgreSQL Data Connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3767
fix: Check ODBC parameters are positive integers by @peasee in https://github.com/spiceai/spiceai/pull/3777
Fix Delta DataType Map type mapping to arrow type by @Sevenannn in https://github.com/spiceai/spiceai/pull/3776
Update Databricks & Delta Lake Connector RC criteria by @Sevenannn in https://github.com/spiceai/spiceai/pull/3778
Add a /v1/packages/generate API to generate a Spicepod package from a GitHub repo. by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3782
Set Spice-Target-Source header for spice add by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3783
Call v1 spicerack API by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3784
Run models integration tests on self-hosted macOS runners by @sgrebnov in https://github.com/spiceai/spiceai/pull/3785
Fix OpenAI models integration tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/3786
Integration test for Databricks delta_lake mode by @Sevenannn in https://github.com/spiceai/spiceai/pull/3779
Add spice connect for connecting to existing Spice.ai instances by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3790
Add eval spicepod component; basic HTTP api to run eval. by @Jeadie in https://github.com/spiceai/spiceai/pull/3766
Release RC for databricks delta_lake mode by @Sevenannn in https://github.com/spiceai/spiceai/pull/3792
Include Huggingface model to E2E models tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/3788
Enable trace_id & parent_span_id overrides for v1/chat/completion by @Jeadie in https://github.com/spiceai/spiceai/pull/3791
Search benchmark: run search workload and measure result by @sgrebnov in https://github.com/spiceai/spiceai/pull/3793
Search benchmark: measure search precision by @sgrebnov in https://github.com/spiceai/spiceai/pull/3804
Use MinIO instead of S3 for benchmark tests by @Sevenannn in https://github.com/spiceai/spiceai/pull/3794
Update benchmark snapshots by @github-actions in https://github.com/spiceai/spiceai/pull/3814
Only verify TPCH / TPCDS official query results for DuckDB by @Sevenannn in https://github.com/spiceai/spiceai/pull/3816
Fixes for the Debezium connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3819
Fix insert statement when all columns are constraint columns by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3820
docs: Move ODBC to Beta for current state of roadmap by @peasee in https://github.com/spiceai/spiceai/pull/3823
Accept :, / or :// as the delimiter for the data connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3821
Update dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3826
Enable read_write mode support for Postgres Data Connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/3813
feat: add Databricks ODBC TPCDS benchmark by @peasee in https://github.com/spiceai/spiceai/pull/3825
Change spice.ai data connector dataset path format to <org>/<app>/datasets/<table_reference> by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3828
fix: enable tpcds explain snapshotting by @peasee in https://github.com/spiceai/spiceai/pull/3830
Azure AI support for both LLMs & embedding models by @Jeadie in https://github.com/spiceai/spiceai/pull/3824
Add Github Workflow to run Search Benchmark by @sgrebnov in https://github.com/spiceai/spiceai/pull/3834
Fetch access token with Microsoft OAuth, and use access token to initiate Sharepoint data connector graph client by @Sevenannn in https://github.com/spiceai/spiceai/pull/3836
Initialize accelerator for datasets dynamically included by @Sevenannn in https://github.com/spiceai/spiceai/pull/3714
Update cargo.lock by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3838
feat: add MS SQL TPCH benchmark by @peasee in https://github.com/spiceai/spiceai/pull/3833
Improve Azure AI models support by @sgrebnov in https://github.com/spiceai/spiceai/pull/3835
Primary key support for Arrow's Memtable by @Jeadie in https://github.com/spiceai/spiceai/pull/3829
Update Tokenizer to 0.21 and mistral.rs by @Jeadie in https://github.com/spiceai/spiceai/pull/3839
Fix models integration tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/3843
Enable spice login abfs by @Sevenannn in https://github.com/spiceai/spiceai/pull/3844
update crates/llms dependencies to 'spiceai' branch by @Jeadie in https://github.com/spiceai/spiceai/pull/3846
Make eval runs non-blocking; spice.eval.{results, runs} tables. by @Jeadie in https://github.com/spiceai/spiceai/pull/3780
fix: Update GraphQL snapshots by @peasee in https://github.com/spiceai/spiceai/pull/3849
Update to Rust 1.83 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3847
feat: add mssql integration test by @peasee in https://github.com/spiceai/spiceai/pull/3848
Prepend user-specified user agent in flight repl by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3850
fix: trim CHAR in mssql by @peasee in https://github.com/spiceai/spiceai/pull/3852
Fix column quoting for SpiceCloudPlatform dialect by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3857
Optimize builds by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3861
Endgame template: Add recently added AI/ML quickstarts and samples by @sgrebnov in https://github.com/spiceai/spiceai/pull/3859
docs: Release MS SQL Beta by @peasee in https://github.com/spiceai/spiceai/pull/3853
Fix nsql sampling for tables with embeddings by @sgrebnov in https://github.com/spiceai/spiceai/pull/3860
Make GH workflows with spiceai-macos runners more stable by @sgrebnov in https://github.com/spiceai/spiceai/pull/3863
fix: Remove GraphQL swapi test by @peasee in https://github.com/spiceai/spiceai/pull/3867
create 1 tokio::test per test/model by @Jeadie in https://github.com/spiceai/spiceai/pull/3696
handle max_completion_tokens vs max_tokens for openai vs azure by @Jeadie in https://github.com/spiceai/spiceai/pull/3869
Search benchmark: write results to dataset by @sgrebnov in https://github.com/spiceai/spiceai/pull/3871
Create evalconverter that creates spice eval components. by @Jeadie in https://github.com/spiceai/spiceai/pull/3864
Update quickstart in README.md by @ewgenius in https://github.com/spiceai/spiceai/pull/3876
Remove reference to spiceai-smart-demo from the repo home by @sgrebnov in https://github.com/spiceai/spiceai/pull/3885
Trace evals accelerated tables updates in debug mode by @sgrebnov in https://github.com/spiceai/spiceai/pull/3884
Clarify confusing log message by @Advayp in https://github.com/spiceai/spiceai/pull/3862
Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/3840
Azure OpenAI models: make endpoint parameter required by @sgrebnov in https://github.com/spiceai/spiceai/pull/3883
Use spiceai delta kernel fork, actionable message for delta checkpoint errors by @Sevenannn in https://github.com/spiceai/spiceai/pull/3856
Add support for GGUF files in HF by @Jeadie in https://github.com/spiceai/spiceai/pull/3875

Full Changelog: https://github.com/spiceai/spiceai/compare/v1.0.0-rc.1...v1.0.0-rc.2

- Rust
Published by peasee over 1 year ago

https://github.com/spiceai/spiceai - v1.0.0-rc.1

Spice v1.0-rc.1 (Nov 27, 2024)

Spice v1.0.0-rc.1 marks the release candidate for the first major version of Spice.ai OSS. This milestone includes key Connector and Accelerator graduations and bug fixes, positioning Spice for a stable and production-ready release.

Highlights in v1.0-rc.1

API Key Authentication: Spice now supports optional authentication for API endpoints via configurable API keys, for additional security and control over runtime access.

Example Spicepod.yml configuration:

yaml runtime: auth: api-key: enabled: true keys: - ${ secrets:api_key } # Load from a secret store - my-api-key # Or specify directly

Usage:

HTTP API: Include the API key in the X-API-Key header.
Flight SQL: Use the API key in the Authorization header as a Bearer token.
Spice CLI: Provide the --api-key flag for CLI commands.

For more details on using API Key auth, refer to the API Auth documentation.

DuckDB Data Connector: Has graduated from Beta to Release Candidate.

Arrow and DuckDB Data Accelerators: Both have graduated from Beta to Release Candidates.

Debezium Kafka Integration: Spice now supports secure authentication and encryption options for Kafka connections when using Debezium for Change Data Capture (CDC). The previous limitation of PLAINTEXT protocol-only connections has been lifted. Spice now supports the following Kafka security configurations:

Security protocol: PLAINTEXT, SSL, SASLPLAINTEXT, SASLSSL
SASL mechanisms: PLAIN, SCRAM-SHA-256, SCRAM-SHA-512

Example Spicepod.yml configuration:

yaml datasets: - from: debezium:my_kafka_topic_with_debezium_changes name: my_dataset params: kafka_security_protocol: SASL_SSL kafka_sasl_mechanism: SCRAM-SHA-512 kafka_sasl_username: kafka kafka_sasl_password: ${secrets:kafka_sasl_password} kafka_ssl_ca_location: ./certs/kafka_ca_cert.pem

Breaking changes

Model Parameters: The params.spice_tools parameter has been replaced by params.tools. Backward compatibility is maintained for existing configurations using params.spice_tools.

Dataset Accelerator State: The ready_state parameter has been moved to the dataset level.

Ready Handler Response: The response body of the /v1/ready handler has been changed from Ready (uppercase) to ready (lowercase) for consistency and adherence to standards.

Default Kafka Security for Debezium: The default Kafka kafka_security_protocol parameter for Debezium datasets has changed from PLAINTEXT to SASL_SSL, improving security by default. Metrics Name Updates: Adjustments have been made to specific metrics for improved observability and accuracy:

| Before | v1.0-rc.1 | | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------- | | catalogsloaderror | catalogloaderrors | | catalogsstatus | catalogloadstate | | datasetsaccelerationappenddurationms, datasetsaccelerationloaddurationms | datasetaccelerationrefreshdurationms {mode: append/full} | | datasetsaccelerationlastrefreshtime | datasetaccelerationlastrefreshtimems | | datasetsaccelerationrefresherror | datasetaccelerationrefresherrors | | datasetscount | datasetactivecount | | datasetsloaderror | datasetloaderrors | | datasetsstatus | datasetloadstate | | datasetsunavailabletime | datasetunavailabletimems | | embeddingscount | embeddingsactivecount | | embeddingsloaderror | embeddingsloaderrors | | embeddingsstatus | embeddingsloadstate | | flightdoactiondurationms, flightdogetgetprimarykeysdurationms, flightdogetgetcatalogsdurationms, flightdogetgetschemasdurationms, flightdogetgetsqlinfodurationms, flightdogettabletypesdurationms, flightdogetgettablesdurationms, flightdogetpreparedstatementquerydurationms, flightdogetsimpledurationms, flightdogetstatementquerydurationms, flightdoputdurationms, flighthandshakerequestdurationms, flightlistactionsdurationms, flightgetflightinforequestdurationms | flightrequestdurationms {method: methodname, command: commandname} | | flightdoactionrequests, flightdoexchangedataupdatessent, flightdoexchangerequests, flightdoputrequests, flightdogetrequests, flighthandshakerequests, flightlistactionsrequests, flightlistflightsrequests, flightgetflightinforequests, flightgetschemarequests | flightrequests {method: methodname, command: commandname} | | httprequestsdurationms | httprequestdurationms | | modelscount | modelactivecount | | modelsloaddurationms | modelloaddurationms | | modelsloaderror | modelloaderrors | | modelsstatus | modelloadstate | | toolcount | toolactivecount | | toolloaderror | toolloaderrors | | toolsstatus | toolloadstate | | querycount | queryexecutions | | queryexecutionduration | queryexecutiondurationms | | resultscachehitcount | resultscachehits | | resultscacheitemcount | resultscacheitemscount | | resultscachemaxsize | resultscachemaxsizebytes | | resultscacherequestcount | resultscacherequests | | resultscachesize | resultscachesizebytes | | secretsstoresloaddurationms | secretsstoreloaddurationms | | bytesprocessed | queryprocessedbytes | | bytesreturned | queryreturnedbytes | | spicedruntimeflightserverstart | runtimeflightserverstarted | | spicedruntimehttpserverstart | runtimehttpserverstarted | | viewsloaderror | viewloaderrors |

Contributors

@phillipleblanc
@sgrebnov
@Jeadie
@Sevenannn
@peasee
@slyons
@barracudarin
@lukekim
@ewgenius

What's changed

Update to next release version by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3372
Update Helm chart to v0.20.0-beta by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3373
Upgrade dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3375
E2E: Add a test to confirm refreshing with custom refresh-sql via CLI by @sgrebnov in https://github.com/spiceai/spiceai/pull/3374
Fix regression in inferring embedding model vector size for non-default models by @Jeadie in https://github.com/spiceai/spiceai/pull/3376
add AI quickstarts to endgame by @Jeadie in https://github.com/spiceai/spiceai/pull/3378
Remove need for params.model_type for most HF LLMs by @Jeadie in https://github.com/spiceai/spiceai/pull/3342
Replace query_duration_seconds and http_requests_duration_seconds with milliseconds metrics by @sgrebnov in https://github.com/spiceai/spiceai/pull/3251
Add Extension<Runtime> to HTTP routes to simplify tooling in NSQL. by @Jeadie in https://github.com/spiceai/spiceai/pull/3384
Update datafusion patch by @Sevenannn in https://github.com/spiceai/spiceai/pull/3386
Ensure hyperparameters are obeyed in recursive chat/completion calls. by @Jeadie in https://github.com/spiceai/spiceai/pull/3395
fix: update odbc benchmarks by @peasee in https://github.com/spiceai/spiceai/pull/3394
Implement traits & plumbing for pluggable HTTP Auth by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3397
Add allow_http parameter for S3 data connector by @Sevenannn in https://github.com/spiceai/spiceai/pull/3398
Add column field to dataset spicepod component by @Jeadie in https://github.com/spiceai/spiceai/pull/3336
feat: add duckdb connector benchmarks by @peasee in https://github.com/spiceai/spiceai/pull/3403
Add integration tests for OpenAI NSQL functionality by @sgrebnov in https://github.com/spiceai/spiceai/pull/3402
Implement optional api-key auth for the HTTP endpoint by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3405
Add integration tests for Search API (OpenAI and HF models) by @sgrebnov in https://github.com/spiceai/spiceai/pull/3410
HTTP APIs: list tools, call tool by @Jeadie in https://github.com/spiceai/spiceai/pull/3404
Implement optional api-key auth for the Flight/FlightSQL endpoint by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3412
Adding semicolons to some TPCH queries to make sure they run on the CLI by @slyons in https://github.com/spiceai/spiceai/pull/3420
Add GrpcAuth to protect the OpenTelemetry endpoint by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3417
Support Kafka-native authentication and TLS connections for Debezium connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3419
Add integration tests for Embeddings API (OpenAI and HF models) by @sgrebnov in https://github.com/spiceai/spiceai/pull/3416
Support base64 embedding format by @Jeadie in https://github.com/spiceai/spiceai/pull/3418
Give local models some love by @Jeadie in https://github.com/spiceai/spiceai/pull/3425
Have views update on --pods-watcher-enabled by @Jeadie in https://github.com/spiceai/spiceai/pull/3428
Simplify running models integration tests locally by @sgrebnov in https://github.com/spiceai/spiceai/pull/3424
Make Debezium connector MySQL compatible by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3432
Store + load memory tooling, enable by @Jeadie in https://github.com/spiceai/spiceai/pull/3413
Statically compile OpenSSL by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3434
Build macOS x64 on macos-14 (Sonoma) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3435
Upgrade dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3443
Bump azure_core from 0.20.0 to 0.21.0 by @dependabot in https://github.com/spiceai/spiceai/pull/3436
Add integration tests for chat completion API (HF and OpenAI) by @sgrebnov in https://github.com/spiceai/spiceai/pull/3433
Run Clickbench with Spice Benchmark Binary by @Sevenannn in https://github.com/spiceai/spiceai/pull/3389
Use datatype_is_semantically_equal in verify_schema by @Sevenannn in https://github.com/spiceai/spiceai/pull/3423
Use spiceai-large-runners to build benchmark binary by @Sevenannn in https://github.com/spiceai/spiceai/pull/3446
Skip reqwest_retry::middleware tracing in non verbose configuration by @sgrebnov in https://github.com/spiceai/spiceai/pull/3445
feat: Add invalid type action handling for DuckDB by @peasee in https://github.com/spiceai/spiceai/pull/3430
Fix benchmark: Lock poisoning issue from INSTA by @Sevenannn in https://github.com/spiceai/spiceai/pull/3457
docs: Release DuckDB Connector RC by @peasee in https://github.com/spiceai/spiceai/pull/3459
DR: Code Pattern For Obtaining Milliseconds-Based Duration by @sgrebnov in https://github.com/spiceai/spiceai/pull/3460
Improve ClickBench setup script: avoid re-downloading test data every time by @sgrebnov in https://github.com/spiceai/spiceai/pull/3463
Fix TableReference quoting for MySQL by @Jeadie in https://github.com/spiceai/spiceai/pull/3461
Tool use and model name for local models by @Jeadie in https://github.com/spiceai/spiceai/pull/3458
params.tools, not params.spice_tools. Allow backwards compatibility to params.spice_tools. by @Jeadie in https://github.com/spiceai/spiceai/pull/3473
fix: Support DuckDB boolean list by @peasee in https://github.com/spiceai/spiceai/pull/3474
Upgrade to DataFusion 43 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3462
Build explicit ODBC Docker image by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3476
Promote Arrow acceleration to RC by @sgrebnov in https://github.com/spiceai/spiceai/pull/3478
Update benchmark workflow to create PR for updating snapshot by @Sevenannn in https://github.com/spiceai/spiceai/pull/3479
Update benchmark snapshots for spice.ai connector tpch by @github-actions in https://github.com/spiceai/spiceai/pull/3481
Update setup-make action by @Sevenannn in https://github.com/spiceai/spiceai/pull/3488
Option to return sql from v1/nsql by @Jeadie in https://github.com/spiceai/spiceai/pull/3487
Adding scripts to run and monitor TPC-H/-DS queries at larger scale factors by @slyons in https://github.com/spiceai/spiceai/pull/3483
Update Datafusion and Datafusion-Table-Providers patch by @Sevenannn in https://github.com/spiceai/spiceai/pull/3489
docs: Update Accelerator RC to specify clickbench in all modes by @peasee in https://github.com/spiceai/spiceai/pull/3490
Add logos and marks by @lukekim in https://github.com/spiceai/spiceai/pull/3485
Updates to repo docs by @lukekim in https://github.com/spiceai/spiceai/pull/3486
Change document_similarity to return markdown, not JSON. by @Jeadie in https://github.com/spiceai/spiceai/pull/3477
Add support for creating embeddings for Utf8View type columns by @sgrebnov in https://github.com/spiceai/spiceai/pull/3498
Add vector search support for Utf8View type columns by @sgrebnov in https://github.com/spiceai/spiceai/pull/3500
Update datafusion-table-providers version by @Jeadie in https://github.com/spiceai/spiceai/pull/3503
Update text-embeddings-inference and mistral.rs from downstream. by @Jeadie in https://github.com/spiceai/spiceai/pull/3505
Fix snapshot update PR push in benchmark by @Sevenannn in https://github.com/spiceai/spiceai/pull/3484
Run FederationAnalyzerRule before ResolveGroupingFunction rule by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3508
Update benchmark snapshots by @github-actions in https://github.com/spiceai/spiceai/pull/3509
docs: Release DuckDB accelerator RC by @peasee in https://github.com/spiceai/spiceai/pull/3512
Upgrade datafusion-functions-json to 0.43 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3511
Update Datafusion Table Provider patch to fix MySQL refresh append mode by @Sevenannn in https://github.com/spiceai/spiceai/pull/3514
Handle panics in HF API calls by @Jeadie in https://github.com/spiceai/spiceai/pull/3521
Update Runtime metrics according to metrics naming guidelines by @sgrebnov in https://github.com/spiceai/spiceai/pull/3518
Update Flight metrics according to metrics naming guidelines by @sgrebnov in https://github.com/spiceai/spiceai/pull/3515
Update Results Cache metrics according to metrics naming guidelines by @sgrebnov in https://github.com/spiceai/spiceai/pull/3520
Move ready_state to dataset level by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3526
Add --force option to spice upgrade to force it to upgrade to the latest released version by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3527
Refactor runtime initialization into separate modules by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3531
Update Anonymous telemetry metrics according to metrics naming guidelines by @sgrebnov in https://github.com/spiceai/spiceai/pull/3529
Add Metrics naming principles and guidelines by @sgrebnov in https://github.com/spiceai/spiceai/pull/3516
Update Dataset Acceleration metrics according to metrics naming guidelines by @sgrebnov in https://github.com/spiceai/spiceai/pull/3528
Improve localpod startup to register immediately after its parent is registered by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3532
AI/LLM integration tests: make tests more robust and verify more ai_tools by @sgrebnov in https://github.com/spiceai/spiceai/pull/3513
Update dashboards to match new metrics names by @sgrebnov in https://github.com/spiceai/spiceai/pull/3530
Clarify source of prefixes for data component parameters. by @Jeadie in https://github.com/spiceai/spiceai/pull/3541
Upgrade dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3564
Update Spice release process to support release branches by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3525
fix: Validate the endpoint for ABFS and S3 by @peasee in https://github.com/spiceai/spiceai/pull/3565
Vector Search: Default to datasets with embeddings only when none are specified by @sgrebnov in https://github.com/spiceai/spiceai/pull/3575
Lowercase the ready handler response by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3577
Update benchmark snapshots by @github-actions in https://github.com/spiceai/spiceai/pull/3579
Improve spice search error handling by @sgrebnov in https://github.com/spiceai/spiceai/pull/3571
Load components in parallel, not concurrently by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3566
fix: Make S3 auth parameter validation more robust: by @peasee in https://github.com/spiceai/spiceai/pull/3578
fix: Infer if the specified file format is correct in object store by @peasee in https://github.com/spiceai/spiceai/pull/3580
Add ability to configure CORS on the HTTP server by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3581
fix: Handle invalid S3 auth and region better by @peasee in https://github.com/spiceai/spiceai/pull/3582
allow setting of replicaCount to a falsy-value by @barracudarin in https://github.com/spiceai/spiceai/pull/3586
spice search to default to only datasets with embeddings by @sgrebnov in https://github.com/spiceai/spiceai/pull/3588
Run AI integration tests as part of CI by @sgrebnov in https://github.com/spiceai/spiceai/pull/3572
Load datasets in parallel by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3585
Run integration test on smaller runners by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3583
Use folders for model component by @Jeadie in https://github.com/spiceai/spiceai/pull/3584
Improve models integration tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/3592
Change default taskhistory capturedoutput to none by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3598
Add timeout to /v1/datasets APIs when app is locked by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3601
Properly drop the read lock on the runtime app in http.start by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3603
Make integration tests more robust on fewer cores by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3604
refactor: First pass data connector error messages update by @peasee in https://github.com/spiceai/spiceai/pull/3602
Add log if no datasets are configured by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3605
Upgrade to DuckDB 1.1.3 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3606
Add E2E test for spice search and chat functionality (OpenAI) by @sgrebnov in https://github.com/spiceai/spiceai/pull/3599
Use spiceai-runners for TPCH / TPCDS benchmark by @Sevenannn in https://github.com/spiceai/spiceai/pull/3507
docs: Update error handling guide by @peasee in https://github.com/spiceai/spiceai/pull/3611
Improve default description for sql tool by @Jeadie in https://github.com/spiceai/spiceai/pull/3612
Update metric name from query_invocations to query_executions by @sgrebnov in https://github.com/spiceai/spiceai/pull/3613
Don't provide runtime tools to health check. by @Jeadie in https://github.com/spiceai/spiceai/pull/3615
Sort vector search results based on similarity score by @sgrebnov in https://github.com/spiceai/spiceai/pull/3620
Allow overriding runtime configuration with --set-runtime CLI flags by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3619
Some bugs by @Jeadie in https://github.com/spiceai/spiceai/pull/3621
Improve S3 errors by @Sevenannn in https://github.com/spiceai/spiceai/pull/3640
Update Databricks, Delta Lake, DuckDB error messages by @Sevenannn in https://github.com/spiceai/spiceai/pull/3642
docs: Add error message UX to beta connector criteria by @peasee in https://github.com/spiceai/spiceai/pull/3639
feat: Make REPL identify it's waiting on a new line by @peasee in https://github.com/spiceai/spiceai/pull/3617
Wrap Server-Sent-Events chat errors as OpenAI error events by @sgrebnov in https://github.com/spiceai/spiceai/pull/3641
refactor: Update accelerated table errors, dataset health monitor errors by @peasee in https://github.com/spiceai/spiceai/pull/3614
Extend v1/datasets api to indicate if dataset can be used in vector search by @sgrebnov in https://github.com/spiceai/spiceai/pull/3644
feat: Unnest DataFusion errors by @peasee in https://github.com/spiceai/spiceai/pull/3646
feat: Add RateLimited DataConnectorError by @peasee in https://github.com/spiceai/spiceai/pull/3648
Setup nightly docker release workflow by @ewgenius in https://github.com/spiceai/spiceai/pull/3649
Make LLM integration tests more extensible. by @Jeadie in https://github.com/spiceai/spiceai/pull/3576
feat: Update ODBC error messages by @peasee in https://github.com/spiceai/spiceai/pull/3651
feat: Better tonic errors by @peasee in https://github.com/spiceai/spiceai/pull/3650
Nightly release workflow fixes by @ewgenius in https://github.com/spiceai/spiceai/pull/3652
Fix missing ARM64 image for nightly publish step by @ewgenius in https://github.com/spiceai/spiceai/pull/3653
Use GitHub GraphQL rate limiting responses to rate limit requests by @lukekim in https://github.com/spiceai/spiceai/pull/3610
Fix typo in nightly release publish step by @ewgenius in https://github.com/spiceai/spiceai/pull/3654
Handle GitHub rate-limiting for the Rest API by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3656
Adding custom User-Agent parameters to chat, nsql and flightrepl by @slyons in https://github.com/spiceai/spiceai/pull/3609
Remove "nightly-" prefix from tag by @ewgenius in https://github.com/spiceai/spiceai/pull/3671
Upgrade dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3670
spice search to warn if dataset is not ready and won't be included in search by @sgrebnov in https://github.com/spiceai/spiceai/pull/3590
Fix keyring secret store to try both prefixed & unprefixed secrets by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3672
Handle empty embeds by allowing for nulls by @Jeadie in https://github.com/spiceai/spiceai/pull/3600
Improve github connector error by @Sevenannn in https://github.com/spiceai/spiceai/pull/3677
Update FlightSQL error messages by @sgrebnov in https://github.com/spiceai/spiceai/pull/3676
Update Datafusion Table Provider Patch to include error message improvements by @Sevenannn in https://github.com/spiceai/spiceai/pull/3678
Integration tests for llms crate, with basic Anthropic test. by @Jeadie in https://github.com/spiceai/spiceai/pull/3647
Allow E2E model tests to complete even if parallel platform tests failed by @sgrebnov in https://github.com/spiceai/spiceai/pull/3679
Add Openai to llms testing by @Jeadie in https://github.com/spiceai/spiceai/pull/3680
Fix .env in '.github/workflows/integration_llms.yml' by @Jeadie in https://github.com/spiceai/spiceai/pull/3686
Improve error messages for spice ai connector, separate errors to different lines for DuckDB, Delta Lake, Databricks connector by @Sevenannn in https://github.com/spiceai/spiceai/pull/3643
Add microsoft/Phi-3-mini-4k-instruct to llms crate testing, with MODEL_SKIPLIST & MODEL_ALLOWLIST by @Jeadie in https://github.com/spiceai/spiceai/pull/3690
Add nightly label to spiced version in Cargo.toml by @ewgenius in https://github.com/spiceai/spiceai/pull/3691
Disable HF in models integration tests (not supported) by @sgrebnov in https://github.com/spiceai/spiceai/pull/3693
Add log when CORS is enabled by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3695
Fix nightly release workflow by @ewgenius in https://github.com/spiceai/spiceai/pull/3698
Correctly set nightly labels for both release and pre-release versions by @ewgenius in https://github.com/spiceai/spiceai/pull/3699
Improve REPL error handling for multiline error messages by @sgrebnov in https://github.com/spiceai/spiceai/pull/3692
Determine supportfilterpushdown based on Accelerator federated reader & ZeroResultsAction by @Sevenannn in https://github.com/spiceai/spiceai/pull/3694
Fix rdfkafak duplicated version by @Sevenannn in https://github.com/spiceai/spiceai/pull/3707
feat: Render multiline errors better in REPL by @peasee in https://github.com/spiceai/spiceai/pull/3701
refactor: Update UnableToAttachDataConnector error message by @peasee in https://github.com/spiceai/spiceai/pull/3706
refactor: Update errors for Alpha connectors by @peasee in https://github.com/spiceai/spiceai/pull/3705
Update benchmark snapshots by @github-actions in https://github.com/spiceai/spiceai/pull/3704
Implement a RequestContext that automatically propagates request details to metric dimensions by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3709
Fix acceleration in append mode with refresh_sql specified by @sgrebnov in https://github.com/spiceai/spiceai/pull/3697
Bump github.com/stretchr/testify from 1.9.0 to 1.10.0 by @dependabot in https://github.com/spiceai/spiceai/pull/3655
Tokenizer for OpenAI embedding models for accurate chunking by @Jeadie in https://github.com/spiceai/spiceai/pull/3519
Update error message when dataset isn't configured with time_column in append refresh by @Sevenannn in https://github.com/spiceai/spiceai/pull/3703
Add the missing winver dependency in runtime crate by @Sevenannn in https://github.com/spiceai/spiceai/pull/3711
deps: Update table providers by @peasee in https://github.com/spiceai/spiceai/pull/3712
Add special tokens in chunk sizer by @Jeadie in https://github.com/spiceai/spiceai/pull/3713
Disable results cache for benchmark tests by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3715

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.20.0-beta...v1.0.0-rc.1

- Rust
Published by ewgenius over 1 year ago

https://github.com/spiceai/spiceai - v0.20.0-beta

Spice v0.20.0-beta (Nov 04, 2024)

Spice v0.20.0-beta improves federated query performance with column pruning and adds support for Metal (Apple Silicon) and CUDA (NVidia) accelerators. The S3, PostgreSQL, MySQL, and GitHub Data Connectors have graduated from Beta to Release Candidates. The Arrow, DuckDB, and SQLite Data Accelerators have graduated from Alpha to Beta.

Highlights in v0.20.0-beta

Data Connectors: The S3, PostgreSQL, MySQL, and GitHub Data Connectors have graduated from beta to release candidate.

Data Accelerators: The Arrow, DuckDB, and SQLite Data Accelerators have graduated from alpha to beta.

Metal and CUDA Support: Added support for Metal (Apple Silicon) and CUDA (NVidia) for AI/ML workloads including embeddings and local LLM inference.

For instructions on compiling a Meta or CUDA binary, see the Installation Docs.

Breaking Changes

The ODBC Data Connector now requires ODBC drivers specified in connection strings are registered in the system ODBC driver manager.

Example invalid connection string:

bash DRIVER={/path/to/driver.so};SERVER=localhost;DATABASE=master

Example valid connection string:

bash DRIVER={My ODBC Driver};SERVER=localhost;DATABASE=master

Where My ODBC Driver is the name of an ODBC driver registered in the ODBC driver manager.

Contributors

@ewgenius
@peasee
@phillipleblanc
@sgrebnov
@Jeadie
@barracudarin
@Sevenannn

What's Changed

Update Helm for v0.19.4-beta and add release notes by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3310
Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/3311
metal & cuda flags for spice by @Jeadie in https://github.com/spiceai/spiceai/pull/3212
Promote postgres connector to RC quality by @Sevenannn in https://github.com/spiceai/spiceai/pull/3305
docs: Update ROADMAP.md by @peasee in https://github.com/spiceai/spiceai/pull/3322
feat: Enable federation for in-memory accelerators by @peasee in https://github.com/spiceai/spiceai/pull/3325
fix: Only allow env files from the current dir by @peasee in https://github.com/spiceai/spiceai/pull/3327
Always read TimezoneTZ from PostgreSQL as UTC by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3330
For multi-sink acceleration refreshes, ensure parent table completes before the children. by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3329
Update TPC-DS Q49 (Decimal to Float) to match SQLite's type system by @sgrebnov in https://github.com/spiceai/spiceai/pull/3323
Enable parquet pushdown in Spice by @Sevenannn in https://github.com/spiceai/spiceai/pull/3245
Use spice object_store fork to fix S3 ambiguous error by @Sevenannn in https://github.com/spiceai/spiceai/pull/3304
Don't mix commented out queries for s3 connectors and accelerators by @Sevenannn in https://github.com/spiceai/spiceai/pull/3331
Allow only valid WHERE conditions in vector searches by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3335
fix: Allow only ODBC profiles by @peasee in https://github.com/spiceai/spiceai/pull/3324
Track how many times an acceleration falls back during initialization by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3339
Anthropic model regex and fix tool parsing aggregation bug by @Jeadie in https://github.com/spiceai/spiceai/pull/3334
Upgrade runtime along with CLI on spice upgrade by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3341
Update upcoming Roadmap by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3343
fix: Prevent acceleration files outside of working directory by @peasee in https://github.com/spiceai/spiceai/pull/3340
Document S3 connector limitations by @Sevenannn in https://github.com/spiceai/spiceai/pull/3333
Update Object Store Patch by @Sevenannn in https://github.com/spiceai/spiceai/pull/3361
Promote SQLite Data Accelerator to Beta by @sgrebnov in https://github.com/spiceai/spiceai/pull/3365
Promote S3 connector to RC quality by @Sevenannn in https://github.com/spiceai/spiceai/pull/3362
Revert "fix: Only allow env files from the current dir" by @peasee in https://github.com/spiceai/spiceai/pull/3368
docs: Fix typo for S3 release status in README.md by @peasee in https://github.com/spiceai/spiceai/pull/3370
Include unnecessary columns pruning step during federated plan creation by @sgrebnov in https://github.com/spiceai/spiceai/pull/3363

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.19.4-beta...v0.20.0-beta

- Rust
Published by phillipleblanc over 1 year ago

https://github.com/spiceai/spiceai - v0.19.4-beta

Spice v0.19.4 (Oct 30, 2024)

Spice v0.19.4-beta introduces a new localpod Data Connector, improvements to accelerator resiliency and control, and a new configuration to control when accelerated datasets are considered ready.

Highlights in v0.19.4

localpod Connector: Implement a "tiered" acceleration strategy with a new localpod Data Connector that can be used to accelerate datasets from other datasets registered in Spice.

yaml datasets: - from: s3://my_bucket/my_dataset name: my_dataset acceleration: enabled: true engine: duckdb mode: file refresh_check_interval: 60s - from: localpod:my_dataset name: my_localpod_dataset acceleration: enabled: true

Refreshes on the localpod's parent dataset will automatically be synchronized with the localpod dataset.

Improved Accelerator Resiliency: When Spice is restarted, if the federated source for a dataset configured with a file-based accelerator is not available, the dataset will still load from the existing file data and will attempt to connect to the federated source in the background for future refreshes.

Accelerator Ready State: Control when an accelerated dataset is considered "ready" by the runtime with the new ready_state parameter.

yaml datasets: - from: s3://my_bucket/my_dataset name: my_dataset acceleration: enabled: true ready_state: on_load # or on_registration

ready_state: on_load: Default. The dataset is considered ready after the initial load of the accelerated data. For file-based accelerated datasets that have existing data, this means the dataset is ready immediately.
ready_state: on_registration: The dataset is considered ready when the dataset is registered in Spice. Queries against this dataset before the data is loaded will fallback to the federated source.

Breaking changes

Accelerated datasets configured with ready_state: on_load (the default behavior) that are not ready will return an error instead of returning zero results.

Contributors

@Sevenannn
@peasee
@phillipleblanc
@sgrebnov
@barracudarin
@Jeadie
@ewgenius

What's Changed

Update helm for v0.19.3-beta by @ewgenius in https://github.com/spiceai/spiceai/pull/3274
docs: Mark GitHub as Beta in README.md by @peasee in https://github.com/spiceai/spiceai/pull/3272
Fix docker publish by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3273
Add SQLite TPC-DS Limitations: ROLLUP and GROUPING by @sgrebnov in https://github.com/spiceai/spiceai/pull/3277
Update version to 1.0.0-rc.1 by @sgrebnov in https://github.com/spiceai/spiceai/pull/3276
Synchronize localpod acceleration with parent acceleration refreshes by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3264
feat: Update Datafusion, promote DuckDB and MySQL by @peasee in https://github.com/spiceai/spiceai/pull/3278
Add SQLite TPC-DS Limitations: stddev by @sgrebnov in https://github.com/spiceai/spiceai/pull/3279
fix indentation issue with service annotations by @barracudarin in https://github.com/spiceai/spiceai/pull/3281
fix: Expose GitHub ratelimit errors by @peasee in https://github.com/spiceai/spiceai/pull/3258
Revert Datafusion parquet changes by @Sevenannn in https://github.com/spiceai/spiceai/pull/3286
Promote arrow accelerator to beta by @Sevenannn in https://github.com/spiceai/spiceai/pull/3287
Add SQLite TPC-DS Limitations: casting to DECIMAL by @sgrebnov in https://github.com/spiceai/spiceai/pull/3282
Accelerated datasets can fallback to federated source while loading by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3280
Enable overlap_size correctly by @Jeadie in https://github.com/spiceai/spiceai/pull/3229
Avoid duplicated filter conditions in rewritten SQL by @Sevenannn in https://github.com/spiceai/spiceai/pull/3284
Fix SQLite records conversion with NULL in first row by @sgrebnov in https://github.com/spiceai/spiceai/pull/3295
fix: Update datafusion by @peasee in https://github.com/spiceai/spiceai/pull/3297
Display shorter name for benchmark workflow matrix by @Sevenannn in https://github.com/spiceai/spiceai/pull/3299
Update spice_sys_dataset_checkpoint to store federated table schema by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3303
Update postgres connector/accelerator snapshot by @Sevenannn in https://github.com/spiceai/spiceai/pull/3298
Accelerated tables with existing file data can load without a connection to the federated source by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3306
Ensure synchronized tables complete their insertion at the same time by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3307

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.19.3-beta...v0.19.4-beta

- Rust
Published by phillipleblanc over 1 year ago

https://github.com/spiceai/spiceai - v0.19.3-beta

Spice v0.19.3 (Oct 28, 2024)

Spice v0.19.3-beta improves the performance and stability of data connectors and accelerators, including faster queries across multiple federated sources by optimizing how filters are applied. Anthropic has also been added as a LLM model provider.

Highlights in v0.19.3

DataFusion Fixes: Resolved bugs in DataFusion and DataFusion Table Providers, expanding TPC-DS coverage and correctness.

GitHub Data Connector Beta Milestone: The GitHub Data Connector has graduated to Beta after extensive testing, stability, and performance improvements.

Anthropic Models Provider: Anthropic has been added as an LLM provider, including support for streaming.

Example spicepod.yml:

yaml models: - from: anthropic:claude-3-5-sonnet-20240620 name: claude_3_5_sonnet params: anthropic_api_key: ${ secrets:SPICE_ANTHROPIC_API_KEY }

Breaking changes

None.

Contributors

@Jeadie
@Sevenannn
@phillipleblanc
@peasee
@sgrebnov
@nlamirault
@barracudarin
@lukekim
@slyons

New Contributors

@nlamirault made their first contribution in https://github.com/spiceai/spiceai/pull/3207
@barracudarin made their first contribution in https://github.com/spiceai/spiceai/pull/3228

What's Changed

Make Anthropic OpenAI compatible. by @Jeadie in https://github.com/spiceai/spiceai/pull/3087
Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/3200
Bump version to 1.0.0-rc.1 by @Sevenannn in https://github.com/spiceai/spiceai/pull/3202
Fix clickhouse schema inference for non-default database by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3201
Update endgame template by @Sevenannn in https://github.com/spiceai/spiceai/pull/3198
Upgrade dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3197
fix: dataset refresh defaults properties to None by @peasee in https://github.com/spiceai/spiceai/pull/3205
Upgrade OTEL to v0.26 and make seconds based metrics reported precisely by @sgrebnov in https://github.com/spiceai/spiceai/pull/3203
use text_embedding_inference::Infer for more complete embedding solution by @Jeadie in https://github.com/spiceai/spiceai/pull/3199
Add S3 parquet file - arrow accelerator e2e test by @Sevenannn in https://github.com/spiceai/spiceai/pull/3154
feat: Add script to setup clickbench on mysql by @peasee in https://github.com/spiceai/spiceai/pull/3176
Update helm chart version to v0.19.2 by @Sevenannn in https://github.com/spiceai/spiceai/pull/3210
Add sample dataset option in v1/nsql. by @Jeadie in https://github.com/spiceai/spiceai/pull/3105
Split spiced_docker build across architectures by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3206
feat(helm): do not install demo dataset by default by @nlamirault in https://github.com/spiceai/spiceai/pull/3207
Split integration test across build/run steps by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3215
feat(helm): Refactoring Kubernetes labels by @nlamirault in https://github.com/spiceai/spiceai/pull/3208
Define 'toolrecursionlimit' for LLMs, and limit internal tool calling recursion. by @Jeadie in https://github.com/spiceai/spiceai/pull/3214
Improve filters pushdown for federated queries by @sgrebnov in https://github.com/spiceai/spiceai/pull/3183
Implement native schema inference for PostgreSQL by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3209
docs: Update release criteria by @peasee in https://github.com/spiceai/spiceai/pull/3219
Run SQLite acceleration TPC-DS tests using smaller scale by @sgrebnov in https://github.com/spiceai/spiceai/pull/3227
bind the serviceAccount if a name is given or if we're creating one by @barracudarin in https://github.com/spiceai/spiceai/pull/3228
Only emit channel send error log when its not a closed channel error by @Jeadie in https://github.com/spiceai/spiceai/pull/3230
Enable Parquet Exec filter pushdown in Spice by @Sevenannn in https://github.com/spiceai/spiceai/pull/3216
Add snapshots for SQLite TPC-DS benchmark (file mode) by @sgrebnov in https://github.com/spiceai/spiceai/pull/3234
docs: Add SDK release checks to endgame by @peasee in https://github.com/spiceai/spiceai/pull/3256
Implement localpod Data Connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3249
Revert "Enable Parquet Exec filter pushdown in Spice (#3216)" by @Sevenannn in https://github.com/spiceai/spiceai/pull/3244
refactor: Use existing action for detecting changes by @peasee in https://github.com/spiceai/spiceai/pull/3255
feat: Add GitHub integration test by @peasee in https://github.com/spiceai/spiceai/pull/3226
Add get_readiness tool to retrieve status of all registered components by @lukekim in https://github.com/spiceai/spiceai/pull/3035
Improve CLI error output when REPL can't connect to the Flight endpoint by @slyons in https://github.com/spiceai/spiceai/pull/3188
Fixing FTP link in Endgame by @slyons in https://github.com/spiceai/spiceai/pull/3267
Update version to 0.19.3-beta by @sgrebnov in https://github.com/spiceai/spiceai/pull/3269
add service type and annotation customizations in https://github.com/spiceai/spiceai/pull/3268

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.19.2-beta...v0.19.3-beta

- Rust
Published by sgrebnov over 1 year ago

https://github.com/spiceai/spiceai - v0.19.2-beta

Spice v0.19.2 (Oct 21, 2024)

Spice v0.19.2-beta continues to improve performance and stability of data connectors and data accelerators, further expands TPC-DS coverage, and includes several bug fixes.

Highlights in v0.19.2

DataFusion Fixes: Resolved bugs in DataFusion and DataFusion Table Providers, improving TPC-DS query support and correctness.

TPC-DS Snapshots: Extended support for TPC-DS benchmarks with added snapshot tests for validating query plans and result accuracy.

PostgreSQL Accelerator Beta: Postgres Data Accelerator has been promoted to Beta Quality

Breaking changes

The hive_infer_partitions parameter been changed to hive_partitioning_enabled, now defaults to false and must be explicitly enabled.

Contributors

@ewgenius
@sgrebnov
@slyons
@Jeadie
@Sevenannn
@phillipleblanc
@dependabot
@peasee

Dependencies

DataFusion Table Providers: Upgraded to rev 2bcf481b4abe9d0bd6bb2479ce49020df66ff97f.
duckdb-rs: Upgraded from 1.0.0 to 1.1.1.

What's Changed

Update Helm chart for v0.19.1-beta by @ewgenius in https://github.com/spiceai/spiceai/pull/3106
Add more TPC-DS snapshots for Postgres acceleration by @sgrebnov in https://github.com/spiceai/spiceai/pull/3107
Bumping version to 1.0.0-rc.1 by @slyons in https://github.com/spiceai/spiceai/pull/3109
New table sampling methods: sampledistinctcolumns, randomsample, topn_sample by @Jeadie in https://github.com/spiceai/spiceai/pull/3108
Add TPCDS snapshot tests for file-based and in-mem duckdb by @Sevenannn in https://github.com/spiceai/spiceai/pull/3115
Add Postgres acceleration E2E test for MySQL by @sgrebnov in https://github.com/spiceai/spiceai/pull/3110
Update datafusion logical plan to avoid wrong group_by columns in aggregation by @Sevenannn in https://github.com/spiceai/spiceai/pull/3111
Warn if user tries to embed column that does not exist by @Jeadie in https://github.com/spiceai/spiceai/pull/3120
Changes for Rust version upgrade by @Sevenannn in https://github.com/spiceai/spiceai/pull/3134
Add unnest support for federated plans by @sgrebnov in https://github.com/spiceai/spiceai/pull/3133
Don't .clone() unnecessarily by @Jeadie in https://github.com/spiceai/spiceai/pull/3128
Fix Flight get_schema to construct logical plan and return that schema. by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3131
Bump clap from 4.5.19 to 4.5.20 by @dependabot in https://github.com/spiceai/spiceai/pull/3099
Add GitHub Workflow to build spice-postgres-tpcds-bench image by @sgrebnov in https://github.com/spiceai/spiceai/pull/3140
test: Add basic MySQL integration test by @peasee in https://github.com/spiceai/spiceai/pull/3143
Bump datafusion-federation and datafusion-table-providers crates by @sgrebnov in https://github.com/spiceai/spiceai/pull/3148
docs: Add MySQL limitation for division by zero by @peasee in https://github.com/spiceai/spiceai/pull/3144
fix: Dataset refresh by @peasee in https://github.com/spiceai/spiceai/pull/3147
Update arrow, duckdb, postgres accelerator tpcds snapshots by @Sevenannn in https://github.com/spiceai/spiceai/pull/3145
Add TPC-DS benchmarks for Postgres data connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/3149
Update E2E test ci to include tests for accelerating Postgres into accelerators by @Sevenannn in https://github.com/spiceai/spiceai/pull/3137
Add TPCDS Benchmark test and snapshots for S3 by @Sevenannn in https://github.com/spiceai/spiceai/pull/3152
[cli] Include 200 in acceptable response codes for doRuntimeApiRequest by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3157
Use -build.{GIT_SHA} for unreleased versions by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3159
Upgrade to Rust 1.82 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3158
Disable hive_infer_partitions by default by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3160
Upgrade to DuckDB 1.1.1 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3161
feat: Add MySQL TPCDS results snapshots and exclude workarounds by @peasee in https://github.com/spiceai/spiceai/pull/3165
Fix taskhistory output for sql, add output to tableschema & list_datasets tool by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3166
feat: Add ClickBench queries as separate files by @peasee in https://github.com/spiceai/spiceai/pull/3169
Calculate embeddings in a separate blocking thread by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3170
docs: Update ROADMAP.md and release criterias by @peasee in https://github.com/spiceai/spiceai/pull/3124
Handle OpenTelemetry errors by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3173
Update version to 0.19.2-beta by @Sevenannn in https://github.com/spiceai/spiceai/pull/3182

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.19.1-beta...v0.19.2-beta

- Rust
Published by Sevenannn over 1 year ago

https://github.com/spiceai/spiceai - v0.19.1-beta

Spice v0.19.1 (Oct 14, 2024)

Spice v0.19.1 brings further performance and stability improvements to data connectors, including improved query push-down for file-based connectors (s3, abfs, file, ftp, sftp) that use Hive-style partitioning.

Highlights in v0.19.1

TPC-H and TPC-DS Coverage: Expanded coverage for TPC-H and TPC-DS benchmarking suites across accelerators and connectors.

GitHub Connector Array Filter: The GitHub connector now supports filter push down for the array_contains function in SQL queries using search query mode.

NSQL CLI Command: A new spice nsql CLI command has been added to easily query datasets with natural language from the command line.

Breaking changes

None

Contributors

@peasee
@Sevenannn
@sgrebnov
@karifabri
@phillipleblanc
@lukekim
@Jeadie
@slyons

Dependencies

DataFusion Table Providers: Upgraded to rev f22b96601891856e02a73d482cca4f6100137df8.

What's Changed

release: Update helm chart for v0.19.0-beta by @peasee in https://github.com/spiceai/spiceai/pull/3024
Set fail-fast = true for benchmark test by @Sevenannn in https://github.com/spiceai/spiceai/pull/2997
release: Update next version and ROADMAP by @peasee in https://github.com/spiceai/spiceai/pull/3033
Verify TPCH benchmark query results for Spark connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/2993
feat: Add x-spice-user-agent header to Spice REPL by @peasee in https://github.com/spiceai/spiceai/pull/2979
Update to object store file formats documentation link by @karifabri in https://github.com/spiceai/spiceai/pull/3036
Use teraswitch-runners for Linux x64 workflows + builds by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3042
feat: Support array contains in GitHub pushdown by @peasee in https://github.com/spiceai/spiceai/pull/2983
Bump text-splitter from 0.16.1 to 0.17.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2987
Revert integration tests back to hosted runner by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3046
Tune Github runner resources to allow in memory TPCDS benchmark to run by @Sevenannn in https://github.com/spiceai/spiceai/pull/3025
fix: add winver by @peasee in https://github.com/spiceai/spiceai/pull/3054
refactor: Use is modifier for checking GitHub state filter by @peasee in https://github.com/spiceai/spiceai/pull/3056
Enable merge_group checks for PR workflows by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3058
Fix issues with merge group by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3059
Validate in-memory arrow accelertion TPCDS result correctness by @Sevenannn in https://github.com/spiceai/spiceai/pull/3044
Fix rev parsing for PR checks by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3060
Use 'Accept' header for /v1/sql/ and /v1/nsql by @Jeadie in https://github.com/spiceai/spiceai/pull/3032
Verify Postgres acceleration TPCDS result correctness by @Sevenannn in https://github.com/spiceai/spiceai/pull/3043
Add NSQL CLI REPL command by @lukekim in https://github.com/spiceai/spiceai/pull/2856
Preserve query results order and add TPCH benchmark results verification for duckdb:file mode by @sgrebnov in https://github.com/spiceai/spiceai/pull/3034
Refactor benchmark to include MySQL tpcds bench, tweaks to makefile target for generating mysql tpcds data by @Sevenannn in https://github.com/spiceai/spiceai/pull/2967
Support runtime parameter for sql_query_keep_partition_by_columns & enable by default by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3065
Document TPC-DS limitations: EXCEPT, INTERSECT, duplicate names by @sgrebnov in https://github.com/spiceai/spiceai/pull/3069
Adding ABFS benchmark by @slyons in https://github.com/spiceai/spiceai/pull/3062
Add support for GitHub app installation auth for GitHub connector by @ewgenius in https://github.com/spiceai/spiceai/pull/3063
docs: Document stack overflow workaround, add helper script by @peasee in https://github.com/spiceai/spiceai/pull/3070
Tune MySQL TPCDS image to allow for successful benchmark test run by @Sevenannn in https://github.com/spiceai/spiceai/pull/3067
Automatically infer partitions for hive-style partitioned files for object store based connectors by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3073
Support hf_token from params/secrets by @Jeadie in https://github.com/spiceai/spiceai/pull/3071
Inherit embedding columns from source, when available. by @Jeadie in https://github.com/spiceai/spiceai/pull/3045
Validate identifiers for component names by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3079
docs: Add workaround for TPC-DS Q97 in MySQL by @peasee in https://github.com/spiceai/spiceai/pull/3080
Document TPC-DS Postgres column alias in a CASE statement limitation by @sgrebnov in https://github.com/spiceai/spiceai/pull/3083
Update plan snapshots for TPC-H bench queries by @sgrebnov in https://github.com/spiceai/spiceai/pull/3088
Update Datafusion crate to include recent unparsing fixes by @sgrebnov in https://github.com/spiceai/spiceai/pull/3089
Sample SQL table data tool and API by @Jeadie in https://github.com/spiceai/spiceai/pull/3081
chore: Update datafusion-table-providers by @peasee in https://github.com/spiceai/spiceai/pull/3090
Add hive_infer_partitions to remaining object store connectors by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3086
deps: Update datafusion-table-providers by @peasee in https://github.com/spiceai/spiceai/pull/3093
For local embedding models, return usage input tokens. by @Jeadie in https://github.com/spiceai/spiceai/pull/3095
Update end_game.md with Accelerator/Connector criteria check by @slyons in https://github.com/spiceai/spiceai/pull/3092
Update TPC-DS Q90 by @sgrebnov in https://github.com/spiceai/spiceai/pull/3094
docs: Add RC connector criteria by @peasee in https://github.com/spiceai/spiceai/pull/3026
Update version to 0.19.1-beta by @sgrebnov in https://github.com/spiceai/spiceai/pull/3101

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.19.0-beta...v0.19.1-beta

- Rust
Published by slyons over 1 year ago

https://github.com/spiceai/spiceai - v0.19.0-beta

Spice v0.19.0-beta (Oct 7, 2024)

Spice v0.19.0-beta brings performance improvements for accelerators and expanded TPC-DS coverage. A new Azure Blob Storage data connector has also been added.

Highlights in v0.19.0-beta

Improved TPC-DS Coverage: Enhanced support for TPC-DS derived queries.

CLI SQL REPL: The CLI SQL REPL (spice sql) now supports multi-line editing and tab indentation. Note, a terminating semi-colon ';' is now required for each executed SQL block.

Azure Storage Data Connector: A new Azure Blob Storage data connector (abfs://) has been added, enabling federated SQL queries on files stored in Azure Blob-compatible endpoints, including Azure BlobFS (abfss://) and Azure Data Lake (adl://). Supported file formats can be specified using the file_format parameter.

Example spicepod.yml:

yaml datasets: - from: abfs://foocontainer/taxi_sample.csv name: azure_test params: azure_account: spiceadls azure_access_key: abc123== file_format: csv

For a full list of supported files, see the Object Store File Formats documentation.

For more details, see the Azure Blob Storage Data Connector documentation.

Breaking Changes

Spice.ai Data Connector: The key for the Spice.ai Cloud Platform Data Connector has changed from spiceai to spice.ai. To upgrade, change uses of from: spiceai: to from: spice.ai:.
GitHub Data Connector: Pull Requests column login has been renamed to author.
CLI SQL REPL: A terminating semi-colon ';' is now required for each executed SQL block.
Spicepod Hot-Reload: When running spiced directly, hot-reload of spicepod.yml configuration is now disabled. Run with spice run to use hot-reload.

Contributors

@sgrebnov
@Jeadie
@Sevenannn
@peasee
@ewgenius
@slyons
@phillipleblanc
@lukekim

Dependencies

DataFusion Table Providers: Upgraded to rev 826814ab149aad8ee668454c83a0650fb8b18d60.

What's Changed

Bump tonic from 0.12.2 to 0.12.3 by @dependabot in https://github.com/spiceai/spiceai/pull/2880
Verify benchmark query results using snapshot testing (s3 connector) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2902
Fix paths-ignore: by @Jeadie in https://github.com/spiceai/spiceai/pull/2906
Rename spiceai data connector to spice.ai by @sgrebnov in https://github.com/spiceai/spiceai/pull/2899
Update ROADMAP.md by @Jeadie in https://github.com/spiceai/spiceai/pull/2907
Helm update for helm for 0.18.3-beta by @Jeadie in https://github.com/spiceai/spiceai/pull/2910
Add tpcds queries by @Sevenannn in https://github.com/spiceai/spiceai/pull/2918
Fix paths-ignore for docs. by @Jeadie in https://github.com/spiceai/spiceai/pull/2911
feat: Support LIKE expressions in GitHub filter pushdown by @peasee in https://github.com/spiceai/spiceai/pull/2903
feat: Support date comparison pushdown in GitHub connector by @peasee in https://github.com/spiceai/spiceai/pull/2904
Improve aggregation and union queries unparsing by @sgrebnov in https://github.com/spiceai/spiceai/pull/2925
Initialize file based accelerators on dataset reload by @Sevenannn in https://github.com/spiceai/spiceai/pull/2923
Update spiceai/spiceai for next release by @Jeadie in https://github.com/spiceai/spiceai/pull/2928
Verify TPC-H benchmark query results for arrow acceleration by @sgrebnov in https://github.com/spiceai/spiceai/pull/2927
Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/2912
Use structured output for NSQL by @Jeadie in https://github.com/spiceai/spiceai/pull/2922
Update TPC-DS queries to use supported date addition format by @sgrebnov in https://github.com/spiceai/spiceai/pull/2930
Add busy_timeout accelerator param for Sqlite by @Sevenannn in https://github.com/spiceai/spiceai/pull/2855
Use Cosine Similarity in vector search by @Jeadie in https://github.com/spiceai/spiceai/pull/2932
Add support for passing x-spiceai-app-id metadata in spiceai data connector by @ewgenius in https://github.com/spiceai/spiceai/pull/2934
docs: update beta accelerator criteria by @peasee in https://github.com/spiceai/spiceai/pull/2905
Azure Connector implementation by @slyons in https://github.com/spiceai/spiceai/pull/2926
Local embedding model from relative paths by @Jeadie in https://github.com/spiceai/spiceai/pull/2908
Add Markdown aware chunker when params.file_format: md. by @Jeadie in https://github.com/spiceai/spiceai/pull/2943
'spice version' without structured logging by @Jeadie in https://github.com/spiceai/spiceai/pull/2944
Bump tempfile from 3.12.0 to 3.13.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2878
feat: GraphQL commit query parameters by @peasee in https://github.com/spiceai/spiceai/pull/2945
Update OpenAI client and use new request fields by @Jeadie in https://github.com/spiceai/spiceai/pull/2951
refactor: Rename GitHub pulls login to author by @peasee in https://github.com/spiceai/spiceai/pull/2954
Run tpcds benchmarks for accelerators by @Sevenannn in https://github.com/spiceai/spiceai/pull/2853
Add spiced arg --pods-watcher-enabled. Watcher disabled by default for spiced. by @ewgenius in https://github.com/spiceai/spiceai/pull/2953
Add error message when spicepod has embeddings or models without '--features models' by @Jeadie in https://github.com/spiceai/spiceai/pull/2952
Adding multi-line editing and tab indentation to sql REPL by @slyons in https://github.com/spiceai/spiceai/pull/2949
Update MySQL ghcr image to include tpcds data by @Sevenannn in https://github.com/spiceai/spiceai/pull/2941
Document DataFusion limitation: The context only support single SQL Statement, Date Arithmetic like date + 3 not supported by @Sevenannn in https://github.com/spiceai/spiceai/pull/2970
Bump snafu from 0.8.4 to 0.8.5 by @dependabot in https://github.com/spiceai/spiceai/pull/2876
Bump async-trait from 0.1.82 to 0.1.83 by @dependabot in https://github.com/spiceai/spiceai/pull/2879
Bump async-graphql from 7.0.9 to 7.0.11 in the cargo group by @dependabot in https://github.com/spiceai/spiceai/pull/2950
Verify TPC-H benchmark query results for MySQL by @sgrebnov in https://github.com/spiceai/spiceai/pull/2972
Verify TPCH benchmark query results for Postgres by @sgrebnov in https://github.com/spiceai/spiceai/pull/2973
Verify TPCH benchmark query results for sqlite acceleration by @sgrebnov in https://github.com/spiceai/spiceai/pull/2974
Verify TPCH benchmark query results for duckdb (in-memory) acceleration by @sgrebnov in https://github.com/spiceai/spiceai/pull/2975
Support for mdx file extensions to apply a markdown splitter by @ewgenius in https://github.com/spiceai/spiceai/pull/2977
Don't assume first vector or content will be non-null/zero by @Jeadie in https://github.com/spiceai/spiceai/pull/2940
use custom chunk sizers for HF, local and OpenAI models by @Jeadie in https://github.com/spiceai/spiceai/pull/2971
Ensure we return N unique documents, not N unique chunks by @Jeadie in https://github.com/spiceai/spiceai/pull/2976
Fix issues parsing messages[*].tool_calls for local models by @Jeadie in https://github.com/spiceai/spiceai/pull/2957
text -> SQL trait to customise per model. by @Jeadie in https://github.com/spiceai/spiceai/pull/2942
Remove system message from ToolUsingChat. by @Jeadie in https://github.com/spiceai/spiceai/pull/2978
Make logical plan to sql more robust (improve ORDER BY; support round for Postgres) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2984
Add connectionpoolsize parameter for Postgres accelerator by @Sevenannn in https://github.com/spiceai/spiceai/pull/2969
Fix dataset configure prompt by @sgrebnov in https://github.com/spiceai/spiceai/pull/2991
Verify TPCH benchmark query results for Databricks(odbc) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2989
Verify TPCH benchmark query results for Databricks (delta_lake) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2982
Set log level for anonymous telemetry traces to trace by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2995
Improvements to issue templates by @lukekim in https://github.com/spiceai/spiceai/pull/2992
spice login writes to .env.local if present by @slyons in https://github.com/spiceai/spiceai/pull/2996

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.18.3-beta...v0.19.0-beta

- Rust
Published by peasee over 1 year ago

https://github.com/spiceai/spiceai - v0.18.3-beta

Spice v0.18.3-beta (Sep 30, 2024)

The Spice v0.18.3-beta release includes several quality-of-life improvements including verbosity flags for spiced and the Spice CLI, vector search over larger documents with support for chunking dataset embeddings, and multiple performance enhancements. Additionally, the release includes several bug fixes, dependency updates, and optimizations, including updated table providers and significantly improved GitHub data connector performance for issues and pull requests.

Highlights in v0.18.3-beta

GitHub Query Mode: A new github_query_mode: search parameter has been added to the GitHub Data Connector, which uses the GitHub Search API to enable faster and more efficient query of issues and pull requests when using filters.

Example spicepod.yml:

yaml - from: github:github.com/spiceai/spiceai/issues/trunk name: spiceai.issues params: github_query_mode: search # Use GitHub Search API github_token: ${secrets:GITHUB_TOKEN}

Output Verbosity: Higher verbosity output levels can be specified through flags for both spiced and the Spice CLI.

Example command line:

```shell spice -v spice --very-verbose

spiced -vv spiced --verbose ```

Embedding Chunking: Chunking can be enabled and configured to preprocess input data before generating dataset embeddings. This improves the relevance and precision for larger pieces of content.

Example spicepod.yml:

yaml - name: support_tickets embeddings: - column: conversation_history use: openai_embeddings chunking: enabled: true target_chunk_size: 128 overlap_size: 16 trim_whitespace: true

For details, see the Search Documentation.

Dependencies

DataFusion Table Providers: Upgraded to rev b0af91992699ecbf5adf2036a07122578f06150e.

Contributors

@Sevenannn
@peasee
@Jeadie
@sgrebnov
@phillipleblanc
@ewgenius
@slyons

What's Changed

Update datafusion table provider patch by @Sevenannn in https://github.com/spiceai/spiceai/pull/2817
refactor: Set maxrowsper_batch for ODBC to 4000 by @peasee in https://github.com/spiceai/spiceai/pull/2822
Use User message for health check by @Jeadie in https://github.com/spiceai/spiceai/pull/2823
Upgrade Helm chart (Spice v0.18.2-beta) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2820
Add verbosity flags for spiced, spice: -v, -vv, --verbose, --very-verbose. by @Jeadie in https://github.com/spiceai/spiceai/pull/2831
Rename spiceai data connector to spice.ai by @sgrebnov in https://github.com/spiceai/spiceai/pull/2680
Prepare for v0.19.0-beta release (version bump) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2821
Bump clap from 4.5.17 to 4.5.18 (#2801) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2848
Enable "rc" feature for serde in spicepod crate by @ewgenius in https://github.com/spiceai/spiceai/pull/2851
Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/2852
chore: update table providers by @peasee in https://github.com/spiceai/spiceai/pull/2858
fix: Use GitHub search for issues in GraphQL by @peasee in https://github.com/spiceai/spiceai/pull/2845
fix: Use GitHub search for pull_requests by @peasee in https://github.com/spiceai/spiceai/pull/2847
Support chunking dataset embeddings by @Jeadie in https://github.com/spiceai/spiceai/pull/2854
refactor: Update GraphQL client to be more robust for filter push down by @peasee in https://github.com/spiceai/spiceai/pull/2864
docs: Update accelerator beta criteria by @peasee in https://github.com/spiceai/spiceai/pull/2865
Change BytesProcessedRule to be an optimizer rather than an analyzer rule by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2867
Don't run E2E or PR tests on documentation by @Jeadie in https://github.com/spiceai/spiceai/pull/2869
Verify benchmark query results using snapshot testing (spice.ai connector) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2866
feat: Add GraphQLOptimizer by @peasee in https://github.com/spiceai/spiceai/pull/2868
Update quickstarts for Endgame by @Jeadie in https://github.com/spiceai/spiceai/pull/2863
Update version to v0.18.3-beta by @sgrebnov in https://github.com/spiceai/spiceai/pull/2882
Update DataFusion: fix coalesce, Aggregation with Window functions unparsing support by @sgrebnov in https://github.com/spiceai/spiceai/pull/2884
Revert "Rename spiceai data connector to spice.ai" by @sgrebnov in https://github.com/spiceai/spiceai/pull/2881
Adding integration test for DuckDB read functions by @slyons in https://github.com/spiceai/spiceai/pull/2857
Show more informative mysql error message by @Sevenannn in https://github.com/spiceai/spiceai/pull/2883
Fix no process-level CryptoProvider available when using REPL and TLS by @sgrebnov in https://github.com/spiceai/spiceai/pull/2887
Change UX for chunking and enable overlap_size in chunking by @Jeadie in https://github.com/spiceai/spiceai/pull/2890
Add log/slog to spice CLI tool by @Jeadie in https://github.com/spiceai/spiceai/pull/2859
feat: Add GitHub GraphQLOptimizer by @peasee in https://github.com/spiceai/spiceai/pull/2870
Fix mysql invalid tablename error message by @Sevenannn in https://github.com/spiceai/spiceai/pull/2896
fix: Remove login column rename in pulls and update Optimizer by @peasee in https://github.com/spiceai/spiceai/pull/2897
Fix require check checking. by @Jeadie in https://github.com/spiceai/spiceai/pull/2898

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.18.2-beta...v0.18.3-beta

- Rust
Published by Jeadie over 1 year ago

https://github.com/spiceai/spiceai - v0.18.2-beta

Spice v0.18.2-beta (Sep 24, 2024)

The v0.18.2-beta release improves the reliability of the sharepoint data connector and spice search functionality.

Contributors

@Jeadie
@sgrebnov

New Contributors

None

What's Changed

Issue with sharepoint Site by @Jeadie in https://github.com/spiceai/spiceai/pull/2810
Upgrade Helm chart (Spice v0.18.1-beta) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2812
Prepare for v0.18.2-beta release by @sgrebnov in https://github.com/spiceai/spiceai/pull/2811
Fix issues with spice search by @Jeadie in https://github.com/spiceai/spiceai/pull/2814

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.18.1-beta...0.18.2

- Rust
Published by github-actions[bot] over 1 year ago

https://github.com/spiceai/spiceai - v0.18.1-beta

Spice v0.18.1-beta (Sep 23, 2024)

The v0.18.1-beta release continues to improve runtime performance and reliability. Performance for accelerated queries joining multiple datasets has been significantly improved with join push-down support. GraphQL, MySQL, and SharePoint data connectors have better reliability and error handling, and a new Microsoft SQL Server data connector has been introduced. Task History now has fine-grained configuration, including the ability to disable the feature entirely. A new spice search CLI command has been added, enabling development-time embeddings-based searches across datasets.

Highlights in v0.18.1-beta

Join push-down for accelerations: Queries to the same accelerator will now push-down joins, significantly improving acceleration performance for queries joining multiple tables.

Microsoft SQL Server Data Connector: Use from: mssql: to access and accelerate Microsoft SQL Server datasets.

Example spicepod.yml:

yaml datasets: - from: mssql:path.to.my_dataset name: my_dataset params: mssql_connection_string: ${secrets:mssql_connection_string}

See the Microsoft SQL Server Data Connector documentation.

Task History: Task History can be configured in the spicepod.yml, including the ability to include, or truncate outputs such as the results of a SQL query.

Example spicepod.yml:

yaml runtime: task_history: enabled: true captured_output: truncated retention_period: 8h retention_check_interval: 15m

See the Task History Spicepod reference for more information on possible values and behaviors.

Search CLI Command Use the spice search CLI command to perform embeddings-based searches across search configure datasets. Note: Search requires the ai feature to be installed.

Refresh on File Changes: File Data Connector data refreshes can be configured to be triggered when the source file is modified through a file system watcher. Enable the watcher by adding file_watcher: enabled to the acceleration parameters.

Example spicepod.yml:

yaml datasets: - from: file://path/to/my_file.csv name: my_file acceleration: enabled: true refresh_mode: full params: file_watcher: enabled

Breaking Changes

The Query History table runtime.query_history has been deprecated and removed in favor of the Task History table runtime.task_history. The Task History table tracks tasks across all features such as SQL query, vector search, and AI completion in a unified table.

See the Task History documentation.

Dependencies

DataFusion: Upgraded from v41 to v42.
Apache Arrow: Upgraded from v52 to v53.
DuckDB: Upgraded from v1.0 to v1.1.

Contributors

@phillipleblanc
@Jeadie
@lukekim
@sgrebnov
@peasee
@Sevenannn
@ewgenius
@slyons

New Contributors

@slyons made their first contribution in https://github.com/spiceai/spiceai/pull/2724

What's Changed

Update Helm Chart for 0.18.0-beta release by @sgrebnov in https://github.com/spiceai/spiceai/pull/2711
Use a single instance for all DuckDB accelerated datasets by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2669
Dependabot upgrades by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2715
Use a single instance for all SQLite accelerated datasets by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2720
Prepare for v0.18.1-beta release by @sgrebnov in https://github.com/spiceai/spiceai/pull/2692
For GraphQL, remove necessity of json_pointer and improve error messaging. by @Jeadie in https://github.com/spiceai/spiceai/pull/2713
Postgres accelerator benchmark test by @Sevenannn in https://github.com/spiceai/spiceai/pull/2652
Trace query result while running benchmark tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/2684
Early check EmbeddingConnector if embedding models do not exist by @Jeadie in https://github.com/spiceai/spiceai/pull/2717
Move table creation for spicesysdatasetcheckpoint to DatasetCheckpoint::trynew by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2732
Don't load tools immediately by @Jeadie in https://github.com/spiceai/spiceai/pull/2731
Renable accelerator federation on trunk by @Sevenannn in https://github.com/spiceai/spiceai/pull/2725
Fixing Data Connectors link in README.md by @slyons in https://github.com/spiceai/spiceai/pull/2724
Enable rehydration tests for DuckDB by @sgrebnov in https://github.com/spiceai/spiceai/pull/2729
Check pageInfo is correct at initialisation of GraphQL connector by @Jeadie in https://github.com/spiceai/spiceai/pull/2730
Microsoft SQL Server data connector initial support by @sgrebnov in https://github.com/spiceai/spiceai/pull/2741
Add spice search CLI command by @lukekim in https://github.com/spiceai/spiceai/pull/2739
Update threat model by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2738
Upgrade to Arrow 53, DataFusion 42 and DuckDB 1.1 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2744
Update datafusion table provider patch by @Sevenannn in https://github.com/spiceai/spiceai/pull/2747
feat: Add enabled config option for task_history by @peasee in https://github.com/spiceai/spiceai/pull/2758
Remove v0.18.0-beta from the Roadmap by @sgrebnov in https://github.com/spiceai/spiceai/pull/2748
Fix spark-connect to use native roots for TLS again by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2766
Fix benchmark test - Install default crypto provider by @Sevenannn in https://github.com/spiceai/spiceai/pull/2752
Resolve primary keys for datasets with catalog or schema by @Jeadie in https://github.com/spiceai/spiceai/pull/2749
MSSQL: include table name in schema retrieval error by @sgrebnov in https://github.com/spiceai/spiceai/pull/2746
File Format parsing for Document tables, support for docx + pdf by @Jeadie in https://github.com/spiceai/spiceai/pull/2740
Add Document parsing to Sharepoint connector. by @Jeadie in https://github.com/spiceai/spiceai/pull/2760
Execution plan with BinaryExpr predicates pushdown support for MS SQL by @sgrebnov in https://github.com/spiceai/spiceai/pull/2768
Update datafusion patch by @Sevenannn in https://github.com/spiceai/spiceai/pull/2772
Support for standalone config parameters for MS SQL by @sgrebnov in https://github.com/spiceai/spiceai/pull/2773
Utilize DataConnectorError for MySQL Data Connector Errors by @Sevenannn in https://github.com/spiceai/spiceai/pull/2759
Add Score to search results by @lukekim in https://github.com/spiceai/spiceai/pull/2774
Don't call GetComponentStatuses when --metrics not enabled by @Jeadie in https://github.com/spiceai/spiceai/pull/2779
Implement better error handling for spicepods by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2767
Make integration tests more robust by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2782
Query results streaming support for MS SQL by @sgrebnov in https://github.com/spiceai/spiceai/pull/2781
Update benchmark snapshots by @Sevenannn in https://github.com/spiceai/spiceai/pull/2778
For Sharepoint connector, if clientsecret and authcode are both provided, default to auth_code by @Jeadie in https://github.com/spiceai/spiceai/pull/2780
Add modified pk/indexes scenario to rehydration tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/2743
Run benchmarks on Wed, Fri, Sat, and Sun. by @lukekim in https://github.com/spiceai/spiceai/pull/2786
Update PULLREQUESTTEMPLATE.md to include a section for Documentation by @slyons in https://github.com/spiceai/spiceai/pull/2785
Add E2E test for MS SQL data connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/2788
More types support for MS SQL data connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/2789
feat: Add capturedoutput option for taskhistory by @peasee in https://github.com/spiceai/spiceai/pull/2783
Add ability to refresh when file data connector detects changes by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2787
Propagate MySQL invalid table name error by @Sevenannn in https://github.com/spiceai/spiceai/pull/2776
feat: Add retention options for task_history config by @peasee in https://github.com/spiceai/spiceai/pull/2784
fix: Move task history check after query history creation by @peasee in https://github.com/spiceai/spiceai/pull/2793
MS SQL connector should ignore all unsupported types by @sgrebnov in https://github.com/spiceai/spiceai/pull/2795
Improve Sharepoint DX by @Jeadie in https://github.com/spiceai/spiceai/pull/2791
Replace query history with task history by @peasee in https://github.com/spiceai/spiceai/pull/2792
Fix datasetshealthmonitor spice.runtime.task_history not found warning by @sgrebnov in https://github.com/spiceai/spiceai/pull/2805
Upgrade macOS x86_64 test runner to macOS 13.6.9 Ventura by @sgrebnov in https://github.com/spiceai/spiceai/pull/2803
Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/2808
Add mssql to the list of supported data connectors by @sgrebnov in https://github.com/spiceai/spiceai/pull/28

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.18.0-beta...v0.18.1-beta

- Rust
Published by sgrebnov over 1 year ago

https://github.com/spiceai/spiceai - v0.18.0-beta

Spice v0.18.0-beta (Sep 16, 2024)

The v0.18.0-beta release adds new Sharepoint and File data connectors, introduces AWS Identity and Access Management (IAM) support for the S3 Data Connector, improves performance of the GitHub connector, and increases the overall reliability of all data accelerators. The /ready API endpoint was enhanced to report as ready only when all components, including loaded data, have successfully reported readiness.

Highlights in v0.18.0-beta

Sharepoint Data Connector: Use from: sharepoint: to access and accelerate documents stored in Microsoft 365 OneDrive for Business (Sharepoint).

Example spicepod.yml:

yaml datasets: - from: sharepoint:drive:Documents/path:/important_documents/ name: important_documents params: sharepoint_client_id: ${secrets:SPICE_SHAREPOINT_CLIENT_ID} sharepoint_tenant_id: ${secrets:SPICE_SHAREPOINT_TENANT_ID} sharepoint_client_secret: ${secrets:SPICE_SHAREPOINT_CLIENT_SECRET}

See the Sharepoint Data Connector documentation.

AWS Identity and Access Management (IAM) for S3: A new s3_auth parameter for the s3 data connector to configure the authentication method to use when connecting to S3. Supported values are public, key, and iam_role. Use s3_auth: iam_role to assume the instance IAM role.

Example spicepod.yml:

yaml datasets: - from: s3://my-bucket name: bucket params: s3_auth: iam_role # Assume IAM role of instance

See the S3 Data Connector documentation.

File Data Connector Use from: file: to query files stored by locally accessible filesystems.

Example spicepod.yml:

yaml datasets: - from: file://path/to/customer.parquet name: customer params: file_format: parquet

See the File Data Connector documentation.

Improved /ready Api Now includes the initial data load for accelerated datasets in addition to component readiness to ensure readiness is only reported when data has loaded and can be successfully queried.

Breaking Changes

GitHub Data Connector: The data type for time-related columns has changed from Utf8 to Timestamp. To upgrade, data type references to timestamp. For example, if using time_format:, change uses of time_format: ISO8601 to time_format: timestamp.
Ready API: The /ready API reports ready only when all components have reported ready and data is fully loaded. To upgrade, evaluate uses of the Ready API (such as Kubernetes readiness probes) and consider how it might affect system behavior.

Dependencies

No major dependencies updates.

Contributors

@phillipleblanc
@Jeadie
@lukekim
@sgrebnov
@peasee
@eltociear
@Sevenannn
@ewgenius
@karifabri

New Contributors

@karifabri made their first contribution in https://github.com/spiceai/spiceai/pull/2601

What's Changed

Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/2585
Set helm to v0.17.4-beta by @ewgenius in https://github.com/spiceai/spiceai/pull/2595
Bump to next v0.18.0-beta version by @ewgenius in https://github.com/spiceai/spiceai/pull/2596
Add snapshot test docs / Update beta criteria for data accelerators by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2594
Enable federation for accelerated queries (sqlite, duckdb, postgres) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2598
spelling updates on v0.17.4 release notes by @karifabri in https://github.com/spiceai/spiceai/pull/2601
Update endgame template by @ewgenius in https://github.com/spiceai/spiceai/pull/2591
fix: Re-attach DuckDB attachments on each query by @peasee in https://github.com/spiceai/spiceai/pull/2602
Speed up sqlite accelerator benchmark test with indexes by @Sevenannn in https://github.com/spiceai/spiceai/pull/2597
Fix refresh API using refresh_mode: append by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2609
Tweak /ready to only report ready when components have all reported Ready by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2600
Add s3_auth parameter to configure IAM role authentication by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2611
Bump fundu from 2.0.0 to 2.0.1 by @dependabot in https://github.com/spiceai/spiceai/pull/2576
fix: Remove comments from SQL files by @peasee in https://github.com/spiceai/spiceai/pull/2627
Utilize runtime.status().is_ready() to check acceleration dataset readiness in benchmark test by @Sevenannn in https://github.com/spiceai/spiceai/pull/2614
Allow for prefix to be kept in internal Parameters by @Jeadie in https://github.com/spiceai/spiceai/pull/2603
Bump itertools from 0.12.1 to 0.13.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2572
Bump golang.org/x/mod from 0.20.0 to 0.21.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2571
Add initial threat model using OWASP Threat Dragon by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2599
fix: Explicitly error for duplicate duckdb file accelerators by @peasee in https://github.com/spiceai/spiceai/pull/2628
Benchmark test binary can parse command line option by @Sevenannn in https://github.com/spiceai/spiceai/pull/2626
Snapshot tests shouldn't crash the Spice benchmark test by @Sevenannn in https://github.com/spiceai/spiceai/pull/2613
Bump anyhow from 1.0.86 to 1.0.87 by @dependabot in https://github.com/spiceai/spiceai/pull/2573
Upgrade datafusion to improve SQLite subquery tables aliasing support by @sgrebnov in https://github.com/spiceai/spiceai/pull/2634
Run benchmark separately using workflow by @Sevenannn in https://github.com/spiceai/spiceai/pull/2631
Sharepoint UX changes by @Jeadie in https://github.com/spiceai/spiceai/pull/2633
Improve /ready to only mark a dataset ready iff the initial refresh completed by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2630
Support relative paths for file connector by @Jeadie in https://github.com/spiceai/spiceai/pull/2637
Fix error decoding response body GitHub file connector bug by @sgrebnov in https://github.com/spiceai/spiceai/pull/2645
GraphQL pagination and robustness. by @Jeadie in https://github.com/spiceai/spiceai/pull/2632
docs: Update bug template by @peasee in https://github.com/spiceai/spiceai/pull/2629
Define GitHub issues data connector schema upfront by @sgrebnov in https://github.com/spiceai/spiceai/pull/2646
Add support for loading from Sharepoint Group's default drive. by @Jeadie in https://github.com/spiceai/spiceai/pull/2642
Fix typo in workflow, fix the postgres connector container readiness check by @Sevenannn in https://github.com/spiceai/spiceai/pull/2654
Fix check all features by @Sevenannn in https://github.com/spiceai/spiceai/pull/2653
Enable Warn/Error traces from dependency components by @sgrebnov in https://github.com/spiceai/spiceai/pull/2655
Use lower case iso8601 for time_column by @Sevenannn in https://github.com/spiceai/spiceai/pull/2551
Add basic integration test for Spice spill-to-disk and re-hydration scenario by @sgrebnov in https://github.com/spiceai/spiceai/pull/2643
Add 'RefreshOverrides::max_jitter' to 'POST /v1/datasets/:name/acceleration/refresh' by @Jeadie in https://github.com/spiceai/spiceai/pull/2641
Bump rustls-pemfile from 1.0.4 to 2.1.3 by @dependabot in https://github.com/spiceai/spiceai/pull/2575
Update dependencies to support querying postgres enum types by @Sevenannn in https://github.com/spiceai/spiceai/pull/2657
Upgrade table-providers by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2659
Improve spill_to_disk_and_rehydration integration test by @sgrebnov in https://github.com/spiceai/spiceai/pull/2658
Enhance GitHub connector robustness with explicit table schema definitions by @sgrebnov in https://github.com/spiceai/spiceai/pull/2661
Rename sharepoint fields by @Jeadie in https://github.com/spiceai/spiceai/pull/2668
Disable dataset checkpoint for DuckDB acceleration by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2676
Revert "Enable federation for accelerated queries (sqlite, duckdb, postgres) (#2598) by @Sevenannn in https://github.com/spiceai/spiceai/pull/2683

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.17.4-beta...v0.18.0-beta

- Rust
Published by sgrebnov over 1 year ago

https://github.com/spiceai/spiceai - v0.17.4-beta.1

This is the release candidate 0.17.4-beta.1

- Rust
Published by phillipleblanc over 1 year ago

https://github.com/spiceai/spiceai - v0.17.4-beta

- Rust
Published by ewgenius over 1 year ago

https://github.com/spiceai/spiceai - v0.17.3-beta

Spice v0.17.3-beta (Sep 2, 2024)

The v0.17.3-beta release further improves data accelerator robustness and adds a new github data connector that makes accelerating GitHub Issues, Pull Requests, Commits, and Blobs easy.

Highlights in v0.17.3-beta

Improved benchmarking, testing, and robustness of data accelerators: Continued improvements to benchmarking and testing of data accelerators, leading to more robust and reliable data accelerators.

GitHub Connector (alpha): Connect to GitHub and accelerate Issues, Pull Requests, Commits, and Blobs.

```yaml datasets: # Fetch all rust and golang files from spiceai/spiceai - from: github:github.com/spiceai/spiceai/files/trunk name: spiceai.files params: include: '*/.rs; */.go' githubtoken: ${secrets:GITHUBTOKEN}

# Fetch all issues from spiceai/spiceai. Similar for pull requests, commits, and more.

from: github:github.com/spiceai/spiceai/issues name: spiceai.issues params: githubtoken: ${secrets:GITHUBTOKEN} ```

Breaking Changes

None.

Contributors

@phillipleblanc
@Jeadie
@peasee
@sgrebnov
@Sevenannn
@lukekim
@dependabot
@ewgenius

What's Changed

Dependencies

delta_kernel from 0.2.0 to 0.3.0.

Commits

Prepare version for v0.17.3-beta by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2388
Add a basic Github Connector by @Jeadie in https://github.com/spiceai/spiceai/pull/2365
task: Re-enable federation by @peasee in https://github.com/spiceai/spiceai/pull/2389
fix: Implement custom PartialEq for Dataset by @peasee in https://github.com/spiceai/spiceai/pull/2390
GitHub Data Connector files support (basic fields) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2393
Add a --force flag to spice install to force it to install the latest released version by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2395
Improve experience of using spice chat by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2396
Fix view loading on startup by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2398
Add include param support to GitHub Data Connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/2397
Postgres integration test to cover on-conflict behavior by @Sevenannn in https://github.com/spiceai/spiceai/pull/2359
Create dependabot.yml by @lukekim in https://github.com/spiceai/spiceai/pull/2399
Add content column to GitHub Connector when dataset is accelerated by @sgrebnov in https://github.com/spiceai/spiceai/pull/2400
Fix dependabot indentation by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2402
Bump docker/setup-buildx-action from 1 to 3 by @dependabot in https://github.com/spiceai/spiceai/pull/2403
Bump github/codeql-action from 2 to 3 by @dependabot in https://github.com/spiceai/spiceai/pull/2404
Bump docker/login-action from 1 to 3 by @dependabot in https://github.com/spiceai/spiceai/pull/2405
Bump yogevbd/enforce-label-action from 2.1.0 to 2.2.2 by @dependabot in https://github.com/spiceai/spiceai/pull/2406
Bump actions/checkout from 3 to 4 by @dependabot in https://github.com/spiceai/spiceai/pull/2407
Bump go.uber.org/zap from 1.21.0 to 1.27.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2408
Bump github.com/prometheus/client_model from 0.6.0 to 0.6.1 by @dependabot in https://github.com/spiceai/spiceai/pull/2409
Bump github.com/spf13/cobra from 1.6.0 to 1.8.1 by @dependabot in https://github.com/spiceai/spiceai/pull/2412
Bump chrono-tz from 0.8.6 to 0.9.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2413
Bump tokio from 1.39.2 to 1.39.3 by @dependabot in https://github.com/spiceai/spiceai/pull/2414
Bump tokenizers from 0.19.1 to 0.20.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2415
Bump serde from 1.0.207 to 1.0.209 by @dependabot in https://github.com/spiceai/spiceai/pull/2416
Bump gopkg.in/natefinch/lumberjack.v2 from 2.0.0 to 2.2.1 by @dependabot in https://github.com/spiceai/spiceai/pull/2410
Bump ndarray from 0.15.6 to 0.16.1 by @dependabot in https://github.com/spiceai/spiceai/pull/2417
Bump golang.org/x/mod from 0.14.0 to 0.20.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2411
Add correct labels to dependabot.yml by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2418
Fix build break by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2430
Dependabot updates by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2431
Bump github.com/stretchr/testify from 1.8.1 to 1.9.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2422
Preserve timezone information in constructing expr by @Sevenannn in https://github.com/spiceai/spiceai/pull/2392
Bump github.com/spf13/viper from 1.12.0 to 1.19.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2420
Fix repeated base table data in acceleration with embeddings by @Sevenannn in https://github.com/spiceai/spiceai/pull/2401
Fix tool calling with Groq (and potentially other tool-enabled models) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2435
Remove candle from crates/llms/src/chat/ by @Jeadie in https://github.com/spiceai/spiceai/pull/2439
fix: Only attach successfully initialized accelerators by @peasee in https://github.com/spiceai/spiceai/pull/2433
Support overriding OpenAI default values in a model param; add token usage telemetry to task_history. by @Jeadie in https://github.com/spiceai/spiceai/pull/2434
Enable message chains and tool calls for local LLMs by @Jeadie in https://github.com/spiceai/spiceai/pull/2180
DuckDB on-conflict integration test by @Sevenannn in https://github.com/spiceai/spiceai/pull/2437
Fix MySQL E2E tests and include MySQL acceleration testing by @sgrebnov in https://github.com/spiceai/spiceai/pull/2441
Use rtcontext for proper cloud/local context in spice chat by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2442
Fix MySQL connector to respect the source column's decimal precision by @sgrebnov in https://github.com/spiceai/spiceai/pull/2443
Improve Github Data Connector tables schema by @sgrebnov in https://github.com/spiceai/spiceai/pull/2448
Improve GitHub Connector error msg when invalid token or permissions by @sgrebnov in https://github.com/spiceai/spiceai/pull/2449
Proper error tracking across tracing spans by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2454
task: Disable and update federation by @peasee in https://github.com/spiceai/spiceai/pull/2457
GitHub connector: convert labels and hashes to primitive arrays by @sgrebnov in https://github.com/spiceai/spiceai/pull/2452
Bump datafusion version to the latest by @sgrebnov in https://github.com/spiceai/spiceai/pull/2456
Trim trailing / for S3 data connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2458
Add accelerated_refresh to task_history table by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2459
Add assignees and labels fields to github issues and github pulls datasets by @ewgenius in https://github.com/spiceai/spiceai/pull/2467
Native clickhouse schema inference by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2466
List GitHub connector in readme by @ewgenius in https://github.com/spiceai/spiceai/pull/2468
Fix LLMs health check; Add updatedAt field to GitHub connector by @ewgenius in https://github.com/spiceai/spiceai/pull/2474
Remove non existing updated_at from github.pulls dataset by @ewgenius in https://github.com/spiceai/spiceai/pull/2475
GitHub connector: add pulls labels and rm duplicate milestoneId and milestoneTitle for issues by @sgrebnov in https://github.com/spiceai/spiceai/pull/2477
Bump delta_kernel from 0.2.0 to 0.3.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2472
Add back GitHub connector Pull Request updated_at by @lukekim in https://github.com/spiceai/spiceai/pull/2479
Update ROADMAP Sep 2, 2024. by @lukekim in https://github.com/spiceai/spiceai/pull/2478

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.17.2-beta...v0.17.3-beta

- Rust
Published by Jeadie over 1 year ago

https://github.com/spiceai/spiceai - v0.17.2-beta.1

This is the release candidate 0.17.2-beta.1

- Rust
Published by phillipleblanc almost 2 years ago

https://github.com/spiceai/spiceai - v0.17.2-beta

Spice v0.17.2-beta (Aug 26, 2024)

The v0.17.2-beta release focuses on improving data accelerator compatibility, stability, and performance. Expanded data type support for DuckDB, SQLite, and PostgreSQL data accelerators (and data connectors) enables significantly more data types to be accelerated. Error handling and logging has also been improved along with several bugs.

Highlights in v0.17.2-beta

Expanded Data Type Support for Data Accelerators: DuckDB, SQLite, and PostgreSQL Data Accelerators now support a wider range of data types, enabling acceleration of more diverse datasets.

Enhanced Error Handling and Logging: Improvements have been made to aid in troubleshooting and debugging.

Anonymous Usage Telemetry: Optional, anonymous, aggregated telemetry has been added to help improve Spice. This feature can be disabled. For details about collected data, see the telemetry documentation.

To opt out of telemetry:

Using the CLI flag:

bash spice run -- --telemetry-enabled false

Add configuration to spicepod.yaml:

yaml runtime: telemetry: enabled: false

Improved Benchmarking: A suite of performance benchmarking tests have been added to the project, helping to maintain and improve runtime performance; a top priority for the project.

Breaking Changes

None.

Contributors

@Jeadie
@y-f-u
@phillipleblanc
@sgrebnov
@Sevenannn
@peasee
@ewgenius

What's Changed

Dependencies

DataFusion: Upgraded from v40 to v41

Commits

Pin actions/upload-artifact to v4.3.4 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2200
Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/2202
Update to next release version, v0.17.2-beta by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2203
add accelerator beta criteria by @y-f-u in https://github.com/spiceai/spiceai/pull/2201
update helm chart to 0.17.1-beta by @Sevenannn in https://github.com/spiceai/spiceai/pull/2205
add dockerignore to avoid copy target and test folder by @y-f-u in https://github.com/spiceai/spiceai/pull/2206
add client timeout for deltalake connector by @y-f-u in https://github.com/spiceai/spiceai/pull/2208
Upgrade tonic and opentelemetry-proto by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2223
Add index and resource tuning for postgres ghcr image to support postgres benchmark in sf1 by @Sevenannn in https://github.com/spiceai/spiceai/pull/2196
Remove embedding columns from retrieved_primary_keys in v1/search by @Jeadie in https://github.com/spiceai/spiceai/pull/2176
use file as dbpathparam as the param prefix is trimmed by @y-f-u in https://github.com/spiceai/spiceai/pull/2230
use file for sqlite db path param by @y-f-u in https://github.com/spiceai/spiceai/pull/2231
docs: Clarify the global requirement for local_infile when loading TPCH by @peasee in https://github.com/spiceai/spiceai/pull/2228
Revert pinning actions/upload-artifact@v4 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2232
Runtime tools to chat models by @Jeadie in https://github.com/spiceai/spiceai/pull/2207
Create runtime.task_history table for queries, and embeddings by @Jeadie in https://github.com/spiceai/spiceai/pull/2191
chore: Update Databricks ODBC Bench to use TPCH SF1 by @peasee in https://github.com/spiceai/spiceai/pull/2238
Replace metrics-rs with OpenTelemetry Metrics by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2240
fix: Remove dead code by @peasee in https://github.com/spiceai/spiceai/pull/2249
Improve tool quality and add vector search tool by @Jeadie in https://github.com/spiceai/spiceai/pull/2250
fix missing partition cols in delta lake by @y-f-u in https://github.com/spiceai/spiceai/pull/2253
download file from remote for delta testing by @y-f-u in https://github.com/spiceai/spiceai/pull/2254
feat: Set SQLite DB path to .spice/data by @peasee in https://github.com/spiceai/spiceai/pull/2242
Support tools for chat completions in streaming mode by @ewgenius in https://github.com/spiceai/spiceai/pull/2255
Load component description field from spicepod.yaml and include in LLM context by @ewgenius in https://github.com/spiceai/spiceai/pull/2261
Add parameter for connection_pool_size in the Postgres Data Connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2251
Add primary keys to response of DocumentSimilarityTool by @Jeadie in https://github.com/spiceai/spiceai/pull/2263
run queries bash script by @y-f-u in https://github.com/spiceai/spiceai/pull/2262
Run benchmark test on schedule by @Sevenannn in https://github.com/spiceai/spiceai/pull/2277
feat: Add a reference to originating App for a Dataset by @peasee in https://github.com/spiceai/spiceai/pull/2283
Tool use & telemetry productionisation. by @Jeadie in https://github.com/spiceai/spiceai/pull/2286
Fix cron in benchmarks.yml by @Sevenannn in https://github.com/spiceai/spiceai/pull/2288
Upgrade to DataFusion v41 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2290
Chat completions adjustments and fixes by @ewgenius in https://github.com/spiceai/spiceai/pull/2292
Define the new metrics Arrow schema based on Open Telemetry by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2295
OpenTelemetry Metrics Arrow exporter to runtime.metrics table by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2296
Calculate summary metrics from histograms for Prometheus endpoint by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2302
Add back Spice DF runtime_env during SessionContext construction by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2304
Add integration test for S3 data connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2305
Fix secrets.inject_secrets when secret not found. by @Jeadie in https://github.com/spiceai/spiceai/pull/2306
Intra-table federation query on duckdb accelerated table by @y-f-u in https://github.com/spiceai/spiceai/pull/2299
Postgres federation on acceleration by @y-f-u in https://github.com/spiceai/spiceai/pull/2309
sqlite intra table federation on acceleration by @y-f-u in https://github.com/spiceai/spiceai/pull/2308
feat: Add DataAccelerator::init() for SQLite acceleration federation by @peasee in https://github.com/spiceai/spiceai/pull/2293
Initial framework for collecting anonymous usage telemetry by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2310
Add gRPC action to trigger accelerated dataset refresh by @sgrebnov in https://github.com/spiceai/spiceai/pull/2316
add disable_query_push_down option to acceleration settings by @y-f-u in https://github.com/spiceai/spiceai/pull/2327
Remove v1/assist by @Jeadie in https://github.com/spiceai/spiceai/pull/2312
bump table provider version to set the correct dialect for postgres writer by @y-f-u in https://github.com/spiceai/spiceai/pull/2329
Send telemetry on startup by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2331
Calculate resource IDs for telemetry by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2332
Refactor v1/search: include WHERE condition, allow extra columns in projection. by @Jeadie in https://github.com/spiceai/spiceai/pull/2328
Add integration test for gRPC dataset refresh action by @sgrebnov in https://github.com/spiceai/spiceai/pull/2330
Propagate errors through all task_history nested spans by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2337
Improve tools by @Jeadie in https://github.com/spiceai/spiceai/pull/2338
update duckdb rs version to support more types: interval/duration/etc by @y-f-u in https://github.com/spiceai/spiceai/pull/2336
feat: Add DuckDB accelerator init, attach databases for federation by @peasee in https://github.com/spiceai/spiceai/pull/2335
Add query telemetry metrics by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2333
Add system prompts for LLMs; system prompts for tool using models. by @Jeadie in https://github.com/spiceai/spiceai/pull/2342
Fix benchmark test to keep running when there's failed queries by @Sevenannn in https://github.com/spiceai/spiceai/pull/2347
Tools as a spicepod first class citizen. by @Jeadie in https://github.com/spiceai/spiceai/pull/2344
Add bytes_processed telemetry metric by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2343
fix misaligned columns from delta lake by @y-f-u in https://github.com/spiceai/spiceai/pull/2356
Emit telemetry metrics to runtime.metrics/Prometheus as well by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2352
Use UTC timezone for telemetry timestamps by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2354
Fix MetricType deserialization by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2358
Add dataset details to tool using LLMs; early check tables in vector search by @Jeadie in https://github.com/spiceai/spiceai/pull/2353
Bump datafusion-federation/datafusion-table-providers dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2360
Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/2362
fix: Disable DuckDB and SQLite federation by @peasee in https://github.com/spiceai/spiceai/pull/2371
Fix system prompt in ToolUsingChat, fix builtin registration by @Jeadie in https://github.com/spiceai/spiceai/pull/2367
fix: Use --profile release for benchmarks by @peasee in https://github.com/spiceai/spiceai/pull/2372
nql parameter 'use' -> 'model' by @Jeadie in https://github.com/spiceai/spiceai/pull/2366

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.17.1-beta...v0.17.2-beta

- Rust
Published by phillipleblanc almost 2 years ago

https://github.com/spiceai/spiceai - v0.17.1-beta

Spice v0.17.1-beta (Aug 5, 2024)

The v0.17.1-beta minor release focuses on enhancing stability, performance, and usability. The Flight interface now supports the GetSchema API and s3, ftp, sftp, http, https, and databricks data connectors have added support for a client_timeout parameter.

Highlights in v0.17.1-beta

Flight API GetSchema: The GetSchema API is now supported by the Flight interface. The schema of a dataset can be retrieved using GetSchema with the PATH or CMD FlightDescriptor types. The CMD FlightDescriptor type is used to get the schema of an arbitrary SQL query as the CMD bytes. The PATH FlightDescriptor type is used to retrieve the schema of a dataset.

Client Timeout: A client_timeout parameter has been added for Data Connectors: ftp, sftp, http, https, and databricks. When defined, the client timeout configures Spice to stop waiting for a response from the data source after the specified duration. The default timeout is 30 seconds.

yaml datasets: - from: ftp://remote-ftp-server.com/path/to/folder/ name: my_dataset params: file_format: csv # Example client timeout client_timeout: 30s ftp_user: my-ftp-user ftp_pass: ${secrets:my_ftp_password}

Breaking Changes

TLS is now required to be explicitly enabled. Enable TLS on the command line using --tls-enabled true:

bash spice run -- --tls-enabled true --tls-certificate-file /path/to/cert.pem --tls-key-file /path/to/key.pem

Or in the spicepod.yml with enabled: true:

yaml runtime: tls: # TLS explicitly enabled enabled: true certificate_file: /path/to/cert.pem key_file: /path/to/key.pem

Contributors

@Jeadie
@y-f-u
@phillipleblanc
@sgrebnov
@peasee
@Sevenannn

What's Changed

Dependencies

Rust: Upgraded from v1.79.0 to v1.80.0

Commits

Update README.md by @Jeadie in https://github.com/spiceai/spiceai/pull/2142
update helm chart to 0.17.0-beta by @y-f-u in https://github.com/spiceai/spiceai/pull/2144
Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/2143
Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/2141
Update Spice runtime to require explicit enablement for TLS by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2148
Update next version, ROADMAP, End Game template, move alpha release notes by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2145
Update EXTENSIBILITY to be correct, update README.md with Beta connectors by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2146
Add benchmark tests for duckdb acceleration by @sgrebnov in https://github.com/spiceai/spiceai/pull/2151
fix: Increase benchmark dataset setup timeout for Databricks by @peasee in https://github.com/spiceai/spiceai/pull/2149
Add LLMs to v1/models by @Jeadie in https://github.com/spiceai/spiceai/pull/2152
Dataset with acceleration enabled = false shouldn't go through accelerated dataset hot reload by @Sevenannn in https://github.com/spiceai/spiceai/pull/2155
Show single error string in Spice SQL REPL command line by @Sevenannn in https://github.com/spiceai/spiceai/pull/2150
Add CI to build makefile install targets by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2157
Make the FlightClient struct cheap to clone by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2162
Fix bugs with local Unity Catalog server by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2160
Benchmark: data connector tests should continue on query error (s3) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2161
fix hanging spiced when odbc loading data and received a cancel signal by @y-f-u in https://github.com/spiceai/spiceai/pull/2156
Improve MySql schema extraction and add InList and ScalarFunction expr support by @sgrebnov in https://github.com/spiceai/spiceai/pull/2158
Fix issue with use of EmbeddingConnector by @Jeadie in https://github.com/spiceai/spiceai/pull/2165
add client timeout for all object store providers by @y-f-u in https://github.com/spiceai/spiceai/pull/2168
Benchmark: include sqlite acceleration and enable more tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/2172
feat: Use datafusion SQLite streaming updates by @peasee in https://github.com/spiceai/spiceai/pull/2171
Benchmark: include arrow acceleration and enable more tests (tpch_q22) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2173
Localhost -> Sink; Fix Sink connector to not require schema via CREATE TABLE... and infer on first write by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2167
Fix misspelled acceleration engine name in benchmark tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/2175
update spark bench catalog by @y-f-u in https://github.com/spiceai/spiceai/pull/2178
Benchmark: Discard first measurement of sql query, disable result caching by @Sevenannn in https://github.com/spiceai/spiceai/pull/2179
clear message when invalid params configured for accelerator by @y-f-u in https://github.com/spiceai/spiceai/pull/2177
Implement the Flight GetSchema API by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2169
Support AppendStream for SpiceAI data connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2181
Support MySQL BINARY, VARBINARY, Postgres BYTEA and improve MySQL auth error message by @sgrebnov in https://github.com/spiceai/spiceai/pull/2184
Benchmark: use SF1 for MySQL TPC-H tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/2183
fix windows build broken by adding tokio unix signal by @y-f-u in https://github.com/spiceai/spiceai/pull/2193
Adds TLS support for flightsubscriber/flightpublisher tools by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2194
Update README output samples by @ewgenius in https://github.com/spiceai/spiceai/pull/2195
Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/2197

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.17.0-beta...v0.17.1-beta

- Rust
Published by phillipleblanc almost 2 years ago

https://github.com/spiceai/spiceai - v0.17.0-beta

Spice v0.17-beta (July 29, 2024)

Announcing the first beta release of Spice.ai OSS! 🎉

The core Spice runtime has graduated from alpha to beta! Components, such as Data Connectors and Models, follow independent release milestones. Data Connectors graduating from alpha to beta include databricks, spiceai, postgres, s3, odbc, and mysql. From beta to 1.0, project will be to on improving performance and scaling to larger datasets.

This release also includes enhanced security with Transport Layer Security (TLS) secured APIs, a new spice install CLI command, and several performance and stability improvements.

Highlights in v0.17-beta

Encryption in transit with TLS: The HTTP, gRPC, Metrics, and OpenTelemetry (OTEL) API endpoints can be secured with TLS by specifying a certificate and private key in PEM format.

Enable TLS using the --tls-certificate-file and --tls-key-file command-line flags:

bash spice run -- --tls-certificate-file /path/to/cert.pem --tls-key-file /path/to/key.pem

Or configure in the spicepod.yml:

yaml runtime: tls: certificate_file: /path/to/cert.pem key_file: /path/to/key.pem

Get started with TLS by following the TLS Sample. For more details see the TLS Documentation.

spice install: Running the spice install CLI command will download and install the latest version of the runtime.

bash spice install

Improved SQLite and DuckDB compatibility: The SQLite and DuckDB accelerators support more complex queries and additional data types.
Pass through arguments from spice run to runtime: Arguments passed to spice run are now passed through to the runtime.
Secrets replacement within connection strings: Secrets are now replaced within connection strings:

yaml datasets: - from: mysql:my_table name: my_table params: mysql_connection_string: mysql://user:${secrets:mysql_pw}@localhost:3306/db

Breaking Changes

The odbc data connector is now optional and has been removed from the released binaries. To use the odbc data connector, use the official Spice Docker image or build the Spice runtime from source.

To build Spice from source with the odbc feature:

bash cargo build --release --features odbc

To use the official Spice Docker image from DockerHub:

```bash

Pull the latest official Spice image

docker pull spiceai/spiceai:latest

Pull the official v0.17-beta Spice image

docker pull spiceai/spiceai:0.17.0-beta ```

Contributors

@y-f-u
@peasee
@digadeesh
@phillipleblanc
@ewgenius
@sgrebnov
@Sevenannn
@lukekim

What's Changed

Dependencies

Upgraded delta-kernel-rs to v0.2.0.

Commits

update helm chart versions for v0.16.0-alpha by @y-f-u in https://github.com/spiceai/spiceai/pull/2057
Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/2060
fix: Install unixodbc for E2E test release installation by @peasee in https://github.com/spiceai/spiceai/pull/2063
update next release to 0.16.1-beta by @digadeesh in https://github.com/spiceai/spiceai/pull/2065
update version to 0.17.0-beta by @digadeesh in https://github.com/spiceai/spiceai/pull/2068
Update ROADMAP.md - removing delivered features and updating Beta timeline. by @digadeesh in https://github.com/spiceai/spiceai/pull/2066
make bench works for more connectors by @y-f-u in https://github.com/spiceai/spiceai/pull/2042
enable spark benchmark by @y-f-u in https://github.com/spiceai/spiceai/pull/2069
Make the json_pointer param optional for the GraphQL connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2072
Fix secrets init to not bail if a secret store can't load by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2073
Update end_game.md by @ewgenius in https://github.com/spiceai/spiceai/pull/2059
Fix time predicate with timezone info casting for Dremio by @sgrebnov in https://github.com/spiceai/spiceai/pull/2058
Add benchmark tests for S3 data connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/2049
Add benchmark tests for MySQL data connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/2048
fix: Add Athena dialect for ODBC by @peasee in https://github.com/spiceai/spiceai/pull/2084
Workflow to build MySQL image with TPCH benchmark data by @sgrebnov in https://github.com/spiceai/spiceai/pull/2070
Fix secrets replacement within connection strings by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2086
fix: Correctly prefix missing required parameters by @peasee in https://github.com/spiceai/spiceai/pull/2088
Add Postgres Data Connector TPCH Benchmark Tests by @Sevenannn in https://github.com/spiceai/spiceai/pull/2009
Add spice install CLI command by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2090
Use MySQL service container for benchmark tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/2089
Remove ODBC from default released binaries by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2092
Add cfg flag to properly support build w / wo feature in benchmark tests by @Sevenannn in https://github.com/spiceai/spiceai/pull/2095
Move Prometheus metrics server to runtime by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2093
fix: Remove unixodbc from test release install by @peasee in https://github.com/spiceai/spiceai/pull/2103
Upgrade delta_kernel to 0.2.0 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2102
Allow DuckDB to load extensions in Docker by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2104
Spawn the metrics server in the background. by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2105
fix: suffix delta kernel table location with slash if none by @y-f-u in https://github.com/spiceai/spiceai/pull/2107
Bump object_store from 0.10.1 to 0.10.2 by @dependabot in https://github.com/spiceai/spiceai/pull/2094
Decision Record: Default HTTP and GRPC ports for Spice.ai OSS by @digadeesh in https://github.com/spiceai/spiceai/pull/2091
Enable TLS for metrics endpoint by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2108
Use Postgres container for tpch bench by @Sevenannn in https://github.com/spiceai/spiceai/pull/2112
Add workflow to build Postgres Docker image using tpch data by @Sevenannn in https://github.com/spiceai/spiceai/pull/2101
Enable TLS for HTTP endpoint by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2109
Enable TLS on the Flight GRPC endpoint by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2110
add timeout parameters for object store client options by @y-f-u in https://github.com/spiceai/spiceai/pull/2114
Enable TLS on the OpenTelemetry GRPC endpoint by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2111
feat: Add ODBC Databricks Benches by @peasee in https://github.com/spiceai/spiceai/pull/2113
Support configuring TLS in the spicepod by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2118
add broken tpch simple queries by @y-f-u in https://github.com/spiceai/spiceai/pull/2116
Add integration test for TLS by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2121
Improve SQLite and DuckDB compatibility by @sgrebnov in https://github.com/spiceai/spiceai/pull/2122
Pass through arguments from spice run and spice sql to runtime by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2123
Handle TLS in the spice CLI when connecting to the runtime by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2124
Handle connecting over TLS for spice sql by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2125
Remove --tls flag by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2128
fix: Handle SQLResult error instead of unwrapping by @peasee in https://github.com/spiceai/spiceai/pull/2127
Add delta bench by @y-f-u in https://github.com/spiceai/spiceai/pull/2120
feat: Add Athena ODBC benches by @peasee in https://github.com/spiceai/spiceai/pull/2129
fix: Use odbc-api fork for decimal conversion fix by @peasee in https://github.com/spiceai/spiceai/pull/2133
Update benchmarks job env for delta testing by @y-f-u in https://github.com/spiceai/spiceai/pull/2134
Use forked dotenvy to disable variable substitution by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2135
Remove unnecessary memory allocations in the query path by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2136
upgrade spiceai df for tpch simple 6 and 7 by @y-f-u in https://github.com/spiceai/spiceai/pull/2137
Avoid more unnecessary allocations in the query path by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2138

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.16.0-alpha...v0.17-beta

- Rust
Published by phillipleblanc almost 2 years ago

https://github.com/spiceai/spiceai - v0.16.0-alpha

Spice v0.16-alpha (July 22, 2024)

The v0.16-alpha release is the first candidate release for the beta milestone on a path to finalizing the v1.0 developer and user experience. Upgraders should be aware of several breaking changes designed to improve the Secrets configuration experience and to make authoring spicepod.yml files more consistent. See the Breaking Changes section below for details. Additionally, the Spice Java SDK was released, providing Java developers a simple but powerful native experience to query Spice.

Highlights in v0.16-alpha

Secret Stores: More than one Secret Store can now be specified. For example, to configure Spice with both Environment Variable and AWS Secrets Manager Secret Stores, use the following secrets configuration in spicepod.yaml:

yaml secrets: - from: env name: env - from: aws_secrets_manager:my_secret_name name: aws_secret

Secrets managed by configured Secret Stores can be referenced in component params using the syntax ${<store_name>:<key>}. E.g.

yaml datasets: - from: postgres:my_table name: my_table params: pg_host: localhost pg_port: 5432 pg_pass: ${ env:MY_PG_PASS }

Java Client SDK: The Spice Java SDK has been released for JDK 17 or greater.
Federated SQL Query: Significant stability and reliability improvements have been made to federated SQL query support in most data connectors.
ODBC Data Connector: Providing a specific SQL dialect to query ODBC data sources is now supported using the sql_dialect param. For example, when querying Databricks using ODBC, the databricks dialect can be specified to ensure compatibility. Read the ODBC Data Connector documentation for more details.

Breaking Changes

Secret Stores: Secret Stores support has been overhauled including required changes to spicepod.yml schema. File based secrets stored in the ~/.spice/auth file are no longer supported. See Secret Stores Documentation for full reference.

To upgrade Secret Stores, rename any parameters ending in _key to remove the _key suffix and specify a secret inline via the secret replacement syntax (${<secret_store>:<key>}):

yaml datasets: - from: postgres:my_table name: my_table params: pg_host: localhost pg_port: 5432 pg_pass_key: my_pg_pass

to:

yaml datasets: - from: postgres:my_table name: my_table params: pg_host: localhost pg_port: 5432 pg_pass: ${secrets:my_pg_pass}

And ensure the MY_PG_PASS environment variable is set.

Datasets: The default value of time_format has changed from unix_seconds to timestamp.

To upgrade:

yaml datasets: - from: name: my_dataset # Explicitly define format when not specified. time_format: unix_seconds

HTTP Port: The default HTTP port has changed from port 3000 to port 8090 to avoid conflicting with frontend apps which typically use the 3000 range. If an SDK is used, upgrade it at the same time as the runtime.

To upgrade and continue using port 3000, run spiced with the --http command line argument:

```shell

Using Dockerfile or spiced directly

spiced --http 127.0.0.1:3000 ```

HTTP Metrics Port: The default HTTP Metrics port has changed from port 9000 to 9090 to avoid conflicting with other metrics protocols which typically use port 9000.

To upgrade and continue using port 9000, run spiced with the metrics command line argument:

```shell

Using Dockerfile or spiced directly

spiced --metrics 127.0.0.1:9000 ```

GraphQL Data Connector: json_path has been replaced with json_pointer to access nested data from the result of the GraphQL query. See the GraphQL Data Connector documentation for full details and RFC-6901 - JSON Pointer.

To upgrade, change:

yaml json_path: my.json.path

To:

yaml json_pointer: /my/json/pointer

Data Connector Configuration: Consistent connector name prefixing has been applied to connector specific params parameters. Prefixed parameter names helps ensure parameters do not collide.

For example, the Databricks data connector specific params are now prefixed with databricks:

yaml datasets: - from: databricks:spiceai.datasets.my_awesome_table # A reference to a table in the Databricks unity catalog name: my_delta_lake_table params: mode: spark_connect endpoint: dbc-a1b2345c-d6e7.cloud.databricks.com token: MY_TOKEN

To upgrade:

yaml datasets: # Example for Spark Connect - from: databricks:spiceai.datasets.my_awesome_table # A reference to a table in the Databricks unity catalog name: my_delta_lake_table params: mode: spark_connect databricks_endpoint: dbc-a1b2345c-d6e7.cloud.databricks.com # Now prefixed with databricks databricks_token: ${secrets:my_token} # Now prefixed with databricks

Refer to the Data Connector documentation for parameter naming changes in this release.

Clickhouse Data Connector: The clickhouse_connection_timeout parameter has been renamed to connection_timeout as it applies to the client and is not Clickhouse configuration itself.

To upgrade, change:

yaml clickhouse_connection_timeout: time

To:

yaml connection_timeout: time

Contributors

@y-f-u
@phillipleblanc
@ewgenius
@github-actions
@sgrebnov
@lukekim
@digadeesh
@peasee
@Sevenannn

What's Changed

Dependencies

No major dependency updates.

Commits

bump helm chart versions to 0.15.2-alpha by @y-f-u in https://github.com/spiceai/spiceai/pull/1975
Remove unused Cargo.toml fields by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1981
Update version to 0.16.0-beta by @ewgenius in https://github.com/spiceai/spiceai/pull/1983
Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/1984
Enable sqlite acceleration testing in E2E by @sgrebnov in https://github.com/spiceai/spiceai/pull/1980
Revert "Revert "fix: validate time column and time format when constructing accelerated table refresh"" by @y-f-u in https://github.com/spiceai/spiceai/pull/1982
Add Datadog dashboard skeleton by @sgrebnov in https://github.com/spiceai/spiceai/pull/1971
Format Cargo.toml with taplo by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1988
Spice cli spice chat command, to interact with deployed spiced instance in spice.ai cloud by @ewgenius in https://github.com/spiceai/spiceai/pull/1990
Use platform api /v1/chat/completions with streaming in spice chat cli command by @ewgenius in https://github.com/spiceai/spiceai/pull/1998
update spiceai datafusion version to fix tpch queries by @y-f-u in https://github.com/spiceai/spiceai/pull/2001
Install a rustls default CryptoProvider by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2003
Roadmap update July, 2024 by @lukekim in https://github.com/spiceai/spiceai/pull/2002
Add local spice runtime support for spice chat command, add --model flag by @ewgenius in https://github.com/spiceai/spiceai/pull/2007
fix: GraphQL Data Connector - Change json path to json pointer by @digadeesh in https://github.com/spiceai/spiceai/pull/1930
Update ROADMAP.md to include MySQL data connector in Beta by @digadeesh in https://github.com/spiceai/spiceai/pull/2016
Load secrets from multiple secret stores & secrets UX refresh by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2011
upgrade spiceai datafusion to fix tpch simple query 3 by @y-f-u in https://github.com/spiceai/spiceai/pull/2021
feat: Autodetect ODBC dialect by @peasee in https://github.com/spiceai/spiceai/pull/1997
feat: Use CustomDialectBuilder for Databricks ODBC dialect by @peasee in https://github.com/spiceai/spiceai/pull/2020
Switch the secret replacement syntax to ${ <secret>:<key> } by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2026
fix spiceai connector lengthy error by @y-f-u in https://github.com/spiceai/spiceai/pull/2024
Log parameter key instead of value when injecting secret by @Sevenannn in https://github.com/spiceai/spiceai/pull/2031
Update benchmark yml to support postgres benchmark test by @Sevenannn in https://github.com/spiceai/spiceai/pull/2032
Separate data connector parameters into connector and runtime categories by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2028
Fix spice chat prompt and spinner by @ewgenius in https://github.com/spiceai/spiceai/pull/2029
Build spiced with odbc for release binaries by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2036
MySQL timestamp, int64 casting, date part extraction and intervals support by @sgrebnov in https://github.com/spiceai/spiceai/pull/2035
updating default http and metrics ports by @digadeesh in https://github.com/spiceai/spiceai/pull/2034
enable spark connect federated query by @y-f-u in https://github.com/spiceai/spiceai/pull/2041
fix: Use MySQL Interval for Databricks ODBC by @peasee in https://github.com/spiceai/spiceai/pull/2037
Re-enable testquickstartdremio E2E test by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2045
Fix ODBC build for release binaries by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2046
chore: Remove unused dependencies by @peasee in https://github.com/spiceai/spiceai/pull/2044
fix: Change version to alpha breaking by @peasee in https://github.com/spiceai/spiceai/pull/2051
Add connector prefix for dataset configure endpoint param by @sgrebnov in https://github.com/spiceai/spiceai/pull/2052
Fix unprefixed runtime parameters by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2050
Fix make install-with-models by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2054
Bump openssl from 0.10.64 to 0.10.66 by @dependabot in https://github.com/spiceai/spiceai/pull/2047
Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/2056
ignore empty constraints when creating accelerated table by @y-f-u in https://github.com/spiceai/spiceai/pull/2055

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.15.2-alpha...v0.16.0-alpha

- Rust
Published by digadeesh almost 2 years ago

https://github.com/spiceai/spiceai - v0.15.2-alpha

Spice v0.15.2-alpha (July 15, 2024)

The v0.15.2-alpha minor release focuses on enhancing stability, performance, and introduces Catalog Providers for streamlined access to Data Catalog tables. Unity Catalog, Databricks Unity Catalog, and the Spice.ai Cloud Platform Catalog are supported in v0.15.2-alpha. The reliability of federated query push-down has also been improved for the MySQL, PostgreSQL, ODBC, S3, Databricks, and Spice.ai Cloud Platform data connectors.

Highlights in v0.15.2-alpha

Catalog Providers: Catalog Providers streamline access to Data Catalog tables. Initial catalog providers supported are Databricks Unity Catalog, Unity Catalog and Spice.ai Cloud Platform Catalog.

For example, to configure Spice to connect to tpch tables in the Spice.ai Cloud Platform Catalog use the new catalogs: section in the spicepod.yml:

yaml catalogs: - name: spiceai from: spiceai include: - tpch.*

Time: 0.001866958 seconds. 9 rows. ```

ODBC Data Connector Push-Down: The ODBC Data Connector now supports query push-down for joins, improving performance for joined datasets configured with the same odbc_connection_string.

Improved Spicepod Validation Improved spicepod.yml validation has been added, including warnings when loading resources with duplicate names (datasets, views, models, embeddings).

Breaking Changes

None.

Contributors

@phillipleblanc
@peasee
@y-f-u
@ewgenius
@Sevenannn
@sgrebnov
@lukekim

What's Changed

Dependencies

Upgraded Apache DataFusion to v40.0.0.

Commits

Update to next release version v0.15.2-alpha by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1901
release: Update helm 0.15.1-alpha by @peasee in https://github.com/spiceai/spiceai/pull/1902
fix: Detect and error on duplicate component names on spiced (re)load by @peasee in https://github.com/spiceai/spiceai/pull/1905
fix: flaky test - testrefreshstatuschangeto_ready by @y-f-u in https://github.com/spiceai/spiceai/pull/1908
Add support for parsing catalog from Spicepod. by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1903
Add catalog component to Runtime by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1906
Adds a RuntimeBuilder and make most items on Runtime private by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1913
Bump zerovec-derive from 0.10.2 to 0.10.3 by @dependabot in https://github.com/spiceai/spiceai/pull/1914
Add separate tagged image with enabled models feature by @ewgenius in https://github.com/spiceai/spiceai/pull/1909
Update datafusion-table-providers to use newest head by @Sevenannn in https://github.com/spiceai/spiceai/pull/1927
Add MySQL support for TPC-H test data generation script by @sgrebnov in https://github.com/spiceai/spiceai/pull/1932
fix: Expose ODBC task errors if error is before data stream begins by @peasee in https://github.com/spiceai/spiceai/pull/1924
Use public.ecr.aws/docker/library/{postgres/mysql}:latest for integration test images by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1934
Implement spice.ai CatalogProvider by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1925
fix: validate time column and time format when constructing accelerated table refresh by @y-f-u in https://github.com/spiceai/spiceai/pull/1926
Add support for filtering tables included by a catalog by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1933
Add UnityCatalog catalog provider by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1940
Implement Databricks catalog provider by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1941
Copy params into dataset_params by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1947
Make integration tests more stable by using logged-in registry during CI by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1955
Add integration test for Spice.ai catalog provider by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1956
Add GET /v1/catalogs API and catalogs CMD by @lukekim in https://github.com/spiceai/spiceai/pull/1957
feat: Enable ODBC JoinPushDown with hashed connection string by @peasee in https://github.com/spiceai/spiceai/pull/1954
Fix bug: arrow acceleration reports zero results during refresh by @sgrebnov in https://github.com/spiceai/spiceai/pull/1962
Revert "fix: validate time column and time format when constructing accelerated table refresh" by @y-f-u in https://github.com/spiceai/spiceai/pull/1964
fix: Update arrow-odbc to use our fork for pending fixes by @peasee in https://github.com/spiceai/spiceai/pull/1965
Upgrade to DataFusion 40 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1963
Do exchange shouldn't require table to be writable by @Sevenannn in https://github.com/spiceai/spiceai/pull/1958
Use custom dialect rule for flight federated request by @y-f-u in https://github.com/spiceai/spiceai/pull/1946
upgrade datafusion federation to have the table rewrite fix for tpch-q9 by @y-f-u in https://github.com/spiceai/spiceai/pull/1970
Create v0.15.2-alpha.md Release notes by @digadeesh in https://github.com/spiceai/spiceai/pull/1969
Fix Unity Catalog API response for Azure Databricks by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1973
Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/1976

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.15.1-alpha...v0.15.2-alpha

- Rust
Published by digadeesh almost 2 years ago

https://github.com/spiceai/spiceai - v0.15.1-alpha

Spice v0.15.1-alpha (July 8, 2024)

The v0.15.1-alpha minor release focuses on enhancing stability, performance, and usability. Memory usage has been significantly improved for the postgres and duckdb acceleration engines which now use stream processing. A new Delta Lake Data Connector has been added, sharing a delta-kernel-rs based implementation with the Databricks Data Connector supporting deletion vectors.

Highlights

Improved memory usage for PostgreSQL and DuckDB acceleration engines: Large dataset acceleration with PostgreSQL and DuckDB engines has reduced memory consumption by streaming data directly to the accelerated table as it is read from the source.

Delta Lake Data Connector: A new Delta Lake Data Connector has been added for using Delta Lake outside of Databricks.

ODBC Data Connector Streaming: The ODBC Data Connector now streams results, reducing memory usage, and improving performance.

GraphQL Object Unnesting: The GraphQL Data Connector can automatically unnest objects from GraphQL queries using the unnest_depth parameter.

Breaking Changes

None.

New Contributors

None.

Contributors

What's Changed

Dependencies

The MySQL, PostgreSQL, SQLite and DuckDB DataFusion TableProviders developed by Spice AI have been donated to the datafusion-contrib/datafusion-table-providers community repository.

Commits

Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/1842
Update ROADMAP.md - Remove v0.15.0-alpha roadmap items. by @digadeesh in https://github.com/spiceai/spiceai/pull/1843
update helm chart for v0.15.0-alpha by @y-f-u in https://github.com/spiceai/spiceai/pull/1845
update cargo.toml and version.txt to 0.15.1-alpha (for next release) by @digadeesh in https://github.com/spiceai/spiceai/pull/1844
Fix check for outdated Cargo.lock & update Cargo.lock by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1846
Add Debezium to README by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1847
use snmalloc as global allocator by @y-f-u in https://github.com/spiceai/spiceai/pull/1848
Various improvements for mistral.rs by @Jeadie in https://github.com/spiceai/spiceai/pull/1831
Enable streaming for accelerated tables refresh (common logic) by @sgrebnov in https://github.com/spiceai/spiceai/pull/1863
Use in-memory DB pool for DuckDB functions by @Jeadie in https://github.com/spiceai/spiceai/pull/1849
Generate Spicepod JSON Schema by @ewgenius in https://github.com/spiceai/spiceai/pull/1865
Update http param names by @Jeadie in https://github.com/spiceai/spiceai/pull/1872
Replace DuckDB, PostgreSQL, Sqlite and MySQL providers with the datafusion-table-providers crate by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1873
Remove more dead code moved to datafusion-table-providers by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1874
feat: Optimize ODBC for streaming results by @peasee in https://github.com/spiceai/spiceai/pull/1862
Fix how models uses secrets by @Jeadie in https://github.com/spiceai/spiceai/pull/1875
fix: Add support for varying duplicate columns behavior in GraphQL unnesting by @peasee in https://github.com/spiceai/spiceai/pull/1876
fix: Remove GraphQL duplicate rename support by @peasee in https://github.com/spiceai/spiceai/pull/1877
fix: Remove Overwrite GraphQL duplicates behavior by @peasee in https://github.com/spiceai/spiceai/pull/1882
fix: Use tokio mpsc channels for ODBC streaming by @peasee in https://github.com/spiceai/spiceai/pull/1883
Upgrade table providers to enable DuckDB streaming write by @sgrebnov in https://github.com/spiceai/spiceai/pull/1884
Update ROADMAP.md - Add debezium (alpha) to connector list. by @digadeesh in https://github.com/spiceai/spiceai/pull/1880
Allow defining user for mysql data connector via secrets by @sgrebnov in https://github.com/spiceai/spiceai/pull/1886
Replace delta-rs with delta-kernel-rs and add new delta data connector. by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1878
Update README images by @lukekim in https://github.com/spiceai/spiceai/pull/1890
Handle deletion vectors for delta tables by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1891
Rename delta to delta_lake by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1892
Add where is the AI to the FAQ. by @lukekim in https://github.com/spiceai/spiceai/pull/1885
update df table providers rev version by @y-f-u in https://github.com/spiceai/spiceai/pull/1889
Enable other cloud providers for delta_lake integration by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1893
Add CLI parameters for logging into Databricks with Azure/GCP cloud storage by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1894
Bump zerovec from 0.10.2 to 0.10.4 by @dependabot in https://github.com/spiceai/spiceai/pull/1896
Add 'Content-Type' to metrics exporter to be prometheus exposition format compliant by @sgrebnov in https://github.com/spiceai/spiceai/pull/1897
Update enforce-labels.yml so it accepts depdenabot updates with kind/… by @digadeesh in https://github.com/spiceai/spiceai/pull/1898

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.15.0-alpha...v0.15.1-alpha

- Rust
Published by digadeesh almost 2 years ago

https://github.com/spiceai/spiceai - v0.15.0-alpha

Spice v0.15-alpha (July 1, 2024)

The v0.15-alpha release introduces support for streaming databases changes with Change Data Capture (CDC) into accelerated tables via a new Debezium connector, configurable retry logic for data refresh, and the release of a new C# SDK to build with Spice in Dotnet.

Highlights

Debezium data connector with Change Data Capture (CDC): Sync accelerated datasets with Debezium data sources over Kafka in real-time.
Data Refresh Retries: By default, accelerated datasets attempt to retry data refreshes on transient errors. This behavior can be configured using refresh_retry_enabled and refresh_retry_max_attempts.
C# Client SDK: A new C# Client SDK has been released for developing applications in Dotnet.

Debezium data connector with Change Data Capture (CDC)

Integrating Debezium CDC is straightforward. Get started with the Debezium CDC Sample, read more about CDC in Spice, and read the Debezium data connector documentation.

Example Spicepod using Debezium CDC:

yaml datasets: - from: debezium:cdc.public.customer_addresses name: customer_addresses_cdc params: debezium_transport: kafka debezium_message_format: json kafka_bootstrap_servers: localhost:19092 acceleration: enabled: true engine: duckdb mode: file refresh_mode: changes

Data Refresh Retries

Example Spicepod configuration limiting refresh retries to a maximum of 10 attempts:

yaml datasets: - from: eth.blocks name: blocks acceleration: refresh_retry_enabled: true refresh_retry_max_attempts: 10 refresh_check_interval: 30s

Breaking Changes

None.

New Contributors

@rupurt made their first contribution in https://github.com/spiceai/spiceai/pull/1791

Contributors

What's Changed

Dependencies

No major dependency updates.

Commits

Update version to 0.15.0-alpha by @ewgenius in https://github.com/spiceai/spiceai/pull/1784
Update helm for v0.14.1-alpha by @ewgenius in https://github.com/spiceai/spiceai/pull/1786
Run PR checks on PRs merging into feature-- branches by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1788
Enable retries for accelerated table refresh by @sgrebnov in https://github.com/spiceai/spiceai/pull/1762
enable more tpch benchmark queries as a result of decimal unparsing by @y-f-u in https://github.com/spiceai/spiceai/pull/1790
add nix flake by @rupurt in https://github.com/spiceai/spiceai/pull/1791
Support local and HF embedding models by @Jeadie in https://github.com/spiceai/spiceai/pull/1789
fix(bin/spice): Implement custom Unmarshaller for DatasetOrReference by @peasee in https://github.com/spiceai/spiceai/pull/1787
For windows, move symlink -> symlink_file. by @Jeadie in https://github.com/spiceai/spiceai/pull/1793
docs: Add PULLREQUESTTEMPLATE.md by @peasee in https://github.com/spiceai/spiceai/pull/1794
Fix Unsupported DataType: conversion for time predicates by @sgrebnov in https://github.com/spiceai/spiceai/pull/1795
Use incremental backoff for initial dataset registration retries by @sgrebnov in https://github.com/spiceai/spiceai/pull/1805
Basic HTTP/S connector by @Jeadie in https://github.com/spiceai/spiceai/pull/1792
Scale support for Snowflake fixed-point numbers by @sgrebnov in https://github.com/spiceai/spiceai/pull/1804
bump datafusion federation to resolve the join query failures by @y-f-u in https://github.com/spiceai/spiceai/pull/1806
fix: Stream PostgreSQL data in by @peasee in https://github.com/spiceai/spiceai/pull/1798
Remove clippy::module_name_repetitions lint by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1812
Improve Snowflake fixed-point numbers casting by @sgrebnov in https://github.com/spiceai/spiceai/pull/1809
Case insensitive secret getter by @ewgenius in https://github.com/spiceai/spiceai/pull/1813
refactor: Format TOML with Taplo by @peasee in https://github.com/spiceai/spiceai/pull/1808
feat: Update PR template, add label enforcement in PR by @peasee in https://github.com/spiceai/spiceai/pull/1815
fix bug that append may miss updates when the incremental changes are not able to be contained in one record batch by @y-f-u in https://github.com/spiceai/spiceai/pull/1817
add integration test for inner join across federated table and accelerated table by @y-f-u in https://github.com/spiceai/spiceai/pull/1811
Unify spicepod.llms into spicepod.models and refactor UX of spicepod.models by @Jeadie in https://github.com/spiceai/spiceai/pull/1818
Fix issue with querying accelerated tables where the dataset name has a schema by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1823
Fix schema support for refresh_sql and improve e2e tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/1826
feat: Add GraphQL unnesting by @peasee in https://github.com/spiceai/spiceai/pull/1822
fix: Allow kind/optimization labels, increase Postgres test timeout by @peasee in https://github.com/spiceai/spiceai/pull/1830
Implement Real-time acceleration updates via Debezium CDC by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1832
Remove println statement from PG Connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/1835
Don't try to "hot reload" Debezium accelerated datasets by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1837
Create v1/search that performs vector search. by @Jeadie in https://github.com/spiceai/spiceai/pull/1836
Align spicepod UX of embeddings with models by @Jeadie in https://github.com/spiceai/spiceai/pull/1829
Add "cmake-build" feature to rdkafka for windows by @Jeadie in https://github.com/spiceai/spiceai/pull/1840
Add a better error message when trying to configure refresh_mode=changes on a data connector that doesn't support it. by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1839

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.14.1-alpha...v0.15.0-alpha

- Rust
Published by digadeesh almost 2 years ago

https://github.com/spiceai/spiceai - v0.14.1-alpha

Spice v0.14.1-alpha (Jun 24, 2024)

The v0.14.1-alpha release is focused on quality, stability, and type support with improvements in PostgreSQL, DuckDB, and GraphQL data connectors.

Highlights

PostgreSQL acceleration and data connector: Support for Composite Types and UUID data types.
DuckDB acceleration and data connector: Support for LargeUTF8 and DuckDB functions.
GraphQL data connector: Improved error handling on invalid query syntax.
Refresh SQL: Improved stability when overwriting STRUCT data types.

Breaking Changes

None.

New Contributors

@phungleson made their first contribution in https://github.com/spiceai/spiceai/pull/1750
@peasee made their first contribution in https://github.com/spiceai/spiceai/pull/1769

Contributors

@lukekim
@y-f-u
@ewgenius
@phillipleblanc
@Jeadie
@sgrebnov
@gloomweaver
@phungleson
@peasee
@digadeesh

What's Changed

Dependencies

No major dependency updates.

Commits

Update Helm to v0.14.0-alpha by @sgrebnov in https://github.com/spiceai/spiceai/pull/1720
Update version to 0.14.1-alpha by @sgrebnov in https://github.com/spiceai/spiceai/pull/1721
Use spiceai/async-openai to solve Deserialize issue in v1/embed by @Jeadie in https://github.com/spiceai/spiceai/pull/1707
Add greatest least user defined functions by @y-f-u in https://github.com/spiceai/spiceai/pull/1722
default timeunit to be seconds when time column is a numeric column by @y-f-u in https://github.com/spiceai/spiceai/pull/1727
use system conf to construct dns resolver by @y-f-u in https://github.com/spiceai/spiceai/pull/1728
fix a bug that dataset refresh api does not work for table with schema by @y-f-u in https://github.com/spiceai/spiceai/pull/1729
Move secret crate to runtime module by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1723
Return schema in getflightinfo_simple by @gloomweaver in https://github.com/spiceai/spiceai/pull/1724
Refactor vector search component of v1/assist into a VectorSearch struct by @Jeadie in https://github.com/spiceai/spiceai/pull/1699
Update ROADMAP.md. Fix a broken link for the "Get in touch" link. by @digadeesh in https://github.com/spiceai/spiceai/pull/1725
Secret keys in params should be case insensitive by @ewgenius in https://github.com/spiceai/spiceai/pull/1737
expose error log when refresh encountered some issue, also add more debug logs by @y-f-u in https://github.com/spiceai/spiceai/pull/1739
Support Struct in PostgreSQL accelerator by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1733
rewrite refresh append update dedup logic using arrow comparators by @y-f-u in https://github.com/spiceai/spiceai/pull/1743
Add health checks when loading {llms, embeddings} by @Jeadie in https://github.com/spiceai/spiceai/pull/1738
Support DuckDB function in DuckDB datasets by @Jeadie in https://github.com/spiceai/spiceai/pull/1742
Update version of spiceai/duckdb-rs, support LargeUTF8 by @Jeadie in https://github.com/spiceai/spiceai/pull/1746
Split refresh into coordination and execution layers by @sgrebnov in https://github.com/spiceai/spiceai/pull/1744
bump duckdb rs git sha to resolve duckdb incorrect null value issue by @y-f-u in https://github.com/spiceai/spiceai/pull/1747
cargo.lock file update with #1747 duckdb-rs sha by @y-f-u in https://github.com/spiceai/spiceai/pull/1748
Fix error when GraphQL error locations is missing by @phungleson in https://github.com/spiceai/spiceai/pull/1750
Tweak refresh scheduling logic by @sgrebnov in https://github.com/spiceai/spiceai/pull/1749
Ensure tonic package is in duckdb feature by @Jeadie in https://github.com/spiceai/spiceai/pull/1756
Change tonic::async_trait -> async_trait::async_trait by @Jeadie in https://github.com/spiceai/spiceai/pull/1757
Streaming in v1/chat/completion by @Jeadie in https://github.com/spiceai/spiceai/pull/1741
Add refreshretryenabled/max_attempts acceleration params by @sgrebnov in https://github.com/spiceai/spiceai/pull/1753
Implement refresh retry based on fibonacci backoff (not enabled) by @sgrebnov in https://github.com/spiceai/spiceai/pull/1752
Add VSCode debug target to debug runtime benchmark test by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1760
update spiceai datafusion to include more unparser rules by @y-f-u in https://github.com/spiceai/spiceai/pull/1764
Show UUID types as String instead of base64 binary. by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1767
docs: Add linux contributor guide for setup by @peasee in https://github.com/spiceai/spiceai/pull/1769
Do not expose connection url on object store error by @ewgenius in https://github.com/spiceai/spiceai/pull/1761
Support secrets in llm and embeddings params by @ewgenius in https://github.com/spiceai/spiceai/pull/1770
Bump github.com/hashicorp/go-retryablehttp from 0.7.1 to 0.7.7 by @dependabot in https://github.com/spiceai/spiceai/pull/1775
Update ROADMAP.md with latest roadmap changes for v0.15.0 by @digadeesh in https://github.com/spiceai/spiceai/pull/1773
Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/1776
Strip kwarg '=' in DuckDB function parsing by @Jeadie in https://github.com/spiceai/spiceai/pull/1777

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.14.0-alpha...v0.14.1-alpha

- Rust
Published by digadeesh almost 2 years ago

https://github.com/spiceai/spiceai - v0.14.0-alpha

Spice v0.14-alpha (June 17, 2024)

The v0.14-alpha release focuses on enhancing accelerated dataset performance and data integrity, with support for configuring primary keys and indexes. Additionally, the GraphQL data connector been introduced, along with improved dataset registration and loading error information.

Highlights

Accelerated Datasets: Ensure data integrity using primary key and unique index constraints. Configure conflict handling to either upsert new data or drop it. Create indexes on frequently filtered columns for faster queries on larger datasets.
GraphQL Data Connector: Initial support for using GraphQL as a data source.

Example Spicepod showing how to use primary keys and indexes with accelerated datasets:

yaml datasets: - from: eth.blocks name: blocks acceleration: engine: duckdb # Use DuckDB acceleration engine primary_key: '(hash, timestamp)' indexes: number: enabled # same as `CREATE INDEX ON blocks (number);` '(number, hash)': unique # same as `CREATE UNIQUE INDEX ON blocks (number, hash);` on_conflict: '(hash, number)': drop # possible values: drop (default), upsert '(hash, timestamp)': upsert

Primary Keys, constraints, and indexes are currently supported when using SQLite, DuckDB, and PostgreSQL acceleration engines.

Learn more with the indexing quickstart and the primary key sample.

Read the Local Acceleration documentation.

Breaking Changes

None.

Contributors

@phillipleblanc
@ewgenius
@sgrebnov
@Jeadie
@digadeesh
@gloomweaver
@y-f-u
@lukekim
@edmondop

What's Changed

Dependencies

Apache DataFusion: Upgraded from 38.0.0 to 39.0.0
Apache Arrow/Parquet: Upgraded from 51.0.0 to 52.0.0
Rust: Upgraded from 1.78.0 to 1.79.0

Commits

Update Helm chart for v0.13.3-alpha by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1671
Bump version to v0.14.0-alpha by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1673
Dependency upgrades: DataFusion 39, Arrow/Parquet 52, object_store 0.10.1, arrow-odbc 11.1.0 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1674
Generate unique runtime instance name and store in runtime.metrics table by @ewgenius in https://github.com/spiceai/spiceai/pull/1678
Proper support for Snowflake TIMESTAMP_NTZ by @sgrebnov in https://github.com/spiceai/spiceai/pull/1677
Enable tpchq2 and tpchq21 in the benchmark queries by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1679
Start runtime metrics recorder after loading secrets and extensions by @ewgenius in https://github.com/spiceai/spiceai/pull/1680
Validate table constraints (Primary Keys/Unique Index) on accelerated tables by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1658
Store labels as JSON string in runtime.metrics by @ewgenius in https://github.com/spiceai/spiceai/pull/1681
Atomic updates for DuckDB tables with constraints by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1682
Rename metrics column labels to properties and make it nullable by @ewgenius in https://github.com/spiceai/spiceai/pull/1686
Fix federationoptimizerrule schema error for tpch_q7, tpch_q8, tpch_q9, tpch_q14 by @sgrebnov in https://github.com/spiceai/spiceai/pull/1683
Better prompt for /v1/assist by @Jeadie in https://github.com/spiceai/spiceai/pull/1685
Support stream in v1/assist by @Jeadie in https://github.com/spiceai/spiceai/pull/1653
Fix cache hit rate chart loading for Grafana v9.5 by @sgrebnov in https://github.com/spiceai/spiceai/pull/1691
Update ROADMAP.md to include data connector statuses by @digadeesh in https://github.com/spiceai/spiceai/pull/1684
Support primary_key in Spicepod and create in accelerated table by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1687
Datasets with schema support for availability monitoring by @sgrebnov in https://github.com/spiceai/spiceai/pull/1690
Improve dataset registration output by @sgrebnov in https://github.com/spiceai/spiceai/pull/1692
Readme: update dataset registration traces by @sgrebnov in https://github.com/spiceai/spiceai/pull/1694
Improved error logging for datasets load error by @edmondop in https://github.com/spiceai/spiceai/pull/1695
Improve ArrayDistance scalar UDF by @Jeadie in https://github.com/spiceai/spiceai/pull/1697
Implement on_conflict behavior for accelerated tables with constraints by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1688
Fix datasets live update (Spice file watcher) by @sgrebnov in https://github.com/spiceai/spiceai/pull/1702
Grafana Dashboard: replace Quantile with Percentile filter by @sgrebnov in https://github.com/spiceai/spiceai/pull/1703
refresh with append overlap by @y-f-u in https://github.com/spiceai/spiceai/pull/1706
Fix error message on DuckDB constraint violation by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1709
Add warning when configuring indexes/primarykey/onconflict for Arrow engine. by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1710
ensure schema to be existing when query timestamp during refresh by @y-f-u in https://github.com/spiceai/spiceai/pull/1711
Improve README clarity and add comparison table by @lukekim in https://github.com/spiceai/spiceai/pull/1713
Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/1716
Update README.md to include GraphQL data connector in supported table by @digadeesh in https://github.com/spiceai/spiceai/pull/1717
Fix quoting issue for databricks identifier by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1718

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.13.3-alpha...v0.14.0-alpha

- Rust
Published by github-actions[bot] almost 2 years ago

https://github.com/spiceai/spiceai - v0.13.3-alpha

Spice v0.13.3-alpha (June 10, 2024)

The v0.13.3-alpha release is focused on quality and stability with improvements to metrics, telemetry, and operability.

Highlights

Ready API: - Add /v1/ready API that returns success once all datasets and models are loaded and ready.

Enhanced Grafana dashboard: The dashboard now includes charts for query duration and failures, the last update time of accelerated datasets, the count of refresh errors, and the last successful time the runtime was able to access federated datasets

Contributors

@Jeadie
@ewgenius
@phillipleblanc
@sgrebnov
@gloomweaver
@y-f-u
@mach-kernel

What's Changed

Dependencies

DuckDB 1.0.0: Upgrades embedded DuckDB to 1.0.0.

Commits

Scalar UDF array_distance as euclidean distance between Float32[] by @Jeadie in https://github.com/spiceai/spiceai/pull/1601
Update version to v0.14.0-alpha by @ewgenius in https://github.com/spiceai/spiceai/pull/1614
Update helm for v0.13.2-alpha by @ewgenius in https://github.com/spiceai/spiceai/pull/1618
Upgrade duckdb-rs to DuckDB 1.0.0 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1615
initial idea for 'POST v1/assist' by @Jeadie in https://github.com/spiceai/spiceai/pull/1585
openai server trait and move HTTP endpoints to crates/runtime/src/http/v1/ by @Jeadie in https://github.com/spiceai/spiceai/pull/1619
Add branching policy & updated endgame instructions by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1620
Update Cargo.lock & add CI check for updated Cargo.lock by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1627
Add first-class support for views by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1622
Add /v1/ready API that returns 200 when all datasets have loaded by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1629
Separate NQL logic from LLM Chat messages, and add OpenAI compatiblility per LLM trait. by @Jeadie in https://github.com/spiceai/spiceai/pull/1628
Log queries failing on getflightinfo step (Flight Api) by @sgrebnov in https://github.com/spiceai/spiceai/pull/1626
Graphql Data Connector by @gloomweaver in https://github.com/spiceai/spiceai/pull/1624
GraphQL improved Error formatting, proper format request body by @gloomweaver in https://github.com/spiceai/spiceai/pull/1637
Fix v1/assist response and panic bug. Include primary keys in response too by @Jeadie in https://github.com/spiceai/spiceai/pull/1635
skip integration test if no secret by @y-f-u in https://github.com/spiceai/spiceai/pull/1638
[append] Refresher::getlatesttimestamp / getdf to add refreshsql predicates to scan by @mach-kernel in https://github.com/spiceai/spiceai/pull/1636
GraphQL integration test by @gloomweaver in https://github.com/spiceai/spiceai/pull/1600
Add err_code to query_failures metric by @sgrebnov in https://github.com/spiceai/spiceai/pull/1639
use epoch_ms to replace epoch to work with timestamptz by @y-f-u in https://github.com/spiceai/spiceai/pull/1641
fix the schema mismatch issue on the fallback plan use schema casting by @y-f-u in https://github.com/spiceai/spiceai/pull/1642
bug report template update by @y-f-u in https://github.com/spiceai/spiceai/pull/1640
Add query duration, failures and accelerated dataset metrics to Grafana dashboard by @sgrebnov in https://github.com/spiceai/spiceai/pull/1598
Fix FTP/sftp support for ObjectStoreMetadataTable & ObjectStoreTextTable by @Jeadie in https://github.com/spiceai/spiceai/pull/1649
Support accelerated embedding tables in v1/assist by @Jeadie in https://github.com/spiceai/spiceai/pull/1648
GraphQL pagination, limit pushdown and refactor by @gloomweaver in https://github.com/spiceai/spiceai/pull/1643
Support indexes in accelerated tables by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1644
Federated datasets availability monitoring by @sgrebnov in https://github.com/spiceai/spiceai/pull/1650
Reset federated dataset availability during dataset registration by @sgrebnov in https://github.com/spiceai/spiceai/pull/1661
Change to v0.13.3-alpha by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1666
Add Time Since Offline chart to Grafana dashboard by @sgrebnov in https://github.com/spiceai/spiceai/pull/1664
readme fix to correct the number of rows for show tables by @y-f-u in https://github.com/spiceai/spiceai/pull/1667
Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/1668
Add missing dependency on arrowsqlgen from duckdb data_component by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1669

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.13.2-alpha...v0.13.3-alpha

- Rust
Published by phillipleblanc almost 2 years ago

https://github.com/spiceai/spiceai - v0.13.2-alpha

Spice v0.13.2-alpha (June 3, 2024)

The v0.13.2-alpha release is focused on quality and stability with improvements to federated query push-down, telemetry, and query history.

Highlights

Filesystem Data Connector: Adds the Filesystem Data Connector for directly using files as data sources.
Federated Query Push-Down: Improved stability and schema compatibility for federated queries.
Enhanced Telemetry: Runtime Metrics now include last update time for accelerated datasets, count of refresh errors, and new metrics for query duration and failures.
Query History: Enabled query history logging for Arrow Flight queries in addition to HTTP queries.

Contributors

@lukekim
@y-f-u
@ewgenius
@phillipleblanc
@Jeadie
@Sevenannn
@sgrebnov
@gloomweaver
@mach-kernel

What's Changed

Update ROADMAP.md May 27, 2024 by @lukekim in https://github.com/spiceai/spiceai/pull/1535
update helm chart version and use v0.13.1-alpha by @y-f-u in https://github.com/spiceai/spiceai/pull/1536
version correction in v0.13.1 release note by @y-f-u in https://github.com/spiceai/spiceai/pull/1538
update version to v0.14.0-alpha by @y-f-u in https://github.com/spiceai/spiceai/pull/1539
Update spice_cloud - connect to cloud api by @ewgenius in https://github.com/spiceai/spiceai/pull/1523
Update spice_cloud extension params, and remove logging by @ewgenius in https://github.com/spiceai/spiceai/pull/1541
Update MSRV to 1.78 and remove unused Rust Version parameter in CI by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1540
Improve llm UX in spicepod.yaml by @Jeadie in https://github.com/spiceai/spiceai/pull/1545
Store local runtime metrics in Timestamp with nanoseconds precision and UTC time by @ewgenius in https://github.com/spiceai/spiceai/pull/1548
Object store metadata Table provider by @Jeadie in https://github.com/spiceai/spiceai/pull/1518
Remove clickhouse password requirement by @Sevenannn in https://github.com/spiceai/spiceai/pull/1547
pretty print loaded rows number by @y-f-u in https://github.com/spiceai/spiceai/pull/1553
Fix UNION ALL federated push down by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1550
Update mistral, fix bugs and improve local file DX by @Jeadie in https://github.com/spiceai/spiceai/pull/1552
Cast runtime.metrics schema, if remote (spiceai) data connector provided by @ewgenius in https://github.com/spiceai/spiceai/pull/1554
Use proper MySQL dialect during federation push-down by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1555
parallel load dataset when starting up by @y-f-u in https://github.com/spiceai/spiceai/pull/1551
fix linter warning on Scanf return value by @y-f-u in https://github.com/spiceai/spiceai/pull/1556
Update spice cloud connect api endpoint by @ewgenius in https://github.com/spiceai/spiceai/pull/1557
Create new HTTP endpoint to create embeddings. by @Jeadie in https://github.com/spiceai/spiceai/pull/1558
Query History support for Flight API by @sgrebnov in https://github.com/spiceai/spiceai/pull/1549
Don't cache queries for runtime tables by @sgrebnov in https://github.com/spiceai/spiceai/pull/1561
Fix schema incompatibility on federated push-down queries by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1560
move 'embeddings' to top-level concept in spicepod.yaml by @Jeadie in https://github.com/spiceai/spiceai/pull/1564
object_store table provider for UTF8 data formats by @Jeadie in https://github.com/spiceai/spiceai/pull/1562
Improve connectivity for JDBC clients, like Tableau by @sgrebnov in https://github.com/spiceai/spiceai/pull/1563
Enable datasets from local filesystem by @Jeadie in https://github.com/spiceai/spiceai/pull/1584
Adds benchmarking tests for Spice by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1577
Push down correct timestamp expr to SQLite, add binary type mapping by @mach-kernel in https://github.com/spiceai/spiceai/pull/1566
Add query_duration_seconds and query_failures metrics by @sgrebnov in https://github.com/spiceai/spiceai/pull/1575
Use /app as a default workdir in spiceai docker image by @ewgenius in https://github.com/spiceai/spiceai/pull/1586
Add support for both file:// and file:/ by @Jeadie in https://github.com/spiceai/spiceai/pull/1587
put loaddatasets as the latest step along with startservers by @y-f-u in https://github.com/spiceai/spiceai/pull/1559
Embedding columns (from embedding providers) are now run inside datafusion plans. by @Jeadie in https://github.com/spiceai/spiceai/pull/1576
Support BinaryArray in DuckDB accelerations by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1595
Add cache header to Flight API and Spice REPL indicator by @sgrebnov in https://github.com/spiceai/spiceai/pull/1591
Add accelerated datasets refresh metrics by @sgrebnov in https://github.com/spiceai/spiceai/pull/1589
update the error when starting spice sql with no runtime to be actionable by @digadeesh in https://github.com/spiceai/spiceai/pull/1597
add odbc integration test by @y-f-u in https://github.com/spiceai/spiceai/pull/1590
Fix bug in instantiating EmbeddingConnector by @Jeadie in https://github.com/spiceai/spiceai/pull/1592
readme change to reflect new cli output by @y-f-u in https://github.com/spiceai/spiceai/pull/1602
Update version v0.13.2 by @ewgenius in https://github.com/spiceai/spiceai/pull/1604
Roadmap changes Jun 3, 2024 by @lukekim in https://github.com/spiceai/spiceai/pull/1609

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.13.1-alpha...v0.13.2

- Rust
Published by ewgenius almost 2 years ago

https://github.com/spiceai/spiceai - v0.13.1-alpha

Spice v0.13.1-alpha (May 27, 2024)

The v0.13.1-alpha release of Spice is a minor update focused on stability, quality, and operability. Query result caching provides protection against bursts of queries and schema support for datasets has been added logical grouping. An issue where Refresh SQL predicates were not pushed down underlying data sources has been resolved along with improved Acceleration Refresh logging.

Highlights in v0.13.1-alpha

Results Caching: Introduced query results caching to handle bursts of requests and support caching of non-accelerated results, such as refresh data returned on zero results. Results caching is enabled by default with a 1s item time-to-live (TTL). Learn more.
Query History Logging: Recent queries are now logged in the new spice.runtime.query_history dataset with a default retention of 24-hours. Query history is initially enabled for HTTP queries only (not Arrow Flight queries).
Dataset Schemas: Added support for dataset schemas, allowing logical grouping of datasets by separating the schema name from the table name with a .. E.g.

```yaml datasets: - from: mysql:app1.identities name: app.users

- from: postgres:app2.purchases
  name: app.purchases

```

In this example, queries against app.users will be federated to my_schema.my_table, and app.purchases will be federated to app2.purchases.

Contributors

@y-f-u @Jeadie @sgrebnov @ewgenius @phillipleblanc @lukekim @gloomweaver @Sevenannn

New in this release

Add more type support on mysql connector by @y-f-u in https://github.com/spiceai/spiceai/pull/1449
Add in-memory caching support for Arrow Flight queries by @sgrebnov in https://github.com/spiceai/spiceai/pull/1450
Fix the table reference to use the full table reference, not just the table by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1460
Make file_format parameter required for S3/FTP/SFTP connector by @ewgenius in https://github.com/spiceai/spiceai/pull/1455
Add more verbose logging when acceleration refresh update is finished by @y-f-u in https://github.com/spiceai/spiceai/pull/1453
Fix snowflake dataset path when using federation query by @y-f-u in https://github.com/spiceai/spiceai/pull/1474
Update cargo to use spiceai datafusion fork by @y-f-u in https://github.com/spiceai/spiceai/pull/1475
Enable in-memory results caching by default by @sgrebnov in https://github.com/spiceai/spiceai/pull/1473
Add basic integration test for MySQL federation by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1477
Update results_cache config names per final spec by @sgrebnov in https://github.com/spiceai/spiceai/pull/1487
Add DuckDB quickstart to E2E tests by @lukekim in https://github.com/spiceai/spiceai/pull/1461
Add X-Cache header for http queries by @sgrebnov in https://github.com/spiceai/spiceai/pull/1472
Add telemetry for in-memory caching by @sgrebnov in https://github.com/spiceai/spiceai/pull/1456
Pin Git dependencies to a specific commit hash by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1490
Detect file_format from dataset path by @ewgenius in https://github.com/spiceai/spiceai/pull/1489
Add file_format to helm chart sample dataset by @ewgenius in https://github.com/spiceai/spiceai/pull/1493
Improve duckdb data connector error messages by @Sevenannn in https://github.com/spiceai/spiceai/pull/1486
Add file_format prompt for s3 and ftp datasets in Dataset Configure CLI if no extension detected by @ewgenius in https://github.com/spiceai/spiceai/pull/1494
Add llms to the spicepod definition and use throughout by @Jeadie in https://github.com/spiceai/spiceai/pull/1447
Fix duckdb acceleration converting null into default values. by @y-f-u in https://github.com/spiceai/spiceai/pull/1500
Separate runtime Dataset from spicepod Dataset by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1503
Duckdb e2e test OSX support by @y-f-u in https://github.com/spiceai/spiceai/pull/1505
Use TableReference for dataset name by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1506
Tweak Results Cache naming and output by @lukekim in https://github.com/spiceai/spiceai/pull/1509
Fix refresh_sql not properly passing down filters by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1510
Allow datasets to specify a schema by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1507
Cache invalidation for accelerated tables by @sgrebnov in https://github.com/spiceai/spiceai/pull/1498
Improve spark data connector error messages by @Sevenannn in https://github.com/spiceai/spiceai/pull/1497
Parse postgres table schema from prepare statement to support empty tables by @ewgenius in https://github.com/spiceai/spiceai/pull/1445
Improve clarity of README and add FAQ by @lukekim in https://github.com/spiceai/spiceai/pull/1512
Use binary data transfer for ftp by @gloomweaver in https://github.com/spiceai/spiceai/pull/1517
Add support for time64 for SQL insertion statement by @y-f-u in https://github.com/spiceai/spiceai/pull/1519
Add Spice Extensions PoC by @ewgenius in https://github.com/spiceai/spiceai/pull/1476
Add results cache metrics, pod and quantile filters to Grafana dashboard by @sgrebnov in https://github.com/spiceai/spiceai/pull/1513
Add unit tests for results caching utils by @sgrebnov in https://github.com/spiceai/spiceai/pull/1514
Add E2E tests for results caching by @sgrebnov in https://github.com/spiceai/spiceai/pull/1515
Pass tablereference full string into sparksession table so it can query across schemas or catalogs by @y-f-u in https://github.com/spiceai/spiceai/pull/1521
Trace on debug level for tables in runtime schema by @ewgenius in https://github.com/spiceai/spiceai/pull/1524
Update SparkSessionBuilder::remote and update spark fork hash by @Sevenannn in https://github.com/spiceai/spiceai/pull/1495
Fix federation push-down for datasets with schemas by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1526
Store history of queries in 'spice.runtime.query_history' by @Jeadie in https://github.com/spiceai/spiceai/pull/1501
Disable cache for system queries by @sgrebnov in https://github.com/spiceai/spiceai/pull/1528
Register runtime tables with runtime schema by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1532
Fix acknowledgments workflow to include all cargo features by @Jeadie in https://github.com/spiceai/spiceai/pull/1531

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.13.0-alpha...v0.13.1-alpha

- Rust
Published by y-f-u about 2 years ago

https://github.com/spiceai/spiceai - v0.13.0-alpha

Spice v0.13-alpha (May 20, 2024)

The v0.13.0-alpha release significantly improves federated query performance and efficiency with Query Push-Down. Query push-down allows SQL queries to be directly executed by underlying data sources, such as joining tables using the same data connector. Query push-down is supported for all SQL-based and Arrow Flight data connectors. Additionally, runtime metrics, including query duration, collected and accessed in the spice.runtime.metrics table. This release also includes a new FTP/SFTP data connector and improved CSV support for the S3 data connector.

Highlights

Federated Query Push-Down (#1394): All SQL and Arrow Flight data connectors support federated query push-down.
Runtime Metrics (#1361): Runtime metric collection can be enabled using the --metrics flag and accessed by the spice.runtime.metrics table.
FTP & SFTP data connector (#1355) (#1399): Added support for using FTP and SFTP as data sources.
Improved CSV support (#1411) (#1414): S3/FTP/SFTP data connectors support CSV files with expanded CSV options.

Contributors

@Jeadie
@digadeesh
@ewgenius
@gloomweaver
@lukekim
@phillipleblanc
@sgrebnov
@y-f-u

What's Changed

Remove milestones from Enhancement template by @lukekim in https://github.com/spiceai/spiceai/pull/1373
Update version.txt and Cargo.toml to 0.13.0-alpha by @sgrebnov in https://github.com/spiceai/spiceai/pull/1375
Helm chart for Spice v0.12.2-alpha by @sgrebnov in https://github.com/spiceai/spiceai/pull/1374
Add release cargo feature to docker builds by @ewgenius in https://github.com/spiceai/spiceai/pull/1377
FTP connector by @gloomweaver in https://github.com/spiceai/spiceai/pull/1355
Provide ability to specify timeout for s3 data connector by @gloomweaver in https://github.com/spiceai/spiceai/pull/1378
clickhouse-rs use tag instead of branch by @gloomweaver in https://github.com/spiceai/spiceai/pull/1313
Store runtime metrics in spice.runtime.metrics table by @ewgenius in https://github.com/spiceai/spiceai/pull/1361
Update bug_report.md to include the kind/bug label by @digadeesh in https://github.com/spiceai/spiceai/pull/1381
Remove redundant [refresh] in log by @lukekim in https://github.com/spiceai/spiceai/pull/1384
Implement federation for DuckDB Data Connector (POC) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1380
Update wording for spice cloud connection by @ewgenius in https://github.com/spiceai/spiceai/pull/1386
fix dataset refreshing status by @y-f-u in https://github.com/spiceai/spiceai/pull/1387
clickhouse friendly error by @y-f-u in https://github.com/spiceai/spiceai/pull/1388
Initial work for NQL crate and API by @Jeadie in https://github.com/spiceai/spiceai/pull/1366
Fully implement federation for all SqlTable-based Data Connectors by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1394
use df logical plan to query latest timestamp when refreshing incrementally by @y-f-u in https://github.com/spiceai/spiceai/pull/1393
Refactor datafusion.write_data to use table reference by @ewgenius in https://github.com/spiceai/spiceai/pull/1402
Add federation to FlightTable based DataConnectors by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1401
SFTP Data Connector by @gloomweaver in https://github.com/spiceai/spiceai/pull/1399
Use GPT3.5 for NSQL task by @Jeadie in https://github.com/spiceai/spiceai/pull/1400
Update ROADMAP May 16, 2024 by @lukekim in https://github.com/spiceai/spiceai/pull/1405
Add ftp/sftp connector to readme by @gloomweaver in https://github.com/spiceai/spiceai/pull/1404
Add FlightSQL federation provider by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1403
Refactor runtime metrics to use localhost accelerated table by @ewgenius in https://github.com/spiceai/spiceai/pull/1395
Use JSON response in OpenAI, text -> SQL model by @Jeadie in https://github.com/spiceai/spiceai/pull/1407
support more common csv options by @y-f-u in https://github.com/spiceai/spiceai/pull/1411
add a TLS error message in data connector and implement it for clickhouse by @y-f-u in https://github.com/spiceai/spiceai/pull/1413
Add CSV to s3 data formats by @gloomweaver in https://github.com/spiceai/spiceai/pull/1414
fix up dependencies now 0.5.0 disappeared by @Jeadie in https://github.com/spiceai/spiceai/pull/1417
Add NSQL to FlightRepl by @Jeadie in https://github.com/spiceai/spiceai/pull/1409
Update Cargo.lock by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1418
Enable spice.ai replication for runtime.metrics table by @ewgenius in https://github.com/spiceai/spiceai/pull/1408
Restructure the runtime struct to make it easier to test by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1420
Make it easier to construct an App programatically by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1421
Add an integration test for federation by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1426
wait 2 seconds for the status to turn ready in refreshing status test by @y-f-u in https://github.com/spiceai/spiceai/pull/1419
Add functional tests for federation push-down by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1428
Enable push-down federation by default by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1429
Add guides and examples about error handling by @ewgenius in https://github.com/spiceai/spiceai/pull/1427
Add LRU cache support for http-based queries by @sgrebnov in https://github.com/spiceai/spiceai/pull/1410
Update README.md - Remove bigquery from tablet of connectors by @digadeesh in https://github.com/spiceai/spiceai/pull/1434
Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/1433
CLI wording and logs change reflected on readme by @y-f-u in https://github.com/spiceai/spiceai/pull/1435
Add databricksusessl parameter by @Sevenannn in https://github.com/spiceai/spiceai/pull/1406
Update helm version and use v0.13.0-alpha by @Jeadie in https://github.com/spiceai/spiceai/pull/1436
Don't include feature 'llms/candles' by default by @Jeadie in https://github.com/spiceai/spiceai/pull/1437
Correctly map NullBuilder for Null arrow types by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1438
Propagate object store error by @gloomweaver in https://github.com/spiceai/spiceai/pull/1415

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.12.2-alpha...v0.13.0-alpha

- Rust
Published by Jeadie about 2 years ago

https://github.com/spiceai/spiceai - v0.12.2-alpha

Spice v0.12.2-alpha (May 13, 2024)

The v0.12.2-alpha release introduces data streaming and key-pair authentication for the Snowflake data connector, enables general append mode data refreshes for time-series data, improves connectivity error messages, adds nested folders support for the S3 data connector, and exposes nodeSelector and affinity keys in the Helm chart for better Kubernetes management.

Highlights

Improved Connectivity Error Messages: Error messages provide clearer, actionable guidance for misconfigured settings or unreachable data connectors.
Snowflake Data Connector Improvements: Enables data streaming by default and adds support for key-pair authentication in addition to passwords.
API for Refresh SQL Updates: Update dataset Refresh SQL via API.
Append Data Refresh: Append mode data refreshes for time-series data are now supported for all data connectors. Specify a dataset time_column with refresh_mode: append to only fetch data more recent than the latest local data.
Docker Image Update: The spiceai/spiceai:latest Docker image now includes the ODBC data connector. For a smaller footprint, use spiceai/spiceai:latest-slim.
Helm Chart Improvements: nodeSelector and affinity keys are now supported in the Helm chart for improved Kubernetes deployment management.

Breaking Changes

API to trigger accelerated dataset refreshes has changed from POST /v1/datasets/:name/refresh to POST /v1/datasets/:name/acceleration/refresh to be consistent with the Spicepod.yaml structure.

Contributors

@mach-kernel
@y-f-u
@sgrebnov
@ewgenius
@Jeadie
@Sevenannn
@digadeesh
@phillipleblanc
@lukekim

What's Changed

Fix list type support in spark connect by @y-f-u in https://github.com/spiceai/spiceai/pull/1341
Add nested folder support in S3 Parquet connector by @y-f-u in https://github.com/spiceai/spiceai/pull/1342
Improves S3 connector using DataFusion ListingTable table provider by @y-f-u in https://github.com/spiceai/spiceai/pull/1326
Update ROADMAP May 6, 2024 by @lukekim in https://github.com/spiceai/spiceai/pull/1315
List flightsql and snowflake as supported connectors in README.md by @sgrebnov in https://github.com/spiceai/spiceai/pull/1317
Helm chart for v0.12.1-alpha by @ewgenius in https://github.com/spiceai/spiceai/pull/1323
Read sqlite_file param and use it as path by @Sevenannn in https://github.com/spiceai/spiceai/pull/1309
Compile spiced with release feature in docker image by @ewgenius in https://github.com/spiceai/spiceai/pull/1324
Add support for Snowflake key-pair authentication by @sgrebnov in https://github.com/spiceai/spiceai/pull/1314
Wrap postgres errors in common DataConnectorError by @ewgenius in https://github.com/spiceai/spiceai/pull/1327
Fix TPCH tests runner by @sgrebnov in https://github.com/spiceai/spiceai/pull/1330
Spice CLI support for Snowflake key-pair auth by @sgrebnov in https://github.com/spiceai/spiceai/pull/1325
sqlproviderdatafusion: Support TimestampMicrosecond, Date32, Date64 by @mach-kernel in https://github.com/spiceai/spiceai/pull/1329
Resolve dangling reference for SQLite by @Sevenannn in https://github.com/spiceai/spiceai/pull/1312
Select columns from Spark Dataframe according to projected_schema by @Sevenannn in https://github.com/spiceai/spiceai/pull/1336
Expose nodeselector and affinity keys in Helm chart by @mach-kernel in https://github.com/spiceai/spiceai/pull/1338
Use streaming for Snowflake queries by @sgrebnov in https://github.com/spiceai/spiceai/pull/1337
Publish ODBC images by @mach-kernel in https://github.com/spiceai/spiceai/pull/1271
Include Postgres acceleration engine to types support tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/1343
Refactor dataconnector providers getters to return common DataConnectorResult and DataConnectorError by @ewgenius in https://github.com/spiceai/spiceai/pull/1339
s3 csv support to validate the listing table extensibility by @y-f-u in https://github.com/spiceai/spiceai/pull/1344
Move model code into separate, feature-flagged crate by @Jeadie in https://github.com/spiceai/spiceai/pull/1335
Initial setup for federated queries by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1350
Refactor dbconnection errors, and catch invalid postgres table name case by @ewgenius in https://github.com/spiceai/spiceai/pull/1353
Rename default datafusion catalog to "spice", add internal "spice.runtime" schema by @ewgenius in https://github.com/spiceai/spiceai/pull/1359
Add API to set Refresh SQL for accelerated table by @sgrebnov in https://github.com/spiceai/spiceai/pull/1356
Set next version to v0.12.2 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1367
Upgrade to DataFusion 38 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1368
Incremental append based on time column by @y-f-u in https://github.com/spiceai/spiceai/pull/1360
Update README.md to include correct output when running show tables from quickstart by @digadeesh in https://github.com/spiceai/spiceai/pull/1371

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.12.1-alpha...v0.12.2-alpha

- Rust
Published by github-actions[bot] about 2 years ago

https://github.com/spiceai/spiceai - v0.12.1-alpha

Spice v0.12.1-alpha (May 6, 2024)

The v0.12.1-alpha release introduces a new Snowflake data connector, support for UUID and TimestampTZ types in the PostgreSQL connector, and improved error messages across all data connectors. The Clickhouse data connector enables data streaming by default. The public SQL interface now restricts DML and DDL queries. Additionally, accelerated tables now fully support NULL values, and issues with schema conversion in these tables have been resolved.

Highlights

Snowflake Data Connector: Initial support for Snowflake as a data source.
Clickhouse Data Streaming: Enables data streaming by default, eliminating in-memory result collection.
Read-only SQL Interface: Disables DML (INSERT/UPDATE/DELETE) and DDL (CREATE/ALTER TABLE) queries for improved data source security.
Error Message Improvements: Improved the error messages for commonly encountered issues with data connectors.
Accelerated Tables: Supports NULL values across all data types and fixes schema conversion errors for consistent type handling.

Contributors

@ahirner
@y-f-u
@sgrebnov
@ewgenius
@Jeadie
@gloomweaver
@Sevenannn
@digadeesh
@phillipleblanc

What's Changed

Add schema types check for query result by @sgrebnov in https://github.com/spiceai/spiceai/pull/1212
helm chart for v0.12.0-alpha by @y-f-u in https://github.com/spiceai/spiceai/pull/1235
Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/1232
Bump spiceai version to v0.12.1-alpha by @ewgenius in https://github.com/spiceai/spiceai/pull/1239
Update ROADMAP.md - remove v0.12.0-alpha by @ewgenius in https://github.com/spiceai/spiceai/pull/1241
Raise errors in InsertBuilder by @Jeadie in https://github.com/spiceai/spiceai/pull/1242
Update endgame template by @ewgenius in https://github.com/spiceai/spiceai/pull/1240
Add E2E tests for acceleration engines types support by @sgrebnov in https://github.com/spiceai/spiceai/pull/1218
Stream blocks to arrow by @gloomweaver in https://github.com/spiceai/spiceai/pull/1203
Update enhancement.md to include a checklist item have a release notes entry for each enhancement. by @digadeesh in https://github.com/spiceai/spiceai/pull/1245
arrowsqlgen data column conversion by @Sevenannn in https://github.com/spiceai/spiceai/pull/1230
Implement the Localhost Data Connector & fix DoPut by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1266
Update postgres parameter check by @Sevenannn in https://github.com/spiceai/spiceai/pull/1244
Record batch casting to fix SQLite data type issues by @y-f-u in https://github.com/spiceai/spiceai/pull/1261
typo fix on Decimal in postgres arrowsqlgen by @y-f-u in https://github.com/spiceai/spiceai/pull/1277
Move verifyschema to arrowtools by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1284
Support UUID and TimestampTZ type for Postgres as Data Connector by @ahirner & @y-f-u https://github.com/spiceai/spiceai/pull/1276
Fix linter warnings by @ewgenius in https://github.com/spiceai/spiceai/pull/1286
Add Snowflake data connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/1278
Add Snowflake login support (username and password) by @sgrebnov in https://github.com/spiceai/spiceai/pull/1272
convert timestamp properly in sql gen by @y-f-u in https://github.com/spiceai/spiceai/pull/1291
Add if not exists clause to create statement on when creating a table using duckdb acceleration. by @digadeesh in https://github.com/spiceai/spiceai/pull/1290
Disable DML & DDL queries in the public SQL interface by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1294
Refactor duckdb to properly set access_mode for connection by @ewgenius in https://github.com/spiceai/spiceai/pull/1285
do not insert batch for sqlite and postgres if no records in the record batch by @y-f-u in https://github.com/spiceai/spiceai/pull/1293
Postgres - add custom error message for invalid error table by @ewgenius in https://github.com/spiceai/spiceai/pull/1295
SQLite/Accelerators handle null values by @gloomweaver in https://github.com/spiceai/spiceai/pull/1298
Add command to attach to running process by @gloomweaver in https://github.com/spiceai/spiceai/pull/1297
Use the GITHUB_TOKEN environment variable in the installation script, if available, to avoid rate limiting in CI workflows by @ewgenius in https://github.com/spiceai/spiceai/pull/1302
Fix unsupported SSL mode options for PostgreSQL connection string by @ewgenius in https://github.com/spiceai/spiceai/pull/1300
Add CLI cmd spice login spark by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1303
Check only the latest published release to avoid installing pre-release versions by @ewgenius in https://github.com/spiceai/spiceai/pull/1301
Postgres data connector - handle invalid host/port and username/password errors by @ewgenius in https://github.com/spiceai/spiceai/pull/1292
Fix the panic on bad clickhouse connection by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1306
Improve Snowflake Data Connector by @sgrebnov https://github.com/spiceai/spiceai/pull/1296

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.12.0-alpha...v0.12.1-alpha

- Rust
Published by phillipleblanc about 2 years ago

https://github.com/spiceai/spiceai - v0.12-alpha

Spice v0.12-alpha (Apr 29, 2024)

The v0.12-alpha release introduces Clickhouse and Apache Spark data connectors, adds support for limiting refresh data periods for temporal datasets, and includes upgraded Spice Client SDKs compatible with Spice OSS.

Highlights

Clickhouse data connector: Use Clickhouse as a data source with the clickhouse: scheme.
Apache Spark Connect data connector: Use Apache Spark Connect connections as a data source using the spark: scheme.
Refresh data window: Limit accelerated dataset data refreshes to the specified window, as a duration from now configuration setting, for faster and more efficient refreshes.
ODBC data connector: Use ODBC connections as a data source using the odbc: scheme. The ODBC data connector is currently optional and not included in default builds. It can be conditionally compiled using the odbc cargo feature when building from source.
Spice Client SDK Support: The official Spice SDKs have been upgraded with support for Spice OSS.

Breaking Changes

Refresh interval: The refresh_interval acceleration setting and been changed to refresh_check_interval to make it clearer it is the check versus the data interval.

Contributors

@phillipleblanc
@Jeadie
@ewgenius
@sgrebnov
@y-f-u
@lukekim
@digadeesh
@gloomweaver
@edmondop
@mach-kernel

New Contributors

Thanks to @mach-kernel who made their first contribution in https://github.com/spiceai/spiceai/pull/1204 by adding the ODBC data connector!

What's Changed

Update helm version by @Jeadie in https://github.com/spiceai/spiceai/pull/1167
Handle and trace errors in secret stores by @ewgenius in https://github.com/spiceai/spiceai/pull/1149
bump the release versions to 0.12.0 by @y-f-u in https://github.com/spiceai/spiceai/pull/1171
Don't fail acknowledgments flow if no changes detected by @ewgenius in https://github.com/spiceai/spiceai/pull/1170
Allow Spice CLI to control runtime installation on Windows by @sgrebnov in https://github.com/spiceai/spiceai/pull/1173
Allow SELECT count(*) for Sqlite Data Accelerator by @sgrebnov in https://github.com/spiceai/spiceai/pull/1166
add refresh_period param in acceleration by @y-f-u in https://github.com/spiceai/spiceai/pull/1180
Properly support Spark Connect filter pushdown by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1186
Avoid rate-limiting on arduino/setup-protoc@v3 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1189
Clickhouse DataConnector base implementation by @gloomweaver in https://github.com/spiceai/spiceai/pull/1168
rename refreshinterval to refreshcheck_interval by @y-f-u in https://github.com/spiceai/spiceai/pull/1190
Fix timestamp & add support for Decimal to Databricks/Spark by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1194
Convert temporal column and refresh period to datafusion expr by @y-f-u in https://github.com/spiceai/spiceai/pull/1187
Hot reload accelerated table on dataset update by @ewgenius in https://github.com/spiceai/spiceai/pull/1195
Upgrade DataFusion to 37.1 & DuckDB to 10.2 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1200
Update version.txt for 0.11.2 release by @digadeesh in https://github.com/spiceai/spiceai/pull/1199
Clickhouse E2E by @gloomweaver in https://github.com/spiceai/spiceai/pull/1193
Clickhouse: fix darwin ci pipeline by @gloomweaver in https://github.com/spiceai/spiceai/pull/1201
Add table_type to show tables in Spice SQL & update next version to v0.12.0-alpha by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1206
print WARN if time_column does not exists in federated schema by @y-f-u in https://github.com/spiceai/spiceai/pull/1207
Add FallbackOnZeroResultsScanExec for executing an input ExecutionPlan and optionally falling back to a TableProvider.scan() if the input has zero results by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1196
Clickhouse refactor connection code and set secure option by @gloomweaver in https://github.com/spiceai/spiceai/pull/1198
E2E: reusable Spice installation by @sgrebnov in https://github.com/spiceai/spiceai/pull/1205
Clickhouse blocktoarrow unit test by @gloomweaver in https://github.com/spiceai/spiceai/pull/1202
rename refreshperiod to refreshdata_period by @y-f-u in https://github.com/spiceai/spiceai/pull/1210
Refactor E2E tests: dataset verification and PostgreSQL installation by @sgrebnov in https://github.com/spiceai/spiceai/pull/1211
Add BI dashboard acceleration video to README.md by @lukekim in https://github.com/spiceai/spiceai/pull/1219
Improve clarity and consistency of output messages by @lukekim in https://github.com/spiceai/spiceai/pull/1214
Update ROADMAP Apr 29, 2024 by @lukekim in https://github.com/spiceai/spiceai/pull/1220
Stand-alone Spark Connect: Isolate Spark Connect from Databricks Connect to make it reusable by @edmondop in https://github.com/spiceai/spiceai/pull/1213
Optimize build time in dev mode by @gloomweaver in https://github.com/spiceai/spiceai/pull/1215
Feature: Support ODBC reads using unixodbc by @mach-kernel in https://github.com/spiceai/spiceai/pull/1204
Use non-fork deltalake by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1223
Support Date32 & Date64 in arrowsqlgen by @Jeadie in https://github.com/spiceai/spiceai/pull/1217
Update REPL output to be consistent with the latest Spice version by @sgrebnov in https://github.com/spiceai/spiceai/pull/1231
rename refreshdataperiod to refreshdatawindow by @y-f-u in https://github.com/spiceai/spiceai/pull/1233
Update README.md to include ODBC, Spark Connect, and Clickhouse data connectors in support data connector matrix. by @digadeesh in https://github.com/spiceai/spiceai/pull/1234

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.11.1-alpha...v0.12.0-alpha

- Rust
Published by ewgenius about 2 years ago

https://github.com/spiceai/spiceai - 0.11.1-alpha

Spice v0.11.1-alpha (Apr 22, 2024)

The v0.11.1-alpha release introduces retention policies for accelerated datasets, native Windows installation support, and integration of catalog and schema settings for the Databricks Spark connector. Several bugs have also been fixed for improved stability.

Highlights

Retention Policies for Accelerated Datasets: Automatic eviction of data from accelerated time-series datasets when a specified temporal column exceeds the retention period, optimizing resource utilization.
Windows Installation Support: Native Windows installation support, including upgrades.
Databricks Spark Connect Catalog and Schema Settings: Improved translation between DataFusion and Spark, providing better Spark Catalog support.

Contributors

@phillipleblanc
@Jeadie
@ewgenius
@sgrebnov
@y-f-u
@lukekim
@digadeesh
@Sevenannn
@gloomweaver

New in this release

PowerShell script to install Spice on Windows by @sgrebnov in https://github.com/spiceai/spiceai/pull/1128
Support catalog and schema in Databricks Spark Connect by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1137
Retention handlers by @y-f-u in https://github.com/spiceai/spiceai/pull/1096

What's Changed

Update CONTRIBUTING with new dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1121
Fix the Helm tag by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1122
Upgrade Spice version to 0.11.1 by @sgrebnov in https://github.com/spiceai/spiceai/pull/1123
Remove 0.11 from roadmap by @ewgenius in https://github.com/spiceai/spiceai/pull/1124
Include refresh_sql and manual refresh to e2e tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/1125
Respect executables file extension on Windows by @sgrebnov in https://github.com/spiceai/spiceai/pull/1130
Use quoted strings when performing federated SQL queries by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1129
Make Windows artifact names consistent with other platforms by @sgrebnov in https://github.com/spiceai/spiceai/pull/1132
Make Windows installation less verbose by @sgrebnov in https://github.com/spiceai/spiceai/pull/1138
Document Windows installation and add test by @sgrebnov in https://github.com/spiceai/spiceai/pull/1134
Use transaction for DuckDB Table Writer by @Sevenannn in https://github.com/spiceai/spiceai/pull/1135
Update Windows installation script url by @sgrebnov in https://github.com/spiceai/spiceai/pull/1143
Update roadmap Apr 18, 2024 by @lukekim in https://github.com/spiceai/spiceai/pull/1142
Test connection when new connection pool created by @ewgenius in https://github.com/spiceai/spiceai/pull/1126
Enable clippy::cloneonref_ptr by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1146
Allow only alphanumeric dataset names when using spice dataset configure by @ewgenius in https://github.com/spiceai/spiceai/pull/1140
Extend PR check to build with no default features, and each individual feature by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1156
Bump rustls from 0.21.10 to 0.21.11 by @dependabot in https://github.com/spiceai/spiceai/pull/1150
Serde rule for ISO8601 time format by @y-f-u in https://github.com/spiceai/spiceai/pull/1151
Add static linking for vcruntime dependencies on Windows by @sgrebnov in https://github.com/spiceai/spiceai/pull/1152
Use clearer retention param key - retentioncheckenabled instead by @y-f-u in https://github.com/spiceai/spiceai/pull/1158
spice upgrade on Windows by @sgrebnov in https://github.com/spiceai/spiceai/pull/1155

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.11.0-alpha...v0.11.1-alpha

- Rust
Published by y-f-u about 2 years ago

https://github.com/spiceai/spiceai - Spice.ai v0.11.0-alpha

The Spice v0.11.0-alpha release significantly improves the Databricks data connector with Databricks Connect (Spark Connect) support, adds the DuckDB data connector, and adds the AWS Secrets Manager secret store. In addition, enhanced control over accelerated dataset refreshes, improved SSL security for MySQL and PostgreSQL connections, and overall stability improvements have been added.

Highlights in v0.11.0-alpha

DuckDB data connector: Use DuckDB databases or connections as a data source.

AWS Secrets Manager Secret Store: Use AWS Secrets Managers as a secret store.

Custom Refresh SQL: Specify a custom SQL query for dataset refresh using refresh_sql.

Dataset Refresh API: Trigger a dataset refresh using the new CLI command spice refresh or via API.

Expanded SSL support for Postgres: SSL mode now supports disable, require, prefer, verify-ca, verify-full options with the default mode changed to require. Added pg_sslrootcert parameter for setting a custom root certificate and the pg_insecure parameter is no longer supported.

Databricks Connect: Choose between using Spark Connect or Delta Lake when using the Databricks data connector for improved performance.

Improved SSL support for Postgres: ssl mode now supports disable, require, prefer, verify-ca, verify-full options with default mode changed to require. Added pg_sslrootcert parameter to allow setting custom root cert for postgres connector, pg_insecure parameter is no longer supported as redundant.

Internal architecture refactor: The internal architecture of spiced was refactored to simplify the creation data components and to improve alignment with DataFusion concepts.

New Contributors

@edmondop's first contribution github.com/spiceai/spiceai/pull/1110!

Contributors

@phillipleblanc
@Jeadie
@ewgenius
@sgrebnov
@y-f-u
@lukekim
@digadeesh
@Sevenannn
@gloomweaver
@ahirner

New in this release

Fixes MySQL NULL values by @gloomweaver in https://github.com/spiceai/spiceai/pull/1067
Fixes PostgreSQL NULL values for NUMERIC by @gloomweaver in https://github.com/spiceai/spiceai/pull/1068
Adds Custom Refresh SQL support by @lukekim and @phillipleblanc in https://github.com/spiceai/spiceai/pull/1073
Adds DuckDB data connector by @Sevenannn in https://github.com/spiceai/spiceai/pull/1085
Adds AWS Secrets Manager secret store by @sgrebnov in https://github.com/spiceai/spiceai/pull/1063, https://github.com/spiceai/spiceai/pull/1064
Adds Dataset refresh API by @sgrebnov in https://github.com/spiceai/spiceai/pull/1075, https://github.com/spiceai/spiceai/pull/1078, https://github.com/spiceai/spiceai/pull/1083
Adds spice refresh CLI command for dataset refresh by @sgrebnov in https://github.com/spiceai/spiceai/pull/1112
Adds TEXT and DECIMAL types support and properly handling NULL for MySQL by @gloomweaver in https://github.com/spiceai/spiceai/pull/1067
Adds MySQL DATE and TINYINT types support for MySQL by @ewgenius in https://github.com/spiceai/spiceai/pull/1065
Adds ssl_rootcert_path parameter for MySql data connector by @ewgenius in https://github.com/spiceai/spiceai/pull/1079
Adds LargeUtf8 support and explicitly passing the schema to data accelerator SqlTable by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1077
Adds Ability to configure data retention for accelerated datasets by @y-f-u in https://github.com/spiceai/spiceai/issues/1086
Adds Custom SSL certificates for PostgreSQL data connector by @ewgenius in https://github.com/spiceai/spiceai/pull/1081
Adds Conditional compile for Dremio by @ahirner in https://github.com/spiceai/spiceai/pull/1100
Adds Ability for Databricks connector to use spark-connect-rs as the mechanism to execute queries against the Databricks by @edmondop in https://github.com/spiceai/spiceai/pull/1110
Adds Ability to choose between Spark Connect and Delta Lake implementation for Databricks by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1115/files
Updates Databricks login parameters by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1113
Updates Architecture to simplify data components development by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1040
Updates Improved readability of GitHub Actions test job names by @lukekim in https://github.com/spiceai/spiceai/pull/1071
Updates Upgrade Arrow, DataFusion, Tonic dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1097
Updates Handling non-string spicepod params by @ewgenius in https://github.com/spiceai/spiceai/pull/1098
Updates Optional features compile: duckdb, databricks by @ahirner in https://github.com/spiceai/spiceai/pull/1100
Updates Helm version to 0.1.3 by @Jeadie in https://github.com/spiceai/spiceai/pull/1120
Removes pg_insecure parameter support from Postgres by @ewgenius in https://github.com/spiceai/spiceai/pull/1081

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.10.2-alpha...v0.11.0-alpha

- Rust
Published by sgrebnov about 2 years ago

https://github.com/spiceai/spiceai - Spice.ai v0.10.2-alpha

The v0.10.2-alpha release adds the MySQL data connector and makes external data connections more robust on initialization.

Highlights in v0.10.2-alpha

MySQL data connector: Connect to any MySQL server, including SSL support.
Data connections verified at initialization: Verify endpoints and authorization for external data connections (e.g. databricks, spice.ai) at initialization.

New Contributors

@rthomas made their first contribution in https://github.com/spiceai/spiceai/pull/1022
@ahirner made their first contribution in https://github.com/spiceai/spiceai/pull/1029
@gloomweaver made their first contribution in https://github.com/spiceai/spiceai/pull/1004

Contributors

@phillipleblanc
@y-f-u
@ewgenius
@sgrebnov
@lukekim
@digadeesh
@jeadie

New in this release

Adds MySQL data connector by @gloomweaver in https://github.com/spiceai/spiceai/pull/1004
Fixes show tables; parsing in the Spice SQL repl.
Adds data connector verification at initialization
- For Dremio by @sgrebnov in https://github.com/spiceai/spiceai/pull/1017
- For Databricks by @sgrebnov in https://github.com/spiceai/spiceai/pull/1019
- For Spice.ai by @sgrebnov in https://github.com/spiceai/spiceai/pull/1020
Fixes Ensures unit and doc tests compile and run by @rthomas in https://github.com/spiceai/spiceai/pull/1022
Improves Helm chart + Grafana dashboard by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1030
Fixes Makes data connectors optional features by @ahirner in https://github.com/spiceai/spiceai/pull/1029
Fixes Fixes SpiceAI E2E for external contributors in Github actions by @ewgenius in https://github.com/spiceai/spiceai/pull/1023
Fixes remove hardcoded lookback_size (& improve SpiceAI's ModelSource) by @Jeadie in https://github.com/spiceai/spiceai/pull/1016

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.10.1-alpha...v0.10.2-alpha

- Rust
Published by Jeadie about 2 years ago

https://github.com/spiceai/spiceai - Spice.ai v0.10.1-alpha

The v0.10.1-alpha release focuses on stability, bug fixes, and usability by improving error messages when using SQLite data accelerators, improving the PostgreSQL support, and adding a basic Helm chart.

Highlights in v0.10.1-alpha

Improved PostgreSQL support for Data Connectors TLS is now supported with PostgreSQL Data Connectors and there is improved VARCHAR and BPCHAR conversions through Spice.

Improved Error messages Simplified error messages from Spice when propagating errors from Data Connectors and Accelerator Engines.

Spice Pods Command The spice pods command can give you quick statistics about models, dependencies, and datasets that are loaded by the Spice runtime.

Contributors

@phillipleblanc
@mitchdevenport
@ewgenius
@sgrebnov
@lukekim
@digadeesh

New in this release

Adds Basic Helm Chart for spiceai (https://github.com/spiceai/spiceai/pull/1002)
Adds Support for spice login in environments with no browser. (https://github.com/spiceai/spiceai/pull/994)
Adds TLS support in Postgres connector. (https://github.com/spiceai/spiceai/pull/970)
Fixes Improve Postgres VARCHAR and BPCHAR conversion. (https://github.com/spiceai/spiceai/pull/993)
Fixes spice pods Returns incorrect counts. (https://github.com/spiceai/spiceai/pull/998)
Fixes Return friendly error messages for unsupported types in sqlite. (https://github.com/spiceai/spiceai/pull/982)
Fixes Pass Tonic errors when receiving errors from dependencies. (https://github.com/spiceai/spiceai/pull/995)

- Rust
Published by digadeesh about 2 years ago

https://github.com/spiceai/spiceai - Spice.ai v0.10-alpha

Announcing the release of Spice.ai v0.10-alpha! 🎉

The Spice.ai v0.10-alpha release focused on additions and updates to improve stability, usability, and the overall Spice developer experience.

Highlights in v0.10-alpha

Public Bucket Support for S3 Data Connector: The S3 Data Connector now supports public buckets in addition to buckets requiring an access id and key.

JDBC-Client Connectivity: Improved connectivity for JDBC clients, like Tableau.

User Experience Improvements:

Friendlier error messages across the board to make debugging and development better.
Added a spice login postgres command, streamlining the process for connecting to PostgreSQL databases.
Added PostgreSQL connection verification and connection string support, enhancing usability for PostgreSQL users.

Grafana Dashboard: Improving the ability to monitor Spice deployments, a standard Grafana dashboard is now available.

Contributors

@phillipleblanc
@mitchdevenport
@Jeadie
@ewgenius
@sgrebnov
@y-f-u
@lukekim
@digadeesh

New in this release

Fixes Gracefully handle Arrow Flight DoExchange connection resets
Adds Grafana Dashboard
Adds Flight SQL CommandGetTableTypes Command support (improves JDBC-client connectivity)
Adds Friendlier error messages
Adds spice login postgres command
Adds PostgreSQL connection verification
Adds PostgreSQL connection string support
Adds Linux aarch64 build
Updates Improves spice status with dataset metrics
Updates CLI REPL improved show tables output
Updates CLI REPL limit output to 500 rows
Updates Improved README.md with architecture diagram updates
Updates Improved CI run time.
Updates Use macOS hosted Actions runner

- Rust
Published by phillipleblanc about 2 years ago

https://github.com/spiceai/spiceai - Spice.ai v0.9.1-alpha

The v0.9.1 release focused on stability, bug fixes, and usability by adding spice CLI commands for listing Spicepods (spice pods), Models (spice models), Datasets (spice datasets), and improved status (spice status) details. In addition, the Arrow Flight SQL (flightsql) data connector and SQLite (sqlite) data store were added.

Highlights in v0.9.1-alpha

FlightSQL data connector: Arrow Flight SQL can now be used as a connector for federated SQL query.

SQLite data backend: SQLite can now be used as a data store for acceleration.

Contributors

@phillipleblanc
@mitchdevenport
@Jeadie
@ewgenius
@sgrebnov
@y-f-u
@lukekim

New in this release

Adds FlightSQL data connector (flightsql).
Adds SQLite data store, supports both in-memory and file based (sqlite).
Adds support for date, varchar, bpchar, and primitive list types for the PostgreSQL data connector and data store.
Adds spice pods, spice status, spice datasets, and spice models CLI commands.
Adds GET /v1/spicepods API for listing loaded Spicepods.
Adds spiced Docker CI build and release.
Adds E2E tests for release installation and local acceleration.
Adds E2E tests and instructions to run basic TPC-H benchmark tests.
Adds linux/arm64 binary build.
Fixes spice sql REPL panics when query result is too large. (https://github.com/spiceai/spiceai/pull/875)
Fixes --access-secret in spice s3 login. (https://github.com/spiceai/spiceai/pull/894)
Fixes version check upgrade logic.

- Rust
Published by y-f-u about 2 years ago

https://github.com/spiceai/spiceai - Spice.ai v0.9-alpha

The v0.9 release adds several data connectors including the Spice data connector for the ability to connect to other spiced instances. Improved observability for spiced has been added with the new /metrics endpoint for monitoring deployed instances.

Highlights in v0.9-alpha

Arrow Flight SQL endpoint: The Arrow Flight endpoint now supports Flight SQL, including JDBC, ODBC, and ADBC enabling database clients like DBeaver or BI applications like Tableau to connect to and query the Spice runtime.

Spice.ai data connector: Use other Spice runtime instances as data connectors for federated SQL query across Spice deployments and for chaining Spice runtimes.

Keyring secret store: Use the operating system native credential store, like macOS keychain for storing secrets used by spiced.

PostgreSQL data connector: PostgreSQL can now be used as both a data store for acceleration and as a connector for federated SQL query.

Databricks data connector: Databricks as a connector for federated SQL query across Delta Lake tables.

S3 data connector: S3 as a connector for federated SQL query across Parquet files stored in S3.

Metrics endpoint: Added new /metrics endpoint for spiced observability and monitoring with the following metrics:

- spiced_runtime_http_server_start counter - spiced_runtime_flight_server_start counter - datasets_count gauge - load_dataset summary - load_secrets summary - datasets/load_error counter - datasets/count counter - models/load_error counter - models/count counter

Contributors

@phillipleblanc
@mitchdevenport
@Jeadie
@ewgenius
@sgrebnov
@Sevenannn
@y-f-u
@digadeesh
@lukekim

New in this release

Adds Keyring secret store (keyring).
Adds PostgreSQL data connector (postgres).
Adds Spice.ai data connector (spiceai).
Adds Arrow Flight SQL (JDBC/ODBC/ADBC) support.
Adds Databricks data connector (databricks) - Delta Lake support.
Adds S3 data connector (s3) - Parquet support.
Adds /v1/models API.
Adds /v1/status API.
Adds /metrics API.

- Rust
Published by sgrebnov about 2 years ago

https://github.com/spiceai/spiceai - Spice.ai v0.8-alpha

Announcing the release of Spice v0.8-alpha! 🏹

This is a minor release that builds on the new Rust-based runtime, adding stability and a preview of new features for the first major release.

Highlights in v0.8-alpha

Secrets management: Spice 0.8 runtime can now configure and retrieve secrets from local environment variables and in a Kubernetes cluster.

Data tables can be locally accelerated using PostgreSQL

New in this release

Adds Secrets management in local environment variables and Kubernetes clusters.
Adds (Preview) PostgreSQL as a data table acceleration engine.

- Rust
Published by ewgenius about 2 years ago

https://github.com/spiceai/spiceai - Spice v0.7-alpha

Announcing the release of Spice v0.7-alpha! 🏹

Spice v0.7-alpha is an all new implementation of Spice written in Rust. The Spice v0.7 runtime provides developers with a unified SQL query interface to locally accelerate and query data tables sourced from any database, data warehouse, or data lake.

Learn more and get started in minutes with the updated Quickstart in the repository README!

Highlights in v0.7-alpha

DataFusion SQL Query Engine: Spice v0.7 leverages the Apache DataFusion query engine to provide very fast, high quality SQL query across one or more local or remote data sources.

Data tables can be locally accelerated using Apache Arrow in-memory or by DuckDB.

New in this release

Adds runtime rewritten in Rust for high-performance.
Adds Apache DataFusion SQL query engine.
Adds The Spice.ai platform as a data source.
Adds Dremio as a data source.
Adds OpenTelemetry (OTEL) collector.
Adds local data table acceleration.
Adds DuckDB file or in-memory as a data table acceleration engine.
Adds In-memory Apache Arrow as a data table acceleration engine.
Removes the built-in AI training engine; now cloud-based and provided by the Spice.ai platform.
Removes the built-in dashboard and web-interface; now cloud-based and provided by the Spice.ai platform.

- Rust
Published by phillipleblanc over 2 years ago

https://github.com/spiceai/spiceai - Spice.ai v0.6.2-alpha

Announcing the release of Spice.ai v0.6.2-alpha! 🐞

This release fixes a bug in the CLI that prevented users from adding Spicepods from spicerack.org

- Rust
Published by phillipleblanc almost 3 years ago

https://github.com/spiceai/spiceai - Spice.ai v0.6.1-alpha

Announcing the release of Spice.ai v0.6.1-alpha! 🌶

Building upon the Apache Arrow support in v0.6-alpha, Spice.ai now includes new Apache Arrow data processor and Apache Arrow Flight data connector components! Together, these create a high-performance bulk-data transport directly into the Spice.ai ML engine. Coupled with big data systems from the Apache Arrow ecosystem like Hive, Drill, Spark, Snowflake, and BigQuery, it's now easier than ever to combine big data with Spice.ai.

And we're also excited to announce the release of Spice.xyz! 🎉

Spice.xyz is data and AI infrastructure for web3. It’s web3 data made easy. Insanely fast and purpose designed for applications and ML.

Spice.xyz delivers data in Apache Arrow format, over high-performance Apache Arrow Flight APIs to your application, notebook, ML pipeline, and of course through these new data components, to the Spice.ai runtime.

Read the announcement post at blog.spice.ai.

New in this release

Adds Apache Arrow Data Processor
Adds Apache Arrow Flight Data Connector

Now built with Go 1.18.

Dependency updates

Updates to React 18
Updates to CRA 5
Updates to Glide DataGrid 4
Updates to SWR 1.2
Updates to TypeScript 4.6

- Rust
Published by lukekim about 4 years ago

https://github.com/spiceai/spiceai - Spice.ai v0.6-alpha

Announcing the release of Spice.ai v0.6-alpha! 🏹

Spice.ai now scales to datasets 10-100 larger enabling new classes of uses cases and applications! 🚀 We've completely rebuilt Spice.ai's data processing and transport upon Apache Arrow, a high-performance platform that uses an in-memory columnar format. Spice.ai joins other major projects including Apache Spark, pandas, and InfluxDB in being powered by Apache Arrow. This also paves the way for high-performance data connections to the Spice.ai runtime using Apache Arrow Flight and import/export of data using Apache Parquet. We're incredibly excited about the potential this architecture has for building intelligent applications on top of a high-performance transport between application data sources the Spice.ai AI engine.

Highlights in v0.6-alpha

Massive improvement in data loading performance and dataset scale

From data connectors, to REST API, to AI engine, we've now rebuilt Spice.ai's data processing and transport on the Apache Arrow project. Specifically, using the Apache Arrow for Go implementation. Many thanks to Matt Topol for his contributions to the project and guidance on using it.

This release includes a change to the Spice.ai runtime to AI Engine transport from sending text CSV over gGPC to Apache Arrow Records over IPC (Unix sockets).

This is a breaking change to the Data Processor interface, as it now uses arrow.Record instead of Observation.

Benchmarking v0.6

Before v0.6, Spice.ai would not scale into the 100s of 1000s of rows.

| Format | Row Number | Data Size | Process Time | Load Time | Transport time | Memory Usage | | ------ | ---------- | --------- | ------------ | --------- | -------------- | ------------ | | csv | 2,000 | 163.15KiB | 3.0005s | 0.0000s | 0.0100s | 423.754MiB | | csv | 20,000 | 1.61MiB | 2.9765s | 0.0000s | 0.0938s | 479.644MiB | | csv | 200,000 | 16.31MiB | 0.2778s | 0.0000s | NA (error) | 0.000MiB | | csv | 2,000,000 | 164.97MiB | 0.2573s | 0.0050s | NA (error) | 0.000MiB | | json | 2,000 | 301.79KiB | 3.0261s | 0.0000s | 0.0282s | 422.135MiB | | json | 20,000 | 2.97MiB | 2.9020s | 0.0000s | 0.2541s | 459.138MiB | | json | 200,000 | 29.85MiB | 0.2782s | 0.0010s | NA (error) | 0.000MiB | | json | 2,000,000 | 300.39MiB | 0.3353s | 0.0080s | NA (error) | 0.000MiB |

After building on Arrow, Spice.ai now easily scales beyond millions of rows.

| Format | Row Number | Data Size | Process Time | Load Time | Transport time | Memory Usage | | ------ | ---------- | --------- | ------------ | --------- | -------------- | ------------ | | csv | 2,000 | 163.14KiB | 2.8281s | 0.0000s | 0.0194s | 439.580MiB | | csv | 20,000 | 1.61MiB | 2.7297s | 0.0000s | 0.0658s | 461.836MiB | | csv | 200,000 | 16.30MiB | 2.8072s | 0.0020s | 0.4830s | 639.763MiB | | csv | 2,000,000 | 164.97MiB | 2.8707s | 0.0400s | 4.2680s | 1897.738MiB | | json | 2,000 | 301.80KiB | 2.7275s | 0.0000s | 0.0367s | 436.238MiB | | json | 20,000 | 2.97MiB | 2.8284s | 0.0000s | 0.2334s | 473.550MiB | | json | 200,000 | 29.85MiB | 2.8862s | 0.0100s | 1.7725s | 824.089MiB | | json | 2,000,000 | 300.39MiB | 2.7437s | 0.0920s | 16.5743s | 4044.118MiB |

New in this release

Adds Apache Arrow data processing and transport.
Fixes TensorBoard logging and monitoring when using GitHub Codespaces and Docker.
Adds Polling HTTP Data Connector

Dependency updates

Updates to numpy 1.21.0
Updates to marked 3.0.8
Updates to follow-redirects 1.14.7
Updates nanoid to 3.2.0

- Rust
Published by phillipleblanc over 4 years ago

https://github.com/spiceai/spiceai - Spice.ai v0.5.1-alpha

Announcing the release of Spice.ai v0.5.1-alpha! 📈

This minor release builds upon v0.5-alpha adding the ability to start training from the dashboard plus support for monitoring training runs with TensorBoard.

Highlights in v0.5.1-alpha

Start training from dashboard

A "Start Training" button has been added to the pod page on the dashboard so that you can easily start training runs from that context.

Training runs can now be started by:

Modifications to the Spicepod YAML file.
The spice train command.
The "Start Training" dashboard button.
POST API calls to /api/v0.1/pods/{pod name}/train

Video: https://user-images.githubusercontent.com/80174/146122241-f8073266-ead6-4628-8563-93e98d74e9f0.mov

TensorBoard monitoring

TensorBoard monitoring is now supported when using DQL (default) or the new SACD learning algorithms that was announced in v0.5-alpha.

When enabled, TensorBoard logs will automatically be collected and a "Open TensorBoard" button will be shown on the pod page in the dashboard.

Logging can be enabled at the pod level with the training_loggers pod param or per training run with the CLI --training-loggers argument.

Video: https://user-images.githubusercontent.com/80174/146382503-2bb2570b-5111-4de0-9b80-a1dc4a5dcc35.mov

Support for VPG will be added in v0.6-alpha. The design allows for additional loggers to be added in the future. Let us know what you'd like to see!

New in this release

Adds a start training button on the dashboard pod page.
Adds TensorBoard logging and monitoring when using DQL and SACD learning algorithms.

Dependency updates

Updates to Tailwind 3.0.6
Updates to Glide Data Grid 3.2.1

- Rust
Published by phillipleblanc over 4 years ago

https://github.com/spiceai/spiceai - Spice.ai v0.5-alpha

We are excited to announce the release of Spice.ai v0.5-alpha! 🥇

Highlights include a new learning algorithm called "Soft Actor-Critic" (SAC), fixes to the behavior of spice upgrade, and a more consistent authoring experience for reward functions.

If you are new to Spice.ai, check out the getting started guide and star spiceai/spiceai on GitHub.

Highlights in v0.5-alpha

Soft Actor-Critic (Discrete) (SAC) Learning Algorithm

The addition of the Soft Actor-Critic (Discrete) (SAC) learning algorithm is a significant improvement to the power of the AI engine. It is not set as the default algorithm yet, so to start using it pass the --learning-algorithm sacd parameter to spice train. We'd love to get your feedback on how its working!

Consistent reward authoring experience

With the addition of the reward function files that allow you to edit your reward function in a Python file, the behavior of starting a new training session by editing the reward function code was lost. With this release, that behavior is restored.

In addition, there is a breaking change to the variables used to access the observation state and interpretations. This change was made to better reflect the purpose of the variables and make them easier to work with in Python

| Previous (Type) | New (Type) | | ----------------------------------- | -------------------------------------- | | prev_state (SimpleNamespace) | current_state (dict) | | prev_state.interpretations (list) | current_state_interpretations (list) | | new_state (SimpleNamespace) | next_state (dict) | | new_state.interpretations (list) | next_state_interpretations (list) |

Improved `spice upgrade` behavior

The Spice.ai CLI will no longer recommend "upgrading" to an older version. An issue was also fixed where trying to upgrade the Spice.ai CLI using spice upgrade on Linux would return an error.

New in this release

Adds a new learning algorithm called "Soft-Actor Critic (Discrete)" (SAC).
Updates the reward function parameters for the YAML code blocks from prev_state and new_state to current_state and next_state to be consistent with the reward function files.
Fixes an issue where editing a reward functions file would not automatically trigger training.
Fixes the normalization of values for the Deep-Q Learning algorithm to handle larger values.
Fixes an issue where the Spice.ai CLI would not upgrade on Linux with the spice upgrade command.
Fixes an issue where the Spice.ai CLI would recommend an "upgrade" to an older version.

- Rust
Published by phillipleblanc over 4 years ago

https://github.com/spiceai/spiceai - Spice.ai v0.4.1-alpha

Announcing the release of Spice.ai v0.4.1-alpha! ✅

This point release focuses on fixes and improvements to v0.4-alpha. Highlights include AI engine performance improvements, updates to the dashboard observations data grid, notification of new CLI versions, and several bug fixes.

A special acknowledgment to @Adm28, who added the CLI upgrade detection and prompt, which notifies users of new CLI versions and prompts to upgrade.

Highlights in v0.4.1-alpha

AI engine performance improvements

Overall training performance has been improved up to 13% by removing a lock in the AI engine.

In versions before v0.4.1-alpha, performance was especially impacted when streaming new data during a training run.

Dashboard Observations Datagrid

The dashboard observations datagrid now automatically resizes to the window width, and headers are easier to read, with automatic grouping into dataspaces. In addition, column widths are also resizable.

CLI version detection and upgrade prompt

When it is run, the Spice.ai CLI will now automatically check for new CLI versions once a day maximum.

If it detects a new version, it will print a notification to the console on spice version, spice run or spice add commands prompting the user to upgrade using the new spice upgrade command.

New in this release

Adds automatic resizing of the observations datagrid.
Adds header group by dataspace to the observations datagrid.
Adds CLI version detection and prompt for upgrade on version, run, and add commands.
Adds Support for parsing hex-encoded times and measurements. Use the time_format of hex or prefix with 0x.
Updates AI engine with improved training performance.
Updates Go and NPM dependencies.
Fixes detection of Spicepods in the Spicepods directory, and a resulting error when loading a non-Spicepod file.
Fixes a potential "zip slip" security issue.
Fixes an issue where the AI engine may not gracefully shutdown.

- Rust
Published by lukekim over 4 years ago

https://github.com/spiceai/spiceai - Spice.ai v0.4-alpha

We are excited to announce the release of Spice.ai v0.4-alpha! 🏄‍♂️

Highlights include support for authoring reward functions in a code file, the ability to specify the time of recommendation, and ingestion support for transaction/correlation ids. Authoring reward functions in a code file is a significant improvement to the developer experience than specifying functions inline in the YAML manifest, and we are looking forward to your feedback on it!

If you are new to Spice.ai, check out the getting started guide and star spiceai/spiceai on GitHub.

Highlights in v0.4-alpha

Upgrade using `spice upgrade`

The spice upgrade command was added in the v0.3.1-alpha release, so you can now upgrade from v0.3.1 to v0.4 by simply running spice upgrade in your terminal. Special thanks to community member @Adm28 for contributing this feature!

Reward Function Files

In addition to defining reward code inline, it is now possible to author reward code in functions in a separate Python file.

The reward function file path is defined by the reward_funcs property.

A function defined in the code file is mapped to an action by authoring its name in the with property of the relevant reward.

Example:

yaml training: reward_funcs: my_reward.py rewards: - reward: buy with: buy_reward - reward: sell with: sell_reward - reward: hold with: hold_reward

Learn more in the documentation: docs.spiceai.org/concepts/rewards/external

Time Categories

Spice.ai can now learn from cyclical patterns, such as daily, weekly, or monthly cycles.

To enable automatic cyclical field generation from the observation time, specify one or more time categories in the pod manifest, such as a month or weekday in the time section.

For example, by specifying month the Spice.ai engine automatically creates a field in the AI engine data stream called time_month_{month} with the value calculated from the month of which that timestamp relates.

Example:

yaml time: categories: - month - dayofweek

Supported category values are: month dayofmonth dayofweek hour

Learn more in the documentation: docs.spiceai.org/reference/pod/#time

Get recommendation for a specific time

It is now possible to specify the time of recommendations fetched from the /recommendation API.

Valid times are from pod epoch_time to epoch_time + period.

Previously the API only supported recommendations based on the time of the last ingested observation.

Requests are made in the following format:GET http://localhost:8000/api/v0.1/pods/{pod}/recommendation?time={unix_timestamp}`

An example for quickstarts/trader

GET http://localhost:8000/api/v0.1/pods/trader/recommendation?time=1605729600

Specifying {unix_timestamp} as 0 will return a recommendation based on the latest data. An invalid {unix_timestamp} will return a result that has the valid time range in the error message:

json { "response": { "result": "invalid_recommendation_time", "message": "The time specified (1610060201) is outside of the allowed range: (1610057600, 1610060200)", "error": true } }

New in this release

Adds time categories configuration to the pod manifest to enable learning from cyclical patterns in data - e.g. hour, day of week, day of month, and month
Adds support for defining reward functions in a rewards functions code file.
Adds the ability to specify recommendation time making it possible to now see which action Spice.ai recommends at any time during the pod period.
Adds support for ingestion of transaction/correlation identifiers (e.g. order_id, trace_id) in the pod manifest.
Adds validation for invalid dataspace names in the pod manifest.
Adds the ability to resize columns to the dashboard observation data grid.
Updates to TensorFlow 2.7 and Keras 2.7
Fixes a bug where data processors were using data connector params
Fixes a dashboard issue in the pod observations data grid where a column might not be shown.
Fixes a crash on pod load if the training section is not included in the manifest.
Fixes an issue where data manager stats errors were incorrectly being printed to console.
Fixes an issue where selectors may not match due to surrounding whitespace.

- Rust
Published by phillipleblanc over 4 years ago

https://github.com/spiceai/spiceai - Spice.ai v0.3.1-alpha

We are excited to announce the release of Spice.ai v0.3.1-alpha! 🎃

This point release focuses on fixes and improvements to v0.3-alpha. Highlights include the ability to specify both seed and runtime data, to select custom named fields for time and tags, a new spice upgrade command and several bug fixes.

A special acknowledgment to @Adm28, who added the new spice upgrade command, which enables the CLI to self-update, which in turn will auto-update the runtime.

Highlights in v0.3.1-alpha

Upgrade command

The CLI can now be updated using the new spice upgrade command. This command will check for, download, and install the latest Spice.ai CLI release, which will become active on it's next run.

When run, the CLI will check for the matching version of the Spice.ai runtime, and will automatically download and install it as necessary.

The version of both the Spice.ai CLI and runtime can be checked with the spice version CLI command.

Seed data

When working with streaming data sources, like market prices, it's often also useful to seed the dataspace with historical data. Spice.ai enables this with the new seed_data node in the dataspace configuration. The syntax is exactly the same as the data syntax. For example:

yaml dataspaces: - from: coinbase name: btcusd seed_data: connector: file params: path: path/to/seed/data.csv processor: name: csv data: connector: coinbase params: product_ids: BTC-USD processor: name: json

The seed data will be fetched first, before the runtime data is initialized. Both sets of connectors and processors use the dataspace scoped measurements, categories and tags for processing, and both data sources are merged in pod-scoped observation timeline.

Time field selectors

Before v0.3.1-alpha, data was required to include a specific time field. In v0.3.1-alpha, the JSON and CSV data processors now support the ability to select a specific field to populate the time field. An example selector to use the created_at column for time is:

yaml data: processor: name: csv params: time_selector: created_at

Tag field selectors

Before v0.3.1-alpha, tags were required to be placed in a _tags field. In v0.3.1-alpha, any field can now be selected to populate tags. Tags are pod-unique string values, and the union of all selected fields will make up the resulting tag list. For example:

yaml dataspace: from: twitter name: tweets tags: selectors: - tags - author_id values: - spiceaihq - spicy

New in this release

Adds a new spice upgrade command for self-upgrade of the Spice.ai CLI.
Adds a new seed_data node to the dataspace configuration, enabling the dataspace to be seeded with an alternative source of data.
Adds the ability to select a custom time field in JSON and CSV data processors with the time_selector parameter.
Adds the ability to select custom tag fields in the dataspace configuration with selectors list.
Adds error reporting for AI engine crashes, where previously it would fail silently.
Fixes the dashboard pods list from "jumping" around due to being unsorted.
Fixes rare cases where categorical data might be sent to the AI engine in the wrong format.

- Rust
Published by github-actions[bot] over 4 years ago

https://github.com/spiceai/spiceai - Spice.ai v0.3-alpha

Spice.ai v0.3-alpha

We are excited to announce the release of Spice.ai v0.3-alpha! 🎉

This release adds support for ingestion, automatic encoding, and training of categorical data, enabling more use-cases and datasets beyond just numerical measurements. For example, perhaps you want to learn from data that includes a category of t-shirt sizes, with discrete values, such as small, medium, and large. The v0.3 engine now supports this and automatically encodes the categorical string values into numerical values that the AI engine can use. Also included is a preview of data visualizations in the dashboard, which is helpful for developers as they author Spicepods and dataspaces.

A special acknowledgment to @sboorlagadda, who submitted the first Spice.ai feature contribution from the community ever! He added the ability to list pods from the CLI with the new spice pods list command. Thank you, @sboorlagadda!!!

If you are new to Spice.ai, check out the getting started guide and star spiceai/spiceai on GitHub.

Highlights in v0.3-alpha

Categorical data

In v0.1, the runtime and AI engine only supported ingesting numerical data. In v0.2, tagged data was accepted and automatically encoded into fields available for learning. In this release, v0.3, categorical data can now also be ingested and automatically encoded into fields available for learning. This is a breaking change with the format of the manifest changing separating numerical measurements and categorical data.

Pre-v0.3, the manifest author specified numerical data using the fields node.

In v0.3, numerical data is now specified under measurements and categorical data under categories. E.g.

yaml dataspaces: - from: event name: stream measurements: - name: duration selector: length_of_time fill: none - name: guest_count selector: num_guests fill: none categories: - name: event_type values: - dinner - party - name: target_audience values: - employees - investors tags: - tagA - tagB

Data visualizations preview

A top piece of community feedback was the ability to visualize data. After first running Spice.ai, we'd often hear from developers, "how do I see the data?". A preview of data visualizations is now included in the dashboard on the pod page.

Listing pods

Once the Spice.ai runtime has started, you can view the loaded pods on the dashboard and fetch them via API call localhost:8000/api/v0.1/pods. To make it even easier, we've added the ability to list them via the CLI with the new spice pods list command, which shows the list of pods and their manifest paths.

Coinbase data connector

A new Coinbase data connector is included in v0.3, enabling the streaming of live market ticker prices from Coinbase Pro. Enable it by specifying the coinbase data connector and providing a list of Coinbase Pro product ids. E.g. "BTC-USD". A new sample which demonstrates is also available with its associated Spicepod available from the spicerack.org registry. Get it with spice add samples/trader.

Tweet Recommendation Quickstart

A new Tweet Recommendation Quickstart has been added. Given past tweet activity and metrics of a given account, this app can recommend when to tweet, comment, or retweet to maximize for like count, interaction rates, and outreach of said given Twitter account.

Trader Sample

A new Trader Sample has been added in addition to the Trader Quickstart. The sample uses the new Coinbase data connector to stream live Coinbase Pro ticker data for learning.

New in this release

Adds support for ingesting, encoding, and training on categorical data. v0.3 uses one-hot-encoding.
Changes Spicepod manifest fields node to measurements and add the categories node.
Adds the ability to select a field from the source data and map it to a different field name in the dataspace. See an example for measurements in docs.
Adds support for JSON content type when fetching from the /observations API. Previously, only CSV was supported.
Adds a preview version of data visualizations to the dashboard. The grid has several limitations, one of which is it currently cannot be resized.
Adds the ability to select which learning algorithm to use via the CLI, the API, and specified in the Spicepod manifest. Possible choices are currently "vpg", Vanilla Policy Gradient and "dql", Deep Q-Learning. Shout out to @corentin-pro, who added this feature on his second day on the team!
Adds the ability to list loaded pods with the CLI command spice pods list.
Adds a new coinbase data connector for Coinbase Pro market prices.
Adds a new Tweet Recommendation Quickstart.
Adds a new Trader Sample.
Fixes bug where the /observations endpoint was not providing fully qualified field names.
Fixes issue where debugging messages were printed when using spice add.

- Rust
Published by phillipleblanc over 4 years ago

https://github.com/spiceai/spiceai - Spice.ai v0.2.1-alpha

Spice.ai v0.2.1-alpha

Announcing the release of Spice.ai v0.2.1-alpha! 🚚

This point release focuses on fixes and improvements to v0.2-alpha. Highlights include the ability to specify how missing data should be treated and a new production mode for spiced.

This release supports the ability to specify how the runtime should treat missing data. Previous releases filled missing data with the last value (or initial value) in the series. While this makes sense for some data, i.e., market prices of a stock or cryptocurrency, it does not make sense for discrete data, i.e., ratings. In v0.2.1, developers can now add the fill parameter on a dataspace field to specify the behavior. This release supports fill types previous and none. The default is previous.

Example in a manifest:

yaml dataspaces: - from: twitter name: tweets fields: - name: likes fill: none # The new fill parameter

spiced now defaults to a new production mode when run standalone (not via the CLI), with development mode now explicitly set with the --development flag. Production mode does not activate development time features, such as the Spicepod file watcher. The CLI always runs spiced in development mode as it is not expected to be used in production deployments.

New in this release

Adds a fill parameter to dataspace fields to specify how missing values should be treated.
Adds the ability to specify the fill behavior of empty values in a dataspace.
Simplifies releases with a single spiceai release instead of separate spice and spiced releases.
Adds an explicit development mode to spiced. Production mode does not activate the file watcher.
Fixes a bug when the pod parameter epoch_time was not set which would cause data not to be sent to the AI engine.
Fixes a bug where the User-Agent was not set correctly from CLI calls to api.spicerack.org

- Rust
Published by github-actions[bot] over 4 years ago

https://github.com/spiceai/spiceai - Spice CLI v0.2-alpha

- Rust
Published by lukekim over 4 years ago

https://github.com/spiceai/spiceai - Spice.ai v0.2-alpha

Spice.ai v0.2-alpha

We are excited to announce the release of Spice.ai v0.2-alpha! 🎉

This release is the first major version since the initial v0.1 announcement and includes significant improvements based upon community and early customer feedback. If you are new to Spice.ai, check out the getting started guide and star spiceai/spiceai on GitHub.

Highlights in v0.2-alpha

Tagged data

In the first release, the runtime and AI engine could only ingest numerical data. In v0.2, tagged data is accepted and automatically encoded into fields available for learning. For example, it's now possible to include a "liked" tag when using tweet data, automatically encoded to a 0/1 field for training. Both CSV and the new JSON observation formats support tags. The v0.3 release will add additional support for sets of categorical data.

Streaming data

Previously, the runtime would trigger each data connector to fetch on a 15-second interval. In v0.2, we upgraded the interface for data connectors to a push/streaming model, which enables continuous streaming data into the environment and AI engine.

Interpreted data

Spice.ai works together with your application code and works best when it's provided continuous feedback. This feedback could be from the application itself, for example, ratings, likes, thumbs-up/down, profit from trades, or external expertise. The interpretations API was introduced in v0.1.1, and v0.2 adds AI engine support providing a way to give meaning or an interpretation of ranges of time-series data, which are then available within reward functions. For example, a time range of stock prices could be a "good time to buy," or perhaps Tuesday mornings is a "good time to tweet," and an application or expert can teach the AI engine this through interpretations providing a shortcut to it's learning.

New in this release

Adds core runtime and AI engine tagged data support
Adds tagged data support to the CSV processor
Adds streaming data support to the engine and data connectors
Adds a new JSON data processor for ingesting JSON data
Adds a new Twitter data connector with JSON processor support
Adds a new /pods//dataspaces API
Adds support for using interpretations in reward functions Learn more.
Adds support for downloading zipped pods from the spicerack.org registry
Adds support for adding data along with the pod manifest when adding a pod from the spicerack.org registry
Adds basic /pods//diagnostics API
Fixes pod period, interval, and granularity not being correctly set when trying to use a "d" format
Fixes the color scheme of action counts in the dashboard to improve readability

- Rust
Published by github-actions[bot] over 4 years ago

https://github.com/spiceai/spiceai - v0.1.1-alpha

alpha

- Rust
Published by github-actions[bot] over 4 years ago

https://github.com/spiceai/spiceai - Spice Runtime v0.1.1-alpha

Spice.ai v0.1.1-alpha

Announcing the release of Spice.ai v0.1.1-alpha! 🙌

This is the first point release following the public launch of v0.1-alpha and is focused on fixes and improvements to v0.1-alpha before the bigger v0.2-alpha release.

Highlights include initial support for interpretations and the addition of a new Json Data Processor which enables observations to be posted in JSON to a new Dataspaces API. The ability to post observations directly to the Dataspace also now makes Data Connectors optional.

Interpretations will enable end-users and external systems to participate in training by providing expert interpretation of the data, ultimately creating smarter pods. v0.1.1-alpha includes the ability to add and get interpretations by API and through import/export of Spicepods. Reward function authors will be able to use interpretations in reward functions from the v0.2-alpha release.

Previously observations could only be added in CSV format. JSON is now supported by calling the new dataspace observations API that leverages the also new JSON processor located in the data-components-contrib repository. The JSON processor defaults to parsing the Spice.ai observation format and is extensible to other schemas.

The dashboard has also been improved to show action counts during a training run, making it easier to visualize the learning process.

New in this release

Adds visualization of actions counts during a training run in the dashboard.
Adds a new interpretations API, along with support for importing and exporting interpretations to pods. Learn more.
Adds a new API for ingesting dataspace observations. Learn more.
Adds an official DockerHub repository for spiceai/spiceai.
Fixes bug where the dashboard would not load on browser refresh.

- Rust
Published by github-actions[bot] over 4 years ago

https://github.com/spiceai/spiceai - Spice CLI v0.1.1-alpha-rc

This is the release candidate 0.1.1-alpha-rc

- Rust
Published by github-actions[bot] over 4 years ago

https://github.com/spiceai/spiceai - Spice Runtime v0.1.1-alpha-rc

This is the release candidate 0.1.1-alpha-rc

- Rust
Published by github-actions[bot] over 4 years ago

https://github.com/spiceai/spiceai - Spice Runtime v0.2.0-alpha-rc

This is the release candidate 0.2.0-alpha-rc

- Rust
Published by github-actions[bot] over 4 years ago

https://github.com/spiceai/spiceai - Spice CLI v0.2.0-alpha-rc

This is the release candidate 0.2.0-alpha-rc

- Rust
Published by github-actions[bot] over 4 years ago

https://github.com/spiceai/spiceai - Spice Runtime v0.1.0-alpha

Spice.ai v0.1.0-alpha

Announcing the public release of Spice.ai v0.1.0-alpha! 🎉

See the blog post at blog.spiceai.org.

New in this release

Made public github.com/spiceai/spiceai
Made public github.com/spiceai/data-components-contrib
Made public github.com/spiceai/docs
Made public github.com/spiceai/quickstarts
Made public github.com/spiceai/samples
Adds spicerack.org homepage

- Rust
Published by github-actions[bot] over 4 years ago

https://github.com/spiceai/spiceai - Spice CLI v0.1.0-alpha

Spice.ai v0.1.0-alpha

Announcing the public release of Spice.ai v0.1.0-alpha! 🎉

See the blog post at blog.spiceai.org.

New in this release

Made public github.com/spiceai/spiceai
Made public github.com/spiceai/data-components-contrib
Made public github.com/spiceai/docs
Made public github.com/spiceai/quickstarts
Made public github.com/spiceai/samples
Adds spicerack.org homepage

- Rust
Published by github-actions[bot] over 4 years ago

https://github.com/spiceai/spiceai - Spice Runtime v0.1.0-alpha-rc

This is the release candidate 0.1.0-alpha-rc

- Rust
Published by github-actions[bot] over 4 years ago

https://github.com/spiceai/spiceai - Spice CLI v0.1.0-alpha-rc

This is the release candidate 0.1.0-alpha-rc

- Rust
Published by github-actions[bot] over 4 years ago

https://github.com/spiceai/spiceai - Spice Runtime v0.1.0-alpha.5

Spice.ai v0.1.0-alpha.5

Announcing the release of Spice.ai v0.1.0-alpha.5! 🎉

This release focused on preparation for the public launch of the project, including more comprehensive and easier-to-understand documentation, quickstarts and samples.

Data Connectors and Data Processors have now been moved to their own repository spiceai/data-components-contrib

To better improve the developer experience, the following breaking changes have been made:

The pods directory .spice/pods (and thus manifests) and the config file .spice/config.yaml have been moved from the ./spice directory to the app root ./. This allows for the .spice directory to be added to the .gitignore and for the manifest changes to be easily tracked in the project.
Flights have been renamed to more understandable Training Runs in user interfaces.

New in this release

Adds Open source acknowledgements to the dashboard
Adds improved error messages for several scenarios
Updates all Quickstarts and Samples to be clearer, easier to understand and better show the value of Spice.ai. The LogPruner sample has also been renamed ServerOps
Updates the dashboard to show a message when no pods have been trained
Updates all documentation links to docs.spiceai.org
Updates to use Python 3.8.12
Fixes bug where the dashboards showed undefined episode number
Fixes issue where the manifest.json was not being served to the React app
Fixes the config.yaml being written when not required
Removes the ability to load a custom dashboard - this may come back in a future release

Breaking changes

Changes .spice/pods is now located at ./spicepods
Changes .spice/config.yaml is now located at .spice.config.yaml

- Rust
Published by github-actions[bot] over 4 years ago

https://github.com/spiceai/spiceai - Spice CLI v0.1.0-alpha.5

Spice.ai v0.1.0-alpha.5

Announcing the release of Spice.ai v0.1.0-alpha.5! 🎉

This release focused on preparation for the public launch of the project, including more comprehensive and easier-to-understand documentation, quickstarts and samples.

Data Connectors and Data Processors have now been moved to their own repository spiceai/data-components-contrib

To better improve the developer experience, the following breaking changes have been made:

The pods directory .spice/pods (and thus manifests) and the config file .spice/config.yaml have been moved from the ./spice directory to the app root ./. This allows for the .spice directory to be added to the .gitignore and for the manifest changes to be easily tracked in the project.
Flights have been renamed to more understandable Training Runs in user interfaces.

New in this release

Adds Open source acknowledgements to the dashboard
Adds improved error messages for several scenarios
Updates all Quickstarts and Samples to be clearer, easier to understand and better show the value of Spice.ai. The LogPruner sample has also been renamed ServerOps
Updates the dashboard to show a message when no pods have been trained
Updates all documentation links to docs.spiceai.org
Updates to use Python 3.8.12
Fixes bug where the dashboards showed undefined episode number
Fixes issue where the manifest.json was not being served to the React app
Fixes the config.yaml being written when not required
Removes the ability to load a custom dashboard - this may come back in a future release

Breaking changes

Changes .spice/pods is now located at ./spicepods
Changes .spice/config.yaml is now located at .spice.config.yaml

- Rust
Published by github-actions[bot] over 4 years ago

https://github.com/spiceai/spiceai - Spice Runtime v0.1.0-alpha.5-rc

This is the release candidate 0.1.0-alpha.5-rc

- Rust
Published by github-actions[bot] almost 5 years ago

https://github.com/spiceai/spiceai - Spice CLI v0.1.0-alpha.5-rc

This is the release candidate 0.1.0-alpha.5-rc

- Rust
Published by github-actions[bot] almost 5 years ago

https://github.com/spiceai/spiceai - Spice Runtime v0.1.0-alpha.4

Spice.ai v0.1.0-alpha.4

Announcing the release of Spice.ai v0.1.0-alpha.4! 🎉

We have a project name update. The project will now be referred to as "Spice.ai" instead of "Spice AI" and the project website will be located at spiceai.org.

This release now uses the new spicerack.org AI package registry instead of fetching packages directly from GitHub.

Added support for importing and exporting Spice.ai pods with spice import and spice export commands.

The CLI been streamlined removing the pod command: - pod add changes from spice pod add <pod path> to just spice add <pod path> - pod train changes from spice pod train <pod name> to just spice train <pod name>

We've also updated the names of some concepts:

"DataSources" are now "Dataspaces"
"Inference" is now "Recommendation"

New in this release

Adds a new Gardener to intelligently decide on the best time to water a simulated garden
Adds support for importing and exporting Spice.ai pods with spice import and spice export commands
Adds a complete end-to-end test suite
Adds installing by friendly URL curl https://install.spiceai.org | /bin/bash
Adds the spice binary to PATH automatically by shell config (E.g. .bashrc .zshrc)
Adds support for targeting hosting contexts (docker or metal) specifically with a --context command line flag
Removes the model downloader. This will return with better supported in a later version
Updates Trader quickstart with demo Node.js application to better demonstrate its use
Updates LogPruner quickstart with demo PowerShell Core script to better demonstrate its use
Updates Tensorflow from 2.5.0 to 2.5.1
Fixes potential mismatch of CLI and runtime by only automatically upgrading to the same version
Fixes issue with .spice/config.yml creation in Docker due to incorrect permissions
Fixes dashboard title from React App to Spice.ai

Breaking changes

Changes datasources section in the pod manifest to dataspaces
Changes /api/v0.1/pods/<pod>/inference API to /api/v0.1/pods/<pod>/recommendation

- Rust
Published by github-actions[bot] almost 5 years ago

https://github.com/spiceai/spiceai - Spice CLI v0.1.0-alpha.4

Spice.ai v0.1.0-alpha.4

Announcing the release of Spice.ai v0.1.0-alpha.4! 🎉

We have a project name update. The project will now be referred to as "Spice.ai" instead of "Spice AI" and the project website will be located at spiceai.org.

This release now uses the new spicerack.org AI package registry instead of fetching packages directly from GitHub.

The CLI been streamlined removing the pod command: - pod add changes from spice pod add <pod path> to just spice add <pod path> - pod train changes from spice pod train <pod name> to just spice train <pod name>

We've also updated the names of some concepts:

"DataSources" are now "Dataspaces"
"Inference" is now "Recommendation"

New in this release

Adds a new Gardener to intelligently decide on the best time to water a simulated garden
Adds a complete end-to-end test suite
Adds installing by friendly URL curl https://install.spiceai.org | /bin/bash
Adds the spice binary to PATH automatically by shell config (E.g. .bashrc .zshrc)
Adds support for targeting hosting contexts (docker or metal) specifically with a --context command line flag
Removes the model downloader. This will return with better supported in a later version
Updates [Trader]](https://github.com/spiceai/quickstarts/tree/trunk/trader) quickstart with demo Node.js application to better demonstrate its use
Updates [LogPruner]](https://github.com/spiceai/quickstarts/tree/trunk/logpruner) quickstart with demo PowerShell Core script to better demonstrate its use
Updates Tensorflow from 2.5.0 to 2.5.1
Fixes potential mismatch of CLI and runtime by only automatically upgrading to the same version
Fixes issue with .spice/config.yml creation in Docker due to incorrect permissions
Fixes dashboard title from React App to Spice.ai

Breaking changes

Changes datasources section in the pod manifest to dataspaces
Changes /api/v0.1/pods/<pod>/inference API to /api/v0.1/pods/<pod>/recommendation

- Rust
Published by github-actions[bot] almost 5 years ago

Recent Releases of https://github.com/spiceai/spiceai

https://github.com/spiceai/spiceai -

https://github.com/spiceai/spiceai - v1.6.0

Spice v1.6.0 (Aug 26, 2025)

What's New in v1.6.0

DataFusion v48 Highlights

Runtime Highlights

Contributors

New Contributors

Breaking Changes

Cookbook Updates

Upgrading

What's Changed

Dependencies

Changelog

https://github.com/spiceai/spiceai - v1.5.2

Spice v1.5.2 (Aug 4, 2025)

What's New in v1.5.2

Example datasets for Redshift TPCH tables

Contributors

Breaking Changes

Cookbook Updates

Upgrading

What's Changed

Dependencies

Changelog

https://github.com/spiceai/spiceai - v1.5.1

Spice v1.5.1 (July 28, 2025)

What's New in v1.5.1

Contributors

New Contributors

Breaking Changes

Cookbook Updates

Upgrading

What's Changed

Dependencies

Changelog

https://github.com/spiceai/spiceai - v1.5.0

https://github.com/spiceai/spiceai - v1.5.0-rc.3

Spice v1.5.0-rc.3 (July 16, 2025)

What's New in v1.5.0-rc.3

Highlights in v1.5.0-rc.3

Contributors

Breaking Changes

Cookbook Updates

Upgrading

What's Changed

Dependencies

Changelog

https://github.com/spiceai/spiceai - v1.5.0-rc.2

Spice v1.5.0-rc.2 (July 14, 2025)

What's New in v1.5.0-rc.2

Contributors

Breaking Changes

Cookbook Updates

Upgrading

What's Changed

Dependencies

Changelog

https://github.com/spiceai/spiceai - v1.5.0-rc.1

Spice v1.5.0-rc.1 (July 7, 2025)

What's New in v1.5.0-rc.1

Contributors

Breaking Changes

Cookbook Updates

Upgrading

What's Changed

Dependencies

Changelog

https://github.com/spiceai/spiceai - v1.4.0

https://github.com/spiceai/spiceai - v1.4.0-rc.1

Spice v1.4.0-rc.1 (June 11, 2025)

What's New in v1.4.0-rc.1

DataFusion v47 Highlights

Arrow v55 Highlights

Contributors

Breaking Changes

Cookbook Updates

Upgrading

What's Changed