Recent Releases of https://github.com/spiceai/spiceai
https://github.com/spiceai/spiceai - v1.6.0
Spice v1.6.0 (Aug 26, 2025)
Spice 1.6.0 upgrades DataFusion to v48, reducing expressions memory footprint by ~50% for faster planning and lower memory usage, eliminating unnecessary projections in queries, optimizing string functions like ascii and character_length for up to 3x speedup, and accelerating unbounded aggregate window functions by 5.6x. The release adds Kafka and MongoDB connectors for real-time streaming and NoSQL data acceleration, supports OpenAI Responses API for advanced model interactions including OpenAI-hosted tools like web_search and code_interpreter, improves the OpenAI Embeddings Connector with usage tier configuration for higher throughput via increased concurrent requests, introduces Model2Vec embeddings for ultra-low-latency encoding, and improves the Amazon S3 Vectors engine to support multi-column primary keys.
What's New in v1.6.0
DataFusion v48 Highlights
Spice.ai is built on the DataFusion query engine. The v48 release brings:
Performance & Size Improvements 🚀: Expressions memory footprint was reduced by ~50% resulting in faster planning and lower memory usage, with planning times improved by 10-20%. There are now fewer unnecessary projections in queries. The string functions, ascii and character_length were optimized for improved performance, with character_length achieving up to 3x speedup. Queries with unbounded aggregate window functions have improved performance by 5.6 times via avoided unnecessary computation for constant results across partitions. The Expr struct size was reduced from 272 to 144 bytes.
New Features & Enhancements ✨: Support was added for ORDER BY ALL for easy ordering of all columns in a query.
See the Apache DataFusion 48.0.0 Blog for details.
Runtime Highlights
Amazon S3 Vectors Multi-Column Primary Keys: The Amazon S3 Vectors engine now supports datasets with multi-column primary keys. This enables vector indexes for datasets where more than one column forms the primary key, such as those splitting documents into chunks for retrieval contexts. For multi-column keys, Spice serializes the keys using arrow-json format, storing them as single string keys in the vector index.
Model2Vec Embeddings: Spice now supports model2vec static embeddings with a new model2vec embeddings provider, for sentence transformers up to 500x faster and 15x smaller, enabling scenarios requiring low latency and high-throughput encoding.
yaml
embeddings:
- from: model2vec:minishlab/potion-base-8M # HuggingFace model
name: potion
- from: model2vec:path/to/my/local/model # local model
name: local
Learn more in the Model2Dev Embeddings documentation.
Kafka Data Connector: Use from: kafka:<topic> to ingest data directly from Kafka topics for integration with existing Kafka-based event streaming infrastructure, providing real-time data acceleration and query without additional middleware.
Example Spicepod.yml:
yaml
- from: kafka:orders_events
name: orders
acceleration:
enabled: true
refresh_mode: append
params:
kafka_bootstrap_servers: server:9092
Learn more in the Kafka Data Connector documentation.
MongoDB Data Connector: Use from: mongodb:<dataset> to access and accelerate data stored in MongoDB, deployed on-premises or in the cloud.
Example spicepod.yml:
yaml
datasets:
- from: mongodb:my_dataset
name: my_dataset
params:
mongodb_host: localhost
mongodb_db: my_database
mongodb_user: my_user
mongodb_pass: password
Learn more in the MongoDB Data Connector documentation.
OpenAI Responses API Support: The OpenAI Responses API (/v1/responses) is now supported, which is OpenAI's most advanced interface for generating model responses.
You can now make requests to any responses compatible model using the new /v1/responses endpoint.
Example curl request:
bash
curl http://localhost:8090/v1/responses \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4.1",
"input": "Tell me a three sentence bedtime story about Spice AI."
}'
To use responses in spice chat, use the --responses flag.
Example:
bash
spice chat --responses # Use the `/v1/responses` endpoint for all completions instead of `/v1/chat/completions`
Use OpenAI-hosted tools supported by Open AI's Responses API by specifying the openai_responses_tools parameter:
Example spicepod.yml:
yaml
models:
- name: test
from: openai:gpt-4.1
params:
openai_api_key: ${ secrets:SPICE_OPENAI_API_KEY }
tools: sql, list_datasets
openai_responses_tools: web_search, code_interpreter # 'code_interpreter' or 'web_search'
These OpenAI-specific tools are only available from the /v1/responses endpoint. Any other tools specified via the tools parameter are available from both the /v1/chat/completions and /v1/responses endpoints.
Learn more in the OpenAI Model Provider documentation.
OpenAI Embeddings & Models Connectors Usage Tier: The OpenAI Embeddings and Models Connectors now supports specifying account usage tier for embeddings and model requests, improving the performance of generating text embeddings or calling models during dataset load and search by increasing concurrent requests.
Example spicepod.yml:
yaml
embeddings:
- from: openai:text-embedding-3-small
name: openai_embed
params:
openai_usage_tier: tier1
By setting the usage tier to the matching usage tier for your OpenAI account, the Embeddings and Models Connector will increase the maximum number of concurrent requests to match the specified tier.
Learn more in the OpenAI Model Provider documentation.
Contributors
- @Jeadie
- @peasee
- @sgrebnov
- @Sevenannn
- @kczimm
- @phillipleblanc
- @Advayp
- @lukekim
- @ewgenius
- @mach-kernel
- @krinart
New Contributors
- @krinart made their first contribution in github.com/spiceai/spiceai/pull/6573
Breaking Changes
No breaking changes.
Cookbook Updates
- Added OpenAI Responses API - Use OpenAI's Responses API with Spice
- Added Live Orders Analytics with Apache Kafka Data Connector - Combine real-time data streaming from Kafka with other datasets
- Added MongoDB Data Connector - Use MongoDB as a data source with Spice
The Spice Cookbook includes 77 recipes to help you get started with Spice quickly and easily.
Upgrading
To upgrade to v1.6.0, use one of the following methods:
CLI:
console
spice upgrade
Homebrew:
console
brew upgrade spiceai/spiceai/spice
Docker:
Pull the spiceai/spiceai:1.6.0 image:
console
docker pull spiceai/spiceai:1.6.0
For available tags, see DockerHub.
Helm:
console
helm repo update
helm upgrade spiceai spiceai/spiceai
AWS Marketplace:
🎉 Spice is also now available in the AWS Marketplace!
What's Changed
Dependencies
- DataFusion: Upgraded to v48
- Rust: Upgraded from 1.86.0 to 1.87.0
Changelog
- Support Streaming with Tool Calls (#6941) by @Advayp in #6941
- Fix parameterized query planning in DataFusion (#6942) by @Jeadie in #6942
- Update the UnableToLoadCredentials error with a pointer to docs (#6937) by @phillipleblanc in #6937
- Fix spicecloud benchmark (#6935) by @krinart in #6935
- [Debezium] Support for VariableScaleDecimal (#6934) by @krinart in #6934
- Update to DF 48 (#6665) by @mach-kernel and @kczimm in #6665
- Mark append-stream and CDC datasets as ready after first message (#6914) by @sgrebnov in #6914
- Model2Vec embedding model support (#6846) by @mach-kernel in #6846
- Update snapshot for S3 vector search test (#6920) by @Jeadie in #6920
- remove [] from queryset in spicepod path for CI (#6919) by @Jeadie in #6919
- Remove verbose tracing (#6915) by @Jeadie in #6915
- Refactor how models supporting the Responses API are loaded (#6912) by @Advayp in #6912
- Write tests for truncate formatting in
arrow_toolsand fix bug. (#6900) by @Jeadie in #6900 - Support using the Responses API from
spice chat(#6894) by @Advayp in #6894 - Include GPT-5 into Text-To-SQL and Financebench benchmarks (#6907) by @sgrebnov in #6907
- Better error message when credentials aren't loaded for S3 Vectors (#6910) by @phillipleblanc in #6910
- Add tracing and system prompt support for the Responses API (#6893) by @Advayp in #6893
- Constraint violation check is improved to control behavior when violations occur within a batch (#6897) by @phillipleblanc in #6897
- fix: Multi-column text search with v1/search (#6905) by @peasee in #6905
- fix: Correctly project text search primary keys to underlying projection (#6904) by @peasee in #6904
- fix: Update benchmark snapshots (#6901) by @app/github-actions in #6901
- In S3vector, do not pushdown on non-filterable columns (#6884) by @Jeadie in #6884
- Run E2E Test CI macOS build on bigger runners (#6896) by @phillipleblanc in #6896
- Enable configuration of the Responses API for the Azure model provider (#6891) by @Advayp in #6891
- fix: Update benchmark snapshots (#6888) by @app/github-actions in #6888
- Update OpenAPI specification for
/v1/responses(#6889) by @Advayp in #6889 - Add test to ensure tools are injected correctly in the Responses API (#6886) by @Advayp in #6886
- Enable embeddings for append streams (#6878) by @sgrebnov in #6878
- Show correct limit for EXPLAIN plans in
S3VectorsQueryExec(#6852) by @Jeadie in #6852 - Responses API support for Azure Open AI (#6879) by @Advayp in #6879
- fix: Update search test case structure (#6865) by @peasee in #6865
- Fix mongodb benchmark (#6883) by @phillipleblanc in #6883
- Support multiple column primary keys for S3 vectors. (#6775) by @Jeadie in #6775
- Kafka Data Connector: persist consumer between restarts (#6870) by @sgrebnov in #6870
- Fix newlines in errors added in recent PRs (#6877) by @phillipleblanc in #6877
- Add override parameter to force support for the Responses API (#6871) by @Advayp in #6871
- Don't use metadata columns in
VectorScanTableProvider(#6854) by @Jeadie in #6854 - Add non-streaming tool call support (hosted and Spice tools) via the Responses API (#6869) by @Advayp in #6869
- Update error guideline to remove newlines + remove newlines from error messages. (#6866) by @phillipleblanc in #6866
- Remove void acceleration engine + optional table behaviors (#6868) by @phillipleblanc in #6868
- Kafka Data Connector basic support (#6856) by @sgrebnov in #6856
- Federated+Accelerated TPCH Benchmarks for MongoDB (#6788) by @krinart in #6788
- Pass embeddings calculated in
compute_indexto the acceleration (#6792) by @phillipleblanc in #6792 - Add non-streaming and streaming support for OpenAI Responses API endpoint (#6830) by @Advayp in #6830
- Use latest version of OpenAI crate to resolve issues with Service Tier deserialization (#6853) by @Advayp in #6853
- Update openapi.json (#6799) by @app/github-actions in #6799
- Improve management message (#6850) by @lukekim in #6850
- fix: Include FTS search column if it is the PK (#6836) by @peasee in #6836
- Refactor Health Checks (#6848) by @Advayp in #6848
- Introduce a
Responsestrait and LLM registry for model providers that support the OpenAI Responses API (#6798) by @Advayp in #6798 - fix: Update datafusion-table-providers to include constraints (#6837) by @peasee in #6837
- Bump postcard from 1.1.2 to 1.1.3 (#6841) by @app/dependabot in #6841
- Bump governor from 0.10.0 to 0.10.1 (#6835) by @app/dependabot in #6835
- Bump ctor from 0.2.9 to 0.5.0 (#6827) by @app/dependabot in #6827
- Bump azure_core from 0.26.0 to 0.27.0 (#6826) by @app/dependabot in #6826
- Bump rstest from 0.25.0 to 0.26.1 (#6825) by @app/dependabot in #6825
- Use latest commit in our fork of async-openai (#6829) by @Advayp in #6829
- Bump rustls from 0.23.27 to 0.23.31 (#6824) by @app/dependabot in #6824
- Bump async-trait from 0.1.88 to 0.1.89 (#6823) by @app/dependabot in #6823
- Bump hyper from 1.6.0 to 1.7.0 (#6814) by @app/dependabot in #6814
- Bump serde_json from 1.0.140 to 1.0.142 (#6812) by @app/dependabot in #6812
- Add s3 vector test retrieving vectors (#6786) by @Jeadie in #6786
- fix: Allow v1/search with only FTS (#6811) by @peasee in #6811
- Bump tantivy from 0.24.1 to 0.24.2 (#6806) by @app/dependabot in #6806
- Bump tokio-util from 0.7.15 to 0.7.16 (#6810) by @app/dependabot in #6810
- fix: Improve FTS index primary key handling (#6809) by @peasee in #6809
- Bump logos from 0.15.0 to 0.15.1 (#6808) by @app/dependabot in #6808
- Bump hf-hub from 0.4.2 to 0.4.3 (#6807) by @app/dependabot in #6807
- Bump odbc-api from 13.0.1 to 13.1.0 (#6803) by @app/dependabot in #6803
- fix: Spice search CLI with FTS supports string or slice unmarshalling (#6805) by @peasee in #6805
- Bump uuid from 1.17.0 to 1.18.0 (#6797) by @app/dependabot in #6797
- Bump reqwest from 0.12.22 to 0.12.23 (#6796) by @app/dependabot in #6796
- Bump anyhow from 1.0.98 to 1.0.99 (#6795) by @app/dependabot in #6795
- Bump clap from 4.5.41 to 4.5.45 (#6794) by @app/dependabot in #6794
- Respect default MAXDECODINGMESSAGE_SIZE (100MB) in Flight API (#6802) by @sgrebnov in #6802
- Fix compilation errors caused by upgrading
async-openai(#6793) by @Advayp in #6793 - Remove outdated vector search benchmark (replaced with testoperator) (#6791) by @sgrebnov in #6791
- Handle errors in vector ingestion pipeline (#6782) by @phillipleblanc in #6782
- fix: Explicitly error when chunking is defined for vector engines (#6787) by @peasee in #6787
- Make
VectorScanTableProviderandVectorQueryTableProvidersupport multi-column primary keys (#6757) by @Jeadie in #6757 - Use
megascience/megascienceQ+A dataset for text search testing. (#6702) by @Jeadie in #6702 - Flight REPL autocomplete (#6589) by @krinart in #6589
- use ref:
github.event.pull_request.head.shain integration_models.yml (#6780) by @Jeadie in #6780 - fix: Move search telemetry calls in UDTF to scan (#6778) by @peasee in #6778
- Fix Hugging Face models and embeddings loading in Docker (#6777) by @ewgenius in #6777
- feat: Migrate bedrock rate limiter (#6773) by @peasee in #6773
- Run the PR checks on the DEV runners (#6769) by @phillipleblanc in #6769
- feat: add OpenAI models rate controller (#6767) by @peasee in #6767
- Implement MongoDB data connector (#6594) by @krinart in #6594
- fix: Use head ref for concurrency group (#6770) by @peasee in #6770
- fix: Run enforce pulls with spice on pullrequesttarget (#6768) by @peasee in #6768
- feat: Add OpenAI Embeddings Rate Controller (#6764) by @peasee in #6764
- Move AWS SDK credential bridge integration test to the existing AWS SDK integration test run (#6766) by @phillipleblanc in #6766
- Use Spice specific errors instead of OpenAIError in embedding module (#6748) by @kczimm in #6748
- Use context in Glue Catalog Provider (#6763) by @Advayp in #6763
- pin cargo-deny to previous version (#6762) by @kczimm in #6762
- Bump actions/download-artifact from 4 to 5 (#6720) by @app/dependabot in #6720
- Upgrade dependabot dependencies (#6754) by @phillipleblanc in #6754
- Set E2E Test CI models build to 90 minute timeout (#6756) by @phillipleblanc in #6756
- chore: upgrade to Rust 1.87.0 (#6614) by @kczimm in #6614
- feat: Add initial runtime-rate-limiter crate (#6753) by @peasee in #6753
- feat: Add more embedding traces, add MiniLM MTEB spicepod (#6742) by @peasee in #6742
- Update QA analytics for release (#6740) by @Advayp in #6740
- Always use 'returnData: true' for s3 vector query index (#6741) by @Jeadie in #6741
- feat: Add Embedding and Search anonymous telemetry (#6737) by @peasee in #6737
- Add 1.5.2 to SECURITY.md (#6739) by @ewgenius in #6739
- Combine the Iceberg and Object Store AWS SDK bridges into one crate (#6718) by @Advayp in #6718
- Updates to v1.5.2 release notes (#6736) by @lukekim in #6736
- Update end game template - move glue catalog to catalogs section (#6732) by @ewgenius in #6732
- Update v1.5.2.md (#6735) by @kczimm in #6735
- Add note about S3 Vectors workaround (#6734) by @phillipleblanc in #6734
- feat: Avoid joining for VectorScanTableProvider if the index is sufficient (#6714) by @peasee in #6714
- update changelog (#6729) by @kczimm in #6729
- remove unneeded autogenerated s3 vector code (#6715) by @Jeadie in #6715
- fix: Set S3 vectors default limit to 30, add more tracing (#6712) by @peasee in #6712
- docs: Add Hadoop cookbook to endgame template (#6708) by @peasee in #6708
- Fix testoperator append mode compilation error (#6706) by @phillipleblanc in #6706
- test: Add VectorScanTableProvider snapshot tests (#6701) by @peasee in #6701
- feat: Add Hadoop catalog-mode benchmark (#6684) by @peasee in #6684
- Move shared AWS crates used in bridges to workspace (#6705) by @Advayp in #6705
- Use installation id to group connections (#6703) by @Advayp in #6703
- Add Guardrails for AWS bedrock models (#6692) by @Jeadie in #6692
- Update bedrock keys for CI. (#6693) by @Jeadie in #6693
- Update acknowledgements (#6690) by @app/github-actions in #6690
- ROADMAP updates Aug 1, 2025 (#6667) by @lukekim in #6667
- Add retry logic for OpenAI embeddings creation (#6656) by @sgrebnov in #6656
- Make models E2E chat test more robust (#6657) by @sgrebnov in #6657
- Update Search GH Workflow to use Test Operator (#6650) by @sgrebnov in #6650
- Score and P95 latency calculation for MTEB Quora-based vector search tests in Test Operator (#6640) by @sgrebnov in #6640
- Fix multiple query error being classified as an internal error (#6635) by @Advayp in #6635
- Add Support for S3 Table Buckets (#6573) by krinart in #6573
- set MISTRALRSMETALPRECOMPILE=0 for metal (#6652) by @kczimm in #6652
- Vector search to push down udtf limit argument into logical sort plan (#6636) by @mach-kernel in #6636
- docs: Update qa_analytics.csv (#6643) by @peasee in #6643
- Update SECURITY.md (#6642) by @Jeadie in #6642
- docs: Update qa_analytics.csv (#6641) by @peasee in #6641
- Separate token usage (#6619) by @Advayp in #6619
- Fix typo in release notes (#6634) by @Advayp in #6634
- Add environment variable for org token (#6633) by @Advayp in #6633
- CDC: Compute embeddings on ingest (#6612) by @mach-kernel in #6612
- Add view name to view creation errors (#6611) by @lukekim in #6611
- Add core logic for running MTEB Quora-based vector search tests in Test Operator (#6607) by @sgrebnov in #6607
- Revert "Update generate-openapi.yml (#6584)" (#6620) by @Jeadie in #6620
- Non-accelerated views should report as ready only after all dependent datasets are ready (#6617) by @sgrebnov in #6617
- Rust
Published by sgrebnov 6 months ago
https://github.com/spiceai/spiceai - v1.5.2
Spice v1.5.2 (Aug 4, 2025)
Spice v1.5.2 introduces a new Amazon Bedrock Models Provider for converse API (Nova) compatible models, AWS Redshift support using the Postgres data connector, and Hadoop Catalog Support for Iceberg tables along with several bug fixes and improvements.
What's New in v1.5.2
Amazon Bedrock Models Provider: Adds a new Amazon Bedrock LLM Provider. Models compatible with the Converse API (Nova) are supported.
Amazon Bedrock provides access to a range of foundation models for generative AI. Spice supports using Bedrock-hosted models by specifying the bedrock prefix in the from field and configuring the required parameters.
Supported Model IDs:
us.amazon.nova-lite-v1:0us.amazon.nova-micro-v1:0us.amazon.nova-premier-v1:0us.amazon.nova-pro-v1:0
Refer to the Amazon Bedrock documentation for details on available models.
Example Spicepod.yaml:
yaml
models:
- from: bedrock:us.amazon.nova-lite-v1:0
name: novash
params:
aws_region: us-east-1
aws_access_key_id: ${ secrets:AWS_ACCESS_KEY_ID }
aws_secret_access_key: ${ secrets:AWS_SECRET_ACCESS_KEY }
bedrock_guardrail_identifier: arn:aws:bedrock:abcdefg012927:0123456789876:guardrail/hello
bedrock_guardrail_version: DRAFT
bedrock_trace: enabled
bedrock_temperature: 42
For more information, see the Amazon Bedrock Documentation.
AWS Redshift Support for Postgres Data Connector: Spice now supports connecting to Amazon Redshift using the PostgreSQL data connector. Redshift is a columnar OLAP database compatible with PostgreSQL, allowing you to use the same connector and configuration parameters.
To connect to Redshift, use the format postgres:schema.table in your Spicepod and set the connection parameters to match your Redshift cluster settings.
Example Spicepod.yaml:
```yaml
Example datasets for Redshift TPCH tables
datasets: - from: postgres:public.customer name: customer params: pghost: ${secrets:PGHOST} pgport: 5439 pgsslmode: prefer pgdb: dev pguser: ${secrets:PGUSER} pgpass: ${secrets:PGPASS} - from: postgres:public.lineitem name: lineitem params: pghost: ${secrets:PGHOST} pgport: 5439 pgsslmode: prefer pgdb: dev pguser: ${secrets:PGUSER} pgpass: ${secrets:PGPASS} ```
Redshift types are mapped to PostgreSQL types. See the PostgreSQL connector documentation for details on supported types and configuration.
Hadoop Catalog Support for Iceberg: The Iceberg Data and Catalog connectors now support connecting to Hadoop catalogs on local filesystem (file://) or S3 object storage (s3://, s3a://). This enables connecting to Iceberg catalogs without a separate catalog provider service.
Example Spicepod.yaml:
```yaml catalogs: - from: iceberg:file:///tmp/hadoopwarehouse/ name: localhadoop - from: iceberg:s3://my-bucket/hadoopwarehouse/ name: s3hadoop
# Example datasets - from: iceberg:file:///data/hadoopwarehouse/test/mytable1 name: localhadoop - from: iceberg:s3://my-bucket/hadoopwarehouse/test/mytable2 name: s3hadoop ```
For more details, see the Iceberg Data Connector documentation and the Iceberg Catalog Connector documentation.
Contributors
Breaking Changes
- N/A
Cookbook Updates
- Added Amazon Redshift Support to the Postgres Data Connector cookbook: Connect to tables in Amazon Redshift.
The Spice Cookbook includes 75 recipes to help you get started with Spice quickly and easily.
Upgrading
To upgrade to v1.5.2, use one of the following methods:
CLI:
console
spice upgrade
Homebrew:
console
brew upgrade spiceai/spiceai/spice
Docker:
Pull the spiceai/spiceai:1.5.2 image:
console
docker pull spiceai/spiceai:1.5.2
For available tags, see DockerHub.
Helm:
console
helm repo update
helm upgrade spiceai spiceai/spiceai
AWS Marketplace:
🎉 Spice is also now available in the AWS Marketplace!
What's Changed
Dependencies
No major dependency updates.
Changelog
- fixes for databricks OpenAI compatibility (#6629) by @Jeadie in #6629
- Update spicepod.schema.json (#6632) by @app/github-actions in #6632
- Remove 'stream_options' from databricks LLMs (#6637) by @Jeadie in #6637
- Move retry and rate limiting logic for Amazon bedrock out of embeddings. (#6626) by @Jeadie in #6626
- Disable Metal precomplation in integration_llms.yml (#6649) by @Jeadie in #6649
- fix: Hadoop integration test (#6660) by @peasee in #6660
- feat: Add Hadoop Catalog Data Component (#6658) by @peasee in #6658
- update datafusion-table-providers to latest spiceai tag (#6661) by @mach-kernel in #6661
- feat: Add Hadoop Catalog connectors for Iceberg (#6659) by @peasee in #6659
- Make
FullTextSearchExecrobust to RecordBatch column ordering. (#6675) by @Jeadie in #6675 - Make 'runtime-object-store' crate (#6674) by @Jeadie in #6674
- fix: Support include for Iceberg (#6663) by @peasee in #6663
- feat: Add Hadoop TPCH benchmark (#6678) by @peasee in #6678
- feat: Add Hadoop
metadata_pathparameter (#6680) by @peasee in #6680 - fix: Automatically infer Hadoop warehouse scheme (#6681) by @peasee in #6681
- Amazon Bedrock, specifically Nova models (#6673) by @Jeadie in [#6673](https://github.com/spiceai/spiceai/pull/6673
- fix perplexityauthtoken parameters for web_search (#6685) by @Jeadie in #6685
- Fix AWS Auth issue (#6699) by @Advayp in #6699
- Limit Concurrent Requests for GitHub (#6672) by @Advayp in #6672
- Add runtime parameter to enable more permissive parquet reading when page indexes are missing (#6716) by @phillipleblanc in #6716
- Improve Flight REPL error messages (#6696) by @lukekim in #6696
- Fixes from search tests (#6710) by @Jeadie in #6710
- Rust
Published by ewgenius 7 months ago
https://github.com/spiceai/spiceai - v1.5.1
Spice v1.5.1 (July 28, 2025)
Spice v1.5.1 expands the GitHub data connector to include pull-request comments, adds a configurable rate limiting for AWS Bedrock embedding models, expands partition pruning with inequality operators, and adds client-supplied cache keys for granular caching control in the HTTP and Arrow Flight SQL APIs.
What's New in v1.5.1
GitHub Data Connector Pull Request Comments: Configure GitHub pulls datasets to include comments.
Example Spicepod.yaml:
yaml
datasets:
- from: github:github.com/spiceai/spiceai/pulls
name: spiceai.pulls
params:
include_comments: all # 'review', 'discussion', or 'none'. Defaults to 'none'.
max_comments_fetched: '25' # Defaults to 100
# ...
For details, see the GitHub Data Connector documentation.
AWS Bedrock Embedding Models Invocation Control: Improved rate limiting control for AWS Bedrock embedding models with max_concurrent_invocations configuration.
yaml
embeddings:
- from: bedrock:cohere.embed-english-v3
name: cohere-embeddings
params:
max_concurrent_invocations: '41'
# ...
For details, see the AWS Bedrock Embeddings Model Provider documentation.
Improved Query Partitioning: Expanded partition pruning support with additional inequality operators (e.g. >, >=, <, <=).
For details, see the Query Partitioning documentation.
Client-Supplied Cache Keys: Support for a new Spice-Cache-Key header/metadata-key in the HTTP and Arrow Flight SQL query APIs to for fine-grained client-side caching control.
Example HTTP API usage:
bash
$ curl -vvS -XPOST http://localhost:8090/v1/sql \
-H"spice-cache-key: 1851400_20170216_north_america" \
-d "select * from scihub_journals_accessed
where user_id = '1851400'
and date_trunc('DAY', timestamp) = '2017-02-16'
and city = 'New York';"
Example Response:
bash
< HTTP/1.1 200 OK
< content-type: application/json
< x-cache: Hit from spiceai
< results-cache-status: HIT
< vary: Spice-Cache-Key
< vary: origin, access-control-request-method, access-control-request-headers
< content-length: 604
< date: Wed, 23 Jul 2025 20:26:12 GMT
<
[{
"timestamp": "2017-02-16 09:55:06",
"doi": "10.1155/2012/650929",
"ip_identifier": 1000856,
"user_id": 1851400,
"country": "United States",
"city": "New York",
"longitude": 40.7830603,
"latitude": -73.9712488
},
...
]
For details, see the Cache Control documentation.
Contributors
- @Jeadie
- @Advayp
- @sgrebnov
- @kczimm
- @lukekim
- @phillipleblanc
- @mach-kernel
- @varunguleriaCodes
- @peasee
- @Sevenannn
- @ewgenius
New Contributors
- @varunguleriaCodes made their first contribution in github.com/spiceai/spiceai/pull/6383
Breaking Changes
- N/A
Cookbook Updates
No new recipes added in this release.
The Spice Cookbook includes 74 recipes to help you get started with Spice quickly and easily.
Upgrading
To upgrade to v1.5.1, use one of the following methods:
CLI:
console
spice upgrade
Homebrew:
console
brew upgrade spiceai/spiceai/spice
Docker:
Pull the spiceai/spiceai:1.5.1 image:
console
docker pull spiceai/spiceai:1.5.1
For available tags, see DockerHub.
Helm:
console
helm repo update
helm upgrade spiceai spiceai/spiceai
What's Changed
Dependencies
No major dependency updates.
Changelog
- Fix refresh via Api when dataset is already accelerated and no refresh interval is set by @sgrebnov in https://github.com/spiceai/spiceai/pull/6549
- Add support for custom GraphQL unnesting behavior by @Advayp in https://github.com/spiceai/spiceai/pull/6540
- Regex Update to disallow hyphens dataset names by @varunguleriaCodes in https://github.com/spiceai/spiceai/pull/6383
- Enforce max limit on comments fetched per PR by @Advayp in https://github.com/spiceai/spiceai/pull/6580
- Fix accelerated refresh issue by @Advayp in https://github.com/spiceai/spiceai/pull/6590
- Enable configurations of max invocations for Bedrock models by @Advayp in https://github.com/spiceai/spiceai/pull/6592
- Client-supplied cache keys (Spice-Cache-Key) by @mach-kernel in https://github.com/spiceai/spiceai/pull/6579
- Improved partition pruning by @kczimm in https://github.com/spiceai/spiceai/pull/6582
- Fix retention filter when both retention_sql and period are set by @sgrebnov in https://github.com/spiceai/spiceai/pull/6595
- Initial support for PR comments by @Advayp in https://github.com/spiceai/spiceai/pull/6569
- chore: Update croner by @peasee in https://github.com/spiceai/spiceai/pull/6547
- fix databricks streaming for Claude model by @peasee in https://github.com/spiceai/spiceai/pull/6601
- Remove
FullTextUDTFAnalyzerRuleand move FTS code intosearchcrate by @jeadie in https://github.com/spiceai/spiceai/pull/6596 - Remove download of legacy sentence transformers config by @jeadie in https://github.com/spiceai/spiceai/pull/6605
- re-add snapshot tests by @jeadie
- Embedding column config to support client-specified vector sizes by @mach-kernel in https://github.com/spiceai/spiceai/pull/6610
- Fix mismatch in columns for the GitHub PR table type by @Advayp in https://github.com/spiceai/spiceai/pull/6616
- bump version to 1.5.1 by @phillipleblanc
- fix issues with cherry-picking by @jeadie
- Add integration tests for GitHub PRs with comments by @Advayp in https://github.com/spiceai/spiceai/pull/6581
- Add view name to view creation errors by @lukekim in https://github.com/spiceai/spiceai/pull/6611
- CDC: Compute embeddings on ingest by @mach-kernel in https://github.com/spiceai/spiceai/pull/6612
- Rust
Published by Jeadie 7 months ago
https://github.com/spiceai/spiceai - v1.5.0-rc.3
Spice v1.5.0-rc.3 (July 16, 2025)
This is the third release candidate for v1.5.0, building on the capabilities introduced in v1.5.0-rc.2. This release introduces native support for Amazon S3 Vectors, enabling petabyte scale vector search directly from S3 vector buckets, alongside SQL-integrated vector and full-text search, partitioning for DuckDB acceleration, and automated refreshes for search indexes and views. It includes the AWS Bedrock Embeddings Model Provider, the Oracle Database connector, and the now-stable Spice.ai Cloud Data Connector, and the upgrade to DuckDB v1.3.2.
What's New in v1.5.0-rc.3
Amazon S3 Vectors Support: Spice.ai now integrates with Amazon S3 Vectors, launched in public preview on July 15, 2025, enabling vector-native object storage with built-in indexing and querying. This integration supports semantic search, recommendation systems, and retrieval-augmented generation (RAG) at petabyte scale with S3’s durability and elasticity. Spice.ai manages the vector lifecycle—ingesting data, embedding it with models like Amazon Titan or Cohere via AWS Bedrock, or MiniLM L6 available from HuggingFace, and storing it in S3 Vector buckets.
Example Spicepod.yml configuration for S3 Vectors:
yaml
datasets:
- from: s3://my_vector_bucket/data/
name: my_vectors
params:
file_format: parquet
acceleration:
enabled: true
vectors:
engine: s3_vectors
params:
s3_vectors_aws_region: us-east-2
s3_vectors_bucket: my-s3-vectors-bucket
columns:
- name: content
embeddings:
- from: bedrock_titan
row_id:
- id
Example SQL query using S3 Vectors:
sql
SELECT *
FROM vector_search(my_vectors, 'Cricket bats', 10)
WHERE price < 100
ORDER BY score
For more details, refer to the S3 Vectors Documentation.
Highlights in v1.5.0-rc.3
SQL-integrated Search: Vector and full-text search capabilities are now natively available in SQL queries, extending the power of the POST v1/search endpoint to all SQL workflows.
Example Vector-Similarity-Search (VSS) using the vector_search UDTF on the table reviews for the search term "Cricket bats":
sql
SELECT review_id, review_text, review_date, score
FROM vector_search(reviews, "Cricket bats")
WHERE country_code="AUS"
LIMIT 3
Example Full-Text-Search (FTS) using the text_search UDTF on the table reviews for the search term "Cricket bats":
sql
SELECT review_id, review_text, review_date, score
FROM text_search(reviews, "Cricket bats")
LIMIT 3
DuckDB v1.3.2 Upgrade: Upgraded DuckDB engine from v1.1.3 to v1.3.2. Key improvements include support for adding primary keys to existing tables, resolution of over-eager unique constraint checking for smoother inserts, and 13% reduced runtime on TPC-H SF100 queries through extensive optimizer refinements. The v1.2.x release of DuckDB was skipped due to a regression in indexes.
- Read the DuckDB v1.2.0 announcement.
- Read the DuckDB v1.3.0 announcement.
Partitioned Acceleration: DuckDB file-based accelerations now support partition_by expressions, enabling queries to scale to large datasets through automatic data partitioning and query predicate pruning. New UDFs, bucket and truncate, simplify partition logic.
New UDFs useful for partition_by expressions:
bucket(num_buckets, col): Partitions a column into a specified number of buckets based on a hash of the column value.truncate(width, col): Truncates a column to a specified width, aligning values to the nearest lower multiple (e.g.,truncate(10, 101) = 100).
Example Spicepod.yml configuration:
yaml
datasets:
- from: s3://my_bucket/some_large_table/
name: my_table
params:
file_format: parquet
acceleration:
enabled: true
engine: duckdb
mode: file
partition_by: bucket(100, account_id) # Partition account_id into 100 buckets
Full-Text-Search (FTS) Index Refresh: Accelerated datasets with search indexes maintain up-to-date results with configurable refresh intervals.
Example refreshing search indexes on body every 10 seconds:
yaml
datasets:
- from: github:github.com/spiceai/docs/pulls
name: spiceai.doc.pulls
params:
github_token: ${secrets:GITHUB_TOKEN}
acceleration:
enabled: true
refresh_mode: full
refresh_check_interval: 10s
columns:
- name: body
full_text_search:
enabled: true
row_id:
- id
Scheduled View Refresh: Accelerated Views now support cron-based refresh schedules using refresh_cron, automating updates for accelerated data.
Example Spicepod.yml configuration:
yaml
views:
- name: my_view
sql: SELECT 1
acceleration:
enabled: true
refresh_cron: '0 * * * *' # Every hour
For more details, refer to Scheduled Refreshes.
Multi-column Vector Search: For datasets configured with embeddings on more than one column, POST v1/search and similarity_search perform parallel vector search on each column, aggregating results using reciprocal rank fusion.
Example Spicepod.yml for multi-column search:
yaml
datasets:
- from: github:github.com/apache/datafusion/issues
name: datafusion.issues
params:
github_token: ${secrets:GITHUB_TOKEN}
columns:
- name: title
embeddings:
- from: hf_minilm
- name: body
embeddings:
- from: openai_embeddings
AWS Bedrock Embeddings Model Provider: Added support for AWS Bedrock embedding models, including Amazon Titan Text Embeddings and Cohere Text Embeddings.
Example Spicepod.yml:
yaml
embeddings:
- from: bedrock:cohere.embed-english-v3
name: cohere-embeddings
params:
aws_region: us-east-1
input_type: search_document
truncate: END
- from: bedrock:amazon.titan-embed-text-v2:0
name: titan-embeddings
params:
aws_region: us-east-1
dimensions: '256'
For more details, refer to the AWS Bedrock Embedding Models Documentation.
Oracle Data Connector: Use from: oracle: to access and accelerate data stored in Oracle databases, deployed on-premises or in the cloud.
Example Spicepod.yml:
yaml
datasets:
- from: oracle:"SH"."PRODUCTS"
name: products
params:
oracle_host: 127.0.0.1
oracle_username: scott
oracle_password: tiger
See the Oracle Data Connector documentation.
Spice.ai Cloud Data Connector: Graduated to Stable.
Contributors
Breaking Changes
- Search HTTP API Response:
POST v1/searchresponse payload has changed. See the new API documentation for details. - Model Provider Parameter Prefixes: Model Provider parameters use provider-specific prefixes instead of
openai_prefixes (e.g.,hf_temperaturefor HuggingFace,anthropic_max_completion_tokensfor Anthropic,perplexity_tool_choicefor Perplexity). Theopenai_prefix remains supported for backward compatibility but is deprecated and will be removed in a future release.
Cookbook Updates
- Added Oracle Data Connector cookbook: Connect to tables in Oracle databases.
- Added Hashed Partitioning with DuckDB cookbook: Accelerate data on large datasets by partitioning data into a fixed number of buckets.
The Spice Cookbook now includes 72 recipes to help you get started with Spice quickly and easily.
Upgrading
To upgrade to v1.5.0-rc.3, download and install the specific binary from github.com/spiceai/spiceai/releases/tag/v1.5.0-rc.3 or pull the v1.5.0-rc.3 Docker image (spiceai/spiceai:1.5.0-rc.3).
What's Changed
Dependencies
- delta_kernel: Upgraded to v0.12.1
- DuckDB: Upgraded from v1.1.3 to v1.3.2
- iceberg-rust: Upgraded from v0.4.0 to v0.5.1
Changelog
- v1.5.0-rc.2 release notes (#6440) by @lukekim in #6440
- Amazon S3 Vectors support by @lukekim in #6468
- Rust
Published by phillipleblanc 8 months ago
https://github.com/spiceai/spiceai - v1.5.0-rc.2
Spice v1.5.0-rc.2 (July 14, 2025)
This is the second release candidate for v1.5.0, which introduces SQL-integrated vector and full-text search, partitioning for DuckDB acceleration, and automated refreshes for search indexes and views. It adds a new AWS Bedrock Embeddings Model Provider, a new Oracle Database connector, and promotes the Spice.ai Cloud Data Connector to stable, alongside multi-column vector search for expanded search. This release also upgrades DuckDB from v1.1.3 to v1.3.2, accelerating Spice.ai datasets with improved indexes, query performance, and internal storage optimizations.
What's New in v1.5.0-rc.2
SQL-integrated Search: Vector and full-text search capabilities are now natively available in SQL queries, extending the power of the POST v1/search endpoint to all SQL workflows.
Example Vector-Similarity-Search (VSS) using the new vector_search UDTF on the table reviews for the search term "Cricket bats".
sql
SELECT review_id, review_text, review_date, score
FROM vector_search(reviews, "Cricket bats")
WHERE country_code="AUS"
LIMIT 3
Example Full-Text-Search (FTS) using the new text_search UDTF on the table reviews for the search term "Cricket bats".
sql
SELECT review_id, review_text, review_date, score
FROM text_search(reviews, "Cricket bats")
LIMIT 3
DuckDB v1.3.2 Upgrade: Upgraded DuckDB engine from v1.1.3 to v1.3.2. Key improvements include support for adding primary keys to existing tables, resolution of over-eager unique constraint checking for smoother inserts, and 13% reduced runtime on TPC-H SF100 queries through extensive optimizer refinements. The v1.2.x release of DuckDB was skipped due to a regression in indexes.
- Read the DuckDB v1.2.0 announcement.
- Read the DuckDB v1.3.0 announcement.
Partitioned Acceleration: DuckDB file-based accelerations now support partition_by expressions, enabling queries to scale to large datasets through automatic data partitioning and query predicate pruning. New UDFs, bucket and truncate, simplify partition logic.
New UDFs useful for partition_by expressions:
bucket(num_buckets, col): Partitions a column into a specified number of buckets based on a hash of the column value.truncate(width, col): Truncates a column to a specified width, aligning values to the nearest lower multiple (e.g.,truncate(10, 101) = 100).
Example Spicepod.yml configuration:
yaml
datasets:
- from: s3://my_bucket/some_large_table/
name: my_table
params:
file_format: parquet
acceleration:
enabled: true
engine: duckdb
mode: file
partition_by: bucket(100, account_id) # Partition account_id into 100 buckets
Full-Text-Search (FTS) Index Refresh: Accelerated datasets with search indexes maintain up-to-date results with configurable refresh intervals.
Example refreshing search indexes on body every 10 seconds (based on acceleration.refresh_check_interval).
yaml
datasets:
- from: github:github.com/spiceai/docs/pulls
name: spiceai.doc.pulls
params:
github_token: ${secrets:GITHUB_TOKEN}
acceleration:
enabled: true
refresh_mode: full
refresh_check_interval: 10s
columns:
- name: body
full_text_search:
enabled: true
row_id:
- id
Scheduled View Refresh: Accelerated Views now support cron-based refresh schedules using refresh_cron, automating updates for accelerated data.
Example Spicepod.yml configuration:
yaml
views:
- name: my_view
sql: SELECT 1
acceleration:
enabled: true
refresh_cron: '0 * * * *' # Every hour
For more details, refer to Scheduled Refreshes.
- Multi-column Vector Search: For datasets configured with embeddings on more than one column,
POST v1/searchandsimilarity_searchwill perform parallel vector search on each column, and aggregate results using a reciprocal rank fusion scoring method.
Example Spicepod.yml where search results will consider both the Github issue's title and the content of its body.
yaml
datasets:
- from: github:github.com/apache/datafusion/issues
name: datafusion.issues
params:
github_token: ${secrets:GITHUB_TOKEN}
columns:
- name: title
embeddings:
- from: hf_minilm
- name: body
embeddings:
- from: openai_embeddings
AWS Bedrock Embeddings Model Provider: Added support for AWS Bedrock embedding models, including Amazon Titan Text Embeddings and Cohere Text Embeddings.
Example Spicepod.yaml:
```yaml embeddings: - from: bedrock:cohere.embed-english-v3 name: cohere-embeddings params: awsregion: us-east-1 inputtype: search_document truncate: END
- from: bedrock:amazon.titan-embed-text-v2:0 name: titan-embeddings params: aws_region: us-east-1 dimensions: '256' ```
For more details, refer to the AWS Bedrock Embedding Models Documentation.
Oracle Data Connector: Use from: oracle: to access and accelerate data stored in Oracle databases, deployed on-premises or in the cloud.
Example Spicepod.yml:
yaml
datasets:
- from: oracle:"SH"."PRODUCTS"
name: products
params:
oracle_host: 127.0.0.1
oracle_username: scott
oracle_password: tiger
See the Oracle Data Connector documentation for details.
Spice.ai Cloud Data Connector: Graduated to Stable.
Contributors
Breaking Changes
Search HTTP API Response:
POST v1/searchresponse payload has changed. See the new API documentation for details.Model Provider Parameter Prefixes: Model Provider parameters use provider-specific prefixes instead of
openai_prefixes (e.g.,hf_temperatureinstead ofopenai_temperaturefor HuggingFace,anthropic_max_completion_tokensfor Anthropic,perplexity_tool_choicefor Perplexity). Theopenai_prefix remains supported for backward compatibility but is now deprecated will be removed in a future release.
Cookbook Updates
- Added Oracle Data Connector cookbook: Connect to tables in Oracle databases.
- Added Hashed Partitioning with DuckDB cookbook: Accelerate data on large datasets by partitioning data into a fixed number of buckets.
The Spice Cookbook now includes 72 recipes to help you get started with Spice quickly and easily.
Upgrading
To upgrade to v1.5.0-rc.2, download and install the specific binary from github.com/spiceai/spiceai/releases/tag/v1.5.0-rc.2 or pull the v1.5.0-rc.2 Docker image (spiceai/spiceai:1.5.0-rc.2).
What's Changed
Dependencies
- delta_kernel: Upgraded to v0.12.1
- DuckDB: Upgraded from v1.1.3 to v1.3.2
- iceberg: Upgraded from v0.4.0 to v0.5.1
Changelog
- fix llm integraion test (#6398) by @Sevenannn in #6398
- Promote spice cloud connector to stable quality (#6221) by @Sevenannn in #6221
- v1.5.0-rc.1 release notes (#6397) by @lukekim in #6397
- Fix model nsql integration tests (#6365) by @Sevenannn in #6365
- Fix incorrect UDTF name and SQL query (#6404) by @lukekim in #6404
- Update v1.5.0-rc.1.md (#6407) by @sgrebnov in #6407
- Improve error messages (#6405) by @lukekim in #6405
- build(deps): bump Jimver/cuda-toolkit from 0.2.25 to 0.2.26 (#6388) by @app/dependabot in #6388
- Upgrade dependabot dependencies (#6411) by @phillipleblanc in #6411
- Fix projection pushdown issues for document based file connector (#6362) by @Advayp in #6362
- Create a new crate for UDFs (#6416) by @kczimm in #6416
- Add a PartitionedDuckDB Accelerator (#6338) by @kczimm in #6338
- Use
vector_search()UDTF in HTTP APIs (#6417) by @Jeadie in #6417 - add supported types (#6409) by @kczimm in #6409
- Enable session time zone override for MySQL (#6426) by @sgrebnov in #6426
- Acceleration-like indexing for full text search indexes. (#6382) by @Jeadie in #6382
- Provide error message when partition by expression changes (#6415) by @kczimm in #6415
- Add support for Oracle Autonomous Database connections (Oracle Cloud) (#6421) by @sgrebnov in #6421
- prune partitions for exact and in list with and without UDFs (#6423) by @kczimm in #6423
- Fixes and reenable FTS tests (#6431) by @Jeadie in #6431
- Updating
text-embedding-inference&mistralrsdependency (#6366) by @Jeadie in #6366 - Upgrade DuckDB to 1.3.2 (#6434) by @phillipleblanc in #6434
- Fix issue in limit clause for the Github Data connector (#6443) by @Advayp in #6443
- Upgrade iceberg-rust to 0.5.1 (#6446) by @phillipleblanc in #6446
- Rust
Published by peasee 8 months ago
https://github.com/spiceai/spiceai - v1.5.0-rc.1
Spice v1.5.0-rc.1 (July 7, 2025)
This is the first release candidate for v1.5.0, which introduces partitioning for DuckDB acceleration, SQL-integrated vector and full-text search, and automated refreshes for search indexes and views. It adds a new AWS Bedrock Embeddings Model Provider, a new Oracle Database connector, and promotes the Spice.ai Cloud Data Connector to stable, alongside multi-column vector search for expanded search.
What's New in v1.5.0-rc.1
Partitioned Acceleration: DuckDB file-based accelerations now support partition_by expressions, enabling queries to scale to large datasets through automatic data partitioning and query predicate pruning. New UDFs, bucket and truncate, simplify partition logic.
New UDFs useful for partition_by expressions:
bucket(num_buckets, col): Partitions a column into a specified number of buckets based on a hash of the column value.truncate(width, col): Truncates a column to a specified width, aligning values to the nearest lower multiple (e.g.,truncate(10, 101) = 100).
Example Spicepod.yml configuration:
yaml
datasets:
- from: s3://my_bucket/some_large_table/
name: my_table
params:
file_format: parquet
acceleration:
enabled: true
engine: duckdb
mode: file
partition_by: bucket(100, account_id) # Partition account_id into 100 buckets
SQL-integrated Search: Vector and full-text search capabilities are now natively available in SQL queries, extending the power of the POST v1/search endpoint to all SQL workflows.
Example Vector-Similarity-Search (VSS) using the new similarity_search UDTF on the table reviews for the search term "Cricket bats".
sql
SELECT review_id, review_text, review_date, score
FROM similarity_search(reviews, "Cricket bats")
WHERE country_code="AUS"
LIMIT 3
Example Full-Text-Search (FTS) using the new text_search UDTF on the table reviews for the search term "Cricket bats".
sql
SELECT review_id, review_text, review_date, score
FROM reviews
FROM text_search(reviews, "Cricket bats")
LIMIT 3
Full-Text-Search (FTS) Index Refresh: Accelerated datasets with search indexes maintain up-to-date results with configurable refresh intervals.
Example refreshing search indexes on body every 10 seconds (based on acceleration.refresh_check_interval).
yaml
datasets:
- from: github:github.com/spiceai/docs/pulls
name: spiceai.doc.pulls
params:
github_token: ${secrets:GITHUB_TOKEN}
acceleration:
enabled: true
refresh_mode: full
refresh_check_interval: 10s
columns:
- name: body
full_text_search:
enabled: true
row_id:
- id
Scheduled View Refresh: Accelerated Views now support cron-based refresh schedules using refresh_cron, automating updates for accelerated data.
Example Spicepod.yml configuration:
yaml
views:
- name: my_view
sql: SELECT 1
acceleration:
enabled: true
refresh_cron: '0 * * * *' # Every hour
For more details, refer to Scheduled Refreshes.
- Multi-column Vector Search: For datasets configured with embeddings on more than one column,
POST v1/searchandsimilarity_searchwill perform parallel vector search on each column, and aggregate results using a reciprocal rank fusion scoring method.
Example Spicepod.yml where search results will consider both the Github issue's title and the content of its body.
yaml
datasets:
- from: github:github.com/apache/datafusion/issues
name: datafusion.issues
params:
github_token: ${secrets:GITHUB_TOKEN}
columns:
- name: title
embeddings:
- from: hf_minilm
- name: body
embeddings:
- from: openai_embeddings
AWS Bedrock Embeddings Model Provider: Added support for AWS Bedrock embedding models, including Amazon Titan Text Embeddings and Cohere Text Embeddings.
Example Spicepod.yaml:
```yaml embeddings: - from: bedrock:cohere.embed-english-v3 name: cohere-embeddings params: awsregion: us-east-1 inputtype: search_document truncate: END
- from: bedrock:amazon.titan-embed-text-v2:0 name: titan-embeddings params: aws_region: us-east-1 dimensions: '256' ```
For more details, refer to the AWS Bedrock Embedding Models Documentation.
Oracle Data Connector: Use from: oracle: to access and accelerate data stored in Oracle databases, deployed on-premises or in the cloud.
Example Spicepod.yml:
yaml
datasets:
- from: oracle:"SH"."PRODUCTS"
name: products
params:
oracle_host: 127.0.0.1
oracle_username: scott
oracle_password: tiger
See the Oracle Data Connector documentation for details.
Spice.ai Cloud Data Connector: Graduated to Stable.
Contributors
Breaking Changes
Search HTTP API Response:
POST v1/searchresponse payload has changed. See the new API documentation for details.Model Provider Parameter Prefixes: Model Provider parameters use provider-specific prefixes instead of
openai_prefixes (e.g.,hf_temperatureinstead ofopenai_temperaturefor HuggingFace,anthropic_max_completion_tokensfor Anthropic,perplexity_tool_choicefor Perplexity). Theopenai_prefix remains supported for backward compatibility but is now deprecated will be removed in a future release.
Cookbook Updates
- Added Oracle Data Connector cookbook: Connect to tables in Oracle databases.
The Spice Cookbook now includes 71 recipes to help you get started with Spice quickly and easily.
Upgrading
To upgrade to v1.5.0-rc.1, download and install the specific binary from github.com/spiceai/spiceai/releases/tag/v1.5.0-rc.1 or pull the v1.5.0-rc.1 Docker image (spiceai/spiceai:1.5.0-rc.1).
What's Changed
Dependencies
- delta_kernel: Upgraded to v0.12.1
Changelog
- Jeadie/25 06 10/finance (#6182) by @Jeadie in #6182
- chore: Update dependencies (#6196) by @peasee in #6196
- Fix FlightSQL GetDbSchemas and GetTables schemas to fully match the protocol (#6197) by @sgrebnov in #6197
- Use spice-rs in test operator and retry on connection reset error (#6136) by @Sevenannn in #6136
- Move model-grading evals to testoperator (#6195) by @Jeadie in #6195
- Don't use base table for full text search post apply vector search (#6215) by @Jeadie in #6215
- Fix
content-typeheader inv1/sqlresponse (#6217) by @Jeadie in #6217 - Add v1.4.0-rc.1 release into qa_analytics.csv (#6209) by @sgrebnov in #6209
- fix: Reschedule AI benchmarks, set max parallel to 1 (#6224) by @peasee in #6224
- task: Add MySQL indexes (#6227) by @peasee in #6227
- fix pagination (#6222) by @Jeadie in #6222
- Add build links to release notes (#6220) by @kczimm in #6220
- feat: Enable additional testoperator tests (#6218) by @peasee in #6218
- chore: Update testoperator release target to 1.4 (#6235) by @peasee in #6235
- fix: Update benchmark snapshots (#6234) by @app/github-actions in #6234
- fix: Lower SF100 memory limit (#6236) by @peasee in #6236
- Add glue integration test using hive and iceberg tables (#6248) by @kczimm in #6248
- allow database for empty patterns (#6258) by @kczimm in #6258
- add Glue catalog to README.md (#6179) by @kczimm in #6179
- Add bucket UDF for partitioning (#6200) by @kczimm in #6200
- New tool
parsley(#6232) by @Jeadie in #6232 - Upgrade dependabot dependencies (#6261) by @phillipleblanc in #6261
- Upgrade delta_kernel to 0.12.1 (#6263) by @phillipleblanc in #6263
- fix: Throughput test dispatching (#6265) by @peasee in #6265
- fix: badges on README.md show correct status (#6268) by @phillipleblanc in #6268
- Extend Flight CommandGetTables with source native data type info (#6259) by @sgrebnov in #6259
- fix: Docker image build with profile (#6270) by @peasee in #6270
- docs: Post-release update (#6275) by @peasee in #6275
- Improve error message for incorrect/missing Glue table or database (#6257) by @kczimm in #6257
- Update spicepod.schema.json (#6274) by @app/github-actions in #6274
- Update openapi.json (#6279) by @app/github-actions in #6279
- Add Remote Spicepod support (#6233) by @phillipleblanc in #6233
- Update QA analytics for v1.4.0 (#6277) by @ewgenius in #6277
- Add truncate UDF (#6278) by @kczimm in #6278
- Update qa_analytics.csv for 1.4.0 (#6284) by @sgrebnov in #6284
- Default grok to 'grok-3' (#6285) by @Jeadie in #6285
- For Spice.ai connectors, do not default to dev SCP for dev builds (#6254) by @Jeadie in #6254
- fix: Deny extra caching parameters (#6288) by @peasee in #6288
- Make DynamoDB connectivity errors more specific and actionable (#6294) by @sgrebnov in #6294
- Create a table provider from full text search index + query (#6286) by @Jeadie in #6286
- Update Flight CommandGetTables to Return Native DataFusion SQL Data Types (#6297) by @sgrebnov in #6297
- Adds a synchronous
get_tablefunction on the DataFusion context (#6300) by @phillipleblanc in #6300 - Better Glue connector error messages (#6289) by @kczimm in #6289
- fix: consume response stream before reading
authorizationmetadata (#6292) by @Sevenannn in #6292 - feat: Use retryable stream in test operator (#6231) by @Sevenannn in #6231
- Support reserved word column names in DynamoDB (#6308) by @sgrebnov in #6308
- fix: Implement Default manually for SQLResultsCacheConfig (#6310) by @peasee in #6310
- Add integration test for DynamoDB Data Connector (#6311) by @sgrebnov in #6311
- fix: Warn about no configured datasets if no datasets and catalogs are present (#6296) by @Advayp in #6296
- Add better error messages for cases when a port is already in use (#6313) by @Advayp in #6313
- Disallow datasets with protected names (#6309) by @Advayp in #6309
- Roadmap updates June 2025 (#6319) by @lukekim in #6319
- Add partitioning models (#6298) by @kczimm in #6298
- Encode ScalarValues for use in filenames (#6318) by @kczimm in #6318
- Standardize model parameter handling & prioritize
<model-prefix>_<param>for model default overrides (#6199) by @Sevenannn in #6199 - Add initial support for Oracle Data Connector (#6321) by @sgrebnov in #6321
- Oracle connector: Support all major Oracle data types (#6323) by @sgrebnov in #6323
- Oracle connector: support filter predicate pushdown (#6326) by @sgrebnov in #6326
text_searchUDTF and required AnalyzerRule. (#6280) by @Jeadie in #6280- Build indexes as part of accelerations (#6324) by @phillipleblanc in #6324
- feat: Add support for cron-based view refresh (#6341) by @peasee in #6341
- Surface table not found errors immediately (#6317) by @Advayp in #6317
- runtime-datafusion-index: Stop infinite recursion for IndexTableScanOptimizerRule (#6353) by @phillipleblanc in #6353
- Add optional behaviors to DataAccelerator tables + add WantsUnderlyingTableBehavior to VoidTable (#6354) by @phillipleblanc in #6354
- AWS Bedrock models. (#6358) by @Jeadie in #6358
- Ensure views load even if they're the only components defined (#6359) by @Advayp in #6359
- Improve type conversion and add integration tests for the Oracle connector (#6327) by @sgrebnov in #6327
- Upgrade dependabot dependencies (#6375) by @phillipleblanc in #6375
- Don't run tests that require a Databricks cluster on every PR (#6379) by @phillipleblanc in #6379
- Properly handle duplicate flags to
spice run(#6364) by @Advayp in #6364 - Fix the case sensitivity of the key in env secrets store (#6371) by @ewgenius in #6371
vector_searchUDTF and related changes (#6381) by @Jeadie in #6381- Update end_game.md (#6380) by @sgrebnov in #6380
- fix: openai model endpoint (#6394) by @Sevenannn in #6394
- Enable Oracle connector in default build configuration by @sgrebnov in #6395
- Enable configuring otel endpoint from
spice runby @Advayp in #6360
- Rust
Published by phillipleblanc 8 months ago
https://github.com/spiceai/spiceai - v1.4.0-rc.1
Spice v1.4.0-rc.1 (June 11, 2025)
This release candidate for v1.4.0 upgrades DataFusion to v47 and Arrow to v55 for faster queries, more efficient Parquet/CSV handling, and improved reliability. It introduces the AWS Glue Catalog and Data Connectors for native access to Glue-managed data on S3 and supports Databricks U2M OAuth for secure Databricks user authentication. New Cron-based dataset refreshes and worker schedules enable automated task management, while dataset and search results caching improvements further optimizes query, search, and RAG performance.
What's New in v1.4.0-rc.1
DataFusion v47 Highlights
Spice.ai is built on the DataFusion query engine. The v47 release brings:
Performance Improvements 🚀: This release delivers major query speedups through specialized GroupsAccumulator implementations for first_value, last_value, and min/max on Duration types, eliminating unnecessary sorting and computation. TopK operations are now up to 10x faster thanks to early exit optimizations, while sort performance is further enhanced by reusing row converters, removing redundant clones, and optimizing sort-preserving merge streams. Logical operations benefit from short-circuit evaluation for AND/OR, reducing overhead, and additional enhancements address high latency from sequential metadata fetching, improve int/string comparison efficiency, and simplify logical expressions for better execution.
Bug Fixes & Compatibility Improvements 🛠️: The release addresses issues with external sort, aggregation, and window functions, improves handling of NULL values and type casting in arrays and binary operations, and corrects problems with complex joins and nested window expressions. It also addresses SQL unparsing for subqueries, aliases, and UNION BY NAME.
See the Apache DataFusion 47.0.0 Changelog for details.
Arrow v55 Highlights
Arrow v55 delivers faster Parquet gzip compression, improved array concatenation, and better support for large files (4GB+) and modular encryption. Parquet metadata reads are now more efficient, with support for range requests and enhanced compatibility for INT96 timestamps and timezones. CSV parsing is more robust, with clearer error messages. These updates boost performance, compatibility, and reliability.
See the Arrow 55.0.0 Changelog and Arrow 55.1.0 Changelog for details.
Search Result Caching: Spice now supports runtime caching for search results, improving performance for subsequent searches and chat completion requests that use the document_similarity LLM tool. Caching is configurable with options like maximum size, item TTL, eviction policy, and hashing algorithm.
Example spicepod.yml configuration:
yaml
runtime:
caching:
search_results:
enabled: true
max_size: 128mb
item_ttl: 5s
eviction_policy: lru
hashing_algorithm: siphash
For more information, refer to the Caching documentation.
AWS Glue Catalog Connector: Connect to AWS Glue Data Catalogs to query Iceberg, Parquet, or CSV tables in S3.
Example spicepod.yml configuration:
yaml
catalogs:
- from: glue
name: my_glue_catalog
params:
glue_key: <your-access-key-id>
glue_secret: <your-secret-access-key>
glue_region: <your-region>
include:
- 'testdb.hive_*'
- 'testdb.iceberg_*'
sql
sql> show tables;
+-----------------+--------------+-------------------+------------+
| table_catalog | table_schema | table_name | table_type |
+-----------------+--------------+-------------------+------------+
| my_glue_catalog | testdb | hive_table_001 | BASE TABLE |
| my_glue_catalog | testdb | iceberg_table_001 | BASE TABLE |
| spice | runtime | task_history | BASE TABLE |
+-----------------+--------------+-------------------+------------+
For more information, refer to the Glue Catalog Connector documentation.
AWS Glue Data Connector: Connect to specific tables in AWS Glue Data Catalogs to query Iceberg, Parquet, or CSV in S3.
Example spicepod.yml configuration:
yaml
datasets:
- from: glue:my_database.my_table
name: my_table
params:
glue_auth: key
glue_region: us-east-1
glue_key: ${secrets:AWS_ACCESS_KEY_ID}
glue_secret: ${secrets:AWS_SECRET_ACCESS_KEY}
For more information, refer to the Glue Data Connector documentation.
Databricks U2M OAuth: Spice now supports User-to-Machine (U2M) authentication for Databricks when called with a compatible client, such as the Spice Cloud Platform.
yaml
datasets:
- from: databricks:spiceai_sandbox.default.messages
name: messages
params:
databricks_endpoint: ${secrets:DATABRICKS_ENDPOINT}
databricks_cluster_id: ${secrets:DATABRICKS_CLUSTER_ID}
databricks_client_id: ${secrets:DATABRICKS_CLIENT_ID}
Dataset Refresh Schedules: Accelerated datasets now support a refresh_cron parameter, automatically refreshing the dataset on a defined cron schedule. Cron scheduled refreshes respect the global dataset_refresh_parallelism parameter.
Example spicepod.yml configuration:
yaml
datasets:
- name: my_dataset
from: s3://my-bucket/my_file.parquet
acceleration:
refresh_cron: 0 0 * * * # Daily refresh at midnight
For more information, refer to the Dataset Refresh Schedules documentation.
Worker Execution Schedules: Workers now support a cron parameter and will execute an LLM-prompt or SQL query automatically on the defined cron schedule, in conjunction with a provided params.prompt.
Example spicepod.yml configuration:
yaml
workers:
- name: email_reporter
models:
- from: gpt-4o
params:
prompt: 'Inspect the latest emails, and generate a summary report for them. Post the summary report to the connected Teams channel'
cron: 0 2 * * * # Daily at 2am
For more information, refer to the Worker Execution Schedules documentation.
SQL Worker Actions: Spice now supports workers with sql actions, to execute automated SQL queries on a cron schedule:
yaml
workers:
- name: my_worker
cron: 0 * * * *
sql: 'SELECT * FROM lineitem'
For more information, refer to the Workers with a SQL action documentation;
Contributors
Breaking Changes
- No breaking changes.
Cookbook Updates
- Added Glue Catalog Connector and Data Connector cookbooks: Connect to tables and databases in the AWS Glue Data catalog.
The Spice Cookbook now includes 69 recipes to help you get started with Spice quickly and easily.
Upgrading
To upgrade to v1.4.0-rc.1, download and install the specific binary from github.com/spiceai/spiceai/releases/tag/v1.4.0-rc.1 or pull one of the nightly Docker images:
- 20250611-1883d57-models
- 20250611-1883d57-models-sysalloc
- 20250611-1883d57-models-mimalloc
- 20250611-1883d57-models-jemalloc
What's Changed
Dependencies
- DataFusion: Upgraded to v47
- arrow-rs: Upgraded to v55.1.0
- delta_kernel: Upgraded to v0.11.0
Changelog
- Update trunk to 1.4.0-unstable (#5878) by @phillipleblanc in #5878
- Update openapi.json (#5885) by @app/github-actions in #5885
- feat: Testoperator reports benchmark failure summary (#5889) by @peasee in #5889
- fix: Publish binaries to dev when platform option is all (#5905) by @peasee in #5905
- feat: Print dispatch current test count of total (#5906) by @peasee in #5906
- Include multiple duckdb files acceleration scenarios into testoperator dispatch (#5913) by @sgrebnov in #5913
- feat: Support building testoperator on dev (#5915) by @peasee in #5915
- Update spicepod.schema.json (#5927) by @app/github-actions in #5927
- Update ROADMAP & SECURITY for 1.3.0 (#5926) by @phillipleblanc in #5926
- Define
SearchGenerationparadigm & use in Vector Search (#5876) by @Jeadie in #5876 - docs: Update qa_analytics.csv (#5928) by @peasee in #5928
- fix: Properly publish binaries to dev on push (#5931) by @peasee in #5931
- Load request context extensions on every flight incoming call (#5916) by @ewgenius in #5916
- Fix deferred loading for datasets with embeddings (#5932) by @ewgenius in #5932
- Schedule AI benchmarks to run every Mon and Thu evening PST (#5940) by @sgrebnov in #5940
- Fix explain plan snapshots for TPCDS queries Q36, Q70 & Q86 not being deterministic after DF 46 upgrade (#5942) by @phillipleblanc in #5942
- chore: Upgrade to Rust 1.86 (#5945) by @peasee in #5945
- Standardise HTTP settings across CLI (#5769) by @Jeadie in #5769
- Fix deferred flag for Databricks SQL warehouse mode (#5958) by @ewgenius in #5958
- Add deferred catalog loading (#5950) by @ewgenius in #5950
- Refactor deferred_load using ComponentInitialization enum for better clarity (#5961) by @ewgenius in #5961
- Post-release housekeeping (#5964) by @phillipleblanc in #5964
- add LTO for release builds (#5709) by @kczimm in #5709
- Fix dependabot/192 (#5976) by @Jeadie in #5976
- Fix Test-to-SQL benchmark scheduled run (#5977) by @sgrebnov in #5977
- Fix JSON to ScalarValue type conversion to match DataFusion behavior (#5979) by @sgrebnov in #5979
- Add v1.3.1 release notes (#5978) by @lukekim in #5978
- Define
CandidateAggregationtrait and implement RRF for multi column vector search. (#5943) by @Jeadie in #5943 - Regenerate nightly build workflow (#5995) by @ewgenius in #5995
- Fix DataFusion dependency loading in Databricks request context extension (#5987) by @ewgenius in #5987
- Update spicepod.schema.json (#6000) by @app/github-actions in #6000
- feat: Run MySQL SF100 on dev runners (#5986) by @peasee in #5986
- fix: Remove caching RwLock (#6001) by @peasee in #6001
- 1.3.1 Post-release housekeeping (#6002) by @phillipleblanc in #6002
- feat: Add initial scheduler crate (#5923) by @peasee in #5923
- fix flight request context scope (#6004) by @ewgenius in #6004
- fix: Ensure snapshots on different scale factors are retained (#6009) by @peasee in #6009
- fix: Allow dev runners in dispatch files (#6011) by @peasee in #6011
- refactor: Deprecate resultscache for caching.sqlresults (#6008) by @peasee in #6008
- Fix models benchmark results reporting (#6013) by @sgrebnov in #6013
- fix: Run PR checks for tools/ changes (#6014) by @peasee in #6014
- feat: Add a CronRequestChannel for
scheduler(#6005) by @peasee in #6005 - feat: Add refresh_cron acceleration parameter, start scheduler on table load (#6016) by @peasee in #6016
- Update license check to allow dual license crates (#6021) by @sgrebnov in #6021
- Initial worker concept (#5973) by @Jeadie in #5973
- Don't fail if cargo-deny already installed (license check) (#6023) by @sgrebnov in #6023
- Upgrade to DataFusion 47 and Arrow 55 (#5966) by @sgrebnov in #5966
- Read Iceberg tables from Glue Catalog Connector (#5965) by @kczimm in #5965
- Handle multiple highlights in v1/search UX (#5963) by @Jeadie in #5963
- feat: Add cron scheduler configurations for workers (#6033) by @peasee in #6033
- feat: Add search cache configuration and results wrapper (#6020) by @peasee in #6020
- Fix GitHub Actions Ubuntu for more workflows (#6040) by @phillipleblanc in #6040
- Fix Actions for testoperator dispatch manual (#6042) by @phillipleblanc in #6042
- refactor: Remove worker type (#6039) by @peasee in #6039
- feat: Support cron dataset refreshes (#6037) by @peasee in #6037
- Upgrade datafusion-federation to 0.4.2 (#6022) by @phillipleblanc in #6022
- Define SearchPipeline and use in
runtime/vector_search.rs. (#6044) by @Jeadie in #6044 - fix: Scheduler test when scheduler is running (#6051) by @peasee in #6051
- doc: Spice Cloud Connector Limitation (#6035) by @Sevenannn in #6035
- Add support for on_conflict:upsert for Arrow MemTable (#6059) by @sgrebnov in #6059
- Enhance Arrow Flight DoPut operation tracing (#6053) by @sgrebnov in #6053
- Update openapi.json (#6032) by @app/github-actions in #6032
- Add tools enabled to MCP server capabilities (#6060) by @Jeadie in #6060
- Upgrade to delta_kernel 0.11 (#6045) by @phillipleblanc in #6045
- refactor: Replace refresh oneshot with notify (#6050) by @peasee in #6050
- Enable Upsert OnConflictBehavior for runtime.task_history table (#6068) by @sgrebnov in #6068
- feat: Add a workers integration test (#6069) by @peasee in #6069
- Fix DuckDB acceleration
ORDER BY rand()andORDER BY NULL(#6071) by @phillipleblanc in #6071 - Update Models Benchmarks to report unsuccessful evals as errors (#6070) by @sgrebnov in #6070
- Revert: fix: Use HTTPS ubuntu sources (#6082) by @Sevenannn in #6082
- Add initial support for Spice Cloud Platform management (#6089) by @sgrebnov in #6089
- Run spiceai cloud connector TPC tests using spice dev apps (#6049) by @Sevenannn in #6049
- feat: Add SQL worker action (#6093) by @peasee in #6093
- Post-release housekeeping (#6097) by @phillipleblanc in #6097
- Fix search bench (#6091) by @Jeadie in #6091
- fix: Update benchmark snapshots (#6094) by @app/github-actions in #6094
- fix: Update benchmark snapshots (#6095) by @app/github-actions in #6095
- Glue catalog connector for hive style parquet (#6054) by @kczimm in #6054
- Update openapi.json (#6100) by @app/github-actions in #6100
- Improve Flight Client DoPut / Publish error handling (#6105) by @sgrebnov in #6105
- Define
PostApplyCandidateGenerationto handle all filters & projections. (#6096) by @Jeadie in #6096 - refactor: Update the tracing task names for scheduled tasks (#6101) by @peasee in #6101
- task: Switch GH runners in PR and testoperator (#6052) by @peasee in #6052
- feat: Connect search caching for HTTP and tools (#6108) by @peasee in #6108
- test: Add multi-dataset cron test (#6102) by @peasee in #6102
- Sanitize the ListingTableURL (#6110) by @phillipleblanc in #6110
- Avoid partial writes by FlightTableWriter (#6104) by @sgrebnov in #6104
- fix: Update the TPCDS postgres acceleration indexes (#6111) by @peasee in #6111
- Make Glue Catalog refreshable (#6103) by @kczimm in #6103
- Refactor Glue catalog to use a new Glue data connector (#6125) by @kczimm in #6125
- Emit retry error on flight transient connection failure (#6123) by @Sevenannn in #6123
- Update Flight DoPut implementation to send single final PutResult (#6124) by @sgrebnov in #6124
- feat: Add metrics for search results cache (#6129) by @peasee in #6129
- update MCP crate (#6130) by @Jeadie in #6130
- feat: Add search cache status header, respect cache control (#6131) by @peasee in #6131
- fix: Allow specifying individual caching blocks (#6133) by @peasee in #6133
- Update openapi.json (#6132) by @app/github-actions in #6132
- Add CSV support to Glue data connector (#6138) by @kczimm in #6138
- Update Spice Cloud Platform management UX (#6140) by @sgrebnov in #6140
- Add TPCH bench for Glue catalog (#6055) by @kczimm in #6055
- Enforce maxtokensper_request limit in OpenAI embedding logic (#6144) by @sgrebnov in #6144
- Enable Spice Cloud Control Plane connect (management) for FinanceBench (#6147) by @sgrebnov in #6147
- Add integration test for Spice Cloud Platform management (#6150) by @sgrebnov in #6150
- fix: Invalidate search cache on refresh (#6137) by @peasee in #6137
- fix: Prevent registering cron schedule with change stream accelerations (#6152) by @peasee in #6152
- test: Add an append cron integration test (#6151) by @peasee in #6151
- fix: Cache search results with no-cache directive (#6155) by @peasee in #6155
- fix: Glue catalog dispatch runner type (#6157) by @peasee in #6157
- Fix: Glue S3 location for directories and Iceberg credentials (#6174) by @kczimm in #6174
- Support multiple columns in FTS (#6156) by @Jeadie in #6156
- fix: Add --cache-control flag for search CLI (#6158) by @peasee in #6158
- Add Glue data connector tpch bench test for parquet and csv (#6170) by @kczimm in #6170
- fix: Apply results cache deprecation correctly (#6177) by @peasee in #6177
- Fix Linux CUDA build (use candle-core 0.8.4 and cudarc v0.12) (#6181) by @sgrebnov in #6181
- fix: return empty stream when no results for Databricks SQL Warehouse (#6192) by @kczimm in #6192
Full Changelog: v1.3.2...v1.4.0-rc.1
- Rust
Published by kczimm 9 months ago
https://github.com/spiceai/spiceai - v1.3.2
Spice v1.3.2 (June 3, 2025)
Spice v1.3.2 improves DuckDB acceleration to accept ORDER BY rand() and ORDER BY NULL SQL queries, and supports the TIMESTAMP_NTZ(0) (timestamp with seconds precision) type in Snowflake.
Contributors
Breaking Changes
No breaking changes.
Cookbook Updates
No new cookbook recipes.
The Spice Cookbook now includes 67 recipes to help you get started with Spice quickly and easily.
Upgrading
To upgrade to v1.3.2, use one of the following methods:
CLI:
console
spice upgrade
Homebrew:
console
brew upgrade spiceai/spiceai/spice
Docker:
Pull the spiceai/spiceai:1.3.2 image:
console
docker pull spiceai/spiceai:1.3.2
For available tags, see DockerHub.
Helm:
console
helm repo update
helm upgrade spiceai spiceai/spiceai
What's Changed
Dependencies
No major dependency changes.
Changelog
- Handle Snowflake Timestamp NTZ with seconds precision (#6084) by @kczimm in #6084
- Fix DuckDB acceleration
ORDER BY rand()andORDER BY NULL(#6071) by @phillipleblanc in #6071
Full Changelog: https://github.com/spiceai/spiceai/compare/v1.3.1...v1.3.2
- Rust
Published by phillipleblanc 9 months ago
https://github.com/spiceai/spiceai - v1.3.0
Spice v1.3.0 (May 19, 2025)
Spice v1.3.0 accelerates data and AI applications with significantly improved query performance, reliability, and expanded Databricks integration. New support for the Databricks SQL Statement Execution API enables direct SQL queries on Databricks SQL Warehouses, complementing Mosaic AI model serving and embeddings (introduced in v1.2.2) and existing Databricks catalog and dataset integrations. This release upgrades to DataFusion v46, optimizes results caching performance, and strengthens security with least-privilege sandboxed improvements.
What's New in v1.3.0
- Databricks SQL Statement Execution API Support: Added support for the Databricks SQL Statement Execution API, enabling direct SQL queries against Databricks SQL Warehouses for optimized performance in analytics and reporting workflows.
Example spicepod.yml configuration:
yaml
datasets:
- from: databricks:spiceai.datasets.my_awesome_table
name: my_awesome_table
params:
mode: sql_warehouse
databricks_endpoint: ${env:DATABRICKS_ENDPOINT}
databricks_sql_warehouse_id: ${env:DATABRICKS_SQL_WAREHOUSE_ID}
databricks_token: ${env:DATABRICKS_TOKEN}
For details, see the Databricks Data Connector documentation.
- Improved Results Cache Performance & Hashing Algorithm: Spice now supports an alternative results cache hashing algorithm,
ahash, in addition tosiphash, being the default. Configure it via:
yaml
runtime:
results_cache:
hashing_algorithm: ahash # or siphash
The hashing algorithm determines how cache keys are hashed before being stored, impacting both lookup speed and protection against potential DOS attacks.
Using ahash improves performance for large queries or query plans. Combined with results cache optimizations, it reduces 99th percentile request latency and increases total requests/second for queries with large result sets (100k+ cached rows). The following charts show performance tested against the TPCH Query #17 on a scale factor 5 dataset (30+ million rows, 5GB):
| Latency | Req/sec |
| --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| |
|
Note: ahash was not available in v1.2.2, so it is excluded from comparisons.
To learn more, refer to the Results Cache Hashing Algorithm documentation.
SQL Query Performance: Optimized the critical SQL query path, reducing overhead and improving response times for simple queries by 10-20%.
DuckDB Acceleration: Fixed a bug in the DuckDB acceleration engine causing query failures under high concurrency when querying datasets accelerated into multiple DuckDB files.
Container Security: The container image now runs as a non-root user with enhanced sandboxing and includes only essential dependencies for a slimmer, more secure image.
DataFusion v46 Highlights
Spice.ai is built on the DataFusion query engine. The v46 release brings:
Faster Performance 🚀: DataFusion 46 introduces significant performance enhancements, including a 2x faster
median()function for large datasets without grouping, 10–100% speed improvements inFIRST_VALUEandLAST_VALUEwindow functions by avoiding sorting, and a 40x fasteruuid()function. Additional optimizations, such as a 50% fasterrepeat()string function, acceleratedchr()andto_hex()functions, improved grouping algorithms, and Parquet row group pruning withNOT LIKEfilters, further boost overall query efficiency.New range() Table Function: A new table-valued function
range(start, stop, step)has been added to make it easy to generate integer sequences — similar to PostgreSQL’s generate_series() or Spark’s range(). Example:SELECT * FROM range(1, 10, 2);UNION [ALL | DISTINCT] BY NAME Support: DataFusion now supports
UNION BY NAMEandUNION ALL BY NAME, which align columns by name instead of position. This matches functionality found in systems like Spark and DuckDB and simplifies combining heterogeneously ordered result sets.
Example:
sql
SELECT col1, col2 FROM t1
UNION ALL BY NAME
SELECT col2, col1 FROM t2;
See the DataFusion 46.0.0 release notes for details.
Spice.ai adopts the latest minus one DataFusion release for quality assurance and stability. The upgrade to DataFusion v47 is planned for Spice v1.4.0 in June.
Contributors
Breaking Changes
No breaking changes.
Cookbook Updates
- Added Accelerated Views: Pre-calculate and materialize data derived from one or more underlying datasets.
The Spice Cookbook now includes 67 recipes to help you get started with Spice quickly and easily.
Upgrading
To upgrade to v1.3.0, use one of the following methods:
CLI:
console
spice upgrade
Homebrew:
console
brew upgrade spiceai/spiceai/spice
Docker:
Pull the spiceai/spiceai:1.3.0 image:
console
docker pull spiceai/spiceai:1.3.0
For available tags, see DockerHub.
Helm:
console
helm repo update
helm upgrade spiceai spiceai/spiceai
What's Changed
Dependencies
- DataFusion: Upgraded to v46
- Apache Arrow: Upgraded to v54.3.0
- delta_kernel: Upgraded to v0.10.0
Changelog
- update to 1.2.2 by @Jeadie in #5806
- Move sandboxing logic to Dockerfile by @phillipleblanc in #5808
- Add note to run installation health workflow after release is marked as official by @Sevenannn in #5797
- ROADMAP updates May 13, 2025 by @lukekim in #5809
- Update qa_analytics.csv by @kczimm in #5810
- post-release housekeeping by @Jeadie in #5811
- Fix flaky DataBricks M2M integration tests by @phillipleblanc in #5818
- Add DataFusion request context extension to http routes by @ewgenius in #5807
- Use Utf8 for partition columns by @phillipleblanc in #5820
- Use full path for location metadata column by @phillipleblanc in #5819
- Remove the DataFusion reference from the flight service and use the reference from the request context instead by @ewgenius in #5821
- Upgrade delta_kernel to 0.10 by @phillipleblanc in #5823
- fix: Update benchmark snapshots by @app/github-actions in #5827
- Update qa_analytics.csv by @kczimm in #5824
- fix: Update benchmark snapshots by @app/github-actions in #5826
- fix: Update benchmark snapshots by @app/github-actions in #5825
- Fix dispatch spicepod reference for
file[parquet]-duckdb[file]-indexesandfile[parquet]-duckdb[memory]-indexesby @phillipleblanc in #5837 - Fix
spice run --http-endpointin CLI by @Jeadie in #5812 - Prevent excessively copying RawCacheKey by @peasee in #5838
- Make DuckDB database attachments logic more robust by @sgrebnov in #5839
- Simplify Databricks U2M auth flow, by moving user auth to the request context by @ewgenius in #5842
- Update to new MCP crate by @Jeadie in #5758
- Disable the query tracker when task history is disabled by @peasee in #5852
- Set fsGroup on PodSpec to force volumes to be mounted with permission to docker image by @phillipleblanc in #5854
- Clarify Helm release steps by @phillipleblanc in #5855
- Avoid cloning cached results by @peasee in #5853
- Upgrade to DataFusion 46 by @phillipleblanc in #5543
- Update openapi.json by @app/github-actions in #5856
- Adapt to Arrow 54 changes in Dict IDs preserving (Arrow IPC) by @sgrebnov in #5866
- fix: Update benchmark snapshots by @app/github-actions in #5867
- Fix s3[parquet]-duckdb[file-many] benchmark Spicepod configuration by @sgrebnov in #5868
- fix: Update benchmark snapshots by @app/github-actions in #5869
- feat: Refactor caching, support hashing algorithms by @peasee in #5859
- Overried health checks for Databricks models in U2M auth mode by @ewgenius in #5858
- Update trunk to 1.4.0-unstable by @phillipleblanc in #5878
- fix: Pass parameters to testoperator explain plan by @peasee in #5883
- Disallow schema updates for existing accelerated tables by @phillipleblanc in #5887
- Deferrable registration for Databricks U2M datasets by @ewgenius in #5860
See the full list of changes at: v1.2.2...v1.3.0
- Rust
Published by phillipleblanc 9 months ago
https://github.com/spiceai/spiceai - v1.2.2
Spice v1.2.2 (May 12, 2025)
Spice v1.2.2 introduces support for Databricks Mosaic AI model serving and embeddings, alongside the existing Databricks catalog and dataset integrations. It adds configurable service ports in the Helm chart and resolves several bugs to improve stability and performance.
Highlights in v1.2.2
- Databricks Model & Embedding Provider: Spice integrates with Databricks Model Serving for models and embeddings, enabling secure access via machine-to-machine (M2M) OAuth authentication with service principal credentials. The runtime automatically refreshes tokens using
databricks_client_idanddatabricks_client_secret, ensuring uninterrupted operation. This feature supports Databricks-hosted large language models and embedding models.
```yaml models: - from: databricks:databricks-llama-4-maverick name: llama-4-maverick params: databricksendpoint: dbc-46470731-42e5.cloud.databricks.com databricksclientid: ${secrets:DATABRICKSCLIENTID} databricksclientsecret: ${secrets:DATABRICKSCLIENT_SECRET}
embeddings: - from: databricks:databricks-gte-large-en name: gte-large-en params: databricksendpoint: dbc-42424242-4242.cloud.databricks.com databricksclientid: ${secrets:DATABRICKSCLIENTID} databricksclientsecret: ${secrets:DATABRICKSCLIENT_SECRET} ```
For detailed setup instructions, refer to the Databricks Model Provider documentation.
Configurable Helm Chart Service Ports: The Helm chart now supports custom ports for flexible network configurations for deployments. Specify non-default ports in your Helm values file.
Resolved Issues:
- MCP Nested Tool Calling: Fixed a bug preventing nested tool invocation when Spice operates as the MCP server federating to MCP clients.
- Dataset Load Concurrency: Corrected a failure to respect the
dataset_load_parallelismsetting during dataset loading. - Acceleration Hot-Reload: Addressed an issue where changes to acceleration enable/disable settings were not detected during hot reload of Spicepod.yaml.
Contributors
Breaking Changes
No breaking changes.
Cookbook Updates
Updated cookbooks:
- Databricks Catalogs: Includes using Databricks Service Principal
- Databricks: Includes using M2M auth
- Python ADBC: Adds a dataset to be queried over ADBC.
The Spice Cookbook now includes 68 recipes to help you get started with Spice quickly and easily.
Upgrading
To upgrade to v1.2.2, use one of the following methods:
CLI:
console
spice upgrade
Homebrew:
console
brew upgrade spiceai/spiceai/spice
Docker:
Pull the spiceai/spiceai:1.2.2 image:
console
docker pull spiceai/spiceai:1.2.2
For available tags, see DockerHub.
Helm:
console
helm repo update
helm upgrade spiceai spiceai/spiceai
What's Changed
Dependencies
- No major dependency changes.
Changelog
- Update spark-connect-rs to override user agent string by @ewgenius in https://github.com/spiceai/spice/pull/5798
- Merge pull request by @ewgenius in https://github.com/spiceai/spice/pull/5796
- Pass the default user agent string to the Databricks Spark, Delta, and Unity clients by @ewgenius in https://github.com/spiceai/spice/pull/5717
- bump to 1.2.2 by @Jeadie in https://github.com/spiceai/spice/pull/none
- Helm chart: support for service ports overrides by @sgrebnov in https://github.com/spiceai/spice/pull/5774
- Update spice cli login command with client-id and client-secret flags for Databricks by @ewgenius in https://github.com/spiceai/spice/pull/5788
- Fix bug where setting Cache-Control: no-cache doesn't compute the cache key by @phillipleblanc in https://github.com/spiceai/spice/pull/5779
- Update to datafusion-contrib/datafusion-table-providers#336 by @phillipleblanc in https://github.com/spiceai/spice/pull/5778
- Lru cache: limit single cached record size to u32::MAX (4GB) by @sgrebnov in https://github.com/spiceai/spice/pull/5772
- Fix LLMs calling nested MCP tools by @Jeadie in https://github.com/spiceai/spice/pull/5771
- MySQL: Set the charactersetresults/charactersetclient/charactersetconnection session variables on connection setup by @Sevenannn in https://github.com/spiceai/spice/pull/5770
- Control the parallelism of acceleration refresh datasets with runtime.datasetloadparallelism by @phillipleblanc in https://github.com/spiceai/spice/pull/5763
- Fix Iceberg predicates not matching the Arrow type of columns read from parquet files by @phillipleblanc in https://github.com/spiceai/spice/pull/5761
- fix: Use decimal_cmp for numerical BETWEEN in SQLite by @peasee in https://github.com/spiceai/spice/pull/5760
- Support product name override in databricks user agent string by @ewgenius in https://github.com/spiceai/spice/pull/5749
- Databricks U2M Token Provider support by @ewgenius in https://github.com/spiceai/spice/pull/5747
- Remove HTTP auth from LLM config and simplify Databricks models logic by using static headers by @Jeadie in https://github.com/spiceai/spice/pull/5742
- clear plan cache when dataset updates by @kczimm in https://github.com/spiceai/spice/pull/5741
- Support Databricks M2M auth in LLMs + Embeddings by @Jeadie in https://github.com/spiceai/spice/pull/5720
- Retrieve Github App tokens in background; make TokenProvider not async by @Jeadie in https://github.com/spiceai/spice/pull/5718
- Make 'token_providers' crate by @Jeadie in https://github.com/spiceai/spice/pull/5716
- Databricks AI: Embedding models & LLM streaming by @Jeadie in https://github.com/spiceai/spice/pull/5715
See the full list of changes at: v1.2.1...v1.2.2
- Rust
Published by Jeadie 10 months ago
https://github.com/spiceai/spiceai - v1.2.1
Spice v1.2.1 (May 6, 2025)
Spice v1.2.1 includes several data connector fixes and improves query performance for accelerated views. This release also introduces Databricks Service Principal (M2M OAuth) authentication and expands parameterized queries.
Highlights in v1.2.1
- Databricks Service Principal Support: Databricks datasets and catalogs now support Machine-to-Machine (M2M) OAuth authentication via Service Principals, enabling secure machine connections to Databricks.
Example spicepod.yaml:
yaml
datasets:
- from: databricks:spiceai.datasets.my_awesome_table # A reference to a table in the Databricks unity catalog
name: my_delta_lake_table
params:
mode: delta_lake
databricks_endpoint: dbc-a1b2345c-d6e7.cloud.databricks.com
databricks_client_id: ${secrets:DATABRICKS_CLIENT_ID}
databricks_client_secret: ${secrets:DATABRICKS_CLIENT_SECRET}
For details, see documentation for:
- Databricks Data Connector
Iceberg Data Connector: Now supports cross-account table access via the AWS Glue Catalog Connector and fixes an issue when querying data from append mode datasets.
Iceberg Catalog API: Full compatibility with the Iceberg HTTP REST Catalog API to consume Spice datasets from Iceberg Catalog clients.
For details, see documentation for:
- Iceberg Data Connector
Improved Parameterized Query Support: Expanded type inference for placeholders in:
INlist expressionsLIKEpatternsSIMILAR TOpatternsLIMITclauses- Subqueries
New Contributors 🎉
Contributors
Breaking Changes
No breaking changes.
Cookbook Updates
New recipes for:
- Language Model Evaluations: Use Spice.ai OSS to evaluate language models.
- LLM as a Judge: Use LLM judge models to evaluate the performance of other language models.
The Spice Cookbook now includes 68 recipes to help you get started with Spice quickly and easily.
Upgrading
To upgrade to v1.2.1, use one of the following methods:
CLI:
console
spice upgrade
Homebrew:
console
brew upgrade spiceai/spiceai/spice
Docker:
Pull the spiceai/spiceai:1.2.1 image:
console
docker pull spiceai/spiceai:1.2.1
For available tags, see DockerHub.
Helm:
console
helm repo update
helm upgrade spiceai spiceai/spiceai
What's Changed
Dependencies
- No major dependency changes.
Changelog
- Fix: Specify metric type as a dimension for testoperator by @peasee in #5630
- Fix: Add option to run dispatch schedule by @peasee in #5631
- Infer placeholder datatype for InList, Like, and SimilarTo by @kczimm in #5626
- Add QA analytics for 1.2.0 by @phillipleblanc in #5640
- Fix: Use SPICEDCOMMIT for spicedcommit_sha by @peasee in #5632
- New crates/tools by @Jeadie in #5121
- Update openapi.json by @github-actions in #5643
- Enable metrics reporting for models benchmarks (evals) by @sgrebnov in #5639
- Implement CatalogBuilder, add app and runtime references to catalog component, add runtime reference to connector params by @ewgenius in #5641
- Fix eventing bug in LLM progress; Add tool and worker progress by @Jeadie in #5619
- Handle small precision differences in TPCH answer validation by @phillipleblanc in #5642
- Add TokenProviderRegistry to the runtime by @ewgenius in #5651
- Provide ModelContextLayer for evals by @Jeadie in #5648
- Databricks datacomponents refactor. Databricks Spark connect - add settoken method and writable spark session by @ewgenius in #5654
- Extract AWS Glue warehouse for cross-account Iceberg tables by @phillipleblanc in #5656
- Refactor Dataset component by @phillipleblanc in #5660
- Fix Iceberg API returning 404 when schema contains a Dictionary by @phillipleblanc in #5665
- Fix dependencies: downgrade swagger-ui to v8; force zip to 2.3.0 by @kczimm in #5664
- Add DuckDB indexes spicepod, additional dispatches by @peasee in #5633
- Update readme: update data federation link by @nuvic in #5673
- Support metadata columns for object-store based data connectors by @phillipleblanc in #5661
- Add model name to LLM judges, and add modelgradedscoring task by @Jeadie in #5655
- Add SF1000 TPCH test spicepods for delta lake by @Sevenannn in #5606
- Validate Github Connector resource existence before building the github connector graphql table by @Sevenannn in #5674
- Remove hard-coded embedding performance tests in CI by @Sevenannn in #5675
- Databricks M2M auth for spark connect data connector by @ewgenius in #5659
- Enable federated data refresh support for accelerated views by @sgrebnov in #5677
- Add pods watcher integration test by @Sevenannn in #5681
- Add m2m support for databricks delta connector by @ewgenius in #5680
- Update end_game.md by @sgrebnov in #5684
- Update StaticTokenProvider to use SecretString instead of raw str value by @ewgenius in #5686
- Add M2M Auth support for Databricks catalog connector by @ewgenius in #5687
- Update UX to disable acceleration federation by @sgrebnov in #5682
- Improve placeholder inference (LIMIT & Expr::InSubquery) by @phillipleblanc in #5692
- Tweak default log to ignore aws_config::imds::region by @phillipleblanc in #5693
- Make Spice properly Iceberg Catalog API compatible for load table API by @phillipleblanc in #5695
- Use deterministic queries for Databricks m2m catalog tests by @ewgenius in #5696
- Support retrieving the latest Iceberg table on table scan by @phillipleblanc in #5704
- Infer partitions from schemasourcepath if present by @phillipleblanc in #5721
Full Changelog: v1.2.0...v1.2.1
- Rust
Published by sgrebnov 10 months ago
https://github.com/spiceai/spiceai - v1.2.0
Spice v1.2.0 (Apr 28, 2025)
Spice v1.2.0 is a significant update. It upgrades DataFusion to v45 and Arrow to v54. This release brings faster query performance, support for parameterized queries in SQL and HTTP APIs, and the ability to accelerate views. Several bugs have been fixed and dependencies updated for better stability and speed.
DataFusion v45 Highlights
Spice.ai is built on the DataFusion query engine. The v45 release brings:
Faster Performance 🚀: DataFusion is now the fastest single-node engine for Apache Parquet files in the clickbench benchmark. Performance improved by over 33% from v33 to v45. Arrow StringView is now on by default, making string and binary data queries much faster, especially with Parquet files.
Better Quality 📋: DataFusion now runs over 5 million SQL tests per push using the SQLite sqllogictest suite. There are new checks for logical plan correctness and more thorough pre-release testing.
New SQL Functions ✨: Added
show functions,to_local_time,regexp_count,map_extract,array_distance,array_any_value,greatest,least, andarrays_overlap.
See the DataFusion 45.0.0 release notes for details.
Spice.ai upgrades to the latest minus one DataFusion release to ensure adequate testing and stability. The next upgrade to DataFusion v46 is planned for Spice v1.3.0 in May.
What's New in v1.2.0
- Parameterized Queries: Parameterized queries are now supported with the Flight SQL API and HTTP API. Positional and named arguments via
$1and:paramsyntax are supported, respectively. Logical plans for SQL statements are cached for faster repeated queries.
Example Cookbook recipes:
See the API Documentation for additional details.
- Accelerated Views: Views, not just datasets, can now be accelerated. This provides much better performance for views that perform heavy computation.
Example spicepod.yaml:
yaml
views:
- name: accelerated_view
acceleration:
enabled: true
engine: duckdb
primary_key: id
refresh_check_interval: 1h
sql: |
select * from dataset_a
union all
select * from dataset_b
See the Data Acceleration documentation.
- Memory Usage Metrics & Configuration: Runtime now tracks memory usage as a metric, and a new runtime
memory_limitparameter is available. The memory limit parameter applies specifically to the runtime and should be used in addition to existing memory usage configuration, such asduckdb_memory_limit. Memory usage for queries beyond the memory limit will spill to disk.
See the Memory Reference for details.
- New Worker Component: Workers are new configurable compute units in the Spice runtime. They help manage compute across models and tools, handle errors, and balance load. Workers are configured in the
workerssection ofspicepod.yaml.
Example spicepod.yaml:
yaml
workers:
- name: round-robin
description: |
Distributes requests between 'foo' and 'bar' models in a round-robin fashion.
models:
- from: foo
- from: bar
- name: fallback
description: |
Tries 'bar' first, then 'foo', then 'baz' if earlier models fail.
models:
- from: foo
order: 2
- from: bar
order: 1
- from: baz
order: 3
See the Workers Documentation for details.
- Databricks Model Provider: Databricks models can now be used with
from: databricks:model_name.
Example spicepod.yaml:
yaml
models:
- from: databricks:llama-3_2_1_1b_instruct
name: llama-instruct
params:
databricks_endpoint: dbc-46470731-42e5.cloud.databricks.com
databricks_token: ${ secrets:SPICE_DATABRICKS_TOKEN }
See the Databricks model documentation.
spice chatCLI Improvements: Thespice chatcommand now supports an optional--temperatureparameter. A one-shot chat can also be sent withspice chat <message>.More Type Support: Added support for Postgres JSON type and DuckDB Dictionary type.
Other Improvements:
- New image tags let you pick memory allocators for different use-cases:
jemalloc,sysalloc, andmimalloc. - Better error handling and logging for chat and model operations.
- New image tags let you pick memory allocators for different use-cases:
Contributors
Cookbook Updates
New recipes for:
- Python ADBC Client with Parameterized Queries: Using Parameterized Queries from Python over ADBC.
- Java JDBC Client with Parameterized Queries: Using Parameterized Queries from Java over JDBC.
- Scala JDBC Client with Parameterized Queries: Using Parameterized Queries from Scala over JDBC.
The Spice Cookbook now includes 68 recipes to help you get started with Spice quickly and easily.
Upgrading
To upgrade to v1.2.0, use one of the following methods:
CLI:
console
spice upgrade
Homebrew:
console
brew upgrade spiceai/spiceai/spice
Docker:
Pull the spiceai/spiceai:1.2.0 image:
console
docker pull spiceai/spiceai:1.2.0
For available tags, see DockerHub.
Helm:
console
helm repo update
helm upgrade spiceai spiceai/spiceai
What's Changed
Dependencies
- DataFusion: upgraded to v45.
- Apache Arrow: Upgraded to v54.3.0.
Spice is now built with Rust 1.85.0 and Rust 2024.
Changelog
- Update end_game.md (#5312) by @peasee in https://github.com/spiceai/spiceai/pull/5312
- feat: Add initial testoperator query validation (#5311) by @peasee in https://github.com/spiceai/spiceai/pull/5311
- Update Helm + Prepare for next release (#5317) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5317
- Update spicepod.schema.json (#5319) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5319
- add integration test for reading encrypted PDFs from S3 (#5308) by @kczimm in https://github.com/spiceai/spiceai/pull/5308
- Stop
load_componentsduring runtime shutdown (#5306) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5306 - Update openapi.json (#5321) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5321
- feat: Implement record batch data validation (#5331) by @peasee in https://github.com/spiceai/spiceai/pull/5331
- Update QA analytics for v1.1.1 (#5320) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5320
- fix: Update benchmark snapshots (#5337) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5337
- Enforce pulls with Spice v1.0.4 (#5339) by @lukekim in https://github.com/spiceai/spiceai/pull/5339
- Upgrade to DataFusion 45, Arrow 54, Rust 1.85 & Edition 2024 (#5334) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5334
- feat: Allow validating testoperator in benchmark workflow (#5342) by @peasee in https://github.com/spiceai/spiceai/pull/5342
- Upgrade
delta_kernelto 0.9 (#5343) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5343 - deps: Update odbc-api (#5344) by @peasee in https://github.com/spiceai/spiceai/pull/5344
- Fix schema inference for Snowflake tables with large number of columns (#5348) by @ewgenius in https://github.com/spiceai/spiceai/pull/5348
- feat: Update testoperator dispatch for validation, version metric (#5349) by @peasee in https://github.com/spiceai/spiceai/pull/5349
- fix: validate_results not validate (#5352) by @peasee in https://github.com/spiceai/spiceai/pull/5352
- revert to previous pdf-extract; remove test for encrypted pdf support (#5355) by @kczimm in https://github.com/spiceai/spiceai/pull/5355
- Stablize the test
verify_similarity_search_chat_completion(#5284) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5284 - Turn off
delta_kernel::log_segmentlogging and refactor log filtering (#5367) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5367 - Upgrade to DuckDB 1.2.2 (#5375) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5375
- Update Readme - fix broken and outdated links (#5376) by @ewgenius in https://github.com/spiceai/spiceai/pull/5376
- Upgrade dependabot dependencies (#5385) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5385
- fix: Remove IMAP oauth (#5386) by @peasee in https://github.com/spiceai/spiceai/pull/5386
- Bump Helm chart to 1.1.2 (#5389) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5389
- Refactor accelerator registry as part of runtime. (#5318) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5318
- Include
vnd.spiceai.sql/nsql.v1+jsonresponse examples (openapi docs) (#5388) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5388 - docs: Update endgame template with SpiceQA, update qa analytics (#5391) by @peasee in https://github.com/spiceai/spiceai/pull/5391
- Make graceful shutdown timeout configurable (#5358) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5358
- docs: Update release criteria with note on max columns (#5401) by @peasee in https://github.com/spiceai/spiceai/pull/5401
- Update openapi.json (#5392) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5392
- FinanceBench: update scorer instructions and switch scoring model to
gpt-4.1(#5395) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5395 - feat: Write OTel metrics for testoperator (#5397) by @peasee in https://github.com/spiceai/spiceai/pull/5397
- Update nsql openapi title (#5403) by @ewgenius in https://github.com/spiceai/spiceai/pull/5403
- Track
ai_inferences_countwith used tools flag. Extensible runtime request context. (#5393) by @ewgenius in https://github.com/spiceai/spiceai/pull/5393 - Include newly detected view as changed view (#5408) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5408
- Track usedtools in aiinferenceswithspice_count as number (#5409) by @ewgenius in https://github.com/spiceai/spiceai/pull/5409
- Update openapi.json (#5406) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5406
- Tweak enforce pulls with Spice (#5411) by @lukekim in https://github.com/spiceai/spiceai/pull/5411
- Allow
flightsqlandspiceaiconnectors to override flight max message size (#5407) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5407 - Retry model graded scorer once on successful, empty response (#5405) by @Jeadie in https://github.com/spiceai/spiceai/pull/5405
- use span task name in 'spice trace' tree, not span_id (#5412) by @Jeadie in https://github.com/spiceai/spiceai/pull/5412
- Rename to
track_ai_inferences_with_spice_countin all places (#5410) by @ewgenius in https://github.com/spiceai/spiceai/pull/5410 - Update qa_analytics.csv (#5421) by @peasee in https://github.com/spiceai/spiceai/pull/5421
- Remove the filter for the
list_datasetstool in the AI inferences metric count. (#5417) by @ewgenius in https://github.com/spiceai/spiceai/pull/5417 - fix: Testoperator uses an exact API key for benchmark metric submission (#5413) by @peasee in https://github.com/spiceai/spiceai/pull/5413
- feat: Enable testoperator metrics in workflow (#5422) by @peasee in https://github.com/spiceai/spiceai/pull/5422
- Upgrade mistral.rs (#5404) by @Jeadie in https://github.com/spiceai/spiceai/pull/5404
- Include all FinanceBench documents in benchmark tests (#5426) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5426
- Handle second Ctrl-C to force runtime termination (#5427) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5427
- Add optional
--temperatureparameter forspice chatCLI command (#5429) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5429 - Remove
with_runtime_statusfrom theRuntimeBuilder(#5430) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5430 - Fix spice chat error handling (#5433) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5433
- Add more test models to FinanceBench benchmark (#5431) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5431
- support 'from: databricks:model_name' (#5434) by @Jeadie in https://github.com/spiceai/spiceai/pull/5434
- Upgrade Pulls with Spice to v1.0.6 and add concurrency control (#5442) by @lukekim in https://github.com/spiceai/spiceai/pull/5442
- Upgrade DataFusion table providers (#5443) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5443
- Test spice chat in e2etestspice_cli (#5447) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5447
- Allow for one-shot chat request using
spice chat <message>(#5444) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5444 - Enable parallel data sampling for NSQL (#5449) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5449
- Upgrade Go from v1.23.4 to v1.24.2 (#5462) by @lukekim in https://github.com/spiceai/spiceai/pull/5462
- Update PULLREQUESTTEMPLATE.md (#5465) by @lukekim in https://github.com/spiceai/spiceai/pull/5465
- Enable captured outputs by default when spiced is started by the CLI (spice run) (#5464) by @lukekim in https://github.com/spiceai/spiceai/pull/5464
- Parameterized queries via Flight SQL API (#5420) by @kczimm in https://github.com/spiceai/spiceai/pull/5420
- fix: Update benchmarks readme badge (#5466) by @peasee in https://github.com/spiceai/spiceai/pull/5466
- delay auth check for binding parameterized queries (#5475) by @kczimm in https://github.com/spiceai/spiceai/pull/5475
- Add support for
?placeholder syntax in parameterized queries (#5463) by @kczimm in https://github.com/spiceai/spiceai/pull/5463 - enable task name override for non static span names (#5423) by @Jeadie in https://github.com/spiceai/spiceai/pull/5423
- Allow parameter queries with no parameters (#5481) by @kczimm in https://github.com/spiceai/spiceai/pull/5481
- Support unparsing UNION for distinct results (#5483) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5483
- add rust-toolchain.toml (#5485) by @kczimm in https://github.com/spiceai/spiceai/pull/5485
- Add parameterized query support to the HTTP API (#5484) by @kczimm in https://github.com/spiceai/spiceai/pull/5484
- E2E test for spice chat
behavior (#5451) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5451 - Renable and fix huggingface models integration tests (#5478) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5478
- Update openapi.json (#5488) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5488
- feat: Record memory usage as a metric (#5489) by @peasee in https://github.com/spiceai/spiceai/pull/5489
- fix: update dispatcher to run all benchmarks, rename metric, update spicepods, add scale factor (#5500) by @peasee in https://github.com/spiceai/spiceai/pull/5500
- Fix ILIKE filters support (#5502) by @ewgenius in https://github.com/spiceai/spiceai/pull/5502
- fix: Update test spicepod locations and names (#5505) by @peasee in https://github.com/spiceai/spiceai/pull/5505
- fix: Update benchmark snapshots (#5508) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5508
- fix: Update benchmark snapshots (#5512) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5512
- Fix Delta Lake bug for: Found unmasked nulls for non-nullable StructArray field "predicate" (#5515) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5515
- fix: working directory for duckdb e2e test spicepods (#5510) by @peasee in https://github.com/spiceai/spiceai/pull/5510
- Tweaks to README.md (#5516) by @lukekim in https://github.com/spiceai/spiceai/pull/5516
- Cache logical plans of SQL statements (#5487) by @kczimm in https://github.com/spiceai/spiceai/pull/5487
- Fix
content-type: application/json(#5517) by @Jeadie in https://github.com/spiceai/spiceai/pull/5517 - Validate postgres results in testoperator dispatch (#5504) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5504
- fix: Update benchmark snapshots (#5511) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5511
- Fix results cache by SQL with prepared statements (#5518) by @kczimm in https://github.com/spiceai/spiceai/pull/5518
- Add initial support for views acceleration (#5509) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5509
- fix: Update benchmark snapshots (#5527) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5527
- Support switching the memory allocator Spice uses via
alloc-*features. (#5528) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5528 - fix: Update benchmark snapshots (#5525) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5525
- Add test spicepod for tpch mysql-duckdbfile acceleration by @Sevenannn in https://github.com/spiceai/spiceai/pull/5521
- Fix nightly arm build - change tag
-defaultto-models(#5529) by @ewgenius in https://github.com/spiceai/spiceai/pull/5529 - LLM router via
workerspicepod component (#5513) by @Jeadie in https://github.com/spiceai/spiceai/pull/5513 - Apply Spice advanced acceleration logic and params support to accelerated views (#5526) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5526
- Enable DatasetCheckpoint logic for accelerated views (#5533) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5533
- Fix public '.model' name for router workers (#5535) by @Jeadie in https://github.com/spiceai/spiceai/pull/5535
- feat: Add Runtime memory limit parameter (#5536) by @peasee in https://github.com/spiceai/spiceai/pull/5536
- For fallback worker, check first item in
chat/completionstream. (#5537) by @Jeadie in https://github.com/spiceai/spiceai/pull/5537 - Move rate limit check to after parameterized query binding (#5540) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5540
- Update spicepod.schema.json (#5545) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5545
- Accelerate views: refreshonstartup, ready_state, jitter params support (#5547) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5547
- Add integration test for accelerated views (#5550) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5550
- Don't install make or expect on spiceai-macos runners (#5554) by @lukekim in https://github.com/spiceai/spiceai/pull/5554
event_streamcrate for emitting events from tracing::Span; used in v1/chat/completions streaming. (#5474) by @Jeadie in https://github.com/spiceai/spiceai/pull/5474- Fix typo in method (#5559) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5559
- Run test operator every day and current and previous commits (#5557) by @lukekim in https://github.com/spiceai/spiceai/pull/5557
- Add awsallowhttp parameter for delta lake connector (#5541) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5541
- feat: Add branch name to metric dimensions in testoperator (#5563) by @peasee in https://github.com/spiceai/spiceai/pull/5563
- fix: Update the tpch benchmark snapshots for: ./test/spicepods/tpch/sf1/federated/odbc[databricks].yaml (#5565) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5565
- fix: Split scheduled dispatch into a separate job (#5567) by @peasee in https://github.com/spiceai/spiceai/pull/5567
- fix: Use outputs.SPICED_COMMIT (#5568) by @peasee in https://github.com/spiceai/spiceai/pull/5568
- fix: Use refs in testoperator dispatch instead of commits (#5569) by @peasee in https://github.com/spiceai/spiceai/pull/5569
- fix: actions/checkout ref does not take a full ref (#5571) by @peasee in https://github.com/spiceai/spiceai/pull/5571
- fix: Testoperator dispatch (#5572) by @peasee in https://github.com/spiceai/spiceai/pull/5572
- Respect
update-snapshotswhen running all benchmarks manually (#5577) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5577 - Use FETCHHEAD instead of ${{ inputs.ref }} to list commits in setupspiced (#5579) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5579
- Add additional test scenarios for benchmarks (#5582) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5582
- fix: Update the tpch benchmark snapshots for: test/spicepods/tpch/sf1/accelerated/databricks[delta_lake]-duckdb[file].yaml (#5590) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5590
- fix: Update the tpch benchmark snapshots for: test/spicepods/tpch/sf1/accelerated/mysql-duckdb[file].yaml (#5591) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5591
- Fix Snowflake data connector rows ordering (#5599) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5599
- fix: Update benchmark snapshots (#5595) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5595
- fix: Update the tpch benchmark snapshots for: test/spicepods/tpch/sf1/accelerated/databricks[delta_lake]-arrow.yaml (#5594) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5594
- fix: Update benchmark snapshots (#5589) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5589
- fix: Update benchmark snapshots (#5583) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5583
- Downgrade DuckDB to 1.1.3 (#5607) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5607
- Add prepared statement integration tests (#5544) by @kczimm in https://github.com/spiceai/spiceai/pull/5544
Full Changelog: v1.1.2...v1.2.0
- Rust
Published by ewgenius 10 months ago
https://github.com/spiceai/spiceai - v1.1.2
Spice v1.1.2 (Apr 14, 2025)
Spice v1.1.2 improves Delta Lake Data Connector performance, introduces new Accept headers for the /v1/sql and /v1/nsql endpoints to include query metadata with results, and resolves an issue with the Snowflake Data Connector when handling wide tables (>600 columns).
The official Tableau Connector for Spice.ai v0.1 has been released, making it easy to connect to both self-hosted Spice.ai and Spice Cloud instances using Tableau.
What's New in v1.1.2
Tableau Connector for Spice.ai: Released the initial version (v0.1) of the official Tableau Taco Connector (fully open-source), enabling data visualization and analytics in Tableau with self-hosted Spice.ai and Spice Cloud deployments.
- Official Release: github.com/spicehq/tableau-connector/releases/tag/v0.1.0
- Docs: spiceai.org/docs/clients/tableau
- Open Source Repository: github.com/spiceai/tableau-connector
Delta Lake Data Connector: Upgraded
delta_kernelto v0.9, and optimized scan operations, reducing query execution time by up to 20% on large datasets.Snowflake Data Connector: Fixed a bug that caused failures when loading tables with more than 600 columns.
Query Metadata (SQL and NSQL): Added support for the
application/vnd.spiceai.sql.v1+jsonAccept header on the/v1/sqlendpoint, and theapplication/vnd.spiceai.nsql.v1+jsonAccept header on the/v1/nsqlendpoint, enabling responses to include metadata such as the executed SQL query and schema alongside results.
Example:
bash
curl -XPOST "http://localhost:8090/v1/nsql" \
-H "Content-Type: application/json" \
-H "Accept: application/vnd.spiceai.nsql.v1+json" \
-d '{
"query": "What’s the highest tip any passenger gave?"
}' | jq
Example response:
json
{
"row_count": 1,
"schema": {
"fields": [
{
"name": "highest_tip",
"data_type": "Float64",
"nullable": true,
"dict_id": 0,
"dict_is_ordered": false,
"metadata": {}
}
],
"metadata": {}
},
"data": [
{
"highest_tip": 428.0
}
],
"sql": "SELECT MAX(\"tip_amount\") AS \"highest_tip\"\nFROM \"spice\".\"public\".\"taxi_trips\""
}
For details, see the SQL Query API and NSQL API documentation.
Contributors
Breaking Changes
No breaking changes in this release.
Cookbook Updates
No major cookbook additions.
The Spice Cookbook now includes 65 recipes to help you get started with Spice quickly and easily.
Upgrading
To upgrade to v1.1.2, use one of the following methods:
CLI:
console
spice upgrade
Homebrew:
console
brew upgrade spiceai/spiceai/spice
Docker:
Pull the spiceai/spiceai:1.1.2 image:
console
docker pull spiceai/spiceai:1.1.2
For available tags, see DockerHub.
Helm:
console
helm repo update
helm upgrade spiceai spiceai/spiceai
What's Changed
Dependencies
- delta_kernel: updated to v0.9.0.
Changelog
- Backport - Fix schema inference for Snowflake tables with large number of columns #5348 by @ewgenius in #5350
- Upgrade
delta_kernelto 0.9 (#5343) by @phillipleblanc in #5356 - Add basic support for
application/vnd.spiceai.sql.v1+jsonformat (#5333) by @sgrebnov in #5333 - Convert DataFusion filters to Delta Kernel predicates by @phillipleblanc in #5362
- revert to previous pdf-extract; remove test for encrypted pdf support by @kczimm in #5355
- Turn off
delta_kernel::log_segmentlogging and refactor log filtering by @phillipleblanc in #5367 - Extend
application/vnd.spiceai.sql.v1+jsonwithschemaandrow_countfields by @sgrebnov in #5365 - Make separate
vnd.spiceai.sql.v1+jsonandvnd.spiceai.nsql.v1+jsonMIME types by @sgrebnov in #5382
Full Changelog: v1.1.1...v1.1.2
- Rust
Published by phillipleblanc 11 months ago
https://github.com/spiceai/spiceai - v1.1.1
Spice v1.1.1 (Apr 7, 2025)
Spice v1.1.1 introduces several key updates, including a new Component Metrics System, improved Delta Data Connector performance, improved MCP tool descriptions, and expanded runtime results caching options. This release also adds detailed MySQL connection pool metrics for better observability. Component Metrics are Prometheus-compatible and accessible via the metrics endpoint.
Highlights v1.1.1
- Component Metrics System: A new system for monitoring components, starting with MySQL connection pool metrics. These metrics provide insights into MySQL connection performance and can be selectively enabled in the dataset configuration. Metrics are exposed in Prometheus format via the metrics endpoint.
For more details, see the Component Metrics documentation.
- Results Caching Enhancements: Added a
cache_key_typeoption for runtime results caching. Options include:plan(Default): Uses the query's logical plan as the cache key. Matches semantically equivalent queries but requires query parsing.sql: Uses the raw SQL string as the cache key. Provides faster lookups but requires exact string matches. Usesqlfor predictable queries without dynamic functions likeNOW().
Example spicepod.yaml configuration:
yaml
runtime:
results_cache:
enabled: true
cache_max_size: 128MiB
cache_key_type: sql # Use SQL for the results cache key
item_ttl: 1s
For more details, see the runtime configuration documentation.
Delta Data Connector: Improved scan performance for faster query performance.
MCP Tools: Improved descriptions for built-in MCP tools to improve usability.
MySQL Component Metrics: Added detailed metrics for monitoring MySQL connections, such as connection count and pool activity.
Example spicepod.yaml configuration:
yaml
datasets:
- from: mysql:my_table
name: my_dataset
metrics:
- name: connection_count
enabled: true
- name: connections_in_pool
enabled: true
- name: active_wait_requests
enabled: true
params:
mysql_host: localhost
mysql_tcp_port: 3306
mysql_user: root
mysql_pass: ${secrets:MYSQL_PASS}
For more details, see the MySQL Data Connector documentation.
- spice.js SDK: The
spice.jsSDK has been updated to v2.0.1 and includes several important security updates.
New Contributors 🎉
Contributors
Breaking Changes
No breaking changes in this release.
Cookbook Updates
The Spice Cookbook now includes 65 recipes to help you get started with Spice quickly and easily.
Upgrading
To upgrade to v1.1.1, use one of the following methods:
CLI:
console
spice upgrade
Homebrew:
console
brew upgrade spiceai/spiceai/spice
Docker:
Pull the spiceai/spiceai:1.1.1 image:
console
docker pull spiceai/spiceai:1.1.1
For available tags, see DockerHub.
Helm:
console
helm repo update
helm upgrade spiceai spiceai/spiceai
What's Changed
Dependencies
- No major dependency changes.
Changelog
- fix: Testoperator DuckDB, SQLite, Postgres, Spicecloud by @peasee in #5190
- Update Helm Chart and SECURITY.md to v1.1.0 by @lukekim in #5223
- Update version.txt to v1.1.1-unstable by @lukekim in #5224
- Update Cargo.lock to v1.1.1-unstable by @lukekim in #5225
- Add tests for
verify_schema_source_pathinListingTableConnectorby @phillipleblanc in #5221 - Reduce noise from debug logging by @phillipleblanc in #5227
- Improve
openai_test_chat_messagesintegration test reliability by @Sevenannn in #5222 - Verify the checkpoints existence before shutting down runtime in integration tests directly querying checkpoint by @Sevenannn in #5232
- Fix CORS support for json content-type api by @sgrebnov in #5241
- Fix ModelGradedScorer error: The 'metadata' parameter is only allowed when 'store' is enabled. by @sgrebnov in #5231
- fix: Use
pulls-with-spice-actionand switch tospiceai-macosrunners by @peasee in #5238 - Use v1.0.3 pulls with spice action by @lukekim in #5244
- feat: Build ODBC binaries, run testoperator on ODBC by @peasee in #5237
- Bump timeout for several integration test runtime load_components & readiness check by @Sevenannn in #5229
- Validate port is available before binding port for docker container in integration tests by @Sevenannn in #5248
- Update datafusion-table-providers to fix the schema for PostgreSQL materialized views by @ewgenius in #5259
- Verify flight server is ready for flight integration tests by @Sevenannn in #5240
- fix: Publish to MinIO inside of matrix on buildandrelease by @peasee in #5258
- fix: TPCDS on zero results benchmarks by @peasee in #5263
- Use model as a judge scorer for Financebench by @sgrebnov in #5264
- Fix FinanceBench llm scorer secret name by @sgrebnov in #5276
- Implements support for
runtime.results_cache.cache_key_typeby @phillipleblanc in #5265 - fix: Testoperator MS SQL, query overrides, dispatcher by @peasee in #5279
- refactor: Delete old benchmarks by @peasee in #5283
- Imporve embedding column parsing performance test by @Sevenannn in #5268
- Add Support for AWS Session Token in S3 Data Connector by @kczimm in #5243
- Implement Component Metrics system + MySQL connection pool metrics by @phillipleblanc in #5290
- Add default descriptions to built-in MCP tools by @lukekim in #5293
- fix: Vector search with cased columns by @peasee in #5295
- Run delta kernel scan in a blocking Tokio thread. by @phillipleblanc in #5296
- Expose the
mysql_pool_minandmysql_pool_maxconnection pool parameters by @phillipleblanc in #5297 - use patched pdf-extract by @kczimm in #5270
Full Changelog: v1.1.0...v1.1.1
- Rust
Published by phillipleblanc 11 months ago
https://github.com/spiceai/spiceai - v1.1.0
Spice v1.1.0 (Mar 31, 2025)
Spice v1.1.0 introduces full support for the Model-Context-Protocol (MCP), expanding how models and tools connect. Spice can now act as both an MCP Server, with the new /v1/mcp/sse API, and an MCP Client, supporting stdio and SSE-based servers. This release also introduces a new Web Search tool with Perplexity model support, advanced evaluation workflows with custom eval scorers, including LLM-as-a-judge, and adds an IMAP Data Connector for federated SQL queries across email servers. Alongside these features, v1.1.0 includes automatic NSQL query retries, expanded task tracing, request drains for HTTP server shutdowns, delivering improved reliability, flexibility, and observability.
Highlights in v1.1.0
- Spice as an MCP Server and Client: Spice now supports the Model Context Protocol (MCP), for expanded tool discovery and connectivity. Spice can:
- Run stdio-based MCP servers internally.
- Connect to external MCP servers over SSE protocol (Streamable HTTP is coming soon!)
For more details, see the MCP documentation.
### Usage
yaml
tools:
- name: google_maps
from: mcp:npx
params:
mcp_args: -y @modelcontextprotocol/server-google-maps
### Spice as an MCP Server
Tools in Spice can be accessed via MCP. For example, connecting from an IDE like Cursor or Windsurf to Spice. Set the MCP Server URL to http://localhost:8090/v1/mcp/sse.
- Perplexity Model Support: Spice now supports Perplexity-hosted models, enabling advanced web search and retrieval capabilities. Example configuration:
yaml
models:
- name: webs
from: perplexity:sonar
params:
perplexity_auth_token: ${ secrets:SPICE_PERPLEXITY_AUTH_TOKEN }
perplexity_search_domain_filter:
- docs.spiceai.org
- huggingface.co
For more details, see the Perplexity documentation.
- Web Search Tool: The new Web Search Tool enables Spice models to search the web for information using search engines like Perplexity. Example configuration:
yaml
tools:
- name: the_internet
from: websearch
description: 'Search the web for information.'
params:
engine: perplexity
perplexity_auth_token: ${ secrets:SPICE_PERPLEXITY_AUTH_TOKEN }
For more details, see the Web Search Tool documentation.
Eval Scorers: Eval scorers assess model performance on evaluation cases. Spice includes built-in scorers:
match: Exact match.json_match: JSON equivalence.includes: Checks if actual output includes expected output.fuzzy_match: Normalized subset matching.levenshtein: Levenshtein distance.
Custom scorers can use embedding models or LLMs as judges. Example:
yaml
evals:
- name: australia
dataset: cricket_questions
scorers:
- hf_minilm
- judge
- match
embeddings:
- name: hf_minilm
from: huggingface:huggingface.co/sentence-transformers/all-MiniLM-L6-v2
models:
- name: judge
from: openai:gpt-4o
params:
openai_api_key: ${ secrets:OPENAI_API_KEY }
system_prompt: |
Compare these stories and score their similarity (0.0 to 1.0).
Story A: {{ .actual }}
Story B: {{ .ideal }}
For more details, see the Eval Scorers documentation.
- IMAP Data Connector: Query emails stored in IMAP servers using federated SQL. Example:
yaml
datasets:
- from: imap:myawesomeemail@gmail.com
name: emails
params:
imap_access_token: ${secrets:IMAP_ACCESS_TOKEN}
For more details, see the IMAP Data Connector documentation.
Automatic NSQL Query Retries: Failed NSQL queries are now automatically retried, improving reliability for federated queries. For more details, see the NSQL documentation.
Enhanced Task Tracing: Task history now includes chat completion IDs, and runtime readiness is traced for better observability. Use the
runtime.task_historytable to query task details. See the Task History documentation.Vector Search with Keyword Filtering: The vector search API now includes an optional list of keywords as a parameter, to pre-filter SQL results before performing a vector search. When vector searching via a chat completion, models will automatically generate keywords relevant to the search. See the Vector Search API documentation.
Improved Refresh Behavior on Startup: Spice won't automatically refresh an accelerated dataset on startup if it doesn't need to. See the Refresh on Startup documentation.
Graceful Shutdown for HTTP Server: The HTTP server now drains requests for graceful shutdowns, ensuring smoother runtime termination.
New Contributors 🎉
- @Garamda made their first contribution in https://github.com/spiceai/spiceai/pull/4840
- @sergey-shandar made their first contribution in https://github.com/spiceai/spiceai/pull/4868
- @benrussell made their first contribution in https://github.com/spiceai/spiceai/pull/5126
Contributors
- @sgrebnov
- @phillipleblanc
- @peasee
- @Jeadie
- @lukekim
- @benrussell
- @Sevenannn
- @sergey-shandar
- @Garamda
- @johnnynunez
Breaking Changes
No breaking changes.
Cookbook Updates
The Spice Cookbook now has 74 recipes that make it easy to get started with Spice!
Upgrading
To upgrade to v1.1.0, use one of the following methods:
CLI:
console
spice upgrade
Homebrew:
console
brew upgrade spiceai/spiceai/spice
Docker:
Pull the spiceai/spiceai:1.1.0 image:
console
docker pull spiceai/spiceai:1.1.0
For available tags, see DockerHub.
Helm:
console
helm repo update
helm upgrade spiceai spiceai/spiceai
What's Changed
Dependencies
- No major dependency changes.
Changelog
- release: Bump chart, and versions for next release by @peasee in https://github.com/spiceai/spiceai/pull/4464
- feat: Schedule testoperator by @peasee in https://github.com/spiceai/spiceai/pull/4503
- fix: Remove on zero results arguments from benchmarks by @peasee in https://github.com/spiceai/spiceai/pull/4533
- fix: Don't snapshot clickbench benchmarks by @peasee in https://github.com/spiceai/spiceai/pull/4534
- docs: v1.0.1 release note by @Sevenannn in https://github.com/spiceai/spiceai/pull/4529
- Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/4535
- In spiced_docker, propagate setup to publish-cuda by @Jeadie in https://github.com/spiceai/spiceai/pull/4543
- Upgrade Rust to 1.84 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4541
- Upgrade dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4546
- Revert "Use OpenAI golang client in
spice chat(#4491)" by @Jeadie in https://github.com/spiceai/spiceai/pull/4564 - feat: add schema inference for the Spice.ai Data Connector by @peasee in https://github.com/spiceai/spiceai/pull/4579
- Remove 'tools: builtin' by @Jeadie in https://github.com/spiceai/spiceai/pull/4607
- feat: Add initial IMAP connector by @peasee in https://github.com/spiceai/spiceai/pull/4587
- feat: Add email content loading by @peasee in https://github.com/spiceai/spiceai/pull/4616
- feat: Add SSL and Auth parameters for IMAP by @peasee in https://github.com/spiceai/spiceai/pull/4613
- Change /v1/models to be OpenAI compatible by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4624
- Use
pdf-extractcrate to extract text from PDF documents by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4615 - Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/4628
- Add 1.0.2 release notes by @sgrebnov in https://github.com/spiceai/spiceai/pull/4627
- Fix cuda::ffi by @Jeadie in https://github.com/spiceai/spiceai/pull/4649
- Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/4654
- fix: Spice.ai schema inference by @peasee in https://github.com/spiceai/spiceai/pull/4674
- Add SQL Benchmark with sample eval configuration based on TPCH by @sgrebnov in https://github.com/spiceai/spiceai/pull/4549
- Update Helm chart to Spice v1.0.2 by @sgrebnov in https://github.com/spiceai/spiceai/pull/4655
- Update v1.0.2 release notes by @sgrebnov in https://github.com/spiceai/spiceai/pull/4639
- Fix E2E AI release install test on self-hosted runners (macos) by @sgrebnov in https://github.com/spiceai/spiceai/pull/4675
- Main performance metrics calculation for Text to SQL Benchmark by @sgrebnov in https://github.com/spiceai/spiceai/pull/4681
- Add eval datasets / test scripts for model grading criteria by @Sevenannn in https://github.com/spiceai/spiceai/pull/4663
- Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/4684
- Add testoperator for
evalsrunning by @sgrebnov in https://github.com/spiceai/spiceai/pull/4688 - Add GH Workflow to run Text to SQL benchmark by @sgrebnov in https://github.com/spiceai/spiceai/pull/4689
- Add 1.0.2 as supported version to SECURITY.md by @sgrebnov in https://github.com/spiceai/spiceai/pull/4695
- Text-To-SQL benchmark: trace failed tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/4705
- Text-To-SQL benchmark: extend list of benchmarking models by @sgrebnov in https://github.com/spiceai/spiceai/pull/4707
- Text-To-SQL: increase sql coverage, add more advanced tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/4713
- Use model that supports tools in hf_test by @Jeadie in https://github.com/spiceai/spiceai/pull/4712
- Fix Spice.ai E2E test by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4723
- Return non-existing model for v1/chat endpoint by @Sevenannn in https://github.com/spiceai/spiceai/pull/4718
- Update Helm chart for 1.0.3 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4742
- Update dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4740
- Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/4744
- Update SECURITY.md with 1.0.3 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4745
- Add basic smoke test of perplexity LLM to llm integration tests. by @Jeadie in https://github.com/spiceai/spiceai/pull/4735
- Don't run integration tests on PRs when only CLI is changed by @Jeadie in https://github.com/spiceai/spiceai/pull/4751
- Prompt user to upgrade through brew / do another clean install when spice is installed through homebrew / at non-standard path by @Sevenannn in https://github.com/spiceai/spiceai/pull/4746
- feat: Search with keyword filtering by @peasee in https://github.com/spiceai/spiceai/pull/4759
- Fix search benchmark by @sgrebnov in https://github.com/spiceai/spiceai/pull/4765
- feat: Add IMAP access token parameter by @peasee in https://github.com/spiceai/spiceai/pull/4769
- Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/4774
- Mark trunk builds as unstable by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4776
- feat: Release Spice.ai RC by @peasee in https://github.com/spiceai/spiceai/pull/4753
- fix: Validate columns and keywords in search by @peasee in https://github.com/spiceai/spiceai/pull/4775
- Run models E2E tests on PR by @sgrebnov in https://github.com/spiceai/spiceai/pull/4798
- fix: models runtime not required for cloud chat by @peasee in https://github.com/spiceai/spiceai/pull/4781
- Only open one PR for openapi.json by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4807
- docs: Release IMAP Alpha by @peasee in https://github.com/spiceai/spiceai/pull/4797
- Add Results-Cache-Status to indicate query result came from cache by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4809
- Initial spice cli e2e tests with spice upgrade tests by @Sevenannn in https://github.com/spiceai/spiceai/pull/4764
- Log CLI and Runtime Versions on startup by @sgrebnov in https://github.com/spiceai/spiceai/pull/4816
- Sort keys for openai by @Jeadie in https://github.com/spiceai/spiceai/pull/4766
- Remove docs index trigger from the endgame template by @ewgenius in https://github.com/spiceai/spiceai/pull/4832
- Release notes for v1.0.4 by @Jeadie in https://github.com/spiceai/spiceai/pull/4827
- Update SECURITY.md by @Jeadie in https://github.com/spiceai/spiceai/pull/4829
- Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/4831
- Don't print URL by @lukekim in https://github.com/spiceai/spiceai/pull/4838
- add 'eval_run' to 'spice trace' by @Jeadie in https://github.com/spiceai/spiceai/pull/4841
- Run benchmark tests w/o uploading test results (pending improvements) by @sgrebnov in https://github.com/spiceai/spiceai/pull/4843
- Fix 'actual" and "output" columns in
eval.results. by @Jeadie in https://github.com/spiceai/spiceai/pull/4835 - Fix string escaping of system prompt by @Jeadie in https://github.com/spiceai/spiceai/pull/4844
- update helm chart to v1.0.4 by @Jeadie in https://github.com/spiceai/spiceai/pull/4828
- Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/4806
- fix: Skip sccache in PR for external users by @peasee in https://github.com/spiceai/spiceai/pull/4851
- fix: Return BAD_REQUEST when not embeddings are configured by @peasee in https://github.com/spiceai/spiceai/pull/4804
- Debug log cuda detection failure in spice by @Sevenannn in https://github.com/spiceai/spiceai/pull/4852
- fix: Set RUSTC wrapper explicitly by @peasee in https://github.com/spiceai/spiceai/pull/4854
- Improve trace UX for
ai_completion, fix infinite tool calls by @Jeadie in https://github.com/spiceai/spiceai/pull/4853 - Allow homebrew spice cli to upgrade the runtime by @Sevenannn in https://github.com/spiceai/spiceai/pull/4811
- Add support for MCP tools by @Jeadie in https://github.com/spiceai/spiceai/pull/4808
- fix: Rustc wrapper actions by @peasee in https://github.com/spiceai/spiceai/pull/4867
- Provide link to supported OS list when user platform is not supported by @Garamda in https://github.com/spiceai/spiceai/pull/4840
- Always download spice runtime version matched with spice cli version by @Sevenannn in https://github.com/spiceai/spiceai/pull/4761
- Disable flaky integration test by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4871
- fix: sccache actions setup by @peasee in https://github.com/spiceai/spiceai/pull/4873
- Fixing Go installation in the setup script for Linux Arm64 by @sergey-shandar in https://github.com/spiceai/spiceai/pull/4868
- Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/4864
- DuckDB acceleration: Use temp table only for append with conflict resolution by @sgrebnov in https://github.com/spiceai/spiceai/pull/4874
- Trace the output of streamed
chat/completionsto runtime.task_history. by @Jeadie in https://github.com/spiceai/spiceai/pull/4845 - Always pass
X-API-Keyin spice api calls header if detected in env by @ewgenius in https://github.com/spiceai/spiceai/pull/4878 - Revert "DuckDB acceleration: Use temp table only for append with conflict resolution" by @sgrebnov in https://github.com/spiceai/spiceai/pull/4886
- Allow overriding spicerack base url in the CLI by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4892
- Add test Spicepod for DuckDB full acceleration with constraints by @sgrebnov in https://github.com/spiceai/spiceai/pull/4891
- Refactor Parameter Handling by @Advayp in https://github.com/spiceai/spiceai/pull/4833
- Add test Spicepod for DuckDB append acceleration with constraints by @sgrebnov in https://github.com/spiceai/spiceai/pull/4898
- Update to latest async-openai fork. Update secrecy by @Sevenannn in https://github.com/spiceai/spiceai/pull/4911
- Fix mcp tools build by @sgrebnov in https://github.com/spiceai/spiceai/pull/4916
- Add more test spicepods by @Sevenannn in https://github.com/spiceai/spiceai/pull/4923
- task: Add more dispatch files by @peasee in https://github.com/spiceai/spiceai/pull/4933
- run spiceai benchmark test using test operator by @Sevenannn in https://github.com/spiceai/spiceai/pull/4920
- Convert sequential search code block to parallel async by @Garamda in https://github.com/spiceai/spiceai/pull/4936
- fix: Throughput metric calculation by @peasee in https://github.com/spiceai/spiceai/pull/4938
- Update dependabot dependencies &
cargo updateby @phillipleblanc in https://github.com/spiceai/spiceai/pull/4872 - Improve servers shutdown sequence during runtime termination by @sgrebnov in https://github.com/spiceai/spiceai/pull/4942
- Semantic model for views. Views visible in
table_schema&list_datasetstools. by @Jeadie in https://github.com/spiceai/spiceai/pull/4946 - update openai-async by @Jeadie in https://github.com/spiceai/spiceai/pull/4948
- Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/4961
- fix: Redundant results snapshotting by @peasee in https://github.com/spiceai/spiceai/pull/4956
- Create schema for views if not exist by @Jeadie in https://github.com/spiceai/spiceai/pull/4957
- Bump Jimver/cuda-toolkit from 0.2.21 to 0.2.22 by @dependabot in https://github.com/spiceai/spiceai/pull/4969
- List available operations in
spice trace <operation>by @Jeadie in https://github.com/spiceai/spiceai/pull/4953 - Initial commit of release analytics by @lukekim in https://github.com/spiceai/spiceai/pull/4975
- Remove spaces from CSV by @lukekim in https://github.com/spiceai/spiceai/pull/4977
- Fix Spice pods watcher by @sgrebnov in https://github.com/spiceai/spiceai/pull/4984
- feat: Add appendable data sources for the testoperator by @peasee in https://github.com/spiceai/spiceai/pull/4949
- Omit timestamp when warning regarding datasets with hyphens by @Advayp in https://github.com/spiceai/spiceai/pull/4987
- Update helm chart to v1.0.5 by @sgrebnov in https://github.com/spiceai/spiceai/pull/4990
- docs: Update qa_analytics.csv by @peasee in https://github.com/spiceai/spiceai/pull/4989
- Update end_game template by @sgrebnov in https://github.com/spiceai/spiceai/pull/4991
- Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/4993
- Add v1.0.5 release notes by @sgrebnov in https://github.com/spiceai/spiceai/pull/4994
- Supported Versions: include v1.0.5 by @sgrebnov in https://github.com/spiceai/spiceai/pull/4995
- Dependabot updates by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4992
- Switch to basic markdown formatting for vector search by @sgrebnov in https://github.com/spiceai/spiceai/pull/4934
- docs: Update qa_analytics.csv by @peasee in https://github.com/spiceai/spiceai/pull/5001
- feat: Add TPCDS FileAppendableSource for testoperator by @peasee in https://github.com/spiceai/spiceai/pull/5002
- Update
ringby @phillipleblanc in https://github.com/spiceai/spiceai/pull/5003 - docs: Update qa_analytics.csv by @peasee in https://github.com/spiceai/spiceai/pull/5006
- feat: Add ClickBench FileAppendableSource for testoperator by @peasee in https://github.com/spiceai/spiceai/pull/5004
- feat: Validate append test table counts by @peasee in https://github.com/spiceai/spiceai/pull/5008
- feat: Add append spicepods by @peasee in https://github.com/spiceai/spiceai/pull/5009
- Improve Vector Search performance for large content w/o primary key defined by @sgrebnov in https://github.com/spiceai/spiceai/pull/5010
- Don't try to downgrade Arc in testaccelerationduckdbsingleinstance by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5014
- feat: Add an initial testoperator vector search command by @peasee in https://github.com/spiceai/spiceai/pull/5011
- feat: Update testoperator workflows for automatic snapshot updates by @peasee in https://github.com/spiceai/spiceai/pull/5018
- Fix Vector Search when additional columns include embedding column by @sgrebnov in https://github.com/spiceai/spiceai/pull/5022
- Include test for primary key passed as additional column in Vector Search by @sgrebnov in https://github.com/spiceai/spiceai/pull/5024
- fix: Update benchmark snapshots by @github-actions in https://github.com/spiceai/spiceai/pull/5020
- upgrade mistral.rs by @Jeadie in https://github.com/spiceai/spiceai/pull/4952
- fix: Indexes for TPCDS SQLite Spicepod by @peasee in https://github.com/spiceai/spiceai/pull/5038
- fix: Update benchmark snapshots by @github-actions in https://github.com/spiceai/spiceai/pull/5035
- Include local files in generated Spicepod package by @sgrebnov in https://github.com/spiceai/spiceai/pull/5041
- update mistral.rs to 'spiceai' branch rev by @Jeadie in https://github.com/spiceai/spiceai/pull/5029
- Configure spiced as an MCP SSE server by @Jeadie in https://github.com/spiceai/spiceai/pull/5039
- Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/5052
- fix: Disable benchmarks schedule, enable testoperator schedule by @peasee in https://github.com/spiceai/spiceai/pull/5058
- fix: Update benchmark snapshots by @github-actions in https://github.com/spiceai/spiceai/pull/5060
- Update ROADMAP.md March 2025 by @lukekim in https://github.com/spiceai/spiceai/pull/5061
- fix: Testoperator data setup by @peasee in https://github.com/spiceai/spiceai/pull/5068
- fix: All HTTP endpoints to hang when adding an invalid dataset with --pods-watcher-enabled by @sgrebnov in https://github.com/spiceai/spiceai/pull/5050
- fix: Update benchmark snapshots by @github-actions in https://github.com/spiceai/spiceai/pull/5073
- Integration tests for MCP tooling by @Jeadie in https://github.com/spiceai/spiceai/pull/5053
- OpenAPI docs for MCP by @Jeadie in https://github.com/spiceai/spiceai/pull/5057
- fix: Acceleration federation test by @peasee in https://github.com/spiceai/spiceai/pull/5090
- fix: Allow spiced commit in testoperator dispatch by @peasee in https://github.com/spiceai/spiceai/pull/5098
- fix: Use RefreshOverrides for the refresh API definition by @peasee in https://github.com/spiceai/spiceai/pull/5095
- Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/5094
- fix: Increase tries for refreshstatuschangetoready test by @peasee in https://github.com/spiceai/spiceai/pull/5099
- feat: Testoperator reports on max and median memory usage by @peasee in https://github.com/spiceai/spiceai/pull/5101
- Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/5105
- fix: Fail testoperator on failed queries by @peasee in https://github.com/spiceai/spiceai/pull/5106
- Update Helm chart to 1.0.6 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5107
- Update SECURITY.md to include 1.0.6 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5109
- Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/5108
- Add QA analytics for 1.0.6 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5110
- add env variables to tools, usable in MCP stdio by @Jeadie in https://github.com/spiceai/spiceai/pull/5097
- HF downloads obey SIGTERM by @Jeadie in https://github.com/spiceai/spiceai/pull/5044
- Add v1.0.6 release notes into trunk by @sgrebnov in https://github.com/spiceai/spiceai/pull/5111
- Remove redundant mod name for iceberg integration tests by @Sevenannn in https://github.com/spiceai/spiceai/pull/5112
- Use fixed data directory for test operator by @Sevenannn in https://github.com/spiceai/spiceai/pull/5103
- Improvements for evals by @Jeadie in https://github.com/spiceai/spiceai/pull/5040
- Make McpProxy trait for MCP passthrough by @Jeadie in https://github.com/spiceai/spiceai/pull/5115
- Properly handle '/' for tool names. by @Jeadie in https://github.com/spiceai/spiceai/pull/5116
- Use retry logic when loading tools by @Jeadie in https://github.com/spiceai/spiceai/pull/5120
- Exclude slow tests from regular pr runs by @Sevenannn in https://github.com/spiceai/spiceai/pull/5119
- Fix test operator snapshot update by @Sevenannn in https://github.com/spiceai/spiceai/pull/5130
- spice init: Fixes windows bug where full path is used for spicepod name by @benrussell in https://github.com/spiceai/spiceai/pull/5126
- fix: Update benchmark snapshots by @github-actions in https://github.com/spiceai/spiceai/pull/5131
- Implement graceful shutdown for HTTP server by @sgrebnov in https://github.com/spiceai/spiceai/pull/5102
- Update enhancement.md by @lukekim in https://github.com/spiceai/spiceai/pull/5142
- Add GitHub Workflow and PoC Spicepod configuration to run FinanceBench tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/5145
- Fix Postgres and MySQL installation on macos14-runner (E2E CI) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5155
- De-duplicate attachments in DuckDBAttachments by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5156
- v1.0.7 release note by @Sevenannn in https://github.com/spiceai/spiceai/pull/5153
- Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/5160
- Update Helm chart to 1.0.7 by @Sevenannn in https://github.com/spiceai/spiceai/pull/5159
- Add github token to macos test release download tasks by @Sevenannn in https://github.com/spiceai/spiceai/pull/5161
- update security.md for 1.0.7 by @Sevenannn in https://github.com/spiceai/spiceai/pull/5162
- Update roadmap.md by @Sevenannn in https://github.com/spiceai/spiceai/pull/5163
- Add a performance comparison section for 1.0.7 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5164
- docs: Add snafu error variant point to style guide by @peasee in https://github.com/spiceai/spiceai/pull/5167
- Fix 1.0.7 release note by @Sevenannn in https://github.com/spiceai/spiceai/pull/5168
- Adjust DuckDB connection pool size based on DuckDB accelerator instances usage by @Sevenannn in https://github.com/spiceai/spiceai/pull/5117
- Add automatic retry for NSQL queries by @sgrebnov in https://github.com/spiceai/spiceai/pull/5169
- Include chat completion id to task history by @sgrebnov in https://github.com/spiceai/spiceai/pull/5170
- Trace when all runtime components are ready by @sgrebnov in https://github.com/spiceai/spiceai/pull/5171
- Update qa_analytics.csv for 1.0.7 by @Sevenannn in https://github.com/spiceai/spiceai/pull/5165
- Set default tool recursion limit to 10 to prevent infinite loops by @sgrebnov in https://github.com/spiceai/spiceai/pull/5173
- Add support for
schema_source_pathparam for object-store data connectors by @sgrebnov in https://github.com/spiceai/spiceai/pull/5178 - Run license check and check changes on self-hosted macOS runners by @lukekim in https://github.com/spiceai/spiceai/pull/5179
- Add MCP by @lukekim in https://github.com/spiceai/spiceai/pull/5183
Full Changelog: github.com/spiceai/spiceai/compare/v1.0.0...release/1.1
- Rust
Published by phillipleblanc 11 months ago
https://github.com/spiceai/spiceai - v1.0.7
Spice v1.0.7 (Mar 26, 2025)
Spice v1.0.7 improves memory usage when using DuckDB, improves schema inference performance when using object-store based data connectors, and fixes a bug in Dremio schema inference.
Highlights in v1.0.7
- DuckDB Memory Usage: Memory usage when using DuckDB has been significantly improved for data loads and refreshes through expanded use of zero-copy Arrow and multi-threading for data loads. When a
duckdb_memory_limitis specified, disk spilling has been improved for greater-than-memory workloads. In addition, a newtemp_directoryruntime parameter supports storing temporary files to alternative location than the DuckDB data file for higher throughput. For example,temp_directorycould be set to a different high-IOPs IO2 EBS volume that is separate from theduckdb_file_path.
Automated end-to-end tests for the DuckDB Accelerator coverage has been significantly expanded.
For configuration details, see the documentation for runtime parameters and the DuckDB Data Accelerator.
- Schema Inference Performance for Object-Store Data Connectors: Schema inference performance has been improved, especially for large numbers of objects (1M+ objects) when using object-store based data connectors by making the object-listing and selection more efficient.
Contributors
- @phillipleblanc
- @sgrebnov
- @peasee
- @Sevenannn
Breaking Changes
No breaking changes.
Upgrading
To upgrade to v1.0.7, use one of the following methods:
CLI:
console
spice upgrade
Homebrew:
console
brew upgrade spiceai/spiceai/spice
Docker:
Pull the spiceai/spiceai:1.0.7 image:
console
docker pull spiceai/spiceai:1.0.7
For available tags, see DockerHub.
Helm:
console
helm repo update
helm upgrade spiceai spiceai/spiceai
What's Changed
Dependencies
- DataFusion Table Providers: Upgraded from
760ece6ac52b7d180d697f347642af403c2e711cto9ba9dce19a1fdbd5e22cc2e445c5b3ea731944b4.
Changelog
- fix: Remove on zero results arguments from benchmarks by @peasee in https://github.com/spiceai/spiceai/pull/4533
- Run benchmark tests w/o uploading test results (pending improvements) by @sgrebnov in https://github.com/spiceai/spiceai/pull/4843
- fix: Return BAD_REQUEST when not embeddings are configured by @peasee in https://github.com/spiceai/spiceai/pull/4804
- Fix Dremio schema inference by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5114
- Improve performance of schema inference for object-store data connectors by @sgrebnov in https://github.com/spiceai/spiceai/pull/5124
- Always download spice runtime version matched with spice cli version by @Sevenannn in https://github.com/spiceai/spiceai/pull/4761
- Fix go lint errors by @sgrebnov in https://github.com/spiceai/spiceai/pull/5147
- Make DuckDB acceleration E2E tests more comprehensive by @sgrebnov in https://github.com/spiceai/spiceai/pull/5146
- Enable Spice to load larger than memory datasets into DuckDB accelerations by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5149
- Add
temp_directoryruntime parameter and insert it for DuckDB accelerations by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5152 - Fix Postgres and MySQL installation on macos14-runner (E2E CI) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5155
- Enable E2E for DuckDB full mode acceleration with indexes only in CI by @sgrebnov in https://github.com/spiceai/spiceai/pull/5154
Full Changelog: https://github.com/spiceai/spiceai/compare/v1.0.6...v1.0.7
- Rust
Published by Sevenannn 11 months ago
https://github.com/spiceai/spiceai - v1.0.6
Spice v1.0.6 (Mar 17, 2025)
Spice v1.0.6 improves stability for DuckDB acceleration, Iceberg Data/Catalog connector improvements when using AWS Glue, and fixes an issue with the ready_state: on_registration federation fallback when using DuckDB. In addition, redundant data refreshes on startup are avoided for accelerations with persistent data.
Highlights in v1.0.6
Iceberg Data/Catalog Connector Improvements: Improves Iceberg data & catalog connector reliability, including bug fixes for AWS Glue API rate-limiting and compatibility, REST API pagination support, explicit AWS credential handling, and support for AWS STS role assumption.
Fixes On-Registration Fallback when using DuckDB: Previously, when using DuckDB as a data accelerator and the
ready_state: on_registrationconfiguration, queries made during the initial data refresh did not properly fallback to the federated source. This is now fixed.DuckDB downgraded for Stability: DuckDB has been downgraded to v1.1.3 due to a regression in memory handling tracked by duckdb/duckdb issue #16640. Once resolved and validated, Spice will re-upgrade to v1.2.x.
Expanded Integration Tests: Additional integration tests covering federated accelerator behavior and graceful shutdown processes have been added.
Optimized Data Refresh for Persistent Accelerations: Changed behavior in v1.0.6. When using persistent (file-mode) acceleration without a defined refresh interval, Spice performs a full refresh at startup only if no previously accelerated data is available. This ensures efficient startup behavior by avoiding unnecessary refreshes. This logic applies only to full refreshes when no refresh interval is specified.
To maintain the previous behavior and always refresh on every startup, set:
yaml
acceleration:
refresh_on_startup: always
Contributors
- @peasee
- @phillipleblanc
- @sgrebnov
- @lukekim
- @Sevenannn
Breaking Changes
Starting from v1.0.6 when using persistent (file-mode) acceleration without a defined refresh interval, Spice performs a full refresh at startup only if no previously accelerated data is available. To maintain the previous behavior and always refresh on every startup, set:
yaml
acceleration:
refresh_on_startup: always
Cookbook Updates
No new recipes.
Upgrading
To upgrade to v1.0.6, use one of the following methods:
CLI:
console
spice upgrade
Homebrew:
console
brew upgrade spiceai/spiceai/spice
Docker:
Pull the spiceai/spiceai:1.0.6 image:
console
docker pull spiceai/spiceai:1.0.6
For available tags, see DockerHub.
Helm:
console
helm repo update
helm upgrade spiceai spiceai/spiceai
What's Changed
Dependencies
- duckdb-rs: Downgraded from 1.2.0 to 1.1.3
Changelog
- Implement proper readystate: onregistration for federation enabled accelerators by @phillipleblanc in #5019
- Add indexes and primary keys mismatch detection for DuckDB Acceleration by @sgrebnov in #5045
- Add comprehensive integration tests for the ready_state behavior by @phillipleblanc in #5042
- Add test Spicepod for acceleration with constraints by @sgrebnov in #4891
- Add test Spicepod for DuckDB append acceleration with constraints by @sgrebnov in #4898
- Add DuckDB graceful shutdown test to E2E CI tests by @sgrebnov in #5047
- Update duckdbappendwithpkand_indexes.yaml (work for duckdb 1.1.x) by @sgrebnov in #5067
- fix: Downgrade to DuckDB 1.1.3 by @peasee in #5055
- fix: Acceleration federation integration test by @peasee in #5070
- Improvements to Iceberg Catalog/Data Connector by @phillipleblanc in #5071
- Add Results-Cache-Status to indicate query result came from cache by @phillipleblanc in #4809
- fix: Spice.ai schema inference by @peasee in #4674
- Add
refresh_on_startupSpicepod configuration param by @phillipleblanc and @sgrebnov in #5086 - Test restart behavior of DuckDB file acceleration against glue iceberg table by @Sevenannn #5075
- Run Iceberg Data Connector - DuckDB File mode integration test by @Sevenannn #5069
- Integration test for glue iceberg catalog by @Sevenannn #5077
Full Changelog: https://github.com/spiceai/spiceai/compare/v1.0.5...v1.0.6
- Rust
Published by phillipleblanc 12 months ago
https://github.com/spiceai/spiceai - v1.0.5
Spice v1.0.5 (Mar 10, 2025)
Spice v1.0.5 expands Iceberg support with the introduction of the Iceberg Data Connector, in addition to the existing Iceberg Catalog Connector. This new connector enables direct dataset creation and configuration for specific Iceberg objects, enabling federated and accelerated SQL queries on Apache Iceberg tables.
Performance improvements include enhanced Parquet pruning in append mode, where object-store metadata is now leveraged alongside Hive partitioning to optimize file pruning. This results in faster and more efficient queries.
DuckDB has been upgraded to v1.2.0, along with additional stability improvements, including improved graceful shutdown and the ability to configure the DuckDB memory limit.
Additional updates include support for the Arrow Map type.
Highlights in v1.0.5
- New Iceberg Data Connector: Enables direct dataset creation and querying of Iceberg tables.
Example usage in spicepod.yaml:
yaml
datasets:
- from: iceberg:https://iceberg-catalog-host.com/v1/namespaces/my_namespace/tables/my_table
name: my_table
params:
# Same as Iceberg Catalog Connector
acceleration:
enabled: true
For detailed setup instructions, authentication options, and configuration parameters, refer to the Iceberg Data Connector documentation.
Improved Parquet pruning in append mode: Uses object-store metadata for more efficient file pruning.
DuckDB upgrade to v1.2.0 with improved graceful shutdown: Read the DuckDB v1.2.0 announcement for details, including breaking changes for
mapandlist_reduce. Graceful shutdown of DuckDB has been improved for better stability across restarts.Configurable DuckDB memory limit: Use the
duckdb_memory_limitparameter to set the DuckDB acceleration memory limit:
yaml
- from: spice.ai:path.to.my_dataset
name: my_dataset
acceleration:
params:
duckdb_memory_limit: '2GB'
enabled: true
engine: duckdb
mode: file
Contributors
- @peasee
- @phillipleblanc
- @sgrebnov
- @lukekim
Breaking Changes
- DuckDB v1.2.0 has breaking changes.
Upgrading
To upgrade to v1.0.5, use one of the following methods:
CLI:
console
spice upgrade
Homebrew:
console
brew upgrade spiceai/spiceai/spice
Docker:
Pull the spiceai/spiceai:1.0.5 image:
console
docker pull spiceai/spiceai:1.0.5
For available tags, see DockerHub.
Helm:
console
helm repo update
helm upgrade spiceai spiceai/spiceai
What's Changed
Dependencies
- duckdb-rs: Upgraded from 1.1.1 to 1.2.0
Changelog
- fix: Update OpenAI model health check by @peasee in #4849
- fix: Allow metrics endpoint setting in CLI by @peasee in #4939
- DuckDB acceleration: fix Decimal with zero scale support by @sgrebnov in #4922
- Introduce runtime shutdown state by @sgrebnov in #4917
- Add support for Flight and HTTP endpoints configuration to Spice CLI (run and sql) by @sgrebnov and @lukekim in #4913
- Fix Datafusion resources deallocation during shutdown by @sgrebnov in #4912
- DuckDB: fix error handling during record batch insertion by @sgrebnov in #4894
- DuckDB: add support for Map Arrow type for DuckDB acceleration by @sgrebnov in #4887
- Upgrade to DuckDB v1.2.0 by @sgrebnov in #4842
- Gracefully shutdown the runtime and deallocate static resources by @sgrebnov in #4879
- Implement an Iceberg Data Connector by @phillipleblanc in #4941
- Don't trace canceled dataset refresh during runtime termination by @sgrebnov in #4958
- Use metadata column lastmodified when specified as a timecolumn by @phillipleblanc in #4970
- Add duckdbmemorylimit param support for DuckDB acceleration by @sgrebnov in #4971
- Add Iceberg dataset integration test by @phillipleblanc in #4950
Full Changelog: https://github.com/spiceai/spiceai/compare/v1.0.4...v1.0.5
- Rust
Published by sgrebnov 12 months ago
https://github.com/spiceai/spiceai - v1.0.4
Spice v1.0.4 (Feb 17, 2024)
Spice v1.0.4 includes several bugfixes including improved table column casing and normalization, Delta Lake partition pruning and improved tracing throughout spiced and added functionality to spice trace.
Highlights in v1.0.4
- Improved
spice tracefunctionality: A more detailedspice traceformat with new flags--include-output,--include-inputand--truncate``` >> spice trace ai_chat --include-input --truncate
TREE STATUS DURATION TASK INPUT
b28bab6b58971b7e ✅ 1352.12ms aichat {"messages":[{"role":"user","content":"hello"}],"model":"openaimodel","stream":... (45 characters omitted)
└── 1a0ad7c6138abb09 ✅ 1352.03ms aicompletion {"messages":[{"role":"user","content":"hello"}],"model":"openaimodel","stream":... (45 characters omitted)
```
Contributors
- @phillipleblanc
- @Sevenannn
- @sgrebnov
- @peasee
- @Jeadie
- @lukekim
Breaking Changes
No breaking changes.
Cookbook Updates
Upgrading
To upgrade to v1.0.4, use one of the following methods:
CLI:
console
spice upgrade
Homebrew:
console
brew upgrade spiceai/spiceai/spice
Docker:
Pull the spiceai/spiceai:1.0.4 image:
console
docker pull spiceai/spiceai:1.0.4
For available tags, see DockerHub.
Helm:
console
helm repo update
helm upgrade spiceai spiceai/spiceai
What's Changed
Dependencies
No major dependency changes.
Changelog
- Do not return underlying content of chunked embedding column by default during tooluse::documentsimilarity by @Jeadie in https://github.com/spiceai/spiceai/pull/4802
- Fix Snowflake Case-Sensitive Identifiers support by @sgrebnov in https://github.com/spiceai/spiceai/pull/4813
- Prepare for 1.0.4 by @sgrebnov in https://github.com/spiceai/spiceai/pull/4801
- Add support for a timepartitioncolumn by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4784
- Prevent the automatic normalization of refresh_sql columns to lowercase by @sgrebnov in https://github.com/spiceai/spiceai/pull/4787
- Implement partition pruning for Delta Lake tables by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4783
- Fix constraint verification for columns with uppercase letters by @sgrebnov in https://github.com/spiceai/spiceai/pull/4785
- Add truncate command for spice trace by @peasee in https://github.com/spiceai/spiceai/pull/4771
- Implement Cache-Control: no-cache to bypass results cache by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4763
- Prompt user to download runtime when running spice sql by @Sevenannn in https://github.com/spiceai/spiceai/pull/4747
- Add vector search tracing by @peasee in https://github.com/spiceai/spiceai/pull/4757
- Update spice trace output format by @Jeadie in https://github.com/spiceai/spiceai/pull/4750
- Fix tool call arguments in Grok messages by @Jeadie in https://github.com/spiceai/spiceai/pull/4741
Full Changelog: https://github.com/spiceai/spiceai/compare/v1.0.3...v1.0.4
- Rust
Published by phillipleblanc about 1 year ago
https://github.com/spiceai/spiceai - v1.0.3
Spice v1.0.3 (Feb 10, 2024)
Spice v1.0.3 provides several bug fixes, including a fix for the initial data load period when a retention policy has been set, and a new unsupported_type_action: string parameter to auto-convert unsupported types to strings.
Highlights in v1.0.3
- PostgreSQL Data Connector: New
unsupported_type_action: stringparameter that auto-converts unsupported types such as JSONB to strings.
Contributors
- @phillipleblanc
- @Sevenannn
- @sgrebnov
- @peasee
- @Jeadie
- @lukekim
Breaking Changes
No breaking changes.
Cookbook Updates
- Updated Kubernetes Deployment Recipe
- Updated Data Retention Recipe
Upgrading
To upgrade to v1.0.3, use one of the following methods:
CLI:
console
spice upgrade
Homebrew:
console
brew upgrade spiceai/spiceai/spice
Docker:
Pull the spiceai/spiceai:1.0.3 image:
console
docker pull spiceai/spiceai:1.0.3
For available tags, see DockerHub.
Helm:
console
helm repo update
helm upgrade spiceai spiceai/spiceai
What's Changed
Dependencies
No major dependency changes.
Changelog
- For local models, use 'content=""' instead of None by @Jeadie and @phillipleblanc in https://github.com/spiceai/spiceai/pull/4646
- Perplexity Sonar LLM component by @Jeadie and @lukekim in https://github.com/spiceai/spiceai/pull/4673
- Update async openai fork & support reasoning effort parameter by @Sevenannn and @phillipleblanc in https://github.com/spiceai/spiceai/pull/4679
- Web search tool by @Jeadie and @lukekim in https://github.com/spiceai/spiceai/pull/4687
- Setup tpc-extension by @ewgenius and @phillipleblanc in https://github.com/spiceai/spiceai/pull/4690
- fix: Use PostgreSQL interval style for Spice.ai by @peasee and @phillipleblanc in https://github.com/spiceai/spiceai/pull/4716
- Fix spice upgrade command by @Sevenannn and @sgrebnov in https://github.com/spiceai/spiceai/pull/4699
- Fix bug: Ensure refresh only retrieves data within the retention period by @sgrebnov and @phillipleblanc in https://github.com/spiceai/spiceai/pull/4717
- Implement unsupportedtypeaction: string for Postgres JSONB support by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4719
- Fix the get latest release logic by @Sevenannn and @phillipleblanc in https://github.com/spiceai/spiceai/pull/4721
- add 'accelerated_refresh' to 'spice trace' allowlist by @Jeadie and @phillipleblanc in https://github.com/spiceai/spiceai/pull/4711
- Update version to 1.0.3 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4731
- Truncate embedding columns within sampling tool by @Jeadie in https://github.com/spiceai/spiceai/pull/4722
- Validate primary key columns during accelerated dataset initialization by @sgrebnov in https://github.com/spiceai/spiceai/pull/4736
Full Changelog: https://github.com/spiceai/spiceai/compare/v1.0.2...v1.0.3
- Rust
Published by phillipleblanc about 1 year ago
https://github.com/spiceai/spiceai - v1.0.2
Spice v1.0.2 (Feb 3, 2024)
Spice v1.0.2 adds support for running local filesystem-hosted DeepSeek models including R1 (cloud-hosted via DeepSeek API was already supported) and improves the developer experience for debugging AI chat tasks along with several bug fixes. The HuggingFace and Filesystem-Hosted models providers have both graduated to Release Candidates (RC) and the Spice.ai Cloud Platform catalog provider has graduated to Beta.
Highlights in v1.0.2
- spice trace New
spice traceCLI command that outputs a detailed breakdown of traces and tasks, including tool usage and AI completions.
Examples:
```shell trace> spice trace aichat 61cc6bd0e571c783 aichat ├── 69362c30f238076f tooluse::getreadiness ├── b6b17f1a9a6b86dc aicompletion ├── c30d692c6c41c5ee tooluse::listdatasets └── ce18756d5fef0df0 aicompletion
trace> spice trace ai_chat --trace-id 61cc6bd0e571c783
trace> spice trace ai_chat --id chatcmpl-AvXwmPSV1PMyGBi9dLfkEQTZPjhqz ```
The spice trace CLI simply outputs data available in the runtime.task_history table which can also be queried by SQL.
To learn more, see:
spice traceDocumentationFilesystem-Hosted Models Provider: Graduated to Release Candidate (RC). To learn more, see the Filesystem-Hosted Models Provider Documentation.
HuggingFace Models Provider: Graduated to Release Candidate (RC). To learn more, see the HuggingFace Models Provider Documentation.
Spice.ai Cloud Platform Catalog: Graduated to Beta.
Contributors
- @phillipleblanc
- @johnnynunez
- @Sevenannn
- @sgrebnov
- @peasee
- @Jeadie
- @lukekim
New Contributors
- @johnnynunez made their first contribution in github.com/spiceai/spiceai/pull/4502
Breaking Changes
No breaking changes.
Cookbook Updates
Upgrading
To upgrade to v1.0.2, use one of the following methods:
CLI:
console
spice upgrade
Homebrew:
console
brew upgrade spiceai/spiceai/spice
Docker:
Pull the spiceai/spiceai:1.0.2 image:
console
docker pull spiceai/spiceai:1.0.2
For available tags, see DockerHub.
Helm:
console
helm repo update
helm upgrade spiceai spiceai/spiceai
What's Changed
Dependencies
No major dependency changes.
Changlog
- Update release branch naming by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4539
- ready for arm buildings by @johnnynunez in https://github.com/spiceai/spiceai/pull/4502
- Bump helm chart version to 1.0.1 by @Sevenannn in https://github.com/spiceai/spiceai/pull/4542
- Include 1.0.1 as supported version in security.md by @Sevenannn in https://github.com/spiceai/spiceai/pull/4545
- Update CI to build on hosted windows runners by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4540
- docs: Update Windows install by @peasee in https://github.com/spiceai/spiceai/pull/4551
- Fix spark spicepod for test operator by @Sevenannn in https://github.com/spiceai/spiceai/pull/4555
- Improve hugging face model chat error by @Sevenannn in https://github.com/spiceai/spiceai/pull/4554
- fix: Update Windows E2E install by @peasee in https://github.com/spiceai/spiceai/pull/4557
- feat: Add Spice Cloud Catalog Spicepod, release Alpha by @peasee in https://github.com/spiceai/spiceai/pull/4561
- Fix huggingface embedding errors by @Sevenannn in https://github.com/spiceai/spiceai/pull/4558
- feat: Load table schemas through REST for Spice Cloud Catalog by @peasee in https://github.com/spiceai/spiceai/pull/4563
- Add upgrade instruction in release note by @Sevenannn in https://github.com/spiceai/spiceai/pull/4548
- Add federated source information to refresh errors by @sgrebnov in https://github.com/spiceai/spiceai/pull/4560
- docs: Update ROADMAP.md by @peasee in https://github.com/spiceai/spiceai/pull/4566
- Merge mistral upstream by @Jeadie in https://github.com/spiceai/spiceai/pull/4562
- Fix windows build by @Sevenannn in https://github.com/spiceai/spiceai/pull/4574
- feat: Update Spice Cloud Catalog errors, release as Beta by @peasee in https://github.com/spiceai/spiceai/pull/4575
- docs: Add TOC to README.md by @peasee in https://github.com/spiceai/spiceai/pull/4538
- Updates to spiceai/mistral.rs by @Jeadie in https://github.com/spiceai/spiceai/pull/4580
- Improve refresh error tracing by @sgrebnov in https://github.com/spiceai/spiceai/pull/4576
- Add HTTP consistency & overhead to testoperator dispatch tool by @Jeadie in https://github.com/spiceai/spiceai/pull/4556
- Fix append mode refresh with MySQL Data Connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/4583
- fix: Retry flaky tests by @peasee in https://github.com/spiceai/spiceai/pull/4577
- Fix E2E models test build on macOS runners by @sgrebnov in https://github.com/spiceai/spiceai/pull/4585
- spice trace chat support in CLI by @Jeadie in https://github.com/spiceai/spiceai/pull/4582
- Include hf test specs, enable ready_wait in workflow by @Sevenannn in https://github.com/spiceai/spiceai/pull/4584
- Add paths verification when loading models by @sgrebnov in https://github.com/spiceai/spiceai/pull/4591
- Add generation_config.json support for Filesystem models by @sgrebnov in https://github.com/spiceai/spiceai/pull/4592
- Promote Filesystem model provider to RC by @sgrebnov in https://github.com/spiceai/spiceai/pull/4593
- docs: Add models grading criteria by @peasee in https://github.com/spiceai/spiceai/pull/4550
- Fix typo in Alpha Release Criteria (models) by @sgrebnov in https://github.com/spiceai/spiceai/pull/4588
- fix: Retry AI integration tests by @peasee in https://github.com/spiceai/spiceai/pull/4595
- Run LLM integration tests on Macs; add running local models by @Jeadie in https://github.com/spiceai/spiceai/pull/4495
- Update version to 1.0.2 by @sgrebnov in https://github.com/spiceai/spiceai/pull/4594
- feat: Schedule testoperator by @peasee in https://github.com/spiceai/spiceai/pull/4503
- Improve UX of downloading GGUF from HF by @Jeadie in https://github.com/spiceai/spiceai/pull/4601
- Improve spice trace CLI command by @sgrebnov https://github.com/spiceai/spiceai/pull/4629
- Improve the UX of using huggingface models & embeddings by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4623
- GGUF, hide metadata by @Jeadie in https://github.com/spiceai/spiceai/pull/4631
- Promote hugging face to rc by @Sevenannn in https://github.com/spiceai/spiceai/pull/4626
- Endgame Issue template improvements by @lukekim in https://github.com/spiceai/spiceai/pull/4647
- feat: setup sccache for PR checks by @peasee in https://github.com/spiceai/spiceai/pull/4652
- Run buildandrelease_cuda.yml when crates/llms/Cargo.toml changes by @Jeadie in https://github.com/spiceai/spiceai/pull/4648
- Update E2E installation tests to match model runtime version by @sgrebnov in https://github.com/spiceai/spiceai/pull/4653
- fix: Postgres LargeUtf8 is equal to Utf8 by @peasee in https://github.com/spiceai/spiceai/pull/4664
- Fix eager string formatting in mistral.rs by @Jeadie in https://github.com/spiceai/spiceai/pull/4665
- Better error for spicepod parsing by @Sevenannn in https://github.com/spiceai/spiceai/pull/4632
- Update datafusion-table-providers (MySQL improvements) by @sgrebnov in https://github.com/spiceai/spiceai/pull/4670
- Handle delta tables partitioned by a date column with large date values by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4672
Full Changelog: https://github.com/spiceai/spiceai/compare/v1.0.1...v1.0.2
- Rust
Published by sgrebnov about 1 year ago
https://github.com/spiceai/spiceai - v1.0.1
Spice v1.0.1 (Jan 27, 2024)
Spice v1.0.1 focuses on an improved developer experience, with automatic CUDA GPU detection for local models, in addition to bug fixes. Notably, the Iceberg Catalog Connector now supports AWS Glue including Sig v4 authentication.
Highlights in v1.0.1
- AWS Glue Support for Iceberg Catalog Connector: The Iceberg Catalog Connector now supports AWS Glue. Example spicepod.yaml configuration:
yaml
- from: iceberg:https://glue.ap-northeast-2.amazonaws.com/iceberg/v1/catalogs/123456789012/namespaces
name: glue
spice upgradeCLI Command: Thespice upgradeCLI command detects more edge cases for a smoother upgrade experience.GPU Acceleration Detection: The Spice CLI now automatically detects and enables CUDA (NVIDIA GPUs) GPU acceleration when supported in addition to Metal (M-Series on macOS).
Python SDK: The Python SDK (
spicepy) has updated to v3.0.0, aligning the SDK with the Runtime
Breaking changes
No breaking changes.
Dependencies
No major dependency changes.
Cookbook
- Added DeepSeek Model Recipe
- Added OpenAI LLM & Embeddings Recipe
Upgrading
To upgrade to v1.0.1, use one of the following methods:
CLI:
console
spice upgrade
Homebrew:
console
brew upgrade spiceai/spiceai/spice
Docker:
Pull the spiceai/spiceai:1.0.1 image:
console
docker pull spiceai/spiceai:1.0.1
For available tags, see DockerHub.
Helm:
console
helm repo update
helm upgrade spiceai spiceai/spiceai
Contributors
- @Jeadie
- @phillipleblanc
- @ewgenius
- @peasee
- @Sevenannn
- @sgrebnov
- @lukekim
What's Changed
- Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/4459
- docs: 1.0 release notes by @peasee in https://github.com/spiceai/spiceai/pull/4440
- Create a release-only workflow that uses a previous run's artifacts by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4461
- Add publish-only CUDA workflow by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4462
- Fix the CUDA release workflow by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4463
- docs: Update SECURITY.md for stable by @peasee in https://github.com/spiceai/spiceai/pull/4465
- docs: Update endgame by @peasee in https://github.com/spiceai/spiceai/pull/4460
- docs: Promote HF and File model components by @peasee in https://github.com/spiceai/spiceai/pull/4457
- fix: E2E test release installation by @peasee in https://github.com/spiceai/spiceai/pull/4466
- Fix publish part of CUDA workflow by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4467
- Fix broken docs links in README by @ewgenius in https://github.com/spiceai/spiceai/pull/4468
- Update benchmark snapshots by @github-actions in https://github.com/spiceai/spiceai/pull/4474
- Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/4477
- Add instruction to force-install CPU runtime to v1.0 release notes by @sgrebnov in https://github.com/spiceai/spiceai/pull/4469
- feat: Add WIP testoperator dispatch workflow by @peasee in https://github.com/spiceai/spiceai/pull/4478
- Fix Bug: invalid REPL cursor position on Windows by @sgrebnov in https://github.com/spiceai/spiceai/pull/4480
- feat: Download latest spiced commit for testoperators by @peasee in https://github.com/spiceai/spiceai/pull/4483
- Add compute engine image by @lukekim in https://github.com/spiceai/spiceai/pull/4486
- fix: Testoperator git fetch depth by @peasee in https://github.com/spiceai/spiceai/pull/4484
- feat: New spicepods, testoperator improvements, TPCDS Q1 fix by @peasee in https://github.com/spiceai/spiceai/pull/4475
- Add 87 CUDA compatiblity to build CI by @Jeadie in https://github.com/spiceai/spiceai/pull/4489
- Use OpenAI golang client in
spice chatby @Jeadie in https://github.com/spiceai/spiceai/pull/4491 - Verify
searchandchaton Windows as part of AI installation tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/4492 - feat: Add testoperator dispatch command by @peasee in https://github.com/spiceai/spiceai/pull/4479
- Run CUDA builds on non-GPU instances by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4496
- Use upgraded spice cli when performing runtime upgrade in spice upgrade by @Sevenannn in https://github.com/spiceai/spiceai/pull/4490
- Revert "Use OpenAI golang client in
spice chat(#4491)" by @Jeadie in https://github.com/spiceai/spiceai/pull/4532 - Make Anthropic rate limit error message friendlier by @sgrebnov in https://github.com/spiceai/spiceai/pull/4501
- Update supported CUDA targets: add 87(cli), remove 75 by @sgrebnov in https://github.com/spiceai/spiceai/pull/4509
- Support AWS Glue for Iceberg catalog connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4517
- Package CUDA runtime libraries into artifact for Windows by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4497
Full Changelog: https://github.com/spiceai/spiceai/compare/v1.0.0...v1.0.1
- Rust
Published by Sevenannn about 1 year ago
https://github.com/spiceai/spiceai - v1.0.0
Spice v1.0-stable (Jan 20, 2025)
🎉 After 47 releases, Spice.ai OSS has reached production readiness with the 1.0-stable milestone!
The core runtime and features such as query federation, query acceleration, catalog integration, search and AI-inference have all graduated to stable status along with key component graduations across data connectors, data accelerators, catalog connectors, and AI model providers.
Highlights in v1.0-stable
Stable Data Connectors: The following data connectors have graduated to Stable:
Stable Data Accelerators: The following data accelerators have graduated to Stable:
Unity Catalog Connector: Graduated to Stable.
Databricks (mode: spark_connect) Data Connector: Graduated to Beta.
Beta Catalog Connectors: The Iceberg and Databricks catalog connectors graduated to Beta.
OpenAI Model & Embeddings Provider: Graduated to Release Candidate (RC).
Alpha Model Providers: The Anthropic and xAI (Grok) model providers graduated to Alpha.
Breaking Changes
Default Runtime Version: The CLI will install the GPU accelerated AI-capable Runtime by default, when running
spice installorspice run.Default OpenAI Model: The default OpenAI model has updated to
gpt-4o-mini.Identifier Normalization: Unquoted identifiers such as table names are no longer normalized to lowercase. Identifiers will now retain their exact case as provided.
Sandboxed Docker Image: The Runtime Docker Image now runs the
spicedprocess as thenobodyuser in a minimal chroot sandbox.Insecure S3 and ABFS endpoints: The S3 and ABFS connectors now enforce insecure endpoint checks, preventing HTTP endpoints unless
allow_httpis explicitly enabled. Refer to the documentation for details.
Dependencies
No major dependency changes.
Upgrading
To upgrade to v1.0.0, use one of the following methods:
CLI:
console
spice upgrade
Homebrew:
console
brew upgrade spiceai/spiceai/spice
Docker:
Pull the spiceai/spiceai:1.0.0 image:
console
docker pull spiceai/spiceai:1.0.0
For available tags, see DockerHub.
Helm:
console
helm repo update
helm upgrade spiceai spiceai/spiceai
Contributors
- @peasee
- @ewgenius
- @Jeadie
- @Sevenannn
- @lukekim
- @phillipleblanc
- @sgrebnov
What's Changed
- feat: Update load test criteria, testoperator updates by @peasee in https://github.com/spiceai/spiceai/pull/4311
- Update helm for v1.0.0-rc.5 by @ewgenius in https://github.com/spiceai/spiceai/pull/4313
- Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/4318
- Bump version to v1.0.0, update SECURITY.md by @ewgenius in https://github.com/spiceai/spiceai/pull/4314
- Initial criteria for models, embeddings by @Jeadie in https://github.com/spiceai/spiceai/pull/4223
- Update benchmark snapshots by @github-actions in https://github.com/spiceai/spiceai/pull/4321
- Add dremio param for running load test by @Sevenannn in https://github.com/spiceai/spiceai/pull/4315
- Promote Databricks (mode: delta_lake) connector to stable by @Sevenannn in https://github.com/spiceai/spiceai/pull/4328
- Handle failed query in load test by @Sevenannn in https://github.com/spiceai/spiceai/pull/4327
- feat: Use load test hours for baseline query sets by @peasee in https://github.com/spiceai/spiceai/pull/4334
- Fix typo in 1.0.0-rc.5 release notes by @ewgenius in https://github.com/spiceai/spiceai/pull/4329
- feat: add testoperator data consistency by @peasee in https://github.com/spiceai/spiceai/pull/4319
- docs: Release DuckDB connector stable by @peasee in https://github.com/spiceai/spiceai/pull/4335
- Fix DocumentDB -> DynamoDB by @lukekim in https://github.com/spiceai/spiceai/pull/4339
- Update benchmark snapshots by @github-actions in https://github.com/spiceai/spiceai/pull/4337
- fix: Download hits.parquet from MinIO for benchmark by @peasee in https://github.com/spiceai/spiceai/pull/4338
- Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/4341
- Remove evil averages by @lukekim in https://github.com/spiceai/spiceai/pull/4343
- Don't run builds on non-code changes by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4344
- Remove streaming requirement from Databricks spark Beta and Spark connector Beta by @ewgenius in https://github.com/spiceai/spiceai/pull/4345
- Update s3 tpcds spicepods by @ewgenius in https://github.com/spiceai/spiceai/pull/4346
- Explicitly set required scale factor for throughput and load tests by @ewgenius in https://github.com/spiceai/spiceai/pull/4347
- Fix s3 tpcds dataset name by @ewgenius in https://github.com/spiceai/spiceai/pull/4348
- Promote Iceberg Catalog Connector to Beta by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4350
- Update s3 clickbench benchmark snapshots by @ewgenius in https://github.com/spiceai/spiceai/pull/4351
- fix: DuckDB clickbench on zero results by @peasee in https://github.com/spiceai/spiceai/pull/4349
- Add integration test with snapshots for databricks catalog connector by @Sevenannn in https://github.com/spiceai/spiceai/pull/4353
- refactor: Remove on zero results from benchmarks, add data consistency workflow by @peasee in https://github.com/spiceai/spiceai/pull/4354
- Fix Bug: No field named body_embedding when do vector search with refresh sql containing subset of columns by @sgrebnov in https://github.com/spiceai/spiceai/pull/4297
- docs: Update roadmap by @peasee in https://github.com/spiceai/spiceai/pull/4364
- feat: Release accelerators stable by @peasee in https://github.com/spiceai/spiceai/pull/4361
- Add TPCH/TPCDS test spicepods for MySQL by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4365
- Catch when an insecure (http) S3 and ABFS data connectors endpoint is used without specifying the
allow_httpparameter by @ewgenius in https://github.com/spiceai/spiceai/pull/4363 - Update ROADMAP - Iceberg catalog alpha for v1.0 by @ewgenius in https://github.com/spiceai/spiceai/pull/4367
- Promote databricks catalog and databricks (spark_connect) connector to beta by @Sevenannn in https://github.com/spiceai/spiceai/pull/4369
- Update Roadmap - Iceberg beta by @ewgenius in https://github.com/spiceai/spiceai/pull/4373
- Build CUDA binaries for Linux by @Jeadie in https://github.com/spiceai/spiceai/pull/4320
- Promote Nvidia NIM as Alpha by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4380
- Promote xai to alpha by @Sevenannn in https://github.com/spiceai/spiceai/pull/4381
- Update stable criteria for object store based connectors by @ewgenius in https://github.com/spiceai/spiceai/pull/4383
- Testoperator: http consistency and overhead tests, fixes and ci by @ewgenius in https://github.com/spiceai/spiceai/pull/4382
- Promote S3 Data Connector to Stable by @ewgenius in https://github.com/spiceai/spiceai/pull/4385
- Download platform-supported CUDA binary version on Linux by @sgrebnov in https://github.com/spiceai/spiceai/pull/4356
- Fix http consistency test workflow, add overhead workflow by @ewgenius in https://github.com/spiceai/spiceai/pull/4387
- feat: Add Postgres test spicepods by @peasee in https://github.com/spiceai/spiceai/pull/4388
- Fix typos + specific in model criteria; Make explicit alpha/beta tests for LLMS in
crates/llms/tests. by @Jeadie in https://github.com/spiceai/spiceai/pull/4377 - Fix federation bug for correlated subqueries of deeply nested Dremio tables by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4389
- Fix http overhead workflow by @ewgenius in https://github.com/spiceai/spiceai/pull/4390
- Tweak model tests, fix embedding input by @ewgenius in https://github.com/spiceai/spiceai/pull/4391
- Promote Dremio to Stable quality by @Sevenannn in https://github.com/spiceai/spiceai/pull/4392
- Add beta functionality tests for embedding models. by @Jeadie in https://github.com/spiceai/spiceai/pull/4352
- docs: Release postgres connector stable by @peasee in https://github.com/spiceai/spiceai/pull/4398
- Increase timeout for model response in E2E tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/4399
- Disable ident normalization (i.e.
SELECT MyColumn from tableworks) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4400 - Preserve schema metadata by @ewgenius in https://github.com/spiceai/spiceai/pull/4402
- Make models integration tests tracing less verbose by @sgrebnov in https://github.com/spiceai/spiceai/pull/4403
- Fix
cudafeature build on Windows by @sgrebnov in https://github.com/spiceai/spiceai/pull/4404 - Promote MySQL to Stable by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4406
- docs: Release Delta Lake and Unity catalog by @peasee in https://github.com/spiceai/spiceai/pull/4405
- Use
gpt-4o-minias a default model for openai provider by @ewgenius in https://github.com/spiceai/spiceai/pull/4410 - Fix streaming for Openai and Anthropic by @Jeadie in https://github.com/spiceai/spiceai/pull/4409
- Tweak model loading and missing tool errors messages by @ewgenius in https://github.com/spiceai/spiceai/pull/4412
- Spice CLI: fallback to CPU build for unsupported GPU Compute Capability by @sgrebnov in https://github.com/spiceai/spiceai/pull/4407
- Build Windows CUDA binaries as part of
build_and_releaseworkflow by @sgrebnov in https://github.com/spiceai/spiceai/pull/4386 - Update docs link by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4416
- feat: Add CPU models install escape hatch by @peasee in https://github.com/spiceai/spiceai/pull/4419
- Handle OpenAI API Errors by @ewgenius in https://github.com/spiceai/spiceai/pull/4417
- Update spice cli to use
GH_TOKENorGITHUB_TOKENenv variables when calling releases api by @ewgenius in https://github.com/spiceai/spiceai/pull/4175 - Implement secure sandboxing for Docker image by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4411
- Automatically install supported CUDA binary on Windows by @sgrebnov in https://github.com/spiceai/spiceai/pull/4420
- Metrics for LLMs+ embeddings by @Jeadie in https://github.com/spiceai/spiceai/pull/4418
- Jeadie/25 01 17/beta perf by @Jeadie in https://github.com/spiceai/spiceai/pull/4397
- Pass GitHub token to all CI steps calling spice run by @ewgenius in https://github.com/spiceai/spiceai/pull/4423
- Run the models integration tests on PRs by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4421
- Run CUDA builds in a separate workflow by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4430
- Promote OpenAI models and embeddings providers to RC by @ewgenius in https://github.com/spiceai/spiceai/pull/4432
- Update link to retrieval-augmented generation (RAG) details by @sgrebnov in https://github.com/spiceai/spiceai/pull/4433
- Unity catalog should strip parameter prefix before passing parameters to delta lake factory by @Sevenannn in https://github.com/spiceai/spiceai/pull/4436
- Update quickstart traces to match current version by @sgrebnov in https://github.com/spiceai/spiceai/pull/4435
- Update Supported Embeddings Providers Readme section by @sgrebnov in https://github.com/spiceai/spiceai/pull/4434
- Local models can stream tools by @Jeadie in https://github.com/spiceai/spiceai/pull/4429
- fix: Use MetricsCollector::show() for HTTP testoperator commands by @peasee in https://github.com/spiceai/spiceai/pull/4442
- Fix run query action by @ewgenius in https://github.com/spiceai/spiceai/pull/4444
- Default to AI-enabled runtime for
spice run/spice installby @phillipleblanc in https://github.com/spiceai/spiceai/pull/4443 - Change no spicepod.yaml log to warning by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4447
- refactor: Update Catalog Connector error messages by @peasee in https://github.com/spiceai/spiceai/pull/4441
- Fix panic when converting OTel metrics by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4449
- refactor: Update model errors by @peasee in https://github.com/spiceai/spiceai/pull/4446
- Update spiceai/mistral.rs to silence metadata logs by @ewgenius in https://github.com/spiceai/spiceai/pull/4452
- fix xAI; don't use openai defaults by @Jeadie in https://github.com/spiceai/spiceai/pull/4450
- Improves the UX of using huggingface models by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4451
- Add GH Workflow to test
spice airuntime installation by @sgrebnov in https://github.com/spiceai/spiceai/pull/4448 - fix: Use specific model errors where available by @peasee in https://github.com/spiceai/spiceai/pull/4454
- Detect and report unsupported embedding column type during dataset registration by @sgrebnov in https://github.com/spiceai/spiceai/pull/4456
- Handle Errors by @Jeadie in https://github.com/spiceai/spiceai/pull/4455
- Catch and report negative openai_temperature error by @Sevenannn in https://github.com/spiceai/spiceai/pull/4453
- Clarify release check error message if it is caused by wrong GH token by @ewgenius in https://github.com/spiceai/spiceai/pull/4458
Full Changelog: https://github.com/spiceai/spiceai/compare/v1.0.0-rc.5...v1.0.0
- Rust
Published by peasee about 1 year ago
https://github.com/spiceai/spiceai - v1.0.0-rc.5
Spice v1.0-rc.5 (Jan 13, 2025)
Spice v1.0.0-rc.5 is the fifth release candidate for the first major version of Spice.ai OSS. This release focuses production readiness and critical bug fixes. In addition, a new DynamoDB data connector has been added along with automatic detection for GPU acceleration when running Spice using the CLI.
Highlights in v1.0-rc.5
Automatic GPU Acceleration Detection: Automatically detect and utilize GPU acceleration when running by CLI. Install AI components locally using the CLI command
spice install ai. Currently supports NVIdia CUDA and Apple Metal (M-series).DynamoDB Data Connector: Query AWS DynamoDB tables using SQL with the new DynamoDB Data Connector.
yaml
datasets:
- from: dynamodb:users
name: users
params:
dynamodb_aws_region: us-west-2
dynamodb_aws_access_key_id: ${secrets:aws_access_key_id}
dynamodb_aws_secret_access_key: ${secrets:aws_secret_access_key}
acceleration:
enabled: true
console
sql> describe users;
+----------------+-----------+-------------+
| column_name | data_type | is_nullable |
+----------------+-----------+-------------+
| created_at | Utf8 | YES |
| date_of_birth | Utf8 | YES |
| email | Utf8 | YES |
| account_status | Utf8 | YES |
| updated_at | Utf8 | YES |
| full_name | Utf8 | YES |
| ... |
+----------------+-----------+-------------+
File Data Connector: Graduated to Stable.
Dremio Data Connector: Graduated to Release Candidate (RC).
Spice.ai, Spark, and Snowflake Data Connectors: Graduated to Beta.
Dependencies
No major dependency changes.
Contributors
- @Jeadie
- @phillipleblanc
- @ewgenius
- @peasee
- @Sevenannn
- @lukekim
What's Changed
- Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/4190
- Ensure non-nullity of primary keys in
MemTable; check validity of initial data. by @Jeadie in https://github.com/spiceai/spiceai/pull/4158 - Bump version to v1.0.0 stable by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4191
- Fix metal + models download by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4193
- Update spice.ai connector beta roadmap by @ewgenius in https://github.com/spiceai/spiceai/pull/4194
- feat: verify on zero results snapshots by @peasee in https://github.com/spiceai/spiceai/pull/4195
- Add throughput test module to
test-frameworkby @phillipleblanc in https://github.com/spiceai/spiceai/pull/4196 - Update Spice.ai TPCH snapshots by @ewgenius in https://github.com/spiceai/spiceai/pull/4202
- Replace all usage of
lazy_static!withLazyLockby @phillipleblanc in https://github.com/spiceai/spiceai/pull/4199 - Fix model + metal download by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4200
- Run Clickbench for Dremio by @Sevenannn in https://github.com/spiceai/spiceai/pull/4138
- Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/4205
- Fix the typo in connector stable criteria by @Sevenannn in https://github.com/spiceai/spiceai/pull/4213
- feat: Add throughput test example by @peasee in https://github.com/spiceai/spiceai/pull/4214
- feat: calculate throughput test query percentiles by @peasee in https://github.com/spiceai/spiceai/pull/4215
- feat: Add throughput test to actions by @peasee in https://github.com/spiceai/spiceai/pull/4217
- Implement DynamoDB Data Connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4218
- 1.0 doc updates by @lukekim in https://github.com/spiceai/spiceai/pull/4181
- Improve clarity and concison of use-cases by @lukekim in https://github.com/spiceai/spiceai/pull/4220
- Remove macOS Intel build by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4221
- fix: Test operator throughput test workflow by @peasee in https://github.com/spiceai/spiceai/pull/4222
- DynamoDB: Automatically load AWS credentials from IAM roles if access key not provided by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4226
- File connector clickbench snapshots results by @ewgenius in https://github.com/spiceai/spiceai/pull/4225
- Spice.ai Catalog Connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4204
- feat: Add test framework metrics collection by @peasee in https://github.com/spiceai/spiceai/pull/4227
- Add badges for build/test status on README.md by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4228
- Release Dremio to RC by @Sevenannn in https://github.com/spiceai/spiceai/pull/4224
- feat: Add more test spicepods by @peasee in https://github.com/spiceai/spiceai/pull/4229
- feat: Add load test to testoperator by @peasee in https://github.com/spiceai/spiceai/pull/4231
- Add TSV format to all
object_store-based connectors by @Jeadie in https://github.com/spiceai/spiceai/pull/4192 - Move test-framework to dev-dependencies for Runtime by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4230
- Document limitation for correlated subqueries in TPCH for Spice.ai connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4235
- Changes for CUDA by @Jeadie in https://github.com/spiceai/spiceai/pull/4130
- fix: Collect batches from test framework, load test updates by @peasee in https://github.com/spiceai/spiceai/pull/4234
- Suppress opentelemetry_sdk warnings - they aren't useful by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4243
- fix: Set dataset status first, update test framework by @peasee in https://github.com/spiceai/spiceai/pull/4244
- feat: Re-enable defaults on test spicepods by @peasee in https://github.com/spiceai/spiceai/pull/4248
- Add usage for streaming local models; Fix spice chat usage bar TPS expansion by @Jeadie in https://github.com/spiceai/spiceai/pull/4232
- refactor: Use composite testoperator setup, add query overrides by @peasee in https://github.com/spiceai/spiceai/pull/4246
- Enable expandviewsat_output for DF optimizer and transform schema to expanded view types by @ewgenius in https://github.com/spiceai/spiceai/pull/4237
- Add throughput test spicepod for databricks delta mode connector by @Sevenannn in https://github.com/spiceai/spiceai/pull/4241
- Spark data connector - update and enable TPCH and TPCDS benchmarks by @ewgenius in https://github.com/spiceai/spiceai/pull/4240
- Increase the timeout minutes of load test to 10 hours by @Sevenannn in https://github.com/spiceai/spiceai/pull/4254
- Improve partition column counts error for delta table by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4247
- Add e2e test for databricks catalog connector (mode: delta_lake) by @Sevenannn in https://github.com/spiceai/spiceai/pull/4255
- Spark connector integration tests by @ewgenius in https://github.com/spiceai/spiceai/pull/4256
- Run benchmark test with the new test framework by @Sevenannn in https://github.com/spiceai/spiceai/pull/4245
- Configure databricks delta secrets to run load test by @Sevenannn in https://github.com/spiceai/spiceai/pull/4257
- Support
propertiesfor emitted telemetry by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4249 - feat: Add
ready_waittest operator workflow input by @peasee in https://github.com/spiceai/spiceai/pull/4259 - Handle 'LargeStringArray' for embedding tables by @Jeadie in https://github.com/spiceai/spiceai/pull/4263
llmstests for alpha/beta model criteria by @Jeadie in https://github.com/spiceai/spiceai/pull/4261- Configurable runner type for load and throughput tests by @ewgenius in https://github.com/spiceai/spiceai/pull/4262
- Handle NULL partition columns for Delta Lake tables by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4264
- Add integration test for Snowflake by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4266
- Add Snowflake TPCH queries by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4268
- Handle
LargeStringArrayinv1/search. by @Jeadie in https://github.com/spiceai/spiceai/pull/4265 - Fix
build_cudain Update spiced_docker.yml by @Jeadie in https://github.com/spiceai/spiceai/pull/4269 - Run Snowflake benchmark in GitHub Actions by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4270
- Allow Snowflake query override for CI tests by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4271
- Don't run GPU builds for trunk by @Jeadie in https://github.com/spiceai/spiceai/pull/4272
- Fix InvalidTypeAction not working by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4273
- Add xAI key to llm integration tests by @Jeadie in https://github.com/spiceai/spiceai/pull/4274
- Update openai snapshots by @Jeadie in https://github.com/spiceai/spiceai/pull/4275
- Fix federation bug for correlated subqueries by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4276
- Update end_game.md by @ewgenius in https://github.com/spiceai/spiceai/pull/4278
- Promote Snowflake to Beta by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4277
- Set version to 1.0.0-rc.5 by @ewgenius in https://github.com/spiceai/spiceai/pull/4283
- Update cargo.lock by @ewgenius in https://github.com/spiceai/spiceai/pull/4285
- Update spice.ai data connector snapshots by @ewgenius in https://github.com/spiceai/spiceai/pull/4281
- Promote the Spice.ai Data Connector to Beta by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4282
- Revert change to
integration_models__models__search__openai_chunking_response.snapby @Jeadie in https://github.com/spiceai/spiceai/pull/4279 - Allow for a subset of build artifacts to be published to minio by @Jeadie in https://github.com/spiceai/spiceai/pull/4280
- Promote File Data Connector to Stable by @ewgenius in https://github.com/spiceai/spiceai/pull/4286
- Add Iceberg to Supported Catalogs by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4287
- Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/4289
- Fix Spark benchmark credentials, add back overrides by @ewgenius in https://github.com/spiceai/spiceai/pull/4295
- Promote Spark Data Connector to Beta by @ewgenius in https://github.com/spiceai/spiceai/pull/4296
- Add Dremio throughput test spicepod by @Sevenannn in https://github.com/spiceai/spiceai/pull/4233
- Add error message for invalid databricks mode parameter by @Sevenannn in https://github.com/spiceai/spiceai/pull/4299
- Fix pre-release check to look for
buildstring by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4300 - Promote databricks catalog connector (mode: delta_lake) to beta by @Sevenannn in https://github.com/spiceai/spiceai/pull/4301
- Properly delegate
load_tableto Rest Catalog by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4303 - Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/4302
- docs: Update ROADMAP.md by @peasee in https://github.com/spiceai/spiceai/pull/4306
- v1.0.0-rc.5 Release Notes by @ewgenius in https://github.com/spiceai/spiceai/pull/4298
Full Changelog: https://github.com/spiceai/spiceai/compare/v1.0.0-rc.4...v1.0.0-rc.5
- Rust
Published by ewgenius about 1 year ago
https://github.com/spiceai/spiceai - v1.0.0-rc.4
Spice v1.0-rc.4 (Jan 6, 2025)
Happy New Year 🎆!
Spice v1.0.0-rc.4 is the fourth release candidate for the first major version of Spice.ai OSS. This release continues the focus on production readiness. In addition, xAI has been added as a model provider.
Highlights in v1.0-rc.4
- xAI Model Provider: Adds support for xAI hosted models.
yaml
models:
- from: xai:grok2-latest
name: xai
params:
xai_api_key: ${secrets:SPICE_XAI_API_KEY}
yaml
datasets:
- from: file://my_table.tsv
name: table
- Spicepod Spec Version: Spicepod spec version
v1is now by default.v1beta1will continue to work.
yaml
version: v1
kind: Spicepod
name: my_pod
GitHub Data Connector: Graduated to Stable.
PostgreSQL Data Accelerator: Graduated to Release Candidate (RC).
Cookbook
- Added xAI model provider recipe.
Dependencies
No major dependency changes.
Contributors
- @lukekim
- @phillipleblanc
- @peasee
- @karifabri
- @sgrebnov
- @Jeadie
- @ewgenius
What's Changed
- Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/4087
- Update Helm chart for v1.0.0-rc.3 (v0.2.2) by @lukekim in https://github.com/spiceai/spiceai/pull/4088
- Rev version to v1.0.0-rc.4 by @lukekim in https://github.com/spiceai/spiceai/pull/4090
- Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/4089
- Fix OpenAI Models Integration tests by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4084
- fix: Update Postgres TPCDS and ClickBench queries by @peasee in https://github.com/spiceai/spiceai/pull/4092
- fix: Check Postgres acceleration schema on insert by @peasee in https://github.com/spiceai/spiceai/pull/4094
- Update v1.0.0-rc.3.md by @karifabri in https://github.com/spiceai/spiceai/pull/4096
- Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/4093
- First-class TSV for file data connector by @lukekim in https://github.com/spiceai/spiceai/pull/4098
- Allow Flight DoPut only for write api-keys by @sgrebnov in https://github.com/spiceai/spiceai/pull/4010
- Only create tables
eval.runsandeval.resultswhen an eval is defined by @Jeadie in https://github.com/spiceai/spiceai/pull/4099 - Update Copyright year to include 2025 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4100
- feat: add postgres clickbench accelerator, release postgres accelerator by @peasee in https://github.com/spiceai/spiceai/pull/4111
- Add spice binaries with metal to releases; detect metal device in
spice install/upgrade. by @Jeadie in https://github.com/spiceai/spiceai/pull/4097 - docs: Clarify connector release criteria by @peasee in https://github.com/spiceai/spiceai/pull/4112
- Update datafusion-federation to fix LIMIT with OFFSET handling in logical plan rewrite by @ewgenius in https://github.com/spiceai/spiceai/pull/4115
- Support Grok AI. by @Jeadie in https://github.com/spiceai/spiceai/pull/4113
- Fix
spice chatusage bar. by @Jeadie in https://github.com/spiceai/spiceai/pull/4119 - Set unified max encoding and decoding message size for all flight client configurations across runtime by @ewgenius in https://github.com/spiceai/spiceai/pull/4116
- feat: Add the file connector as an appendable benchmark connector by @peasee in https://github.com/spiceai/spiceai/pull/4120
- Add
spice evalcommand by @lukekim in https://github.com/spiceai/spiceai/pull/4118 - Support multi-level table nesting for Dremio by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4129
- feat: run append TPCH benchmarks in workflow (Arrow, DuckDB) by @peasee in https://github.com/spiceai/spiceai/pull/4131
- Fix bug in Iceberg tables selecting a subset of columns by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4132
- feat: Run append TPCDS benchmarks in workflow (Arrow, DuckDB) by @peasee in https://github.com/spiceai/spiceai/pull/4141
- Setup spice.ai clickbench by @ewgenius in https://github.com/spiceai/spiceai/pull/4134
- Data is streamed when reading from the GitHub connector (GraphQL tables) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4142
- Mark the GitHub Data Connector as Stable by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4143
- Fix table quoting for Databricks Spark connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4145
- Extend flight compute context for spice.ai connector with org and app names, to fix federated queries from different spice.ai data sources by @ewgenius in https://github.com/spiceai/spiceai/pull/4144
- Enforce Flight DoPut policies: Rate Limiting, Read Timeout, and Max Records per Batch by @sgrebnov in https://github.com/spiceai/spiceai/pull/4117
- Fix bug Changes in catalog.yaml would require saving in spicepod.yaml to apply by @sgrebnov in https://github.com/spiceai/spiceai/pull/4147
- Update benchmark snapshots by @github-actions in https://github.com/spiceai/spiceai/pull/4137
- Add
test-frameworkcrate to contain all common benchmark, E2E, integration testing logic. by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4157 - Fix
platform_optionvariable inbuild_and_release.yml. by @Jeadie in https://github.com/spiceai/spiceai/pull/4154 - feat: Add Clickbench append benchmark for DuckDB and Arrow by @peasee in https://github.com/spiceai/spiceai/pull/4160
- Upload artifacts to Minio on buildandrelease by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4159
- feat: add on zero results benchmark by @peasee in https://github.com/spiceai/spiceai/pull/4164
- Update spice.ai connector tests by @ewgenius in https://github.com/spiceai/spiceai/pull/4161
Full Changelog: https://github.com/spiceai/spiceai/compare/v1.0.0-rc.3...v1.0.0-rc.4
- Rust
Published by phillipleblanc about 1 year ago
https://github.com/spiceai/spiceai - v1.0.0-rc.3
Spice v1.0-rc.3 (Dec 30, 2024)
Spice v1.0.0-rc.3 is the third release candidate for the first major version of Spice.ai OSS. This release continues the focus on production readiness and includes new Iceberg Catalog APIs, DuckDB improvements, and a new Iceberg Catalog Connector.
Highlights in v1.0-rc.3
Iceberg Catalog APIs: Spice now functions as an Iceberg Catalog provider, implementing a core subset of the Iceberg Catalog APIs. This enables Iceberg Catalog clients native discovery of datasets and schemas through Spice APIs.
GET /v1/namespaces- List all catalogs registered in Spice.GET /v1/namespaces?parent=catalog- List schemas registered under a given catalog.GET /v1/namespaces/:catalog_schema/tables- List tables registered under a given schema.GET /v1/namespaces/:catalog_schema/tables/:table- Get the schema of a given table.Iceberg Catalog Connector: The Iceberg Catalog Connector is a new integration to discover and query datasets from a remote Iceberg Catalog.
Example connecting to a remote Iceberg Catalog with tables stored in S3:
yaml
catalogs:
- from: iceberg:https://my-iceberg-catalog.com/v1/namespaces
name: ice
params:
iceberg_s3_access_key_id: ${secrets:ICEBERG_S3_ACCESS_KEY_ID}
iceberg_s3_secret_access_key: ${secrets:ICEBERG_S3_SECRET_ACCESS_KEY}
iceberg_s3_region: us-east-1
View the Iceberg Catalog Connector documentation for more details.
DuckDB Improvements: Added
cosine_distancesupport for DuckDB-backed vector search, improvedunnestnested type handling forarray_elementand lists, and optimized query performance.SQLite Data Accelerator: Graduated to Release Candidate (RC).
File Data Accelerator: Graduated to Release Candidate (RC).
Breaking changes
- API:
v1/datasets/samplehas been removed as it is not particularly useful, can be replicated via SQL, and via the tools endpointPOST v1/tools/:name.
Cookbook
New Language Model Evals Recipe shwoing how to measure the performance of a language model using LLM-as-Judge, configured entirely in the spice runtime.
New Iceberg Catalog Recipe showing how to use Spice to query Iceberg tables from a Iceberg catalog.
Dependencies
- OpenTelemetry: Upgraded from 0.26.0 to 0.27.1
- Go: Upgraded from 1.22 to 1.23 (CLI)
Contributors
- @sgrebnov
- @phillipleblanc
- @peasee
- @Jeadie
- @Sevenannn
- @lukekim
- @ewgenius
What's Changed
- Add CI configuration for search benchmark dataset access by @sgrebnov in https://github.com/spiceai/spiceai/pull/3888
- Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/3895
- Upgrade dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3896
- chore: Update helm chart for RC.2 by @peasee in https://github.com/spiceai/spiceai/pull/3899
- Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/3903
- chore: Update MacOS test release install to macos-13 by @peasee in https://github.com/spiceai/spiceai/pull/3901
- Add usage to
spice chatand fixv1/models?status=true. by @Jeadie in https://github.com/spiceai/spiceai/pull/3898 - chore: Bump versions for rc3 by @peasee in https://github.com/spiceai/spiceai/pull/3902
- docs: Update endgame with a step to verify dependencies in release notes by @peasee in https://github.com/spiceai/spiceai/pull/3897
- Ensure eval dataset input and ouput of correct length by @Jeadie in https://github.com/spiceai/spiceai/pull/3900
spice add/connect/dataset configureshould update spicepod, not overwrite it & upgrade to Go 1.23 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3905- Bump opentelemetry from 0.26.0 to 0.27.1 by @dependabot in https://github.com/spiceai/spiceai/pull/3879
- Ensure trace_id is overridden for prior written spans by @Jeadie in https://github.com/spiceai/spiceai/pull/3906
- add 'role': 'assistant' for local models by @Jeadie in https://github.com/spiceai/spiceai/pull/3910
- Run tpcds benchmark for file connector by @Sevenannn in https://github.com/spiceai/spiceai/pull/3924
- Update to reference cookbook instead of quickstarts/samples by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3928
- Fix/remove flaky integration tests by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3930
- Implement
/v1/iceberg/namespaces&/v1/iceberg/configAPIs by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3923 - Add script for creating tpcds parquet files and spicepod for file connector by @Sevenannn in https://github.com/spiceai/spiceai/pull/3931
- Use
utoipato generate openapi.json and swagger for dev by @Jeadie in https://github.com/spiceai/spiceai/pull/3927 fuzzy_match,json_match,includesscorer by @Jeadie in https://github.com/spiceai/spiceai/pull/3926- Implement
/v1/iceberg/namespaces/:namespaceby @phillipleblanc in https://github.com/spiceai/spiceai/pull/3933 - Implement
GET /v1/iceberg/namespaces/:namespace/tablesAPI by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3934 - Add custom Spice DuckDB dialect with cosine_distance support by @sgrebnov in https://github.com/spiceai/spiceai/pull/3938
- Fix NSQL error:
all columns in a record batch must have the same lengthby @sgrebnov in https://github.com/spiceai/spiceai/pull/3947 - Don't include tools use in hf test model by @Jeadie in https://github.com/spiceai/spiceai/pull/3955
- Implement
GET /v1/namespaces/{namespace}/tables/{table}API by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3940 - Update dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3967
- DuckDB: add support for nested types in Lists by @sgrebnov in https://github.com/spiceai/spiceai/pull/3961
- Add script to set up clickbench for file connector by @Sevenannn in https://github.com/spiceai/spiceai/pull/3945
- docs: Add connector stable criteria by @peasee in https://github.com/spiceai/spiceai/pull/3908
- Update Roadmp Dec 23, 2024 by @lukekim in https://github.com/spiceai/spiceai/pull/3978
- Improve CI testing for OpenAPI, new tool
spiceschema, fix broken OpenAPI stuff. by @Jeadie in https://github.com/spiceai/spiceai/pull/3948 - remove
v1/datasets/sampleby @Jeadie in https://github.com/spiceai/spiceai/pull/3981 - feat: add SQLite ClickBench benchmark by @peasee in https://github.com/spiceai/spiceai/pull/3975
- Remove feature 'llms/mistralrs' by @Jeadie in https://github.com/spiceai/spiceai/pull/3984
- Add support for 'params.spice_tools: nsql' by @Jeadie in https://github.com/spiceai/spiceai/pull/3985
- Fix integration tests - add missing
formatquery parameter in /v1/status requests by @ewgenius in https://github.com/spiceai/spiceai/pull/3989 - Enhance AI tools sampling logic for robust handling of large fields by @sgrebnov in https://github.com/spiceai/spiceai/pull/3959
- Fix subquery federation by @Sevenannn in https://github.com/spiceai/spiceai/pull/3991
- Fix unnest and add DuckDB support for
array_elementby @sgrebnov in https://github.com/spiceai/spiceai/pull/3995 - Add score value snapshotting to vector similarity search tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/3996
- Use Llama-3.2-3B-Instruct for Hugging Face integration testing by @sgrebnov in https://github.com/spiceai/spiceai/pull/3992
- Simplify
construct_chunk_query_sqlfor DuckDB compatibility by @sgrebnov in https://github.com/spiceai/spiceai/pull/3988 - Update TPCH and TPCDS benchmarks for spice.ai connector by @ewgenius in https://github.com/spiceai/spiceai/pull/3982
- Correctly pass Hugging Face token in models integration tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/3997
- Fix:
on_zero_resultscausesTransactionContext Error: Catalog write-write conflict on create with "attachment_0"by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3998 - Add DuckDB acceleration to search benchmarks by @sgrebnov in https://github.com/spiceai/spiceai/pull/4000
- Enable Postgres write via non-default
postgres-writefeature flag by @sgrebnov in https://github.com/spiceai/spiceai/pull/4004 - Allow search benchmark to write test results by @sgrebnov in https://github.com/spiceai/spiceai/pull/4008
- Make Flight DoPut atomic and commit write only on successful stream completion by @sgrebnov in https://github.com/spiceai/spiceai/pull/4002
- Create a
CatalogConnectorabstraction by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4003 - Fix
generate-openapi.ymland add.schema/openapi.json. by @Jeadie in https://github.com/spiceai/spiceai/pull/3983 - Enable spice.ai tpcds bench workflow. Comment failing tpch queries. by @ewgenius in https://github.com/spiceai/spiceai/pull/4001
- feat: Add SQLite ClickBench overrides by @peasee in https://github.com/spiceai/spiceai/pull/4016
- Implement Iceberg Catalog Connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4053
- feat: Datafusion updates for SQLite fixes and release by @peasee in https://github.com/spiceai/spiceai/pull/4054
- docs: Add accelerator stable release criteria by @peasee in https://github.com/spiceai/spiceai/pull/4017
- Add dremio tpch / tpcds benchmark test by @Sevenannn in https://github.com/spiceai/spiceai/pull/4063
- Update docs, and make PR to
spiceai/docsfor newopenapi.json. by @Jeadie in https://github.com/spiceai/spiceai/pull/4019 - Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/4065
- Fix dremio subquery rewrite by @Sevenannn in https://github.com/spiceai/spiceai/pull/4064
- Update generate-openapi.yml by @Jeadie in https://github.com/spiceai/spiceai/pull/4073
- docs: Add catalog criteria by @peasee in https://github.com/spiceai/spiceai/pull/4052
- fix
distinct_columnsin auto/nsql tool groups by @Jeadie in https://github.com/spiceai/spiceai/pull/4074 - Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/4075
- Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/4076
- Implement windowfuncsupportwindowframe from DremioDialect by @Sevenannn in https://github.com/spiceai/spiceai/pull/4012
- Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/4079
- Promote file connector to rc by @Sevenannn in https://github.com/spiceai/spiceai/pull/4080
- Add Iceberg to README by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4085
- Fix '/v1/status' default format by @Jeadie in https://github.com/spiceai/spiceai/pull/4081
Full Changelog: https://github.com/spiceai/spiceai/compare/v1.0.0-rc.2...v1.0.0-rc.3
- Rust
Published by lukekim about 1 year ago
https://github.com/spiceai/spiceai - v1.0.0-rc.2
Spice v1.0-rc.2 (Dec 16, 2024)
Spice v1.0.0-rc.2 is the second release candidate for the first major version of Spice.ai OSS. This release continues to build on the stability of Spice for production use, including key Data Connector graduations, bug fixes, and AI features.
Highlights in v1.0-rc.2
MS SQL and File Data Connectors: Graduated from Alpha to Beta.
GraphQL and Databricks Delta Lake Data Connectors: Graduated from Beta to Release Candidate.
gospice SDK Release: The Spice Go SDK has updated to v7.0, adding support for refreshing datasets and upgrading dependencies.
Azure AI Support: Added support for both LLMs and embedding models. Example
spicepod.ymlconfiguration:
yaml
embeddings:
- name: azure
from: azure:text-embedding-3-small
params:
endpoint: https://your-resource-name.openai.azure.com
azure_api_version: 2024-08-01-preview
azure_deployment_name: text-embedding-3-small
azure_api_key: ${ secrets:SPICE_AZURE_API_KEY }
models:
- name: azure
from: azure:gpt-4o-mini
params:
endpoint: https://your-resource-name.openai.azure.com
azure_api_version: 2024-08-01-preview
azure_deployment_name: gpt-4o-mini
azure_api_key: ${ secrets:SPICE_AZURE_TOKEN }
Accelerate subsets of columns: Spice now supports acceleration for specific columns from a federated source. Specify the desired columns directly in the Refresh SQL for more selective and efficient data acceleration.
Example spicepod.yaml configuration:
yaml
datasets:
- from: s3://spiceai-demo-datasets/taxi_trips/2024/
name: taxi_trips
params:
file_format: parquet
acceleration:
refresh_sql: SELECT tpep_pickup_datetime, tpep_dropoff_datetime, trip_distance, total_amount FROM taxi_trips
Breaking changes
Sharepoint Authentication Parameters: now use access tokens instead of authorization codes, using the sharepoint_bearer_token parameter. The sharepoint_auth_code parameter has been removed.
Data Connector Delimiters: now support / and ://, in addition to : in the from parameter of the dataset configuration. The following examples are equivalent:
from: postgres://my_postgres_tablefrom: postgres/my_postgres_tablefrom: postgres:my_postgres_table
Some data connectors, such as s3 which only accepts ://, place further restrictions on the allowed delimiter.
The file data connector has changed how it interprets the :// delimiter to reflect how most other URL parsers work, i.e. file://my_file_path. Previously, the file path was interpreted as /my_file_path. Now, it is interpreted as a relative path, i.e. my_file_path.
Spice Search limit: is now applied to the final search result, instead of previously being applied separately to each dataset involved in a search before aggregation.
Dependencies
- Rust: Upgraded to 1.83
Contributors
- @phillipleblanc
- @ewgenius
- @Jeadie
- @sgrebnov
- @peasee
- @Sevenannn
- @Advayp
New Contributors
- @Advayp made their first contribution in https://github.com/spiceai/spiceai/pull/3862
What's Changed
- Fix install scripts to handle the RC release by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3718
- Update helm chart to v1.0.0-rc.1 by @ewgenius in https://github.com/spiceai/spiceai/pull/3720
- Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/3719
- Add logic to ignore task cancellations due to runtime shutdown by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3717
- Update to next relese version v1.0.0-rc.2 by @ewgenius in https://github.com/spiceai/spiceai/pull/3721
- Handle parsing OTel KeyValues from the
baggageheader by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3722 - Update
llmsdependencies:mistralrs,async-openaiby @Jeadie in https://github.com/spiceai/spiceai/pull/3725 - Support
jsonlfor object store by @Jeadie in https://github.com/spiceai/spiceai/pull/3726 - Fix NSQL models integration tests for HF by @sgrebnov in https://github.com/spiceai/spiceai/pull/3727
- standardise 'csvschemainfermaxrecords' -> 'schemainfermax_records'; include deprecation messages for dataset params by @Jeadie in https://github.com/spiceai/spiceai/pull/3732
- feat: Add script to generate TPC-H data for file connector by @peasee in https://github.com/spiceai/spiceai/pull/3737
- feat: Add file connector integration test by @peasee in https://github.com/spiceai/spiceai/pull/3735
- fix: Add explicit message for ODBC connector when not installed by @peasee in https://github.com/spiceai/spiceai/pull/3736
- Remove Box::leak in
create_accelerated_tableby @sgrebnov in https://github.com/spiceai/spiceai/pull/3739 - docs: Update enhancement and PR template by @peasee in https://github.com/spiceai/spiceai/pull/3740
- feat: add file connector benchmark by @peasee in https://github.com/spiceai/spiceai/pull/3734
- docs: Release file connector beta by @peasee in https://github.com/spiceai/spiceai/pull/3738
- For embeddings, use
sentence_*_config.json, download HF async, use TEI functions by @Jeadie in https://github.com/spiceai/spiceai/pull/3724 - Optimize build & release workflow for trunk builds by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3741
- Update benchmark snapshots by @github-actions in https://github.com/spiceai/spiceai/pull/3752
- Skip Spice cloud integration tests by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3755
- Add
http_requestsmetric and deprecatehttp_requests_totalby @sgrebnov in https://github.com/spiceai/spiceai/pull/3748 - Update benchmark snapshots by @github-actions in https://github.com/spiceai/spiceai/pull/3759
- fix: Parquet file generation script by @peasee in https://github.com/spiceai/spiceai/pull/3762
- fix: Use InvalidConfiguration error for GraphQL query errors by @peasee in https://github.com/spiceai/spiceai/pull/3763
- Extend Spice Search integration and E2E tests to cover chunking by @sgrebnov in https://github.com/spiceai/spiceai/pull/3750
- test: Add GraphQL integration tests from external sources by @peasee in https://github.com/spiceai/spiceai/pull/3756
- docs: Release GraphQL release candidate by @peasee in https://github.com/spiceai/spiceai/pull/3764
- Accelerate a subset of columns from source dataset in Refresh SQL by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3765
- Run TPCDS benchmark for databricks delta mode by @Sevenannn in https://github.com/spiceai/spiceai/pull/3751
- Update dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3747
- Implement vector search benchmark initialization by @sgrebnov in https://github.com/spiceai/spiceai/pull/3774
- Implement InvalidTypeAction for PostgreSQL Data Connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3767
- fix: Check ODBC parameters are positive integers by @peasee in https://github.com/spiceai/spiceai/pull/3777
- Fix Delta DataType
Maptype mapping to arrow type by @Sevenannn in https://github.com/spiceai/spiceai/pull/3776 - Update Databricks & Delta Lake Connector RC criteria by @Sevenannn in https://github.com/spiceai/spiceai/pull/3778
- Add a
/v1/packages/generateAPI to generate a Spicepod package from a GitHub repo. by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3782 - Set
Spice-Target-Sourceheader forspice addby @phillipleblanc in https://github.com/spiceai/spiceai/pull/3783 - Call v1 spicerack API by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3784
- Run models integration tests on self-hosted macOS runners by @sgrebnov in https://github.com/spiceai/spiceai/pull/3785
- Fix OpenAI models integration tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/3786
- Integration test for Databricks delta_lake mode by @Sevenannn in https://github.com/spiceai/spiceai/pull/3779
- Add
spice connectfor connecting to existing Spice.ai instances by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3790 - Add
evalspicepod component; basic HTTP api to run eval. by @Jeadie in https://github.com/spiceai/spiceai/pull/3766 - Release RC for databricks delta_lake mode by @Sevenannn in https://github.com/spiceai/spiceai/pull/3792
- Include Huggingface model to E2E models tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/3788
- Enable
trace_id&parent_span_idoverrides forv1/chat/completionby @Jeadie in https://github.com/spiceai/spiceai/pull/3791 - Search benchmark: run search workload and measure result by @sgrebnov in https://github.com/spiceai/spiceai/pull/3793
- Search benchmark: measure search precision by @sgrebnov in https://github.com/spiceai/spiceai/pull/3804
- Use MinIO instead of S3 for benchmark tests by @Sevenannn in https://github.com/spiceai/spiceai/pull/3794
- Update benchmark snapshots by @github-actions in https://github.com/spiceai/spiceai/pull/3814
- Only verify TPCH / TPCDS official query results for DuckDB by @Sevenannn in https://github.com/spiceai/spiceai/pull/3816
- Fixes for the Debezium connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3819
- Fix insert statement when all columns are constraint columns by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3820
- docs: Move ODBC to Beta for current state of roadmap by @peasee in https://github.com/spiceai/spiceai/pull/3823
- Accept
:,/or://as the delimiter for the data connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3821 - Update dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3826
- Enable
read_writemode support for Postgres Data Connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/3813 - feat: add Databricks ODBC TPCDS benchmark by @peasee in https://github.com/spiceai/spiceai/pull/3825
- Change
spice.aidata connector dataset path format to<org>/<app>/datasets/<table_reference>by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3828 - fix: enable tpcds explain snapshotting by @peasee in https://github.com/spiceai/spiceai/pull/3830
- Azure AI support for both LLMs & embedding models by @Jeadie in https://github.com/spiceai/spiceai/pull/3824
- Add Github Workflow to run Search Benchmark by @sgrebnov in https://github.com/spiceai/spiceai/pull/3834
- Fetch access token with Microsoft OAuth, and use access token to initiate Sharepoint data connector graph client by @Sevenannn in https://github.com/spiceai/spiceai/pull/3836
- Initialize accelerator for datasets dynamically included by @Sevenannn in https://github.com/spiceai/spiceai/pull/3714
- Update cargo.lock by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3838
- feat: add MS SQL TPCH benchmark by @peasee in https://github.com/spiceai/spiceai/pull/3833
- Improve Azure AI models support by @sgrebnov in https://github.com/spiceai/spiceai/pull/3835
- Primary key support for Arrow's
Memtableby @Jeadie in https://github.com/spiceai/spiceai/pull/3829 - Update Tokenizer to 0.21 and mistral.rs by @Jeadie in https://github.com/spiceai/spiceai/pull/3839
- Fix models integration tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/3843
- Enable
spice login abfsby @Sevenannn in https://github.com/spiceai/spiceai/pull/3844 - update
crates/llmsdependencies to 'spiceai' branch by @Jeadie in https://github.com/spiceai/spiceai/pull/3846 - Make eval runs non-blocking;
spice.eval.{results, runs}tables. by @Jeadie in https://github.com/spiceai/spiceai/pull/3780 - fix: Update GraphQL snapshots by @peasee in https://github.com/spiceai/spiceai/pull/3849
- Update to Rust 1.83 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3847
- feat: add mssql integration test by @peasee in https://github.com/spiceai/spiceai/pull/3848
- Prepend user-specified user agent in flight repl by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3850
- fix: trim CHAR in mssql by @peasee in https://github.com/spiceai/spiceai/pull/3852
- Fix column quoting for SpiceCloudPlatform dialect by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3857
- Optimize builds by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3861
- Endgame template: Add recently added AI/ML quickstarts and samples by @sgrebnov in https://github.com/spiceai/spiceai/pull/3859
- docs: Release MS SQL Beta by @peasee in https://github.com/spiceai/spiceai/pull/3853
- Fix nsql sampling for tables with embeddings by @sgrebnov in https://github.com/spiceai/spiceai/pull/3860
- Make GH workflows with spiceai-macos runners more stable by @sgrebnov in https://github.com/spiceai/spiceai/pull/3863
- fix: Remove GraphQL swapi test by @peasee in https://github.com/spiceai/spiceai/pull/3867
- create 1
tokio::testper test/model by @Jeadie in https://github.com/spiceai/spiceai/pull/3696 - handle
max_completion_tokensvsmax_tokensfor openai vs azure by @Jeadie in https://github.com/spiceai/spiceai/pull/3869 - Search benchmark: write results to dataset by @sgrebnov in https://github.com/spiceai/spiceai/pull/3871
- Create
evalconverterthat creates spice eval components. by @Jeadie in https://github.com/spiceai/spiceai/pull/3864 - Update quickstart in README.md by @ewgenius in https://github.com/spiceai/spiceai/pull/3876
- Remove reference to spiceai-smart-demo from the repo home by @sgrebnov in https://github.com/spiceai/spiceai/pull/3885
- Trace
evalsaccelerated tables updates in debug mode by @sgrebnov in https://github.com/spiceai/spiceai/pull/3884 - Clarify confusing log message by @Advayp in https://github.com/spiceai/spiceai/pull/3862
- Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/3840
- Azure OpenAI models: make
endpointparameter required by @sgrebnov in https://github.com/spiceai/spiceai/pull/3883 - Use spiceai delta kernel fork, actionable message for delta checkpoint errors by @Sevenannn in https://github.com/spiceai/spiceai/pull/3856
- Add support for GGUF files in HF by @Jeadie in https://github.com/spiceai/spiceai/pull/3875
Full Changelog: https://github.com/spiceai/spiceai/compare/v1.0.0-rc.1...v1.0.0-rc.2
- Rust
Published by peasee about 1 year ago
https://github.com/spiceai/spiceai - v1.0.0-rc.1
Spice v1.0-rc.1 (Nov 27, 2024)
Spice v1.0.0-rc.1 marks the release candidate for the first major version of Spice.ai OSS. This milestone includes key Connector and Accelerator graduations and bug fixes, positioning Spice for a stable and production-ready release.
Highlights in v1.0-rc.1
API Key Authentication: Spice now supports optional authentication for API endpoints via configurable API keys, for additional security and control over runtime access.
Example Spicepod.yml configuration:
yaml
runtime:
auth:
api-key:
enabled: true
keys:
- ${ secrets:api_key } # Load from a secret store
- my-api-key # Or specify directly
Usage:
- HTTP API: Include the API key in the
X-API-Keyheader. - Flight SQL: Use the API key in the
Authorizationheader as a Bearer token. - Spice CLI: Provide the
--api-keyflag for CLI commands.
For more details on using API Key auth, refer to the API Auth documentation.
DuckDB Data Connector: Has graduated from Beta to Release Candidate.
Arrow and DuckDB Data Accelerators: Both have graduated from Beta to Release Candidates.
Debezium Kafka Integration: Spice now supports secure authentication and encryption options for Kafka connections when using Debezium for Change Data Capture (CDC). The previous limitation of PLAINTEXT protocol-only connections has been lifted. Spice now supports the following Kafka security configurations:
- Security protocol: PLAINTEXT, SSL, SASLPLAINTEXT, SASLSSL
- SASL mechanisms: PLAIN, SCRAM-SHA-256, SCRAM-SHA-512
Example Spicepod.yml configuration:
yaml
datasets:
- from: debezium:my_kafka_topic_with_debezium_changes
name: my_dataset
params:
kafka_security_protocol: SASL_SSL
kafka_sasl_mechanism: SCRAM-SHA-512
kafka_sasl_username: kafka
kafka_sasl_password: ${secrets:kafka_sasl_password}
kafka_ssl_ca_location: ./certs/kafka_ca_cert.pem
Breaking changes
Model Parameters: The params.spice_tools parameter has been replaced by params.tools. Backward compatibility is maintained for existing configurations using params.spice_tools.
Dataset Accelerator State: The ready_state parameter has been moved to the dataset level.
Ready Handler Response: The response body of the /v1/ready handler has been changed from Ready (uppercase) to ready (lowercase) for consistency and adherence to standards.
Default Kafka Security for Debezium: The default Kafka kafka_security_protocol parameter for Debezium datasets has changed from PLAINTEXT to SASL_SSL, improving security by default.
Metrics Name Updates: Adjustments have been made to specific metrics for improved observability and accuracy:
| Before | v1.0-rc.1 | | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------- | | catalogsloaderror | catalogloaderrors | | catalogsstatus | catalogloadstate | | datasetsaccelerationappenddurationms, datasetsaccelerationloaddurationms | datasetaccelerationrefreshdurationms {mode: append/full} | | datasetsaccelerationlastrefreshtime | datasetaccelerationlastrefreshtimems | | datasetsaccelerationrefresherror | datasetaccelerationrefresherrors | | datasetscount | datasetactivecount | | datasetsloaderror | datasetloaderrors | | datasetsstatus | datasetloadstate | | datasetsunavailabletime | datasetunavailabletimems | | embeddingscount | embeddingsactivecount | | embeddingsloaderror | embeddingsloaderrors | | embeddingsstatus | embeddingsloadstate | | flightdoactiondurationms, flightdogetgetprimarykeysdurationms, flightdogetgetcatalogsdurationms, flightdogetgetschemasdurationms, flightdogetgetsqlinfodurationms, flightdogettabletypesdurationms, flightdogetgettablesdurationms, flightdogetpreparedstatementquerydurationms, flightdogetsimpledurationms, flightdogetstatementquerydurationms, flightdoputdurationms, flighthandshakerequestdurationms, flightlistactionsdurationms, flightgetflightinforequestdurationms | flightrequestdurationms {method: methodname, command: commandname} | | flightdoactionrequests, flightdoexchangedataupdatessent, flightdoexchangerequests, flightdoputrequests, flightdogetrequests, flighthandshakerequests, flightlistactionsrequests, flightlistflightsrequests, flightgetflightinforequests, flightgetschemarequests | flightrequests {method: methodname, command: commandname} | | httprequestsdurationms | httprequestdurationms | | modelscount | modelactivecount | | modelsloaddurationms | modelloaddurationms | | modelsloaderror | modelloaderrors | | modelsstatus | modelloadstate | | toolcount | toolactivecount | | toolloaderror | toolloaderrors | | toolsstatus | toolloadstate | | querycount | queryexecutions | | queryexecutionduration | queryexecutiondurationms | | resultscachehitcount | resultscachehits | | resultscacheitemcount | resultscacheitemscount | | resultscachemaxsize | resultscachemaxsizebytes | | resultscacherequestcount | resultscacherequests | | resultscachesize | resultscachesizebytes | | secretsstoresloaddurationms | secretsstoreloaddurationms | | bytesprocessed | queryprocessedbytes | | bytesreturned | queryreturnedbytes | | spicedruntimeflightserverstart | runtimeflightserverstarted | | spicedruntimehttpserverstart | runtimehttpserverstarted | | viewsloaderror | viewloaderrors |
Contributors
- @phillipleblanc
- @sgrebnov
- @Jeadie
- @Sevenannn
- @peasee
- @slyons
- @barracudarin
- @lukekim
- @ewgenius
What's changed
- Update to next release version by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3372
- Update Helm chart to v0.20.0-beta by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3373
- Upgrade dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3375
- E2E: Add a test to confirm refreshing with custom
refresh-sqlvia CLI by @sgrebnov in https://github.com/spiceai/spiceai/pull/3374 - Fix regression in inferring embedding model vector size for non-default models by @Jeadie in https://github.com/spiceai/spiceai/pull/3376
- add AI quickstarts to endgame by @Jeadie in https://github.com/spiceai/spiceai/pull/3378
- Remove need for
params.model_typefor most HF LLMs by @Jeadie in https://github.com/spiceai/spiceai/pull/3342 - Replace
query_duration_secondsandhttp_requests_duration_secondswithmillisecondsmetrics by @sgrebnov in https://github.com/spiceai/spiceai/pull/3251 - Add
Extension<Runtime>to HTTP routes to simplify tooling in NSQL. by @Jeadie in https://github.com/spiceai/spiceai/pull/3384 - Update datafusion patch by @Sevenannn in https://github.com/spiceai/spiceai/pull/3386
- Ensure hyperparameters are obeyed in recursive chat/completion calls. by @Jeadie in https://github.com/spiceai/spiceai/pull/3395
- fix: update odbc benchmarks by @peasee in https://github.com/spiceai/spiceai/pull/3394
- Implement traits & plumbing for pluggable HTTP Auth by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3397
- Add allow_http parameter for S3 data connector by @Sevenannn in https://github.com/spiceai/spiceai/pull/3398
- Add column field to dataset spicepod component by @Jeadie in https://github.com/spiceai/spiceai/pull/3336
- feat: add duckdb connector benchmarks by @peasee in https://github.com/spiceai/spiceai/pull/3403
- Add integration tests for OpenAI NSQL functionality by @sgrebnov in https://github.com/spiceai/spiceai/pull/3402
- Implement optional api-key auth for the HTTP endpoint by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3405
- Add integration tests for Search API (OpenAI and HF models) by @sgrebnov in https://github.com/spiceai/spiceai/pull/3410
- HTTP APIs: list tools, call tool by @Jeadie in https://github.com/spiceai/spiceai/pull/3404
- Implement optional api-key auth for the Flight/FlightSQL endpoint by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3412
- Adding semicolons to some TPCH queries to make sure they run on the CLI by @slyons in https://github.com/spiceai/spiceai/pull/3420
- Add GrpcAuth to protect the OpenTelemetry endpoint by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3417
- Support Kafka-native authentication and TLS connections for Debezium connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3419
- Add integration tests for Embeddings API (OpenAI and HF models) by @sgrebnov in https://github.com/spiceai/spiceai/pull/3416
- Support base64 embedding format by @Jeadie in https://github.com/spiceai/spiceai/pull/3418
- Give local models some love by @Jeadie in https://github.com/spiceai/spiceai/pull/3425
- Have views update on
--pods-watcher-enabledby @Jeadie in https://github.com/spiceai/spiceai/pull/3428 - Simplify running models integration tests locally by @sgrebnov in https://github.com/spiceai/spiceai/pull/3424
- Make Debezium connector MySQL compatible by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3432
- Store + load memory tooling, enable by @Jeadie in https://github.com/spiceai/spiceai/pull/3413
- Statically compile OpenSSL by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3434
- Build macOS x64 on macos-14 (Sonoma) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3435
- Upgrade dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3443
- Bump azure_core from 0.20.0 to 0.21.0 by @dependabot in https://github.com/spiceai/spiceai/pull/3436
- Add integration tests for chat completion API (HF and OpenAI) by @sgrebnov in https://github.com/spiceai/spiceai/pull/3433
- Run Clickbench with Spice Benchmark Binary by @Sevenannn in https://github.com/spiceai/spiceai/pull/3389
- Use
datatype_is_semantically_equalinverify_schemaby @Sevenannn in https://github.com/spiceai/spiceai/pull/3423 - Use spiceai-large-runners to build benchmark binary by @Sevenannn in https://github.com/spiceai/spiceai/pull/3446
- Skip reqwest_retry::middleware tracing in non verbose configuration by @sgrebnov in https://github.com/spiceai/spiceai/pull/3445
- feat: Add invalid type action handling for DuckDB by @peasee in https://github.com/spiceai/spiceai/pull/3430
- Fix benchmark: Lock poisoning issue from INSTA by @Sevenannn in https://github.com/spiceai/spiceai/pull/3457
- docs: Release DuckDB Connector RC by @peasee in https://github.com/spiceai/spiceai/pull/3459
- DR: Code Pattern For Obtaining Milliseconds-Based Duration by @sgrebnov in https://github.com/spiceai/spiceai/pull/3460
- Improve ClickBench setup script: avoid re-downloading test data every time by @sgrebnov in https://github.com/spiceai/spiceai/pull/3463
- Fix
TableReferencequoting for MySQL by @Jeadie in https://github.com/spiceai/spiceai/pull/3461 - Tool use and model name for local models by @Jeadie in https://github.com/spiceai/spiceai/pull/3458
params.tools, notparams.spice_tools. Allow backwards compatibility toparams.spice_tools. by @Jeadie in https://github.com/spiceai/spiceai/pull/3473- fix: Support DuckDB boolean list by @peasee in https://github.com/spiceai/spiceai/pull/3474
- Upgrade to DataFusion 43 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3462
- Build explicit ODBC Docker image by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3476
- Promote Arrow acceleration to RC by @sgrebnov in https://github.com/spiceai/spiceai/pull/3478
- Update benchmark workflow to create PR for updating snapshot by @Sevenannn in https://github.com/spiceai/spiceai/pull/3479
- Update benchmark snapshots for spice.ai connector tpch by @github-actions in https://github.com/spiceai/spiceai/pull/3481
- Update setup-make action by @Sevenannn in https://github.com/spiceai/spiceai/pull/3488
- Option to return sql from
v1/nsqlby @Jeadie in https://github.com/spiceai/spiceai/pull/3487 - Adding scripts to run and monitor TPC-H/-DS queries at larger scale factors by @slyons in https://github.com/spiceai/spiceai/pull/3483
- Update Datafusion and Datafusion-Table-Providers patch by @Sevenannn in https://github.com/spiceai/spiceai/pull/3489
- docs: Update Accelerator RC to specify clickbench in all modes by @peasee in https://github.com/spiceai/spiceai/pull/3490
- Add logos and marks by @lukekim in https://github.com/spiceai/spiceai/pull/3485
- Updates to repo docs by @lukekim in https://github.com/spiceai/spiceai/pull/3486
- Change
document_similarityto return markdown, not JSON. by @Jeadie in https://github.com/spiceai/spiceai/pull/3477 - Add support for creating embeddings for Utf8View type columns by @sgrebnov in https://github.com/spiceai/spiceai/pull/3498
- Add vector search support for Utf8View type columns by @sgrebnov in https://github.com/spiceai/spiceai/pull/3500
- Update
datafusion-table-providersversion by @Jeadie in https://github.com/spiceai/spiceai/pull/3503 - Update
text-embeddings-inferenceandmistral.rsfrom downstream. by @Jeadie in https://github.com/spiceai/spiceai/pull/3505 - Fix snapshot update PR push in benchmark by @Sevenannn in https://github.com/spiceai/spiceai/pull/3484
- Run FederationAnalyzerRule before ResolveGroupingFunction rule by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3508
- Update benchmark snapshots by @github-actions in https://github.com/spiceai/spiceai/pull/3509
- docs: Release DuckDB accelerator RC by @peasee in https://github.com/spiceai/spiceai/pull/3512
- Upgrade datafusion-functions-json to 0.43 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3511
- Update Datafusion Table Provider patch to fix MySQL refresh append mode by @Sevenannn in https://github.com/spiceai/spiceai/pull/3514
- Handle panics in HF API calls by @Jeadie in https://github.com/spiceai/spiceai/pull/3521
- Update Runtime metrics according to metrics naming guidelines by @sgrebnov in https://github.com/spiceai/spiceai/pull/3518
- Update Flight metrics according to metrics naming guidelines by @sgrebnov in https://github.com/spiceai/spiceai/pull/3515
- Update Results Cache metrics according to metrics naming guidelines by @sgrebnov in https://github.com/spiceai/spiceai/pull/3520
- Move
ready_stateto dataset level by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3526 - Add
--forceoption tospice upgradeto force it to upgrade to the latest released version by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3527 - Refactor runtime initialization into separate modules by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3531
- Update Anonymous telemetry metrics according to metrics naming guidelines by @sgrebnov in https://github.com/spiceai/spiceai/pull/3529
- Add Metrics naming principles and guidelines by @sgrebnov in https://github.com/spiceai/spiceai/pull/3516
- Update Dataset Acceleration metrics according to metrics naming guidelines by @sgrebnov in https://github.com/spiceai/spiceai/pull/3528
- Improve localpod startup to register immediately after its parent is registered by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3532
- AI/LLM integration tests: make tests more robust and verify more ai_tools by @sgrebnov in https://github.com/spiceai/spiceai/pull/3513
- Update dashboards to match new metrics names by @sgrebnov in https://github.com/spiceai/spiceai/pull/3530
- Clarify source of prefixes for data component parameters. by @Jeadie in https://github.com/spiceai/spiceai/pull/3541
- Upgrade dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3564
- Update Spice release process to support release branches by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3525
- fix: Validate the endpoint for ABFS and S3 by @peasee in https://github.com/spiceai/spiceai/pull/3565
- Vector Search: Default to datasets with embeddings only when none are specified by @sgrebnov in https://github.com/spiceai/spiceai/pull/3575
- Lowercase the ready handler response by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3577
- Update benchmark snapshots by @github-actions in https://github.com/spiceai/spiceai/pull/3579
- Improve
spice searcherror handling by @sgrebnov in https://github.com/spiceai/spiceai/pull/3571 - Load components in parallel, not concurrently by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3566
- fix: Make S3 auth parameter validation more robust: by @peasee in https://github.com/spiceai/spiceai/pull/3578
- fix: Infer if the specified file format is correct in object store by @peasee in https://github.com/spiceai/spiceai/pull/3580
- Add ability to configure CORS on the HTTP server by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3581
- fix: Handle invalid S3 auth and region better by @peasee in https://github.com/spiceai/spiceai/pull/3582
- allow setting of replicaCount to a falsy-value by @barracudarin in https://github.com/spiceai/spiceai/pull/3586
spice searchto default to only datasets with embeddings by @sgrebnov in https://github.com/spiceai/spiceai/pull/3588- Run AI integration tests as part of CI by @sgrebnov in https://github.com/spiceai/spiceai/pull/3572
- Load datasets in parallel by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3585
- Run integration test on smaller runners by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3583
- Use folders for model component by @Jeadie in https://github.com/spiceai/spiceai/pull/3584
- Improve models integration tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/3592
- Change default taskhistory capturedoutput to
noneby @phillipleblanc in https://github.com/spiceai/spiceai/pull/3598 - Add timeout to
/v1/datasetsAPIs when app is locked by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3601 - Properly drop the read lock on the runtime app in http.start by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3603
- Make integration tests more robust on fewer cores by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3604
- refactor: First pass data connector error messages update by @peasee in https://github.com/spiceai/spiceai/pull/3602
- Add log if no datasets are configured by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3605
- Upgrade to DuckDB 1.1.3 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3606
- Add E2E test for spice search and chat functionality (OpenAI) by @sgrebnov in https://github.com/spiceai/spiceai/pull/3599
- Use spiceai-runners for TPCH / TPCDS benchmark by @Sevenannn in https://github.com/spiceai/spiceai/pull/3507
- docs: Update error handling guide by @peasee in https://github.com/spiceai/spiceai/pull/3611
- Improve default description for sql tool by @Jeadie in https://github.com/spiceai/spiceai/pull/3612
- Update metric name from
query_invocationstoquery_executionsby @sgrebnov in https://github.com/spiceai/spiceai/pull/3613 - Don't provide runtime tools to health check. by @Jeadie in https://github.com/spiceai/spiceai/pull/3615
- Sort vector search results based on similarity score by @sgrebnov in https://github.com/spiceai/spiceai/pull/3620
- Allow overriding runtime configuration with
--set-runtimeCLI flags by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3619 - Some bugs by @Jeadie in https://github.com/spiceai/spiceai/pull/3621
- Improve S3 errors by @Sevenannn in https://github.com/spiceai/spiceai/pull/3640
- Update Databricks, Delta Lake, DuckDB error messages by @Sevenannn in https://github.com/spiceai/spiceai/pull/3642
- docs: Add error message UX to beta connector criteria by @peasee in https://github.com/spiceai/spiceai/pull/3639
- feat: Make REPL identify it's waiting on a new line by @peasee in https://github.com/spiceai/spiceai/pull/3617
- Wrap Server-Sent-Events chat errors as OpenAI error events by @sgrebnov in https://github.com/spiceai/spiceai/pull/3641
- refactor: Update accelerated table errors, dataset health monitor errors by @peasee in https://github.com/spiceai/spiceai/pull/3614
- Extend
v1/datasetsapi to indicate if dataset can be used in vector search by @sgrebnov in https://github.com/spiceai/spiceai/pull/3644 - feat: Unnest DataFusion errors by @peasee in https://github.com/spiceai/spiceai/pull/3646
- feat: Add RateLimited DataConnectorError by @peasee in https://github.com/spiceai/spiceai/pull/3648
- Setup nightly docker release workflow by @ewgenius in https://github.com/spiceai/spiceai/pull/3649
- Make LLM integration tests more extensible. by @Jeadie in https://github.com/spiceai/spiceai/pull/3576
- feat: Update ODBC error messages by @peasee in https://github.com/spiceai/spiceai/pull/3651
- feat: Better tonic errors by @peasee in https://github.com/spiceai/spiceai/pull/3650
- Nightly release workflow fixes by @ewgenius in https://github.com/spiceai/spiceai/pull/3652
- Fix missing ARM64 image for nightly publish step by @ewgenius in https://github.com/spiceai/spiceai/pull/3653
- Use GitHub GraphQL rate limiting responses to rate limit requests by @lukekim in https://github.com/spiceai/spiceai/pull/3610
- Fix typo in nightly release publish step by @ewgenius in https://github.com/spiceai/spiceai/pull/3654
- Handle GitHub rate-limiting for the Rest API by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3656
- Adding custom User-Agent parameters to chat, nsql and flightrepl by @slyons in https://github.com/spiceai/spiceai/pull/3609
- Remove "nightly-" prefix from tag by @ewgenius in https://github.com/spiceai/spiceai/pull/3671
- Upgrade dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3670
spice searchto warn if dataset is not ready and won't be included in search by @sgrebnov in https://github.com/spiceai/spiceai/pull/3590- Fix keyring secret store to try both prefixed & unprefixed secrets by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3672
- Handle empty embeds by allowing for nulls by @Jeadie in https://github.com/spiceai/spiceai/pull/3600
- Improve github connector error by @Sevenannn in https://github.com/spiceai/spiceai/pull/3677
- Update FlightSQL error messages by @sgrebnov in https://github.com/spiceai/spiceai/pull/3676
- Update Datafusion Table Provider Patch to include error message improvements by @Sevenannn in https://github.com/spiceai/spiceai/pull/3678
- Integration tests for
llmscrate, with basic Anthropic test. by @Jeadie in https://github.com/spiceai/spiceai/pull/3647 - Allow E2E model tests to complete even if parallel platform tests failed by @sgrebnov in https://github.com/spiceai/spiceai/pull/3679
- Add Openai to llms testing by @Jeadie in https://github.com/spiceai/spiceai/pull/3680
- Fix .env in '.github/workflows/integration_llms.yml' by @Jeadie in https://github.com/spiceai/spiceai/pull/3686
- Improve error messages for spice ai connector, separate errors to different lines for DuckDB, Delta Lake, Databricks connector by @Sevenannn in https://github.com/spiceai/spiceai/pull/3643
- Add
microsoft/Phi-3-mini-4k-instructto llms crate testing, withMODEL_SKIPLIST&MODEL_ALLOWLISTby @Jeadie in https://github.com/spiceai/spiceai/pull/3690 - Add nightly label to spiced version in Cargo.toml by @ewgenius in https://github.com/spiceai/spiceai/pull/3691
- Disable HF in models integration tests (not supported) by @sgrebnov in https://github.com/spiceai/spiceai/pull/3693
- Add log when CORS is enabled by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3695
- Fix nightly release workflow by @ewgenius in https://github.com/spiceai/spiceai/pull/3698
- Correctly set nightly labels for both release and pre-release versions by @ewgenius in https://github.com/spiceai/spiceai/pull/3699
- Improve REPL error handling for multiline error messages by @sgrebnov in https://github.com/spiceai/spiceai/pull/3692
- Determine supportfilterpushdown based on Accelerator federated reader & ZeroResultsAction by @Sevenannn in https://github.com/spiceai/spiceai/pull/3694
- Fix rdfkafak duplicated version by @Sevenannn in https://github.com/spiceai/spiceai/pull/3707
- feat: Render multiline errors better in REPL by @peasee in https://github.com/spiceai/spiceai/pull/3701
- refactor: Update UnableToAttachDataConnector error message by @peasee in https://github.com/spiceai/spiceai/pull/3706
- refactor: Update errors for Alpha connectors by @peasee in https://github.com/spiceai/spiceai/pull/3705
- Update benchmark snapshots by @github-actions in https://github.com/spiceai/spiceai/pull/3704
- Implement a RequestContext that automatically propagates request details to metric dimensions by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3709
- Fix acceleration in append mode with refresh_sql specified by @sgrebnov in https://github.com/spiceai/spiceai/pull/3697
- Bump github.com/stretchr/testify from 1.9.0 to 1.10.0 by @dependabot in https://github.com/spiceai/spiceai/pull/3655
- Tokenizer for OpenAI embedding models for accurate chunking by @Jeadie in https://github.com/spiceai/spiceai/pull/3519
- Update error message when dataset isn't configured with time_column in append refresh by @Sevenannn in https://github.com/spiceai/spiceai/pull/3703
- Add the missing winver dependency in runtime crate by @Sevenannn in https://github.com/spiceai/spiceai/pull/3711
- deps: Update table providers by @peasee in https://github.com/spiceai/spiceai/pull/3712
- Add special tokens in chunk sizer by @Jeadie in https://github.com/spiceai/spiceai/pull/3713
- Disable results cache for benchmark tests by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3715
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.20.0-beta...v1.0.0-rc.1
- Rust
Published by ewgenius over 1 year ago
https://github.com/spiceai/spiceai - v0.20.0-beta
Spice v0.20.0-beta (Nov 04, 2024)
Spice v0.20.0-beta improves federated query performance with column pruning and adds support for Metal (Apple Silicon) and CUDA (NVidia) accelerators. The S3, PostgreSQL, MySQL, and GitHub Data Connectors have graduated from Beta to Release Candidates. The Arrow, DuckDB, and SQLite Data Accelerators have graduated from Alpha to Beta.
Highlights in v0.20.0-beta
Data Connectors: The S3, PostgreSQL, MySQL, and GitHub Data Connectors have graduated from beta to release candidate.
Data Accelerators: The Arrow, DuckDB, and SQLite Data Accelerators have graduated from alpha to beta.
Metal and CUDA Support: Added support for Metal (Apple Silicon) and CUDA (NVidia) for AI/ML workloads including embeddings and local LLM inference.
For instructions on compiling a Meta or CUDA binary, see the Installation Docs.
Breaking Changes
- The ODBC Data Connector now requires ODBC drivers specified in connection strings are registered in the system ODBC driver manager.
Example invalid connection string:
bash
DRIVER={/path/to/driver.so};SERVER=localhost;DATABASE=master
Example valid connection string:
bash
DRIVER={My ODBC Driver};SERVER=localhost;DATABASE=master
Where My ODBC Driver is the name of an ODBC driver registered in the ODBC driver manager.
Contributors
- @ewgenius
- @peasee
- @phillipleblanc
- @sgrebnov
- @Jeadie
- @barracudarin
- @Sevenannn
What's Changed
- Update Helm for v0.19.4-beta and add release notes by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3310
- Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/3311
metal&cudaflags for spice by @Jeadie in https://github.com/spiceai/spiceai/pull/3212- Promote postgres connector to RC quality by @Sevenannn in https://github.com/spiceai/spiceai/pull/3305
- docs: Update ROADMAP.md by @peasee in https://github.com/spiceai/spiceai/pull/3322
- feat: Enable federation for in-memory accelerators by @peasee in https://github.com/spiceai/spiceai/pull/3325
- fix: Only allow env files from the current dir by @peasee in https://github.com/spiceai/spiceai/pull/3327
- Always read TimezoneTZ from PostgreSQL as UTC by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3330
- For multi-sink acceleration refreshes, ensure parent table completes before the children. by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3329
- Update TPC-DS Q49 (Decimal to Float) to match SQLite's type system by @sgrebnov in https://github.com/spiceai/spiceai/pull/3323
- Enable parquet pushdown in Spice by @Sevenannn in https://github.com/spiceai/spiceai/pull/3245
- Use spice object_store fork to fix S3 ambiguous error by @Sevenannn in https://github.com/spiceai/spiceai/pull/3304
- Don't mix commented out queries for s3 connectors and accelerators by @Sevenannn in https://github.com/spiceai/spiceai/pull/3331
- Allow only valid WHERE conditions in vector searches by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3335
- fix: Allow only ODBC profiles by @peasee in https://github.com/spiceai/spiceai/pull/3324
- Track how many times an acceleration falls back during initialization by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3339
- Anthropic model regex and fix tool parsing aggregation bug by @Jeadie in https://github.com/spiceai/spiceai/pull/3334
- Upgrade runtime along with CLI on
spice upgradeby @phillipleblanc in https://github.com/spiceai/spiceai/pull/3341 - Update upcoming Roadmap by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3343
- fix: Prevent acceleration files outside of working directory by @peasee in https://github.com/spiceai/spiceai/pull/3340
- Document S3 connector limitations by @Sevenannn in https://github.com/spiceai/spiceai/pull/3333
- Update Object Store Patch by @Sevenannn in https://github.com/spiceai/spiceai/pull/3361
- Promote SQLite Data Accelerator to Beta by @sgrebnov in https://github.com/spiceai/spiceai/pull/3365
- Promote S3 connector to RC quality by @Sevenannn in https://github.com/spiceai/spiceai/pull/3362
- Revert "fix: Only allow env files from the current dir" by @peasee in https://github.com/spiceai/spiceai/pull/3368
- docs: Fix typo for S3 release status in README.md by @peasee in https://github.com/spiceai/spiceai/pull/3370
- Include unnecessary columns pruning step during federated plan creation by @sgrebnov in https://github.com/spiceai/spiceai/pull/3363
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.19.4-beta...v0.20.0-beta
- Rust
Published by phillipleblanc over 1 year ago
https://github.com/spiceai/spiceai - v0.19.4-beta
Spice v0.19.4 (Oct 30, 2024)
Spice v0.19.4-beta introduces a new localpod Data Connector, improvements to accelerator resiliency and control, and a new configuration to control when accelerated datasets are considered ready.
Highlights in v0.19.4
localpod Connector: Implement a "tiered" acceleration strategy with a new localpod Data Connector that can be used to accelerate datasets from other datasets registered in Spice.
yaml
datasets:
- from: s3://my_bucket/my_dataset
name: my_dataset
acceleration:
enabled: true
engine: duckdb
mode: file
refresh_check_interval: 60s
- from: localpod:my_dataset
name: my_localpod_dataset
acceleration:
enabled: true
Refreshes on the localpod's parent dataset will automatically be synchronized with the localpod dataset.
Improved Accelerator Resiliency: When Spice is restarted, if the federated source for a dataset configured with a file-based accelerator is not available, the dataset will still load from the existing file data and will attempt to connect to the federated source in the background for future refreshes.
Accelerator Ready State: Control when an accelerated dataset is considered "ready" by the runtime with the new ready_state parameter.
yaml
datasets:
- from: s3://my_bucket/my_dataset
name: my_dataset
acceleration:
enabled: true
ready_state: on_load # or on_registration
ready_state: on_load: Default. The dataset is considered ready after the initial load of the accelerated data. For file-based accelerated datasets that have existing data, this means the dataset is ready immediately.ready_state: on_registration: The dataset is considered ready when the dataset is registered in Spice. Queries against this dataset before the data is loaded will fallback to the federated source.
Breaking changes
Accelerated datasets configured with ready_state: on_load (the default behavior) that are not ready will return an error instead of returning zero results.
Contributors
- @Sevenannn
- @peasee
- @phillipleblanc
- @sgrebnov
- @barracudarin
- @Jeadie
- @ewgenius
What's Changed
- Update helm for v0.19.3-beta by @ewgenius in https://github.com/spiceai/spiceai/pull/3274
- docs: Mark GitHub as Beta in README.md by @peasee in https://github.com/spiceai/spiceai/pull/3272
- Fix docker publish by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3273
- Add SQLite TPC-DS Limitations:
ROLLUPandGROUPINGby @sgrebnov in https://github.com/spiceai/spiceai/pull/3277 - Update version to 1.0.0-rc.1 by @sgrebnov in https://github.com/spiceai/spiceai/pull/3276
- Synchronize localpod acceleration with parent acceleration refreshes by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3264
- feat: Update Datafusion, promote DuckDB and MySQL by @peasee in https://github.com/spiceai/spiceai/pull/3278
- Add SQLite TPC-DS Limitations:
stddevby @sgrebnov in https://github.com/spiceai/spiceai/pull/3279 - fix indentation issue with service annotations by @barracudarin in https://github.com/spiceai/spiceai/pull/3281
- fix: Expose GitHub ratelimit errors by @peasee in https://github.com/spiceai/spiceai/pull/3258
- Revert Datafusion parquet changes by @Sevenannn in https://github.com/spiceai/spiceai/pull/3286
- Promote arrow accelerator to beta by @Sevenannn in https://github.com/spiceai/spiceai/pull/3287
- Add SQLite TPC-DS Limitations: casting to DECIMAL by @sgrebnov in https://github.com/spiceai/spiceai/pull/3282
- Accelerated datasets can fallback to federated source while loading by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3280
- Enable overlap_size correctly by @Jeadie in https://github.com/spiceai/spiceai/pull/3229
- Avoid duplicated filter conditions in rewritten SQL by @Sevenannn in https://github.com/spiceai/spiceai/pull/3284
- Fix SQLite records conversion with NULL in first row by @sgrebnov in https://github.com/spiceai/spiceai/pull/3295
- fix: Update datafusion by @peasee in https://github.com/spiceai/spiceai/pull/3297
- Display shorter name for benchmark workflow matrix by @Sevenannn in https://github.com/spiceai/spiceai/pull/3299
- Update
spice_sys_dataset_checkpointto store federated table schema by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3303 - Update postgres connector/accelerator snapshot by @Sevenannn in https://github.com/spiceai/spiceai/pull/3298
- Accelerated tables with existing file data can load without a connection to the federated source by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3306
- Ensure synchronized tables complete their insertion at the same time by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3307
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.19.3-beta...v0.19.4-beta
- Rust
Published by phillipleblanc over 1 year ago
https://github.com/spiceai/spiceai - v0.19.3-beta
Spice v0.19.3 (Oct 28, 2024)
Spice v0.19.3-beta improves the performance and stability of data connectors and accelerators, including faster queries across multiple federated sources by optimizing how filters are applied. Anthropic has also been added as a LLM model provider.
Highlights in v0.19.3
DataFusion Fixes: Resolved bugs in DataFusion and DataFusion Table Providers, expanding TPC-DS coverage and correctness.
GitHub Data Connector Beta Milestone: The GitHub Data Connector has graduated to Beta after extensive testing, stability, and performance improvements.
Anthropic Models Provider: Anthropic has been added as an LLM provider, including support for streaming.
Example spicepod.yml:
yaml
models:
- from: anthropic:claude-3-5-sonnet-20240620
name: claude_3_5_sonnet
params:
anthropic_api_key: ${ secrets:SPICE_ANTHROPIC_API_KEY }
Breaking changes
None.
Contributors
- @Jeadie
- @Sevenannn
- @phillipleblanc
- @peasee
- @sgrebnov
- @nlamirault
- @barracudarin
- @lukekim
- @slyons
New Contributors
- @nlamirault made their first contribution in https://github.com/spiceai/spiceai/pull/3207
- @barracudarin made their first contribution in https://github.com/spiceai/spiceai/pull/3228
What's Changed
- Make Anthropic OpenAI compatible. by @Jeadie in https://github.com/spiceai/spiceai/pull/3087
- Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/3200
- Bump version to 1.0.0-rc.1 by @Sevenannn in https://github.com/spiceai/spiceai/pull/3202
- Fix clickhouse schema inference for non-default database by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3201
- Update endgame template by @Sevenannn in https://github.com/spiceai/spiceai/pull/3198
- Upgrade dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3197
- fix: dataset refresh defaults properties to None by @peasee in https://github.com/spiceai/spiceai/pull/3205
- Upgrade OTEL to v0.26 and make seconds based metrics reported precisely by @sgrebnov in https://github.com/spiceai/spiceai/pull/3203
- use
text_embedding_inference::Inferfor more complete embedding solution by @Jeadie in https://github.com/spiceai/spiceai/pull/3199 - Add S3 parquet file - arrow accelerator e2e test by @Sevenannn in https://github.com/spiceai/spiceai/pull/3154
- feat: Add script to setup clickbench on mysql by @peasee in https://github.com/spiceai/spiceai/pull/3176
- Update helm chart version to v0.19.2 by @Sevenannn in https://github.com/spiceai/spiceai/pull/3210
- Add sample dataset option in
v1/nsql. by @Jeadie in https://github.com/spiceai/spiceai/pull/3105 - Split spiced_docker build across architectures by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3206
- feat(helm): do not install demo dataset by default by @nlamirault in https://github.com/spiceai/spiceai/pull/3207
- Split integration test across build/run steps by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3215
- feat(helm): Refactoring Kubernetes labels by @nlamirault in https://github.com/spiceai/spiceai/pull/3208
- Define 'toolrecursionlimit' for LLMs, and limit internal tool calling recursion. by @Jeadie in https://github.com/spiceai/spiceai/pull/3214
- Improve filters pushdown for federated queries by @sgrebnov in https://github.com/spiceai/spiceai/pull/3183
- Implement native schema inference for PostgreSQL by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3209
- docs: Update release criteria by @peasee in https://github.com/spiceai/spiceai/pull/3219
- Run SQLite acceleration TPC-DS tests using smaller scale by @sgrebnov in https://github.com/spiceai/spiceai/pull/3227
- bind the serviceAccount if a name is given or if we're creating one by @barracudarin in https://github.com/spiceai/spiceai/pull/3228
- Only emit channel send error log when its not a closed channel error by @Jeadie in https://github.com/spiceai/spiceai/pull/3230
- Enable Parquet Exec filter pushdown in Spice by @Sevenannn in https://github.com/spiceai/spiceai/pull/3216
- Add snapshots for SQLite TPC-DS benchmark (file mode) by @sgrebnov in https://github.com/spiceai/spiceai/pull/3234
- docs: Add SDK release checks to endgame by @peasee in https://github.com/spiceai/spiceai/pull/3256
- Implement
localpodData Connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3249 - Revert "Enable Parquet Exec filter pushdown in Spice (#3216)" by @Sevenannn in https://github.com/spiceai/spiceai/pull/3244
- refactor: Use existing action for detecting changes by @peasee in https://github.com/spiceai/spiceai/pull/3255
- feat: Add GitHub integration test by @peasee in https://github.com/spiceai/spiceai/pull/3226
- Add get_readiness tool to retrieve status of all registered components by @lukekim in https://github.com/spiceai/spiceai/pull/3035
- Improve CLI error output when REPL can't connect to the Flight endpoint by @slyons in https://github.com/spiceai/spiceai/pull/3188
- Fixing FTP link in Endgame by @slyons in https://github.com/spiceai/spiceai/pull/3267
- Update version to 0.19.3-beta by @sgrebnov in https://github.com/spiceai/spiceai/pull/3269
- add service type and annotation customizations in https://github.com/spiceai/spiceai/pull/3268
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.19.2-beta...v0.19.3-beta
- Rust
Published by sgrebnov over 1 year ago
https://github.com/spiceai/spiceai - v0.19.2-beta
Spice v0.19.2 (Oct 21, 2024)
Spice v0.19.2-beta continues to improve performance and stability of data connectors and data accelerators, further expands TPC-DS coverage, and includes several bug fixes.
Highlights in v0.19.2
DataFusion Fixes: Resolved bugs in DataFusion and DataFusion Table Providers, improving TPC-DS query support and correctness.
TPC-DS Snapshots: Extended support for TPC-DS benchmarks with added snapshot tests for validating query plans and result accuracy.
PostgreSQL Accelerator Beta: Postgres Data Accelerator has been promoted to Beta Quality
Breaking changes
- The
hive_infer_partitionsparameter been changed tohive_partitioning_enabled, now defaults tofalseand must be explicitly enabled.
Contributors
- @ewgenius
- @sgrebnov
- @slyons
- @Jeadie
- @Sevenannn
- @phillipleblanc
- @dependabot
- @peasee
Dependencies
- DataFusion Table Providers: Upgraded to rev
2bcf481b4abe9d0bd6bb2479ce49020df66ff97f. - duckdb-rs: Upgraded from 1.0.0 to 1.1.1.
What's Changed
- Update Helm chart for v0.19.1-beta by @ewgenius in https://github.com/spiceai/spiceai/pull/3106
- Add more TPC-DS snapshots for Postgres acceleration by @sgrebnov in https://github.com/spiceai/spiceai/pull/3107
- Bumping version to 1.0.0-rc.1 by @slyons in https://github.com/spiceai/spiceai/pull/3109
- New table sampling methods: sampledistinctcolumns, randomsample, topn_sample by @Jeadie in https://github.com/spiceai/spiceai/pull/3108
- Add TPCDS snapshot tests for file-based and in-mem duckdb by @Sevenannn in https://github.com/spiceai/spiceai/pull/3115
- Add Postgres acceleration E2E test for MySQL by @sgrebnov in https://github.com/spiceai/spiceai/pull/3110
- Update datafusion logical plan to avoid wrong group_by columns in aggregation by @Sevenannn in https://github.com/spiceai/spiceai/pull/3111
- Warn if user tries to embed column that does not exist by @Jeadie in https://github.com/spiceai/spiceai/pull/3120
- Changes for Rust version upgrade by @Sevenannn in https://github.com/spiceai/spiceai/pull/3134
- Add
unnestsupport for federated plans by @sgrebnov in https://github.com/spiceai/spiceai/pull/3133 - Don't
.clone()unnecessarily by @Jeadie in https://github.com/spiceai/spiceai/pull/3128 - Fix Flight
get_schemato construct logical plan and return that schema. by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3131 - Bump clap from 4.5.19 to 4.5.20 by @dependabot in https://github.com/spiceai/spiceai/pull/3099
- Add GitHub Workflow to build
spice-postgres-tpcds-benchimage by @sgrebnov in https://github.com/spiceai/spiceai/pull/3140 - test: Add basic MySQL integration test by @peasee in https://github.com/spiceai/spiceai/pull/3143
- Bump datafusion-federation and datafusion-table-providers crates by @sgrebnov in https://github.com/spiceai/spiceai/pull/3148
- docs: Add MySQL limitation for division by zero by @peasee in https://github.com/spiceai/spiceai/pull/3144
- fix: Dataset refresh by @peasee in https://github.com/spiceai/spiceai/pull/3147
- Update arrow, duckdb, postgres accelerator tpcds snapshots by @Sevenannn in https://github.com/spiceai/spiceai/pull/3145
- Add TPC-DS benchmarks for Postgres data connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/3149
- Update E2E test ci to include tests for accelerating Postgres into accelerators by @Sevenannn in https://github.com/spiceai/spiceai/pull/3137
- Add TPCDS Benchmark test and snapshots for S3 by @Sevenannn in https://github.com/spiceai/spiceai/pull/3152
- [cli] Include 200 in acceptable response codes for
doRuntimeApiRequestby @phillipleblanc in https://github.com/spiceai/spiceai/pull/3157 - Use
-build.{GIT_SHA}for unreleased versions by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3159 - Upgrade to Rust 1.82 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3158
- Disable
hive_infer_partitionsby default by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3160 - Upgrade to DuckDB 1.1.1 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3161
- feat: Add MySQL TPCDS results snapshots and exclude workarounds by @peasee in https://github.com/spiceai/spiceai/pull/3165
- Fix taskhistory output for sql, add output to tableschema & list_datasets tool by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3166
- feat: Add ClickBench queries as separate files by @peasee in https://github.com/spiceai/spiceai/pull/3169
- Calculate embeddings in a separate blocking thread by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3170
- docs: Update ROADMAP.md and release criterias by @peasee in https://github.com/spiceai/spiceai/pull/3124
- Handle OpenTelemetry errors by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3173
- Update version to 0.19.2-beta by @Sevenannn in https://github.com/spiceai/spiceai/pull/3182
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.19.1-beta...v0.19.2-beta
- Rust
Published by Sevenannn over 1 year ago
https://github.com/spiceai/spiceai - v0.19.1-beta
Spice v0.19.1 (Oct 14, 2024)
Spice v0.19.1 brings further performance and stability improvements to data connectors, including improved query push-down for file-based connectors (s3, abfs, file, ftp, sftp) that use Hive-style partitioning.
Highlights in v0.19.1
TPC-H and TPC-DS Coverage: Expanded coverage for TPC-H and TPC-DS benchmarking suites across accelerators and connectors.
GitHub Connector Array Filter: The GitHub connector now supports filter push down for the array_contains function in SQL queries using search query mode.
NSQL CLI Command: A new spice nsql CLI command has been added to easily query datasets with natural language from the command line.
Breaking changes
None
Contributors
- @peasee
- @Sevenannn
- @sgrebnov
- @karifabri
- @phillipleblanc
- @lukekim
- @Jeadie
- @slyons
Dependencies
- DataFusion Table Providers: Upgraded to rev
f22b96601891856e02a73d482cca4f6100137df8.
What's Changed
- release: Update helm chart for v0.19.0-beta by @peasee in https://github.com/spiceai/spiceai/pull/3024
- Set fail-fast = true for benchmark test by @Sevenannn in https://github.com/spiceai/spiceai/pull/2997
- release: Update next version and ROADMAP by @peasee in https://github.com/spiceai/spiceai/pull/3033
- Verify TPCH benchmark query results for Spark connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/2993
- feat: Add x-spice-user-agent header to Spice REPL by @peasee in https://github.com/spiceai/spiceai/pull/2979
- Update to object store file formats documentation link by @karifabri in https://github.com/spiceai/spiceai/pull/3036
- Use teraswitch-runners for Linux x64 workflows + builds by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3042
- feat: Support array contains in GitHub pushdown by @peasee in https://github.com/spiceai/spiceai/pull/2983
- Bump text-splitter from 0.16.1 to 0.17.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2987
- Revert integration tests back to hosted runner by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3046
- Tune Github runner resources to allow in memory TPCDS benchmark to run by @Sevenannn in https://github.com/spiceai/spiceai/pull/3025
- fix: add winver by @peasee in https://github.com/spiceai/spiceai/pull/3054
- refactor: Use is modifier for checking GitHub state filter by @peasee in https://github.com/spiceai/spiceai/pull/3056
- Enable
merge_groupchecks for PR workflows by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3058 - Fix issues with merge group by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3059
- Validate in-memory arrow accelertion TPCDS result correctness by @Sevenannn in https://github.com/spiceai/spiceai/pull/3044
- Fix rev parsing for PR checks by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3060
- Use 'Accept' header for
/v1/sql/and/v1/nsqlby @Jeadie in https://github.com/spiceai/spiceai/pull/3032 - Verify Postgres acceleration TPCDS result correctness by @Sevenannn in https://github.com/spiceai/spiceai/pull/3043
- Add NSQL CLI REPL command by @lukekim in https://github.com/spiceai/spiceai/pull/2856
- Preserve query results order and add TPCH benchmark results verification for duckdb:file mode by @sgrebnov in https://github.com/spiceai/spiceai/pull/3034
- Refactor benchmark to include MySQL tpcds bench, tweaks to makefile target for generating mysql tpcds data by @Sevenannn in https://github.com/spiceai/spiceai/pull/2967
- Support runtime parameter for
sql_query_keep_partition_by_columns& enable by default by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3065 - Document TPC-DS limitations:
EXCEPT,INTERSECT, duplicate names by @sgrebnov in https://github.com/spiceai/spiceai/pull/3069 - Adding ABFS benchmark by @slyons in https://github.com/spiceai/spiceai/pull/3062
- Add support for GitHub app installation auth for GitHub connector by @ewgenius in https://github.com/spiceai/spiceai/pull/3063
- docs: Document stack overflow workaround, add helper script by @peasee in https://github.com/spiceai/spiceai/pull/3070
- Tune MySQL TPCDS image to allow for successful benchmark test run by @Sevenannn in https://github.com/spiceai/spiceai/pull/3067
- Automatically infer partitions for hive-style partitioned files for object store based connectors by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3073
- Support
hf_tokenfrom params/secrets by @Jeadie in https://github.com/spiceai/spiceai/pull/3071 - Inherit embedding columns from source, when available. by @Jeadie in https://github.com/spiceai/spiceai/pull/3045
- Validate identifiers for component names by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3079
- docs: Add workaround for TPC-DS Q97 in MySQL by @peasee in https://github.com/spiceai/spiceai/pull/3080
- Document TPC-DS Postgres column alias in a CASE statement limitation by @sgrebnov in https://github.com/spiceai/spiceai/pull/3083
- Update plan snapshots for TPC-H bench queries by @sgrebnov in https://github.com/spiceai/spiceai/pull/3088
- Update Datafusion crate to include recent unparsing fixes by @sgrebnov in https://github.com/spiceai/spiceai/pull/3089
- Sample SQL table data tool and API by @Jeadie in https://github.com/spiceai/spiceai/pull/3081
- chore: Update datafusion-table-providers by @peasee in https://github.com/spiceai/spiceai/pull/3090
- Add
hive_infer_partitionsto remaining object store connectors by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3086 - deps: Update datafusion-table-providers by @peasee in https://github.com/spiceai/spiceai/pull/3093
- For local embedding models, return usage input tokens. by @Jeadie in https://github.com/spiceai/spiceai/pull/3095
- Update end_game.md with Accelerator/Connector criteria check by @slyons in https://github.com/spiceai/spiceai/pull/3092
- Update TPC-DS Q90 by @sgrebnov in https://github.com/spiceai/spiceai/pull/3094
- docs: Add RC connector criteria by @peasee in https://github.com/spiceai/spiceai/pull/3026
- Update version to 0.19.1-beta by @sgrebnov in https://github.com/spiceai/spiceai/pull/3101
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.19.0-beta...v0.19.1-beta
- Rust
Published by slyons over 1 year ago
https://github.com/spiceai/spiceai - v0.19.0-beta
Spice v0.19.0-beta (Oct 7, 2024)
Spice v0.19.0-beta brings performance improvements for accelerators and expanded TPC-DS coverage. A new Azure Blob Storage data connector has also been added.
Highlights in v0.19.0-beta
Improved TPC-DS Coverage: Enhanced support for TPC-DS derived queries.
CLI SQL REPL: The CLI SQL REPL (spice sql) now supports multi-line editing and tab indentation. Note, a terminating semi-colon ';' is now required for each executed SQL block.
Azure Storage Data Connector: A new Azure Blob Storage data connector (abfs://) has been added, enabling federated SQL queries on files stored in Azure Blob-compatible endpoints, including Azure BlobFS (abfss://) and Azure Data Lake (adl://). Supported file formats can be specified using the file_format parameter.
Example spicepod.yml:
yaml
datasets:
- from: abfs://foocontainer/taxi_sample.csv
name: azure_test
params:
azure_account: spiceadls
azure_access_key: abc123==
file_format: csv
For a full list of supported files, see the Object Store File Formats documentation.
For more details, see the Azure Blob Storage Data Connector documentation.
Breaking Changes
Spice.ai Data Connector: The key for the Spice.ai Cloud Platform Data Connector has changed from
spiceaitospice.ai. To upgrade, change uses offrom: spiceai:tofrom: spice.ai:.GitHub Data Connector: Pull Requests column
loginhas been renamed toauthor.CLI SQL REPL: A terminating semi-colon ';' is now required for each executed SQL block.
Spicepod Hot-Reload: When running
spiceddirectly, hot-reload of spicepod.yml configuration is now disabled. Run withspice runto use hot-reload.
Contributors
- @sgrebnov
- @Jeadie
- @Sevenannn
- @peasee
- @ewgenius
- @slyons
- @phillipleblanc
- @lukekim
Dependencies
- DataFusion Table Providers: Upgraded to rev
826814ab149aad8ee668454c83a0650fb8b18d60.
What's Changed
- Bump tonic from 0.12.2 to 0.12.3 by @dependabot in https://github.com/spiceai/spiceai/pull/2880
- Verify benchmark query results using snapshot testing (s3 connector) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2902
- Fix
paths-ignore:by @Jeadie in https://github.com/spiceai/spiceai/pull/2906 - Rename
spiceaidata connector tospice.aiby @sgrebnov in https://github.com/spiceai/spiceai/pull/2899 - Update ROADMAP.md by @Jeadie in https://github.com/spiceai/spiceai/pull/2907
- Helm update for helm for 0.18.3-beta by @Jeadie in https://github.com/spiceai/spiceai/pull/2910
- Add tpcds queries by @Sevenannn in https://github.com/spiceai/spiceai/pull/2918
- Fix
paths-ignorefor docs. by @Jeadie in https://github.com/spiceai/spiceai/pull/2911 - feat: Support LIKE expressions in GitHub filter pushdown by @peasee in https://github.com/spiceai/spiceai/pull/2903
- feat: Support date comparison pushdown in GitHub connector by @peasee in https://github.com/spiceai/spiceai/pull/2904
- Improve aggregation and union queries unparsing by @sgrebnov in https://github.com/spiceai/spiceai/pull/2925
- Initialize file based accelerators on dataset reload by @Sevenannn in https://github.com/spiceai/spiceai/pull/2923
- Update spiceai/spiceai for next release by @Jeadie in https://github.com/spiceai/spiceai/pull/2928
- Verify TPC-H benchmark query results for arrow acceleration by @sgrebnov in https://github.com/spiceai/spiceai/pull/2927
- Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/2912
- Use structured output for NSQL by @Jeadie in https://github.com/spiceai/spiceai/pull/2922
- Update TPC-DS queries to use supported date addition format by @sgrebnov in https://github.com/spiceai/spiceai/pull/2930
- Add busy_timeout accelerator param for Sqlite by @Sevenannn in https://github.com/spiceai/spiceai/pull/2855
- Use Cosine Similarity in vector search by @Jeadie in https://github.com/spiceai/spiceai/pull/2932
- Add support for passing
x-spiceai-app-idmetadata in spiceai data connector by @ewgenius in https://github.com/spiceai/spiceai/pull/2934 - docs: update beta accelerator criteria by @peasee in https://github.com/spiceai/spiceai/pull/2905
- Azure Connector implementation by @slyons in https://github.com/spiceai/spiceai/pull/2926
- Local embedding model from relative paths by @Jeadie in https://github.com/spiceai/spiceai/pull/2908
- Add Markdown aware chunker when
params.file_format: md. by @Jeadie in https://github.com/spiceai/spiceai/pull/2943 - 'spice version' without structured logging by @Jeadie in https://github.com/spiceai/spiceai/pull/2944
- Bump tempfile from 3.12.0 to 3.13.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2878
- feat: GraphQL commit query parameters by @peasee in https://github.com/spiceai/spiceai/pull/2945
- Update OpenAI client and use new request fields by @Jeadie in https://github.com/spiceai/spiceai/pull/2951
- refactor: Rename GitHub pulls login to author by @peasee in https://github.com/spiceai/spiceai/pull/2954
- Run tpcds benchmarks for accelerators by @Sevenannn in https://github.com/spiceai/spiceai/pull/2853
- Add spiced arg
--pods-watcher-enabled. Watcher disabled by default for spiced. by @ewgenius in https://github.com/spiceai/spiceai/pull/2953 - Add error message when spicepod has embeddings or models without '--features models' by @Jeadie in https://github.com/spiceai/spiceai/pull/2952
- Adding multi-line editing and tab indentation to sql REPL by @slyons in https://github.com/spiceai/spiceai/pull/2949
- Update MySQL ghcr image to include tpcds data by @Sevenannn in https://github.com/spiceai/spiceai/pull/2941
- Document DataFusion limitation: The context only support single SQL Statement, Date Arithmetic like date + 3 not supported by @Sevenannn in https://github.com/spiceai/spiceai/pull/2970
- Bump snafu from 0.8.4 to 0.8.5 by @dependabot in https://github.com/spiceai/spiceai/pull/2876
- Bump async-trait from 0.1.82 to 0.1.83 by @dependabot in https://github.com/spiceai/spiceai/pull/2879
- Bump async-graphql from 7.0.9 to 7.0.11 in the cargo group by @dependabot in https://github.com/spiceai/spiceai/pull/2950
- Verify TPC-H benchmark query results for MySQL by @sgrebnov in https://github.com/spiceai/spiceai/pull/2972
- Verify TPCH benchmark query results for Postgres by @sgrebnov in https://github.com/spiceai/spiceai/pull/2973
- Verify TPCH benchmark query results for sqlite acceleration by @sgrebnov in https://github.com/spiceai/spiceai/pull/2974
- Verify TPCH benchmark query results for duckdb (in-memory) acceleration by @sgrebnov in https://github.com/spiceai/spiceai/pull/2975
- Support for
mdxfile extensions to apply a markdown splitter by @ewgenius in https://github.com/spiceai/spiceai/pull/2977 - Don't assume first vector or content will be non-null/zero by @Jeadie in https://github.com/spiceai/spiceai/pull/2940
- use custom chunk sizers for HF, local and OpenAI models by @Jeadie in https://github.com/spiceai/spiceai/pull/2971
- Ensure we return N unique documents, not N unique chunks by @Jeadie in https://github.com/spiceai/spiceai/pull/2976
- Fix issues parsing
messages[*].tool_callsfor local models by @Jeadie in https://github.com/spiceai/spiceai/pull/2957 - text -> SQL trait to customise per model. by @Jeadie in https://github.com/spiceai/spiceai/pull/2942
- Remove system message from ToolUsingChat. by @Jeadie in https://github.com/spiceai/spiceai/pull/2978
- Make logical plan to sql more robust (improve ORDER BY; support
roundfor Postgres) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2984 - Add connectionpoolsize parameter for Postgres accelerator by @Sevenannn in https://github.com/spiceai/spiceai/pull/2969
- Fix dataset configure prompt by @sgrebnov in https://github.com/spiceai/spiceai/pull/2991
- Verify TPCH benchmark query results for Databricks(odbc) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2989
- Verify TPCH benchmark query results for Databricks (delta_lake) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2982
- Set log level for anonymous telemetry traces to
traceby @phillipleblanc in https://github.com/spiceai/spiceai/pull/2995 - Improvements to issue templates by @lukekim in https://github.com/spiceai/spiceai/pull/2992
spice loginwrites to.env.localif present by @slyons in https://github.com/spiceai/spiceai/pull/2996
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.18.3-beta...v0.19.0-beta
- Rust
Published by peasee over 1 year ago
https://github.com/spiceai/spiceai - v0.18.3-beta
Spice v0.18.3-beta (Sep 30, 2024)
The Spice v0.18.3-beta release includes several quality-of-life improvements including verbosity flags for spiced and the Spice CLI, vector search over larger documents with support for chunking dataset embeddings, and multiple performance enhancements. Additionally, the release includes several bug fixes, dependency updates, and optimizations, including updated table providers and significantly improved GitHub data connector performance for issues and pull requests.
Highlights in v0.18.3-beta
GitHub Query Mode: A new github_query_mode: search parameter has been added to the GitHub Data Connector, which uses the GitHub Search API to enable faster and more efficient query of issues and pull requests when using filters.
Example spicepod.yml:
yaml
- from: github:github.com/spiceai/spiceai/issues/trunk
name: spiceai.issues
params:
github_query_mode: search # Use GitHub Search API
github_token: ${secrets:GITHUB_TOKEN}
Output Verbosity: Higher verbosity output levels can be specified through flags for both spiced and the Spice CLI.
Example command line:
```shell spice -v spice --very-verbose
spiced -vv spiced --verbose ```
Embedding Chunking: Chunking can be enabled and configured to preprocess input data before generating dataset embeddings. This improves the relevance and precision for larger pieces of content.
Example spicepod.yml:
yaml
- name: support_tickets
embeddings:
- column: conversation_history
use: openai_embeddings
chunking:
enabled: true
target_chunk_size: 128
overlap_size: 16
trim_whitespace: true
For details, see the Search Documentation.
Dependencies
- DataFusion Table Providers: Upgraded to rev
b0af91992699ecbf5adf2036a07122578f06150e.
Contributors
- @Sevenannn
- @peasee
- @Jeadie
- @sgrebnov
- @phillipleblanc
- @ewgenius
- @slyons
What's Changed
- Update datafusion table provider patch by @Sevenannn in https://github.com/spiceai/spiceai/pull/2817
- refactor: Set maxrowsper_batch for ODBC to 4000 by @peasee in https://github.com/spiceai/spiceai/pull/2822
- Use User message for health check by @Jeadie in https://github.com/spiceai/spiceai/pull/2823
- Upgrade Helm chart (Spice v0.18.2-beta) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2820
- Add verbosity flags for spiced, spice:
-v,-vv,--verbose,--very-verbose. by @Jeadie in https://github.com/spiceai/spiceai/pull/2831 - Rename
spiceaidata connector tospice.aiby @sgrebnov in https://github.com/spiceai/spiceai/pull/2680 - Prepare for v0.19.0-beta release (version bump) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2821
- Bump clap from 4.5.17 to 4.5.18 (#2801) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2848
- Enable "rc" feature for serde in spicepod crate by @ewgenius in https://github.com/spiceai/spiceai/pull/2851
- Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/2852
- chore: update table providers by @peasee in https://github.com/spiceai/spiceai/pull/2858
- fix: Use GitHub search for issues in GraphQL by @peasee in https://github.com/spiceai/spiceai/pull/2845
- fix: Use GitHub search for pull_requests by @peasee in https://github.com/spiceai/spiceai/pull/2847
- Support chunking dataset embeddings by @Jeadie in https://github.com/spiceai/spiceai/pull/2854
- refactor: Update GraphQL client to be more robust for filter push down by @peasee in https://github.com/spiceai/spiceai/pull/2864
- docs: Update accelerator beta criteria by @peasee in https://github.com/spiceai/spiceai/pull/2865
- Change
BytesProcessedRuleto be an optimizer rather than an analyzer rule by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2867 - Don't run E2E or PR tests on documentation by @Jeadie in https://github.com/spiceai/spiceai/pull/2869
- Verify benchmark query results using snapshot testing (spice.ai connector) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2866
- feat: Add GraphQLOptimizer by @peasee in https://github.com/spiceai/spiceai/pull/2868
- Update quickstarts for Endgame by @Jeadie in https://github.com/spiceai/spiceai/pull/2863
- Update version to v0.18.3-beta by @sgrebnov in https://github.com/spiceai/spiceai/pull/2882
- Update DataFusion: fix coalesce, Aggregation with Window functions unparsing support by @sgrebnov in https://github.com/spiceai/spiceai/pull/2884
- Revert "Rename
spiceaidata connector tospice.ai" by @sgrebnov in https://github.com/spiceai/spiceai/pull/2881 - Adding integration test for DuckDB read functions by @slyons in https://github.com/spiceai/spiceai/pull/2857
- Show more informative mysql error message by @Sevenannn in https://github.com/spiceai/spiceai/pull/2883
- Fix
no process-level CryptoProvider availablewhen using REPL and TLS by @sgrebnov in https://github.com/spiceai/spiceai/pull/2887 - Change UX for chunking and enable overlap_size in chunking by @Jeadie in https://github.com/spiceai/spiceai/pull/2890
- Add
log/slogto spice CLI tool by @Jeadie in https://github.com/spiceai/spiceai/pull/2859 - feat: Add GitHub GraphQLOptimizer by @peasee in https://github.com/spiceai/spiceai/pull/2870
- Fix mysql invalid tablename error message by @Sevenannn in https://github.com/spiceai/spiceai/pull/2896
- fix: Remove login column rename in pulls and update Optimizer by @peasee in https://github.com/spiceai/spiceai/pull/2897
- Fix require check checking. by @Jeadie in https://github.com/spiceai/spiceai/pull/2898
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.18.2-beta...v0.18.3-beta
- Rust
Published by Jeadie over 1 year ago
https://github.com/spiceai/spiceai - v0.18.2-beta
Spice v0.18.2-beta (Sep 24, 2024)
The v0.18.2-beta release improves the reliability of the sharepoint data connector and spice search functionality.
Contributors
- @Jeadie
- @sgrebnov
New Contributors
- None
What's Changed
- Issue with sharepoint Site by @Jeadie in https://github.com/spiceai/spiceai/pull/2810
- Upgrade Helm chart (Spice v0.18.1-beta) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2812
- Prepare for v0.18.2-beta release by @sgrebnov in https://github.com/spiceai/spiceai/pull/2811
- Fix issues with spice search by @Jeadie in https://github.com/spiceai/spiceai/pull/2814
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.18.1-beta...0.18.2
- Rust
Published by github-actions[bot] over 1 year ago
https://github.com/spiceai/spiceai - v0.18.1-beta
Spice v0.18.1-beta (Sep 23, 2024)
The v0.18.1-beta release continues to improve runtime performance and reliability. Performance for accelerated queries joining multiple datasets has been significantly improved with join push-down support. GraphQL, MySQL, and SharePoint data connectors have better reliability and error handling, and a new Microsoft SQL Server data connector has been introduced. Task History now has fine-grained configuration, including the ability to disable the feature entirely. A new spice search CLI command has been added, enabling development-time embeddings-based searches across datasets.
Highlights in v0.18.1-beta
Join push-down for accelerations: Queries to the same accelerator will now push-down joins, significantly improving acceleration performance for queries joining multiple tables.
Microsoft SQL Server Data Connector: Use from: mssql: to access and accelerate Microsoft SQL Server datasets.
Example spicepod.yml:
yaml
datasets:
- from: mssql:path.to.my_dataset
name: my_dataset
params:
mssql_connection_string: ${secrets:mssql_connection_string}
See the Microsoft SQL Server Data Connector documentation.
Task History: Task History can be configured in the spicepod.yml, including the ability to include, or truncate outputs such as the results of a SQL query.
Example spicepod.yml:
yaml
runtime:
task_history:
enabled: true
captured_output: truncated
retention_period: 8h
retention_check_interval: 15m
See the Task History Spicepod reference for more information on possible values and behaviors.
Search CLI Command Use the spice search CLI command to perform embeddings-based searches across search configure datasets. Note: Search requires the ai feature to be installed.
Refresh on File Changes: File Data Connector data refreshes can be configured to be triggered when the source file is modified through a file system watcher. Enable the watcher by adding file_watcher: enabled to the acceleration parameters.
Example spicepod.yml:
yaml
datasets:
- from: file://path/to/my_file.csv
name: my_file
acceleration:
enabled: true
refresh_mode: full
params:
file_watcher: enabled
Breaking Changes
The Query History table runtime.query_history has been deprecated and removed in favor of the Task History table runtime.task_history. The Task History table tracks tasks across all features such as SQL query, vector search, and AI completion in a unified table.
See the Task History documentation.
Dependencies
- DataFusion: Upgraded from v41 to v42.
- Apache Arrow: Upgraded from v52 to v53.
- DuckDB: Upgraded from v1.0 to v1.1.
Contributors
- @phillipleblanc
- @Jeadie
- @lukekim
- @sgrebnov
- @peasee
- @Sevenannn
- @ewgenius
- @slyons
New Contributors
- @slyons made their first contribution in https://github.com/spiceai/spiceai/pull/2724
What's Changed
- Update Helm Chart for 0.18.0-beta release by @sgrebnov in https://github.com/spiceai/spiceai/pull/2711
- Use a single instance for all DuckDB accelerated datasets by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2669
- Dependabot upgrades by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2715
- Use a single instance for all SQLite accelerated datasets by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2720
- Prepare for v0.18.1-beta release by @sgrebnov in https://github.com/spiceai/spiceai/pull/2692
- For GraphQL, remove necessity of
json_pointerand improve error messaging. by @Jeadie in https://github.com/spiceai/spiceai/pull/2713 - Postgres accelerator benchmark test by @Sevenannn in https://github.com/spiceai/spiceai/pull/2652
- Trace query result while running benchmark tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/2684
- Early check EmbeddingConnector if embedding models do not exist by @Jeadie in https://github.com/spiceai/spiceai/pull/2717
- Move table creation for spicesysdatasetcheckpoint to DatasetCheckpoint::trynew by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2732
- Don't load tools immediately by @Jeadie in https://github.com/spiceai/spiceai/pull/2731
- Renable accelerator federation on trunk by @Sevenannn in https://github.com/spiceai/spiceai/pull/2725
- Fixing Data Connectors link in README.md by @slyons in https://github.com/spiceai/spiceai/pull/2724
- Enable rehydration tests for DuckDB by @sgrebnov in https://github.com/spiceai/spiceai/pull/2729
- Check pageInfo is correct at initialisation of GraphQL connector by @Jeadie in https://github.com/spiceai/spiceai/pull/2730
- Microsoft SQL Server data connector initial support by @sgrebnov in https://github.com/spiceai/spiceai/pull/2741
- Add
spice searchCLI command by @lukekim in https://github.com/spiceai/spiceai/pull/2739 - Update threat model by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2738
- Upgrade to Arrow 53, DataFusion 42 and DuckDB 1.1 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2744
- Update datafusion table provider patch by @Sevenannn in https://github.com/spiceai/spiceai/pull/2747
- feat: Add enabled config option for task_history by @peasee in https://github.com/spiceai/spiceai/pull/2758
- Remove v0.18.0-beta from the Roadmap by @sgrebnov in https://github.com/spiceai/spiceai/pull/2748
- Fix spark-connect to use native roots for TLS again by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2766
- Fix benchmark test - Install default crypto provider by @Sevenannn in https://github.com/spiceai/spiceai/pull/2752
- Resolve primary keys for datasets with catalog or schema by @Jeadie in https://github.com/spiceai/spiceai/pull/2749
- MSSQL: include table name in schema retrieval error by @sgrebnov in https://github.com/spiceai/spiceai/pull/2746
- File Format parsing for Document tables, support for docx + pdf by @Jeadie in https://github.com/spiceai/spiceai/pull/2740
- Add Document parsing to Sharepoint connector. by @Jeadie in https://github.com/spiceai/spiceai/pull/2760
- Execution plan with BinaryExpr predicates pushdown support for MS SQL by @sgrebnov in https://github.com/spiceai/spiceai/pull/2768
- Update datafusion patch by @Sevenannn in https://github.com/spiceai/spiceai/pull/2772
- Support for standalone config parameters for MS SQL by @sgrebnov in https://github.com/spiceai/spiceai/pull/2773
- Utilize DataConnectorError for MySQL Data Connector Errors by @Sevenannn in https://github.com/spiceai/spiceai/pull/2759
- Add Score to search results by @lukekim in https://github.com/spiceai/spiceai/pull/2774
- Don't call GetComponentStatuses when --metrics not enabled by @Jeadie in https://github.com/spiceai/spiceai/pull/2779
- Implement better error handling for spicepods by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2767
- Make integration tests more robust by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2782
- Query results streaming support for MS SQL by @sgrebnov in https://github.com/spiceai/spiceai/pull/2781
- Update benchmark snapshots by @Sevenannn in https://github.com/spiceai/spiceai/pull/2778
- For Sharepoint connector, if clientsecret and authcode are both provided, default to auth_code by @Jeadie in https://github.com/spiceai/spiceai/pull/2780
- Add modified pk/indexes scenario to rehydration tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/2743
- Run benchmarks on Wed, Fri, Sat, and Sun. by @lukekim in https://github.com/spiceai/spiceai/pull/2786
- Update PULLREQUESTTEMPLATE.md to include a section for Documentation by @slyons in https://github.com/spiceai/spiceai/pull/2785
- Add E2E test for MS SQL data connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/2788
- More types support for MS SQL data connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/2789
- feat: Add capturedoutput option for taskhistory by @peasee in https://github.com/spiceai/spiceai/pull/2783
- Add ability to refresh when file data connector detects changes by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2787
- Propagate MySQL invalid table name error by @Sevenannn in https://github.com/spiceai/spiceai/pull/2776
- feat: Add retention options for task_history config by @peasee in https://github.com/spiceai/spiceai/pull/2784
- fix: Move task history check after query history creation by @peasee in https://github.com/spiceai/spiceai/pull/2793
- MS SQL connector should ignore all unsupported types by @sgrebnov in https://github.com/spiceai/spiceai/pull/2795
- Improve Sharepoint DX by @Jeadie in https://github.com/spiceai/spiceai/pull/2791
- Replace query history with task history by @peasee in https://github.com/spiceai/spiceai/pull/2792
- Fix datasetshealthmonitor spice.runtime.task_history not found warning by @sgrebnov in https://github.com/spiceai/spiceai/pull/2805
- Upgrade macOS x86_64 test runner to macOS 13.6.9 Ventura by @sgrebnov in https://github.com/spiceai/spiceai/pull/2803
- Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/2808
- Add mssql to the list of supported data connectors by @sgrebnov in https://github.com/spiceai/spiceai/pull/28
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.18.0-beta...v0.18.1-beta
- Rust
Published by sgrebnov over 1 year ago
https://github.com/spiceai/spiceai - v0.18.0-beta
Spice v0.18.0-beta (Sep 16, 2024)
The v0.18.0-beta release adds new Sharepoint and File data connectors, introduces AWS Identity and Access Management (IAM) support for the S3 Data Connector, improves performance of the GitHub connector, and increases the overall reliability of all data accelerators. The /ready API endpoint was enhanced to report as ready only when all components, including loaded data, have successfully reported readiness.
Highlights in v0.18.0-beta
Sharepoint Data Connector: Use from: sharepoint: to access and accelerate documents stored in Microsoft 365 OneDrive for Business (Sharepoint).
Example spicepod.yml:
yaml
datasets:
- from: sharepoint:drive:Documents/path:/important_documents/
name: important_documents
params:
sharepoint_client_id: ${secrets:SPICE_SHAREPOINT_CLIENT_ID}
sharepoint_tenant_id: ${secrets:SPICE_SHAREPOINT_TENANT_ID}
sharepoint_client_secret: ${secrets:SPICE_SHAREPOINT_CLIENT_SECRET}
See the Sharepoint Data Connector documentation.
AWS Identity and Access Management (IAM) for S3: A new s3_auth parameter for the s3 data connector to configure the authentication method to use when connecting to S3. Supported values are public, key, and iam_role. Use s3_auth: iam_role to assume the instance IAM role.
Example spicepod.yml:
yaml
datasets:
- from: s3://my-bucket
name: bucket
params:
s3_auth: iam_role # Assume IAM role of instance
See the S3 Data Connector documentation.
File Data Connector Use from: file: to query files stored by locally accessible filesystems.
Example spicepod.yml:
yaml
datasets:
- from: file://path/to/customer.parquet
name: customer
params:
file_format: parquet
See the File Data Connector documentation.
Improved /ready Api Now includes the initial data load for accelerated datasets in addition to component readiness to ensure readiness is only reported when data has loaded and can be successfully queried.
Breaking Changes
GitHub Data Connector: The data type for time-related columns has changed from
Utf8toTimestamp. To upgrade, data type references to timestamp. For example, if usingtime_format:, change uses oftime_format: ISO8601totime_format: timestamp.Ready API: The
/readyAPI reports ready only when all components have reported ready and data is fully loaded. To upgrade, evaluate uses of the Ready API (such as Kubernetes readiness probes) and consider how it might affect system behavior.
Dependencies
No major dependencies updates.
Contributors
- @phillipleblanc
- @Jeadie
- @lukekim
- @sgrebnov
- @peasee
- @eltociear
- @Sevenannn
- @ewgenius
- @karifabri
New Contributors
- @karifabri made their first contribution in https://github.com/spiceai/spiceai/pull/2601
What's Changed
- Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/2585
- Set helm to v0.17.4-beta by @ewgenius in https://github.com/spiceai/spiceai/pull/2595
- Bump to next v0.18.0-beta version by @ewgenius in https://github.com/spiceai/spiceai/pull/2596
- Add snapshot test docs / Update beta criteria for data accelerators by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2594
- Enable federation for accelerated queries (sqlite, duckdb, postgres) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2598
- spelling updates on v0.17.4 release notes by @karifabri in https://github.com/spiceai/spiceai/pull/2601
- Update endgame template by @ewgenius in https://github.com/spiceai/spiceai/pull/2591
- fix: Re-attach DuckDB attachments on each query by @peasee in https://github.com/spiceai/spiceai/pull/2602
- Speed up sqlite accelerator benchmark test with indexes by @Sevenannn in https://github.com/spiceai/spiceai/pull/2597
- Fix refresh API using
refresh_mode: appendby @phillipleblanc in https://github.com/spiceai/spiceai/pull/2609 - Tweak
/readyto only report ready when components have all reported Ready by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2600 - Add
s3_authparameter to configure IAM role authentication by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2611 - Bump fundu from 2.0.0 to 2.0.1 by @dependabot in https://github.com/spiceai/spiceai/pull/2576
- fix: Remove comments from SQL files by @peasee in https://github.com/spiceai/spiceai/pull/2627
- Utilize runtime.status().is_ready() to check acceleration dataset readiness in benchmark test by @Sevenannn in https://github.com/spiceai/spiceai/pull/2614
- Allow for prefix to be kept in internal Parameters by @Jeadie in https://github.com/spiceai/spiceai/pull/2603
- Bump itertools from 0.12.1 to 0.13.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2572
- Bump golang.org/x/mod from 0.20.0 to 0.21.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2571
- Add initial threat model using OWASP Threat Dragon by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2599
- fix: Explicitly error for duplicate duckdb file accelerators by @peasee in https://github.com/spiceai/spiceai/pull/2628
- Benchmark test binary can parse command line option by @Sevenannn in https://github.com/spiceai/spiceai/pull/2626
- Snapshot tests shouldn't crash the Spice benchmark test by @Sevenannn in https://github.com/spiceai/spiceai/pull/2613
- Bump anyhow from 1.0.86 to 1.0.87 by @dependabot in https://github.com/spiceai/spiceai/pull/2573
- Upgrade datafusion to improve SQLite subquery tables aliasing support by @sgrebnov in https://github.com/spiceai/spiceai/pull/2634
- Run benchmark separately using workflow by @Sevenannn in https://github.com/spiceai/spiceai/pull/2631
- Sharepoint UX changes by @Jeadie in https://github.com/spiceai/spiceai/pull/2633
- Improve
/readyto only mark a dataset ready iff the initial refresh completed by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2630 - Support relative paths for file connector by @Jeadie in https://github.com/spiceai/spiceai/pull/2637
- Fix
error decoding response bodyGitHub file connector bug by @sgrebnov in https://github.com/spiceai/spiceai/pull/2645 - GraphQL pagination and robustness. by @Jeadie in https://github.com/spiceai/spiceai/pull/2632
- docs: Update bug template by @peasee in https://github.com/spiceai/spiceai/pull/2629
- Define GitHub
issuesdata connector schema upfront by @sgrebnov in https://github.com/spiceai/spiceai/pull/2646 - Add support for loading from Sharepoint Group's default drive. by @Jeadie in https://github.com/spiceai/spiceai/pull/2642
- Fix typo in workflow, fix the postgres connector container readiness check by @Sevenannn in https://github.com/spiceai/spiceai/pull/2654
- Fix check all features by @Sevenannn in https://github.com/spiceai/spiceai/pull/2653
- Enable Warn/Error traces from dependency components by @sgrebnov in https://github.com/spiceai/spiceai/pull/2655
- Use lower case iso8601 for time_column by @Sevenannn in https://github.com/spiceai/spiceai/pull/2551
- Add basic integration test for Spice spill-to-disk and re-hydration scenario by @sgrebnov in https://github.com/spiceai/spiceai/pull/2643
- Add 'RefreshOverrides::max_jitter' to 'POST /v1/datasets/:name/acceleration/refresh' by @Jeadie in https://github.com/spiceai/spiceai/pull/2641
- Bump rustls-pemfile from 1.0.4 to 2.1.3 by @dependabot in https://github.com/spiceai/spiceai/pull/2575
- Update dependencies to support querying postgres enum types by @Sevenannn in https://github.com/spiceai/spiceai/pull/2657
- Upgrade table-providers by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2659
- Improve
spill_to_disk_and_rehydrationintegration test by @sgrebnov in https://github.com/spiceai/spiceai/pull/2658 - Enhance GitHub connector robustness with explicit table schema definitions by @sgrebnov in https://github.com/spiceai/spiceai/pull/2661
- Rename sharepoint fields by @Jeadie in https://github.com/spiceai/spiceai/pull/2668
- Disable dataset checkpoint for DuckDB acceleration by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2676
- Revert "Enable federation for accelerated queries (sqlite, duckdb, postgres) (#2598) by @Sevenannn in https://github.com/spiceai/spiceai/pull/2683
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.17.4-beta...v0.18.0-beta
- Rust
Published by sgrebnov over 1 year ago
https://github.com/spiceai/spiceai - v0.17.4-beta.1
This is the release candidate 0.17.4-beta.1
- Rust
Published by phillipleblanc over 1 year ago
https://github.com/spiceai/spiceai - v0.17.3-beta
Spice v0.17.3-beta (Sep 2, 2024)
The v0.17.3-beta release further improves data accelerator robustness and adds a new github data connector that makes accelerating GitHub Issues, Pull Requests, Commits, and Blobs easy.
Highlights in v0.17.3-beta
Improved benchmarking, testing, and robustness of data accelerators: Continued improvements to benchmarking and testing of data accelerators, leading to more robust and reliable data accelerators.
GitHub Connector (alpha): Connect to GitHub and accelerate Issues, Pull Requests, Commits, and Blobs.
```yaml datasets: # Fetch all rust and golang files from spiceai/spiceai - from: github:github.com/spiceai/spiceai/files/trunk name: spiceai.files params: include: '*/.rs; */.go' githubtoken: ${secrets:GITHUBTOKEN}
# Fetch all issues from spiceai/spiceai. Similar for pull requests, commits, and more.
- from: github:github.com/spiceai/spiceai/issues name: spiceai.issues params: githubtoken: ${secrets:GITHUBTOKEN} ```
Breaking Changes
None.
Contributors
- @phillipleblanc
- @Jeadie
- @peasee
- @sgrebnov
- @Sevenannn
- @lukekim
- @dependabot
- @ewgenius
What's Changed
Dependencies
delta_kernelfrom 0.2.0 to 0.3.0.
Commits
- Prepare version for v0.17.3-beta by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2388
- Add a basic Github Connector by @Jeadie in https://github.com/spiceai/spiceai/pull/2365
- task: Re-enable federation by @peasee in https://github.com/spiceai/spiceai/pull/2389
- fix: Implement custom PartialEq for Dataset by @peasee in https://github.com/spiceai/spiceai/pull/2390
- GitHub Data Connector
filessupport (basic fields) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2393 - Add a
--forceflag tospice installto force it to install the latest released version by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2395 - Improve experience of using
spice chatby @phillipleblanc in https://github.com/spiceai/spiceai/pull/2396 - Fix view loading on startup by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2398
- Add
includeparam support to GitHub Data Connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/2397 - Postgres integration test to cover on-conflict behavior by @Sevenannn in https://github.com/spiceai/spiceai/pull/2359
- Create dependabot.yml by @lukekim in https://github.com/spiceai/spiceai/pull/2399
- Add
contentcolumn to GitHub Connector when dataset is accelerated by @sgrebnov in https://github.com/spiceai/spiceai/pull/2400 - Fix dependabot indentation by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2402
- Bump docker/setup-buildx-action from 1 to 3 by @dependabot in https://github.com/spiceai/spiceai/pull/2403
- Bump github/codeql-action from 2 to 3 by @dependabot in https://github.com/spiceai/spiceai/pull/2404
- Bump docker/login-action from 1 to 3 by @dependabot in https://github.com/spiceai/spiceai/pull/2405
- Bump yogevbd/enforce-label-action from 2.1.0 to 2.2.2 by @dependabot in https://github.com/spiceai/spiceai/pull/2406
- Bump actions/checkout from 3 to 4 by @dependabot in https://github.com/spiceai/spiceai/pull/2407
- Bump go.uber.org/zap from 1.21.0 to 1.27.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2408
- Bump github.com/prometheus/client_model from 0.6.0 to 0.6.1 by @dependabot in https://github.com/spiceai/spiceai/pull/2409
- Bump github.com/spf13/cobra from 1.6.0 to 1.8.1 by @dependabot in https://github.com/spiceai/spiceai/pull/2412
- Bump chrono-tz from 0.8.6 to 0.9.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2413
- Bump tokio from 1.39.2 to 1.39.3 by @dependabot in https://github.com/spiceai/spiceai/pull/2414
- Bump tokenizers from 0.19.1 to 0.20.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2415
- Bump serde from 1.0.207 to 1.0.209 by @dependabot in https://github.com/spiceai/spiceai/pull/2416
- Bump gopkg.in/natefinch/lumberjack.v2 from 2.0.0 to 2.2.1 by @dependabot in https://github.com/spiceai/spiceai/pull/2410
- Bump ndarray from 0.15.6 to 0.16.1 by @dependabot in https://github.com/spiceai/spiceai/pull/2417
- Bump golang.org/x/mod from 0.14.0 to 0.20.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2411
- Add correct labels to dependabot.yml by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2418
- Fix build break by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2430
- Dependabot updates by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2431
- Bump github.com/stretchr/testify from 1.8.1 to 1.9.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2422
- Preserve timezone information in constructing expr by @Sevenannn in https://github.com/spiceai/spiceai/pull/2392
- Bump github.com/spf13/viper from 1.12.0 to 1.19.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2420
- Fix repeated base table data in acceleration with embeddings by @Sevenannn in https://github.com/spiceai/spiceai/pull/2401
- Fix tool calling with Groq (and potentially other tool-enabled models) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2435
- Remove candle from
crates/llms/src/chat/by @Jeadie in https://github.com/spiceai/spiceai/pull/2439 - fix: Only attach successfully initialized accelerators by @peasee in https://github.com/spiceai/spiceai/pull/2433
- Support overriding OpenAI default values in a model param; add token usage telemetry to task_history. by @Jeadie in https://github.com/spiceai/spiceai/pull/2434
- Enable message chains and tool calls for local LLMs by @Jeadie in https://github.com/spiceai/spiceai/pull/2180
- DuckDB on-conflict integration test by @Sevenannn in https://github.com/spiceai/spiceai/pull/2437
- Fix MySQL E2E tests and include MySQL acceleration testing by @sgrebnov in https://github.com/spiceai/spiceai/pull/2441
- Use rtcontext for proper cloud/local context in
spice chatby @phillipleblanc in https://github.com/spiceai/spiceai/pull/2442 - Fix MySQL connector to respect the source column's decimal precision by @sgrebnov in https://github.com/spiceai/spiceai/pull/2443
- Improve Github Data Connector tables schema by @sgrebnov in https://github.com/spiceai/spiceai/pull/2448
- Improve GitHub Connector error msg when invalid token or permissions by @sgrebnov in https://github.com/spiceai/spiceai/pull/2449
- Proper error tracking across tracing spans by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2454
- task: Disable and update federation by @peasee in https://github.com/spiceai/spiceai/pull/2457
- GitHub connector: convert
labelsandhashesto primitive arrays by @sgrebnov in https://github.com/spiceai/spiceai/pull/2452 - Bump
datafusionversion to the latest by @sgrebnov in https://github.com/spiceai/spiceai/pull/2456 - Trim trailing
/for S3 data connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2458 - Add
accelerated_refreshtotask_historytable by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2459 - Add
assigneesandlabelsfields to github issues and github pulls datasets by @ewgenius in https://github.com/spiceai/spiceai/pull/2467 - Native clickhouse schema inference by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2466
- List GitHub connector in readme by @ewgenius in https://github.com/spiceai/spiceai/pull/2468
- Fix LLMs health check; Add
updatedAtfield to GitHub connector by @ewgenius in https://github.com/spiceai/spiceai/pull/2474 - Remove non existing updated_at from github.pulls dataset by @ewgenius in https://github.com/spiceai/spiceai/pull/2475
- GitHub connector: add pulls labels and rm duplicate milestoneId and milestoneTitle for issues by @sgrebnov in https://github.com/spiceai/spiceai/pull/2477
- Bump delta_kernel from 0.2.0 to 0.3.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2472
- Add back GitHub connector Pull Request
updated_atby @lukekim in https://github.com/spiceai/spiceai/pull/2479 - Update ROADMAP Sep 2, 2024. by @lukekim in https://github.com/spiceai/spiceai/pull/2478
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.17.2-beta...v0.17.3-beta
- Rust
Published by Jeadie over 1 year ago
https://github.com/spiceai/spiceai - v0.17.2-beta.1
This is the release candidate 0.17.2-beta.1
- Rust
Published by phillipleblanc over 1 year ago
https://github.com/spiceai/spiceai - v0.17.2-beta
Spice v0.17.2-beta (Aug 26, 2024)
The v0.17.2-beta release focuses on improving data accelerator compatibility, stability, and performance. Expanded data type support for DuckDB, SQLite, and PostgreSQL data accelerators (and data connectors) enables significantly more data types to be accelerated. Error handling and logging has also been improved along with several bugs.
Highlights in v0.17.2-beta
Expanded Data Type Support for Data Accelerators: DuckDB, SQLite, and PostgreSQL Data Accelerators now support a wider range of data types, enabling acceleration of more diverse datasets.
Enhanced Error Handling and Logging: Improvements have been made to aid in troubleshooting and debugging.
Anonymous Usage Telemetry: Optional, anonymous, aggregated telemetry has been added to help improve Spice. This feature can be disabled. For details about collected data, see the telemetry documentation.
To opt out of telemetry:
- Using the CLI flag:
bash
spice run -- --telemetry-enabled false
- Add configuration to
spicepod.yaml:
yaml
runtime:
telemetry:
enabled: false
Improved Benchmarking: A suite of performance benchmarking tests have been added to the project, helping to maintain and improve runtime performance; a top priority for the project.
Breaking Changes
None.
Contributors
- @Jeadie
- @y-f-u
- @phillipleblanc
- @sgrebnov
- @Sevenannn
- @peasee
- @ewgenius
What's Changed
Dependencies
- DataFusion: Upgraded from v40 to v41
Commits
- Pin actions/upload-artifact to v4.3.4 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2200
- Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/2202
- Update to next release version,
v0.17.2-betaby @phillipleblanc in https://github.com/spiceai/spiceai/pull/2203 - add accelerator beta criteria by @y-f-u in https://github.com/spiceai/spiceai/pull/2201
- update helm chart to 0.17.1-beta by @Sevenannn in https://github.com/spiceai/spiceai/pull/2205
- add dockerignore to avoid copy target and test folder by @y-f-u in https://github.com/spiceai/spiceai/pull/2206
- add client timeout for deltalake connector by @y-f-u in https://github.com/spiceai/spiceai/pull/2208
- Upgrade tonic and opentelemetry-proto by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2223
- Add index and resource tuning for postgres ghcr image to support postgres benchmark in sf1 by @Sevenannn in https://github.com/spiceai/spiceai/pull/2196
- Remove embedding columns from
retrieved_primary_keysin v1/search by @Jeadie in https://github.com/spiceai/spiceai/pull/2176 - use file as dbpathparam as the param prefix is trimmed by @y-f-u in https://github.com/spiceai/spiceai/pull/2230
- use file for sqlite db path param by @y-f-u in https://github.com/spiceai/spiceai/pull/2231
- docs: Clarify the global requirement for local_infile when loading TPCH by @peasee in https://github.com/spiceai/spiceai/pull/2228
- Revert pinning actions/upload-artifact@v4 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2232
- Runtime tools to chat models by @Jeadie in https://github.com/spiceai/spiceai/pull/2207
- Create
runtime.task_historytable for queries, and embeddings by @Jeadie in https://github.com/spiceai/spiceai/pull/2191 - chore: Update Databricks ODBC Bench to use TPCH SF1 by @peasee in https://github.com/spiceai/spiceai/pull/2238
- Replace
metrics-rswith OpenTelemetry Metrics by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2240 - fix: Remove dead code by @peasee in https://github.com/spiceai/spiceai/pull/2249
- Improve tool quality and add vector search tool by @Jeadie in https://github.com/spiceai/spiceai/pull/2250
- fix missing partition cols in delta lake by @y-f-u in https://github.com/spiceai/spiceai/pull/2253
- download file from remote for delta testing by @y-f-u in https://github.com/spiceai/spiceai/pull/2254
- feat: Set SQLite DB path to .spice/data by @peasee in https://github.com/spiceai/spiceai/pull/2242
- Support tools for chat completions in streaming mode by @ewgenius in https://github.com/spiceai/spiceai/pull/2255
- Load component
descriptionfield from spicepod.yaml and include in LLM context by @ewgenius in https://github.com/spiceai/spiceai/pull/2261 - Add parameter for
connection_pool_sizein the Postgres Data Connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2251 - Add primary keys to response of
DocumentSimilarityToolby @Jeadie in https://github.com/spiceai/spiceai/pull/2263 - run queries bash script by @y-f-u in https://github.com/spiceai/spiceai/pull/2262
- Run benchmark test on schedule by @Sevenannn in https://github.com/spiceai/spiceai/pull/2277
- feat: Add a reference to originating App for a Dataset by @peasee in https://github.com/spiceai/spiceai/pull/2283
- Tool use & telemetry productionisation. by @Jeadie in https://github.com/spiceai/spiceai/pull/2286
- Fix cron in benchmarks.yml by @Sevenannn in https://github.com/spiceai/spiceai/pull/2288
- Upgrade to DataFusion v41 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2290
- Chat completions adjustments and fixes by @ewgenius in https://github.com/spiceai/spiceai/pull/2292
- Define the new metrics Arrow schema based on Open Telemetry by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2295
- OpenTelemetry Metrics Arrow exporter to
runtime.metricstable by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2296 - Calculate summary metrics from histograms for Prometheus endpoint by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2302
- Add back Spice DF runtime_env during SessionContext construction by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2304
- Add integration test for S3 data connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2305
- Fix
secrets.inject_secretswhen secret not found. by @Jeadie in https://github.com/spiceai/spiceai/pull/2306 - Intra-table federation query on duckdb accelerated table by @y-f-u in https://github.com/spiceai/spiceai/pull/2299
- Postgres federation on acceleration by @y-f-u in https://github.com/spiceai/spiceai/pull/2309
- sqlite intra table federation on acceleration by @y-f-u in https://github.com/spiceai/spiceai/pull/2308
- feat: Add
DataAccelerator::init()for SQLite acceleration federation by @peasee in https://github.com/spiceai/spiceai/pull/2293 - Initial framework for collecting anonymous usage telemetry by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2310
- Add gRPC action to trigger accelerated dataset refresh by @sgrebnov in https://github.com/spiceai/spiceai/pull/2316
- add
disable_query_push_downoption to acceleration settings by @y-f-u in https://github.com/spiceai/spiceai/pull/2327 - Remove
v1/assistby @Jeadie in https://github.com/spiceai/spiceai/pull/2312 - bump table provider version to set the correct dialect for postgres writer by @y-f-u in https://github.com/spiceai/spiceai/pull/2329
- Send telemetry on startup by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2331
- Calculate resource IDs for telemetry by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2332
- Refactor
v1/search: include WHERE condition, allow extra columns in projection. by @Jeadie in https://github.com/spiceai/spiceai/pull/2328 - Add integration test for gRPC dataset refresh action by @sgrebnov in https://github.com/spiceai/spiceai/pull/2330
- Propagate errors through all
task_historynested spans by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2337 - Improve tools by @Jeadie in https://github.com/spiceai/spiceai/pull/2338
- update duckdb rs version to support more types: interval/duration/etc by @y-f-u in https://github.com/spiceai/spiceai/pull/2336
- feat: Add DuckDB accelerator init, attach databases for federation by @peasee in https://github.com/spiceai/spiceai/pull/2335
- Add query telemetry metrics by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2333
- Add system prompts for LLMs; system prompts for tool using models. by @Jeadie in https://github.com/spiceai/spiceai/pull/2342
- Fix benchmark test to keep running when there's failed queries by @Sevenannn in https://github.com/spiceai/spiceai/pull/2347
- Tools as a spicepod first class citizen. by @Jeadie in https://github.com/spiceai/spiceai/pull/2344
- Add
bytes_processedtelemetry metric by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2343 - fix misaligned columns from delta lake by @y-f-u in https://github.com/spiceai/spiceai/pull/2356
- Emit telemetry metrics to
runtime.metrics/Prometheus as well by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2352 - Use UTC timezone for telemetry timestamps by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2354
- Fix MetricType deserialization by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2358
- Add dataset details to tool using LLMs; early check tables in vector search by @Jeadie in https://github.com/spiceai/spiceai/pull/2353
- Bump datafusion-federation/datafusion-table-providers dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2360
- Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/2362
- fix: Disable DuckDB and SQLite federation by @peasee in https://github.com/spiceai/spiceai/pull/2371
- Fix system prompt in ToolUsingChat, fix builtin registration by @Jeadie in https://github.com/spiceai/spiceai/pull/2367
- fix: Use --profile release for benchmarks by @peasee in https://github.com/spiceai/spiceai/pull/2372
- nql parameter 'use' -> 'model' by @Jeadie in https://github.com/spiceai/spiceai/pull/2366
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.17.1-beta...v0.17.2-beta
- Rust
Published by phillipleblanc over 1 year ago
https://github.com/spiceai/spiceai - v0.17.1-beta
Spice v0.17.1-beta (Aug 5, 2024)
The v0.17.1-beta minor release focuses on enhancing stability, performance, and usability. The Flight interface now supports the GetSchema API and s3, ftp, sftp, http, https, and databricks data connectors have added support for a client_timeout parameter.
Highlights in v0.17.1-beta
Flight API GetSchema: The GetSchema API is now supported by the Flight interface. The schema of a dataset can be retrieved using GetSchema with the PATH or CMD FlightDescriptor types. The CMD FlightDescriptor type is used to get the schema of an arbitrary SQL query as the CMD bytes. The PATH FlightDescriptor type is used to retrieve the schema of a dataset.
Client Timeout: A client_timeout parameter has been added for Data Connectors: ftp, sftp, http, https, and databricks. When defined, the client timeout configures Spice to stop waiting for a response from the data source after the specified duration. The default timeout is 30 seconds.
yaml
datasets:
- from: ftp://remote-ftp-server.com/path/to/folder/
name: my_dataset
params:
file_format: csv
# Example client timeout
client_timeout: 30s
ftp_user: my-ftp-user
ftp_pass: ${secrets:my_ftp_password}
Breaking Changes
TLS is now required to be explicitly enabled. Enable TLS on the command line using --tls-enabled true:
bash
spice run -- --tls-enabled true --tls-certificate-file /path/to/cert.pem --tls-key-file /path/to/key.pem
Or in the spicepod.yml with enabled: true:
yaml
runtime:
tls:
# TLS explicitly enabled
enabled: true
certificate_file: /path/to/cert.pem
key_file: /path/to/key.pem
Contributors
- @Jeadie
- @y-f-u
- @phillipleblanc
- @sgrebnov
- @peasee
- @Sevenannn
What's Changed
Dependencies
- Rust: Upgraded from v1.79.0 to v1.80.0
Commits
- Update README.md by @Jeadie in https://github.com/spiceai/spiceai/pull/2142
- update helm chart to 0.17.0-beta by @y-f-u in https://github.com/spiceai/spiceai/pull/2144
- Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/2143
- Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/2141
- Update Spice runtime to require explicit enablement for TLS by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2148
- Update next version, ROADMAP, End Game template, move alpha release notes by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2145
- Update EXTENSIBILITY to be correct, update README.md with Beta connectors by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2146
- Add benchmark tests for duckdb acceleration by @sgrebnov in https://github.com/spiceai/spiceai/pull/2151
- fix: Increase benchmark dataset setup timeout for Databricks by @peasee in https://github.com/spiceai/spiceai/pull/2149
- Add LLMs to
v1/modelsby @Jeadie in https://github.com/spiceai/spiceai/pull/2152 - Dataset with acceleration enabled = false shouldn't go through accelerated dataset hot reload by @Sevenannn in https://github.com/spiceai/spiceai/pull/2155
- Show single error string in Spice SQL REPL command line by @Sevenannn in https://github.com/spiceai/spiceai/pull/2150
- Add CI to build makefile install targets by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2157
- Make the FlightClient struct cheap to clone by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2162
- Fix bugs with local Unity Catalog server by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2160
- Benchmark: data connector tests should continue on query error (s3) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2161
- fix hanging spiced when odbc loading data and received a cancel signal by @y-f-u in https://github.com/spiceai/spiceai/pull/2156
- Improve MySql schema extraction and add InList and ScalarFunction expr support by @sgrebnov in https://github.com/spiceai/spiceai/pull/2158
- Fix issue with use of
EmbeddingConnectorby @Jeadie in https://github.com/spiceai/spiceai/pull/2165 - add client timeout for all object store providers by @y-f-u in https://github.com/spiceai/spiceai/pull/2168
- Benchmark: include sqlite acceleration and enable more tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/2172
- feat: Use datafusion SQLite streaming updates by @peasee in https://github.com/spiceai/spiceai/pull/2171
- Benchmark: include arrow acceleration and enable more tests (tpch_q22) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2173
- Localhost -> Sink; Fix Sink connector to not require schema via
CREATE TABLE...and infer on first write by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2167 - Fix misspelled acceleration engine name in benchmark tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/2175
- update spark bench catalog by @y-f-u in https://github.com/spiceai/spiceai/pull/2178
- Benchmark: Discard first measurement of sql query, disable result caching by @Sevenannn in https://github.com/spiceai/spiceai/pull/2179
- clear message when invalid params configured for accelerator by @y-f-u in https://github.com/spiceai/spiceai/pull/2177
- Implement the Flight
GetSchemaAPI by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2169 - Support AppendStream for SpiceAI data connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2181
- Support MySQL BINARY, VARBINARY, Postgres BYTEA and improve MySQL auth error message by @sgrebnov in https://github.com/spiceai/spiceai/pull/2184
- Benchmark: use SF1 for MySQL TPC-H tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/2183
- fix windows build broken by adding tokio unix signal by @y-f-u in https://github.com/spiceai/spiceai/pull/2193
- Adds TLS support for
flightsubscriber/flightpublishertools by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2194 - Update README output samples by @ewgenius in https://github.com/spiceai/spiceai/pull/2195
- Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/2197
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.17.0-beta...v0.17.1-beta
- Rust
Published by phillipleblanc over 1 year ago
https://github.com/spiceai/spiceai - v0.17.0-beta
Spice v0.17-beta (July 29, 2024)
Announcing the first beta release of Spice.ai OSS! 🎉
The core Spice runtime has graduated from alpha to beta! Components, such as Data Connectors and Models, follow independent release milestones. Data Connectors graduating from alpha to beta include databricks, spiceai, postgres, s3, odbc, and mysql. From beta to 1.0, project will be to on improving performance and scaling to larger datasets.
This release also includes enhanced security with Transport Layer Security (TLS) secured APIs, a new spice install CLI command, and several performance and stability improvements.
Highlights in v0.17-beta
- Encryption in transit with TLS: The HTTP, gRPC, Metrics, and OpenTelemetry (OTEL) API endpoints can be secured with TLS by specifying a certificate and private key in PEM format.
Enable TLS using the --tls-certificate-file and --tls-key-file command-line flags:
bash
spice run -- --tls-certificate-file /path/to/cert.pem --tls-key-file /path/to/key.pem
Or configure in the spicepod.yml:
yaml
runtime:
tls:
certificate_file: /path/to/cert.pem
key_file: /path/to/key.pem
Get started with TLS by following the TLS Sample. For more details see the TLS Documentation.
spice install: Running thespice installCLI command will download and install the latest version of the runtime.
bash
spice install
Improved SQLite and DuckDB compatibility: The SQLite and DuckDB accelerators support more complex queries and additional data types.
Pass through arguments from
spice runto runtime: Arguments passed tospice runare now passed through to the runtime.Secrets replacement within connection strings: Secrets are now replaced within connection strings:
yaml
datasets:
- from: mysql:my_table
name: my_table
params:
mysql_connection_string: mysql://user:${secrets:mysql_pw}@localhost:3306/db
Breaking Changes
The odbc data connector is now optional and has been removed from the released binaries. To use the odbc data connector, use the official Spice Docker image or build the Spice runtime from source.
To build Spice from source with the odbc feature:
bash
cargo build --release --features odbc
To use the official Spice Docker image from DockerHub:
```bash
Pull the latest official Spice image
docker pull spiceai/spiceai:latest
Pull the official v0.17-beta Spice image
docker pull spiceai/spiceai:0.17.0-beta ```
Contributors
- @y-f-u
- @peasee
- @digadeesh
- @phillipleblanc
- @ewgenius
- @sgrebnov
- @Sevenannn
- @lukekim
What's Changed
Dependencies
- Upgraded delta-kernel-rs to v0.2.0.
Commits
- update helm chart versions for v0.16.0-alpha by @y-f-u in https://github.com/spiceai/spiceai/pull/2057
- Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/2060
- fix: Install
unixodbcfor E2E test release installation by @peasee in https://github.com/spiceai/spiceai/pull/2063 - update next release to 0.16.1-beta by @digadeesh in https://github.com/spiceai/spiceai/pull/2065
- update version to 0.17.0-beta by @digadeesh in https://github.com/spiceai/spiceai/pull/2068
- Update ROADMAP.md - removing delivered features and updating Beta timeline. by @digadeesh in https://github.com/spiceai/spiceai/pull/2066
- make bench works for more connectors by @y-f-u in https://github.com/spiceai/spiceai/pull/2042
- enable spark benchmark by @y-f-u in https://github.com/spiceai/spiceai/pull/2069
- Make the
json_pointerparam optional for the GraphQL connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2072 - Fix secrets init to not bail if a secret store can't load by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2073
- Update end_game.md by @ewgenius in https://github.com/spiceai/spiceai/pull/2059
- Fix time predicate with timezone info casting for Dremio by @sgrebnov in https://github.com/spiceai/spiceai/pull/2058
- Add benchmark tests for S3 data connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/2049
- Add benchmark tests for MySQL data connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/2048
- fix: Add Athena dialect for ODBC by @peasee in https://github.com/spiceai/spiceai/pull/2084
- Workflow to build MySQL image with TPCH benchmark data by @sgrebnov in https://github.com/spiceai/spiceai/pull/2070
- Fix secrets replacement within connection strings by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2086
- fix: Correctly prefix missing required parameters by @peasee in https://github.com/spiceai/spiceai/pull/2088
- Add Postgres Data Connector TPCH Benchmark Tests by @Sevenannn in https://github.com/spiceai/spiceai/pull/2009
- Add
spice installCLI command by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2090 - Use MySQL service container for benchmark tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/2089
- Remove ODBC from default released binaries by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2092
- Add cfg flag to properly support build w / wo feature in benchmark tests by @Sevenannn in https://github.com/spiceai/spiceai/pull/2095
- Move Prometheus metrics server to runtime by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2093
- fix: Remove unixodbc from test release install by @peasee in https://github.com/spiceai/spiceai/pull/2103
- Upgrade
delta_kernelto 0.2.0 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2102 - Allow DuckDB to load extensions in Docker by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2104
- Spawn the metrics server in the background. by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2105
- fix: suffix delta kernel table location with slash if none by @y-f-u in https://github.com/spiceai/spiceai/pull/2107
- Bump object_store from 0.10.1 to 0.10.2 by @dependabot in https://github.com/spiceai/spiceai/pull/2094
- Decision Record: Default HTTP and GRPC ports for Spice.ai OSS by @digadeesh in https://github.com/spiceai/spiceai/pull/2091
- Enable TLS for metrics endpoint by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2108
- Use Postgres container for tpch bench by @Sevenannn in https://github.com/spiceai/spiceai/pull/2112
- Add workflow to build Postgres Docker image using tpch data by @Sevenannn in https://github.com/spiceai/spiceai/pull/2101
- Enable TLS for HTTP endpoint by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2109
- Enable TLS on the Flight GRPC endpoint by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2110
- add timeout parameters for object store client options by @y-f-u in https://github.com/spiceai/spiceai/pull/2114
- Enable TLS on the OpenTelemetry GRPC endpoint by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2111
- feat: Add ODBC Databricks Benches by @peasee in https://github.com/spiceai/spiceai/pull/2113
- Support configuring TLS in the spicepod by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2118
- add broken tpch simple queries by @y-f-u in https://github.com/spiceai/spiceai/pull/2116
- Add integration test for TLS by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2121
- Improve SQLite and DuckDB compatibility by @sgrebnov in https://github.com/spiceai/spiceai/pull/2122
- Pass through arguments from
spice runandspice sqlto runtime by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2123 - Handle TLS in the spice CLI when connecting to the runtime by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2124
- Handle connecting over TLS for
spice sqlby @phillipleblanc in https://github.com/spiceai/spiceai/pull/2125 - Remove
--tlsflag by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2128 - fix: Handle SQLResult error instead of unwrapping by @peasee in https://github.com/spiceai/spiceai/pull/2127
- Add delta bench by @y-f-u in https://github.com/spiceai/spiceai/pull/2120
- feat: Add Athena ODBC benches by @peasee in https://github.com/spiceai/spiceai/pull/2129
- fix: Use odbc-api fork for decimal conversion fix by @peasee in https://github.com/spiceai/spiceai/pull/2133
- Update benchmarks job env for delta testing by @y-f-u in https://github.com/spiceai/spiceai/pull/2134
- Use forked dotenvy to disable variable substitution by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2135
- Remove unnecessary memory allocations in the query path by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2136
- upgrade spiceai df for tpch simple 6 and 7 by @y-f-u in https://github.com/spiceai/spiceai/pull/2137
- Avoid more unnecessary allocations in the query path by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2138
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.16.0-alpha...v0.17-beta
- Rust
Published by phillipleblanc over 1 year ago
https://github.com/spiceai/spiceai - v0.16.0-alpha
Spice v0.16-alpha (July 22, 2024)
The v0.16-alpha release is the first candidate release for the beta milestone on a path to finalizing the v1.0 developer and user experience. Upgraders should be aware of several breaking changes designed to improve the Secrets configuration experience and to make authoring spicepod.yml files more consistent. See the Breaking Changes section below for details. Additionally, the Spice Java SDK was released, providing Java developers a simple but powerful native experience to query Spice.
Highlights in v0.16-alpha
- Secret Stores: More than one Secret Store can now be specified. For example, to configure Spice with both Environment Variable and AWS Secrets Manager Secret Stores, use the following
secretsconfiguration inspicepod.yaml:
yaml
secrets:
- from: env
name: env
- from: aws_secrets_manager:my_secret_name
name: aws_secret
Secrets managed by configured Secret Stores can be referenced in component params using the syntax ${<store_name>:<key>}. E.g.
yaml
datasets:
- from: postgres:my_table
name: my_table
params:
pg_host: localhost
pg_port: 5432
pg_pass: ${ env:MY_PG_PASS }
Java Client SDK: The Spice Java SDK has been released for JDK 17 or greater.
Federated SQL Query: Significant stability and reliability improvements have been made to federated SQL query support in most data connectors.
ODBC Data Connector: Providing a specific SQL dialect to query ODBC data sources is now supported using the
sql_dialectparam. For example, when querying Databricks using ODBC, thedatabricksdialect can be specified to ensure compatibility. Read the ODBC Data Connector documentation for more details.
Breaking Changes
- Secret Stores: Secret Stores support has been overhauled including required changes to
spicepod.ymlschema. File based secrets stored in the~/.spice/authfile are no longer supported. See Secret Stores Documentation for full reference.
To upgrade Secret Stores, rename any parameters ending in _key to remove the _key suffix and specify a secret inline via the secret replacement syntax (${<secret_store>:<key>}):
yaml
datasets:
- from: postgres:my_table
name: my_table
params:
pg_host: localhost
pg_port: 5432
pg_pass_key: my_pg_pass
to:
yaml
datasets:
- from: postgres:my_table
name: my_table
params:
pg_host: localhost
pg_port: 5432
pg_pass: ${secrets:my_pg_pass}
And ensure the MY_PG_PASS environment variable is set.
- Datasets: The default value of
time_formathas changed fromunix_secondstotimestamp.
To upgrade:
yaml
datasets:
- from:
name: my_dataset
# Explicitly define format when not specified.
time_format: unix_seconds
- HTTP Port: The default HTTP port has changed from port
3000to port8090to avoid conflicting with frontend apps which typically use the 3000 range. If an SDK is used, upgrade it at the same time as the runtime.
To upgrade and continue using port 3000, run spiced with the --http command line argument:
```shell
Using Dockerfile or spiced directly
spiced --http 127.0.0.1:3000 ```
- HTTP Metrics Port: The default HTTP Metrics port has changed from port
9000to9090to avoid conflicting with other metrics protocols which typically use port 9000.
To upgrade and continue using port 9000, run spiced with the metrics command line argument:
```shell
Using Dockerfile or spiced directly
spiced --metrics 127.0.0.1:9000 ```
- GraphQL Data Connector:
json_pathhas been replaced withjson_pointerto access nested data from the result of the GraphQL query. See the GraphQL Data Connector documentation for full details and RFC-6901 - JSON Pointer.
To upgrade, change:
yaml
json_path: my.json.path
To:
yaml
json_pointer: /my/json/pointer
- Data Connector Configuration: Consistent connector name prefixing has been applied to connector specific
paramsparameters. Prefixed parameter names helps ensure parameters do not collide.
For example, the Databricks data connector specific params are now prefixed with databricks:
yaml
datasets:
- from: databricks:spiceai.datasets.my_awesome_table # A reference to a table in the Databricks unity catalog
name: my_delta_lake_table
params:
mode: spark_connect
endpoint: dbc-a1b2345c-d6e7.cloud.databricks.com
token: MY_TOKEN
To upgrade:
yaml
datasets:
# Example for Spark Connect
- from: databricks:spiceai.datasets.my_awesome_table # A reference to a table in the Databricks unity catalog
name: my_delta_lake_table
params:
mode: spark_connect
databricks_endpoint: dbc-a1b2345c-d6e7.cloud.databricks.com # Now prefixed with databricks
databricks_token: ${secrets:my_token} # Now prefixed with databricks
Refer to the Data Connector documentation for parameter naming changes in this release.
Clickhouse Data Connector: The clickhouse_connection_timeout parameter has been renamed to connection_timeout as it applies to the client and is not Clickhouse configuration itself.
To upgrade, change:
yaml
clickhouse_connection_timeout: time
To:
yaml
connection_timeout: time
Contributors
- @y-f-u
- @phillipleblanc
- @ewgenius
- @github-actions
- @sgrebnov
- @lukekim
- @digadeesh
- @peasee
- @Sevenannn
What's Changed
Dependencies
No major dependency updates.
Commits
- bump helm chart versions to 0.15.2-alpha by @y-f-u in https://github.com/spiceai/spiceai/pull/1975
- Remove unused Cargo.toml fields by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1981
- Update version to 0.16.0-beta by @ewgenius in https://github.com/spiceai/spiceai/pull/1983
- Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/1984
- Enable sqlite acceleration testing in E2E by @sgrebnov in https://github.com/spiceai/spiceai/pull/1980
- Revert "Revert "fix: validate time column and time format when constructing accelerated table refresh"" by @y-f-u in https://github.com/spiceai/spiceai/pull/1982
- Add Datadog dashboard skeleton by @sgrebnov in https://github.com/spiceai/spiceai/pull/1971
- Format Cargo.toml with taplo by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1988
- Spice cli
spice chatcommand, to interact with deployed spiced instance in spice.ai cloud by @ewgenius in https://github.com/spiceai/spiceai/pull/1990 - Use platform api
/v1/chat/completionswith streaming inspice chatcli command by @ewgenius in https://github.com/spiceai/spiceai/pull/1998 - update spiceai datafusion version to fix tpch queries by @y-f-u in https://github.com/spiceai/spiceai/pull/2001
- Install a rustls default CryptoProvider by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2003
- Roadmap update July, 2024 by @lukekim in https://github.com/spiceai/spiceai/pull/2002
- Add local spice runtime support for
spice chatcommand, add--modelflag by @ewgenius in https://github.com/spiceai/spiceai/pull/2007 - fix: GraphQL Data Connector - Change json path to json pointer by @digadeesh in https://github.com/spiceai/spiceai/pull/1930
- Update ROADMAP.md to include MySQL data connector in Beta by @digadeesh in https://github.com/spiceai/spiceai/pull/2016
- Load secrets from multiple secret stores & secrets UX refresh by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2011
- upgrade spiceai datafusion to fix tpch simple query 3 by @y-f-u in https://github.com/spiceai/spiceai/pull/2021
- feat: Autodetect ODBC dialect by @peasee in https://github.com/spiceai/spiceai/pull/1997
- feat: Use CustomDialectBuilder for Databricks ODBC dialect by @peasee in https://github.com/spiceai/spiceai/pull/2020
- Switch the secret replacement syntax to
${ <secret>:<key> }by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2026 - fix spiceai connector lengthy error by @y-f-u in https://github.com/spiceai/spiceai/pull/2024
- Log parameter key instead of value when injecting secret by @Sevenannn in https://github.com/spiceai/spiceai/pull/2031
- Update benchmark yml to support postgres benchmark test by @Sevenannn in https://github.com/spiceai/spiceai/pull/2032
- Separate data connector parameters into
connectorandruntimecategories by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2028 - Fix spice chat prompt and spinner by @ewgenius in https://github.com/spiceai/spiceai/pull/2029
- Build spiced with odbc for release binaries by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2036
- MySQL timestamp, int64 casting, date part extraction and intervals support by @sgrebnov in https://github.com/spiceai/spiceai/pull/2035
- updating default http and metrics ports by @digadeesh in https://github.com/spiceai/spiceai/pull/2034
- enable spark connect federated query by @y-f-u in https://github.com/spiceai/spiceai/pull/2041
- fix: Use MySQL Interval for Databricks ODBC by @peasee in https://github.com/spiceai/spiceai/pull/2037
- Re-enable testquickstartdremio E2E test by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2045
- Fix ODBC build for release binaries by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2046
- chore: Remove unused dependencies by @peasee in https://github.com/spiceai/spiceai/pull/2044
- fix: Change version to alpha breaking by @peasee in https://github.com/spiceai/spiceai/pull/2051
- Add connector prefix for
dataset configureendpoint param by @sgrebnov in https://github.com/spiceai/spiceai/pull/2052 - Fix unprefixed runtime parameters by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2050
- Fix make install-with-models by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2054
- Bump openssl from 0.10.64 to 0.10.66 by @dependabot in https://github.com/spiceai/spiceai/pull/2047
- Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/2056
- ignore empty constraints when creating accelerated table by @y-f-u in https://github.com/spiceai/spiceai/pull/2055
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.15.2-alpha...v0.16.0-alpha
- Rust
Published by digadeesh over 1 year ago
https://github.com/spiceai/spiceai - v0.15.2-alpha
Spice v0.15.2-alpha (July 15, 2024)
The v0.15.2-alpha minor release focuses on enhancing stability, performance, and introduces Catalog Providers for streamlined access to Data Catalog tables. Unity Catalog, Databricks Unity Catalog, and the Spice.ai Cloud Platform Catalog are supported in v0.15.2-alpha. The reliability of federated query push-down has also been improved for the MySQL, PostgreSQL, ODBC, S3, Databricks, and Spice.ai Cloud Platform data connectors.
Highlights in v0.15.2-alpha
Catalog Providers: Catalog Providers streamline access to Data Catalog tables. Initial catalog providers supported are Databricks Unity Catalog, Unity Catalog and Spice.ai Cloud Platform Catalog.
For example, to configure Spice to connect to tpch tables in the Spice.ai Cloud Platform Catalog use the new catalogs: section in the spicepod.yml:
yaml
catalogs:
- name: spiceai
from: spiceai
include:
- tpch.*
```bash sql> show tables +---------------+--------------+---------------+------------+ | tablecatalog | tableschema | tablename | tabletype | +---------------+--------------+---------------+------------+ | spiceai | tpch | region | BASE TABLE | | spiceai | tpch | part | BASE TABLE | | spiceai | tpch | customer | BASE TABLE | | spiceai | tpch | lineitem | BASE TABLE | | spiceai | tpch | partsupp | BASE TABLE | | spiceai | tpch | supplier | BASE TABLE | | spiceai | tpch | nation | BASE TABLE | | spiceai | tpch | orders | BASE TABLE | | spice | runtime | query_history | BASE TABLE | +---------------+--------------+---------------+------------+
Time: 0.001866958 seconds. 9 rows. ```
ODBC Data Connector Push-Down: The ODBC Data Connector now supports query push-down for joins, improving performance for joined datasets configured with the same odbc_connection_string.
Improved Spicepod Validation Improved spicepod.yml validation has been added, including warnings when loading resources with duplicate names (datasets, views, models, embeddings).
Breaking Changes
None.
Contributors
- @phillipleblanc
- @peasee
- @y-f-u
- @ewgenius
- @Sevenannn
- @sgrebnov
- @lukekim
What's Changed
Dependencies
- Upgraded Apache DataFusion to v40.0.0.
Commits
- Update to next release version v0.15.2-alpha by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1901
- release: Update helm 0.15.1-alpha by @peasee in https://github.com/spiceai/spiceai/pull/1902
- fix: Detect and error on duplicate component names on spiced (re)load by @peasee in https://github.com/spiceai/spiceai/pull/1905
- fix: flaky test - testrefreshstatuschangeto_ready by @y-f-u in https://github.com/spiceai/spiceai/pull/1908
- Add support for parsing
catalogfrom Spicepod. by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1903 - Add catalog component to
Runtimeby @phillipleblanc in https://github.com/spiceai/spiceai/pull/1906 - Adds a RuntimeBuilder and make most items on Runtime private by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1913
- Bump zerovec-derive from 0.10.2 to 0.10.3 by @dependabot in https://github.com/spiceai/spiceai/pull/1914
- Add separate tagged image with enabled models feature by @ewgenius in https://github.com/spiceai/spiceai/pull/1909
- Update datafusion-table-providers to use newest head by @Sevenannn in https://github.com/spiceai/spiceai/pull/1927
- Add MySQL support for TPC-H test data generation script by @sgrebnov in https://github.com/spiceai/spiceai/pull/1932
- fix: Expose ODBC task errors if error is before data stream begins by @peasee in https://github.com/spiceai/spiceai/pull/1924
- Use public.ecr.aws/docker/library/{postgres/mysql}:latest for integration test images by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1934
- Implement
spice.aiCatalogProviderby @phillipleblanc in https://github.com/spiceai/spiceai/pull/1925 - fix: validate time column and time format when constructing accelerated table refresh by @y-f-u in https://github.com/spiceai/spiceai/pull/1926
- Add support for filtering tables included by a catalog by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1933
- Add
UnityCatalogcatalog provider by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1940 - Implement
Databrickscatalog provider by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1941 - Copy
paramsintodataset_paramsby @phillipleblanc in https://github.com/spiceai/spiceai/pull/1947 - Make integration tests more stable by using logged-in registry during CI by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1955
- Add integration test for Spice.ai catalog provider by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1956
- Add GET /v1/catalogs API and catalogs CMD by @lukekim in https://github.com/spiceai/spiceai/pull/1957
- feat: Enable ODBC JoinPushDown with hashed connection string by @peasee in https://github.com/spiceai/spiceai/pull/1954
- Fix bug: arrow acceleration reports zero results during refresh by @sgrebnov in https://github.com/spiceai/spiceai/pull/1962
- Revert "fix: validate time column and time format when constructing accelerated table refresh" by @y-f-u in https://github.com/spiceai/spiceai/pull/1964
- fix: Update arrow-odbc to use our fork for pending fixes by @peasee in https://github.com/spiceai/spiceai/pull/1965
- Upgrade to DataFusion 40 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1963
- Do exchange shouldn't require table to be writable by @Sevenannn in https://github.com/spiceai/spiceai/pull/1958
- Use custom dialect rule for flight federated request by @y-f-u in https://github.com/spiceai/spiceai/pull/1946
- upgrade datafusion federation to have the table rewrite fix for tpch-q9 by @y-f-u in https://github.com/spiceai/spiceai/pull/1970
- Create v0.15.2-alpha.md Release notes by @digadeesh in https://github.com/spiceai/spiceai/pull/1969
- Fix Unity Catalog API response for Azure Databricks by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1973
- Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/1976
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.15.1-alpha...v0.15.2-alpha
- Rust
Published by digadeesh over 1 year ago
https://github.com/spiceai/spiceai - v0.15.1-alpha
Spice v0.15.1-alpha (July 8, 2024)
The v0.15.1-alpha minor release focuses on enhancing stability, performance, and usability. Memory usage has been significantly improved for the postgres and duckdb acceleration engines which now use stream processing. A new Delta Lake Data Connector has been added, sharing a delta-kernel-rs based implementation with the Databricks Data Connector supporting deletion vectors.
Highlights
Improved memory usage for PostgreSQL and DuckDB acceleration engines: Large dataset acceleration with PostgreSQL and DuckDB engines has reduced memory consumption by streaming data directly to the accelerated table as it is read from the source.
Delta Lake Data Connector: A new Delta Lake Data Connector has been added for using Delta Lake outside of Databricks.
ODBC Data Connector Streaming: The ODBC Data Connector now streams results, reducing memory usage, and improving performance.
GraphQL Object Unnesting: The GraphQL Data Connector can automatically unnest objects from GraphQL queries using the unnest_depth parameter.
Breaking Changes
None.
New Contributors
None.
Contributors
What's Changed
Dependencies
The MySQL, PostgreSQL, SQLite and DuckDB DataFusion TableProviders developed by Spice AI have been donated to the datafusion-contrib/datafusion-table-providers community repository.
Commits
- Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/1842
- Update ROADMAP.md - Remove v0.15.0-alpha roadmap items. by @digadeesh in https://github.com/spiceai/spiceai/pull/1843
- update helm chart for v0.15.0-alpha by @y-f-u in https://github.com/spiceai/spiceai/pull/1845
- update cargo.toml and version.txt to 0.15.1-alpha (for next release) by @digadeesh in https://github.com/spiceai/spiceai/pull/1844
- Fix check for outdated Cargo.lock & update Cargo.lock by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1846
- Add Debezium to README by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1847
- use snmalloc as global allocator by @y-f-u in https://github.com/spiceai/spiceai/pull/1848
- Various improvements for mistral.rs by @Jeadie in https://github.com/spiceai/spiceai/pull/1831
- Enable streaming for accelerated tables refresh (common logic) by @sgrebnov in https://github.com/spiceai/spiceai/pull/1863
- Use in-memory DB pool for DuckDB functions by @Jeadie in https://github.com/spiceai/spiceai/pull/1849
- Generate Spicepod JSON Schema by @ewgenius in https://github.com/spiceai/spiceai/pull/1865
- Update http param names by @Jeadie in https://github.com/spiceai/spiceai/pull/1872
- Replace DuckDB, PostgreSQL, Sqlite and MySQL providers with the
datafusion-table-providerscrate by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1873 - Remove more dead code moved to datafusion-table-providers by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1874
- feat: Optimize ODBC for streaming results by @peasee in https://github.com/spiceai/spiceai/pull/1862
- Fix how models uses secrets by @Jeadie in https://github.com/spiceai/spiceai/pull/1875
- fix: Add support for varying duplicate columns behavior in GraphQL unnesting by @peasee in https://github.com/spiceai/spiceai/pull/1876
- fix: Remove GraphQL duplicate rename support by @peasee in https://github.com/spiceai/spiceai/pull/1877
- fix: Remove Overwrite GraphQL duplicates behavior by @peasee in https://github.com/spiceai/spiceai/pull/1882
- fix: Use tokio mpsc channels for ODBC streaming by @peasee in https://github.com/spiceai/spiceai/pull/1883
- Upgrade table providers to enable DuckDB streaming write by @sgrebnov in https://github.com/spiceai/spiceai/pull/1884
- Update ROADMAP.md - Add debezium (alpha) to connector list. by @digadeesh in https://github.com/spiceai/spiceai/pull/1880
- Allow defining user for mysql data connector via secrets by @sgrebnov in https://github.com/spiceai/spiceai/pull/1886
- Replace
delta-rswithdelta-kernel-rsand add newdeltadata connector. by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1878 - Update README images by @lukekim in https://github.com/spiceai/spiceai/pull/1890
- Handle deletion vectors for
deltatables by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1891 - Rename
deltatodelta_lakeby @phillipleblanc in https://github.com/spiceai/spiceai/pull/1892 - Add where is the AI to the FAQ. by @lukekim in https://github.com/spiceai/spiceai/pull/1885
- update df table providers rev version by @y-f-u in https://github.com/spiceai/spiceai/pull/1889
- Enable other cloud providers for delta_lake integration by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1893
- Add CLI parameters for logging into Databricks with Azure/GCP cloud storage by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1894
- Bump zerovec from 0.10.2 to 0.10.4 by @dependabot in https://github.com/spiceai/spiceai/pull/1896
- Add 'Content-Type' to metrics exporter to be prometheus exposition format compliant by @sgrebnov in https://github.com/spiceai/spiceai/pull/1897
- Update enforce-labels.yml so it accepts depdenabot updates with kind/… by @digadeesh in https://github.com/spiceai/spiceai/pull/1898
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.15.0-alpha...v0.15.1-alpha
- Rust
Published by digadeesh over 1 year ago
https://github.com/spiceai/spiceai - v0.15.0-alpha
Spice v0.15-alpha (July 1, 2024)
The v0.15-alpha release introduces support for streaming databases changes with Change Data Capture (CDC) into accelerated tables via a new Debezium connector, configurable retry logic for data refresh, and the release of a new C# SDK to build with Spice in Dotnet.
Highlights
Debezium data connector with Change Data Capture (CDC): Sync accelerated datasets with Debezium data sources over Kafka in real-time.
Data Refresh Retries: By default, accelerated datasets attempt to retry data refreshes on transient errors. This behavior can be configured using
refresh_retry_enabledandrefresh_retry_max_attempts.C# Client SDK: A new C# Client SDK has been released for developing applications in Dotnet.
Debezium data connector with Change Data Capture (CDC)
Integrating Debezium CDC is straightforward. Get started with the Debezium CDC Sample, read more about CDC in Spice, and read the Debezium data connector documentation.
Example Spicepod using Debezium CDC:
yaml
datasets:
- from: debezium:cdc.public.customer_addresses
name: customer_addresses_cdc
params:
debezium_transport: kafka
debezium_message_format: json
kafka_bootstrap_servers: localhost:19092
acceleration:
enabled: true
engine: duckdb
mode: file
refresh_mode: changes
Data Refresh Retries
Example Spicepod configuration limiting refresh retries to a maximum of 10 attempts:
yaml
datasets:
- from: eth.blocks
name: blocks
acceleration:
refresh_retry_enabled: true
refresh_retry_max_attempts: 10
refresh_check_interval: 30s
Breaking Changes
None.
New Contributors
- @rupurt made their first contribution in https://github.com/spiceai/spiceai/pull/1791
Contributors
What's Changed
Dependencies
No major dependency updates.
Commits
- Update version to 0.15.0-alpha by @ewgenius in https://github.com/spiceai/spiceai/pull/1784
- Update helm for v0.14.1-alpha by @ewgenius in https://github.com/spiceai/spiceai/pull/1786
- Run PR checks on PRs merging into
feature--branches by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1788 - Enable retries for accelerated table refresh by @sgrebnov in https://github.com/spiceai/spiceai/pull/1762
- enable more tpch benchmark queries as a result of decimal unparsing by @y-f-u in https://github.com/spiceai/spiceai/pull/1790
- add nix flake by @rupurt in https://github.com/spiceai/spiceai/pull/1791
- Support local and HF embedding models by @Jeadie in https://github.com/spiceai/spiceai/pull/1789
- fix(bin/spice): Implement custom Unmarshaller for DatasetOrReference by @peasee in https://github.com/spiceai/spiceai/pull/1787
- For windows, move
symlink->symlink_file. by @Jeadie in https://github.com/spiceai/spiceai/pull/1793 - docs: Add PULLREQUESTTEMPLATE.md by @peasee in https://github.com/spiceai/spiceai/pull/1794
- Fix
Unsupported DataType: conversionfor time predicates by @sgrebnov in https://github.com/spiceai/spiceai/pull/1795 - Use incremental backoff for initial dataset registration retries by @sgrebnov in https://github.com/spiceai/spiceai/pull/1805
- Basic HTTP/S connector by @Jeadie in https://github.com/spiceai/spiceai/pull/1792
- Scale support for Snowflake fixed-point numbers by @sgrebnov in https://github.com/spiceai/spiceai/pull/1804
- bump datafusion federation to resolve the join query failures by @y-f-u in https://github.com/spiceai/spiceai/pull/1806
- fix: Stream PostgreSQL data in by @peasee in https://github.com/spiceai/spiceai/pull/1798
- Remove
clippy::module_name_repetitionslint by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1812 - Improve Snowflake fixed-point numbers casting by @sgrebnov in https://github.com/spiceai/spiceai/pull/1809
- Case insensitive secret getter by @ewgenius in https://github.com/spiceai/spiceai/pull/1813
- refactor: Format TOML with Taplo by @peasee in https://github.com/spiceai/spiceai/pull/1808
- feat: Update PR template, add label enforcement in PR by @peasee in https://github.com/spiceai/spiceai/pull/1815
- fix bug that append may miss updates when the incremental changes are not able to be contained in one record batch by @y-f-u in https://github.com/spiceai/spiceai/pull/1817
- add integration test for inner join across federated table and accelerated table by @y-f-u in https://github.com/spiceai/spiceai/pull/1811
- Unify spicepod.llms into spicepod.models and refactor UX of spicepod.models by @Jeadie in https://github.com/spiceai/spiceai/pull/1818
- Fix issue with querying accelerated tables where the dataset name has a schema by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1823
- Fix schema support for refresh_sql and improve e2e tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/1826
- feat: Add GraphQL unnesting by @peasee in https://github.com/spiceai/spiceai/pull/1822
- fix: Allow kind/optimization labels, increase Postgres test timeout by @peasee in https://github.com/spiceai/spiceai/pull/1830
- Implement Real-time acceleration updates via Debezium CDC by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1832
- Remove println statement from PG Connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/1835
- Don't try to "hot reload" Debezium accelerated datasets by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1837
- Create
v1/searchthat performs vector search. by @Jeadie in https://github.com/spiceai/spiceai/pull/1836 - Align spicepod UX of
embeddingswithmodelsby @Jeadie in https://github.com/spiceai/spiceai/pull/1829 - Add
"cmake-build"feature tordkafkafor windows by @Jeadie in https://github.com/spiceai/spiceai/pull/1840 - Add a better error message when trying to configure refresh_mode=changes on a data connector that doesn't support it. by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1839
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.14.1-alpha...v0.15.0-alpha
- Rust
Published by digadeesh over 1 year ago
https://github.com/spiceai/spiceai - v0.14.1-alpha
Spice v0.14.1-alpha (Jun 24, 2024)
The v0.14.1-alpha release is focused on quality, stability, and type support with improvements in PostgreSQL, DuckDB, and GraphQL data connectors.
Highlights
- PostgreSQL acceleration and data connector: Support for Composite Types and UUID data types.
- DuckDB acceleration and data connector: Support for LargeUTF8 and DuckDB functions.
- GraphQL data connector: Improved error handling on invalid query syntax.
- Refresh SQL: Improved stability when overwriting STRUCT data types.
Breaking Changes
None.
New Contributors
- @phungleson made their first contribution in https://github.com/spiceai/spiceai/pull/1750
- @peasee made their first contribution in https://github.com/spiceai/spiceai/pull/1769
Contributors
- @lukekim
- @y-f-u
- @ewgenius
- @phillipleblanc
- @Jeadie
- @sgrebnov
- @gloomweaver
- @phungleson
- @peasee
- @digadeesh
What's Changed
Dependencies
No major dependency updates.
Commits
- Update Helm to v0.14.0-alpha by @sgrebnov in https://github.com/spiceai/spiceai/pull/1720
- Update version to 0.14.1-alpha by @sgrebnov in https://github.com/spiceai/spiceai/pull/1721
- Use
spiceai/async-openaito solveDeserializeissue inv1/embedby @Jeadie in https://github.com/spiceai/spiceai/pull/1707 - Add greatest least user defined functions by @y-f-u in https://github.com/spiceai/spiceai/pull/1722
- default timeunit to be seconds when time column is a numeric column by @y-f-u in https://github.com/spiceai/spiceai/pull/1727
- use system conf to construct dns resolver by @y-f-u in https://github.com/spiceai/spiceai/pull/1728
- fix a bug that dataset refresh api does not work for table with schema by @y-f-u in https://github.com/spiceai/spiceai/pull/1729
- Move secret crate to runtime module by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1723
- Return schema in getflightinfo_simple by @gloomweaver in https://github.com/spiceai/spiceai/pull/1724
- Refactor vector search component of
v1/assistinto aVectorSearchstruct by @Jeadie in https://github.com/spiceai/spiceai/pull/1699 - Update ROADMAP.md. Fix a broken link for the "Get in touch" link. by @digadeesh in https://github.com/spiceai/spiceai/pull/1725
- Secret keys in params should be case insensitive by @ewgenius in https://github.com/spiceai/spiceai/pull/1737
- expose error log when refresh encountered some issue, also add more debug logs by @y-f-u in https://github.com/spiceai/spiceai/pull/1739
- Support Struct in PostgreSQL accelerator by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1733
- rewrite refresh append update dedup logic using arrow comparators by @y-f-u in https://github.com/spiceai/spiceai/pull/1743
- Add health checks when loading {llms, embeddings} by @Jeadie in https://github.com/spiceai/spiceai/pull/1738
- Support DuckDB function in DuckDB datasets by @Jeadie in https://github.com/spiceai/spiceai/pull/1742
- Update version of
spiceai/duckdb-rs, support LargeUTF8 by @Jeadie in https://github.com/spiceai/spiceai/pull/1746 - Split refresh into coordination and execution layers by @sgrebnov in https://github.com/spiceai/spiceai/pull/1744
- bump duckdb rs git sha to resolve duckdb incorrect null value issue by @y-f-u in https://github.com/spiceai/spiceai/pull/1747
- cargo.lock file update with #1747 duckdb-rs sha by @y-f-u in https://github.com/spiceai/spiceai/pull/1748
- Fix error when GraphQL error locations is missing by @phungleson in https://github.com/spiceai/spiceai/pull/1750
- Tweak refresh scheduling logic by @sgrebnov in https://github.com/spiceai/spiceai/pull/1749
- Ensure tonic package is in duckdb feature by @Jeadie in https://github.com/spiceai/spiceai/pull/1756
- Change
tonic::async_trait->async_trait::async_traitby @Jeadie in https://github.com/spiceai/spiceai/pull/1757 - Streaming in v1/chat/completion by @Jeadie in https://github.com/spiceai/spiceai/pull/1741
- Add refreshretryenabled/max_attempts acceleration params by @sgrebnov in https://github.com/spiceai/spiceai/pull/1753
- Implement refresh retry based on fibonacci backoff (not enabled) by @sgrebnov in https://github.com/spiceai/spiceai/pull/1752
- Add VSCode debug target to debug runtime benchmark test by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1760
- update spiceai datafusion to include more unparser rules by @y-f-u in https://github.com/spiceai/spiceai/pull/1764
- Show UUID types as String instead of base64 binary. by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1767
- docs: Add linux contributor guide for setup by @peasee in https://github.com/spiceai/spiceai/pull/1769
- Do not expose connection url on object store error by @ewgenius in https://github.com/spiceai/spiceai/pull/1761
- Support secrets in llm and embeddings params by @ewgenius in https://github.com/spiceai/spiceai/pull/1770
- Bump github.com/hashicorp/go-retryablehttp from 0.7.1 to 0.7.7 by @dependabot in https://github.com/spiceai/spiceai/pull/1775
- Update ROADMAP.md with latest roadmap changes for v0.15.0 by @digadeesh in https://github.com/spiceai/spiceai/pull/1773
- Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/1776
- Strip kwarg '=' in DuckDB function parsing by @Jeadie in https://github.com/spiceai/spiceai/pull/1777
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.14.0-alpha...v0.14.1-alpha
- Rust
Published by digadeesh over 1 year ago
https://github.com/spiceai/spiceai - v0.14.0-alpha
Spice v0.14-alpha (June 17, 2024)
The v0.14-alpha release focuses on enhancing accelerated dataset performance and data integrity, with support for configuring primary keys and indexes. Additionally, the GraphQL data connector been introduced, along with improved dataset registration and loading error information.
Highlights
Accelerated Datasets: Ensure data integrity using primary key and unique index constraints. Configure conflict handling to either upsert new data or drop it. Create indexes on frequently filtered columns for faster queries on larger datasets.
GraphQL Data Connector: Initial support for using GraphQL as a data source.
Example Spicepod showing how to use primary keys and indexes with accelerated datasets:
yaml
datasets:
- from: eth.blocks
name: blocks
acceleration:
engine: duckdb # Use DuckDB acceleration engine
primary_key: '(hash, timestamp)'
indexes:
number: enabled # same as `CREATE INDEX ON blocks (number);`
'(number, hash)': unique # same as `CREATE UNIQUE INDEX ON blocks (number, hash);`
on_conflict:
'(hash, number)': drop # possible values: drop (default), upsert
'(hash, timestamp)': upsert
Primary Keys, constraints, and indexes are currently supported when using SQLite, DuckDB, and PostgreSQL acceleration engines.
Learn more with the indexing quickstart and the primary key sample.
Read the Local Acceleration documentation.
Breaking Changes
None.
Contributors
- @phillipleblanc
- @ewgenius
- @sgrebnov
- @Jeadie
- @digadeesh
- @gloomweaver
- @y-f-u
- @lukekim
- @edmondop
What's Changed
Dependencies
- Apache DataFusion: Upgraded from 38.0.0 to 39.0.0
- Apache Arrow/Parquet: Upgraded from 51.0.0 to 52.0.0
- Rust: Upgraded from 1.78.0 to 1.79.0
Commits
- Update Helm chart for v0.13.3-alpha by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1671
- Bump version to v0.14.0-alpha by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1673
- Dependency upgrades: DataFusion 39, Arrow/Parquet 52, object_store 0.10.1, arrow-odbc 11.1.0 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1674
- Generate unique runtime instance name and store in
runtime.metricstable by @ewgenius in https://github.com/spiceai/spiceai/pull/1678 - Proper support for Snowflake TIMESTAMP_NTZ by @sgrebnov in https://github.com/spiceai/spiceai/pull/1677
- Enable tpchq2 and tpchq21 in the benchmark queries by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1679
- Start runtime metrics recorder after loading secrets and extensions by @ewgenius in https://github.com/spiceai/spiceai/pull/1680
- Validate table constraints (Primary Keys/Unique Index) on accelerated tables by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1658
- Store labels as JSON string in
runtime.metricsby @ewgenius in https://github.com/spiceai/spiceai/pull/1681 - Atomic updates for DuckDB tables with constraints by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1682
- Rename metrics column
labelstopropertiesand make it nullable by @ewgenius in https://github.com/spiceai/spiceai/pull/1686 - Fix federationoptimizerrule schema error for
tpch_q7,tpch_q8,tpch_q9,tpch_q14by @sgrebnov in https://github.com/spiceai/spiceai/pull/1683 - Better prompt for /v1/assist by @Jeadie in https://github.com/spiceai/spiceai/pull/1685
- Support stream in
v1/assistby @Jeadie in https://github.com/spiceai/spiceai/pull/1653 - Fix cache hit rate chart loading for Grafana v9.5 by @sgrebnov in https://github.com/spiceai/spiceai/pull/1691
- Update ROADMAP.md to include data connector statuses by @digadeesh in https://github.com/spiceai/spiceai/pull/1684
- Support
primary_keyin Spicepod and create in accelerated table by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1687 - Datasets with schema support for availability monitoring by @sgrebnov in https://github.com/spiceai/spiceai/pull/1690
- Improve dataset registration output by @sgrebnov in https://github.com/spiceai/spiceai/pull/1692
- Readme: update dataset registration traces by @sgrebnov in https://github.com/spiceai/spiceai/pull/1694
- Improved error logging for datasets load error by @edmondop in https://github.com/spiceai/spiceai/pull/1695
- Improve
ArrayDistancescalar UDF by @Jeadie in https://github.com/spiceai/spiceai/pull/1697 - Implement
on_conflictbehavior for accelerated tables with constraints by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1688 - Fix datasets live update (Spice file watcher) by @sgrebnov in https://github.com/spiceai/spiceai/pull/1702
- Grafana Dashboard: replace Quantile with Percentile filter by @sgrebnov in https://github.com/spiceai/spiceai/pull/1703
- refresh with append overlap by @y-f-u in https://github.com/spiceai/spiceai/pull/1706
- Fix error message on DuckDB constraint violation by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1709
- Add warning when configuring indexes/primarykey/onconflict for Arrow engine. by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1710
- ensure schema to be existing when query timestamp during refresh by @y-f-u in https://github.com/spiceai/spiceai/pull/1711
- Improve README clarity and add comparison table by @lukekim in https://github.com/spiceai/spiceai/pull/1713
- Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/1716
- Update README.md to include GraphQL data connector in supported table by @digadeesh in https://github.com/spiceai/spiceai/pull/1717
- Fix quoting issue for databricks identifier by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1718
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.13.3-alpha...v0.14.0-alpha
- Rust
Published by github-actions[bot] over 1 year ago
https://github.com/spiceai/spiceai - v0.13.3-alpha
Spice v0.13.3-alpha (June 10, 2024)
The v0.13.3-alpha release is focused on quality and stability with improvements to metrics, telemetry, and operability.
Highlights
Ready API: - Add /v1/ready API that returns success once all datasets and models are loaded and ready.
Enhanced Grafana dashboard: The dashboard now includes charts for query duration and failures, the last update time of accelerated datasets, the count of refresh errors, and the last successful time the runtime was able to access federated datasets
Contributors
- @Jeadie
- @ewgenius
- @phillipleblanc
- @sgrebnov
- @gloomweaver
- @y-f-u
- @mach-kernel
What's Changed
Dependencies
- DuckDB 1.0.0: Upgrades embedded DuckDB to 1.0.0.
Commits
- Scalar UDF
array_distanceas euclidean distance between Float32[] by @Jeadie in https://github.com/spiceai/spiceai/pull/1601 - Update version to v0.14.0-alpha by @ewgenius in https://github.com/spiceai/spiceai/pull/1614
- Update helm for v0.13.2-alpha by @ewgenius in https://github.com/spiceai/spiceai/pull/1618
- Upgrade duckdb-rs to DuckDB 1.0.0 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1615
- initial idea for 'POST v1/assist' by @Jeadie in https://github.com/spiceai/spiceai/pull/1585
- openai server trait and move HTTP endpoints to
crates/runtime/src/http/v1/by @Jeadie in https://github.com/spiceai/spiceai/pull/1619 - Add branching policy & updated endgame instructions by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1620
- Update Cargo.lock & add CI check for updated Cargo.lock by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1627
- Add first-class support for views by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1622
- Add
/v1/readyAPI that returns 200 when all datasets have loaded by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1629 - Separate NQL logic from LLM Chat messages, and add OpenAI compatiblility per LLM trait. by @Jeadie in https://github.com/spiceai/spiceai/pull/1628
- Log queries failing on getflightinfo step (Flight Api) by @sgrebnov in https://github.com/spiceai/spiceai/pull/1626
- Graphql Data Connector by @gloomweaver in https://github.com/spiceai/spiceai/pull/1624
- GraphQL improved Error formatting, proper format request body by @gloomweaver in https://github.com/spiceai/spiceai/pull/1637
- Fix
v1/assistresponse and panic bug. Include primary keys in response too by @Jeadie in https://github.com/spiceai/spiceai/pull/1635 - skip integration test if no secret by @y-f-u in https://github.com/spiceai/spiceai/pull/1638
- [append] Refresher::getlatesttimestamp / getdf to add refreshsql predicates to scan by @mach-kernel in https://github.com/spiceai/spiceai/pull/1636
- GraphQL integration test by @gloomweaver in https://github.com/spiceai/spiceai/pull/1600
- Add
err_codetoquery_failuresmetric by @sgrebnov in https://github.com/spiceai/spiceai/pull/1639 - use epoch_ms to replace epoch to work with timestamptz by @y-f-u in https://github.com/spiceai/spiceai/pull/1641
- fix the schema mismatch issue on the fallback plan use schema casting by @y-f-u in https://github.com/spiceai/spiceai/pull/1642
- bug report template update by @y-f-u in https://github.com/spiceai/spiceai/pull/1640
- Add query duration, failures and accelerated dataset metrics to Grafana dashboard by @sgrebnov in https://github.com/spiceai/spiceai/pull/1598
- Fix FTP/sftp support for
ObjectStoreMetadataTable&ObjectStoreTextTableby @Jeadie in https://github.com/spiceai/spiceai/pull/1649 - Support accelerated embedding tables in
v1/assistby @Jeadie in https://github.com/spiceai/spiceai/pull/1648 - GraphQL pagination, limit pushdown and refactor by @gloomweaver in https://github.com/spiceai/spiceai/pull/1643
- Support indexes in accelerated tables by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1644
- Federated datasets availability monitoring by @sgrebnov in https://github.com/spiceai/spiceai/pull/1650
- Reset federated dataset availability during dataset registration by @sgrebnov in https://github.com/spiceai/spiceai/pull/1661
- Change to v0.13.3-alpha by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1666
- Add
Time Since Offlinechart to Grafana dashboard by @sgrebnov in https://github.com/spiceai/spiceai/pull/1664 - readme fix to correct the number of rows for show tables by @y-f-u in https://github.com/spiceai/spiceai/pull/1667
- Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/1668
- Add missing dependency on arrowsqlgen from duckdb data_component by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1669
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.13.2-alpha...v0.13.3-alpha
- Rust
Published by phillipleblanc over 1 year ago
https://github.com/spiceai/spiceai - v0.13.2-alpha
Spice v0.13.2-alpha (June 3, 2024)
The v0.13.2-alpha release is focused on quality and stability with improvements to federated query push-down, telemetry, and query history.
Highlights
Filesystem Data Connector: Adds the Filesystem Data Connector for directly using files as data sources.
Federated Query Push-Down: Improved stability and schema compatibility for federated queries.
Enhanced Telemetry: Runtime Metrics now include last update time for accelerated datasets, count of refresh errors, and new metrics for query duration and failures.
Query History: Enabled query history logging for Arrow Flight queries in addition to HTTP queries.
Contributors
- @lukekim
- @y-f-u
- @ewgenius
- @phillipleblanc
- @Jeadie
- @Sevenannn
- @sgrebnov
- @gloomweaver
- @mach-kernel
What's Changed
- Update ROADMAP.md May 27, 2024 by @lukekim in https://github.com/spiceai/spiceai/pull/1535
- update helm chart version and use v0.13.1-alpha by @y-f-u in https://github.com/spiceai/spiceai/pull/1536
- version correction in v0.13.1 release note by @y-f-u in https://github.com/spiceai/spiceai/pull/1538
- update version to v0.14.0-alpha by @y-f-u in https://github.com/spiceai/spiceai/pull/1539
- Update
spice_cloud- connect to cloud api by @ewgenius in https://github.com/spiceai/spiceai/pull/1523 - Update spice_cloud extension params, and remove logging by @ewgenius in https://github.com/spiceai/spiceai/pull/1541
- Update MSRV to 1.78 and remove unused Rust Version parameter in CI by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1540
- Improve
llmUX inspicepod.yamlby @Jeadie in https://github.com/spiceai/spiceai/pull/1545 - Store local runtime metrics in Timestamp with nanoseconds precision and UTC time by @ewgenius in https://github.com/spiceai/spiceai/pull/1548
- Object store metadata Table provider by @Jeadie in https://github.com/spiceai/spiceai/pull/1518
- Remove clickhouse password requirement by @Sevenannn in https://github.com/spiceai/spiceai/pull/1547
- pretty print loaded rows number by @y-f-u in https://github.com/spiceai/spiceai/pull/1553
- Fix UNION ALL federated push down by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1550
- Update mistral, fix bugs and improve local file DX by @Jeadie in https://github.com/spiceai/spiceai/pull/1552
- Cast
runtime.metricsschema, if remote (spiceai) data connector provided by @ewgenius in https://github.com/spiceai/spiceai/pull/1554 - Use proper MySQL dialect during federation push-down by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1555
- parallel load dataset when starting up by @y-f-u in https://github.com/spiceai/spiceai/pull/1551
- fix linter warning on Scanf return value by @y-f-u in https://github.com/spiceai/spiceai/pull/1556
- Update spice cloud connect api endpoint by @ewgenius in https://github.com/spiceai/spiceai/pull/1557
- Create new HTTP endpoint to create embeddings. by @Jeadie in https://github.com/spiceai/spiceai/pull/1558
- Query History support for Flight API by @sgrebnov in https://github.com/spiceai/spiceai/pull/1549
- Don't cache queries for runtime tables by @sgrebnov in https://github.com/spiceai/spiceai/pull/1561
- Fix schema incompatibility on federated push-down queries by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1560
- move 'embeddings' to top-level concept in spicepod.yaml by @Jeadie in https://github.com/spiceai/spiceai/pull/1564
object_storetable provider for UTF8 data formats by @Jeadie in https://github.com/spiceai/spiceai/pull/1562- Improve connectivity for JDBC clients, like Tableau by @sgrebnov in https://github.com/spiceai/spiceai/pull/1563
- Enable datasets from local filesystem by @Jeadie in https://github.com/spiceai/spiceai/pull/1584
- Adds benchmarking tests for Spice by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1577
- Push down correct timestamp expr to SQLite, add binary type mapping by @mach-kernel in https://github.com/spiceai/spiceai/pull/1566
- Add
query_duration_secondsandquery_failuresmetrics by @sgrebnov in https://github.com/spiceai/spiceai/pull/1575 - Use
/appas a default workdir in spiceai docker image by @ewgenius in https://github.com/spiceai/spiceai/pull/1586 - Add support for both file:// and file:/ by @Jeadie in https://github.com/spiceai/spiceai/pull/1587
- put loaddatasets as the latest step along with startservers by @y-f-u in https://github.com/spiceai/spiceai/pull/1559
- Embedding columns (from embedding providers) are now run inside datafusion plans. by @Jeadie in https://github.com/spiceai/spiceai/pull/1576
- Support BinaryArray in DuckDB accelerations by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1595
- Add cache header to Flight API and Spice REPL indicator by @sgrebnov in https://github.com/spiceai/spiceai/pull/1591
- Add accelerated datasets refresh metrics by @sgrebnov in https://github.com/spiceai/spiceai/pull/1589
- update the error when starting spice sql with no runtime to be actionable by @digadeesh in https://github.com/spiceai/spiceai/pull/1597
- add odbc integration test by @y-f-u in https://github.com/spiceai/spiceai/pull/1590
- Fix bug in instantiating
EmbeddingConnectorby @Jeadie in https://github.com/spiceai/spiceai/pull/1592 - readme change to reflect new cli output by @y-f-u in https://github.com/spiceai/spiceai/pull/1602
- Update version v0.13.2 by @ewgenius in https://github.com/spiceai/spiceai/pull/1604
- Roadmap changes Jun 3, 2024 by @lukekim in https://github.com/spiceai/spiceai/pull/1609
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.13.1-alpha...v0.13.2
- Rust
Published by ewgenius over 1 year ago
https://github.com/spiceai/spiceai - v0.13.1-alpha
Spice v0.13.1-alpha (May 27, 2024)
The v0.13.1-alpha release of Spice is a minor update focused on stability, quality, and operability. Query result caching provides protection against bursts of queries and schema support for datasets has been added logical grouping. An issue where Refresh SQL predicates were not pushed down underlying data sources has been resolved along with improved Acceleration Refresh logging.
Highlights in v0.13.1-alpha
Results Caching: Introduced query results caching to handle bursts of requests and support caching of non-accelerated results, such as refresh data returned on zero results. Results caching is enabled by default with a
1sitem time-to-live (TTL). Learn more.Query History Logging: Recent queries are now logged in the new
spice.runtime.query_historydataset with a default retention of 24-hours. Query history is initially enabled for HTTP queries only (not Arrow Flight queries).Dataset Schemas: Added support for dataset schemas, allowing logical grouping of datasets by separating the schema name from the table name with a
.. E.g.
```yaml datasets: - from: mysql:app1.identities name: app.users
- from: postgres:app2.purchases
name: app.purchases
```
In this example, queries against app.users will be federated to my_schema.my_table, and app.purchases will be federated to app2.purchases.
Contributors
@y-f-u @Jeadie @sgrebnov @ewgenius @phillipleblanc @lukekim @gloomweaver @Sevenannn
New in this release
- Add more type support on mysql connector by @y-f-u in https://github.com/spiceai/spiceai/pull/1449
- Add in-memory caching support for Arrow Flight queries by @sgrebnov in https://github.com/spiceai/spiceai/pull/1450
- Fix the table reference to use the full table reference, not just the table by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1460
- Make
file_formatparameter required for S3/FTP/SFTP connector by @ewgenius in https://github.com/spiceai/spiceai/pull/1455 - Add more verbose logging when acceleration refresh update is finished by @y-f-u in https://github.com/spiceai/spiceai/pull/1453
- Fix snowflake dataset path when using federation query by @y-f-u in https://github.com/spiceai/spiceai/pull/1474
- Update cargo to use spiceai datafusion fork by @y-f-u in https://github.com/spiceai/spiceai/pull/1475
- Enable in-memory results caching by default by @sgrebnov in https://github.com/spiceai/spiceai/pull/1473
- Add basic integration test for MySQL federation by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1477
- Update results_cache config names per final spec by @sgrebnov in https://github.com/spiceai/spiceai/pull/1487
- Add DuckDB quickstart to E2E tests by @lukekim in https://github.com/spiceai/spiceai/pull/1461
- Add X-Cache header for http queries by @sgrebnov in https://github.com/spiceai/spiceai/pull/1472
- Add telemetry for in-memory caching by @sgrebnov in https://github.com/spiceai/spiceai/pull/1456
- Pin Git dependencies to a specific commit hash by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1490
- Detect
file_formatfrom dataset path by @ewgenius in https://github.com/spiceai/spiceai/pull/1489 - Add
file_formatto helm chart sample dataset by @ewgenius in https://github.com/spiceai/spiceai/pull/1493 - Improve duckdb data connector error messages by @Sevenannn in https://github.com/spiceai/spiceai/pull/1486
- Add
file_formatprompt for s3 and ftp datasets in Dataset Configure CLI if no extension detected by @ewgenius in https://github.com/spiceai/spiceai/pull/1494 - Add llms to the spicepod definition and use throughout by @Jeadie in https://github.com/spiceai/spiceai/pull/1447
- Fix duckdb acceleration converting null into default values. by @y-f-u in https://github.com/spiceai/spiceai/pull/1500
- Separate runtime Dataset from spicepod Dataset by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1503
- Duckdb e2e test OSX support by @y-f-u in https://github.com/spiceai/spiceai/pull/1505
- Use TableReference for dataset name by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1506
- Tweak Results Cache naming and output by @lukekim in https://github.com/spiceai/spiceai/pull/1509
- Fix refresh_sql not properly passing down filters by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1510
- Allow datasets to specify a schema by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1507
- Cache invalidation for accelerated tables by @sgrebnov in https://github.com/spiceai/spiceai/pull/1498
- Improve spark data connector error messages by @Sevenannn in https://github.com/spiceai/spiceai/pull/1497
- Parse postgres table schema from prepare statement to support empty tables by @ewgenius in https://github.com/spiceai/spiceai/pull/1445
- Improve clarity of README and add FAQ by @lukekim in https://github.com/spiceai/spiceai/pull/1512
- Use binary data transfer for ftp by @gloomweaver in https://github.com/spiceai/spiceai/pull/1517
- Add support for time64 for SQL insertion statement by @y-f-u in https://github.com/spiceai/spiceai/pull/1519
- Add Spice Extensions PoC by @ewgenius in https://github.com/spiceai/spiceai/pull/1476
- Add results cache metrics, pod and quantile filters to Grafana dashboard by @sgrebnov in https://github.com/spiceai/spiceai/pull/1513
- Add unit tests for results caching utils by @sgrebnov in https://github.com/spiceai/spiceai/pull/1514
- Add E2E tests for results caching by @sgrebnov in https://github.com/spiceai/spiceai/pull/1515
- Pass tablereference full string into sparksession table so it can query across schemas or catalogs by @y-f-u in https://github.com/spiceai/spiceai/pull/1521
- Trace on debug level for tables in
runtimeschema by @ewgenius in https://github.com/spiceai/spiceai/pull/1524 - Update SparkSessionBuilder::remote and update spark fork hash by @Sevenannn in https://github.com/spiceai/spiceai/pull/1495
- Fix federation push-down for datasets with schemas by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1526
- Store history of queries in 'spice.runtime.query_history' by @Jeadie in https://github.com/spiceai/spiceai/pull/1501
- Disable cache for system queries by @sgrebnov in https://github.com/spiceai/spiceai/pull/1528
- Register runtime tables with runtime schema by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1532
- Fix acknowledgments workflow to include all cargo features by @Jeadie in https://github.com/spiceai/spiceai/pull/1531
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.13.0-alpha...v0.13.1-alpha
- Rust
Published by y-f-u almost 2 years ago
https://github.com/spiceai/spiceai - v0.13.0-alpha
Spice v0.13-alpha (May 20, 2024)
The v0.13.0-alpha release significantly improves federated query performance and efficiency with Query Push-Down. Query push-down allows SQL queries to be directly executed by underlying data sources, such as joining tables using the same data connector. Query push-down is supported for all SQL-based and Arrow Flight data connectors. Additionally, runtime metrics, including query duration, collected and accessed in the spice.runtime.metrics table. This release also includes a new FTP/SFTP data connector and improved CSV support for the S3 data connector.
Highlights
Federated Query Push-Down (#1394): All SQL and Arrow Flight data connectors support federated query push-down.
Runtime Metrics (#1361): Runtime metric collection can be enabled using the
--metricsflag and accessed by thespice.runtime.metricstable.FTP & SFTP data connector (#1355) (#1399): Added support for using FTP and SFTP as data sources.
Improved CSV support (#1411) (#1414): S3/FTP/SFTP data connectors support CSV files with expanded CSV options.
Contributors
- @Jeadie
- @digadeesh
- @ewgenius
- @gloomweaver
- @lukekim
- @phillipleblanc
- @sgrebnov
- @y-f-u
What's Changed
- Remove milestones from Enhancement template by @lukekim in https://github.com/spiceai/spiceai/pull/1373
- Update version.txt and Cargo.toml to 0.13.0-alpha by @sgrebnov in https://github.com/spiceai/spiceai/pull/1375
- Helm chart for Spice v0.12.2-alpha by @sgrebnov in https://github.com/spiceai/spiceai/pull/1374
- Add
releasecargo feature to docker builds by @ewgenius in https://github.com/spiceai/spiceai/pull/1377 - FTP connector by @gloomweaver in https://github.com/spiceai/spiceai/pull/1355
- Provide ability to specify timeout for s3 data connector by @gloomweaver in https://github.com/spiceai/spiceai/pull/1378
- clickhouse-rs use tag instead of branch by @gloomweaver in https://github.com/spiceai/spiceai/pull/1313
- Store runtime metrics in
spice.runtime.metricstable by @ewgenius in https://github.com/spiceai/spiceai/pull/1361 - Update bug_report.md to include the kind/bug label by @digadeesh in https://github.com/spiceai/spiceai/pull/1381
- Remove redundant [refresh] in log by @lukekim in https://github.com/spiceai/spiceai/pull/1384
- Implement federation for DuckDB Data Connector (POC) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1380
- Update wording for spice cloud connection by @ewgenius in https://github.com/spiceai/spiceai/pull/1386
- fix dataset refreshing status by @y-f-u in https://github.com/spiceai/spiceai/pull/1387
- clickhouse friendly error by @y-f-u in https://github.com/spiceai/spiceai/pull/1388
- Initial work for NQL crate and API by @Jeadie in https://github.com/spiceai/spiceai/pull/1366
- Fully implement federation for all SqlTable-based Data Connectors by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1394
- use df logical plan to query latest timestamp when refreshing incrementally by @y-f-u in https://github.com/spiceai/spiceai/pull/1393
- Refactor datafusion.write_data to use table reference by @ewgenius in https://github.com/spiceai/spiceai/pull/1402
- Add federation to FlightTable based DataConnectors by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1401
- SFTP Data Connector by @gloomweaver in https://github.com/spiceai/spiceai/pull/1399
- Use GPT3.5 for NSQL task by @Jeadie in https://github.com/spiceai/spiceai/pull/1400
- Update ROADMAP May 16, 2024 by @lukekim in https://github.com/spiceai/spiceai/pull/1405
- Add ftp/sftp connector to readme by @gloomweaver in https://github.com/spiceai/spiceai/pull/1404
- Add FlightSQL federation provider by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1403
- Refactor runtime metrics to use localhost accelerated table by @ewgenius in https://github.com/spiceai/spiceai/pull/1395
- Use JSON response in OpenAI, text -> SQL model by @Jeadie in https://github.com/spiceai/spiceai/pull/1407
- support more common csv options by @y-f-u in https://github.com/spiceai/spiceai/pull/1411
- add a TLS error message in data connector and implement it for clickhouse by @y-f-u in https://github.com/spiceai/spiceai/pull/1413
- Add CSV to s3 data formats by @gloomweaver in https://github.com/spiceai/spiceai/pull/1414
- fix up dependencies now 0.5.0 disappeared by @Jeadie in https://github.com/spiceai/spiceai/pull/1417
- Add NSQL to FlightRepl by @Jeadie in https://github.com/spiceai/spiceai/pull/1409
- Update Cargo.lock by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1418
- Enable spice.ai replication for
runtime.metricstable by @ewgenius in https://github.com/spiceai/spiceai/pull/1408 - Restructure the runtime struct to make it easier to test by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1420
- Make it easier to construct an App programatically by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1421
- Add an integration test for federation by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1426
- wait 2 seconds for the status to turn ready in refreshing status test by @y-f-u in https://github.com/spiceai/spiceai/pull/1419
- Add functional tests for federation push-down by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1428
- Enable push-down federation by default by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1429
- Add guides and examples about error handling by @ewgenius in https://github.com/spiceai/spiceai/pull/1427
- Add LRU cache support for http-based queries by @sgrebnov in https://github.com/spiceai/spiceai/pull/1410
- Update README.md - Remove bigquery from tablet of connectors by @digadeesh in https://github.com/spiceai/spiceai/pull/1434
- Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/1433
- CLI wording and logs change reflected on readme by @y-f-u in https://github.com/spiceai/spiceai/pull/1435
- Add databricksusessl parameter by @Sevenannn in https://github.com/spiceai/spiceai/pull/1406
- Update helm version and use v0.13.0-alpha by @Jeadie in https://github.com/spiceai/spiceai/pull/1436
- Don't include feature 'llms/candles' by default by @Jeadie in https://github.com/spiceai/spiceai/pull/1437
- Correctly map NullBuilder for Null arrow types by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1438
- Propagate object store error by @gloomweaver in https://github.com/spiceai/spiceai/pull/1415
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.12.2-alpha...v0.13.0-alpha
- Rust
Published by Jeadie almost 2 years ago
https://github.com/spiceai/spiceai - v0.12.2-alpha
Spice v0.12.2-alpha (May 13, 2024)
The v0.12.2-alpha release introduces data streaming and key-pair authentication for the Snowflake data connector, enables general append mode data refreshes for time-series data, improves connectivity error messages, adds nested folders support for the S3 data connector, and exposes nodeSelector and affinity keys in the Helm chart for better Kubernetes management.
Highlights
Improved Connectivity Error Messages: Error messages provide clearer, actionable guidance for misconfigured settings or unreachable data connectors.
Snowflake Data Connector Improvements: Enables data streaming by default and adds support for key-pair authentication in addition to passwords.
API for Refresh SQL Updates: Update dataset Refresh SQL via API.
Append Data Refresh: Append mode data refreshes for time-series data are now supported for all data connectors. Specify a dataset
time_columnwithrefresh_mode: appendto only fetch data more recent than the latest local data.Docker Image Update: The
spiceai/spiceai:latestDocker image now includes the ODBC data connector. For a smaller footprint, usespiceai/spiceai:latest-slim.Helm Chart Improvements:
nodeSelectorandaffinitykeys are now supported in the Helm chart for improved Kubernetes deployment management.
Breaking Changes
- API to trigger accelerated dataset refreshes has changed from
POST /v1/datasets/:name/refreshtoPOST /v1/datasets/:name/acceleration/refreshto be consistent with theSpicepod.yamlstructure.
Contributors
- @mach-kernel
- @y-f-u
- @sgrebnov
- @ewgenius
- @Jeadie
- @Sevenannn
- @digadeesh
- @phillipleblanc
- @lukekim
What's Changed
- Fix list type support in spark connect by @y-f-u in https://github.com/spiceai/spiceai/pull/1341
- Add nested folder support in S3 Parquet connector by @y-f-u in https://github.com/spiceai/spiceai/pull/1342
- Improves S3 connector using DataFusion ListingTable table provider by @y-f-u in https://github.com/spiceai/spiceai/pull/1326
- Update ROADMAP May 6, 2024 by @lukekim in https://github.com/spiceai/spiceai/pull/1315
- List flightsql and snowflake as supported connectors in README.md by @sgrebnov in https://github.com/spiceai/spiceai/pull/1317
- Helm chart for v0.12.1-alpha by @ewgenius in https://github.com/spiceai/spiceai/pull/1323
- Read sqlite_file param and use it as path by @Sevenannn in https://github.com/spiceai/spiceai/pull/1309
- Compile spiced with
releasefeature in docker image by @ewgenius in https://github.com/spiceai/spiceai/pull/1324 - Add support for Snowflake key-pair authentication by @sgrebnov in https://github.com/spiceai/spiceai/pull/1314
- Wrap postgres errors in common DataConnectorError by @ewgenius in https://github.com/spiceai/spiceai/pull/1327
- Fix TPCH tests runner by @sgrebnov in https://github.com/spiceai/spiceai/pull/1330
- Spice CLI support for Snowflake key-pair auth by @sgrebnov in https://github.com/spiceai/spiceai/pull/1325
- sqlproviderdatafusion: Support TimestampMicrosecond, Date32, Date64 by @mach-kernel in https://github.com/spiceai/spiceai/pull/1329
- Resolve dangling reference for SQLite by @Sevenannn in https://github.com/spiceai/spiceai/pull/1312
- Select columns from Spark Dataframe according to projected_schema by @Sevenannn in https://github.com/spiceai/spiceai/pull/1336
- Expose nodeselector and affinity keys in Helm chart by @mach-kernel in https://github.com/spiceai/spiceai/pull/1338
- Use streaming for Snowflake queries by @sgrebnov in https://github.com/spiceai/spiceai/pull/1337
- Publish ODBC images by @mach-kernel in https://github.com/spiceai/spiceai/pull/1271
- Include Postgres acceleration engine to types support tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/1343
- Refactor dataconnector providers getters to return common
DataConnectorResultandDataConnectorErrorby @ewgenius in https://github.com/spiceai/spiceai/pull/1339 - s3 csv support to validate the listing table extensibility by @y-f-u in https://github.com/spiceai/spiceai/pull/1344
- Move model code into separate, feature-flagged crate by @Jeadie in https://github.com/spiceai/spiceai/pull/1335
- Initial setup for federated queries by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1350
- Refactor dbconnection errors, and catch invalid postgres table name case by @ewgenius in https://github.com/spiceai/spiceai/pull/1353
- Rename default datafusion catalog to "spice", add internal "spice.runtime" schema by @ewgenius in https://github.com/spiceai/spiceai/pull/1359
- Add API to set Refresh SQL for accelerated table by @sgrebnov in https://github.com/spiceai/spiceai/pull/1356
- Set next version to v0.12.2 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1367
- Upgrade to DataFusion 38 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1368
- Incremental append based on time column by @y-f-u in https://github.com/spiceai/spiceai/pull/1360
- Update README.md to include correct output when running show tables from quickstart by @digadeesh in https://github.com/spiceai/spiceai/pull/1371
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.12.1-alpha...v0.12.2-alpha
- Rust
Published by github-actions[bot] almost 2 years ago
https://github.com/spiceai/spiceai - v0.12.1-alpha
Spice v0.12.1-alpha (May 6, 2024)
The v0.12.1-alpha release introduces a new Snowflake data connector, support for UUID and TimestampTZ types in the PostgreSQL connector, and improved error messages across all data connectors. The Clickhouse data connector enables data streaming by default. The public SQL interface now restricts DML and DDL queries. Additionally, accelerated tables now fully support NULL values, and issues with schema conversion in these tables have been resolved.
Highlights
Snowflake Data Connector: Initial support for Snowflake as a data source.
Clickhouse Data Streaming: Enables data streaming by default, eliminating in-memory result collection.
Read-only SQL Interface: Disables DML (INSERT/UPDATE/DELETE) and DDL (CREATE/ALTER TABLE) queries for improved data source security.
Error Message Improvements: Improved the error messages for commonly encountered issues with data connectors.
Accelerated Tables: Supports NULL values across all data types and fixes schema conversion errors for consistent type handling.
Contributors
- @ahirner
- @y-f-u
- @sgrebnov
- @ewgenius
- @Jeadie
- @gloomweaver
- @Sevenannn
- @digadeesh
- @phillipleblanc
What's Changed
- Add schema types check for query result by @sgrebnov in https://github.com/spiceai/spiceai/pull/1212
- helm chart for v0.12.0-alpha by @y-f-u in https://github.com/spiceai/spiceai/pull/1235
- Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/1232
- Bump spiceai version to v0.12.1-alpha by @ewgenius in https://github.com/spiceai/spiceai/pull/1239
- Update ROADMAP.md - remove v0.12.0-alpha by @ewgenius in https://github.com/spiceai/spiceai/pull/1241
- Raise errors in InsertBuilder by @Jeadie in https://github.com/spiceai/spiceai/pull/1242
- Update endgame template by @ewgenius in https://github.com/spiceai/spiceai/pull/1240
- Add E2E tests for acceleration engines types support by @sgrebnov in https://github.com/spiceai/spiceai/pull/1218
- Stream blocks to arrow by @gloomweaver in https://github.com/spiceai/spiceai/pull/1203
- Update enhancement.md to include a checklist item have a release notes entry for each enhancement. by @digadeesh in https://github.com/spiceai/spiceai/pull/1245
- arrowsqlgen data column conversion by @Sevenannn in https://github.com/spiceai/spiceai/pull/1230
- Implement the Localhost Data Connector & fix DoPut by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1266
- Update postgres parameter check by @Sevenannn in https://github.com/spiceai/spiceai/pull/1244
- Record batch casting to fix SQLite data type issues by @y-f-u in https://github.com/spiceai/spiceai/pull/1261
- typo fix on Decimal in postgres arrowsqlgen by @y-f-u in https://github.com/spiceai/spiceai/pull/1277
- Move verifyschema to arrowtools by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1284
- Support UUID and TimestampTZ type for Postgres as Data Connector by @ahirner & @y-f-u https://github.com/spiceai/spiceai/pull/1276
- Fix linter warnings by @ewgenius in https://github.com/spiceai/spiceai/pull/1286
- Add Snowflake data connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/1278
- Add Snowflake login support (username and password) by @sgrebnov in https://github.com/spiceai/spiceai/pull/1272
- convert timestamp properly in sql gen by @y-f-u in https://github.com/spiceai/spiceai/pull/1291
- Add if not exists clause to create statement on when creating a table using duckdb acceleration. by @digadeesh in https://github.com/spiceai/spiceai/pull/1290
- Disable DML & DDL queries in the public SQL interface by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1294
- Refactor duckdb to properly set access_mode for connection by @ewgenius in https://github.com/spiceai/spiceai/pull/1285
- do not insert batch for sqlite and postgres if no records in the record batch by @y-f-u in https://github.com/spiceai/spiceai/pull/1293
- Postgres - add custom error message for invalid error table by @ewgenius in https://github.com/spiceai/spiceai/pull/1295
- SQLite/Accelerators handle null values by @gloomweaver in https://github.com/spiceai/spiceai/pull/1298
- Add command to attach to running process by @gloomweaver in https://github.com/spiceai/spiceai/pull/1297
- Use the
GITHUB_TOKENenvironment variable in the installation script, if available, to avoid rate limiting in CI workflows by @ewgenius in https://github.com/spiceai/spiceai/pull/1302 - Fix unsupported SSL mode options for PostgreSQL connection string by @ewgenius in https://github.com/spiceai/spiceai/pull/1300
- Add CLI cmd
spice login sparkby @phillipleblanc in https://github.com/spiceai/spiceai/pull/1303 - Check only the latest published release to avoid installing pre-release versions by @ewgenius in https://github.com/spiceai/spiceai/pull/1301
- Postgres data connector - handle invalid host/port and username/password errors by @ewgenius in https://github.com/spiceai/spiceai/pull/1292
- Fix the panic on bad clickhouse connection by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1306
- Improve Snowflake Data Connector by @sgrebnov https://github.com/spiceai/spiceai/pull/1296
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.12.0-alpha...v0.12.1-alpha
- Rust
Published by phillipleblanc almost 2 years ago
https://github.com/spiceai/spiceai - v0.12-alpha
Spice v0.12-alpha (Apr 29, 2024)
The v0.12-alpha release introduces Clickhouse and Apache Spark data connectors, adds support for limiting refresh data periods for temporal datasets, and includes upgraded Spice Client SDKs compatible with Spice OSS.
Highlights
Clickhouse data connector: Use Clickhouse as a data source with the
clickhouse:scheme.Apache Spark Connect data connector: Use Apache Spark Connect connections as a data source using the
spark:scheme.Refresh data window: Limit accelerated dataset data refreshes to the specified window, as a duration from now configuration setting, for faster and more efficient refreshes.
ODBC data connector: Use ODBC connections as a data source using the
odbc:scheme. The ODBC data connector is currently optional and not included in default builds. It can be conditionally compiled using theodbccargo feature when building from source.Spice Client SDK Support: The official Spice SDKs have been upgraded with support for Spice OSS.
Breaking Changes
- Refresh interval: The
refresh_intervalacceleration setting and been changed torefresh_check_intervalto make it clearer it is the check versus the data interval.
Contributors
- @phillipleblanc
- @Jeadie
- @ewgenius
- @sgrebnov
- @y-f-u
- @lukekim
- @digadeesh
- @gloomweaver
- @edmondop
- @mach-kernel
New Contributors
- Thanks to @mach-kernel who made their first contribution in https://github.com/spiceai/spiceai/pull/1204 by adding the ODBC data connector!
What's Changed
- Update helm version by @Jeadie in https://github.com/spiceai/spiceai/pull/1167
- Handle and trace errors in secret stores by @ewgenius in https://github.com/spiceai/spiceai/pull/1149
- bump the release versions to 0.12.0 by @y-f-u in https://github.com/spiceai/spiceai/pull/1171
- Don't fail acknowledgments flow if no changes detected by @ewgenius in https://github.com/spiceai/spiceai/pull/1170
- Allow Spice CLI to control runtime installation on Windows by @sgrebnov in https://github.com/spiceai/spiceai/pull/1173
- Allow
SELECT count(*)for Sqlite Data Accelerator by @sgrebnov in https://github.com/spiceai/spiceai/pull/1166 - add refresh_period param in acceleration by @y-f-u in https://github.com/spiceai/spiceai/pull/1180
- Properly support Spark Connect filter pushdown by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1186
- Avoid rate-limiting on arduino/setup-protoc@v3 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1189
- Clickhouse DataConnector base implementation by @gloomweaver in https://github.com/spiceai/spiceai/pull/1168
- rename refreshinterval to refreshcheck_interval by @y-f-u in https://github.com/spiceai/spiceai/pull/1190
- Fix timestamp & add support for Decimal to Databricks/Spark by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1194
- Convert temporal column and refresh period to datafusion expr by @y-f-u in https://github.com/spiceai/spiceai/pull/1187
- Hot reload accelerated table on dataset update by @ewgenius in https://github.com/spiceai/spiceai/pull/1195
- Upgrade DataFusion to 37.1 & DuckDB to 10.2 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1200
- Update version.txt for 0.11.2 release by @digadeesh in https://github.com/spiceai/spiceai/pull/1199
- Clickhouse E2E by @gloomweaver in https://github.com/spiceai/spiceai/pull/1193
- Clickhouse: fix darwin ci pipeline by @gloomweaver in https://github.com/spiceai/spiceai/pull/1201
- Add table_type to
show tablesin Spice SQL & update next version tov0.12.0-alphaby @phillipleblanc in https://github.com/spiceai/spiceai/pull/1206 - print WARN if time_column does not exists in federated schema by @y-f-u in https://github.com/spiceai/spiceai/pull/1207
- Add FallbackOnZeroResultsScanExec for executing an input ExecutionPlan and optionally falling back to a TableProvider.scan() if the input has zero results by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1196
- Clickhouse refactor connection code and set secure option by @gloomweaver in https://github.com/spiceai/spiceai/pull/1198
- E2E: reusable Spice installation by @sgrebnov in https://github.com/spiceai/spiceai/pull/1205
- Clickhouse blocktoarrow unit test by @gloomweaver in https://github.com/spiceai/spiceai/pull/1202
- rename refreshperiod to refreshdata_period by @y-f-u in https://github.com/spiceai/spiceai/pull/1210
- Refactor E2E tests: dataset verification and PostgreSQL installation by @sgrebnov in https://github.com/spiceai/spiceai/pull/1211
- Add BI dashboard acceleration video to README.md by @lukekim in https://github.com/spiceai/spiceai/pull/1219
- Improve clarity and consistency of output messages by @lukekim in https://github.com/spiceai/spiceai/pull/1214
- Update ROADMAP Apr 29, 2024 by @lukekim in https://github.com/spiceai/spiceai/pull/1220
- Stand-alone Spark Connect: Isolate Spark Connect from Databricks Connect to make it reusable by @edmondop in https://github.com/spiceai/spiceai/pull/1213
- Optimize build time in dev mode by @gloomweaver in https://github.com/spiceai/spiceai/pull/1215
- Feature: Support ODBC reads using unixodbc by @mach-kernel in https://github.com/spiceai/spiceai/pull/1204
- Use non-fork deltalake by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1223
- Support Date32 & Date64 in arrowsqlgen by @Jeadie in https://github.com/spiceai/spiceai/pull/1217
- Update REPL output to be consistent with the latest Spice version by @sgrebnov in https://github.com/spiceai/spiceai/pull/1231
- rename refreshdataperiod to refreshdatawindow by @y-f-u in https://github.com/spiceai/spiceai/pull/1233
- Update README.md to include ODBC, Spark Connect, and Clickhouse data connectors in support data connector matrix. by @digadeesh in https://github.com/spiceai/spiceai/pull/1234
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.11.1-alpha...v0.12.0-alpha
- Rust
Published by ewgenius almost 2 years ago
https://github.com/spiceai/spiceai - 0.11.1-alpha
Spice v0.11.1-alpha (Apr 22, 2024)
The v0.11.1-alpha release introduces retention policies for accelerated datasets, native Windows installation support, and integration of catalog and schema settings for the Databricks Spark connector. Several bugs have also been fixed for improved stability.
Highlights
Retention Policies for Accelerated Datasets: Automatic eviction of data from accelerated time-series datasets when a specified temporal column exceeds the retention period, optimizing resource utilization.
Windows Installation Support: Native Windows installation support, including upgrades.
Databricks Spark Connect Catalog and Schema Settings: Improved translation between DataFusion and Spark, providing better Spark Catalog support.
Contributors
- @phillipleblanc
- @Jeadie
- @ewgenius
- @sgrebnov
- @y-f-u
- @lukekim
- @digadeesh
- @Sevenannn
- @gloomweaver
New in this release
- PowerShell script to install Spice on Windows by @sgrebnov in https://github.com/spiceai/spiceai/pull/1128
- Support catalog and schema in Databricks Spark Connect by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1137
- Retention handlers by @y-f-u in https://github.com/spiceai/spiceai/pull/1096
What's Changed
- Update CONTRIBUTING with new dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1121
- Fix the Helm tag by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1122
- Upgrade Spice version to 0.11.1 by @sgrebnov in https://github.com/spiceai/spiceai/pull/1123
- Remove 0.11 from roadmap by @ewgenius in https://github.com/spiceai/spiceai/pull/1124
- Include
refresh_sqland manual refresh to e2e tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/1125 - Respect executables file extension on Windows by @sgrebnov in https://github.com/spiceai/spiceai/pull/1130
- Use quoted strings when performing federated SQL queries by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1129
- Make Windows artifact names consistent with other platforms by @sgrebnov in https://github.com/spiceai/spiceai/pull/1132
- Make Windows installation less verbose by @sgrebnov in https://github.com/spiceai/spiceai/pull/1138
- Document Windows installation and add test by @sgrebnov in https://github.com/spiceai/spiceai/pull/1134
- Use transaction for DuckDB Table Writer by @Sevenannn in https://github.com/spiceai/spiceai/pull/1135
- Update Windows installation script url by @sgrebnov in https://github.com/spiceai/spiceai/pull/1143
- Update roadmap Apr 18, 2024 by @lukekim in https://github.com/spiceai/spiceai/pull/1142
- Test connection when new connection pool created by @ewgenius in https://github.com/spiceai/spiceai/pull/1126
- Enable clippy::cloneonref_ptr by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1146
- Allow only alphanumeric dataset names when using
spice dataset configureby @ewgenius in https://github.com/spiceai/spiceai/pull/1140 - Extend PR check to build with no default features, and each individual feature by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1156
- Bump rustls from 0.21.10 to 0.21.11 by @dependabot in https://github.com/spiceai/spiceai/pull/1150
- Serde rule for ISO8601 time format by @y-f-u in https://github.com/spiceai/spiceai/pull/1151
- Add static linking for vcruntime dependencies on Windows by @sgrebnov in https://github.com/spiceai/spiceai/pull/1152
- Use clearer retention param key - retentioncheckenabled instead by @y-f-u in https://github.com/spiceai/spiceai/pull/1158
spice upgradeon Windows by @sgrebnov in https://github.com/spiceai/spiceai/pull/1155
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.11.0-alpha...v0.11.1-alpha
- Rust
Published by y-f-u almost 2 years ago
https://github.com/spiceai/spiceai - Spice.ai v0.11.0-alpha
The Spice v0.11.0-alpha release significantly improves the Databricks data connector with Databricks Connect (Spark Connect) support, adds the DuckDB data connector, and adds the AWS Secrets Manager secret store. In addition, enhanced control over accelerated dataset refreshes, improved SSL security for MySQL and PostgreSQL connections, and overall stability improvements have been added.
Highlights in v0.11.0-alpha
DuckDB data connector: Use DuckDB databases or connections as a data source.
AWS Secrets Manager Secret Store: Use AWS Secrets Managers as a secret store.
Custom Refresh SQL: Specify a custom SQL query for dataset refresh using refresh_sql.
Dataset Refresh API: Trigger a dataset refresh using the new CLI command spice refresh or via API.
Expanded SSL support for Postgres: SSL mode now supports disable, require, prefer, verify-ca, verify-full options with the default mode changed to require. Added pg_sslrootcert parameter for setting a custom root certificate and the pg_insecure parameter is no longer supported.
Databricks Connect: Choose between using Spark Connect or Delta Lake when using the Databricks data connector for improved performance.
Improved SSL support for Postgres: ssl mode now supports disable, require, prefer, verify-ca, verify-full options with default mode changed to require.
Added pg_sslrootcert parameter to allow setting custom root cert for postgres connector, pg_insecure parameter is no longer supported as redundant.
Internal architecture refactor: The internal architecture of spiced was refactored to simplify the creation data components and to improve alignment with DataFusion concepts.
New Contributors
@edmondop's first contribution github.com/spiceai/spiceai/pull/1110!
Contributors
- @phillipleblanc
- @Jeadie
- @ewgenius
- @sgrebnov
- @y-f-u
- @lukekim
- @digadeesh
- @Sevenannn
- @gloomweaver
- @ahirner
New in this release
- Fixes MySQL
NULLvalues by @gloomweaver in https://github.com/spiceai/spiceai/pull/1067 - Fixes PostgreSQL
NULLvalues forNUMERICby @gloomweaver in https://github.com/spiceai/spiceai/pull/1068 - Adds Custom Refresh SQL support by @lukekim and @phillipleblanc in https://github.com/spiceai/spiceai/pull/1073
- Adds DuckDB data connector by @Sevenannn in https://github.com/spiceai/spiceai/pull/1085
- Adds AWS Secrets Manager secret store by @sgrebnov in https://github.com/spiceai/spiceai/pull/1063, https://github.com/spiceai/spiceai/pull/1064
- Adds Dataset refresh API by @sgrebnov in https://github.com/spiceai/spiceai/pull/1075, https://github.com/spiceai/spiceai/pull/1078, https://github.com/spiceai/spiceai/pull/1083
- Adds
spice refreshCLI command for dataset refresh by @sgrebnov in https://github.com/spiceai/spiceai/pull/1112 - Adds
TEXTandDECIMALtypes support and properly handlingNULLfor MySQL by @gloomweaver in https://github.com/spiceai/spiceai/pull/1067 - Adds MySQL
DATEandTINYINTtypes support for MySQL by @ewgenius in https://github.com/spiceai/spiceai/pull/1065 - Adds
ssl_rootcert_pathparameter for MySql data connector by @ewgenius in https://github.com/spiceai/spiceai/pull/1079 - Adds
LargeUtf8support and explicitly passing the schema to data acceleratorSqlTableby @phillipleblanc in https://github.com/spiceai/spiceai/pull/1077 - Adds Ability to configure data retention for accelerated datasets by @y-f-u in https://github.com/spiceai/spiceai/issues/1086
- Adds Custom SSL certificates for PostgreSQL data connector by @ewgenius in https://github.com/spiceai/spiceai/pull/1081
- Adds Conditional compile for Dremio by @ahirner in https://github.com/spiceai/spiceai/pull/1100
- Adds Ability for Databricks connector to use spark-connect-rs as the mechanism to execute queries against the Databricks by @edmondop in https://github.com/spiceai/spiceai/pull/1110
- Adds Ability to choose between Spark Connect and Delta Lake implementation for Databricks by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1115/files
- Updates Databricks login parameters by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1113
- Updates Architecture to simplify data components development by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1040
- Updates Improved readability of GitHub Actions test job names by @lukekim in https://github.com/spiceai/spiceai/pull/1071
- Updates Upgrade Arrow, DataFusion, Tonic dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1097
- Updates Handling non-string spicepod params by @ewgenius in https://github.com/spiceai/spiceai/pull/1098
- Updates Optional features compile: duckdb, databricks by @ahirner in https://github.com/spiceai/spiceai/pull/1100
- Updates Helm version to 0.1.3 by @Jeadie in https://github.com/spiceai/spiceai/pull/1120
- Removes
pg_insecureparameter support from Postgres by @ewgenius in https://github.com/spiceai/spiceai/pull/1081
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.10.2-alpha...v0.11.0-alpha
- Rust
Published by sgrebnov almost 2 years ago
https://github.com/spiceai/spiceai - Spice.ai v0.10.2-alpha
The v0.10.2-alpha release adds the MySQL data connector and makes external data connections more robust on initialization.
Highlights in v0.10.2-alpha
MySQL data connector: Connect to any MySQL server, including SSL support.
Data connections verified at initialization: Verify endpoints and authorization for external data connections (e.g. databricks, spice.ai) at initialization.
New Contributors
- @rthomas made their first contribution in https://github.com/spiceai/spiceai/pull/1022
- @ahirner made their first contribution in https://github.com/spiceai/spiceai/pull/1029
- @gloomweaver made their first contribution in https://github.com/spiceai/spiceai/pull/1004
Contributors
- @phillipleblanc
- @y-f-u
- @ewgenius
- @sgrebnov
- @lukekim
- @digadeesh
- @jeadie
New in this release
- Adds MySQL data connector by @gloomweaver in https://github.com/spiceai/spiceai/pull/1004
- Fixes
show tables;parsing in the Spice SQL repl. - Adds data connector verification at initialization
- For Dremio by @sgrebnov in https://github.com/spiceai/spiceai/pull/1017
- For Databricks by @sgrebnov in https://github.com/spiceai/spiceai/pull/1019
- For Spice.ai by @sgrebnov in https://github.com/spiceai/spiceai/pull/1020
- Fixes Ensures unit and doc tests compile and run by @rthomas in https://github.com/spiceai/spiceai/pull/1022
- Improves Helm chart + Grafana dashboard by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1030
- Fixes Makes data connectors optional features by @ahirner in https://github.com/spiceai/spiceai/pull/1029
- Fixes Fixes SpiceAI E2E for external contributors in Github actions by @ewgenius in https://github.com/spiceai/spiceai/pull/1023
- Fixes remove hardcoded
lookback_size(& improve SpiceAI's ModelSource) by @Jeadie in https://github.com/spiceai/spiceai/pull/1016
Full Changelog: https://github.com/spiceai/spiceai/compare/v0.10.1-alpha...v0.10.2-alpha
- Rust
Published by Jeadie almost 2 years ago
https://github.com/spiceai/spiceai - Spice.ai v0.10.1-alpha
The v0.10.1-alpha release focuses on stability, bug fixes, and usability by improving error messages when using SQLite data accelerators, improving the PostgreSQL support, and adding a basic Helm chart.
Highlights in v0.10.1-alpha
Improved PostgreSQL support for Data Connectors TLS is now supported with PostgreSQL Data Connectors and there is improved VARCHAR and BPCHAR conversions through Spice.
Improved Error messages Simplified error messages from Spice when propagating errors from Data Connectors and Accelerator Engines.
Spice Pods Command The spice pods command can give you quick statistics about models, dependencies, and datasets that are loaded by the Spice runtime.
Contributors
- @phillipleblanc
- @mitchdevenport
- @ewgenius
- @sgrebnov
- @lukekim
- @digadeesh
New in this release
- Adds Basic Helm Chart for spiceai (https://github.com/spiceai/spiceai/pull/1002)
- Adds Support for
spice loginin environments with no browser. (https://github.com/spiceai/spiceai/pull/994) - Adds TLS support in Postgres connector. (https://github.com/spiceai/spiceai/pull/970)
- Fixes Improve Postgres VARCHAR and BPCHAR conversion. (https://github.com/spiceai/spiceai/pull/993)
- Fixes
spice podsReturns incorrect counts. (https://github.com/spiceai/spiceai/pull/998) - Fixes Return friendly error messages for unsupported types in sqlite. (https://github.com/spiceai/spiceai/pull/982)
- Fixes Pass Tonic errors when receiving errors from dependencies. (https://github.com/spiceai/spiceai/pull/995)
- Rust
Published by digadeesh almost 2 years ago
https://github.com/spiceai/spiceai - Spice.ai v0.10-alpha
Announcing the release of Spice.ai v0.10-alpha! 🎉
The Spice.ai v0.10-alpha release focused on additions and updates to improve stability, usability, and the overall Spice developer experience.
Highlights in v0.10-alpha
Public Bucket Support for S3 Data Connector: The S3 Data Connector now supports public buckets in addition to buckets requiring an access id and key.
JDBC-Client Connectivity: Improved connectivity for JDBC clients, like Tableau.
User Experience Improvements:
- Friendlier error messages across the board to make debugging and development better.
- Added a
spice login postgrescommand, streamlining the process for connecting to PostgreSQL databases. - Added PostgreSQL connection verification and connection string support, enhancing usability for PostgreSQL users.
Grafana Dashboard: Improving the ability to monitor Spice deployments, a standard Grafana dashboard is now available.
Contributors
- @phillipleblanc
- @mitchdevenport
- @Jeadie
- @ewgenius
- @sgrebnov
- @y-f-u
- @lukekim
- @digadeesh
New in this release
- Fixes Gracefully handle Arrow Flight DoExchange connection resets
- Adds Grafana Dashboard
- Adds Flight SQL CommandGetTableTypes Command support (improves JDBC-client connectivity)
- Adds Friendlier error messages
- Adds
spice login postgrescommand - Adds PostgreSQL connection verification
- Adds PostgreSQL connection string support
- Adds Linux aarch64 build
- Updates Improves
spice statuswith dataset metrics - Updates CLI REPL improved
show tablesoutput - Updates CLI REPL limit output to 500 rows
- Updates Improved README.md with architecture diagram updates
- Updates Improved CI run time.
- Updates Use macOS hosted Actions runner
- Rust
Published by phillipleblanc almost 2 years ago
https://github.com/spiceai/spiceai - Spice.ai v0.9.1-alpha
The v0.9.1 release focused on stability, bug fixes, and usability by adding spice CLI commands for listing Spicepods (spice pods), Models (spice models), Datasets (spice datasets), and improved status (spice status) details. In addition, the Arrow Flight SQL (flightsql) data connector and SQLite (sqlite) data store were added.
Highlights in v0.9.1-alpha
FlightSQL data connector: Arrow Flight SQL can now be used as a connector for federated SQL query.
SQLite data backend: SQLite can now be used as a data store for acceleration.
Contributors
- @phillipleblanc
- @mitchdevenport
- @Jeadie
- @ewgenius
- @sgrebnov
- @y-f-u
- @lukekim
New in this release
- Adds FlightSQL data connector (
flightsql). - Adds SQLite data store, supports both in-memory and file based (
sqlite). - Adds support for date, varchar, bpchar, and primitive list types for the PostgreSQL data connector and data store.
- Adds
spice pods,spice status,spice datasets, andspice modelsCLI commands. - Adds
GET /v1/spicepodsAPI for listing loaded Spicepods. - Adds
spicedDocker CI build and release. - Adds E2E tests for release installation and local acceleration.
- Adds E2E tests and instructions to run basic TPC-H benchmark tests.
- Adds
linux/arm64binary build. - Fixes
spice sqlREPL panics when query result is too large. (https://github.com/spiceai/spiceai/pull/875) - Fixes
--access-secretinspice s3 login. (https://github.com/spiceai/spiceai/pull/894) - Fixes version check upgrade logic.
- Rust
Published by y-f-u almost 2 years ago
https://github.com/spiceai/spiceai - Spice.ai v0.9-alpha
The v0.9 release adds several data connectors including the Spice data connector for the ability to connect to other spiced instances. Improved observability for spiced has been added with the new /metrics endpoint for monitoring deployed instances.
Highlights in v0.9-alpha
Arrow Flight SQL endpoint: The Arrow Flight endpoint now supports Flight SQL, including JDBC, ODBC, and ADBC enabling database clients like DBeaver or BI applications like Tableau to connect to and query the Spice runtime.
Spice.ai data connector: Use other Spice runtime instances as data connectors for federated SQL query across Spice deployments and for chaining Spice runtimes.
Keyring secret store: Use the operating system native credential store, like macOS keychain for storing secrets used by spiced.
PostgreSQL data connector: PostgreSQL can now be used as both a data store for acceleration and as a connector for federated SQL query.
Databricks data connector: Databricks as a connector for federated SQL query across Delta Lake tables.
S3 data connector: S3 as a connector for federated SQL query across Parquet files stored in S3.
Metrics endpoint: Added new /metrics endpoint for spiced observability and monitoring with the following metrics:
- spiced_runtime_http_server_start counter
- spiced_runtime_flight_server_start counter
- datasets_count gauge
- load_dataset summary
- load_secrets summary
- datasets/load_error counter
- datasets/count counter
- models/load_error counter
- models/count counter
Contributors
- @phillipleblanc
- @mitchdevenport
- @Jeadie
- @ewgenius
- @sgrebnov
- @Sevenannn
- @y-f-u
- @digadeesh
- @lukekim
New in this release
- Adds Keyring secret store (
keyring). - Adds PostgreSQL data connector (
postgres). - Adds Spice.ai data connector (
spiceai). - Adds Arrow Flight SQL (JDBC/ODBC/ADBC) support.
- Adds Databricks data connector (
databricks) - Delta Lake support. - Adds S3 data connector (
s3) - Parquet support. - Adds
/v1/modelsAPI. - Adds
/v1/statusAPI. - Adds
/metricsAPI.
- Rust
Published by sgrebnov almost 2 years ago
https://github.com/spiceai/spiceai - Spice.ai v0.8-alpha
Announcing the release of Spice v0.8-alpha! 🏹
This is a minor release that builds on the new Rust-based runtime, adding stability and a preview of new features for the first major release.
Highlights in v0.8-alpha
Secrets management: Spice 0.8 runtime can now configure and retrieve secrets from local environment variables and in a Kubernetes cluster.
Data tables can be locally accelerated using PostgreSQL
New in this release
- Adds Secrets management in local environment variables and Kubernetes clusters.
- Adds (Preview) PostgreSQL as a data table acceleration engine.
- Rust
Published by ewgenius almost 2 years ago
https://github.com/spiceai/spiceai - Spice v0.7-alpha
Announcing the release of Spice v0.7-alpha! 🏹
Spice v0.7-alpha is an all new implementation of Spice written in Rust. The Spice v0.7 runtime provides developers with a unified SQL query interface to locally accelerate and query data tables sourced from any database, data warehouse, or data lake.
Learn more and get started in minutes with the updated Quickstart in the repository README!
Highlights in v0.7-alpha
DataFusion SQL Query Engine: Spice v0.7 leverages the Apache DataFusion query engine to provide very fast, high quality SQL query across one or more local or remote data sources.
Data tables can be locally accelerated using Apache Arrow in-memory or by DuckDB.
New in this release
- Adds runtime rewritten in Rust for high-performance.
- Adds Apache DataFusion SQL query engine.
- Adds The Spice.ai platform as a data source.
- Adds Dremio as a data source.
- Adds OpenTelemetry (OTEL) collector.
- Adds local data table acceleration.
- Adds DuckDB file or in-memory as a data table acceleration engine.
- Adds In-memory Apache Arrow as a data table acceleration engine.
- Removes the built-in AI training engine; now cloud-based and provided by the Spice.ai platform.
- Removes the built-in dashboard and web-interface; now cloud-based and provided by the Spice.ai platform.
- Rust
Published by phillipleblanc about 2 years ago
https://github.com/spiceai/spiceai - Spice.ai v0.6.2-alpha
Announcing the release of Spice.ai v0.6.2-alpha! 🐞
This release fixes a bug in the CLI that prevented users from adding Spicepods from spicerack.org
- Rust
Published by phillipleblanc over 2 years ago
https://github.com/spiceai/spiceai - Spice.ai v0.6.1-alpha
Announcing the release of Spice.ai v0.6.1-alpha! 🌶
Building upon the Apache Arrow support in v0.6-alpha, Spice.ai now includes new Apache Arrow data processor and Apache Arrow Flight data connector components! Together, these create a high-performance bulk-data transport directly into the Spice.ai ML engine. Coupled with big data systems from the Apache Arrow ecosystem like Hive, Drill, Spark, Snowflake, and BigQuery, it's now easier than ever to combine big data with Spice.ai.
And we're also excited to announce the release of Spice.xyz! 🎉
Spice.xyz is data and AI infrastructure for web3. It’s web3 data made easy. Insanely fast and purpose designed for applications and ML.
Spice.xyz delivers data in Apache Arrow format, over high-performance Apache Arrow Flight APIs to your application, notebook, ML pipeline, and of course through these new data components, to the Spice.ai runtime.
Read the announcement post at blog.spice.ai.
New in this release
Now built with Go 1.18.
Dependency updates
- Updates to React 18
- Updates to CRA 5
- Updates to Glide DataGrid 4
- Updates to SWR 1.2
- Updates to TypeScript 4.6
- Rust
Published by lukekim almost 4 years ago
https://github.com/spiceai/spiceai - Spice.ai v0.6-alpha
Announcing the release of Spice.ai v0.6-alpha! 🏹
Spice.ai now scales to datasets 10-100 larger enabling new classes of uses cases and applications! 🚀 We've completely rebuilt Spice.ai's data processing and transport upon Apache Arrow, a high-performance platform that uses an in-memory columnar format. Spice.ai joins other major projects including Apache Spark, pandas, and InfluxDB in being powered by Apache Arrow. This also paves the way for high-performance data connections to the Spice.ai runtime using Apache Arrow Flight and import/export of data using Apache Parquet. We're incredibly excited about the potential this architecture has for building intelligent applications on top of a high-performance transport between application data sources the Spice.ai AI engine.
Highlights in v0.6-alpha
Massive improvement in data loading performance and dataset scale
From data connectors, to REST API, to AI engine, we've now rebuilt Spice.ai's data processing and transport on the Apache Arrow project. Specifically, using the Apache Arrow for Go implementation. Many thanks to Matt Topol for his contributions to the project and guidance on using it.
This release includes a change to the Spice.ai runtime to AI Engine transport from sending text CSV over gGPC to Apache Arrow Records over IPC (Unix sockets).
This is a breaking change to the Data Processor interface, as it now uses arrow.Record instead of Observation.
Benchmarking v0.6
Before v0.6, Spice.ai would not scale into the 100s of 1000s of rows.
| Format | Row Number | Data Size | Process Time | Load Time | Transport time | Memory Usage | | ------ | ---------- | --------- | ------------ | --------- | -------------- | ------------ | | csv | 2,000 | 163.15KiB | 3.0005s | 0.0000s | 0.0100s | 423.754MiB | | csv | 20,000 | 1.61MiB | 2.9765s | 0.0000s | 0.0938s | 479.644MiB | | csv | 200,000 | 16.31MiB | 0.2778s | 0.0000s | NA (error) | 0.000MiB | | csv | 2,000,000 | 164.97MiB | 0.2573s | 0.0050s | NA (error) | 0.000MiB | | json | 2,000 | 301.79KiB | 3.0261s | 0.0000s | 0.0282s | 422.135MiB | | json | 20,000 | 2.97MiB | 2.9020s | 0.0000s | 0.2541s | 459.138MiB | | json | 200,000 | 29.85MiB | 0.2782s | 0.0010s | NA (error) | 0.000MiB | | json | 2,000,000 | 300.39MiB | 0.3353s | 0.0080s | NA (error) | 0.000MiB |
After building on Arrow, Spice.ai now easily scales beyond millions of rows.
| Format | Row Number | Data Size | Process Time | Load Time | Transport time | Memory Usage | | ------ | ---------- | --------- | ------------ | --------- | -------------- | ------------ | | csv | 2,000 | 163.14KiB | 2.8281s | 0.0000s | 0.0194s | 439.580MiB | | csv | 20,000 | 1.61MiB | 2.7297s | 0.0000s | 0.0658s | 461.836MiB | | csv | 200,000 | 16.30MiB | 2.8072s | 0.0020s | 0.4830s | 639.763MiB | | csv | 2,000,000 | 164.97MiB | 2.8707s | 0.0400s | 4.2680s | 1897.738MiB | | json | 2,000 | 301.80KiB | 2.7275s | 0.0000s | 0.0367s | 436.238MiB | | json | 20,000 | 2.97MiB | 2.8284s | 0.0000s | 0.2334s | 473.550MiB | | json | 200,000 | 29.85MiB | 2.8862s | 0.0100s | 1.7725s | 824.089MiB | | json | 2,000,000 | 300.39MiB | 2.7437s | 0.0920s | 16.5743s | 4044.118MiB |
New in this release
- Adds Apache Arrow data processing and transport.
- Fixes TensorBoard logging and monitoring when using GitHub Codespaces and Docker.
- Adds Polling HTTP Data Connector
Dependency updates
- Updates to numpy 1.21.0
- Updates to marked 3.0.8
- Updates to follow-redirects 1.14.7
- Updates nanoid to 3.2.0
- Rust
Published by phillipleblanc about 4 years ago
https://github.com/spiceai/spiceai - Spice.ai v0.5.1-alpha
Announcing the release of Spice.ai v0.5.1-alpha! 📈
This minor release builds upon v0.5-alpha adding the ability to start training from the dashboard plus support for monitoring training runs with TensorBoard.
Highlights in v0.5.1-alpha
Start training from dashboard
A "Start Training" button has been added to the pod page on the dashboard so that you can easily start training runs from that context.
Training runs can now be started by:
- Modifications to the Spicepod YAML file.
- The spice train
command. - The "Start Training" dashboard button.
- POST API calls to
/api/v0.1/pods/{pod name}/train
Video: https://user-images.githubusercontent.com/80174/146122241-f8073266-ead6-4628-8563-93e98d74e9f0.mov
TensorBoard monitoring
TensorBoard monitoring is now supported when using DQL (default) or the new SACD learning algorithms that was announced in v0.5-alpha.
When enabled, TensorBoard logs will automatically be collected and a "Open TensorBoard" button will be shown on the pod page in the dashboard.
Logging can be enabled at the pod level with the training_loggers pod param or per training run with the CLI --training-loggers argument.
Video: https://user-images.githubusercontent.com/80174/146382503-2bb2570b-5111-4de0-9b80-a1dc4a5dcc35.mov
Support for VPG will be added in v0.6-alpha. The design allows for additional loggers to be added in the future. Let us know what you'd like to see!
New in this release
- Adds a start training button on the dashboard pod page.
- Adds TensorBoard logging and monitoring when using DQL and SACD learning algorithms.
Dependency updates
- Updates to Tailwind 3.0.6
- Updates to Glide Data Grid 3.2.1
- Rust
Published by phillipleblanc about 4 years ago
https://github.com/spiceai/spiceai - Spice.ai v0.5-alpha
We are excited to announce the release of Spice.ai v0.5-alpha! 🥇
Highlights include a new learning algorithm called "Soft Actor-Critic" (SAC), fixes to the behavior of spice upgrade, and a more consistent authoring experience for reward functions.
If you are new to Spice.ai, check out the getting started guide and star spiceai/spiceai on GitHub.
Highlights in v0.5-alpha
Soft Actor-Critic (Discrete) (SAC) Learning Algorithm
The addition of the Soft Actor-Critic (Discrete) (SAC) learning algorithm is a significant improvement to the power of the AI engine. It is not set as the default algorithm yet, so to start using it pass the --learning-algorithm sacd parameter to spice train. We'd love to get your feedback on how its working!
Consistent reward authoring experience
With the addition of the reward function files that allow you to edit your reward function in a Python file, the behavior of starting a new training session by editing the reward function code was lost. With this release, that behavior is restored.
In addition, there is a breaking change to the variables used to access the observation state and interpretations. This change was made to better reflect the purpose of the variables and make them easier to work with in Python
| Previous (Type) | New (Type) |
| ----------------------------------- | -------------------------------------- |
| prev_state (SimpleNamespace) | current_state (dict) |
| prev_state.interpretations (list) | current_state_interpretations (list) |
| new_state (SimpleNamespace) | next_state (dict) |
| new_state.interpretations (list) | next_state_interpretations (list) |
Improved spice upgrade behavior
The Spice.ai CLI will no longer recommend "upgrading" to an older version. An issue was also fixed where trying to upgrade the Spice.ai CLI using spice upgrade on Linux would return an error.
New in this release
- Adds a new learning algorithm called "Soft-Actor Critic (Discrete)" (SAC).
- Updates the reward function parameters for the YAML code blocks from
prev_stateandnew_statetocurrent_stateandnext_stateto be consistent with the reward function files. - Fixes an issue where editing a reward functions file would not automatically trigger training.
- Fixes the normalization of values for the Deep-Q Learning algorithm to handle larger values.
- Fixes an issue where the Spice.ai CLI would not upgrade on Linux with the
spice upgradecommand. - Fixes an issue where the Spice.ai CLI would recommend an "upgrade" to an older version.
- Rust
Published by phillipleblanc about 4 years ago
https://github.com/spiceai/spiceai - Spice.ai v0.4.1-alpha
Announcing the release of Spice.ai v0.4.1-alpha! ✅
This point release focuses on fixes and improvements to v0.4-alpha. Highlights include AI engine performance improvements, updates to the dashboard observations data grid, notification of new CLI versions, and several bug fixes.
A special acknowledgment to @Adm28, who added the CLI upgrade detection and prompt, which notifies users of new CLI versions and prompts to upgrade.
Highlights in v0.4.1-alpha
AI engine performance improvements
Overall training performance has been improved up to 13% by removing a lock in the AI engine.
In versions before v0.4.1-alpha, performance was especially impacted when streaming new data during a training run.
Dashboard Observations Datagrid
The dashboard observations datagrid now automatically resizes to the window width, and headers are easier to read, with automatic grouping into dataspaces. In addition, column widths are also resizable.
CLI version detection and upgrade prompt
When it is run, the Spice.ai CLI will now automatically check for new CLI versions once a day maximum.
If it detects a new version, it will print a notification to the console on spice version, spice run or spice add commands prompting the user to upgrade using the new spice upgrade command.
New in this release
- Adds automatic resizing of the observations datagrid.
- Adds header group by dataspace to the observations datagrid.
- Adds CLI version detection and prompt for upgrade on version, run, and add commands.
- Adds Support for parsing hex-encoded times and measurements. Use the
time_formatofhexor prefix with0x. - Updates AI engine with improved training performance.
- Updates Go and NPM dependencies.
- Fixes detection of Spicepods in the
Spicepodsdirectory, and a resulting error when loading a non-Spicepod file. - Fixes a potential "zip slip" security issue.
- Fixes an issue where the AI engine may not gracefully shutdown.
- Rust
Published by lukekim over 4 years ago
https://github.com/spiceai/spiceai - Spice.ai v0.4-alpha
We are excited to announce the release of Spice.ai v0.4-alpha! 🏄♂️
Highlights include support for authoring reward functions in a code file, the ability to specify the time of recommendation, and ingestion support for transaction/correlation ids. Authoring reward functions in a code file is a significant improvement to the developer experience than specifying functions inline in the YAML manifest, and we are looking forward to your feedback on it!
If you are new to Spice.ai, check out the getting started guide and star spiceai/spiceai on GitHub.
Highlights in v0.4-alpha
Upgrade using spice upgrade
The spice upgrade command was added in the v0.3.1-alpha release, so you can now upgrade from v0.3.1 to v0.4 by simply running spice upgrade in your terminal. Special thanks to community member @Adm28 for contributing this feature!
Reward Function Files
In addition to defining reward code inline, it is now possible to author reward code in functions in a separate Python file.
The reward function file path is defined by the reward_funcs property.
A function defined in the code file is mapped to an action by authoring its name in the with property of the relevant reward.
Example:
yaml
training:
reward_funcs: my_reward.py
rewards:
- reward: buy
with: buy_reward
- reward: sell
with: sell_reward
- reward: hold
with: hold_reward
Learn more in the documentation: docs.spiceai.org/concepts/rewards/external
Time Categories
Spice.ai can now learn from cyclical patterns, such as daily, weekly, or monthly cycles.
To enable automatic cyclical field generation from the observation time, specify one or more time categories in the pod manifest, such as a month or weekday in the time section.
For example, by specifying month the Spice.ai engine automatically creates a field in the AI engine data stream called time_month_{month} with the value calculated from the month of which that timestamp relates.
Example:
yaml
time:
categories:
- month
- dayofweek
Supported category values are:
month dayofmonth dayofweek hour
Learn more in the documentation: docs.spiceai.org/reference/pod/#time
Get recommendation for a specific time
It is now possible to specify the time of recommendations fetched from the /recommendation API.
Valid times are from pod epoch_time to epoch_time + period.
Previously the API only supported recommendations based on the time of the last ingested observation.
Requests are made in the following format:
GET http://localhost:8000/api/v0.1/pods/{pod}/recommendation?time={unix_timestamp}`
An example for quickstarts/trader
GET http://localhost:8000/api/v0.1/pods/trader/recommendation?time=1605729600
Specifying {unix_timestamp} as 0 will return a recommendation based on the latest data. An invalid {unix_timestamp} will return a result that has the valid time range in the error message:
json
{
"response": {
"result": "invalid_recommendation_time",
"message": "The time specified (1610060201) is outside of the allowed range: (1610057600, 1610060200)",
"error": true
}
}
New in this release
- Adds time categories configuration to the pod manifest to enable learning from cyclical patterns in data - e.g. hour, day of week, day of month, and month
- Adds support for defining reward functions in a rewards functions code file.
- Adds the ability to specify recommendation time making it possible to now see which action Spice.ai recommends at any time during the pod period.
- Adds support for ingestion of transaction/correlation identifiers (e.g.
order_id,trace_id) in the pod manifest. - Adds validation for invalid dataspace names in the pod manifest.
- Adds the ability to resize columns to the dashboard observation data grid.
- Updates to TensorFlow 2.7 and Keras 2.7
- Fixes a bug where data processors were using data connector params
- Fixes a dashboard issue in the pod observations data grid where a column might not be shown.
- Fixes a crash on pod load if the
trainingsection is not included in the manifest. - Fixes an issue where data manager stats errors were incorrectly being printed to console.
- Fixes an issue where selectors may not match due to surrounding whitespace.
- Rust
Published by phillipleblanc over 4 years ago
https://github.com/spiceai/spiceai - Spice.ai v0.3.1-alpha
We are excited to announce the release of Spice.ai v0.3.1-alpha! 🎃
This point release focuses on fixes and improvements to v0.3-alpha. Highlights include the ability to specify both seed and runtime data, to select custom named fields for time and tags, a new spice upgrade command and several bug fixes.
A special acknowledgment to @Adm28, who added the new spice upgrade command, which enables the CLI to self-update, which in turn will auto-update the runtime.
Highlights in v0.3.1-alpha
Upgrade command
The CLI can now be updated using the new spice upgrade command. This command will check for, download, and install the latest Spice.ai CLI release, which will become active on it's next run.
When run, the CLI will check for the matching version of the Spice.ai runtime, and will automatically download and install it as necessary.
The version of both the Spice.ai CLI and runtime can be checked with the spice version CLI command.
Seed data
When working with streaming data sources, like market prices, it's often also useful to seed the dataspace with historical data. Spice.ai enables this with the new seed_data node in the dataspace configuration. The syntax is exactly the same as the data syntax. For example:
yaml
dataspaces:
- from: coinbase
name: btcusd
seed_data:
connector: file
params:
path: path/to/seed/data.csv
processor:
name: csv
data:
connector: coinbase
params:
product_ids: BTC-USD
processor:
name: json
The seed data will be fetched first, before the runtime data is initialized. Both sets of connectors and processors use the dataspace scoped measurements, categories and tags for processing, and both data sources are merged in pod-scoped observation timeline.
Time field selectors
Before v0.3.1-alpha, data was required to include a specific time field. In v0.3.1-alpha, the JSON and CSV data processors now support the ability to select a specific field to populate the time field. An example selector to use the created_at column for time is:
yaml
data:
processor:
name: csv
params:
time_selector: created_at
Tag field selectors
Before v0.3.1-alpha, tags were required to be placed in a _tags field. In v0.3.1-alpha, any field can now be selected to populate tags. Tags are pod-unique string values, and the union of all selected fields will make up the resulting tag list. For example:
yaml
dataspace:
from: twitter
name: tweets
tags:
selectors:
- tags
- author_id
values:
- spiceaihq
- spicy
New in this release
- Adds a new
spice upgradecommand for self-upgrade of the Spice.ai CLI. - Adds a new
seed_datanode to the dataspace configuration, enabling the dataspace to be seeded with an alternative source of data. - Adds the ability to select a custom time field in JSON and CSV data processors with the
time_selectorparameter. - Adds the ability to select custom tag fields in the dataspace configuration with
selectorslist. - Adds error reporting for AI engine crashes, where previously it would fail silently.
- Fixes the dashboard pods list from "jumping" around due to being unsorted.
- Fixes rare cases where categorical data might be sent to the AI engine in the wrong format.
- Rust
Published by github-actions[bot] over 4 years ago
https://github.com/spiceai/spiceai - Spice.ai v0.3-alpha
Spice.ai v0.3-alpha
We are excited to announce the release of Spice.ai v0.3-alpha! 🎉
This release adds support for ingestion, automatic encoding, and training of categorical data, enabling more use-cases and datasets beyond just numerical measurements. For example, perhaps you want to learn from data that includes a category of t-shirt sizes, with discrete values, such as small, medium, and large. The v0.3 engine now supports this and automatically encodes the categorical string values into numerical values that the AI engine can use. Also included is a preview of data visualizations in the dashboard, which is helpful for developers as they author Spicepods and dataspaces.
A special acknowledgment to @sboorlagadda, who submitted the first Spice.ai feature contribution from the community ever! He added the ability to list pods from the CLI with the new spice pods list command. Thank you, @sboorlagadda!!!
If you are new to Spice.ai, check out the getting started guide and star spiceai/spiceai on GitHub.
Highlights in v0.3-alpha
Categorical data
In v0.1, the runtime and AI engine only supported ingesting numerical data. In v0.2, tagged data was accepted and automatically encoded into fields available for learning. In this release, v0.3, categorical data can now also be ingested and automatically encoded into fields available for learning. This is a breaking change with the format of the manifest changing separating numerical measurements and categorical data.
Pre-v0.3, the manifest author specified numerical data using the fields node.
In v0.3, numerical data is now specified under measurements and categorical data under categories. E.g.
yaml
dataspaces:
- from: event
name: stream
measurements:
- name: duration
selector: length_of_time
fill: none
- name: guest_count
selector: num_guests
fill: none
categories:
- name: event_type
values:
- dinner
- party
- name: target_audience
values:
- employees
- investors
tags:
- tagA
- tagB
Data visualizations preview
A top piece of community feedback was the ability to visualize data. After first running Spice.ai, we'd often hear from developers, "how do I see the data?". A preview of data visualizations is now included in the dashboard on the pod page.
Listing pods
Once the Spice.ai runtime has started, you can view the loaded pods on the dashboard and fetch them via API call localhost:8000/api/v0.1/pods. To make it even easier, we've added the ability to list them via the CLI with the new spice pods list command, which shows the list of pods and their manifest paths.
Coinbase data connector
A new Coinbase data connector is included in v0.3, enabling the streaming of live market ticker prices from Coinbase Pro. Enable it by specifying the coinbase data connector and providing a list of Coinbase Pro product ids. E.g. "BTC-USD". A new sample which demonstrates is also available with its associated Spicepod available from the spicerack.org registry. Get it with spice add samples/trader.
Tweet Recommendation Quickstart
A new Tweet Recommendation Quickstart has been added. Given past tweet activity and metrics of a given account, this app can recommend when to tweet, comment, or retweet to maximize for like count, interaction rates, and outreach of said given Twitter account.
Trader Sample
A new Trader Sample has been added in addition to the Trader Quickstart. The sample uses the new Coinbase data connector to stream live Coinbase Pro ticker data for learning.
New in this release
- Adds support for ingesting, encoding, and training on categorical data. v0.3 uses one-hot-encoding.
- Changes Spicepod manifest fields node to measurements and add the categories node.
- Adds the ability to select a field from the source data and map it to a different field name in the dataspace. See an example for measurements in docs.
- Adds support for JSON content type when fetching from the
/observationsAPI. Previously, only CSV was supported. - Adds a preview version of data visualizations to the dashboard. The grid has several limitations, one of which is it currently cannot be resized.
- Adds the ability to select which learning algorithm to use via the CLI, the API, and specified in the Spicepod manifest. Possible choices are currently "vpg", Vanilla Policy Gradient and "dql", Deep Q-Learning. Shout out to @corentin-pro, who added this feature on his second day on the team!
- Adds the ability to list loaded pods with the CLI command spice pods list.
- Adds a new coinbase data connector for Coinbase Pro market prices.
- Adds a new Tweet Recommendation Quickstart.
- Adds a new Trader Sample.
- Fixes bug where the
/observationsendpoint was not providing fully qualified field names. - Fixes issue where debugging messages were printed when using spice add.
- Rust
Published by phillipleblanc over 4 years ago
https://github.com/spiceai/spiceai - Spice.ai v0.2.1-alpha
Spice.ai v0.2.1-alpha
Announcing the release of Spice.ai v0.2.1-alpha! 🚚
This point release focuses on fixes and improvements to v0.2-alpha. Highlights include the ability to specify how missing data should be treated and a new production mode for spiced.
This release supports the ability to specify how the runtime should treat missing data. Previous releases filled missing data with the last value (or initial value) in the series. While this makes sense for some data, i.e., market prices of a stock or cryptocurrency, it does not make sense for discrete data, i.e., ratings. In v0.2.1, developers can now add the fill parameter on a dataspace field to specify the behavior. This release supports fill types previous and none. The default is previous.
Example in a manifest:
yaml
dataspaces:
- from: twitter
name: tweets
fields:
- name: likes
fill: none # The new fill parameter
spiced now defaults to a new production mode when run standalone (not via the CLI), with development mode now explicitly set with the --development flag. Production mode does not activate development time features, such as the Spicepod file watcher. The CLI always runs spiced in development mode as it is not expected to be used in production deployments.
New in this release
- Adds a
fillparameter to dataspace fields to specify how missing values should be treated. - Adds the ability to specify the fill behavior of empty values in a dataspace.
- Simplifies releases with a single
spiceairelease instead of separatespiceandspicedreleases. - Adds an explicit development mode to
spiced. Production mode does not activate the file watcher. - Fixes a bug when the pod parameter
epoch_timewas not set which would cause data not to be sent to the AI engine. - Fixes a bug where the User-Agent was not set correctly from CLI calls to api.spicerack.org
- Rust
Published by github-actions[bot] over 4 years ago
https://github.com/spiceai/spiceai - Spice CLI v0.2-alpha
- Rust
Published by lukekim over 4 years ago
https://github.com/spiceai/spiceai - Spice.ai v0.2-alpha
Spice.ai v0.2-alpha
We are excited to announce the release of Spice.ai v0.2-alpha! 🎉
This release is the first major version since the initial v0.1 announcement and includes significant improvements based upon community and early customer feedback. If you are new to Spice.ai, check out the getting started guide and star spiceai/spiceai on GitHub.
Highlights in v0.2-alpha
Tagged data
In the first release, the runtime and AI engine could only ingest numerical data. In v0.2, tagged data is accepted and automatically encoded into fields available for learning. For example, it's now possible to include a "liked" tag when using tweet data, automatically encoded to a 0/1 field for training. Both CSV and the new JSON observation formats support tags. The v0.3 release will add additional support for sets of categorical data.
Streaming data
Previously, the runtime would trigger each data connector to fetch on a 15-second interval. In v0.2, we upgraded the interface for data connectors to a push/streaming model, which enables continuous streaming data into the environment and AI engine.
Interpreted data
Spice.ai works together with your application code and works best when it's provided continuous feedback. This feedback could be from the application itself, for example, ratings, likes, thumbs-up/down, profit from trades, or external expertise. The interpretations API was introduced in v0.1.1, and v0.2 adds AI engine support providing a way to give meaning or an interpretation of ranges of time-series data, which are then available within reward functions. For example, a time range of stock prices could be a "good time to buy," or perhaps Tuesday mornings is a "good time to tweet," and an application or expert can teach the AI engine this through interpretations providing a shortcut to it's learning.
New in this release
- Adds core runtime and AI engine tagged data support
- Adds tagged data support to the CSV processor
- Adds streaming data support to the engine and data connectors
- Adds a new JSON data processor for ingesting JSON data
- Adds a new Twitter data connector with JSON processor support
- Adds a new
/pods//dataspacesAPI - Adds support for using interpretations in reward functions Learn more.
- Adds support for downloading zipped pods from the spicerack.org registry
- Adds support for adding data along with the pod manifest when adding a pod from the spicerack.org registry
- Adds basic
/pods//diagnosticsAPI - Fixes pod period, interval, and granularity not being correctly set when trying to use a "d" format
- Fixes the color scheme of action counts in the dashboard to improve readability
- Rust
Published by github-actions[bot] over 4 years ago
https://github.com/spiceai/spiceai - v0.1.1-alpha
alpha
- Rust
Published by github-actions[bot] over 4 years ago
https://github.com/spiceai/spiceai - Spice Runtime v0.1.1-alpha
Spice.ai v0.1.1-alpha
Announcing the release of Spice.ai v0.1.1-alpha! 🙌
This is the first point release following the public launch of v0.1-alpha and is focused on fixes and improvements to v0.1-alpha before the bigger v0.2-alpha release.
Highlights include initial support for interpretations and the addition of a new Json Data Processor which enables observations to be posted in JSON to a new Dataspaces API. The ability to post observations directly to the Dataspace also now makes Data Connectors optional.
Interpretations will enable end-users and external systems to participate in training by providing expert interpretation of the data, ultimately creating smarter pods. v0.1.1-alpha includes the ability to add and get interpretations by API and through import/export of Spicepods. Reward function authors will be able to use interpretations in reward functions from the v0.2-alpha release.
Previously observations could only be added in CSV format. JSON is now supported by calling the new dataspace observations API that leverages the also new JSON processor located in the data-components-contrib repository. The JSON processor defaults to parsing the Spice.ai observation format and is extensible to other schemas.
The dashboard has also been improved to show action counts during a training run, making it easier to visualize the learning process.

New in this release
- Adds visualization of actions counts during a training run in the dashboard.
- Adds a new interpretations API, along with support for importing and exporting interpretations to pods. Learn more.
- Adds a new API for ingesting dataspace observations. Learn more.
- Adds an official DockerHub repository for
spiceai/spiceai. - Fixes bug where the dashboard would not load on browser refresh.
- Rust
Published by github-actions[bot] over 4 years ago
https://github.com/spiceai/spiceai - Spice CLI v0.1.1-alpha-rc
This is the release candidate 0.1.1-alpha-rc
- Rust
Published by github-actions[bot] over 4 years ago
https://github.com/spiceai/spiceai - Spice Runtime v0.1.1-alpha-rc
This is the release candidate 0.1.1-alpha-rc
- Rust
Published by github-actions[bot] over 4 years ago
https://github.com/spiceai/spiceai - Spice Runtime v0.2.0-alpha-rc
This is the release candidate 0.2.0-alpha-rc
- Rust
Published by github-actions[bot] over 4 years ago
https://github.com/spiceai/spiceai - Spice CLI v0.2.0-alpha-rc
This is the release candidate 0.2.0-alpha-rc
- Rust
Published by github-actions[bot] over 4 years ago
https://github.com/spiceai/spiceai - Spice Runtime v0.1.0-alpha
Spice.ai v0.1.0-alpha
Announcing the public release of Spice.ai v0.1.0-alpha! 🎉
See the blog post at blog.spiceai.org.
New in this release
- Made public github.com/spiceai/spiceai
- Made public github.com/spiceai/data-components-contrib
- Made public github.com/spiceai/docs
- Made public github.com/spiceai/quickstarts
- Made public github.com/spiceai/samples
- Adds spicerack.org homepage
- Rust
Published by github-actions[bot] over 4 years ago
https://github.com/spiceai/spiceai - Spice CLI v0.1.0-alpha
Spice.ai v0.1.0-alpha
Announcing the public release of Spice.ai v0.1.0-alpha! 🎉
See the blog post at blog.spiceai.org.
New in this release
- Made public github.com/spiceai/spiceai
- Made public github.com/spiceai/data-components-contrib
- Made public github.com/spiceai/docs
- Made public github.com/spiceai/quickstarts
- Made public github.com/spiceai/samples
- Adds spicerack.org homepage
- Rust
Published by github-actions[bot] over 4 years ago
https://github.com/spiceai/spiceai - Spice Runtime v0.1.0-alpha-rc
This is the release candidate 0.1.0-alpha-rc
- Rust
Published by github-actions[bot] over 4 years ago
https://github.com/spiceai/spiceai - Spice CLI v0.1.0-alpha-rc
This is the release candidate 0.1.0-alpha-rc
- Rust
Published by github-actions[bot] over 4 years ago
https://github.com/spiceai/spiceai - Spice Runtime v0.1.0-alpha.5
Spice.ai v0.1.0-alpha.5
Announcing the release of Spice.ai v0.1.0-alpha.5! 🎉
This release focused on preparation for the public launch of the project, including more comprehensive and easier-to-understand documentation, quickstarts and samples.
Data Connectors and Data Processors have now been moved to their own repository spiceai/data-components-contrib
To better improve the developer experience, the following breaking changes have been made:
- The pods directory
.spice/pods(and thus manifests) and the config file.spice/config.yamlhave been moved from the./spicedirectory to the app root./. This allows for the.spicedirectory to be added to the.gitignoreand for the manifest changes to be easily tracked in the project. - Flights have been renamed to more understandable Training Runs in user interfaces.
New in this release
- Adds Open source acknowledgements to the dashboard
- Adds improved error messages for several scenarios
- Updates all Quickstarts and Samples to be clearer, easier to understand and better show the value of Spice.ai. The
LogPrunersample has also been renamedServerOps - Updates the dashboard to show a message when no pods have been trained
- Updates all documentation links to docs.spiceai.org
- Updates to use Python 3.8.12
- Fixes bug where the dashboards showed
undefinedepisode number - Fixes issue where the manifest.json was not being served to the React app
- Fixes the config.yaml being written when not required
- Removes the ability to load a custom dashboard - this may come back in a future release
Breaking changes
- Changes
.spice/podsis now located at./spicepods - Changes
.spice/config.yamlis now located at.spice.config.yaml
- Rust
Published by github-actions[bot] over 4 years ago
https://github.com/spiceai/spiceai - Spice CLI v0.1.0-alpha.5
Spice.ai v0.1.0-alpha.5
Announcing the release of Spice.ai v0.1.0-alpha.5! 🎉
This release focused on preparation for the public launch of the project, including more comprehensive and easier-to-understand documentation, quickstarts and samples.
Data Connectors and Data Processors have now been moved to their own repository spiceai/data-components-contrib
To better improve the developer experience, the following breaking changes have been made:
- The pods directory
.spice/pods(and thus manifests) and the config file.spice/config.yamlhave been moved from the./spicedirectory to the app root./. This allows for the.spicedirectory to be added to the.gitignoreand for the manifest changes to be easily tracked in the project. - Flights have been renamed to more understandable Training Runs in user interfaces.
New in this release
- Adds Open source acknowledgements to the dashboard
- Adds improved error messages for several scenarios
- Updates all Quickstarts and Samples to be clearer, easier to understand and better show the value of Spice.ai. The
LogPrunersample has also been renamedServerOps - Updates the dashboard to show a message when no pods have been trained
- Updates all documentation links to docs.spiceai.org
- Updates to use Python 3.8.12
- Fixes bug where the dashboards showed
undefinedepisode number - Fixes issue where the manifest.json was not being served to the React app
- Fixes the config.yaml being written when not required
- Removes the ability to load a custom dashboard - this may come back in a future release
Breaking changes
- Changes
.spice/podsis now located at./spicepods - Changes
.spice/config.yamlis now located at.spice.config.yaml
- Rust
Published by github-actions[bot] over 4 years ago
https://github.com/spiceai/spiceai - Spice Runtime v0.1.0-alpha.5-rc
This is the release candidate 0.1.0-alpha.5-rc
- Rust
Published by github-actions[bot] over 4 years ago
https://github.com/spiceai/spiceai - Spice CLI v0.1.0-alpha.5-rc
This is the release candidate 0.1.0-alpha.5-rc
- Rust
Published by github-actions[bot] over 4 years ago
https://github.com/spiceai/spiceai - Spice Runtime v0.1.0-alpha.4
Spice.ai v0.1.0-alpha.4
Announcing the release of Spice.ai v0.1.0-alpha.4! 🎉
We have a project name update. The project will now be referred to as "Spice.ai" instead of "Spice AI" and the project website will be located at spiceai.org.
This release now uses the new spicerack.org AI package registry instead of fetching packages directly from GitHub.
Added support for importing and exporting Spice.ai pods with spice import and spice export commands.
The CLI been streamlined removing the pod command:
- pod add changes from spice pod add <pod path> to just spice add <pod path>
- pod train changes from spice pod train <pod name> to just spice train <pod name>
We've also updated the names of some concepts:
- "DataSources" are now "Dataspaces"
- "Inference" is now "Recommendation"
New in this release
- Adds a new Gardener to intelligently decide on the best time to water a simulated garden
- Adds support for importing and exporting Spice.ai pods with
spice importandspice exportcommands - Adds a complete end-to-end test suite
- Adds installing by friendly URL
curl https://install.spiceai.org | /bin/bash - Adds the spice binary to PATH automatically by shell config (E.g.
.bashrc.zshrc) - Adds support for targeting hosting contexts (
dockerormetal) specifically with a--contextcommand line flag - Removes the model downloader. This will return with better supported in a later version
- Updates Trader quickstart with demo Node.js application to better demonstrate its use
- Updates LogPruner quickstart with demo PowerShell Core script to better demonstrate its use
- Updates Tensorflow from 2.5.0 to 2.5.1
- Fixes potential mismatch of CLI and runtime by only automatically upgrading to the same version
- Fixes issue with
.spice/config.ymlcreation in Docker due to incorrect permissions - Fixes dashboard title from
React ApptoSpice.ai
Breaking changes
- Changes
datasourcessection in the pod manifest todataspaces - Changes
/api/v0.1/pods/<pod>/inferenceAPI to/api/v0.1/pods/<pod>/recommendation
- Rust
Published by github-actions[bot] over 4 years ago
https://github.com/spiceai/spiceai - Spice CLI v0.1.0-alpha.4
Spice.ai v0.1.0-alpha.4
Announcing the release of Spice.ai v0.1.0-alpha.4! 🎉
We have a project name update. The project will now be referred to as "Spice.ai" instead of "Spice AI" and the project website will be located at spiceai.org.
This release now uses the new spicerack.org AI package registry instead of fetching packages directly from GitHub.
The CLI been streamlined removing the pod command:
- pod add changes from spice pod add <pod path> to just spice add <pod path>
- pod train changes from spice pod train <pod name> to just spice train <pod name>
We've also updated the names of some concepts:
- "DataSources" are now "Dataspaces"
- "Inference" is now "Recommendation"
New in this release
- Adds a new Gardener to intelligently decide on the best time to water a simulated garden
- Adds a complete end-to-end test suite
- Adds installing by friendly URL
curl https://install.spiceai.org | /bin/bash - Adds the spice binary to PATH automatically by shell config (E.g.
.bashrc.zshrc) - Adds support for targeting hosting contexts (
dockerormetal) specifically with a--contextcommand line flag - Removes the model downloader. This will return with better supported in a later version
- Updates [Trader]](https://github.com/spiceai/quickstarts/tree/trunk/trader) quickstart with demo Node.js application to better demonstrate its use
- Updates [LogPruner]](https://github.com/spiceai/quickstarts/tree/trunk/logpruner) quickstart with demo PowerShell Core script to better demonstrate its use
- Updates Tensorflow from 2.5.0 to 2.5.1
- Fixes potential mismatch of CLI and runtime by only automatically upgrading to the same version
- Fixes issue with
.spice/config.ymlcreation in Docker due to incorrect permissions - Fixes dashboard title from
React ApptoSpice.ai
Breaking changes
- Changes
datasourcessection in the pod manifest todataspaces - Changes
/api/v0.1/pods/<pod>/inferenceAPI to/api/v0.1/pods/<pod>/recommendation
- Rust
Published by github-actions[bot] over 4 years ago