Recent Releases of https://github.com/spiceai/spiceai

https://github.com/spiceai/spiceai -

- Rust
Published by Jeadie 6 months ago

https://github.com/spiceai/spiceai - v1.6.0

Spice v1.6.0 (Aug 26, 2025)

Spice 1.6.0 upgrades DataFusion to v48, reducing expressions memory footprint by ~50% for faster planning and lower memory usage, eliminating unnecessary projections in queries, optimizing string functions like ascii and character_length for up to 3x speedup, and accelerating unbounded aggregate window functions by 5.6x. The release adds Kafka and MongoDB connectors for real-time streaming and NoSQL data acceleration, supports OpenAI Responses API for advanced model interactions including OpenAI-hosted tools like web_search and code_interpreter, improves the OpenAI Embeddings Connector with usage tier configuration for higher throughput via increased concurrent requests, introduces Model2Vec embeddings for ultra-low-latency encoding, and improves the Amazon S3 Vectors engine to support multi-column primary keys.

What's New in v1.6.0

DataFusion v48 Highlights

Spice.ai is built on the DataFusion query engine. The v48 release brings:

Performance & Size Improvements 🚀: Expressions memory footprint was reduced by ~50% resulting in faster planning and lower memory usage, with planning times improved by 10-20%. There are now fewer unnecessary projections in queries. The string functions, ascii and character_length were optimized for improved performance, with character_length achieving up to 3x speedup. Queries with unbounded aggregate window functions have improved performance by 5.6 times via avoided unnecessary computation for constant results across partitions. The Expr struct size was reduced from 272 to 144 bytes.

New Features & Enhancements ✨: Support was added for ORDER BY ALL for easy ordering of all columns in a query.

See the Apache DataFusion 48.0.0 Blog for details.

Runtime Highlights

Amazon S3 Vectors Multi-Column Primary Keys: The Amazon S3 Vectors engine now supports datasets with multi-column primary keys. This enables vector indexes for datasets where more than one column forms the primary key, such as those splitting documents into chunks for retrieval contexts. For multi-column keys, Spice serializes the keys using arrow-json format, storing them as single string keys in the vector index.

Model2Vec Embeddings: Spice now supports model2vec static embeddings with a new model2vec embeddings provider, for sentence transformers up to 500x faster and 15x smaller, enabling scenarios requiring low latency and high-throughput encoding.

yaml embeddings: - from: model2vec:minishlab/potion-base-8M # HuggingFace model name: potion - from: model2vec:path/to/my/local/model # local model name: local

Learn more in the Model2Dev Embeddings documentation.

Kafka Data Connector: Use from: kafka:<topic> to ingest data directly from Kafka topics for integration with existing Kafka-based event streaming infrastructure, providing real-time data acceleration and query without additional middleware.

Example Spicepod.yml:

yaml - from: kafka:orders_events name: orders acceleration: enabled: true refresh_mode: append params: kafka_bootstrap_servers: server:9092

Learn more in the Kafka Data Connector documentation.

MongoDB Data Connector: Use from: mongodb:<dataset> to access and accelerate data stored in MongoDB, deployed on-premises or in the cloud.

Example spicepod.yml:

yaml datasets: - from: mongodb:my_dataset name: my_dataset params: mongodb_host: localhost mongodb_db: my_database mongodb_user: my_user mongodb_pass: password

Learn more in the MongoDB Data Connector documentation.

OpenAI Responses API Support: The OpenAI Responses API (/v1/responses) is now supported, which is OpenAI's most advanced interface for generating model responses.

You can now make requests to any responses compatible model using the new /v1/responses endpoint.

Example curl request:

bash curl http://localhost:8090/v1/responses \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-4.1", "input": "Tell me a three sentence bedtime story about Spice AI." }'

To use responses in spice chat, use the --responses flag.

Example:

bash spice chat --responses # Use the `/v1/responses` endpoint for all completions instead of `/v1/chat/completions`

Use OpenAI-hosted tools supported by Open AI's Responses API by specifying the openai_responses_tools parameter:

Example spicepod.yml:

yaml models: - name: test from: openai:gpt-4.1 params: openai_api_key: ${ secrets:SPICE_OPENAI_API_KEY } tools: sql, list_datasets openai_responses_tools: web_search, code_interpreter # 'code_interpreter' or 'web_search'

These OpenAI-specific tools are only available from the /v1/responses endpoint. Any other tools specified via the tools parameter are available from both the /v1/chat/completions and /v1/responses endpoints.

Learn more in the OpenAI Model Provider documentation.

OpenAI Embeddings & Models Connectors Usage Tier: The OpenAI Embeddings and Models Connectors now supports specifying account usage tier for embeddings and model requests, improving the performance of generating text embeddings or calling models during dataset load and search by increasing concurrent requests.

Example spicepod.yml:

yaml embeddings: - from: openai:text-embedding-3-small name: openai_embed params: openai_usage_tier: tier1

By setting the usage tier to the matching usage tier for your OpenAI account, the Embeddings and Models Connector will increase the maximum number of concurrent requests to match the specified tier.

Learn more in the OpenAI Model Provider documentation.

Contributors

New Contributors

Breaking Changes

No breaking changes.

Cookbook Updates

The Spice Cookbook includes 77 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.6.0, use one of the following methods:

CLI:

console spice upgrade

Homebrew:

console brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.6.0 image:

console docker pull spiceai/spiceai:1.6.0

For available tags, see DockerHub.

Helm:

console helm repo update helm upgrade spiceai spiceai/spiceai

AWS Marketplace:

🎉 Spice is also now available in the AWS Marketplace!

What's Changed

Dependencies

Changelog

  • Support Streaming with Tool Calls (#6941) by @Advayp in #6941
  • Fix parameterized query planning in DataFusion (#6942) by @Jeadie in #6942
  • Update the UnableToLoadCredentials error with a pointer to docs (#6937) by @phillipleblanc in #6937
  • Fix spicecloud benchmark (#6935) by @krinart in #6935
  • [Debezium] Support for VariableScaleDecimal (#6934) by @krinart in #6934
  • Update to DF 48 (#6665) by @mach-kernel and @kczimm in #6665
  • Mark append-stream and CDC datasets as ready after first message (#6914) by @sgrebnov in #6914
  • Model2Vec embedding model support (#6846) by @mach-kernel in #6846
  • Update snapshot for S3 vector search test (#6920) by @Jeadie in #6920
  • remove [] from queryset in spicepod path for CI (#6919) by @Jeadie in #6919
  • Remove verbose tracing (#6915) by @Jeadie in #6915
  • Refactor how models supporting the Responses API are loaded (#6912) by @Advayp in #6912
  • Write tests for truncate formatting in arrow_tools and fix bug. (#6900) by @Jeadie in #6900
  • Support using the Responses API from spice chat (#6894) by @Advayp in #6894
  • Include GPT-5 into Text-To-SQL and Financebench benchmarks (#6907) by @sgrebnov in #6907
  • Better error message when credentials aren't loaded for S3 Vectors (#6910) by @phillipleblanc in #6910
  • Add tracing and system prompt support for the Responses API (#6893) by @Advayp in #6893
  • Constraint violation check is improved to control behavior when violations occur within a batch (#6897) by @phillipleblanc in #6897
  • fix: Multi-column text search with v1/search (#6905) by @peasee in #6905
  • fix: Correctly project text search primary keys to underlying projection (#6904) by @peasee in #6904
  • fix: Update benchmark snapshots (#6901) by @app/github-actions in #6901
  • In S3vector, do not pushdown on non-filterable columns (#6884) by @Jeadie in #6884
  • Run E2E Test CI macOS build on bigger runners (#6896) by @phillipleblanc in #6896
  • Enable configuration of the Responses API for the Azure model provider (#6891) by @Advayp in #6891
  • fix: Update benchmark snapshots (#6888) by @app/github-actions in #6888
  • Update OpenAPI specification for /v1/responses (#6889) by @Advayp in #6889
  • Add test to ensure tools are injected correctly in the Responses API (#6886) by @Advayp in #6886
  • Enable embeddings for append streams (#6878) by @sgrebnov in #6878
  • Show correct limit for EXPLAIN plans in S3VectorsQueryExec (#6852) by @Jeadie in #6852
  • Responses API support for Azure Open AI (#6879) by @Advayp in #6879
  • fix: Update search test case structure (#6865) by @peasee in #6865
  • Fix mongodb benchmark (#6883) by @phillipleblanc in #6883
  • Support multiple column primary keys for S3 vectors. (#6775) by @Jeadie in #6775
  • Kafka Data Connector: persist consumer between restarts (#6870) by @sgrebnov in #6870
  • Fix newlines in errors added in recent PRs (#6877) by @phillipleblanc in #6877
  • Add override parameter to force support for the Responses API (#6871) by @Advayp in #6871
  • Don't use metadata columns in VectorScanTableProvider (#6854) by @Jeadie in #6854
  • Add non-streaming tool call support (hosted and Spice tools) via the Responses API (#6869) by @Advayp in #6869
  • Update error guideline to remove newlines + remove newlines from error messages. (#6866) by @phillipleblanc in #6866
  • Remove void acceleration engine + optional table behaviors (#6868) by @phillipleblanc in #6868
  • Kafka Data Connector basic support (#6856) by @sgrebnov in #6856
  • Federated+Accelerated TPCH Benchmarks for MongoDB (#6788) by @krinart in #6788
  • Pass embeddings calculated in compute_index to the acceleration (#6792) by @phillipleblanc in #6792
  • Add non-streaming and streaming support for OpenAI Responses API endpoint (#6830) by @Advayp in #6830
  • Use latest version of OpenAI crate to resolve issues with Service Tier deserialization (#6853) by @Advayp in #6853
  • Update openapi.json (#6799) by @app/github-actions in #6799
  • Improve management message (#6850) by @lukekim in #6850
  • fix: Include FTS search column if it is the PK (#6836) by @peasee in #6836
  • Refactor Health Checks (#6848) by @Advayp in #6848
  • Introduce a Responses trait and LLM registry for model providers that support the OpenAI Responses API (#6798) by @Advayp in #6798
  • fix: Update datafusion-table-providers to include constraints (#6837) by @peasee in #6837
  • Bump postcard from 1.1.2 to 1.1.3 (#6841) by @app/dependabot in #6841
  • Bump governor from 0.10.0 to 0.10.1 (#6835) by @app/dependabot in #6835
  • Bump ctor from 0.2.9 to 0.5.0 (#6827) by @app/dependabot in #6827
  • Bump azure_core from 0.26.0 to 0.27.0 (#6826) by @app/dependabot in #6826
  • Bump rstest from 0.25.0 to 0.26.1 (#6825) by @app/dependabot in #6825
  • Use latest commit in our fork of async-openai (#6829) by @Advayp in #6829
  • Bump rustls from 0.23.27 to 0.23.31 (#6824) by @app/dependabot in #6824
  • Bump async-trait from 0.1.88 to 0.1.89 (#6823) by @app/dependabot in #6823
  • Bump hyper from 1.6.0 to 1.7.0 (#6814) by @app/dependabot in #6814
  • Bump serde_json from 1.0.140 to 1.0.142 (#6812) by @app/dependabot in #6812
  • Add s3 vector test retrieving vectors (#6786) by @Jeadie in #6786
  • fix: Allow v1/search with only FTS (#6811) by @peasee in #6811
  • Bump tantivy from 0.24.1 to 0.24.2 (#6806) by @app/dependabot in #6806
  • Bump tokio-util from 0.7.15 to 0.7.16 (#6810) by @app/dependabot in #6810
  • fix: Improve FTS index primary key handling (#6809) by @peasee in #6809
  • Bump logos from 0.15.0 to 0.15.1 (#6808) by @app/dependabot in #6808
  • Bump hf-hub from 0.4.2 to 0.4.3 (#6807) by @app/dependabot in #6807
  • Bump odbc-api from 13.0.1 to 13.1.0 (#6803) by @app/dependabot in #6803
  • fix: Spice search CLI with FTS supports string or slice unmarshalling (#6805) by @peasee in #6805
  • Bump uuid from 1.17.0 to 1.18.0 (#6797) by @app/dependabot in #6797
  • Bump reqwest from 0.12.22 to 0.12.23 (#6796) by @app/dependabot in #6796
  • Bump anyhow from 1.0.98 to 1.0.99 (#6795) by @app/dependabot in #6795
  • Bump clap from 4.5.41 to 4.5.45 (#6794) by @app/dependabot in #6794
  • Respect default MAXDECODINGMESSAGE_SIZE (100MB) in Flight API (#6802) by @sgrebnov in #6802
  • Fix compilation errors caused by upgrading async-openai (#6793) by @Advayp in #6793
  • Remove outdated vector search benchmark (replaced with testoperator) (#6791) by @sgrebnov in #6791
  • Handle errors in vector ingestion pipeline (#6782) by @phillipleblanc in #6782
  • fix: Explicitly error when chunking is defined for vector engines (#6787) by @peasee in #6787
  • Make VectorScanTableProvider and VectorQueryTableProvider support multi-column primary keys (#6757) by @Jeadie in #6757
  • Use megascience/megascience Q+A dataset for text search testing. (#6702) by @Jeadie in #6702
  • Flight REPL autocomplete (#6589) by @krinart in #6589
  • use ref: github.event.pull_request.head.sha in integration_models.yml (#6780) by @Jeadie in #6780
  • fix: Move search telemetry calls in UDTF to scan (#6778) by @peasee in #6778
  • Fix Hugging Face models and embeddings loading in Docker (#6777) by @ewgenius in #6777
  • feat: Migrate bedrock rate limiter (#6773) by @peasee in #6773
  • Run the PR checks on the DEV runners (#6769) by @phillipleblanc in #6769
  • feat: add OpenAI models rate controller (#6767) by @peasee in #6767
  • Implement MongoDB data connector (#6594) by @krinart in #6594
  • fix: Use head ref for concurrency group (#6770) by @peasee in #6770
  • fix: Run enforce pulls with spice on pullrequesttarget (#6768) by @peasee in #6768
  • feat: Add OpenAI Embeddings Rate Controller (#6764) by @peasee in #6764
  • Move AWS SDK credential bridge integration test to the existing AWS SDK integration test run (#6766) by @phillipleblanc in #6766
  • Use Spice specific errors instead of OpenAIError in embedding module (#6748) by @kczimm in #6748
  • Use context in Glue Catalog Provider (#6763) by @Advayp in #6763
  • pin cargo-deny to previous version (#6762) by @kczimm in #6762
  • Bump actions/download-artifact from 4 to 5 (#6720) by @app/dependabot in #6720
  • Upgrade dependabot dependencies (#6754) by @phillipleblanc in #6754
  • Set E2E Test CI models build to 90 minute timeout (#6756) by @phillipleblanc in #6756
  • chore: upgrade to Rust 1.87.0 (#6614) by @kczimm in #6614
  • feat: Add initial runtime-rate-limiter crate (#6753) by @peasee in #6753
  • feat: Add more embedding traces, add MiniLM MTEB spicepod (#6742) by @peasee in #6742
  • Update QA analytics for release (#6740) by @Advayp in #6740
  • Always use 'returnData: true' for s3 vector query index (#6741) by @Jeadie in #6741
  • feat: Add Embedding and Search anonymous telemetry (#6737) by @peasee in #6737
  • Add 1.5.2 to SECURITY.md (#6739) by @ewgenius in #6739
  • Combine the Iceberg and Object Store AWS SDK bridges into one crate (#6718) by @Advayp in #6718
  • Updates to v1.5.2 release notes (#6736) by @lukekim in #6736
  • Update end game template - move glue catalog to catalogs section (#6732) by @ewgenius in #6732
  • Update v1.5.2.md (#6735) by @kczimm in #6735
  • Add note about S3 Vectors workaround (#6734) by @phillipleblanc in #6734
  • feat: Avoid joining for VectorScanTableProvider if the index is sufficient (#6714) by @peasee in #6714
  • update changelog (#6729) by @kczimm in #6729
  • remove unneeded autogenerated s3 vector code (#6715) by @Jeadie in #6715
  • fix: Set S3 vectors default limit to 30, add more tracing (#6712) by @peasee in #6712
  • docs: Add Hadoop cookbook to endgame template (#6708) by @peasee in #6708
  • Fix testoperator append mode compilation error (#6706) by @phillipleblanc in #6706
  • test: Add VectorScanTableProvider snapshot tests (#6701) by @peasee in #6701
  • feat: Add Hadoop catalog-mode benchmark (#6684) by @peasee in #6684
  • Move shared AWS crates used in bridges to workspace (#6705) by @Advayp in #6705
  • Use installation id to group connections (#6703) by @Advayp in #6703
  • Add Guardrails for AWS bedrock models (#6692) by @Jeadie in #6692
  • Update bedrock keys for CI. (#6693) by @Jeadie in #6693
  • Update acknowledgements (#6690) by @app/github-actions in #6690
  • ROADMAP updates Aug 1, 2025 (#6667) by @lukekim in #6667
  • Add retry logic for OpenAI embeddings creation (#6656) by @sgrebnov in #6656
  • Make models E2E chat test more robust (#6657) by @sgrebnov in #6657
  • Update Search GH Workflow to use Test Operator (#6650) by @sgrebnov in #6650
  • Score and P95 latency calculation for MTEB Quora-based vector search tests in Test Operator (#6640) by @sgrebnov in #6640
  • Fix multiple query error being classified as an internal error (#6635) by @Advayp in #6635
  • Add Support for S3 Table Buckets (#6573) by krinart in #6573
  • set MISTRALRSMETALPRECOMPILE=0 for metal (#6652) by @kczimm in #6652
  • Vector search to push down udtf limit argument into logical sort plan (#6636) by @mach-kernel in #6636
  • docs: Update qa_analytics.csv (#6643) by @peasee in #6643
  • Update SECURITY.md (#6642) by @Jeadie in #6642
  • docs: Update qa_analytics.csv (#6641) by @peasee in #6641
  • Separate token usage (#6619) by @Advayp in #6619
  • Fix typo in release notes (#6634) by @Advayp in #6634
  • Add environment variable for org token (#6633) by @Advayp in #6633
  • CDC: Compute embeddings on ingest (#6612) by @mach-kernel in #6612
  • Add view name to view creation errors (#6611) by @lukekim in #6611
  • Add core logic for running MTEB Quora-based vector search tests in Test Operator (#6607) by @sgrebnov in #6607
  • Revert "Update generate-openapi.yml (#6584)" (#6620) by @Jeadie in #6620
  • Non-accelerated views should report as ready only after all dependent datasets are ready (#6617) by @sgrebnov in #6617

- Rust
Published by sgrebnov 6 months ago

https://github.com/spiceai/spiceai - v1.5.2

Spice v1.5.2 (Aug 4, 2025)

Spice v1.5.2 introduces a new Amazon Bedrock Models Provider for converse API (Nova) compatible models, AWS Redshift support using the Postgres data connector, and Hadoop Catalog Support for Iceberg tables along with several bug fixes and improvements.

What's New in v1.5.2

Amazon Bedrock Models Provider: Adds a new Amazon Bedrock LLM Provider. Models compatible with the Converse API (Nova) are supported.

Amazon Bedrock provides access to a range of foundation models for generative AI. Spice supports using Bedrock-hosted models by specifying the bedrock prefix in the from field and configuring the required parameters.

Supported Model IDs:

  • us.amazon.nova-lite-v1:0
  • us.amazon.nova-micro-v1:0
  • us.amazon.nova-premier-v1:0
  • us.amazon.nova-pro-v1:0

Refer to the Amazon Bedrock documentation for details on available models.

Example Spicepod.yaml:

yaml models: - from: bedrock:us.amazon.nova-lite-v1:0 name: novash params: aws_region: us-east-1 aws_access_key_id: ${ secrets:AWS_ACCESS_KEY_ID } aws_secret_access_key: ${ secrets:AWS_SECRET_ACCESS_KEY } bedrock_guardrail_identifier: arn:aws:bedrock:abcdefg012927:0123456789876:guardrail/hello bedrock_guardrail_version: DRAFT bedrock_trace: enabled bedrock_temperature: 42

For more information, see the Amazon Bedrock Documentation.

AWS Redshift Support for Postgres Data Connector: Spice now supports connecting to Amazon Redshift using the PostgreSQL data connector. Redshift is a columnar OLAP database compatible with PostgreSQL, allowing you to use the same connector and configuration parameters.

To connect to Redshift, use the format postgres:schema.table in your Spicepod and set the connection parameters to match your Redshift cluster settings.

Example Spicepod.yaml:

```yaml

Example datasets for Redshift TPCH tables

datasets: - from: postgres:public.customer name: customer params: pghost: ${secrets:PGHOST} pgport: 5439 pgsslmode: prefer pgdb: dev pguser: ${secrets:PGUSER} pgpass: ${secrets:PGPASS} - from: postgres:public.lineitem name: lineitem params: pghost: ${secrets:PGHOST} pgport: 5439 pgsslmode: prefer pgdb: dev pguser: ${secrets:PGUSER} pgpass: ${secrets:PGPASS} ```

Redshift types are mapped to PostgreSQL types. See the PostgreSQL connector documentation for details on supported types and configuration.

Hadoop Catalog Support for Iceberg: The Iceberg Data and Catalog connectors now support connecting to Hadoop catalogs on local filesystem (file://) or S3 object storage (s3://, s3a://). This enables connecting to Iceberg catalogs without a separate catalog provider service.

Example Spicepod.yaml:

```yaml catalogs: - from: iceberg:file:///tmp/hadoopwarehouse/ name: localhadoop - from: iceberg:s3://my-bucket/hadoopwarehouse/ name: s3hadoop

# Example datasets - from: iceberg:file:///data/hadoopwarehouse/test/mytable1 name: localhadoop - from: iceberg:s3://my-bucket/hadoopwarehouse/test/mytable2 name: s3hadoop ```

For more details, see the Iceberg Data Connector documentation and the Iceberg Catalog Connector documentation.

Contributors

Breaking Changes

  • N/A

Cookbook Updates

The Spice Cookbook includes 75 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.5.2, use one of the following methods:

CLI:

console spice upgrade

Homebrew:

console brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.5.2 image:

console docker pull spiceai/spiceai:1.5.2

For available tags, see DockerHub.

Helm:

console helm repo update helm upgrade spiceai spiceai/spiceai

AWS Marketplace:

🎉 Spice is also now available in the AWS Marketplace!

What's Changed

Dependencies

No major dependency updates.

Changelog

  • fixes for databricks OpenAI compatibility (#6629) by @Jeadie in #6629
  • Update spicepod.schema.json (#6632) by @app/github-actions in #6632
  • Remove 'stream_options' from databricks LLMs (#6637) by @Jeadie in #6637
  • Move retry and rate limiting logic for Amazon bedrock out of embeddings. (#6626) by @Jeadie in #6626
  • Disable Metal precomplation in integration_llms.yml (#6649) by @Jeadie in #6649
  • fix: Hadoop integration test (#6660) by @peasee in #6660
  • feat: Add Hadoop Catalog Data Component (#6658) by @peasee in #6658
  • update datafusion-table-providers to latest spiceai tag (#6661) by @mach-kernel in #6661
  • feat: Add Hadoop Catalog connectors for Iceberg (#6659) by @peasee in #6659
  • Make FullTextSearchExec robust to RecordBatch column ordering. (#6675) by @Jeadie in #6675
  • Make 'runtime-object-store' crate (#6674) by @Jeadie in #6674
  • fix: Support include for Iceberg (#6663) by @peasee in #6663
  • feat: Add Hadoop TPCH benchmark (#6678) by @peasee in #6678
  • feat: Add Hadoop metadata_path parameter (#6680) by @peasee in #6680
  • fix: Automatically infer Hadoop warehouse scheme (#6681) by @peasee in #6681
  • Amazon Bedrock, specifically Nova models (#6673) by @Jeadie in [#6673](https://github.com/spiceai/spiceai/pull/6673
  • fix perplexityauthtoken parameters for web_search (#6685) by @Jeadie in #6685
  • Fix AWS Auth issue (#6699) by @Advayp in #6699
  • Limit Concurrent Requests for GitHub (#6672) by @Advayp in #6672
  • Add runtime parameter to enable more permissive parquet reading when page indexes are missing (#6716) by @phillipleblanc in #6716
  • Improve Flight REPL error messages (#6696) by @lukekim in #6696
  • Fixes from search tests (#6710) by @Jeadie in #6710

- Rust
Published by ewgenius 7 months ago

https://github.com/spiceai/spiceai - v1.5.1

Spice v1.5.1 (July 28, 2025)

Spice v1.5.1 expands the GitHub data connector to include pull-request comments, adds a configurable rate limiting for AWS Bedrock embedding models, expands partition pruning with inequality operators, and adds client-supplied cache keys for granular caching control in the HTTP and Arrow Flight SQL APIs.

What's New in v1.5.1

GitHub Data Connector Pull Request Comments: Configure GitHub pulls datasets to include comments.

Example Spicepod.yaml:

yaml datasets: - from: github:github.com/spiceai/spiceai/pulls name: spiceai.pulls params: include_comments: all # 'review', 'discussion', or 'none'. Defaults to 'none'. max_comments_fetched: '25' # Defaults to 100 # ...

For details, see the GitHub Data Connector documentation.

AWS Bedrock Embedding Models Invocation Control: Improved rate limiting control for AWS Bedrock embedding models with max_concurrent_invocations configuration.

yaml embeddings: - from: bedrock:cohere.embed-english-v3 name: cohere-embeddings params: max_concurrent_invocations: '41' # ...

For details, see the AWS Bedrock Embeddings Model Provider documentation.

Improved Query Partitioning: Expanded partition pruning support with additional inequality operators (e.g. >, >=, <, <=).

For details, see the Query Partitioning documentation.

Client-Supplied Cache Keys: Support for a new Spice-Cache-Key header/metadata-key in the HTTP and Arrow Flight SQL query APIs to for fine-grained client-side caching control.

Example HTTP API usage:

bash $ curl -vvS -XPOST http://localhost:8090/v1/sql \ -H"spice-cache-key: 1851400_20170216_north_america" \ -d "select * from scihub_journals_accessed where user_id = '1851400' and date_trunc('DAY', timestamp) = '2017-02-16' and city = 'New York';"

Example Response:

bash < HTTP/1.1 200 OK < content-type: application/json < x-cache: Hit from spiceai < results-cache-status: HIT < vary: Spice-Cache-Key < vary: origin, access-control-request-method, access-control-request-headers < content-length: 604 < date: Wed, 23 Jul 2025 20:26:12 GMT < [{ "timestamp": "2017-02-16 09:55:06", "doi": "10.1155/2012/650929", "ip_identifier": 1000856, "user_id": 1851400, "country": "United States", "city": "New York", "longitude": 40.7830603, "latitude": -73.9712488 }, ... ]

For details, see the Cache Control documentation.

Contributors

New Contributors

Breaking Changes

  • N/A

Cookbook Updates

No new recipes added in this release.

The Spice Cookbook includes 74 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.5.1, use one of the following methods:

CLI:

console spice upgrade

Homebrew:

console brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.5.1 image:

console docker pull spiceai/spiceai:1.5.1

For available tags, see DockerHub.

Helm:

console helm repo update helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

No major dependency updates.

Changelog

- Rust
Published by Jeadie 7 months ago

https://github.com/spiceai/spiceai - v1.5.0

- Rust
Published by ewgenius 7 months ago

https://github.com/spiceai/spiceai - v1.5.0-rc.3

Spice v1.5.0-rc.3 (July 16, 2025)

This is the third release candidate for v1.5.0, building on the capabilities introduced in v1.5.0-rc.2. This release introduces native support for Amazon S3 Vectors, enabling petabyte scale vector search directly from S3 vector buckets, alongside SQL-integrated vector and full-text search, partitioning for DuckDB acceleration, and automated refreshes for search indexes and views. It includes the AWS Bedrock Embeddings Model Provider, the Oracle Database connector, and the now-stable Spice.ai Cloud Data Connector, and the upgrade to DuckDB v1.3.2.

What's New in v1.5.0-rc.3

Amazon S3 Vectors Support: Spice.ai now integrates with Amazon S3 Vectors, launched in public preview on July 15, 2025, enabling vector-native object storage with built-in indexing and querying. This integration supports semantic search, recommendation systems, and retrieval-augmented generation (RAG) at petabyte scale with S3’s durability and elasticity. Spice.ai manages the vector lifecycle—ingesting data, embedding it with models like Amazon Titan or Cohere via AWS Bedrock, or MiniLM L6 available from HuggingFace, and storing it in S3 Vector buckets.

Example Spicepod.yml configuration for S3 Vectors:

yaml datasets: - from: s3://my_vector_bucket/data/ name: my_vectors params: file_format: parquet acceleration: enabled: true vectors: engine: s3_vectors params: s3_vectors_aws_region: us-east-2 s3_vectors_bucket: my-s3-vectors-bucket columns: - name: content embeddings: - from: bedrock_titan row_id: - id

Example SQL query using S3 Vectors:

sql SELECT * FROM vector_search(my_vectors, 'Cricket bats', 10) WHERE price < 100 ORDER BY score

For more details, refer to the S3 Vectors Documentation.

Highlights in v1.5.0-rc.3

SQL-integrated Search: Vector and full-text search capabilities are now natively available in SQL queries, extending the power of the POST v1/search endpoint to all SQL workflows.

Example Vector-Similarity-Search (VSS) using the vector_search UDTF on the table reviews for the search term "Cricket bats":

sql SELECT review_id, review_text, review_date, score FROM vector_search(reviews, "Cricket bats") WHERE country_code="AUS" LIMIT 3

Example Full-Text-Search (FTS) using the text_search UDTF on the table reviews for the search term "Cricket bats":

sql SELECT review_id, review_text, review_date, score FROM text_search(reviews, "Cricket bats") LIMIT 3

DuckDB v1.3.2 Upgrade: Upgraded DuckDB engine from v1.1.3 to v1.3.2. Key improvements include support for adding primary keys to existing tables, resolution of over-eager unique constraint checking for smoother inserts, and 13% reduced runtime on TPC-H SF100 queries through extensive optimizer refinements. The v1.2.x release of DuckDB was skipped due to a regression in indexes.

Partitioned Acceleration: DuckDB file-based accelerations now support partition_by expressions, enabling queries to scale to large datasets through automatic data partitioning and query predicate pruning. New UDFs, bucket and truncate, simplify partition logic.

New UDFs useful for partition_by expressions:

  • bucket(num_buckets, col): Partitions a column into a specified number of buckets based on a hash of the column value.
  • truncate(width, col): Truncates a column to a specified width, aligning values to the nearest lower multiple (e.g., truncate(10, 101) = 100).

Example Spicepod.yml configuration:

yaml datasets: - from: s3://my_bucket/some_large_table/ name: my_table params: file_format: parquet acceleration: enabled: true engine: duckdb mode: file partition_by: bucket(100, account_id) # Partition account_id into 100 buckets

Full-Text-Search (FTS) Index Refresh: Accelerated datasets with search indexes maintain up-to-date results with configurable refresh intervals.

Example refreshing search indexes on body every 10 seconds:

yaml datasets: - from: github:github.com/spiceai/docs/pulls name: spiceai.doc.pulls params: github_token: ${secrets:GITHUB_TOKEN} acceleration: enabled: true refresh_mode: full refresh_check_interval: 10s columns: - name: body full_text_search: enabled: true row_id: - id

Scheduled View Refresh: Accelerated Views now support cron-based refresh schedules using refresh_cron, automating updates for accelerated data.

Example Spicepod.yml configuration:

yaml views: - name: my_view sql: SELECT 1 acceleration: enabled: true refresh_cron: '0 * * * *' # Every hour

For more details, refer to Scheduled Refreshes.

Multi-column Vector Search: For datasets configured with embeddings on more than one column, POST v1/search and similarity_search perform parallel vector search on each column, aggregating results using reciprocal rank fusion.

Example Spicepod.yml for multi-column search:

yaml datasets: - from: github:github.com/apache/datafusion/issues name: datafusion.issues params: github_token: ${secrets:GITHUB_TOKEN} columns: - name: title embeddings: - from: hf_minilm - name: body embeddings: - from: openai_embeddings

AWS Bedrock Embeddings Model Provider: Added support for AWS Bedrock embedding models, including Amazon Titan Text Embeddings and Cohere Text Embeddings.

Example Spicepod.yml:

yaml embeddings: - from: bedrock:cohere.embed-english-v3 name: cohere-embeddings params: aws_region: us-east-1 input_type: search_document truncate: END - from: bedrock:amazon.titan-embed-text-v2:0 name: titan-embeddings params: aws_region: us-east-1 dimensions: '256'

For more details, refer to the AWS Bedrock Embedding Models Documentation.

Oracle Data Connector: Use from: oracle: to access and accelerate data stored in Oracle databases, deployed on-premises or in the cloud.

Example Spicepod.yml:

yaml datasets: - from: oracle:"SH"."PRODUCTS" name: products params: oracle_host: 127.0.0.1 oracle_username: scott oracle_password: tiger

See the Oracle Data Connector documentation.

Spice.ai Cloud Data Connector: Graduated to Stable.

Contributors

Breaking Changes

  • Search HTTP API Response: POST v1/search response payload has changed. See the new API documentation for details.
  • Model Provider Parameter Prefixes: Model Provider parameters use provider-specific prefixes instead of openai_ prefixes (e.g., hf_temperature for HuggingFace, anthropic_max_completion_tokens for Anthropic, perplexity_tool_choice for Perplexity). The openai_ prefix remains supported for backward compatibility but is deprecated and will be removed in a future release.

Cookbook Updates

The Spice Cookbook now includes 72 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.5.0-rc.3, download and install the specific binary from github.com/spiceai/spiceai/releases/tag/v1.5.0-rc.3 or pull the v1.5.0-rc.3 Docker image (spiceai/spiceai:1.5.0-rc.3).

What's Changed

Dependencies

Changelog

- Rust
Published by phillipleblanc 8 months ago

https://github.com/spiceai/spiceai - v1.5.0-rc.2

Spice v1.5.0-rc.2 (July 14, 2025)

This is the second release candidate for v1.5.0, which introduces SQL-integrated vector and full-text search, partitioning for DuckDB acceleration, and automated refreshes for search indexes and views. It adds a new AWS Bedrock Embeddings Model Provider, a new Oracle Database connector, and promotes the Spice.ai Cloud Data Connector to stable, alongside multi-column vector search for expanded search. This release also upgrades DuckDB from v1.1.3 to v1.3.2, accelerating Spice.ai datasets with improved indexes, query performance, and internal storage optimizations.

What's New in v1.5.0-rc.2

SQL-integrated Search: Vector and full-text search capabilities are now natively available in SQL queries, extending the power of the POST v1/search endpoint to all SQL workflows.

Example Vector-Similarity-Search (VSS) using the new vector_search UDTF on the table reviews for the search term "Cricket bats".

sql SELECT review_id, review_text, review_date, score FROM vector_search(reviews, "Cricket bats") WHERE country_code="AUS" LIMIT 3

Example Full-Text-Search (FTS) using the new text_search UDTF on the table reviews for the search term "Cricket bats".

sql SELECT review_id, review_text, review_date, score FROM text_search(reviews, "Cricket bats") LIMIT 3

DuckDB v1.3.2 Upgrade: Upgraded DuckDB engine from v1.1.3 to v1.3.2. Key improvements include support for adding primary keys to existing tables, resolution of over-eager unique constraint checking for smoother inserts, and 13% reduced runtime on TPC-H SF100 queries through extensive optimizer refinements. The v1.2.x release of DuckDB was skipped due to a regression in indexes.

Partitioned Acceleration: DuckDB file-based accelerations now support partition_by expressions, enabling queries to scale to large datasets through automatic data partitioning and query predicate pruning. New UDFs, bucket and truncate, simplify partition logic.

New UDFs useful for partition_by expressions:

  • bucket(num_buckets, col): Partitions a column into a specified number of buckets based on a hash of the column value.
  • truncate(width, col): Truncates a column to a specified width, aligning values to the nearest lower multiple (e.g., truncate(10, 101) = 100).

Example Spicepod.yml configuration:

yaml datasets: - from: s3://my_bucket/some_large_table/ name: my_table params: file_format: parquet acceleration: enabled: true engine: duckdb mode: file partition_by: bucket(100, account_id) # Partition account_id into 100 buckets

Full-Text-Search (FTS) Index Refresh: Accelerated datasets with search indexes maintain up-to-date results with configurable refresh intervals.

Example refreshing search indexes on body every 10 seconds (based on acceleration.refresh_check_interval).

yaml datasets: - from: github:github.com/spiceai/docs/pulls name: spiceai.doc.pulls params: github_token: ${secrets:GITHUB_TOKEN} acceleration: enabled: true refresh_mode: full refresh_check_interval: 10s columns: - name: body full_text_search: enabled: true row_id: - id

Scheduled View Refresh: Accelerated Views now support cron-based refresh schedules using refresh_cron, automating updates for accelerated data.

Example Spicepod.yml configuration:

yaml views: - name: my_view sql: SELECT 1 acceleration: enabled: true refresh_cron: '0 * * * *' # Every hour

For more details, refer to Scheduled Refreshes.

  • Multi-column Vector Search: For datasets configured with embeddings on more than one column, POST v1/search and similarity_search will perform parallel vector search on each column, and aggregate results using a reciprocal rank fusion scoring method.

Example Spicepod.yml where search results will consider both the Github issue's title and the content of its body.

yaml datasets: - from: github:github.com/apache/datafusion/issues name: datafusion.issues params: github_token: ${secrets:GITHUB_TOKEN} columns: - name: title embeddings: - from: hf_minilm - name: body embeddings: - from: openai_embeddings

AWS Bedrock Embeddings Model Provider: Added support for AWS Bedrock embedding models, including Amazon Titan Text Embeddings and Cohere Text Embeddings.

Example Spicepod.yaml:

```yaml embeddings: - from: bedrock:cohere.embed-english-v3 name: cohere-embeddings params: awsregion: us-east-1 inputtype: search_document truncate: END

  • from: bedrock:amazon.titan-embed-text-v2:0 name: titan-embeddings params: aws_region: us-east-1 dimensions: '256' ```

For more details, refer to the AWS Bedrock Embedding Models Documentation.

Oracle Data Connector: Use from: oracle: to access and accelerate data stored in Oracle databases, deployed on-premises or in the cloud.

Example Spicepod.yml:

yaml datasets: - from: oracle:"SH"."PRODUCTS" name: products params: oracle_host: 127.0.0.1 oracle_username: scott oracle_password: tiger

See the Oracle Data Connector documentation for details.

Spice.ai Cloud Data Connector: Graduated to Stable.

Contributors

Breaking Changes

  • Search HTTP API Response: POST v1/search response payload has changed. See the new API documentation for details.

  • Model Provider Parameter Prefixes: Model Provider parameters use provider-specific prefixes instead of openai_ prefixes (e.g., hf_temperature instead of openai_temperature for HuggingFace, anthropic_max_completion_tokens for Anthropic, perplexity_tool_choice for Perplexity). The openai_ prefix remains supported for backward compatibility but is now deprecated will be removed in a future release.

Cookbook Updates

The Spice Cookbook now includes 72 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.5.0-rc.2, download and install the specific binary from github.com/spiceai/spiceai/releases/tag/v1.5.0-rc.2 or pull the v1.5.0-rc.2 Docker image (spiceai/spiceai:1.5.0-rc.2).

What's Changed

Dependencies

Changelog

  • fix llm integraion test (#6398) by @Sevenannn in #6398
  • Promote spice cloud connector to stable quality (#6221) by @Sevenannn in #6221
  • v1.5.0-rc.1 release notes (#6397) by @lukekim in #6397
  • Fix model nsql integration tests (#6365) by @Sevenannn in #6365
  • Fix incorrect UDTF name and SQL query (#6404) by @lukekim in #6404
  • Update v1.5.0-rc.1.md (#6407) by @sgrebnov in #6407
  • Improve error messages (#6405) by @lukekim in #6405
  • build(deps): bump Jimver/cuda-toolkit from 0.2.25 to 0.2.26 (#6388) by @app/dependabot in #6388
  • Upgrade dependabot dependencies (#6411) by @phillipleblanc in #6411
  • Fix projection pushdown issues for document based file connector (#6362) by @Advayp in #6362
  • Create a new crate for UDFs (#6416) by @kczimm in #6416
  • Add a PartitionedDuckDB Accelerator (#6338) by @kczimm in #6338
  • Use vector_search() UDTF in HTTP APIs (#6417) by @Jeadie in #6417
  • add supported types (#6409) by @kczimm in #6409
  • Enable session time zone override for MySQL (#6426) by @sgrebnov in #6426
  • Acceleration-like indexing for full text search indexes. (#6382) by @Jeadie in #6382
  • Provide error message when partition by expression changes (#6415) by @kczimm in #6415
  • Add support for Oracle Autonomous Database connections (Oracle Cloud) (#6421) by @sgrebnov in #6421
  • prune partitions for exact and in list with and without UDFs (#6423) by @kczimm in #6423
  • Fixes and reenable FTS tests (#6431) by @Jeadie in #6431
  • Updating text-embedding-inference & mistralrs dependency (#6366) by @Jeadie in #6366
  • Upgrade DuckDB to 1.3.2 (#6434) by @phillipleblanc in #6434
  • Fix issue in limit clause for the Github Data connector (#6443) by @Advayp in #6443
  • Upgrade iceberg-rust to 0.5.1 (#6446) by @phillipleblanc in #6446

- Rust
Published by peasee 8 months ago

https://github.com/spiceai/spiceai - v1.5.0-rc.1

Spice v1.5.0-rc.1 (July 7, 2025)

This is the first release candidate for v1.5.0, which introduces partitioning for DuckDB acceleration, SQL-integrated vector and full-text search, and automated refreshes for search indexes and views. It adds a new AWS Bedrock Embeddings Model Provider, a new Oracle Database connector, and promotes the Spice.ai Cloud Data Connector to stable, alongside multi-column vector search for expanded search.

What's New in v1.5.0-rc.1

Partitioned Acceleration: DuckDB file-based accelerations now support partition_by expressions, enabling queries to scale to large datasets through automatic data partitioning and query predicate pruning. New UDFs, bucket and truncate, simplify partition logic.

New UDFs useful for partition_by expressions:

  • bucket(num_buckets, col): Partitions a column into a specified number of buckets based on a hash of the column value.
  • truncate(width, col): Truncates a column to a specified width, aligning values to the nearest lower multiple (e.g., truncate(10, 101) = 100).

Example Spicepod.yml configuration:

yaml datasets: - from: s3://my_bucket/some_large_table/ name: my_table params: file_format: parquet acceleration: enabled: true engine: duckdb mode: file partition_by: bucket(100, account_id) # Partition account_id into 100 buckets

SQL-integrated Search: Vector and full-text search capabilities are now natively available in SQL queries, extending the power of the POST v1/search endpoint to all SQL workflows.

Example Vector-Similarity-Search (VSS) using the new similarity_search UDTF on the table reviews for the search term "Cricket bats".

sql SELECT review_id, review_text, review_date, score FROM similarity_search(reviews, "Cricket bats") WHERE country_code="AUS" LIMIT 3

Example Full-Text-Search (FTS) using the new text_search UDTF on the table reviews for the search term "Cricket bats".

sql SELECT review_id, review_text, review_date, score FROM reviews FROM text_search(reviews, "Cricket bats") LIMIT 3

Full-Text-Search (FTS) Index Refresh: Accelerated datasets with search indexes maintain up-to-date results with configurable refresh intervals.

Example refreshing search indexes on body every 10 seconds (based on acceleration.refresh_check_interval).

yaml datasets: - from: github:github.com/spiceai/docs/pulls name: spiceai.doc.pulls params: github_token: ${secrets:GITHUB_TOKEN} acceleration: enabled: true refresh_mode: full refresh_check_interval: 10s columns: - name: body full_text_search: enabled: true row_id: - id

Scheduled View Refresh: Accelerated Views now support cron-based refresh schedules using refresh_cron, automating updates for accelerated data.

Example Spicepod.yml configuration:

yaml views: - name: my_view sql: SELECT 1 acceleration: enabled: true refresh_cron: '0 * * * *' # Every hour

For more details, refer to Scheduled Refreshes.

  • Multi-column Vector Search: For datasets configured with embeddings on more than one column, POST v1/search and similarity_search will perform parallel vector search on each column, and aggregate results using a reciprocal rank fusion scoring method.

Example Spicepod.yml where search results will consider both the Github issue's title and the content of its body.

yaml datasets: - from: github:github.com/apache/datafusion/issues name: datafusion.issues params: github_token: ${secrets:GITHUB_TOKEN} columns: - name: title embeddings: - from: hf_minilm - name: body embeddings: - from: openai_embeddings

AWS Bedrock Embeddings Model Provider: Added support for AWS Bedrock embedding models, including Amazon Titan Text Embeddings and Cohere Text Embeddings.

Example Spicepod.yaml:

```yaml embeddings: - from: bedrock:cohere.embed-english-v3 name: cohere-embeddings params: awsregion: us-east-1 inputtype: search_document truncate: END

  • from: bedrock:amazon.titan-embed-text-v2:0 name: titan-embeddings params: aws_region: us-east-1 dimensions: '256' ```

For more details, refer to the AWS Bedrock Embedding Models Documentation.

Oracle Data Connector: Use from: oracle: to access and accelerate data stored in Oracle databases, deployed on-premises or in the cloud.

Example Spicepod.yml:

yaml datasets: - from: oracle:"SH"."PRODUCTS" name: products params: oracle_host: 127.0.0.1 oracle_username: scott oracle_password: tiger

See the Oracle Data Connector documentation for details.

Spice.ai Cloud Data Connector: Graduated to Stable.

Contributors

Breaking Changes

  • Search HTTP API Response: POST v1/search response payload has changed. See the new API documentation for details.

  • Model Provider Parameter Prefixes: Model Provider parameters use provider-specific prefixes instead of openai_ prefixes (e.g., hf_temperature instead of openai_temperature for HuggingFace, anthropic_max_completion_tokens for Anthropic, perplexity_tool_choice for Perplexity). The openai_ prefix remains supported for backward compatibility but is now deprecated will be removed in a future release.

Cookbook Updates

  • Added Oracle Data Connector cookbook: Connect to tables in Oracle databases.

The Spice Cookbook now includes 71 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.5.0-rc.1, download and install the specific binary from github.com/spiceai/spiceai/releases/tag/v1.5.0-rc.1 or pull the v1.5.0-rc.1 Docker image (spiceai/spiceai:1.5.0-rc.1).

What's Changed

Dependencies

Changelog

  • Jeadie/25 06 10/finance (#6182) by @Jeadie in #6182
  • chore: Update dependencies (#6196) by @peasee in #6196
  • Fix FlightSQL GetDbSchemas and GetTables schemas to fully match the protocol (#6197) by @sgrebnov in #6197
  • Use spice-rs in test operator and retry on connection reset error (#6136) by @Sevenannn in #6136
  • Move model-grading evals to testoperator (#6195) by @Jeadie in #6195
  • Don't use base table for full text search post apply vector search (#6215) by @Jeadie in #6215
  • Fix content-type header in v1/sql response (#6217) by @Jeadie in #6217
  • Add v1.4.0-rc.1 release into qa_analytics.csv (#6209) by @sgrebnov in #6209
  • fix: Reschedule AI benchmarks, set max parallel to 1 (#6224) by @peasee in #6224
  • task: Add MySQL indexes (#6227) by @peasee in #6227
  • fix pagination (#6222) by @Jeadie in #6222
  • Add build links to release notes (#6220) by @kczimm in #6220
  • feat: Enable additional testoperator tests (#6218) by @peasee in #6218
  • chore: Update testoperator release target to 1.4 (#6235) by @peasee in #6235
  • fix: Update benchmark snapshots (#6234) by @app/github-actions in #6234
  • fix: Lower SF100 memory limit (#6236) by @peasee in #6236
  • Add glue integration test using hive and iceberg tables (#6248) by @kczimm in #6248
  • allow database for empty patterns (#6258) by @kczimm in #6258
  • add Glue catalog to README.md (#6179) by @kczimm in #6179
  • Add bucket UDF for partitioning (#6200) by @kczimm in #6200
  • New tool parsley (#6232) by @Jeadie in #6232
  • Upgrade dependabot dependencies (#6261) by @phillipleblanc in #6261
  • Upgrade delta_kernel to 0.12.1 (#6263) by @phillipleblanc in #6263
  • fix: Throughput test dispatching (#6265) by @peasee in #6265
  • fix: badges on README.md show correct status (#6268) by @phillipleblanc in #6268
  • Extend Flight CommandGetTables with source native data type info (#6259) by @sgrebnov in #6259
  • fix: Docker image build with profile (#6270) by @peasee in #6270
  • docs: Post-release update (#6275) by @peasee in #6275
  • Improve error message for incorrect/missing Glue table or database (#6257) by @kczimm in #6257
  • Update spicepod.schema.json (#6274) by @app/github-actions in #6274
  • Update openapi.json (#6279) by @app/github-actions in #6279
  • Add Remote Spicepod support (#6233) by @phillipleblanc in #6233
  • Update QA analytics for v1.4.0 (#6277) by @ewgenius in #6277
  • Add truncate UDF (#6278) by @kczimm in #6278
  • Update qa_analytics.csv for 1.4.0 (#6284) by @sgrebnov in #6284
  • Default grok to 'grok-3' (#6285) by @Jeadie in #6285
  • For Spice.ai connectors, do not default to dev SCP for dev builds (#6254) by @Jeadie in #6254
  • fix: Deny extra caching parameters (#6288) by @peasee in #6288
  • Make DynamoDB connectivity errors more specific and actionable (#6294) by @sgrebnov in #6294
  • Create a table provider from full text search index + query (#6286) by @Jeadie in #6286
  • Update Flight CommandGetTables to Return Native DataFusion SQL Data Types (#6297) by @sgrebnov in #6297
  • Adds a synchronous get_table function on the DataFusion context (#6300) by @phillipleblanc in #6300
  • Better Glue connector error messages (#6289) by @kczimm in #6289
  • fix: consume response stream before reading authorization metadata (#6292) by @Sevenannn in #6292
  • feat: Use retryable stream in test operator (#6231) by @Sevenannn in #6231
  • Support reserved word column names in DynamoDB (#6308) by @sgrebnov in #6308
  • fix: Implement Default manually for SQLResultsCacheConfig (#6310) by @peasee in #6310
  • Add integration test for DynamoDB Data Connector (#6311) by @sgrebnov in #6311
  • fix: Warn about no configured datasets if no datasets and catalogs are present (#6296) by @Advayp in #6296
  • Add better error messages for cases when a port is already in use (#6313) by @Advayp in #6313
  • Disallow datasets with protected names (#6309) by @Advayp in #6309
  • Roadmap updates June 2025 (#6319) by @lukekim in #6319
  • Add partitioning models (#6298) by @kczimm in #6298
  • Encode ScalarValues for use in filenames (#6318) by @kczimm in #6318
  • Standardize model parameter handling & prioritize <model-prefix>_<param> for model default overrides (#6199) by @Sevenannn in #6199
  • Add initial support for Oracle Data Connector (#6321) by @sgrebnov in #6321
  • Oracle connector: Support all major Oracle data types (#6323) by @sgrebnov in #6323
  • Oracle connector: support filter predicate pushdown (#6326) by @sgrebnov in #6326
  • text_search UDTF and required AnalyzerRule. (#6280) by @Jeadie in #6280
  • Build indexes as part of accelerations (#6324) by @phillipleblanc in #6324
  • feat: Add support for cron-based view refresh (#6341) by @peasee in #6341
  • Surface table not found errors immediately (#6317) by @Advayp in #6317
  • runtime-datafusion-index: Stop infinite recursion for IndexTableScanOptimizerRule (#6353) by @phillipleblanc in #6353
  • Add optional behaviors to DataAccelerator tables + add WantsUnderlyingTableBehavior to VoidTable (#6354) by @phillipleblanc in #6354
  • AWS Bedrock models. (#6358) by @Jeadie in #6358
  • Ensure views load even if they're the only components defined (#6359) by @Advayp in #6359
  • Improve type conversion and add integration tests for the Oracle connector (#6327) by @sgrebnov in #6327
  • Upgrade dependabot dependencies (#6375) by @phillipleblanc in #6375
  • Don't run tests that require a Databricks cluster on every PR (#6379) by @phillipleblanc in #6379
  • Properly handle duplicate flags to spice run (#6364) by @Advayp in #6364
  • Fix the case sensitivity of the key in env secrets store (#6371) by @ewgenius in #6371
  • vector_search UDTF and related changes (#6381) by @Jeadie in #6381
  • Update end_game.md (#6380) by @sgrebnov in #6380
  • fix: openai model endpoint (#6394) by @Sevenannn in #6394
  • Enable Oracle connector in default build configuration by @sgrebnov in #6395
  • Enable configuring otel endpoint from spice run by @Advayp in #6360

- Rust
Published by phillipleblanc 8 months ago

https://github.com/spiceai/spiceai - v1.4.0

- Rust
Published by peasee 8 months ago

https://github.com/spiceai/spiceai - v1.4.0-rc.1

Spice v1.4.0-rc.1 (June 11, 2025)

This release candidate for v1.4.0 upgrades DataFusion to v47 and Arrow to v55 for faster queries, more efficient Parquet/CSV handling, and improved reliability. It introduces the AWS Glue Catalog and Data Connectors for native access to Glue-managed data on S3 and supports Databricks U2M OAuth for secure Databricks user authentication. New Cron-based dataset refreshes and worker schedules enable automated task management, while dataset and search results caching improvements further optimizes query, search, and RAG performance.

What's New in v1.4.0-rc.1

DataFusion v47 Highlights

Spice.ai is built on the DataFusion query engine. The v47 release brings:

Performance Improvements 🚀: This release delivers major query speedups through specialized GroupsAccumulator implementations for first_value, last_value, and min/max on Duration types, eliminating unnecessary sorting and computation. TopK operations are now up to 10x faster thanks to early exit optimizations, while sort performance is further enhanced by reusing row converters, removing redundant clones, and optimizing sort-preserving merge streams. Logical operations benefit from short-circuit evaluation for AND/OR, reducing overhead, and additional enhancements address high latency from sequential metadata fetching, improve int/string comparison efficiency, and simplify logical expressions for better execution.

Bug Fixes & Compatibility Improvements 🛠️: The release addresses issues with external sort, aggregation, and window functions, improves handling of NULL values and type casting in arrays and binary operations, and corrects problems with complex joins and nested window expressions. It also addresses SQL unparsing for subqueries, aliases, and UNION BY NAME.

See the Apache DataFusion 47.0.0 Changelog for details.

Arrow v55 Highlights

Arrow v55 delivers faster Parquet gzip compression, improved array concatenation, and better support for large files (4GB+) and modular encryption. Parquet metadata reads are now more efficient, with support for range requests and enhanced compatibility for INT96 timestamps and timezones. CSV parsing is more robust, with clearer error messages. These updates boost performance, compatibility, and reliability.

See the Arrow 55.0.0 Changelog and Arrow 55.1.0 Changelog for details.

Search Result Caching: Spice now supports runtime caching for search results, improving performance for subsequent searches and chat completion requests that use the document_similarity LLM tool. Caching is configurable with options like maximum size, item TTL, eviction policy, and hashing algorithm.

Example spicepod.yml configuration:

yaml runtime: caching: search_results: enabled: true max_size: 128mb item_ttl: 5s eviction_policy: lru hashing_algorithm: siphash

For more information, refer to the Caching documentation.

AWS Glue Catalog Connector: Connect to AWS Glue Data Catalogs to query Iceberg, Parquet, or CSV tables in S3.

Example spicepod.yml configuration:

yaml catalogs: - from: glue name: my_glue_catalog params: glue_key: <your-access-key-id> glue_secret: <your-secret-access-key> glue_region: <your-region> include: - 'testdb.hive_*' - 'testdb.iceberg_*'

sql sql> show tables; +-----------------+--------------+-------------------+------------+ | table_catalog | table_schema | table_name | table_type | +-----------------+--------------+-------------------+------------+ | my_glue_catalog | testdb | hive_table_001 | BASE TABLE | | my_glue_catalog | testdb | iceberg_table_001 | BASE TABLE | | spice | runtime | task_history | BASE TABLE | +-----------------+--------------+-------------------+------------+

For more information, refer to the Glue Catalog Connector documentation.

AWS Glue Data Connector: Connect to specific tables in AWS Glue Data Catalogs to query Iceberg, Parquet, or CSV in S3.

Example spicepod.yml configuration:

yaml datasets: - from: glue:my_database.my_table name: my_table params: glue_auth: key glue_region: us-east-1 glue_key: ${secrets:AWS_ACCESS_KEY_ID} glue_secret: ${secrets:AWS_SECRET_ACCESS_KEY}

For more information, refer to the Glue Data Connector documentation.

Databricks U2M OAuth: Spice now supports User-to-Machine (U2M) authentication for Databricks when called with a compatible client, such as the Spice Cloud Platform.

yaml datasets: - from: databricks:spiceai_sandbox.default.messages name: messages params: databricks_endpoint: ${secrets:DATABRICKS_ENDPOINT} databricks_cluster_id: ${secrets:DATABRICKS_CLUSTER_ID} databricks_client_id: ${secrets:DATABRICKS_CLIENT_ID}

Dataset Refresh Schedules: Accelerated datasets now support a refresh_cron parameter, automatically refreshing the dataset on a defined cron schedule. Cron scheduled refreshes respect the global dataset_refresh_parallelism parameter.

Example spicepod.yml configuration:

yaml datasets: - name: my_dataset from: s3://my-bucket/my_file.parquet acceleration: refresh_cron: 0 0 * * * # Daily refresh at midnight

For more information, refer to the Dataset Refresh Schedules documentation.

Worker Execution Schedules: Workers now support a cron parameter and will execute an LLM-prompt or SQL query automatically on the defined cron schedule, in conjunction with a provided params.prompt.

Example spicepod.yml configuration:

yaml workers: - name: email_reporter models: - from: gpt-4o params: prompt: 'Inspect the latest emails, and generate a summary report for them. Post the summary report to the connected Teams channel' cron: 0 2 * * * # Daily at 2am

For more information, refer to the Worker Execution Schedules documentation.

SQL Worker Actions: Spice now supports workers with sql actions, to execute automated SQL queries on a cron schedule:

yaml workers: - name: my_worker cron: 0 * * * * sql: 'SELECT * FROM lineitem'

For more information, refer to the Workers with a SQL action documentation;

Contributors

Breaking Changes

  • No breaking changes.

Cookbook Updates

The Spice Cookbook now includes 69 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.4.0-rc.1, download and install the specific binary from github.com/spiceai/spiceai/releases/tag/v1.4.0-rc.1 or pull one of the nightly Docker images:

What's Changed

Dependencies

Changelog

  • Update trunk to 1.4.0-unstable (#5878) by @phillipleblanc in #5878
  • Update openapi.json (#5885) by @app/github-actions in #5885
  • feat: Testoperator reports benchmark failure summary (#5889) by @peasee in #5889
  • fix: Publish binaries to dev when platform option is all (#5905) by @peasee in #5905
  • feat: Print dispatch current test count of total (#5906) by @peasee in #5906
  • Include multiple duckdb files acceleration scenarios into testoperator dispatch (#5913) by @sgrebnov in #5913
  • feat: Support building testoperator on dev (#5915) by @peasee in #5915
  • Update spicepod.schema.json (#5927) by @app/github-actions in #5927
  • Update ROADMAP & SECURITY for 1.3.0 (#5926) by @phillipleblanc in #5926
  • Define SearchGeneration paradigm & use in Vector Search (#5876) by @Jeadie in #5876
  • docs: Update qa_analytics.csv (#5928) by @peasee in #5928
  • fix: Properly publish binaries to dev on push (#5931) by @peasee in #5931
  • Load request context extensions on every flight incoming call (#5916) by @ewgenius in #5916
  • Fix deferred loading for datasets with embeddings (#5932) by @ewgenius in #5932
  • Schedule AI benchmarks to run every Mon and Thu evening PST (#5940) by @sgrebnov in #5940
  • Fix explain plan snapshots for TPCDS queries Q36, Q70 & Q86 not being deterministic after DF 46 upgrade (#5942) by @phillipleblanc in #5942
  • chore: Upgrade to Rust 1.86 (#5945) by @peasee in #5945
  • Standardise HTTP settings across CLI (#5769) by @Jeadie in #5769
  • Fix deferred flag for Databricks SQL warehouse mode (#5958) by @ewgenius in #5958
  • Add deferred catalog loading (#5950) by @ewgenius in #5950
  • Refactor deferred_load using ComponentInitialization enum for better clarity (#5961) by @ewgenius in #5961
  • Post-release housekeeping (#5964) by @phillipleblanc in #5964
  • add LTO for release builds (#5709) by @kczimm in #5709
  • Fix dependabot/192 (#5976) by @Jeadie in #5976
  • Fix Test-to-SQL benchmark scheduled run (#5977) by @sgrebnov in #5977
  • Fix JSON to ScalarValue type conversion to match DataFusion behavior (#5979) by @sgrebnov in #5979
  • Add v1.3.1 release notes (#5978) by @lukekim in #5978
  • Define CandidateAggregation trait and implement RRF for multi column vector search. (#5943) by @Jeadie in #5943
  • Regenerate nightly build workflow (#5995) by @ewgenius in #5995
  • Fix DataFusion dependency loading in Databricks request context extension (#5987) by @ewgenius in #5987
  • Update spicepod.schema.json (#6000) by @app/github-actions in #6000
  • feat: Run MySQL SF100 on dev runners (#5986) by @peasee in #5986
  • fix: Remove caching RwLock (#6001) by @peasee in #6001
  • 1.3.1 Post-release housekeeping (#6002) by @phillipleblanc in #6002
  • feat: Add initial scheduler crate (#5923) by @peasee in #5923
  • fix flight request context scope (#6004) by @ewgenius in #6004
  • fix: Ensure snapshots on different scale factors are retained (#6009) by @peasee in #6009
  • fix: Allow dev runners in dispatch files (#6011) by @peasee in #6011
  • refactor: Deprecate resultscache for caching.sqlresults (#6008) by @peasee in #6008
  • Fix models benchmark results reporting (#6013) by @sgrebnov in #6013
  • fix: Run PR checks for tools/ changes (#6014) by @peasee in #6014
  • feat: Add a CronRequestChannel for scheduler (#6005) by @peasee in #6005
  • feat: Add refresh_cron acceleration parameter, start scheduler on table load (#6016) by @peasee in #6016
  • Update license check to allow dual license crates (#6021) by @sgrebnov in #6021
  • Initial worker concept (#5973) by @Jeadie in #5973
  • Don't fail if cargo-deny already installed (license check) (#6023) by @sgrebnov in #6023
  • Upgrade to DataFusion 47 and Arrow 55 (#5966) by @sgrebnov in #5966
  • Read Iceberg tables from Glue Catalog Connector (#5965) by @kczimm in #5965
  • Handle multiple highlights in v1/search UX (#5963) by @Jeadie in #5963
  • feat: Add cron scheduler configurations for workers (#6033) by @peasee in #6033
  • feat: Add search cache configuration and results wrapper (#6020) by @peasee in #6020
  • Fix GitHub Actions Ubuntu for more workflows (#6040) by @phillipleblanc in #6040
  • Fix Actions for testoperator dispatch manual (#6042) by @phillipleblanc in #6042
  • refactor: Remove worker type (#6039) by @peasee in #6039
  • feat: Support cron dataset refreshes (#6037) by @peasee in #6037
  • Upgrade datafusion-federation to 0.4.2 (#6022) by @phillipleblanc in #6022
  • Define SearchPipeline and use in runtime/vector_search.rs. (#6044) by @Jeadie in #6044
  • fix: Scheduler test when scheduler is running (#6051) by @peasee in #6051
  • doc: Spice Cloud Connector Limitation (#6035) by @Sevenannn in #6035
  • Add support for on_conflict:upsert for Arrow MemTable (#6059) by @sgrebnov in #6059
  • Enhance Arrow Flight DoPut operation tracing (#6053) by @sgrebnov in #6053
  • Update openapi.json (#6032) by @app/github-actions in #6032
  • Add tools enabled to MCP server capabilities (#6060) by @Jeadie in #6060
  • Upgrade to delta_kernel 0.11 (#6045) by @phillipleblanc in #6045
  • refactor: Replace refresh oneshot with notify (#6050) by @peasee in #6050
  • Enable Upsert OnConflictBehavior for runtime.task_history table (#6068) by @sgrebnov in #6068
  • feat: Add a workers integration test (#6069) by @peasee in #6069
  • Fix DuckDB acceleration ORDER BY rand() and ORDER BY NULL (#6071) by @phillipleblanc in #6071
  • Update Models Benchmarks to report unsuccessful evals as errors (#6070) by @sgrebnov in #6070
  • Revert: fix: Use HTTPS ubuntu sources (#6082) by @Sevenannn in #6082
  • Add initial support for Spice Cloud Platform management (#6089) by @sgrebnov in #6089
  • Run spiceai cloud connector TPC tests using spice dev apps (#6049) by @Sevenannn in #6049
  • feat: Add SQL worker action (#6093) by @peasee in #6093
  • Post-release housekeeping (#6097) by @phillipleblanc in #6097
  • Fix search bench (#6091) by @Jeadie in #6091
  • fix: Update benchmark snapshots (#6094) by @app/github-actions in #6094
  • fix: Update benchmark snapshots (#6095) by @app/github-actions in #6095
  • Glue catalog connector for hive style parquet (#6054) by @kczimm in #6054
  • Update openapi.json (#6100) by @app/github-actions in #6100
  • Improve Flight Client DoPut / Publish error handling (#6105) by @sgrebnov in #6105
  • Define PostApplyCandidateGeneration to handle all filters & projections. (#6096) by @Jeadie in #6096
  • refactor: Update the tracing task names for scheduled tasks (#6101) by @peasee in #6101
  • task: Switch GH runners in PR and testoperator (#6052) by @peasee in #6052
  • feat: Connect search caching for HTTP and tools (#6108) by @peasee in #6108
  • test: Add multi-dataset cron test (#6102) by @peasee in #6102
  • Sanitize the ListingTableURL (#6110) by @phillipleblanc in #6110
  • Avoid partial writes by FlightTableWriter (#6104) by @sgrebnov in #6104
  • fix: Update the TPCDS postgres acceleration indexes (#6111) by @peasee in #6111
  • Make Glue Catalog refreshable (#6103) by @kczimm in #6103
  • Refactor Glue catalog to use a new Glue data connector (#6125) by @kczimm in #6125
  • Emit retry error on flight transient connection failure (#6123) by @Sevenannn in #6123
  • Update Flight DoPut implementation to send single final PutResult (#6124) by @sgrebnov in #6124
  • feat: Add metrics for search results cache (#6129) by @peasee in #6129
  • update MCP crate (#6130) by @Jeadie in #6130
  • feat: Add search cache status header, respect cache control (#6131) by @peasee in #6131
  • fix: Allow specifying individual caching blocks (#6133) by @peasee in #6133
  • Update openapi.json (#6132) by @app/github-actions in #6132
  • Add CSV support to Glue data connector (#6138) by @kczimm in #6138
  • Update Spice Cloud Platform management UX (#6140) by @sgrebnov in #6140
  • Add TPCH bench for Glue catalog (#6055) by @kczimm in #6055
  • Enforce maxtokensper_request limit in OpenAI embedding logic (#6144) by @sgrebnov in #6144
  • Enable Spice Cloud Control Plane connect (management) for FinanceBench (#6147) by @sgrebnov in #6147
  • Add integration test for Spice Cloud Platform management (#6150) by @sgrebnov in #6150
  • fix: Invalidate search cache on refresh (#6137) by @peasee in #6137
  • fix: Prevent registering cron schedule with change stream accelerations (#6152) by @peasee in #6152
  • test: Add an append cron integration test (#6151) by @peasee in #6151
  • fix: Cache search results with no-cache directive (#6155) by @peasee in #6155
  • fix: Glue catalog dispatch runner type (#6157) by @peasee in #6157
  • Fix: Glue S3 location for directories and Iceberg credentials (#6174) by @kczimm in #6174
  • Support multiple columns in FTS (#6156) by @Jeadie in #6156
  • fix: Add --cache-control flag for search CLI (#6158) by @peasee in #6158
  • Add Glue data connector tpch bench test for parquet and csv (#6170) by @kczimm in #6170
  • fix: Apply results cache deprecation correctly (#6177) by @peasee in #6177
  • Fix Linux CUDA build (use candle-core 0.8.4 and cudarc v0.12) (#6181) by @sgrebnov in #6181
  • fix: return empty stream when no results for Databricks SQL Warehouse (#6192) by @kczimm in #6192

Full Changelog: v1.3.2...v1.4.0-rc.1

- Rust
Published by kczimm 9 months ago

https://github.com/spiceai/spiceai - v1.3.2

Spice v1.3.2 (June 3, 2025)

Spice v1.3.2 improves DuckDB acceleration to accept ORDER BY rand() and ORDER BY NULL SQL queries, and supports the TIMESTAMP_NTZ(0) (timestamp with seconds precision) type in Snowflake.

Contributors

Breaking Changes

No breaking changes.

Cookbook Updates

No new cookbook recipes.

The Spice Cookbook now includes 67 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.3.2, use one of the following methods:

CLI:

console spice upgrade

Homebrew:

console brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.3.2 image:

console docker pull spiceai/spiceai:1.3.2

For available tags, see DockerHub.

Helm:

console helm repo update helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

No major dependency changes.

Changelog

  • Handle Snowflake Timestamp NTZ with seconds precision (#6084) by @kczimm in #6084
  • Fix DuckDB acceleration ORDER BY rand() and ORDER BY NULL (#6071) by @phillipleblanc in #6071

Full Changelog: https://github.com/spiceai/spiceai/compare/v1.3.1...v1.3.2

- Rust
Published by phillipleblanc 9 months ago

https://github.com/spiceai/spiceai - v1.3.1

- Rust
Published by phillipleblanc 9 months ago

https://github.com/spiceai/spiceai - v1.3.0

Spice v1.3.0 (May 19, 2025)

Spice v1.3.0 accelerates data and AI applications with significantly improved query performance, reliability, and expanded Databricks integration. New support for the Databricks SQL Statement Execution API enables direct SQL queries on Databricks SQL Warehouses, complementing Mosaic AI model serving and embeddings (introduced in v1.2.2) and existing Databricks catalog and dataset integrations. This release upgrades to DataFusion v46, optimizes results caching performance, and strengthens security with least-privilege sandboxed improvements.

What's New in v1.3.0

  • Databricks SQL Statement Execution API Support: Added support for the Databricks SQL Statement Execution API, enabling direct SQL queries against Databricks SQL Warehouses for optimized performance in analytics and reporting workflows.

Example spicepod.yml configuration:

yaml datasets: - from: databricks:spiceai.datasets.my_awesome_table name: my_awesome_table params: mode: sql_warehouse databricks_endpoint: ${env:DATABRICKS_ENDPOINT} databricks_sql_warehouse_id: ${env:DATABRICKS_SQL_WAREHOUSE_ID} databricks_token: ${env:DATABRICKS_TOKEN}

For details, see the Databricks Data Connector documentation.

  • Improved Results Cache Performance & Hashing Algorithm: Spice now supports an alternative results cache hashing algorithm, ahash, in addition to siphash, being the default. Configure it via:

yaml runtime: results_cache: hashing_algorithm: ahash # or siphash

The hashing algorithm determines how cache keys are hashed before being stored, impacting both lookup speed and protection against potential DOS attacks.

Using ahash improves performance for large queries or query plans. Combined with results cache optimizations, it reduces 99th percentile request latency and increases total requests/second for queries with large result sets (100k+ cached rows). The following charts show performance tested against the TPCH Query #17 on a scale factor 5 dataset (30+ million rows, 5GB):

| Latency | Req/sec | | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | Improvements for the 99th percentile query latency, compared against 1.2.2 with cache key type and hashing algorithm. | Improvements for the requests/second, compared against 1.2.2 with cache key type and hashing algorithm. |

Note: ahash was not available in v1.2.2, so it is excluded from comparisons.

To learn more, refer to the Results Cache Hashing Algorithm documentation.

  • SQL Query Performance: Optimized the critical SQL query path, reducing overhead and improving response times for simple queries by 10-20%.

  • DuckDB Acceleration: Fixed a bug in the DuckDB acceleration engine causing query failures under high concurrency when querying datasets accelerated into multiple DuckDB files.

  • Container Security: The container image now runs as a non-root user with enhanced sandboxing and includes only essential dependencies for a slimmer, more secure image.

DataFusion v46 Highlights

Spice.ai is built on the DataFusion query engine. The v46 release brings:

  • Faster Performance 🚀: DataFusion 46 introduces significant performance enhancements, including a 2x faster median() function for large datasets without grouping, 10–100% speed improvements in FIRST_VALUE and LAST_VALUE window functions by avoiding sorting, and a 40x faster uuid() function. Additional optimizations, such as a 50% faster repeat() string function, accelerated chr() and to_hex() functions, improved grouping algorithms, and Parquet row group pruning with NOT LIKE filters, further boost overall query efficiency.

  • New range() Table Function: A new table-valued function range(start, stop, step) has been added to make it easy to generate integer sequences — similar to PostgreSQL’s generate_series() or Spark’s range(). Example: SELECT * FROM range(1, 10, 2);

  • UNION [ALL | DISTINCT] BY NAME Support: DataFusion now supports UNION BY NAME and UNION ALL BY NAME, which align columns by name instead of position. This matches functionality found in systems like Spark and DuckDB and simplifies combining heterogeneously ordered result sets.

Example:

sql SELECT col1, col2 FROM t1 UNION ALL BY NAME SELECT col2, col1 FROM t2;

See the DataFusion 46.0.0 release notes for details.

Spice.ai adopts the latest minus one DataFusion release for quality assurance and stability. The upgrade to DataFusion v47 is planned for Spice v1.4.0 in June.

Contributors

Breaking Changes

No breaking changes.

Cookbook Updates

  • Added Accelerated Views: Pre-calculate and materialize data derived from one or more underlying datasets.

The Spice Cookbook now includes 67 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.3.0, use one of the following methods:

CLI:

console spice upgrade

Homebrew:

console brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.3.0 image:

console docker pull spiceai/spiceai:1.3.0

For available tags, see DockerHub.

Helm:

console helm repo update helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

Changelog

See the full list of changes at: v1.2.2...v1.3.0

- Rust
Published by phillipleblanc 9 months ago

https://github.com/spiceai/spiceai - v1.2.2

Spice v1.2.2 (May 12, 2025)

Spice v1.2.2 introduces support for Databricks Mosaic AI model serving and embeddings, alongside the existing Databricks catalog and dataset integrations. It adds configurable service ports in the Helm chart and resolves several bugs to improve stability and performance.

Highlights in v1.2.2

  • Databricks Model & Embedding Provider: Spice integrates with Databricks Model Serving for models and embeddings, enabling secure access via machine-to-machine (M2M) OAuth authentication with service principal credentials. The runtime automatically refreshes tokens using databricks_client_id and databricks_client_secret, ensuring uninterrupted operation. This feature supports Databricks-hosted large language models and embedding models.

```yaml models: - from: databricks:databricks-llama-4-maverick name: llama-4-maverick params: databricksendpoint: dbc-46470731-42e5.cloud.databricks.com databricksclientid: ${secrets:DATABRICKSCLIENTID} databricksclientsecret: ${secrets:DATABRICKSCLIENT_SECRET}

embeddings: - from: databricks:databricks-gte-large-en name: gte-large-en params: databricksendpoint: dbc-42424242-4242.cloud.databricks.com databricksclientid: ${secrets:DATABRICKSCLIENTID} databricksclientsecret: ${secrets:DATABRICKSCLIENT_SECRET} ```

For detailed setup instructions, refer to the Databricks Model Provider documentation.

  • Configurable Helm Chart Service Ports: The Helm chart now supports custom ports for flexible network configurations for deployments. Specify non-default ports in your Helm values file.

  • Resolved Issues:

    • MCP Nested Tool Calling: Fixed a bug preventing nested tool invocation when Spice operates as the MCP server federating to MCP clients.
    • Dataset Load Concurrency: Corrected a failure to respect the dataset_load_parallelism setting during dataset loading.
    • Acceleration Hot-Reload: Addressed an issue where changes to acceleration enable/disable settings were not detected during hot reload of Spicepod.yaml.

Contributors

Breaking Changes

No breaking changes.

Cookbook Updates

Updated cookbooks:

The Spice Cookbook now includes 68 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.2.2, use one of the following methods:

CLI:

console spice upgrade

Homebrew:

console brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.2.2 image:

console docker pull spiceai/spiceai:1.2.2

For available tags, see DockerHub.

Helm:

console helm repo update helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

  • No major dependency changes.

Changelog

  • Update spark-connect-rs to override user agent string by @ewgenius in https://github.com/spiceai/spice/pull/5798
  • Merge pull request by @ewgenius in https://github.com/spiceai/spice/pull/5796
  • Pass the default user agent string to the Databricks Spark, Delta, and Unity clients by @ewgenius in https://github.com/spiceai/spice/pull/5717
  • bump to 1.2.2 by @Jeadie in https://github.com/spiceai/spice/pull/none
  • Helm chart: support for service ports overrides by @sgrebnov in https://github.com/spiceai/spice/pull/5774
  • Update spice cli login command with client-id and client-secret flags for Databricks by @ewgenius in https://github.com/spiceai/spice/pull/5788
  • Fix bug where setting Cache-Control: no-cache doesn't compute the cache key by @phillipleblanc in https://github.com/spiceai/spice/pull/5779
  • Update to datafusion-contrib/datafusion-table-providers#336 by @phillipleblanc in https://github.com/spiceai/spice/pull/5778
  • Lru cache: limit single cached record size to u32::MAX (4GB) by @sgrebnov in https://github.com/spiceai/spice/pull/5772
  • Fix LLMs calling nested MCP tools by @Jeadie in https://github.com/spiceai/spice/pull/5771
  • MySQL: Set the charactersetresults/charactersetclient/charactersetconnection session variables on connection setup by @Sevenannn in https://github.com/spiceai/spice/pull/5770
  • Control the parallelism of acceleration refresh datasets with runtime.datasetloadparallelism by @phillipleblanc in https://github.com/spiceai/spice/pull/5763
  • Fix Iceberg predicates not matching the Arrow type of columns read from parquet files by @phillipleblanc in https://github.com/spiceai/spice/pull/5761
  • fix: Use decimal_cmp for numerical BETWEEN in SQLite by @peasee in https://github.com/spiceai/spice/pull/5760
  • Support product name override in databricks user agent string by @ewgenius in https://github.com/spiceai/spice/pull/5749
  • Databricks U2M Token Provider support by @ewgenius in https://github.com/spiceai/spice/pull/5747
  • Remove HTTP auth from LLM config and simplify Databricks models logic by using static headers by @Jeadie in https://github.com/spiceai/spice/pull/5742
  • clear plan cache when dataset updates by @kczimm in https://github.com/spiceai/spice/pull/5741
  • Support Databricks M2M auth in LLMs + Embeddings by @Jeadie in https://github.com/spiceai/spice/pull/5720
  • Retrieve Github App tokens in background; make TokenProvider not async by @Jeadie in https://github.com/spiceai/spice/pull/5718
  • Make 'token_providers' crate by @Jeadie in https://github.com/spiceai/spice/pull/5716
  • Databricks AI: Embedding models & LLM streaming by @Jeadie in https://github.com/spiceai/spice/pull/5715

See the full list of changes at: v1.2.1...v1.2.2

- Rust
Published by Jeadie 10 months ago

https://github.com/spiceai/spiceai - v1.2.2

- Rust
Published by Jeadie 10 months ago

https://github.com/spiceai/spiceai - v1.2.1

Spice v1.2.1 (May 6, 2025)

Spice v1.2.1 includes several data connector fixes and improves query performance for accelerated views. This release also introduces Databricks Service Principal (M2M OAuth) authentication and expands parameterized queries.

Highlights in v1.2.1

  • Databricks Service Principal Support: Databricks datasets and catalogs now support Machine-to-Machine (M2M) OAuth authentication via Service Principals, enabling secure machine connections to Databricks.

Example spicepod.yaml:

yaml datasets: - from: databricks:spiceai.datasets.my_awesome_table # A reference to a table in the Databricks unity catalog name: my_delta_lake_table params: mode: delta_lake databricks_endpoint: dbc-a1b2345c-d6e7.cloud.databricks.com databricks_client_id: ${secrets:DATABRICKS_CLIENT_ID} databricks_client_secret: ${secrets:DATABRICKS_CLIENT_SECRET}

For details, see documentation for:

  • Databricks Data Connector
  • Databricks Unity Catalog Connector

  • Iceberg Data Connector: Now supports cross-account table access via the AWS Glue Catalog Connector and fixes an issue when querying data from append mode datasets.

  • Iceberg Catalog API: Full compatibility with the Iceberg HTTP REST Catalog API to consume Spice datasets from Iceberg Catalog clients.

For details, see documentation for:

  • Iceberg Data Connector
  • S3 Data Connector

  • Improved Parameterized Query Support: Expanded type inference for placeholders in:

    • IN list expressions
    • LIKE patterns
    • SIMILAR TO patterns
    • LIMIT clauses
    • Subqueries

New Contributors 🎉

Contributors

Breaking Changes

No breaking changes.

Cookbook Updates

New recipes for:

The Spice Cookbook now includes 68 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.2.1, use one of the following methods:

CLI:

console spice upgrade

Homebrew:

console brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.2.1 image:

console docker pull spiceai/spiceai:1.2.1

For available tags, see DockerHub.

Helm:

console helm repo update helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

  • No major dependency changes.

Changelog

  • Fix: Specify metric type as a dimension for testoperator by @peasee in #5630
  • Fix: Add option to run dispatch schedule by @peasee in #5631
  • Infer placeholder datatype for InList, Like, and SimilarTo by @kczimm in #5626
  • Add QA analytics for 1.2.0 by @phillipleblanc in #5640
  • Fix: Use SPICEDCOMMIT for spicedcommit_sha by @peasee in #5632
  • New crates/tools by @Jeadie in #5121
  • Update openapi.json by @github-actions in #5643
  • Enable metrics reporting for models benchmarks (evals) by @sgrebnov in #5639
  • Implement CatalogBuilder, add app and runtime references to catalog component, add runtime reference to connector params by @ewgenius in #5641
  • Fix eventing bug in LLM progress; Add tool and worker progress by @Jeadie in #5619
  • Handle small precision differences in TPCH answer validation by @phillipleblanc in #5642
  • Add TokenProviderRegistry to the runtime by @ewgenius in #5651
  • Provide ModelContextLayer for evals by @Jeadie in #5648
  • Databricks datacomponents refactor. Databricks Spark connect - add settoken method and writable spark session by @ewgenius in #5654
  • Extract AWS Glue warehouse for cross-account Iceberg tables by @phillipleblanc in #5656
  • Refactor Dataset component by @phillipleblanc in #5660
  • Fix Iceberg API returning 404 when schema contains a Dictionary by @phillipleblanc in #5665
  • Fix dependencies: downgrade swagger-ui to v8; force zip to 2.3.0 by @kczimm in #5664
  • Add DuckDB indexes spicepod, additional dispatches by @peasee in #5633
  • Update readme: update data federation link by @nuvic in #5673
  • Support metadata columns for object-store based data connectors by @phillipleblanc in #5661
  • Add model name to LLM judges, and add modelgradedscoring task by @Jeadie in #5655
  • Add SF1000 TPCH test spicepods for delta lake by @Sevenannn in #5606
  • Validate Github Connector resource existence before building the github connector graphql table by @Sevenannn in #5674
  • Remove hard-coded embedding performance tests in CI by @Sevenannn in #5675
  • Databricks M2M auth for spark connect data connector by @ewgenius in #5659
  • Enable federated data refresh support for accelerated views by @sgrebnov in #5677
  • Add pods watcher integration test by @Sevenannn in #5681
  • Add m2m support for databricks delta connector by @ewgenius in #5680
  • Update end_game.md by @sgrebnov in #5684
  • Update StaticTokenProvider to use SecretString instead of raw str value by @ewgenius in #5686
  • Add M2M Auth support for Databricks catalog connector by @ewgenius in #5687
  • Update UX to disable acceleration federation by @sgrebnov in #5682
  • Improve placeholder inference (LIMIT & Expr::InSubquery) by @phillipleblanc in #5692
  • Tweak default log to ignore aws_config::imds::region by @phillipleblanc in #5693
  • Make Spice properly Iceberg Catalog API compatible for load table API by @phillipleblanc in #5695
  • Use deterministic queries for Databricks m2m catalog tests by @ewgenius in #5696
  • Support retrieving the latest Iceberg table on table scan by @phillipleblanc in #5704
  • Infer partitions from schemasourcepath if present by @phillipleblanc in #5721

Full Changelog: v1.2.0...v1.2.1

- Rust
Published by sgrebnov 10 months ago

https://github.com/spiceai/spiceai - v1.2.0

Spice v1.2.0 (Apr 28, 2025)

Spice v1.2.0 is a significant update. It upgrades DataFusion to v45 and Arrow to v54. This release brings faster query performance, support for parameterized queries in SQL and HTTP APIs, and the ability to accelerate views. Several bugs have been fixed and dependencies updated for better stability and speed.

DataFusion v45 Highlights

Spice.ai is built on the DataFusion query engine. The v45 release brings:

  • Faster Performance 🚀: DataFusion is now the fastest single-node engine for Apache Parquet files in the clickbench benchmark. Performance improved by over 33% from v33 to v45. Arrow StringView is now on by default, making string and binary data queries much faster, especially with Parquet files.

  • Better Quality 📋: DataFusion now runs over 5 million SQL tests per push using the SQLite sqllogictest suite. There are new checks for logical plan correctness and more thorough pre-release testing.

  • New SQL Functions ✨: Added show functions, to_local_time, regexp_count, map_extract, array_distance, array_any_value, greatest, least, and arrays_overlap.

See the DataFusion 45.0.0 release notes for details.

Spice.ai upgrades to the latest minus one DataFusion release to ensure adequate testing and stability. The next upgrade to DataFusion v46 is planned for Spice v1.3.0 in May.

What's New in v1.2.0

  • Parameterized Queries: Parameterized queries are now supported with the Flight SQL API and HTTP API. Positional and named arguments via $1 and :param syntax are supported, respectively. Logical plans for SQL statements are cached for faster repeated queries.

Example Cookbook recipes:

See the API Documentation for additional details.

  • Accelerated Views: Views, not just datasets, can now be accelerated. This provides much better performance for views that perform heavy computation.

Example spicepod.yaml:

yaml views: - name: accelerated_view acceleration: enabled: true engine: duckdb primary_key: id refresh_check_interval: 1h sql: | select * from dataset_a union all select * from dataset_b

See the Data Acceleration documentation.

  • Memory Usage Metrics & Configuration: Runtime now tracks memory usage as a metric, and a new runtime memory_limit parameter is available. The memory limit parameter applies specifically to the runtime and should be used in addition to existing memory usage configuration, such as duckdb_memory_limit. Memory usage for queries beyond the memory limit will spill to disk.

See the Memory Reference for details.

  • New Worker Component: Workers are new configurable compute units in the Spice runtime. They help manage compute across models and tools, handle errors, and balance load. Workers are configured in the workers section of spicepod.yaml.

Example spicepod.yaml:

yaml workers: - name: round-robin description: | Distributes requests between 'foo' and 'bar' models in a round-robin fashion. models: - from: foo - from: bar - name: fallback description: | Tries 'bar' first, then 'foo', then 'baz' if earlier models fail. models: - from: foo order: 2 - from: bar order: 1 - from: baz order: 3

See the Workers Documentation for details.

  • Databricks Model Provider: Databricks models can now be used with from: databricks:model_name.

Example spicepod.yaml:

yaml models: - from: databricks:llama-3_2_1_1b_instruct name: llama-instruct params: databricks_endpoint: dbc-46470731-42e5.cloud.databricks.com databricks_token: ${ secrets:SPICE_DATABRICKS_TOKEN }

See the Databricks model documentation.

  • spice chat CLI Improvements: The spice chat command now supports an optional --temperature parameter. A one-shot chat can also be sent with spice chat <message>.

  • More Type Support: Added support for Postgres JSON type and DuckDB Dictionary type.

  • Other Improvements:

    • New image tags let you pick memory allocators for different use-cases: jemalloc, sysalloc, and mimalloc.
    • Better error handling and logging for chat and model operations.

Contributors

Cookbook Updates

New recipes for:

The Spice Cookbook now includes 68 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.2.0, use one of the following methods:

CLI:

console spice upgrade

Homebrew:

console brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.2.0 image:

console docker pull spiceai/spiceai:1.2.0

For available tags, see DockerHub.

Helm:

console helm repo update helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

Spice is now built with Rust 1.85.0 and Rust 2024.

Changelog

  • Update end_game.md (#5312) by @peasee in https://github.com/spiceai/spiceai/pull/5312
  • feat: Add initial testoperator query validation (#5311) by @peasee in https://github.com/spiceai/spiceai/pull/5311
  • Update Helm + Prepare for next release (#5317) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5317
  • Update spicepod.schema.json (#5319) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5319
  • add integration test for reading encrypted PDFs from S3 (#5308) by @kczimm in https://github.com/spiceai/spiceai/pull/5308
  • Stop load_components during runtime shutdown (#5306) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5306
  • Update openapi.json (#5321) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5321
  • feat: Implement record batch data validation (#5331) by @peasee in https://github.com/spiceai/spiceai/pull/5331
  • Update QA analytics for v1.1.1 (#5320) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5320
  • fix: Update benchmark snapshots (#5337) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5337
  • Enforce pulls with Spice v1.0.4 (#5339) by @lukekim in https://github.com/spiceai/spiceai/pull/5339
  • Upgrade to DataFusion 45, Arrow 54, Rust 1.85 & Edition 2024 (#5334) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5334
  • feat: Allow validating testoperator in benchmark workflow (#5342) by @peasee in https://github.com/spiceai/spiceai/pull/5342
  • Upgrade delta_kernel to 0.9 (#5343) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5343
  • deps: Update odbc-api (#5344) by @peasee in https://github.com/spiceai/spiceai/pull/5344
  • Fix schema inference for Snowflake tables with large number of columns (#5348) by @ewgenius in https://github.com/spiceai/spiceai/pull/5348
  • feat: Update testoperator dispatch for validation, version metric (#5349) by @peasee in https://github.com/spiceai/spiceai/pull/5349
  • fix: validate_results not validate (#5352) by @peasee in https://github.com/spiceai/spiceai/pull/5352
  • revert to previous pdf-extract; remove test for encrypted pdf support (#5355) by @kczimm in https://github.com/spiceai/spiceai/pull/5355
  • Stablize the test verify_similarity_search_chat_completion (#5284) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5284
  • Turn off delta_kernel::log_segment logging and refactor log filtering (#5367) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5367
  • Upgrade to DuckDB 1.2.2 (#5375) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5375
  • Update Readme - fix broken and outdated links (#5376) by @ewgenius in https://github.com/spiceai/spiceai/pull/5376
  • Upgrade dependabot dependencies (#5385) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5385
  • fix: Remove IMAP oauth (#5386) by @peasee in https://github.com/spiceai/spiceai/pull/5386
  • Bump Helm chart to 1.1.2 (#5389) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5389
  • Refactor accelerator registry as part of runtime. (#5318) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5318
  • Include vnd.spiceai.sql/nsql.v1+json response examples (openapi docs) (#5388) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5388
  • docs: Update endgame template with SpiceQA, update qa analytics (#5391) by @peasee in https://github.com/spiceai/spiceai/pull/5391
  • Make graceful shutdown timeout configurable (#5358) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5358
  • docs: Update release criteria with note on max columns (#5401) by @peasee in https://github.com/spiceai/spiceai/pull/5401
  • Update openapi.json (#5392) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5392
  • FinanceBench: update scorer instructions and switch scoring model to gpt-4.1 (#5395) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5395
  • feat: Write OTel metrics for testoperator (#5397) by @peasee in https://github.com/spiceai/spiceai/pull/5397
  • Update nsql openapi title (#5403) by @ewgenius in https://github.com/spiceai/spiceai/pull/5403
  • Track ai_inferences_count with used tools flag. Extensible runtime request context. (#5393) by @ewgenius in https://github.com/spiceai/spiceai/pull/5393
  • Include newly detected view as changed view (#5408) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5408
  • Track usedtools in aiinferenceswithspice_count as number (#5409) by @ewgenius in https://github.com/spiceai/spiceai/pull/5409
  • Update openapi.json (#5406) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5406
  • Tweak enforce pulls with Spice (#5411) by @lukekim in https://github.com/spiceai/spiceai/pull/5411
  • Allow flightsql and spiceai connectors to override flight max message size (#5407) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5407
  • Retry model graded scorer once on successful, empty response (#5405) by @Jeadie in https://github.com/spiceai/spiceai/pull/5405
  • use span task name in 'spice trace' tree, not span_id (#5412) by @Jeadie in https://github.com/spiceai/spiceai/pull/5412
  • Rename to track_ai_inferences_with_spice_count in all places (#5410) by @ewgenius in https://github.com/spiceai/spiceai/pull/5410
  • Update qa_analytics.csv (#5421) by @peasee in https://github.com/spiceai/spiceai/pull/5421
  • Remove the filter for the list_datasets tool in the AI inferences metric count. (#5417) by @ewgenius in https://github.com/spiceai/spiceai/pull/5417
  • fix: Testoperator uses an exact API key for benchmark metric submission (#5413) by @peasee in https://github.com/spiceai/spiceai/pull/5413
  • feat: Enable testoperator metrics in workflow (#5422) by @peasee in https://github.com/spiceai/spiceai/pull/5422
  • Upgrade mistral.rs (#5404) by @Jeadie in https://github.com/spiceai/spiceai/pull/5404
  • Include all FinanceBench documents in benchmark tests (#5426) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5426
  • Handle second Ctrl-C to force runtime termination (#5427) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5427
  • Add optional --temperature parameter for spice chat CLI command (#5429) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5429
  • Remove with_runtime_status from the RuntimeBuilder (#5430) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5430
  • Fix spice chat error handling (#5433) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5433
  • Add more test models to FinanceBench benchmark (#5431) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5431
  • support 'from: databricks:model_name' (#5434) by @Jeadie in https://github.com/spiceai/spiceai/pull/5434
  • Upgrade Pulls with Spice to v1.0.6 and add concurrency control (#5442) by @lukekim in https://github.com/spiceai/spiceai/pull/5442
  • Upgrade DataFusion table providers (#5443) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5443
  • Test spice chat in e2etestspice_cli (#5447) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5447
  • Allow for one-shot chat request using spice chat <message> (#5444) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5444
  • Enable parallel data sampling for NSQL (#5449) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5449
  • Upgrade Go from v1.23.4 to v1.24.2 (#5462) by @lukekim in https://github.com/spiceai/spiceai/pull/5462
  • Update PULLREQUESTTEMPLATE.md (#5465) by @lukekim in https://github.com/spiceai/spiceai/pull/5465
  • Enable captured outputs by default when spiced is started by the CLI (spice run) (#5464) by @lukekim in https://github.com/spiceai/spiceai/pull/5464
  • Parameterized queries via Flight SQL API (#5420) by @kczimm in https://github.com/spiceai/spiceai/pull/5420
  • fix: Update benchmarks readme badge (#5466) by @peasee in https://github.com/spiceai/spiceai/pull/5466
  • delay auth check for binding parameterized queries (#5475) by @kczimm in https://github.com/spiceai/spiceai/pull/5475
  • Add support for ? placeholder syntax in parameterized queries (#5463) by @kczimm in https://github.com/spiceai/spiceai/pull/5463
  • enable task name override for non static span names (#5423) by @Jeadie in https://github.com/spiceai/spiceai/pull/5423
  • Allow parameter queries with no parameters (#5481) by @kczimm in https://github.com/spiceai/spiceai/pull/5481
  • Support unparsing UNION for distinct results (#5483) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5483
  • add rust-toolchain.toml (#5485) by @kczimm in https://github.com/spiceai/spiceai/pull/5485
  • Add parameterized query support to the HTTP API (#5484) by @kczimm in https://github.com/spiceai/spiceai/pull/5484
  • E2E test for spice chat behavior (#5451) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5451
  • Renable and fix huggingface models integration tests (#5478) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5478
  • Update openapi.json (#5488) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5488
  • feat: Record memory usage as a metric (#5489) by @peasee in https://github.com/spiceai/spiceai/pull/5489
  • fix: update dispatcher to run all benchmarks, rename metric, update spicepods, add scale factor (#5500) by @peasee in https://github.com/spiceai/spiceai/pull/5500
  • Fix ILIKE filters support (#5502) by @ewgenius in https://github.com/spiceai/spiceai/pull/5502
  • fix: Update test spicepod locations and names (#5505) by @peasee in https://github.com/spiceai/spiceai/pull/5505
  • fix: Update benchmark snapshots (#5508) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5508
  • fix: Update benchmark snapshots (#5512) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5512
  • Fix Delta Lake bug for: Found unmasked nulls for non-nullable StructArray field "predicate" (#5515) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5515
  • fix: working directory for duckdb e2e test spicepods (#5510) by @peasee in https://github.com/spiceai/spiceai/pull/5510
  • Tweaks to README.md (#5516) by @lukekim in https://github.com/spiceai/spiceai/pull/5516
  • Cache logical plans of SQL statements (#5487) by @kczimm in https://github.com/spiceai/spiceai/pull/5487
  • Fix content-type: application/json (#5517) by @Jeadie in https://github.com/spiceai/spiceai/pull/5517
  • Validate postgres results in testoperator dispatch (#5504) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5504
  • fix: Update benchmark snapshots (#5511) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5511
  • Fix results cache by SQL with prepared statements (#5518) by @kczimm in https://github.com/spiceai/spiceai/pull/5518
  • Add initial support for views acceleration (#5509) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5509
  • fix: Update benchmark snapshots (#5527) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5527
  • Support switching the memory allocator Spice uses via alloc-* features. (#5528) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5528
  • fix: Update benchmark snapshots (#5525) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5525
  • Add test spicepod for tpch mysql-duckdbfile acceleration by @Sevenannn in https://github.com/spiceai/spiceai/pull/5521
  • Fix nightly arm build - change tag -default to -models (#5529) by @ewgenius in https://github.com/spiceai/spiceai/pull/5529
  • LLM router via worker spicepod component (#5513) by @Jeadie in https://github.com/spiceai/spiceai/pull/5513
  • Apply Spice advanced acceleration logic and params support to accelerated views (#5526) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5526
  • Enable DatasetCheckpoint logic for accelerated views (#5533) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5533
  • Fix public '.model' name for router workers (#5535) by @Jeadie in https://github.com/spiceai/spiceai/pull/5535
  • feat: Add Runtime memory limit parameter (#5536) by @peasee in https://github.com/spiceai/spiceai/pull/5536
  • For fallback worker, check first item in chat/completion stream. (#5537) by @Jeadie in https://github.com/spiceai/spiceai/pull/5537
  • Move rate limit check to after parameterized query binding (#5540) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5540
  • Update spicepod.schema.json (#5545) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5545
  • Accelerate views: refreshonstartup, ready_state, jitter params support (#5547) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5547
  • Add integration test for accelerated views (#5550) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5550
  • Don't install make or expect on spiceai-macos runners (#5554) by @lukekim in https://github.com/spiceai/spiceai/pull/5554
  • event_stream crate for emitting events from tracing::Span; used in v1/chat/completions streaming. (#5474) by @Jeadie in https://github.com/spiceai/spiceai/pull/5474
  • Fix typo in method (#5559) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5559
  • Run test operator every day and current and previous commits (#5557) by @lukekim in https://github.com/spiceai/spiceai/pull/5557
  • Add awsallowhttp parameter for delta lake connector (#5541) by @Sevenannn in https://github.com/spiceai/spiceai/pull/5541
  • feat: Add branch name to metric dimensions in testoperator (#5563) by @peasee in https://github.com/spiceai/spiceai/pull/5563
  • fix: Update the tpch benchmark snapshots for: ./test/spicepods/tpch/sf1/federated/odbc[databricks].yaml (#5565) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5565
  • fix: Split scheduled dispatch into a separate job (#5567) by @peasee in https://github.com/spiceai/spiceai/pull/5567
  • fix: Use outputs.SPICED_COMMIT (#5568) by @peasee in https://github.com/spiceai/spiceai/pull/5568
  • fix: Use refs in testoperator dispatch instead of commits (#5569) by @peasee in https://github.com/spiceai/spiceai/pull/5569
  • fix: actions/checkout ref does not take a full ref (#5571) by @peasee in https://github.com/spiceai/spiceai/pull/5571
  • fix: Testoperator dispatch (#5572) by @peasee in https://github.com/spiceai/spiceai/pull/5572
  • Respect update-snapshots when running all benchmarks manually (#5577) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5577
  • Use FETCHHEAD instead of ${{ inputs.ref }} to list commits in setupspiced (#5579) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5579
  • Add additional test scenarios for benchmarks (#5582) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5582
  • fix: Update the tpch benchmark snapshots for: test/spicepods/tpch/sf1/accelerated/databricks[delta_lake]-duckdb[file].yaml (#5590) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5590
  • fix: Update the tpch benchmark snapshots for: test/spicepods/tpch/sf1/accelerated/mysql-duckdb[file].yaml (#5591) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5591
  • Fix Snowflake data connector rows ordering (#5599) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5599
  • fix: Update benchmark snapshots (#5595) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5595
  • fix: Update the tpch benchmark snapshots for: test/spicepods/tpch/sf1/accelerated/databricks[delta_lake]-arrow.yaml (#5594) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5594
  • fix: Update benchmark snapshots (#5589) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5589
  • fix: Update benchmark snapshots (#5583) by @app/github-actions in https://github.com/spiceai/spiceai/pull/5583
  • Downgrade DuckDB to 1.1.3 (#5607) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5607
  • Add prepared statement integration tests (#5544) by @kczimm in https://github.com/spiceai/spiceai/pull/5544

Full Changelog: v1.1.2...v1.2.0

- Rust
Published by ewgenius 10 months ago

https://github.com/spiceai/spiceai - v1.1.2

Spice v1.1.2 (Apr 14, 2025)

Spice v1.1.2 improves Delta Lake Data Connector performance, introduces new Accept headers for the /v1/sql and /v1/nsql endpoints to include query metadata with results, and resolves an issue with the Snowflake Data Connector when handling wide tables (>600 columns).

The official Tableau Connector for Spice.ai v0.1 has been released, making it easy to connect to both self-hosted Spice.ai and Spice Cloud instances using Tableau.

What's New in v1.1.2

  • Tableau Connector for Spice.ai: Released the initial version (v0.1) of the official Tableau Taco Connector (fully open-source), enabling data visualization and analytics in Tableau with self-hosted Spice.ai and Spice Cloud deployments.

  • Delta Lake Data Connector: Upgraded delta_kernel to v0.9, and optimized scan operations, reducing query execution time by up to 20% on large datasets.

  • Snowflake Data Connector: Fixed a bug that caused failures when loading tables with more than 600 columns.

  • Query Metadata (SQL and NSQL): Added support for the application/vnd.spiceai.sql.v1+json Accept header on the /v1/sql endpoint, and the application/vnd.spiceai.nsql.v1+json Accept header on the /v1/nsql endpoint, enabling responses to include metadata such as the executed SQL query and schema alongside results.

Example:

bash curl -XPOST "http://localhost:8090/v1/nsql" \ -H "Content-Type: application/json" \ -H "Accept: application/vnd.spiceai.nsql.v1+json" \ -d '{ "query": "What’s the highest tip any passenger gave?" }' | jq

Example response:

json { "row_count": 1, "schema": { "fields": [ { "name": "highest_tip", "data_type": "Float64", "nullable": true, "dict_id": 0, "dict_is_ordered": false, "metadata": {} } ], "metadata": {} }, "data": [ { "highest_tip": 428.0 } ], "sql": "SELECT MAX(\"tip_amount\") AS \"highest_tip\"\nFROM \"spice\".\"public\".\"taxi_trips\"" }

For details, see the SQL Query API and NSQL API documentation.

Contributors

Breaking Changes

No breaking changes in this release.

Cookbook Updates

No major cookbook additions.

The Spice Cookbook now includes 65 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.1.2, use one of the following methods:

CLI:

console spice upgrade

Homebrew:

console brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.1.2 image:

console docker pull spiceai/spiceai:1.1.2

For available tags, see DockerHub.

Helm:

console helm repo update helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

Changelog

  • Backport - Fix schema inference for Snowflake tables with large number of columns #5348 by @ewgenius in #5350
  • Upgrade delta_kernel to 0.9 (#5343) by @phillipleblanc in #5356
  • Add basic support for application/vnd.spiceai.sql.v1+json format (#5333) by @sgrebnov in #5333
  • Convert DataFusion filters to Delta Kernel predicates by @phillipleblanc in #5362
  • revert to previous pdf-extract; remove test for encrypted pdf support by @kczimm in #5355
  • Turn off delta_kernel::log_segment logging and refactor log filtering by @phillipleblanc in #5367
  • Extend application/vnd.spiceai.sql.v1+json with schema and row_count fields by @sgrebnov in #5365
  • Make separate vnd.spiceai.sql.v1+json and vnd.spiceai.nsql.v1+json MIME types by @sgrebnov in #5382

Full Changelog: v1.1.1...v1.1.2

- Rust
Published by phillipleblanc 11 months ago

https://github.com/spiceai/spiceai - v1.1.1

Spice v1.1.1 (Apr 7, 2025)

Spice v1.1.1 introduces several key updates, including a new Component Metrics System, improved Delta Data Connector performance, improved MCP tool descriptions, and expanded runtime results caching options. This release also adds detailed MySQL connection pool metrics for better observability. Component Metrics are Prometheus-compatible and accessible via the metrics endpoint.

Highlights v1.1.1

  • Component Metrics System: A new system for monitoring components, starting with MySQL connection pool metrics. These metrics provide insights into MySQL connection performance and can be selectively enabled in the dataset configuration. Metrics are exposed in Prometheus format via the metrics endpoint.

For more details, see the Component Metrics documentation.

  • Results Caching Enhancements: Added a cache_key_type option for runtime results caching. Options include:
    • plan (Default): Uses the query's logical plan as the cache key. Matches semantically equivalent queries but requires query parsing.
    • sql: Uses the raw SQL string as the cache key. Provides faster lookups but requires exact string matches. Use sql for predictable queries without dynamic functions like NOW().

Example spicepod.yaml configuration:

yaml runtime: results_cache: enabled: true cache_max_size: 128MiB cache_key_type: sql # Use SQL for the results cache key item_ttl: 1s

For more details, see the runtime configuration documentation.

  • Delta Data Connector: Improved scan performance for faster query performance.

  • MCP Tools: Improved descriptions for built-in MCP tools to improve usability.

  • MySQL Component Metrics: Added detailed metrics for monitoring MySQL connections, such as connection count and pool activity.

Example spicepod.yaml configuration:

yaml datasets: - from: mysql:my_table name: my_dataset metrics: - name: connection_count enabled: true - name: connections_in_pool enabled: true - name: active_wait_requests enabled: true params: mysql_host: localhost mysql_tcp_port: 3306 mysql_user: root mysql_pass: ${secrets:MYSQL_PASS}

For more details, see the MySQL Data Connector documentation.

  • spice.js SDK: The spice.js SDK has been updated to v2.0.1 and includes several important security updates.

New Contributors 🎉

Contributors

Breaking Changes

No breaking changes in this release.

Cookbook Updates

The Spice Cookbook now includes 65 recipes to help you get started with Spice quickly and easily.

Upgrading

To upgrade to v1.1.1, use one of the following methods:

CLI:

console spice upgrade

Homebrew:

console brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.1.1 image:

console docker pull spiceai/spiceai:1.1.1

For available tags, see DockerHub.

Helm:

console helm repo update helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

  • No major dependency changes.

Changelog

  • fix: Testoperator DuckDB, SQLite, Postgres, Spicecloud by @peasee in #5190
  • Update Helm Chart and SECURITY.md to v1.1.0 by @lukekim in #5223
  • Update version.txt to v1.1.1-unstable by @lukekim in #5224
  • Update Cargo.lock to v1.1.1-unstable by @lukekim in #5225
  • Add tests for verify_schema_source_path in ListingTableConnector by @phillipleblanc in #5221
  • Reduce noise from debug logging by @phillipleblanc in #5227
  • Improve openai_test_chat_messages integration test reliability by @Sevenannn in #5222
  • Verify the checkpoints existence before shutting down runtime in integration tests directly querying checkpoint by @Sevenannn in #5232
  • Fix CORS support for json content-type api by @sgrebnov in #5241
  • Fix ModelGradedScorer error: The 'metadata' parameter is only allowed when 'store' is enabled. by @sgrebnov in #5231
  • fix: Use pulls-with-spice-action and switch to spiceai-macos runners by @peasee in #5238
  • Use v1.0.3 pulls with spice action by @lukekim in #5244
  • feat: Build ODBC binaries, run testoperator on ODBC by @peasee in #5237
  • Bump timeout for several integration test runtime load_components & readiness check by @Sevenannn in #5229
  • Validate port is available before binding port for docker container in integration tests by @Sevenannn in #5248
  • Update datafusion-table-providers to fix the schema for PostgreSQL materialized views by @ewgenius in #5259
  • Verify flight server is ready for flight integration tests by @Sevenannn in #5240
  • fix: Publish to MinIO inside of matrix on buildandrelease by @peasee in #5258
  • fix: TPCDS on zero results benchmarks by @peasee in #5263
  • Use model as a judge scorer for Financebench by @sgrebnov in #5264
  • Fix FinanceBench llm scorer secret name by @sgrebnov in #5276
  • Implements support for runtime.results_cache.cache_key_type by @phillipleblanc in #5265
  • fix: Testoperator MS SQL, query overrides, dispatcher by @peasee in #5279
  • refactor: Delete old benchmarks by @peasee in #5283
  • Imporve embedding column parsing performance test by @Sevenannn in #5268
  • Add Support for AWS Session Token in S3 Data Connector by @kczimm in #5243
  • Implement Component Metrics system + MySQL connection pool metrics by @phillipleblanc in #5290
  • Add default descriptions to built-in MCP tools by @lukekim in #5293
  • fix: Vector search with cased columns by @peasee in #5295
  • Run delta kernel scan in a blocking Tokio thread. by @phillipleblanc in #5296
  • Expose the mysql_pool_min and mysql_pool_max connection pool parameters by @phillipleblanc in #5297
  • use patched pdf-extract by @kczimm in #5270

Full Changelog: v1.1.0...v1.1.1

- Rust
Published by phillipleblanc 11 months ago

https://github.com/spiceai/spiceai - v1.1.0

Spice v1.1.0 (Mar 31, 2025)

Spice v1.1.0 introduces full support for the Model-Context-Protocol (MCP), expanding how models and tools connect. Spice can now act as both an MCP Server, with the new /v1/mcp/sse API, and an MCP Client, supporting stdio and SSE-based servers. This release also introduces a new Web Search tool with Perplexity model support, advanced evaluation workflows with custom eval scorers, including LLM-as-a-judge, and adds an IMAP Data Connector for federated SQL queries across email servers. Alongside these features, v1.1.0 includes automatic NSQL query retries, expanded task tracing, request drains for HTTP server shutdowns, delivering improved reliability, flexibility, and observability.

Highlights in v1.1.0

  • Spice as an MCP Server and Client: Spice now supports the Model Context Protocol (MCP), for expanded tool discovery and connectivity. Spice can:
  1. Run stdio-based MCP servers internally.
  2. Connect to external MCP servers over SSE protocol (Streamable HTTP is coming soon!)

For more details, see the MCP documentation.

### Usage

yaml tools: - name: google_maps from: mcp:npx params: mcp_args: -y @modelcontextprotocol/server-google-maps

### Spice as an MCP Server

Tools in Spice can be accessed via MCP. For example, connecting from an IDE like Cursor or Windsurf to Spice. Set the MCP Server URL to http://localhost:8090/v1/mcp/sse.

  • Perplexity Model Support: Spice now supports Perplexity-hosted models, enabling advanced web search and retrieval capabilities. Example configuration:

yaml models: - name: webs from: perplexity:sonar params: perplexity_auth_token: ${ secrets:SPICE_PERPLEXITY_AUTH_TOKEN } perplexity_search_domain_filter: - docs.spiceai.org - huggingface.co

For more details, see the Perplexity documentation.

  • Web Search Tool: The new Web Search Tool enables Spice models to search the web for information using search engines like Perplexity. Example configuration:

yaml tools: - name: the_internet from: websearch description: 'Search the web for information.' params: engine: perplexity perplexity_auth_token: ${ secrets:SPICE_PERPLEXITY_AUTH_TOKEN }

For more details, see the Web Search Tool documentation.

  • Eval Scorers: Eval scorers assess model performance on evaluation cases. Spice includes built-in scorers:

    • match: Exact match.
    • json_match: JSON equivalence.
    • includes: Checks if actual output includes expected output.
    • fuzzy_match: Normalized subset matching.
    • levenshtein: Levenshtein distance.

Custom scorers can use embedding models or LLMs as judges. Example:

yaml evals: - name: australia dataset: cricket_questions scorers: - hf_minilm - judge - match embeddings: - name: hf_minilm from: huggingface:huggingface.co/sentence-transformers/all-MiniLM-L6-v2 models: - name: judge from: openai:gpt-4o params: openai_api_key: ${ secrets:OPENAI_API_KEY } system_prompt: | Compare these stories and score their similarity (0.0 to 1.0). Story A: {{ .actual }} Story B: {{ .ideal }}

For more details, see the Eval Scorers documentation.

  • IMAP Data Connector: Query emails stored in IMAP servers using federated SQL. Example:

yaml datasets: - from: imap:myawesomeemail@gmail.com name: emails params: imap_access_token: ${secrets:IMAP_ACCESS_TOKEN}

For more details, see the IMAP Data Connector documentation.

  • Automatic NSQL Query Retries: Failed NSQL queries are now automatically retried, improving reliability for federated queries. For more details, see the NSQL documentation.

  • Enhanced Task Tracing: Task history now includes chat completion IDs, and runtime readiness is traced for better observability. Use the runtime.task_history table to query task details. See the Task History documentation.

  • Vector Search with Keyword Filtering: The vector search API now includes an optional list of keywords as a parameter, to pre-filter SQL results before performing a vector search. When vector searching via a chat completion, models will automatically generate keywords relevant to the search. See the Vector Search API documentation.

  • Improved Refresh Behavior on Startup: Spice won't automatically refresh an accelerated dataset on startup if it doesn't need to. See the Refresh on Startup documentation.

  • Graceful Shutdown for HTTP Server: The HTTP server now drains requests for graceful shutdowns, ensuring smoother runtime termination.

New Contributors 🎉

Contributors

  • @sgrebnov
  • @phillipleblanc
  • @peasee
  • @Jeadie
  • @lukekim
  • @benrussell
  • @Sevenannn
  • @sergey-shandar
  • @Garamda
  • @johnnynunez

Breaking Changes

No breaking changes.

Cookbook Updates

The Spice Cookbook now has 74 recipes that make it easy to get started with Spice!

Upgrading

To upgrade to v1.1.0, use one of the following methods:

CLI:

console spice upgrade

Homebrew:

console brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.1.0 image:

console docker pull spiceai/spiceai:1.1.0

For available tags, see DockerHub.

Helm:

console helm repo update helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

  • No major dependency changes.

Changelog

Full Changelog: github.com/spiceai/spiceai/compare/v1.0.0...release/1.1

- Rust
Published by phillipleblanc 11 months ago

https://github.com/spiceai/spiceai - v1.0.7

Spice v1.0.7 (Mar 26, 2025)

Spice v1.0.7 improves memory usage when using DuckDB, improves schema inference performance when using object-store based data connectors, and fixes a bug in Dremio schema inference.

Highlights in v1.0.7

  • DuckDB Memory Usage: Memory usage when using DuckDB has been significantly improved for data loads and refreshes through expanded use of zero-copy Arrow and multi-threading for data loads. When a duckdb_memory_limit is specified, disk spilling has been improved for greater-than-memory workloads. In addition, a new temp_directory runtime parameter supports storing temporary files to alternative location than the DuckDB data file for higher throughput. For example, temp_directory could be set to a different high-IOPs IO2 EBS volume that is separate from the duckdb_file_path.

Automated end-to-end tests for the DuckDB Accelerator coverage has been significantly expanded.

For configuration details, see the documentation for runtime parameters and the DuckDB Data Accelerator.

  • Schema Inference Performance for Object-Store Data Connectors: Schema inference performance has been improved, especially for large numbers of objects (1M+ objects) when using object-store based data connectors by making the object-listing and selection more efficient.

Contributors

  • @phillipleblanc
  • @sgrebnov
  • @peasee
  • @Sevenannn

Breaking Changes

No breaking changes.

Upgrading

To upgrade to v1.0.7, use one of the following methods:

CLI:

console spice upgrade

Homebrew:

console brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.0.7 image:

console docker pull spiceai/spiceai:1.0.7

For available tags, see DockerHub.

Helm:

console helm repo update helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

Changelog

  • fix: Remove on zero results arguments from benchmarks by @peasee in https://github.com/spiceai/spiceai/pull/4533
  • Run benchmark tests w/o uploading test results (pending improvements) by @sgrebnov in https://github.com/spiceai/spiceai/pull/4843
  • fix: Return BAD_REQUEST when not embeddings are configured by @peasee in https://github.com/spiceai/spiceai/pull/4804
  • Fix Dremio schema inference by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5114
  • Improve performance of schema inference for object-store data connectors by @sgrebnov in https://github.com/spiceai/spiceai/pull/5124
  • Always download spice runtime version matched with spice cli version by @Sevenannn in https://github.com/spiceai/spiceai/pull/4761
  • Fix go lint errors by @sgrebnov in https://github.com/spiceai/spiceai/pull/5147
  • Make DuckDB acceleration E2E tests more comprehensive by @sgrebnov in https://github.com/spiceai/spiceai/pull/5146
  • Enable Spice to load larger than memory datasets into DuckDB accelerations by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5149
  • Add temp_directory runtime parameter and insert it for DuckDB accelerations by @phillipleblanc in https://github.com/spiceai/spiceai/pull/5152
  • Fix Postgres and MySQL installation on macos14-runner (E2E CI) by @sgrebnov in https://github.com/spiceai/spiceai/pull/5155
  • Enable E2E for DuckDB full mode acceleration with indexes only in CI by @sgrebnov in https://github.com/spiceai/spiceai/pull/5154

Full Changelog: https://github.com/spiceai/spiceai/compare/v1.0.6...v1.0.7

- Rust
Published by Sevenannn 11 months ago

https://github.com/spiceai/spiceai - v1.0.6

Spice v1.0.6 (Mar 17, 2025)

Spice v1.0.6 improves stability for DuckDB acceleration, Iceberg Data/Catalog connector improvements when using AWS Glue, and fixes an issue with the ready_state: on_registration federation fallback when using DuckDB. In addition, redundant data refreshes on startup are avoided for accelerations with persistent data.

Highlights in v1.0.6

  • Iceberg Data/Catalog Connector Improvements: Improves Iceberg data & catalog connector reliability, including bug fixes for AWS Glue API rate-limiting and compatibility, REST API pagination support, explicit AWS credential handling, and support for AWS STS role assumption.

  • Fixes On-Registration Fallback when using DuckDB: Previously, when using DuckDB as a data accelerator and the ready_state: on_registration configuration, queries made during the initial data refresh did not properly fallback to the federated source. This is now fixed.

  • DuckDB downgraded for Stability: DuckDB has been downgraded to v1.1.3 due to a regression in memory handling tracked by duckdb/duckdb issue #16640. Once resolved and validated, Spice will re-upgrade to v1.2.x.

  • Expanded Integration Tests: Additional integration tests covering federated accelerator behavior and graceful shutdown processes have been added.

  • Optimized Data Refresh for Persistent Accelerations: Changed behavior in v1.0.6. When using persistent (file-mode) acceleration without a defined refresh interval, Spice performs a full refresh at startup only if no previously accelerated data is available. This ensures efficient startup behavior by avoiding unnecessary refreshes. This logic applies only to full refreshes when no refresh interval is specified.

To maintain the previous behavior and always refresh on every startup, set:

yaml acceleration: refresh_on_startup: always

Contributors

  • @peasee
  • @phillipleblanc
  • @sgrebnov
  • @lukekim
  • @Sevenannn

Breaking Changes

Starting from v1.0.6 when using persistent (file-mode) acceleration without a defined refresh interval, Spice performs a full refresh at startup only if no previously accelerated data is available. To maintain the previous behavior and always refresh on every startup, set:

yaml acceleration: refresh_on_startup: always

Cookbook Updates

No new recipes.

Upgrading

To upgrade to v1.0.6, use one of the following methods:

CLI:

console spice upgrade

Homebrew:

console brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.0.6 image:

console docker pull spiceai/spiceai:1.0.6

For available tags, see DockerHub.

Helm:

console helm repo update helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

Changelog

  • Implement proper readystate: onregistration for federation enabled accelerators by @phillipleblanc in #5019
  • Add indexes and primary keys mismatch detection for DuckDB Acceleration by @sgrebnov in #5045
  • Add comprehensive integration tests for the ready_state behavior by @phillipleblanc in #5042
  • Add test Spicepod for acceleration with constraints by @sgrebnov in #4891
  • Add test Spicepod for DuckDB append acceleration with constraints by @sgrebnov in #4898
  • Add DuckDB graceful shutdown test to E2E CI tests by @sgrebnov in #5047
  • Update duckdbappendwithpkand_indexes.yaml (work for duckdb 1.1.x) by @sgrebnov in #5067
  • fix: Downgrade to DuckDB 1.1.3 by @peasee in #5055
  • fix: Acceleration federation integration test by @peasee in #5070
  • Improvements to Iceberg Catalog/Data Connector by @phillipleblanc in #5071
  • Add Results-Cache-Status to indicate query result came from cache by @phillipleblanc in #4809
  • fix: Spice.ai schema inference by @peasee in #4674
  • Add refresh_on_startup Spicepod configuration param by @phillipleblanc and @sgrebnov in #5086
  • Test restart behavior of DuckDB file acceleration against glue iceberg table by @Sevenannn #5075
  • Run Iceberg Data Connector - DuckDB File mode integration test by @Sevenannn #5069
  • Integration test for glue iceberg catalog by @Sevenannn #5077

Full Changelog: https://github.com/spiceai/spiceai/compare/v1.0.5...v1.0.6

- Rust
Published by phillipleblanc 12 months ago

https://github.com/spiceai/spiceai - v1.0.5

Spice v1.0.5 (Mar 10, 2025)

Spice v1.0.5 expands Iceberg support with the introduction of the Iceberg Data Connector, in addition to the existing Iceberg Catalog Connector. This new connector enables direct dataset creation and configuration for specific Iceberg objects, enabling federated and accelerated SQL queries on Apache Iceberg tables.

Performance improvements include enhanced Parquet pruning in append mode, where object-store metadata is now leveraged alongside Hive partitioning to optimize file pruning. This results in faster and more efficient queries.

DuckDB has been upgraded to v1.2.0, along with additional stability improvements, including improved graceful shutdown and the ability to configure the DuckDB memory limit.

Additional updates include support for the Arrow Map type.

Highlights in v1.0.5

  • New Iceberg Data Connector: Enables direct dataset creation and querying of Iceberg tables.

Example usage in spicepod.yaml:

yaml datasets: - from: iceberg:https://iceberg-catalog-host.com/v1/namespaces/my_namespace/tables/my_table name: my_table params: # Same as Iceberg Catalog Connector acceleration: enabled: true For detailed setup instructions, authentication options, and configuration parameters, refer to the Iceberg Data Connector documentation.

  • Improved Parquet pruning in append mode: Uses object-store metadata for more efficient file pruning.

  • DuckDB upgrade to v1.2.0 with improved graceful shutdown: Read the DuckDB v1.2.0 announcement for details, including breaking changes for map and list_reduce. Graceful shutdown of DuckDB has been improved for better stability across restarts.

  • Configurable DuckDB memory limit: Use the duckdb_memory_limit parameter to set the DuckDB acceleration memory limit:

yaml - from: spice.ai:path.to.my_dataset name: my_dataset acceleration: params: duckdb_memory_limit: '2GB' enabled: true engine: duckdb mode: file

Contributors

  • @peasee
  • @phillipleblanc
  • @sgrebnov
  • @lukekim

Breaking Changes

Upgrading

To upgrade to v1.0.5, use one of the following methods:

CLI:

console spice upgrade

Homebrew:

console brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.0.5 image:

console docker pull spiceai/spiceai:1.0.5

For available tags, see DockerHub.

Helm:

console helm repo update helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

Changelog

  • fix: Update OpenAI model health check by @peasee in #4849
  • fix: Allow metrics endpoint setting in CLI by @peasee in #4939
  • DuckDB acceleration: fix Decimal with zero scale support by @sgrebnov in #4922
  • Introduce runtime shutdown state by @sgrebnov in #4917
  • Add support for Flight and HTTP endpoints configuration to Spice CLI (run and sql) by @sgrebnov and @lukekim in #4913
  • Fix Datafusion resources deallocation during shutdown by @sgrebnov in #4912
  • DuckDB: fix error handling during record batch insertion by @sgrebnov in #4894
  • DuckDB: add support for Map Arrow type for DuckDB acceleration by @sgrebnov in #4887
  • Upgrade to DuckDB v1.2.0 by @sgrebnov in #4842
  • Gracefully shutdown the runtime and deallocate static resources by @sgrebnov in #4879
  • Implement an Iceberg Data Connector by @phillipleblanc in #4941
  • Don't trace canceled dataset refresh during runtime termination by @sgrebnov in #4958
  • Use metadata column lastmodified when specified as a timecolumn by @phillipleblanc in #4970
  • Add duckdbmemorylimit param support for DuckDB acceleration by @sgrebnov in #4971
  • Add Iceberg dataset integration test by @phillipleblanc in #4950

Full Changelog: https://github.com/spiceai/spiceai/compare/v1.0.4...v1.0.5

- Rust
Published by sgrebnov 12 months ago

https://github.com/spiceai/spiceai - v1.0.4

Spice v1.0.4 (Feb 17, 2024)

Spice v1.0.4 includes several bugfixes including improved table column casing and normalization, Delta Lake partition pruning and improved tracing throughout spiced and added functionality to spice trace.

Highlights in v1.0.4

  • Improved spice trace functionality: A more detailed spice trace format with new flags --include-output, --include-input and --truncate ``` >> spice trace ai_chat --include-input --truncate

TREE STATUS DURATION TASK INPUT
b28bab6b58971b7e ✅ 1352.12ms aichat {"messages":[{"role":"user","content":"hello"}],"model":"openaimodel","stream":... (45 characters omitted) └── 1a0ad7c6138abb09 ✅ 1352.03ms aicompletion {"messages":[{"role":"user","content":"hello"}],"model":"openaimodel","stream":... (45 characters omitted) ```

Contributors

  • @phillipleblanc
  • @Sevenannn
  • @sgrebnov
  • @peasee
  • @Jeadie
  • @lukekim

Breaking Changes

No breaking changes.

Cookbook Updates

Upgrading

To upgrade to v1.0.4, use one of the following methods:

CLI:

console spice upgrade

Homebrew:

console brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.0.4 image:

console docker pull spiceai/spiceai:1.0.4

For available tags, see DockerHub.

Helm:

console helm repo update helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

No major dependency changes.

Changelog

  • Do not return underlying content of chunked embedding column by default during tooluse::documentsimilarity by @Jeadie in https://github.com/spiceai/spiceai/pull/4802
  • Fix Snowflake Case-Sensitive Identifiers support by @sgrebnov in https://github.com/spiceai/spiceai/pull/4813
  • Prepare for 1.0.4 by @sgrebnov in https://github.com/spiceai/spiceai/pull/4801
  • Add support for a timepartitioncolumn by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4784
  • Prevent the automatic normalization of refresh_sql columns to lowercase by @sgrebnov in https://github.com/spiceai/spiceai/pull/4787
  • Implement partition pruning for Delta Lake tables by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4783
  • Fix constraint verification for columns with uppercase letters by @sgrebnov in https://github.com/spiceai/spiceai/pull/4785
  • Add truncate command for spice trace by @peasee in https://github.com/spiceai/spiceai/pull/4771
  • Implement Cache-Control: no-cache to bypass results cache by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4763
  • Prompt user to download runtime when running spice sql by @Sevenannn in https://github.com/spiceai/spiceai/pull/4747
  • Add vector search tracing by @peasee in https://github.com/spiceai/spiceai/pull/4757
  • Update spice trace output format by @Jeadie in https://github.com/spiceai/spiceai/pull/4750
  • Fix tool call arguments in Grok messages by @Jeadie in https://github.com/spiceai/spiceai/pull/4741

Full Changelog: https://github.com/spiceai/spiceai/compare/v1.0.3...v1.0.4

- Rust
Published by phillipleblanc about 1 year ago

https://github.com/spiceai/spiceai - v1.0.3

Spice v1.0.3 (Feb 10, 2024)

Spice v1.0.3 provides several bug fixes, including a fix for the initial data load period when a retention policy has been set, and a new unsupported_type_action: string parameter to auto-convert unsupported types to strings.

Highlights in v1.0.3

  • PostgreSQL Data Connector: New unsupported_type_action: string parameter that auto-converts unsupported types such as JSONB to strings.

Contributors

  • @phillipleblanc
  • @Sevenannn
  • @sgrebnov
  • @peasee
  • @Jeadie
  • @lukekim

Breaking Changes

No breaking changes.

Cookbook Updates

Upgrading

To upgrade to v1.0.3, use one of the following methods:

CLI:

console spice upgrade

Homebrew:

console brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.0.3 image:

console docker pull spiceai/spiceai:1.0.3

For available tags, see DockerHub.

Helm:

console helm repo update helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

No major dependency changes.

Changelog

  • For local models, use 'content=""' instead of None by @Jeadie and @phillipleblanc in https://github.com/spiceai/spiceai/pull/4646
  • Perplexity Sonar LLM component by @Jeadie and @lukekim in https://github.com/spiceai/spiceai/pull/4673
  • Update async openai fork & support reasoning effort parameter by @Sevenannn and @phillipleblanc in https://github.com/spiceai/spiceai/pull/4679
  • Web search tool by @Jeadie and @lukekim in https://github.com/spiceai/spiceai/pull/4687
  • Setup tpc-extension by @ewgenius and @phillipleblanc in https://github.com/spiceai/spiceai/pull/4690
  • fix: Use PostgreSQL interval style for Spice.ai by @peasee and @phillipleblanc in https://github.com/spiceai/spiceai/pull/4716
  • Fix spice upgrade command by @Sevenannn and @sgrebnov in https://github.com/spiceai/spiceai/pull/4699
  • Fix bug: Ensure refresh only retrieves data within the retention period by @sgrebnov and @phillipleblanc in https://github.com/spiceai/spiceai/pull/4717
  • Implement unsupportedtypeaction: string for Postgres JSONB support by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4719
  • Fix the get latest release logic by @Sevenannn and @phillipleblanc in https://github.com/spiceai/spiceai/pull/4721
  • add 'accelerated_refresh' to 'spice trace' allowlist by @Jeadie and @phillipleblanc in https://github.com/spiceai/spiceai/pull/4711
  • Update version to 1.0.3 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4731
  • Truncate embedding columns within sampling tool by @Jeadie in https://github.com/spiceai/spiceai/pull/4722
  • Validate primary key columns during accelerated dataset initialization by @sgrebnov in https://github.com/spiceai/spiceai/pull/4736

Full Changelog: https://github.com/spiceai/spiceai/compare/v1.0.2...v1.0.3

- Rust
Published by phillipleblanc about 1 year ago

https://github.com/spiceai/spiceai - v1.0.2

Spice v1.0.2 (Feb 3, 2024)

Spice v1.0.2 adds support for running local filesystem-hosted DeepSeek models including R1 (cloud-hosted via DeepSeek API was already supported) and improves the developer experience for debugging AI chat tasks along with several bug fixes. The HuggingFace and Filesystem-Hosted models providers have both graduated to Release Candidates (RC) and the Spice.ai Cloud Platform catalog provider has graduated to Beta.

Highlights in v1.0.2

  • spice trace New spice trace CLI command that outputs a detailed breakdown of traces and tasks, including tool usage and AI completions.

Examples:

```shell trace> spice trace aichat 61cc6bd0e571c783 aichat ├── 69362c30f238076f tooluse::getreadiness ├── b6b17f1a9a6b86dc aicompletion ├── c30d692c6c41c5ee tooluse::listdatasets └── ce18756d5fef0df0 aicompletion

trace> spice trace ai_chat --trace-id 61cc6bd0e571c783

trace> spice trace ai_chat --id chatcmpl-AvXwmPSV1PMyGBi9dLfkEQTZPjhqz ```

The spice trace CLI simply outputs data available in the runtime.task_history table which can also be queried by SQL.

To learn more, see:

Contributors

  • @phillipleblanc
  • @johnnynunez
  • @Sevenannn
  • @sgrebnov
  • @peasee
  • @Jeadie
  • @lukekim

New Contributors

Breaking Changes

No breaking changes.

Cookbook Updates

Upgrading

To upgrade to v1.0.2, use one of the following methods:

CLI:

console spice upgrade

Homebrew:

console brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.0.2 image:

console docker pull spiceai/spiceai:1.0.2

For available tags, see DockerHub.

Helm:

console helm repo update helm upgrade spiceai spiceai/spiceai

What's Changed

Dependencies

No major dependency changes.

Changlog

  • Update release branch naming by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4539
  • ready for arm buildings by @johnnynunez in https://github.com/spiceai/spiceai/pull/4502
  • Bump helm chart version to 1.0.1 by @Sevenannn in https://github.com/spiceai/spiceai/pull/4542
  • Include 1.0.1 as supported version in security.md by @Sevenannn in https://github.com/spiceai/spiceai/pull/4545
  • Update CI to build on hosted windows runners by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4540
  • docs: Update Windows install by @peasee in https://github.com/spiceai/spiceai/pull/4551
  • Fix spark spicepod for test operator by @Sevenannn in https://github.com/spiceai/spiceai/pull/4555
  • Improve hugging face model chat error by @Sevenannn in https://github.com/spiceai/spiceai/pull/4554
  • fix: Update Windows E2E install by @peasee in https://github.com/spiceai/spiceai/pull/4557
  • feat: Add Spice Cloud Catalog Spicepod, release Alpha by @peasee in https://github.com/spiceai/spiceai/pull/4561
  • Fix huggingface embedding errors by @Sevenannn in https://github.com/spiceai/spiceai/pull/4558
  • feat: Load table schemas through REST for Spice Cloud Catalog by @peasee in https://github.com/spiceai/spiceai/pull/4563
  • Add upgrade instruction in release note by @Sevenannn in https://github.com/spiceai/spiceai/pull/4548
  • Add federated source information to refresh errors by @sgrebnov in https://github.com/spiceai/spiceai/pull/4560
  • docs: Update ROADMAP.md by @peasee in https://github.com/spiceai/spiceai/pull/4566
  • Merge mistral upstream by @Jeadie in https://github.com/spiceai/spiceai/pull/4562
  • Fix windows build by @Sevenannn in https://github.com/spiceai/spiceai/pull/4574
  • feat: Update Spice Cloud Catalog errors, release as Beta by @peasee in https://github.com/spiceai/spiceai/pull/4575
  • docs: Add TOC to README.md by @peasee in https://github.com/spiceai/spiceai/pull/4538
  • Updates to spiceai/mistral.rs by @Jeadie in https://github.com/spiceai/spiceai/pull/4580
  • Improve refresh error tracing by @sgrebnov in https://github.com/spiceai/spiceai/pull/4576
  • Add HTTP consistency & overhead to testoperator dispatch tool by @Jeadie in https://github.com/spiceai/spiceai/pull/4556
  • Fix append mode refresh with MySQL Data Connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/4583
  • fix: Retry flaky tests by @peasee in https://github.com/spiceai/spiceai/pull/4577
  • Fix E2E models test build on macOS runners by @sgrebnov in https://github.com/spiceai/spiceai/pull/4585
  • spice trace chat support in CLI by @Jeadie in https://github.com/spiceai/spiceai/pull/4582
  • Include hf test specs, enable ready_wait in workflow by @Sevenannn in https://github.com/spiceai/spiceai/pull/4584
  • Add paths verification when loading models by @sgrebnov in https://github.com/spiceai/spiceai/pull/4591
  • Add generation_config.json support for Filesystem models by @sgrebnov in https://github.com/spiceai/spiceai/pull/4592
  • Promote Filesystem model provider to RC by @sgrebnov in https://github.com/spiceai/spiceai/pull/4593
  • docs: Add models grading criteria by @peasee in https://github.com/spiceai/spiceai/pull/4550
  • Fix typo in Alpha Release Criteria (models) by @sgrebnov in https://github.com/spiceai/spiceai/pull/4588
  • fix: Retry AI integration tests by @peasee in https://github.com/spiceai/spiceai/pull/4595
  • Run LLM integration tests on Macs; add running local models by @Jeadie in https://github.com/spiceai/spiceai/pull/4495
  • Update version to 1.0.2 by @sgrebnov in https://github.com/spiceai/spiceai/pull/4594
  • feat: Schedule testoperator by @peasee in https://github.com/spiceai/spiceai/pull/4503
  • Improve UX of downloading GGUF from HF by @Jeadie in https://github.com/spiceai/spiceai/pull/4601
  • Improve spice trace CLI command by @sgrebnov https://github.com/spiceai/spiceai/pull/4629
  • Improve the UX of using huggingface models & embeddings by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4623
  • GGUF, hide metadata by @Jeadie in https://github.com/spiceai/spiceai/pull/4631
  • Promote hugging face to rc by @Sevenannn in https://github.com/spiceai/spiceai/pull/4626
  • Endgame Issue template improvements by @lukekim in https://github.com/spiceai/spiceai/pull/4647
  • feat: setup sccache for PR checks by @peasee in https://github.com/spiceai/spiceai/pull/4652
  • Run buildandrelease_cuda.yml when crates/llms/Cargo.toml changes by @Jeadie in https://github.com/spiceai/spiceai/pull/4648
  • Update E2E installation tests to match model runtime version by @sgrebnov in https://github.com/spiceai/spiceai/pull/4653
  • fix: Postgres LargeUtf8 is equal to Utf8 by @peasee in https://github.com/spiceai/spiceai/pull/4664
  • Fix eager string formatting in mistral.rs by @Jeadie in https://github.com/spiceai/spiceai/pull/4665
  • Better error for spicepod parsing by @Sevenannn in https://github.com/spiceai/spiceai/pull/4632
  • Update datafusion-table-providers (MySQL improvements) by @sgrebnov in https://github.com/spiceai/spiceai/pull/4670
  • Handle delta tables partitioned by a date column with large date values by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4672

Full Changelog: https://github.com/spiceai/spiceai/compare/v1.0.1...v1.0.2

- Rust
Published by sgrebnov about 1 year ago

https://github.com/spiceai/spiceai - v1.0.1

Spice v1.0.1 (Jan 27, 2024)

Spice v1.0.1 focuses on an improved developer experience, with automatic CUDA GPU detection for local models, in addition to bug fixes. Notably, the Iceberg Catalog Connector now supports AWS Glue including Sig v4 authentication.

Highlights in v1.0.1

  • AWS Glue Support for Iceberg Catalog Connector: The Iceberg Catalog Connector now supports AWS Glue. Example spicepod.yaml configuration:

yaml - from: iceberg:https://glue.ap-northeast-2.amazonaws.com/iceberg/v1/catalogs/123456789012/namespaces name: glue

  • spice upgrade CLI Command: The spice upgrade CLI command detects more edge cases for a smoother upgrade experience.

  • GPU Acceleration Detection: The Spice CLI now automatically detects and enables CUDA (NVIDIA GPUs) GPU acceleration when supported in addition to Metal (M-Series on macOS).

  • Python SDK: The Python SDK (spicepy) has updated to v3.0.0, aligning the SDK with the Runtime

Breaking changes

No breaking changes.

Dependencies

No major dependency changes.

Cookbook

Upgrading

To upgrade to v1.0.1, use one of the following methods:

CLI:

console spice upgrade

Homebrew:

console brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.0.1 image:

console docker pull spiceai/spiceai:1.0.1

For available tags, see DockerHub.

Helm:

console helm repo update helm upgrade spiceai spiceai/spiceai

Contributors

  • @Jeadie
  • @phillipleblanc
  • @ewgenius
  • @peasee
  • @Sevenannn
  • @sgrebnov
  • @lukekim

What's Changed

  • Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/4459
  • docs: 1.0 release notes by @peasee in https://github.com/spiceai/spiceai/pull/4440
  • Create a release-only workflow that uses a previous run's artifacts by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4461
  • Add publish-only CUDA workflow by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4462
  • Fix the CUDA release workflow by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4463
  • docs: Update SECURITY.md for stable by @peasee in https://github.com/spiceai/spiceai/pull/4465
  • docs: Update endgame by @peasee in https://github.com/spiceai/spiceai/pull/4460
  • docs: Promote HF and File model components by @peasee in https://github.com/spiceai/spiceai/pull/4457
  • fix: E2E test release installation by @peasee in https://github.com/spiceai/spiceai/pull/4466
  • Fix publish part of CUDA workflow by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4467
  • Fix broken docs links in README by @ewgenius in https://github.com/spiceai/spiceai/pull/4468
  • Update benchmark snapshots by @github-actions in https://github.com/spiceai/spiceai/pull/4474
  • Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/4477
  • Add instruction to force-install CPU runtime to v1.0 release notes by @sgrebnov in https://github.com/spiceai/spiceai/pull/4469
  • feat: Add WIP testoperator dispatch workflow by @peasee in https://github.com/spiceai/spiceai/pull/4478
  • Fix Bug: invalid REPL cursor position on Windows by @sgrebnov in https://github.com/spiceai/spiceai/pull/4480
  • feat: Download latest spiced commit for testoperators by @peasee in https://github.com/spiceai/spiceai/pull/4483
  • Add compute engine image by @lukekim in https://github.com/spiceai/spiceai/pull/4486
  • fix: Testoperator git fetch depth by @peasee in https://github.com/spiceai/spiceai/pull/4484
  • feat: New spicepods, testoperator improvements, TPCDS Q1 fix by @peasee in https://github.com/spiceai/spiceai/pull/4475
  • Add 87 CUDA compatiblity to build CI by @Jeadie in https://github.com/spiceai/spiceai/pull/4489
  • Use OpenAI golang client in spice chat by @Jeadie in https://github.com/spiceai/spiceai/pull/4491
  • Verify search and chat on Windows as part of AI installation tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/4492
  • feat: Add testoperator dispatch command by @peasee in https://github.com/spiceai/spiceai/pull/4479
  • Run CUDA builds on non-GPU instances by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4496
  • Use upgraded spice cli when performing runtime upgrade in spice upgrade by @Sevenannn in https://github.com/spiceai/spiceai/pull/4490
  • Revert "Use OpenAI golang client in spice chat (#4491)" by @Jeadie in https://github.com/spiceai/spiceai/pull/4532
  • Make Anthropic rate limit error message friendlier by @sgrebnov in https://github.com/spiceai/spiceai/pull/4501
  • Update supported CUDA targets: add 87(cli), remove 75 by @sgrebnov in https://github.com/spiceai/spiceai/pull/4509
  • Support AWS Glue for Iceberg catalog connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4517
  • Package CUDA runtime libraries into artifact for Windows by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4497

Full Changelog: https://github.com/spiceai/spiceai/compare/v1.0.0...v1.0.1

- Rust
Published by Sevenannn about 1 year ago

https://github.com/spiceai/spiceai - v1.0.0

Spice v1.0-stable (Jan 20, 2025)

🎉 After 47 releases, Spice.ai OSS has reached production readiness with the 1.0-stable milestone!

The core runtime and features such as query federation, query acceleration, catalog integration, search and AI-inference have all graduated to stable status along with key component graduations across data connectors, data accelerators, catalog connectors, and AI model providers.

Highlights in v1.0-stable

Breaking Changes

  • Default Runtime Version: The CLI will install the GPU accelerated AI-capable Runtime by default, when running spice install or spice run.

  • Default OpenAI Model: The default OpenAI model has updated to gpt-4o-mini.

  • Identifier Normalization: Unquoted identifiers such as table names are no longer normalized to lowercase. Identifiers will now retain their exact case as provided.

  • Sandboxed Docker Image: The Runtime Docker Image now runs the spiced process as the nobody user in a minimal chroot sandbox.

  • Insecure S3 and ABFS endpoints: The S3 and ABFS connectors now enforce insecure endpoint checks, preventing HTTP endpoints unless allow_http is explicitly enabled. Refer to the documentation for details.

Dependencies

No major dependency changes.

Upgrading

To upgrade to v1.0.0, use one of the following methods:

CLI:

console spice upgrade

Homebrew:

console brew upgrade spiceai/spiceai/spice

Docker:

Pull the spiceai/spiceai:1.0.0 image:

console docker pull spiceai/spiceai:1.0.0

For available tags, see DockerHub.

Helm:

console helm repo update helm upgrade spiceai spiceai/spiceai

Contributors

  • @peasee
  • @ewgenius
  • @Jeadie
  • @Sevenannn
  • @lukekim
  • @phillipleblanc
  • @sgrebnov

What's Changed

Full Changelog: https://github.com/spiceai/spiceai/compare/v1.0.0-rc.5...v1.0.0

- Rust
Published by peasee about 1 year ago

https://github.com/spiceai/spiceai - v1.0.0-rc.5

Spice v1.0-rc.5 (Jan 13, 2025)

Spice v1.0.0-rc.5 is the fifth release candidate for the first major version of Spice.ai OSS. This release focuses production readiness and critical bug fixes. In addition, a new DynamoDB data connector has been added along with automatic detection for GPU acceleration when running Spice using the CLI.

Highlights in v1.0-rc.5

  • Automatic GPU Acceleration Detection: Automatically detect and utilize GPU acceleration when running by CLI. Install AI components locally using the CLI command spice install ai. Currently supports NVIdia CUDA and Apple Metal (M-series).

  • DynamoDB Data Connector: Query AWS DynamoDB tables using SQL with the new DynamoDB Data Connector.

yaml datasets: - from: dynamodb:users name: users params: dynamodb_aws_region: us-west-2 dynamodb_aws_access_key_id: ${secrets:aws_access_key_id} dynamodb_aws_secret_access_key: ${secrets:aws_secret_access_key} acceleration: enabled: true

console sql> describe users; +----------------+-----------+-------------+ | column_name | data_type | is_nullable | +----------------+-----------+-------------+ | created_at | Utf8 | YES | | date_of_birth | Utf8 | YES | | email | Utf8 | YES | | account_status | Utf8 | YES | | updated_at | Utf8 | YES | | full_name | Utf8 | YES | | ... | +----------------+-----------+-------------+

  • File Data Connector: Graduated to Stable.

  • Dremio Data Connector: Graduated to Release Candidate (RC).

  • Spice.ai, Spark, and Snowflake Data Connectors: Graduated to Beta.

Dependencies

No major dependency changes.

Contributors

  • @Jeadie
  • @phillipleblanc
  • @ewgenius
  • @peasee
  • @Sevenannn
  • @lukekim

What's Changed

  • Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/4190
  • Ensure non-nullity of primary keys in MemTable; check validity of initial data. by @Jeadie in https://github.com/spiceai/spiceai/pull/4158
  • Bump version to v1.0.0 stable by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4191
  • Fix metal + models download by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4193
  • Update spice.ai connector beta roadmap by @ewgenius in https://github.com/spiceai/spiceai/pull/4194
  • feat: verify on zero results snapshots by @peasee in https://github.com/spiceai/spiceai/pull/4195
  • Add throughput test module to test-framework by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4196
  • Update Spice.ai TPCH snapshots by @ewgenius in https://github.com/spiceai/spiceai/pull/4202
  • Replace all usage of lazy_static! with LazyLock by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4199
  • Fix model + metal download by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4200
  • Run Clickbench for Dremio by @Sevenannn in https://github.com/spiceai/spiceai/pull/4138
  • Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/4205
  • Fix the typo in connector stable criteria by @Sevenannn in https://github.com/spiceai/spiceai/pull/4213
  • feat: Add throughput test example by @peasee in https://github.com/spiceai/spiceai/pull/4214
  • feat: calculate throughput test query percentiles by @peasee in https://github.com/spiceai/spiceai/pull/4215
  • feat: Add throughput test to actions by @peasee in https://github.com/spiceai/spiceai/pull/4217
  • Implement DynamoDB Data Connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4218
  • 1.0 doc updates by @lukekim in https://github.com/spiceai/spiceai/pull/4181
  • Improve clarity and concison of use-cases by @lukekim in https://github.com/spiceai/spiceai/pull/4220
  • Remove macOS Intel build by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4221
  • fix: Test operator throughput test workflow by @peasee in https://github.com/spiceai/spiceai/pull/4222
  • DynamoDB: Automatically load AWS credentials from IAM roles if access key not provided by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4226
  • File connector clickbench snapshots results by @ewgenius in https://github.com/spiceai/spiceai/pull/4225
  • Spice.ai Catalog Connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4204
  • feat: Add test framework metrics collection by @peasee in https://github.com/spiceai/spiceai/pull/4227
  • Add badges for build/test status on README.md by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4228
  • Release Dremio to RC by @Sevenannn in https://github.com/spiceai/spiceai/pull/4224
  • feat: Add more test spicepods by @peasee in https://github.com/spiceai/spiceai/pull/4229
  • feat: Add load test to testoperator by @peasee in https://github.com/spiceai/spiceai/pull/4231
  • Add TSV format to all object_store-based connectors by @Jeadie in https://github.com/spiceai/spiceai/pull/4192
  • Move test-framework to dev-dependencies for Runtime by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4230
  • Document limitation for correlated subqueries in TPCH for Spice.ai connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4235
  • Changes for CUDA by @Jeadie in https://github.com/spiceai/spiceai/pull/4130
  • fix: Collect batches from test framework, load test updates by @peasee in https://github.com/spiceai/spiceai/pull/4234
  • Suppress opentelemetry_sdk warnings - they aren't useful by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4243
  • fix: Set dataset status first, update test framework by @peasee in https://github.com/spiceai/spiceai/pull/4244
  • feat: Re-enable defaults on test spicepods by @peasee in https://github.com/spiceai/spiceai/pull/4248
  • Add usage for streaming local models; Fix spice chat usage bar TPS expansion by @Jeadie in https://github.com/spiceai/spiceai/pull/4232
  • refactor: Use composite testoperator setup, add query overrides by @peasee in https://github.com/spiceai/spiceai/pull/4246
  • Enable expandviewsat_output for DF optimizer and transform schema to expanded view types by @ewgenius in https://github.com/spiceai/spiceai/pull/4237
  • Add throughput test spicepod for databricks delta mode connector by @Sevenannn in https://github.com/spiceai/spiceai/pull/4241
  • Spark data connector - update and enable TPCH and TPCDS benchmarks by @ewgenius in https://github.com/spiceai/spiceai/pull/4240
  • Increase the timeout minutes of load test to 10 hours by @Sevenannn in https://github.com/spiceai/spiceai/pull/4254
  • Improve partition column counts error for delta table by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4247
  • Add e2e test for databricks catalog connector (mode: delta_lake) by @Sevenannn in https://github.com/spiceai/spiceai/pull/4255
  • Spark connector integration tests by @ewgenius in https://github.com/spiceai/spiceai/pull/4256
  • Run benchmark test with the new test framework by @Sevenannn in https://github.com/spiceai/spiceai/pull/4245
  • Configure databricks delta secrets to run load test by @Sevenannn in https://github.com/spiceai/spiceai/pull/4257
  • Support properties for emitted telemetry by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4249
  • feat: Add ready_wait test operator workflow input by @peasee in https://github.com/spiceai/spiceai/pull/4259
  • Handle 'LargeStringArray' for embedding tables by @Jeadie in https://github.com/spiceai/spiceai/pull/4263
  • llms tests for alpha/beta model criteria by @Jeadie in https://github.com/spiceai/spiceai/pull/4261
  • Configurable runner type for load and throughput tests by @ewgenius in https://github.com/spiceai/spiceai/pull/4262
  • Handle NULL partition columns for Delta Lake tables by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4264
  • Add integration test for Snowflake by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4266
  • Add Snowflake TPCH queries by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4268
  • Handle LargeStringArray in v1/search. by @Jeadie in https://github.com/spiceai/spiceai/pull/4265
  • Fix build_cuda in Update spiced_docker.yml by @Jeadie in https://github.com/spiceai/spiceai/pull/4269
  • Run Snowflake benchmark in GitHub Actions by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4270
  • Allow Snowflake query override for CI tests by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4271
  • Don't run GPU builds for trunk by @Jeadie in https://github.com/spiceai/spiceai/pull/4272
  • Fix InvalidTypeAction not working by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4273
  • Add xAI key to llm integration tests by @Jeadie in https://github.com/spiceai/spiceai/pull/4274
  • Update openai snapshots by @Jeadie in https://github.com/spiceai/spiceai/pull/4275
  • Fix federation bug for correlated subqueries by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4276
  • Update end_game.md by @ewgenius in https://github.com/spiceai/spiceai/pull/4278
  • Promote Snowflake to Beta by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4277
  • Set version to 1.0.0-rc.5 by @ewgenius in https://github.com/spiceai/spiceai/pull/4283
  • Update cargo.lock by @ewgenius in https://github.com/spiceai/spiceai/pull/4285
  • Update spice.ai data connector snapshots by @ewgenius in https://github.com/spiceai/spiceai/pull/4281
  • Promote the Spice.ai Data Connector to Beta by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4282
  • Revert change to integration_models__models__search__openai_chunking_response.snap by @Jeadie in https://github.com/spiceai/spiceai/pull/4279
  • Allow for a subset of build artifacts to be published to minio by @Jeadie in https://github.com/spiceai/spiceai/pull/4280
  • Promote File Data Connector to Stable by @ewgenius in https://github.com/spiceai/spiceai/pull/4286
  • Add Iceberg to Supported Catalogs by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4287
  • Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/4289
  • Fix Spark benchmark credentials, add back overrides by @ewgenius in https://github.com/spiceai/spiceai/pull/4295
  • Promote Spark Data Connector to Beta by @ewgenius in https://github.com/spiceai/spiceai/pull/4296
  • Add Dremio throughput test spicepod by @Sevenannn in https://github.com/spiceai/spiceai/pull/4233
  • Add error message for invalid databricks mode parameter by @Sevenannn in https://github.com/spiceai/spiceai/pull/4299
  • Fix pre-release check to look for build string by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4300
  • Promote databricks catalog connector (mode: delta_lake) to beta by @Sevenannn in https://github.com/spiceai/spiceai/pull/4301
  • Properly delegate load_table to Rest Catalog by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4303
  • Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/4302
  • docs: Update ROADMAP.md by @peasee in https://github.com/spiceai/spiceai/pull/4306
  • v1.0.0-rc.5 Release Notes by @ewgenius in https://github.com/spiceai/spiceai/pull/4298

Full Changelog: https://github.com/spiceai/spiceai/compare/v1.0.0-rc.4...v1.0.0-rc.5

- Rust
Published by ewgenius about 1 year ago

https://github.com/spiceai/spiceai - v1.0.0-rc.4

Spice v1.0-rc.4 (Jan 6, 2025)

Happy New Year 🎆!

Spice v1.0.0-rc.4 is the fourth release candidate for the first major version of Spice.ai OSS. This release continues the focus on production readiness. In addition, xAI has been added as a model provider.

Highlights in v1.0-rc.4

  • xAI Model Provider: Adds support for xAI hosted models.

yaml models: - from: xai:grok2-latest name: xai params: xai_api_key: ${secrets:SPICE_XAI_API_KEY}

yaml datasets: - from: file://my_table.tsv name: table

  • Spicepod Spec Version: Spicepod spec version v1 is now by default. v1beta1 will continue to work.

yaml version: v1 kind: Spicepod name: my_pod

Cookbook

Dependencies

No major dependency changes.

Contributors

  • @lukekim
  • @phillipleblanc
  • @peasee
  • @karifabri
  • @sgrebnov
  • @Jeadie
  • @ewgenius

What's Changed

Full Changelog: https://github.com/spiceai/spiceai/compare/v1.0.0-rc.3...v1.0.0-rc.4

- Rust
Published by phillipleblanc about 1 year ago

https://github.com/spiceai/spiceai - v1.0.0-rc.3

Spice v1.0-rc.3 (Dec 30, 2024)

Spice v1.0.0-rc.3 is the third release candidate for the first major version of Spice.ai OSS. This release continues the focus on production readiness and includes new Iceberg Catalog APIs, DuckDB improvements, and a new Iceberg Catalog Connector.

Highlights in v1.0-rc.3

  • Iceberg Catalog APIs: Spice now functions as an Iceberg Catalog provider, implementing a core subset of the Iceberg Catalog APIs. This enables Iceberg Catalog clients native discovery of datasets and schemas through Spice APIs.

  • GET /v1/namespaces - List all catalogs registered in Spice.

  • GET /v1/namespaces?parent=catalog - List schemas registered under a given catalog.

  • GET /v1/namespaces/:catalog_schema/tables - List tables registered under a given schema.

  • GET /v1/namespaces/:catalog_schema/tables/:table - Get the schema of a given table.

  • Iceberg Catalog Connector: The Iceberg Catalog Connector is a new integration to discover and query datasets from a remote Iceberg Catalog.

Example connecting to a remote Iceberg Catalog with tables stored in S3:

yaml catalogs: - from: iceberg:https://my-iceberg-catalog.com/v1/namespaces name: ice params: iceberg_s3_access_key_id: ${secrets:ICEBERG_S3_ACCESS_KEY_ID} iceberg_s3_secret_access_key: ${secrets:ICEBERG_S3_SECRET_ACCESS_KEY} iceberg_s3_region: us-east-1

View the Iceberg Catalog Connector documentation for more details.

  • DuckDB Improvements: Added cosine_distance support for DuckDB-backed vector search, improved unnest nested type handling for array_element and lists, and optimized query performance.

  • SQLite Data Accelerator: Graduated to Release Candidate (RC).

  • File Data Accelerator: Graduated to Release Candidate (RC).

Breaking changes

  • API:v1/datasets/sample has been removed as it is not particularly useful, can be replicated via SQL, and via the tools endpoint POST v1/tools/:name.

Cookbook

  • New Language Model Evals Recipe shwoing how to measure the performance of a language model using LLM-as-Judge, configured entirely in the spice runtime.

  • New Iceberg Catalog Recipe showing how to use Spice to query Iceberg tables from a Iceberg catalog.

Dependencies

  • OpenTelemetry: Upgraded from 0.26.0 to 0.27.1
  • Go: Upgraded from 1.22 to 1.23 (CLI)

Contributors

  • @sgrebnov
  • @phillipleblanc
  • @peasee
  • @Jeadie
  • @Sevenannn
  • @lukekim
  • @ewgenius

What's Changed

  • Add CI configuration for search benchmark dataset access by @sgrebnov in https://github.com/spiceai/spiceai/pull/3888
  • Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/3895
  • Upgrade dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3896
  • chore: Update helm chart for RC.2 by @peasee in https://github.com/spiceai/spiceai/pull/3899
  • Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/3903
  • chore: Update MacOS test release install to macos-13 by @peasee in https://github.com/spiceai/spiceai/pull/3901
  • Add usage to spice chat and fix v1/models?status=true. by @Jeadie in https://github.com/spiceai/spiceai/pull/3898
  • chore: Bump versions for rc3 by @peasee in https://github.com/spiceai/spiceai/pull/3902
  • docs: Update endgame with a step to verify dependencies in release notes by @peasee in https://github.com/spiceai/spiceai/pull/3897
  • Ensure eval dataset input and ouput of correct length by @Jeadie in https://github.com/spiceai/spiceai/pull/3900
  • spice add/connect/dataset configure should update spicepod, not overwrite it & upgrade to Go 1.23 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3905
  • Bump opentelemetry from 0.26.0 to 0.27.1 by @dependabot in https://github.com/spiceai/spiceai/pull/3879
  • Ensure trace_id is overridden for prior written spans by @Jeadie in https://github.com/spiceai/spiceai/pull/3906
  • add 'role': 'assistant' for local models by @Jeadie in https://github.com/spiceai/spiceai/pull/3910
  • Run tpcds benchmark for file connector by @Sevenannn in https://github.com/spiceai/spiceai/pull/3924
  • Update to reference cookbook instead of quickstarts/samples by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3928
  • Fix/remove flaky integration tests by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3930
  • Implement /v1/iceberg/namespaces & /v1/iceberg/config APIs by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3923
  • Add script for creating tpcds parquet files and spicepod for file connector by @Sevenannn in https://github.com/spiceai/spiceai/pull/3931
  • Use utoipa to generate openapi.json and swagger for dev by @Jeadie in https://github.com/spiceai/spiceai/pull/3927
  • fuzzy_match, json_match, includes scorer by @Jeadie in https://github.com/spiceai/spiceai/pull/3926
  • Implement /v1/iceberg/namespaces/:namespace by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3933
  • Implement GET /v1/iceberg/namespaces/:namespace/tables API by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3934
  • Add custom Spice DuckDB dialect with cosine_distance support by @sgrebnov in https://github.com/spiceai/spiceai/pull/3938
  • Fix NSQL error: all columns in a record batch must have the same length by @sgrebnov in https://github.com/spiceai/spiceai/pull/3947
  • Don't include tools use in hf test model by @Jeadie in https://github.com/spiceai/spiceai/pull/3955
  • Implement GET /v1/namespaces/{namespace}/tables/{table} API by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3940
  • Update dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3967
  • DuckDB: add support for nested types in Lists by @sgrebnov in https://github.com/spiceai/spiceai/pull/3961
  • Add script to set up clickbench for file connector by @Sevenannn in https://github.com/spiceai/spiceai/pull/3945
  • docs: Add connector stable criteria by @peasee in https://github.com/spiceai/spiceai/pull/3908
  • Update Roadmp Dec 23, 2024 by @lukekim in https://github.com/spiceai/spiceai/pull/3978
  • Improve CI testing for OpenAPI, new tool spiceschema, fix broken OpenAPI stuff. by @Jeadie in https://github.com/spiceai/spiceai/pull/3948
  • remove v1/datasets/sample by @Jeadie in https://github.com/spiceai/spiceai/pull/3981
  • feat: add SQLite ClickBench benchmark by @peasee in https://github.com/spiceai/spiceai/pull/3975
  • Remove feature 'llms/mistralrs' by @Jeadie in https://github.com/spiceai/spiceai/pull/3984
  • Add support for 'params.spice_tools: nsql' by @Jeadie in https://github.com/spiceai/spiceai/pull/3985
  • Fix integration tests - add missing format query parameter in /v1/status requests by @ewgenius in https://github.com/spiceai/spiceai/pull/3989
  • Enhance AI tools sampling logic for robust handling of large fields by @sgrebnov in https://github.com/spiceai/spiceai/pull/3959
  • Fix subquery federation by @Sevenannn in https://github.com/spiceai/spiceai/pull/3991
  • Fix unnest and add DuckDB support for array_element by @sgrebnov in https://github.com/spiceai/spiceai/pull/3995
  • Add score value snapshotting to vector similarity search tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/3996
  • Use Llama-3.2-3B-Instruct for Hugging Face integration testing by @sgrebnov in https://github.com/spiceai/spiceai/pull/3992
  • Simplify construct_chunk_query_sql for DuckDB compatibility by @sgrebnov in https://github.com/spiceai/spiceai/pull/3988
  • Update TPCH and TPCDS benchmarks for spice.ai connector by @ewgenius in https://github.com/spiceai/spiceai/pull/3982
  • Correctly pass Hugging Face token in models integration tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/3997
  • Fix: on_zero_results causes TransactionContext Error: Catalog write-write conflict on create with "attachment_0" by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3998
  • Add DuckDB acceleration to search benchmarks by @sgrebnov in https://github.com/spiceai/spiceai/pull/4000
  • Enable Postgres write via non-default postgres-write feature flag by @sgrebnov in https://github.com/spiceai/spiceai/pull/4004
  • Allow search benchmark to write test results by @sgrebnov in https://github.com/spiceai/spiceai/pull/4008
  • Make Flight DoPut atomic and commit write only on successful stream completion by @sgrebnov in https://github.com/spiceai/spiceai/pull/4002
  • Create a CatalogConnector abstraction by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4003
  • Fix generate-openapi.yml and add .schema/openapi.json. by @Jeadie in https://github.com/spiceai/spiceai/pull/3983
  • Enable spice.ai tpcds bench workflow. Comment failing tpch queries. by @ewgenius in https://github.com/spiceai/spiceai/pull/4001
  • feat: Add SQLite ClickBench overrides by @peasee in https://github.com/spiceai/spiceai/pull/4016
  • Implement Iceberg Catalog Connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4053
  • feat: Datafusion updates for SQLite fixes and release by @peasee in https://github.com/spiceai/spiceai/pull/4054
  • docs: Add accelerator stable release criteria by @peasee in https://github.com/spiceai/spiceai/pull/4017
  • Add dremio tpch / tpcds benchmark test by @Sevenannn in https://github.com/spiceai/spiceai/pull/4063
  • Update docs, and make PR to spiceai/docs for new openapi.json. by @Jeadie in https://github.com/spiceai/spiceai/pull/4019
  • Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/4065
  • Fix dremio subquery rewrite by @Sevenannn in https://github.com/spiceai/spiceai/pull/4064
  • Update generate-openapi.yml by @Jeadie in https://github.com/spiceai/spiceai/pull/4073
  • docs: Add catalog criteria by @peasee in https://github.com/spiceai/spiceai/pull/4052
  • fix distinct_columns in auto/nsql tool groups by @Jeadie in https://github.com/spiceai/spiceai/pull/4074
  • Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/4075
  • Update openapi.json by @github-actions in https://github.com/spiceai/spiceai/pull/4076
  • Implement windowfuncsupportwindowframe from DremioDialect by @Sevenannn in https://github.com/spiceai/spiceai/pull/4012
  • Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/4079
  • Promote file connector to rc by @Sevenannn in https://github.com/spiceai/spiceai/pull/4080
  • Add Iceberg to README by @phillipleblanc in https://github.com/spiceai/spiceai/pull/4085
  • Fix '/v1/status' default format by @Jeadie in https://github.com/spiceai/spiceai/pull/4081

Full Changelog: https://github.com/spiceai/spiceai/compare/v1.0.0-rc.2...v1.0.0-rc.3

- Rust
Published by lukekim about 1 year ago

https://github.com/spiceai/spiceai - v1.0.0-rc.2

Spice v1.0-rc.2 (Dec 16, 2024)

Spice v1.0.0-rc.2 is the second release candidate for the first major version of Spice.ai OSS. This release continues to build on the stability of Spice for production use, including key Data Connector graduations, bug fixes, and AI features.

Highlights in v1.0-rc.2

  • MS SQL and File Data Connectors: Graduated from Alpha to Beta.

  • GraphQL and Databricks Delta Lake Data Connectors: Graduated from Beta to Release Candidate.

  • gospice SDK Release: The Spice Go SDK has updated to v7.0, adding support for refreshing datasets and upgrading dependencies.

  • Azure AI Support: Added support for both LLMs and embedding models. Example spicepod.yml configuration:

yaml embeddings: - name: azure from: azure:text-embedding-3-small params: endpoint: https://your-resource-name.openai.azure.com azure_api_version: 2024-08-01-preview azure_deployment_name: text-embedding-3-small azure_api_key: ${ secrets:SPICE_AZURE_API_KEY } models: - name: azure from: azure:gpt-4o-mini params: endpoint: https://your-resource-name.openai.azure.com azure_api_version: 2024-08-01-preview azure_deployment_name: gpt-4o-mini azure_api_key: ${ secrets:SPICE_AZURE_TOKEN }

Accelerate subsets of columns: Spice now supports acceleration for specific columns from a federated source. Specify the desired columns directly in the Refresh SQL for more selective and efficient data acceleration.

Example spicepod.yaml configuration:

yaml datasets: - from: s3://spiceai-demo-datasets/taxi_trips/2024/ name: taxi_trips params: file_format: parquet acceleration: refresh_sql: SELECT tpep_pickup_datetime, tpep_dropoff_datetime, trip_distance, total_amount FROM taxi_trips

Breaking changes

Sharepoint Authentication Parameters: now use access tokens instead of authorization codes, using the sharepoint_bearer_token parameter. The sharepoint_auth_code parameter has been removed.

Data Connector Delimiters: now support / and ://, in addition to : in the from parameter of the dataset configuration. The following examples are equivalent:

  • from: postgres://my_postgres_table
  • from: postgres/my_postgres_table
  • from: postgres:my_postgres_table

Some data connectors, such as s3 which only accepts ://, place further restrictions on the allowed delimiter.

The file data connector has changed how it interprets the :// delimiter to reflect how most other URL parsers work, i.e. file://my_file_path. Previously, the file path was interpreted as /my_file_path. Now, it is interpreted as a relative path, i.e. my_file_path.

Spice Search limit: is now applied to the final search result, instead of previously being applied separately to each dataset involved in a search before aggregation.

Dependencies

  • Rust: Upgraded to 1.83

Contributors

  • @phillipleblanc
  • @ewgenius
  • @Jeadie
  • @sgrebnov
  • @peasee
  • @Sevenannn
  • @Advayp

New Contributors

What's Changed

Full Changelog: https://github.com/spiceai/spiceai/compare/v1.0.0-rc.1...v1.0.0-rc.2

- Rust
Published by peasee about 1 year ago

https://github.com/spiceai/spiceai - v1.0.0-rc.1

Spice v1.0-rc.1 (Nov 27, 2024)

Spice v1.0.0-rc.1 marks the release candidate for the first major version of Spice.ai OSS. This milestone includes key Connector and Accelerator graduations and bug fixes, positioning Spice for a stable and production-ready release.

Highlights in v1.0-rc.1

API Key Authentication: Spice now supports optional authentication for API endpoints via configurable API keys, for additional security and control over runtime access.

Example Spicepod.yml configuration:

yaml runtime: auth: api-key: enabled: true keys: - ${ secrets:api_key } # Load from a secret store - my-api-key # Or specify directly

Usage:

  • HTTP API: Include the API key in the X-API-Key header.
  • Flight SQL: Use the API key in the Authorization header as a Bearer token.
  • Spice CLI: Provide the --api-key flag for CLI commands.

For more details on using API Key auth, refer to the API Auth documentation.

DuckDB Data Connector: Has graduated from Beta to Release Candidate.

Arrow and DuckDB Data Accelerators: Both have graduated from Beta to Release Candidates.

Debezium Kafka Integration: Spice now supports secure authentication and encryption options for Kafka connections when using Debezium for Change Data Capture (CDC). The previous limitation of PLAINTEXT protocol-only connections has been lifted. Spice now supports the following Kafka security configurations:

  • Security protocol: PLAINTEXT, SSL, SASLPLAINTEXT, SASLSSL
  • SASL mechanisms: PLAIN, SCRAM-SHA-256, SCRAM-SHA-512

Example Spicepod.yml configuration:

yaml datasets: - from: debezium:my_kafka_topic_with_debezium_changes name: my_dataset params: kafka_security_protocol: SASL_SSL kafka_sasl_mechanism: SCRAM-SHA-512 kafka_sasl_username: kafka kafka_sasl_password: ${secrets:kafka_sasl_password} kafka_ssl_ca_location: ./certs/kafka_ca_cert.pem

Breaking changes

Model Parameters: The params.spice_tools parameter has been replaced by params.tools. Backward compatibility is maintained for existing configurations using params.spice_tools.

Dataset Accelerator State: The ready_state parameter has been moved to the dataset level.

Ready Handler Response: The response body of the /v1/ready handler has been changed from Ready (uppercase) to ready (lowercase) for consistency and adherence to standards.

Default Kafka Security for Debezium: The default Kafka kafka_security_protocol parameter for Debezium datasets has changed from PLAINTEXT to SASL_SSL, improving security by default. Metrics Name Updates: Adjustments have been made to specific metrics for improved observability and accuracy:

| Before | v1.0-rc.1 | | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------- | | catalogsloaderror | catalogloaderrors | | catalogsstatus | catalogloadstate | | datasetsaccelerationappenddurationms, datasetsaccelerationloaddurationms | datasetaccelerationrefreshdurationms {mode: append/full} | | datasetsaccelerationlastrefreshtime | datasetaccelerationlastrefreshtimems | | datasetsaccelerationrefresherror | datasetaccelerationrefresherrors | | datasetscount | datasetactivecount | | datasetsloaderror | datasetloaderrors | | datasetsstatus | datasetloadstate | | datasetsunavailabletime | datasetunavailabletimems | | embeddingscount | embeddingsactivecount | | embeddingsloaderror | embeddingsloaderrors | | embeddingsstatus | embeddingsloadstate | | flightdoactiondurationms, flightdogetgetprimarykeysdurationms, flightdogetgetcatalogsdurationms, flightdogetgetschemasdurationms, flightdogetgetsqlinfodurationms, flightdogettabletypesdurationms, flightdogetgettablesdurationms, flightdogetpreparedstatementquerydurationms, flightdogetsimpledurationms, flightdogetstatementquerydurationms, flightdoputdurationms, flighthandshakerequestdurationms, flightlistactionsdurationms, flightgetflightinforequestdurationms | flightrequestdurationms {method: methodname, command: commandname} | | flightdoactionrequests, flightdoexchangedataupdatessent, flightdoexchangerequests, flightdoputrequests, flightdogetrequests, flighthandshakerequests, flightlistactionsrequests, flightlistflightsrequests, flightgetflightinforequests, flightgetschemarequests | flightrequests {method: methodname, command: commandname} | | httprequestsdurationms | httprequestdurationms | | modelscount | modelactivecount | | modelsloaddurationms | modelloaddurationms | | modelsloaderror | modelloaderrors | | modelsstatus | modelloadstate | | toolcount | toolactivecount | | toolloaderror | toolloaderrors | | toolsstatus | toolloadstate | | querycount | queryexecutions | | queryexecutionduration | queryexecutiondurationms | | resultscachehitcount | resultscachehits | | resultscacheitemcount | resultscacheitemscount | | resultscachemaxsize | resultscachemaxsizebytes | | resultscacherequestcount | resultscacherequests | | resultscachesize | resultscachesizebytes | | secretsstoresloaddurationms | secretsstoreloaddurationms | | bytesprocessed | queryprocessedbytes | | bytesreturned | queryreturnedbytes | | spicedruntimeflightserverstart | runtimeflightserverstarted | | spicedruntimehttpserverstart | runtimehttpserverstarted | | viewsloaderror | viewloaderrors |

Contributors

  • @phillipleblanc
  • @sgrebnov
  • @Jeadie
  • @Sevenannn
  • @peasee
  • @slyons
  • @barracudarin
  • @lukekim
  • @ewgenius

What's changed

  • Update to next release version by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3372
  • Update Helm chart to v0.20.0-beta by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3373
  • Upgrade dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3375
  • E2E: Add a test to confirm refreshing with custom refresh-sql via CLI by @sgrebnov in https://github.com/spiceai/spiceai/pull/3374
  • Fix regression in inferring embedding model vector size for non-default models by @Jeadie in https://github.com/spiceai/spiceai/pull/3376
  • add AI quickstarts to endgame by @Jeadie in https://github.com/spiceai/spiceai/pull/3378
  • Remove need for params.model_type for most HF LLMs by @Jeadie in https://github.com/spiceai/spiceai/pull/3342
  • Replace query_duration_seconds and http_requests_duration_seconds with milliseconds metrics by @sgrebnov in https://github.com/spiceai/spiceai/pull/3251
  • Add Extension<Runtime> to HTTP routes to simplify tooling in NSQL. by @Jeadie in https://github.com/spiceai/spiceai/pull/3384
  • Update datafusion patch by @Sevenannn in https://github.com/spiceai/spiceai/pull/3386
  • Ensure hyperparameters are obeyed in recursive chat/completion calls. by @Jeadie in https://github.com/spiceai/spiceai/pull/3395
  • fix: update odbc benchmarks by @peasee in https://github.com/spiceai/spiceai/pull/3394
  • Implement traits & plumbing for pluggable HTTP Auth by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3397
  • Add allow_http parameter for S3 data connector by @Sevenannn in https://github.com/spiceai/spiceai/pull/3398
  • Add column field to dataset spicepod component by @Jeadie in https://github.com/spiceai/spiceai/pull/3336
  • feat: add duckdb connector benchmarks by @peasee in https://github.com/spiceai/spiceai/pull/3403
  • Add integration tests for OpenAI NSQL functionality by @sgrebnov in https://github.com/spiceai/spiceai/pull/3402
  • Implement optional api-key auth for the HTTP endpoint by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3405
  • Add integration tests for Search API (OpenAI and HF models) by @sgrebnov in https://github.com/spiceai/spiceai/pull/3410
  • HTTP APIs: list tools, call tool by @Jeadie in https://github.com/spiceai/spiceai/pull/3404
  • Implement optional api-key auth for the Flight/FlightSQL endpoint by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3412
  • Adding semicolons to some TPCH queries to make sure they run on the CLI by @slyons in https://github.com/spiceai/spiceai/pull/3420
  • Add GrpcAuth to protect the OpenTelemetry endpoint by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3417
  • Support Kafka-native authentication and TLS connections for Debezium connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3419
  • Add integration tests for Embeddings API (OpenAI and HF models) by @sgrebnov in https://github.com/spiceai/spiceai/pull/3416
  • Support base64 embedding format by @Jeadie in https://github.com/spiceai/spiceai/pull/3418
  • Give local models some love by @Jeadie in https://github.com/spiceai/spiceai/pull/3425
  • Have views update on --pods-watcher-enabled by @Jeadie in https://github.com/spiceai/spiceai/pull/3428
  • Simplify running models integration tests locally by @sgrebnov in https://github.com/spiceai/spiceai/pull/3424
  • Make Debezium connector MySQL compatible by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3432
  • Store + load memory tooling, enable by @Jeadie in https://github.com/spiceai/spiceai/pull/3413
  • Statically compile OpenSSL by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3434
  • Build macOS x64 on macos-14 (Sonoma) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3435
  • Upgrade dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3443
  • Bump azure_core from 0.20.0 to 0.21.0 by @dependabot in https://github.com/spiceai/spiceai/pull/3436
  • Add integration tests for chat completion API (HF and OpenAI) by @sgrebnov in https://github.com/spiceai/spiceai/pull/3433
  • Run Clickbench with Spice Benchmark Binary by @Sevenannn in https://github.com/spiceai/spiceai/pull/3389
  • Use datatype_is_semantically_equal in verify_schema by @Sevenannn in https://github.com/spiceai/spiceai/pull/3423
  • Use spiceai-large-runners to build benchmark binary by @Sevenannn in https://github.com/spiceai/spiceai/pull/3446
  • Skip reqwest_retry::middleware tracing in non verbose configuration by @sgrebnov in https://github.com/spiceai/spiceai/pull/3445
  • feat: Add invalid type action handling for DuckDB by @peasee in https://github.com/spiceai/spiceai/pull/3430
  • Fix benchmark: Lock poisoning issue from INSTA by @Sevenannn in https://github.com/spiceai/spiceai/pull/3457
  • docs: Release DuckDB Connector RC by @peasee in https://github.com/spiceai/spiceai/pull/3459
  • DR: Code Pattern For Obtaining Milliseconds-Based Duration by @sgrebnov in https://github.com/spiceai/spiceai/pull/3460
  • Improve ClickBench setup script: avoid re-downloading test data every time by @sgrebnov in https://github.com/spiceai/spiceai/pull/3463
  • Fix TableReference quoting for MySQL by @Jeadie in https://github.com/spiceai/spiceai/pull/3461
  • Tool use and model name for local models by @Jeadie in https://github.com/spiceai/spiceai/pull/3458
  • params.tools, not params.spice_tools. Allow backwards compatibility to params.spice_tools. by @Jeadie in https://github.com/spiceai/spiceai/pull/3473
  • fix: Support DuckDB boolean list by @peasee in https://github.com/spiceai/spiceai/pull/3474
  • Upgrade to DataFusion 43 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3462
  • Build explicit ODBC Docker image by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3476
  • Promote Arrow acceleration to RC by @sgrebnov in https://github.com/spiceai/spiceai/pull/3478
  • Update benchmark workflow to create PR for updating snapshot by @Sevenannn in https://github.com/spiceai/spiceai/pull/3479
  • Update benchmark snapshots for spice.ai connector tpch by @github-actions in https://github.com/spiceai/spiceai/pull/3481
  • Update setup-make action by @Sevenannn in https://github.com/spiceai/spiceai/pull/3488
  • Option to return sql from v1/nsql by @Jeadie in https://github.com/spiceai/spiceai/pull/3487
  • Adding scripts to run and monitor TPC-H/-DS queries at larger scale factors by @slyons in https://github.com/spiceai/spiceai/pull/3483
  • Update Datafusion and Datafusion-Table-Providers patch by @Sevenannn in https://github.com/spiceai/spiceai/pull/3489
  • docs: Update Accelerator RC to specify clickbench in all modes by @peasee in https://github.com/spiceai/spiceai/pull/3490
  • Add logos and marks by @lukekim in https://github.com/spiceai/spiceai/pull/3485
  • Updates to repo docs by @lukekim in https://github.com/spiceai/spiceai/pull/3486
  • Change document_similarity to return markdown, not JSON. by @Jeadie in https://github.com/spiceai/spiceai/pull/3477
  • Add support for creating embeddings for Utf8View type columns by @sgrebnov in https://github.com/spiceai/spiceai/pull/3498
  • Add vector search support for Utf8View type columns by @sgrebnov in https://github.com/spiceai/spiceai/pull/3500
  • Update datafusion-table-providers version by @Jeadie in https://github.com/spiceai/spiceai/pull/3503
  • Update text-embeddings-inference and mistral.rs from downstream. by @Jeadie in https://github.com/spiceai/spiceai/pull/3505
  • Fix snapshot update PR push in benchmark by @Sevenannn in https://github.com/spiceai/spiceai/pull/3484
  • Run FederationAnalyzerRule before ResolveGroupingFunction rule by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3508
  • Update benchmark snapshots by @github-actions in https://github.com/spiceai/spiceai/pull/3509
  • docs: Release DuckDB accelerator RC by @peasee in https://github.com/spiceai/spiceai/pull/3512
  • Upgrade datafusion-functions-json to 0.43 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3511
  • Update Datafusion Table Provider patch to fix MySQL refresh append mode by @Sevenannn in https://github.com/spiceai/spiceai/pull/3514
  • Handle panics in HF API calls by @Jeadie in https://github.com/spiceai/spiceai/pull/3521
  • Update Runtime metrics according to metrics naming guidelines by @sgrebnov in https://github.com/spiceai/spiceai/pull/3518
  • Update Flight metrics according to metrics naming guidelines by @sgrebnov in https://github.com/spiceai/spiceai/pull/3515
  • Update Results Cache metrics according to metrics naming guidelines by @sgrebnov in https://github.com/spiceai/spiceai/pull/3520
  • Move ready_state to dataset level by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3526
  • Add --force option to spice upgrade to force it to upgrade to the latest released version by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3527
  • Refactor runtime initialization into separate modules by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3531
  • Update Anonymous telemetry metrics according to metrics naming guidelines by @sgrebnov in https://github.com/spiceai/spiceai/pull/3529
  • Add Metrics naming principles and guidelines by @sgrebnov in https://github.com/spiceai/spiceai/pull/3516
  • Update Dataset Acceleration metrics according to metrics naming guidelines by @sgrebnov in https://github.com/spiceai/spiceai/pull/3528
  • Improve localpod startup to register immediately after its parent is registered by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3532
  • AI/LLM integration tests: make tests more robust and verify more ai_tools by @sgrebnov in https://github.com/spiceai/spiceai/pull/3513
  • Update dashboards to match new metrics names by @sgrebnov in https://github.com/spiceai/spiceai/pull/3530
  • Clarify source of prefixes for data component parameters. by @Jeadie in https://github.com/spiceai/spiceai/pull/3541
  • Upgrade dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3564
  • Update Spice release process to support release branches by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3525
  • fix: Validate the endpoint for ABFS and S3 by @peasee in https://github.com/spiceai/spiceai/pull/3565
  • Vector Search: Default to datasets with embeddings only when none are specified by @sgrebnov in https://github.com/spiceai/spiceai/pull/3575
  • Lowercase the ready handler response by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3577
  • Update benchmark snapshots by @github-actions in https://github.com/spiceai/spiceai/pull/3579
  • Improve spice search error handling by @sgrebnov in https://github.com/spiceai/spiceai/pull/3571
  • Load components in parallel, not concurrently by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3566
  • fix: Make S3 auth parameter validation more robust: by @peasee in https://github.com/spiceai/spiceai/pull/3578
  • fix: Infer if the specified file format is correct in object store by @peasee in https://github.com/spiceai/spiceai/pull/3580
  • Add ability to configure CORS on the HTTP server by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3581
  • fix: Handle invalid S3 auth and region better by @peasee in https://github.com/spiceai/spiceai/pull/3582
  • allow setting of replicaCount to a falsy-value by @barracudarin in https://github.com/spiceai/spiceai/pull/3586
  • spice search to default to only datasets with embeddings by @sgrebnov in https://github.com/spiceai/spiceai/pull/3588
  • Run AI integration tests as part of CI by @sgrebnov in https://github.com/spiceai/spiceai/pull/3572
  • Load datasets in parallel by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3585
  • Run integration test on smaller runners by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3583
  • Use folders for model component by @Jeadie in https://github.com/spiceai/spiceai/pull/3584
  • Improve models integration tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/3592
  • Change default taskhistory capturedoutput to none by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3598
  • Add timeout to /v1/datasets APIs when app is locked by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3601
  • Properly drop the read lock on the runtime app in http.start by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3603
  • Make integration tests more robust on fewer cores by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3604
  • refactor: First pass data connector error messages update by @peasee in https://github.com/spiceai/spiceai/pull/3602
  • Add log if no datasets are configured by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3605
  • Upgrade to DuckDB 1.1.3 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3606
  • Add E2E test for spice search and chat functionality (OpenAI) by @sgrebnov in https://github.com/spiceai/spiceai/pull/3599
  • Use spiceai-runners for TPCH / TPCDS benchmark by @Sevenannn in https://github.com/spiceai/spiceai/pull/3507
  • docs: Update error handling guide by @peasee in https://github.com/spiceai/spiceai/pull/3611
  • Improve default description for sql tool by @Jeadie in https://github.com/spiceai/spiceai/pull/3612
  • Update metric name from query_invocations to query_executions by @sgrebnov in https://github.com/spiceai/spiceai/pull/3613
  • Don't provide runtime tools to health check. by @Jeadie in https://github.com/spiceai/spiceai/pull/3615
  • Sort vector search results based on similarity score by @sgrebnov in https://github.com/spiceai/spiceai/pull/3620
  • Allow overriding runtime configuration with --set-runtime CLI flags by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3619
  • Some bugs by @Jeadie in https://github.com/spiceai/spiceai/pull/3621
  • Improve S3 errors by @Sevenannn in https://github.com/spiceai/spiceai/pull/3640
  • Update Databricks, Delta Lake, DuckDB error messages by @Sevenannn in https://github.com/spiceai/spiceai/pull/3642
  • docs: Add error message UX to beta connector criteria by @peasee in https://github.com/spiceai/spiceai/pull/3639
  • feat: Make REPL identify it's waiting on a new line by @peasee in https://github.com/spiceai/spiceai/pull/3617
  • Wrap Server-Sent-Events chat errors as OpenAI error events by @sgrebnov in https://github.com/spiceai/spiceai/pull/3641
  • refactor: Update accelerated table errors, dataset health monitor errors by @peasee in https://github.com/spiceai/spiceai/pull/3614
  • Extend v1/datasets api to indicate if dataset can be used in vector search by @sgrebnov in https://github.com/spiceai/spiceai/pull/3644
  • feat: Unnest DataFusion errors by @peasee in https://github.com/spiceai/spiceai/pull/3646
  • feat: Add RateLimited DataConnectorError by @peasee in https://github.com/spiceai/spiceai/pull/3648
  • Setup nightly docker release workflow by @ewgenius in https://github.com/spiceai/spiceai/pull/3649
  • Make LLM integration tests more extensible. by @Jeadie in https://github.com/spiceai/spiceai/pull/3576
  • feat: Update ODBC error messages by @peasee in https://github.com/spiceai/spiceai/pull/3651
  • feat: Better tonic errors by @peasee in https://github.com/spiceai/spiceai/pull/3650
  • Nightly release workflow fixes by @ewgenius in https://github.com/spiceai/spiceai/pull/3652
  • Fix missing ARM64 image for nightly publish step by @ewgenius in https://github.com/spiceai/spiceai/pull/3653
  • Use GitHub GraphQL rate limiting responses to rate limit requests by @lukekim in https://github.com/spiceai/spiceai/pull/3610
  • Fix typo in nightly release publish step by @ewgenius in https://github.com/spiceai/spiceai/pull/3654
  • Handle GitHub rate-limiting for the Rest API by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3656
  • Adding custom User-Agent parameters to chat, nsql and flightrepl by @slyons in https://github.com/spiceai/spiceai/pull/3609
  • Remove "nightly-" prefix from tag by @ewgenius in https://github.com/spiceai/spiceai/pull/3671
  • Upgrade dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3670
  • spice search to warn if dataset is not ready and won't be included in search by @sgrebnov in https://github.com/spiceai/spiceai/pull/3590
  • Fix keyring secret store to try both prefixed & unprefixed secrets by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3672
  • Handle empty embeds by allowing for nulls by @Jeadie in https://github.com/spiceai/spiceai/pull/3600
  • Improve github connector error by @Sevenannn in https://github.com/spiceai/spiceai/pull/3677
  • Update FlightSQL error messages by @sgrebnov in https://github.com/spiceai/spiceai/pull/3676
  • Update Datafusion Table Provider Patch to include error message improvements by @Sevenannn in https://github.com/spiceai/spiceai/pull/3678
  • Integration tests for llms crate, with basic Anthropic test. by @Jeadie in https://github.com/spiceai/spiceai/pull/3647
  • Allow E2E model tests to complete even if parallel platform tests failed by @sgrebnov in https://github.com/spiceai/spiceai/pull/3679
  • Add Openai to llms testing by @Jeadie in https://github.com/spiceai/spiceai/pull/3680
  • Fix .env in '.github/workflows/integration_llms.yml' by @Jeadie in https://github.com/spiceai/spiceai/pull/3686
  • Improve error messages for spice ai connector, separate errors to different lines for DuckDB, Delta Lake, Databricks connector by @Sevenannn in https://github.com/spiceai/spiceai/pull/3643
  • Add microsoft/Phi-3-mini-4k-instruct to llms crate testing, with MODEL_SKIPLIST & MODEL_ALLOWLIST by @Jeadie in https://github.com/spiceai/spiceai/pull/3690
  • Add nightly label to spiced version in Cargo.toml by @ewgenius in https://github.com/spiceai/spiceai/pull/3691
  • Disable HF in models integration tests (not supported) by @sgrebnov in https://github.com/spiceai/spiceai/pull/3693
  • Add log when CORS is enabled by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3695
  • Fix nightly release workflow by @ewgenius in https://github.com/spiceai/spiceai/pull/3698
  • Correctly set nightly labels for both release and pre-release versions by @ewgenius in https://github.com/spiceai/spiceai/pull/3699
  • Improve REPL error handling for multiline error messages by @sgrebnov in https://github.com/spiceai/spiceai/pull/3692
  • Determine supportfilterpushdown based on Accelerator federated reader & ZeroResultsAction by @Sevenannn in https://github.com/spiceai/spiceai/pull/3694
  • Fix rdfkafak duplicated version by @Sevenannn in https://github.com/spiceai/spiceai/pull/3707
  • feat: Render multiline errors better in REPL by @peasee in https://github.com/spiceai/spiceai/pull/3701
  • refactor: Update UnableToAttachDataConnector error message by @peasee in https://github.com/spiceai/spiceai/pull/3706
  • refactor: Update errors for Alpha connectors by @peasee in https://github.com/spiceai/spiceai/pull/3705
  • Update benchmark snapshots by @github-actions in https://github.com/spiceai/spiceai/pull/3704
  • Implement a RequestContext that automatically propagates request details to metric dimensions by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3709
  • Fix acceleration in append mode with refresh_sql specified by @sgrebnov in https://github.com/spiceai/spiceai/pull/3697
  • Bump github.com/stretchr/testify from 1.9.0 to 1.10.0 by @dependabot in https://github.com/spiceai/spiceai/pull/3655
  • Tokenizer for OpenAI embedding models for accurate chunking by @Jeadie in https://github.com/spiceai/spiceai/pull/3519
  • Update error message when dataset isn't configured with time_column in append refresh by @Sevenannn in https://github.com/spiceai/spiceai/pull/3703
  • Add the missing winver dependency in runtime crate by @Sevenannn in https://github.com/spiceai/spiceai/pull/3711
  • deps: Update table providers by @peasee in https://github.com/spiceai/spiceai/pull/3712
  • Add special tokens in chunk sizer by @Jeadie in https://github.com/spiceai/spiceai/pull/3713
  • Disable results cache for benchmark tests by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3715

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.20.0-beta...v1.0.0-rc.1

- Rust
Published by ewgenius over 1 year ago

https://github.com/spiceai/spiceai - v0.20.0-beta

Spice v0.20.0-beta (Nov 04, 2024)

Spice v0.20.0-beta improves federated query performance with column pruning and adds support for Metal (Apple Silicon) and CUDA (NVidia) accelerators. The S3, PostgreSQL, MySQL, and GitHub Data Connectors have graduated from Beta to Release Candidates. The Arrow, DuckDB, and SQLite Data Accelerators have graduated from Alpha to Beta.

Highlights in v0.20.0-beta

Data Connectors: The S3, PostgreSQL, MySQL, and GitHub Data Connectors have graduated from beta to release candidate.

Data Accelerators: The Arrow, DuckDB, and SQLite Data Accelerators have graduated from alpha to beta.

Metal and CUDA Support: Added support for Metal (Apple Silicon) and CUDA (NVidia) for AI/ML workloads including embeddings and local LLM inference.

For instructions on compiling a Meta or CUDA binary, see the Installation Docs.

Breaking Changes

  • The ODBC Data Connector now requires ODBC drivers specified in connection strings are registered in the system ODBC driver manager.

Example invalid connection string:

bash DRIVER={/path/to/driver.so};SERVER=localhost;DATABASE=master

Example valid connection string:

bash DRIVER={My ODBC Driver};SERVER=localhost;DATABASE=master

Where My ODBC Driver is the name of an ODBC driver registered in the ODBC driver manager.

Contributors

  • @ewgenius
  • @peasee
  • @phillipleblanc
  • @sgrebnov
  • @Jeadie
  • @barracudarin
  • @Sevenannn

What's Changed

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.19.4-beta...v0.20.0-beta

- Rust
Published by phillipleblanc over 1 year ago

https://github.com/spiceai/spiceai - v0.19.4-beta

Spice v0.19.4 (Oct 30, 2024)

Spice v0.19.4-beta introduces a new localpod Data Connector, improvements to accelerator resiliency and control, and a new configuration to control when accelerated datasets are considered ready.

Highlights in v0.19.4

localpod Connector: Implement a "tiered" acceleration strategy with a new localpod Data Connector that can be used to accelerate datasets from other datasets registered in Spice.

yaml datasets: - from: s3://my_bucket/my_dataset name: my_dataset acceleration: enabled: true engine: duckdb mode: file refresh_check_interval: 60s - from: localpod:my_dataset name: my_localpod_dataset acceleration: enabled: true

Refreshes on the localpod's parent dataset will automatically be synchronized with the localpod dataset.

Improved Accelerator Resiliency: When Spice is restarted, if the federated source for a dataset configured with a file-based accelerator is not available, the dataset will still load from the existing file data and will attempt to connect to the federated source in the background for future refreshes.

Accelerator Ready State: Control when an accelerated dataset is considered "ready" by the runtime with the new ready_state parameter.

yaml datasets: - from: s3://my_bucket/my_dataset name: my_dataset acceleration: enabled: true ready_state: on_load # or on_registration

  • ready_state: on_load: Default. The dataset is considered ready after the initial load of the accelerated data. For file-based accelerated datasets that have existing data, this means the dataset is ready immediately.
  • ready_state: on_registration: The dataset is considered ready when the dataset is registered in Spice. Queries against this dataset before the data is loaded will fallback to the federated source.

Breaking changes

Accelerated datasets configured with ready_state: on_load (the default behavior) that are not ready will return an error instead of returning zero results.

Contributors

  • @Sevenannn
  • @peasee
  • @phillipleblanc
  • @sgrebnov
  • @barracudarin
  • @Jeadie
  • @ewgenius

What's Changed

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.19.3-beta...v0.19.4-beta

- Rust
Published by phillipleblanc over 1 year ago

https://github.com/spiceai/spiceai - v0.19.3-beta

Spice v0.19.3 (Oct 28, 2024)

Spice v0.19.3-beta improves the performance and stability of data connectors and accelerators, including faster queries across multiple federated sources by optimizing how filters are applied. Anthropic has also been added as a LLM model provider.

Highlights in v0.19.3

DataFusion Fixes: Resolved bugs in DataFusion and DataFusion Table Providers, expanding TPC-DS coverage and correctness.

GitHub Data Connector Beta Milestone: The GitHub Data Connector has graduated to Beta after extensive testing, stability, and performance improvements.

Anthropic Models Provider: Anthropic has been added as an LLM provider, including support for streaming.

Example spicepod.yml:

yaml models: - from: anthropic:claude-3-5-sonnet-20240620 name: claude_3_5_sonnet params: anthropic_api_key: ${ secrets:SPICE_ANTHROPIC_API_KEY }

Breaking changes

None.

Contributors

  • @Jeadie
  • @Sevenannn
  • @phillipleblanc
  • @peasee
  • @sgrebnov
  • @nlamirault
  • @barracudarin
  • @lukekim
  • @slyons

New Contributors

  • @nlamirault made their first contribution in https://github.com/spiceai/spiceai/pull/3207
  • @barracudarin made their first contribution in https://github.com/spiceai/spiceai/pull/3228

What's Changed

  • Make Anthropic OpenAI compatible. by @Jeadie in https://github.com/spiceai/spiceai/pull/3087
  • Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/3200
  • Bump version to 1.0.0-rc.1 by @Sevenannn in https://github.com/spiceai/spiceai/pull/3202
  • Fix clickhouse schema inference for non-default database by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3201
  • Update endgame template by @Sevenannn in https://github.com/spiceai/spiceai/pull/3198
  • Upgrade dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3197
  • fix: dataset refresh defaults properties to None by @peasee in https://github.com/spiceai/spiceai/pull/3205
  • Upgrade OTEL to v0.26 and make seconds based metrics reported precisely by @sgrebnov in https://github.com/spiceai/spiceai/pull/3203
  • use text_embedding_inference::Infer for more complete embedding solution by @Jeadie in https://github.com/spiceai/spiceai/pull/3199
  • Add S3 parquet file - arrow accelerator e2e test by @Sevenannn in https://github.com/spiceai/spiceai/pull/3154
  • feat: Add script to setup clickbench on mysql by @peasee in https://github.com/spiceai/spiceai/pull/3176
  • Update helm chart version to v0.19.2 by @Sevenannn in https://github.com/spiceai/spiceai/pull/3210
  • Add sample dataset option in v1/nsql. by @Jeadie in https://github.com/spiceai/spiceai/pull/3105
  • Split spiced_docker build across architectures by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3206
  • feat(helm): do not install demo dataset by default by @nlamirault in https://github.com/spiceai/spiceai/pull/3207
  • Split integration test across build/run steps by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3215
  • feat(helm): Refactoring Kubernetes labels by @nlamirault in https://github.com/spiceai/spiceai/pull/3208
  • Define 'toolrecursionlimit' for LLMs, and limit internal tool calling recursion. by @Jeadie in https://github.com/spiceai/spiceai/pull/3214
  • Improve filters pushdown for federated queries by @sgrebnov in https://github.com/spiceai/spiceai/pull/3183
  • Implement native schema inference for PostgreSQL by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3209
  • docs: Update release criteria by @peasee in https://github.com/spiceai/spiceai/pull/3219
  • Run SQLite acceleration TPC-DS tests using smaller scale by @sgrebnov in https://github.com/spiceai/spiceai/pull/3227
  • bind the serviceAccount if a name is given or if we're creating one by @barracudarin in https://github.com/spiceai/spiceai/pull/3228
  • Only emit channel send error log when its not a closed channel error by @Jeadie in https://github.com/spiceai/spiceai/pull/3230
  • Enable Parquet Exec filter pushdown in Spice by @Sevenannn in https://github.com/spiceai/spiceai/pull/3216
  • Add snapshots for SQLite TPC-DS benchmark (file mode) by @sgrebnov in https://github.com/spiceai/spiceai/pull/3234
  • docs: Add SDK release checks to endgame by @peasee in https://github.com/spiceai/spiceai/pull/3256
  • Implement localpod Data Connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3249
  • Revert "Enable Parquet Exec filter pushdown in Spice (#3216)" by @Sevenannn in https://github.com/spiceai/spiceai/pull/3244
  • refactor: Use existing action for detecting changes by @peasee in https://github.com/spiceai/spiceai/pull/3255
  • feat: Add GitHub integration test by @peasee in https://github.com/spiceai/spiceai/pull/3226
  • Add get_readiness tool to retrieve status of all registered components by @lukekim in https://github.com/spiceai/spiceai/pull/3035
  • Improve CLI error output when REPL can't connect to the Flight endpoint by @slyons in https://github.com/spiceai/spiceai/pull/3188
  • Fixing FTP link in Endgame by @slyons in https://github.com/spiceai/spiceai/pull/3267
  • Update version to 0.19.3-beta by @sgrebnov in https://github.com/spiceai/spiceai/pull/3269
  • add service type and annotation customizations in https://github.com/spiceai/spiceai/pull/3268

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.19.2-beta...v0.19.3-beta

- Rust
Published by sgrebnov over 1 year ago

https://github.com/spiceai/spiceai - v0.19.2-beta

Spice v0.19.2 (Oct 21, 2024)

Spice v0.19.2-beta continues to improve performance and stability of data connectors and data accelerators, further expands TPC-DS coverage, and includes several bug fixes.

Highlights in v0.19.2

DataFusion Fixes: Resolved bugs in DataFusion and DataFusion Table Providers, improving TPC-DS query support and correctness.

TPC-DS Snapshots: Extended support for TPC-DS benchmarks with added snapshot tests for validating query plans and result accuracy.

PostgreSQL Accelerator Beta: Postgres Data Accelerator has been promoted to Beta Quality

Breaking changes

  • The hive_infer_partitions parameter been changed to hive_partitioning_enabled, now defaults to false and must be explicitly enabled.

Contributors

  • @ewgenius
  • @sgrebnov
  • @slyons
  • @Jeadie
  • @Sevenannn
  • @phillipleblanc
  • @dependabot
  • @peasee

Dependencies

What's Changed

  • Update Helm chart for v0.19.1-beta by @ewgenius in https://github.com/spiceai/spiceai/pull/3106
  • Add more TPC-DS snapshots for Postgres acceleration by @sgrebnov in https://github.com/spiceai/spiceai/pull/3107
  • Bumping version to 1.0.0-rc.1 by @slyons in https://github.com/spiceai/spiceai/pull/3109
  • New table sampling methods: sampledistinctcolumns, randomsample, topn_sample by @Jeadie in https://github.com/spiceai/spiceai/pull/3108
  • Add TPCDS snapshot tests for file-based and in-mem duckdb by @Sevenannn in https://github.com/spiceai/spiceai/pull/3115
  • Add Postgres acceleration E2E test for MySQL by @sgrebnov in https://github.com/spiceai/spiceai/pull/3110
  • Update datafusion logical plan to avoid wrong group_by columns in aggregation by @Sevenannn in https://github.com/spiceai/spiceai/pull/3111
  • Warn if user tries to embed column that does not exist by @Jeadie in https://github.com/spiceai/spiceai/pull/3120
  • Changes for Rust version upgrade by @Sevenannn in https://github.com/spiceai/spiceai/pull/3134
  • Add unnest support for federated plans by @sgrebnov in https://github.com/spiceai/spiceai/pull/3133
  • Don't .clone() unnecessarily by @Jeadie in https://github.com/spiceai/spiceai/pull/3128
  • Fix Flight get_schema to construct logical plan and return that schema. by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3131
  • Bump clap from 4.5.19 to 4.5.20 by @dependabot in https://github.com/spiceai/spiceai/pull/3099
  • Add GitHub Workflow to build spice-postgres-tpcds-bench image by @sgrebnov in https://github.com/spiceai/spiceai/pull/3140
  • test: Add basic MySQL integration test by @peasee in https://github.com/spiceai/spiceai/pull/3143
  • Bump datafusion-federation and datafusion-table-providers crates by @sgrebnov in https://github.com/spiceai/spiceai/pull/3148
  • docs: Add MySQL limitation for division by zero by @peasee in https://github.com/spiceai/spiceai/pull/3144
  • fix: Dataset refresh by @peasee in https://github.com/spiceai/spiceai/pull/3147
  • Update arrow, duckdb, postgres accelerator tpcds snapshots by @Sevenannn in https://github.com/spiceai/spiceai/pull/3145
  • Add TPC-DS benchmarks for Postgres data connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/3149
  • Update E2E test ci to include tests for accelerating Postgres into accelerators by @Sevenannn in https://github.com/spiceai/spiceai/pull/3137
  • Add TPCDS Benchmark test and snapshots for S3 by @Sevenannn in https://github.com/spiceai/spiceai/pull/3152
  • [cli] Include 200 in acceptable response codes for doRuntimeApiRequest by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3157
  • Use -build.{GIT_SHA} for unreleased versions by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3159
  • Upgrade to Rust 1.82 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3158
  • Disable hive_infer_partitions by default by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3160
  • Upgrade to DuckDB 1.1.1 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3161
  • feat: Add MySQL TPCDS results snapshots and exclude workarounds by @peasee in https://github.com/spiceai/spiceai/pull/3165
  • Fix taskhistory output for sql, add output to tableschema & list_datasets tool by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3166
  • feat: Add ClickBench queries as separate files by @peasee in https://github.com/spiceai/spiceai/pull/3169
  • Calculate embeddings in a separate blocking thread by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3170
  • docs: Update ROADMAP.md and release criterias by @peasee in https://github.com/spiceai/spiceai/pull/3124
  • Handle OpenTelemetry errors by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3173
  • Update version to 0.19.2-beta by @Sevenannn in https://github.com/spiceai/spiceai/pull/3182

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.19.1-beta...v0.19.2-beta

- Rust
Published by Sevenannn over 1 year ago

https://github.com/spiceai/spiceai - v0.19.1-beta

Spice v0.19.1 (Oct 14, 2024)

Spice v0.19.1 brings further performance and stability improvements to data connectors, including improved query push-down for file-based connectors (s3, abfs, file, ftp, sftp) that use Hive-style partitioning.

Highlights in v0.19.1

TPC-H and TPC-DS Coverage: Expanded coverage for TPC-H and TPC-DS benchmarking suites across accelerators and connectors.

GitHub Connector Array Filter: The GitHub connector now supports filter push down for the array_contains function in SQL queries using search query mode.

NSQL CLI Command: A new spice nsql CLI command has been added to easily query datasets with natural language from the command line.

Breaking changes

None

Contributors

  • @peasee
  • @Sevenannn
  • @sgrebnov
  • @karifabri
  • @phillipleblanc
  • @lukekim
  • @Jeadie
  • @slyons

Dependencies

What's Changed

  • release: Update helm chart for v0.19.0-beta by @peasee in https://github.com/spiceai/spiceai/pull/3024
  • Set fail-fast = true for benchmark test by @Sevenannn in https://github.com/spiceai/spiceai/pull/2997
  • release: Update next version and ROADMAP by @peasee in https://github.com/spiceai/spiceai/pull/3033
  • Verify TPCH benchmark query results for Spark connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/2993
  • feat: Add x-spice-user-agent header to Spice REPL by @peasee in https://github.com/spiceai/spiceai/pull/2979
  • Update to object store file formats documentation link by @karifabri in https://github.com/spiceai/spiceai/pull/3036
  • Use teraswitch-runners for Linux x64 workflows + builds by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3042
  • feat: Support array contains in GitHub pushdown by @peasee in https://github.com/spiceai/spiceai/pull/2983
  • Bump text-splitter from 0.16.1 to 0.17.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2987
  • Revert integration tests back to hosted runner by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3046
  • Tune Github runner resources to allow in memory TPCDS benchmark to run by @Sevenannn in https://github.com/spiceai/spiceai/pull/3025
  • fix: add winver by @peasee in https://github.com/spiceai/spiceai/pull/3054
  • refactor: Use is modifier for checking GitHub state filter by @peasee in https://github.com/spiceai/spiceai/pull/3056
  • Enable merge_group checks for PR workflows by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3058
  • Fix issues with merge group by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3059
  • Validate in-memory arrow accelertion TPCDS result correctness by @Sevenannn in https://github.com/spiceai/spiceai/pull/3044
  • Fix rev parsing for PR checks by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3060
  • Use 'Accept' header for /v1/sql/ and /v1/nsql by @Jeadie in https://github.com/spiceai/spiceai/pull/3032
  • Verify Postgres acceleration TPCDS result correctness by @Sevenannn in https://github.com/spiceai/spiceai/pull/3043
  • Add NSQL CLI REPL command by @lukekim in https://github.com/spiceai/spiceai/pull/2856
  • Preserve query results order and add TPCH benchmark results verification for duckdb:file mode by @sgrebnov in https://github.com/spiceai/spiceai/pull/3034
  • Refactor benchmark to include MySQL tpcds bench, tweaks to makefile target for generating mysql tpcds data by @Sevenannn in https://github.com/spiceai/spiceai/pull/2967
  • Support runtime parameter for sql_query_keep_partition_by_columns & enable by default by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3065
  • Document TPC-DS limitations: EXCEPT, INTERSECT, duplicate names by @sgrebnov in https://github.com/spiceai/spiceai/pull/3069
  • Adding ABFS benchmark by @slyons in https://github.com/spiceai/spiceai/pull/3062
  • Add support for GitHub app installation auth for GitHub connector by @ewgenius in https://github.com/spiceai/spiceai/pull/3063
  • docs: Document stack overflow workaround, add helper script by @peasee in https://github.com/spiceai/spiceai/pull/3070
  • Tune MySQL TPCDS image to allow for successful benchmark test run by @Sevenannn in https://github.com/spiceai/spiceai/pull/3067
  • Automatically infer partitions for hive-style partitioned files for object store based connectors by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3073
  • Support hf_token from params/secrets by @Jeadie in https://github.com/spiceai/spiceai/pull/3071
  • Inherit embedding columns from source, when available. by @Jeadie in https://github.com/spiceai/spiceai/pull/3045
  • Validate identifiers for component names by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3079
  • docs: Add workaround for TPC-DS Q97 in MySQL by @peasee in https://github.com/spiceai/spiceai/pull/3080
  • Document TPC-DS Postgres column alias in a CASE statement limitation by @sgrebnov in https://github.com/spiceai/spiceai/pull/3083
  • Update plan snapshots for TPC-H bench queries by @sgrebnov in https://github.com/spiceai/spiceai/pull/3088
  • Update Datafusion crate to include recent unparsing fixes by @sgrebnov in https://github.com/spiceai/spiceai/pull/3089
  • Sample SQL table data tool and API by @Jeadie in https://github.com/spiceai/spiceai/pull/3081
  • chore: Update datafusion-table-providers by @peasee in https://github.com/spiceai/spiceai/pull/3090
  • Add hive_infer_partitions to remaining object store connectors by @phillipleblanc in https://github.com/spiceai/spiceai/pull/3086
  • deps: Update datafusion-table-providers by @peasee in https://github.com/spiceai/spiceai/pull/3093
  • For local embedding models, return usage input tokens. by @Jeadie in https://github.com/spiceai/spiceai/pull/3095
  • Update end_game.md with Accelerator/Connector criteria check by @slyons in https://github.com/spiceai/spiceai/pull/3092
  • Update TPC-DS Q90 by @sgrebnov in https://github.com/spiceai/spiceai/pull/3094
  • docs: Add RC connector criteria by @peasee in https://github.com/spiceai/spiceai/pull/3026
  • Update version to 0.19.1-beta by @sgrebnov in https://github.com/spiceai/spiceai/pull/3101

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.19.0-beta...v0.19.1-beta

- Rust
Published by slyons over 1 year ago

https://github.com/spiceai/spiceai - v0.19.0-beta

Spice v0.19.0-beta (Oct 7, 2024)

Spice v0.19.0-beta brings performance improvements for accelerators and expanded TPC-DS coverage. A new Azure Blob Storage data connector has also been added.

Highlights in v0.19.0-beta

Improved TPC-DS Coverage: Enhanced support for TPC-DS derived queries.

CLI SQL REPL: The CLI SQL REPL (spice sql) now supports multi-line editing and tab indentation. Note, a terminating semi-colon ';' is now required for each executed SQL block.

Azure Storage Data Connector: A new Azure Blob Storage data connector (abfs://) has been added, enabling federated SQL queries on files stored in Azure Blob-compatible endpoints, including Azure BlobFS (abfss://) and Azure Data Lake (adl://). Supported file formats can be specified using the file_format parameter.

Example spicepod.yml:

yaml datasets: - from: abfs://foocontainer/taxi_sample.csv name: azure_test params: azure_account: spiceadls azure_access_key: abc123== file_format: csv

For a full list of supported files, see the Object Store File Formats documentation.

For more details, see the Azure Blob Storage Data Connector documentation.

Breaking Changes

  • Spice.ai Data Connector: The key for the Spice.ai Cloud Platform Data Connector has changed from spiceai to spice.ai. To upgrade, change uses of from: spiceai: to from: spice.ai:.

  • GitHub Data Connector: Pull Requests column login has been renamed to author.

  • CLI SQL REPL: A terminating semi-colon ';' is now required for each executed SQL block.

  • Spicepod Hot-Reload: When running spiced directly, hot-reload of spicepod.yml configuration is now disabled. Run with spice run to use hot-reload.

Contributors

  • @sgrebnov
  • @Jeadie
  • @Sevenannn
  • @peasee
  • @ewgenius
  • @slyons
  • @phillipleblanc
  • @lukekim

Dependencies

What's Changed

  • Bump tonic from 0.12.2 to 0.12.3 by @dependabot in https://github.com/spiceai/spiceai/pull/2880
  • Verify benchmark query results using snapshot testing (s3 connector) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2902
  • Fix paths-ignore: by @Jeadie in https://github.com/spiceai/spiceai/pull/2906
  • Rename spiceai data connector to spice.ai by @sgrebnov in https://github.com/spiceai/spiceai/pull/2899
  • Update ROADMAP.md by @Jeadie in https://github.com/spiceai/spiceai/pull/2907
  • Helm update for helm for 0.18.3-beta by @Jeadie in https://github.com/spiceai/spiceai/pull/2910
  • Add tpcds queries by @Sevenannn in https://github.com/spiceai/spiceai/pull/2918
  • Fix paths-ignore for docs. by @Jeadie in https://github.com/spiceai/spiceai/pull/2911
  • feat: Support LIKE expressions in GitHub filter pushdown by @peasee in https://github.com/spiceai/spiceai/pull/2903
  • feat: Support date comparison pushdown in GitHub connector by @peasee in https://github.com/spiceai/spiceai/pull/2904
  • Improve aggregation and union queries unparsing by @sgrebnov in https://github.com/spiceai/spiceai/pull/2925
  • Initialize file based accelerators on dataset reload by @Sevenannn in https://github.com/spiceai/spiceai/pull/2923
  • Update spiceai/spiceai for next release by @Jeadie in https://github.com/spiceai/spiceai/pull/2928
  • Verify TPC-H benchmark query results for arrow acceleration by @sgrebnov in https://github.com/spiceai/spiceai/pull/2927
  • Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/2912
  • Use structured output for NSQL by @Jeadie in https://github.com/spiceai/spiceai/pull/2922
  • Update TPC-DS queries to use supported date addition format by @sgrebnov in https://github.com/spiceai/spiceai/pull/2930
  • Add busy_timeout accelerator param for Sqlite by @Sevenannn in https://github.com/spiceai/spiceai/pull/2855
  • Use Cosine Similarity in vector search by @Jeadie in https://github.com/spiceai/spiceai/pull/2932
  • Add support for passing x-spiceai-app-id metadata in spiceai data connector by @ewgenius in https://github.com/spiceai/spiceai/pull/2934
  • docs: update beta accelerator criteria by @peasee in https://github.com/spiceai/spiceai/pull/2905
  • Azure Connector implementation by @slyons in https://github.com/spiceai/spiceai/pull/2926
  • Local embedding model from relative paths by @Jeadie in https://github.com/spiceai/spiceai/pull/2908
  • Add Markdown aware chunker when params.file_format: md. by @Jeadie in https://github.com/spiceai/spiceai/pull/2943
  • 'spice version' without structured logging by @Jeadie in https://github.com/spiceai/spiceai/pull/2944
  • Bump tempfile from 3.12.0 to 3.13.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2878
  • feat: GraphQL commit query parameters by @peasee in https://github.com/spiceai/spiceai/pull/2945
  • Update OpenAI client and use new request fields by @Jeadie in https://github.com/spiceai/spiceai/pull/2951
  • refactor: Rename GitHub pulls login to author by @peasee in https://github.com/spiceai/spiceai/pull/2954
  • Run tpcds benchmarks for accelerators by @Sevenannn in https://github.com/spiceai/spiceai/pull/2853
  • Add spiced arg --pods-watcher-enabled. Watcher disabled by default for spiced. by @ewgenius in https://github.com/spiceai/spiceai/pull/2953
  • Add error message when spicepod has embeddings or models without '--features models' by @Jeadie in https://github.com/spiceai/spiceai/pull/2952
  • Adding multi-line editing and tab indentation to sql REPL by @slyons in https://github.com/spiceai/spiceai/pull/2949
  • Update MySQL ghcr image to include tpcds data by @Sevenannn in https://github.com/spiceai/spiceai/pull/2941
  • Document DataFusion limitation: The context only support single SQL Statement, Date Arithmetic like date + 3 not supported by @Sevenannn in https://github.com/spiceai/spiceai/pull/2970
  • Bump snafu from 0.8.4 to 0.8.5 by @dependabot in https://github.com/spiceai/spiceai/pull/2876
  • Bump async-trait from 0.1.82 to 0.1.83 by @dependabot in https://github.com/spiceai/spiceai/pull/2879
  • Bump async-graphql from 7.0.9 to 7.0.11 in the cargo group by @dependabot in https://github.com/spiceai/spiceai/pull/2950
  • Verify TPC-H benchmark query results for MySQL by @sgrebnov in https://github.com/spiceai/spiceai/pull/2972
  • Verify TPCH benchmark query results for Postgres by @sgrebnov in https://github.com/spiceai/spiceai/pull/2973
  • Verify TPCH benchmark query results for sqlite acceleration by @sgrebnov in https://github.com/spiceai/spiceai/pull/2974
  • Verify TPCH benchmark query results for duckdb (in-memory) acceleration by @sgrebnov in https://github.com/spiceai/spiceai/pull/2975
  • Support for mdx file extensions to apply a markdown splitter by @ewgenius in https://github.com/spiceai/spiceai/pull/2977
  • Don't assume first vector or content will be non-null/zero by @Jeadie in https://github.com/spiceai/spiceai/pull/2940
  • use custom chunk sizers for HF, local and OpenAI models by @Jeadie in https://github.com/spiceai/spiceai/pull/2971
  • Ensure we return N unique documents, not N unique chunks by @Jeadie in https://github.com/spiceai/spiceai/pull/2976
  • Fix issues parsing messages[*].tool_calls for local models by @Jeadie in https://github.com/spiceai/spiceai/pull/2957
  • text -> SQL trait to customise per model. by @Jeadie in https://github.com/spiceai/spiceai/pull/2942
  • Remove system message from ToolUsingChat. by @Jeadie in https://github.com/spiceai/spiceai/pull/2978
  • Make logical plan to sql more robust (improve ORDER BY; support round for Postgres) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2984
  • Add connectionpoolsize parameter for Postgres accelerator by @Sevenannn in https://github.com/spiceai/spiceai/pull/2969
  • Fix dataset configure prompt by @sgrebnov in https://github.com/spiceai/spiceai/pull/2991
  • Verify TPCH benchmark query results for Databricks(odbc) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2989
  • Verify TPCH benchmark query results for Databricks (delta_lake) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2982
  • Set log level for anonymous telemetry traces to trace by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2995
  • Improvements to issue templates by @lukekim in https://github.com/spiceai/spiceai/pull/2992
  • spice login writes to .env.local if present by @slyons in https://github.com/spiceai/spiceai/pull/2996

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.18.3-beta...v0.19.0-beta

- Rust
Published by peasee over 1 year ago

https://github.com/spiceai/spiceai - v0.18.3-beta

Spice v0.18.3-beta (Sep 30, 2024)

The Spice v0.18.3-beta release includes several quality-of-life improvements including verbosity flags for spiced and the Spice CLI, vector search over larger documents with support for chunking dataset embeddings, and multiple performance enhancements. Additionally, the release includes several bug fixes, dependency updates, and optimizations, including updated table providers and significantly improved GitHub data connector performance for issues and pull requests.

Highlights in v0.18.3-beta

GitHub Query Mode: A new github_query_mode: search parameter has been added to the GitHub Data Connector, which uses the GitHub Search API to enable faster and more efficient query of issues and pull requests when using filters.

Example spicepod.yml:

yaml - from: github:github.com/spiceai/spiceai/issues/trunk name: spiceai.issues params: github_query_mode: search # Use GitHub Search API github_token: ${secrets:GITHUB_TOKEN}

Output Verbosity: Higher verbosity output levels can be specified through flags for both spiced and the Spice CLI.

Example command line:

```shell spice -v spice --very-verbose

spiced -vv spiced --verbose ```

Embedding Chunking: Chunking can be enabled and configured to preprocess input data before generating dataset embeddings. This improves the relevance and precision for larger pieces of content.

Example spicepod.yml:

yaml - name: support_tickets embeddings: - column: conversation_history use: openai_embeddings chunking: enabled: true target_chunk_size: 128 overlap_size: 16 trim_whitespace: true

For details, see the Search Documentation.

Dependencies

Contributors

  • @Sevenannn
  • @peasee
  • @Jeadie
  • @sgrebnov
  • @phillipleblanc
  • @ewgenius
  • @slyons

What's Changed

  • Update datafusion table provider patch by @Sevenannn in https://github.com/spiceai/spiceai/pull/2817
  • refactor: Set maxrowsper_batch for ODBC to 4000 by @peasee in https://github.com/spiceai/spiceai/pull/2822
  • Use User message for health check by @Jeadie in https://github.com/spiceai/spiceai/pull/2823
  • Upgrade Helm chart (Spice v0.18.2-beta) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2820
  • Add verbosity flags for spiced, spice: -v, -vv, --verbose, --very-verbose. by @Jeadie in https://github.com/spiceai/spiceai/pull/2831
  • Rename spiceai data connector to spice.ai by @sgrebnov in https://github.com/spiceai/spiceai/pull/2680
  • Prepare for v0.19.0-beta release (version bump) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2821
  • Bump clap from 4.5.17 to 4.5.18 (#2801) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2848
  • Enable "rc" feature for serde in spicepod crate by @ewgenius in https://github.com/spiceai/spiceai/pull/2851
  • Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/2852
  • chore: update table providers by @peasee in https://github.com/spiceai/spiceai/pull/2858
  • fix: Use GitHub search for issues in GraphQL by @peasee in https://github.com/spiceai/spiceai/pull/2845
  • fix: Use GitHub search for pull_requests by @peasee in https://github.com/spiceai/spiceai/pull/2847
  • Support chunking dataset embeddings by @Jeadie in https://github.com/spiceai/spiceai/pull/2854
  • refactor: Update GraphQL client to be more robust for filter push down by @peasee in https://github.com/spiceai/spiceai/pull/2864
  • docs: Update accelerator beta criteria by @peasee in https://github.com/spiceai/spiceai/pull/2865
  • Change BytesProcessedRule to be an optimizer rather than an analyzer rule by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2867
  • Don't run E2E or PR tests on documentation by @Jeadie in https://github.com/spiceai/spiceai/pull/2869
  • Verify benchmark query results using snapshot testing (spice.ai connector) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2866
  • feat: Add GraphQLOptimizer by @peasee in https://github.com/spiceai/spiceai/pull/2868
  • Update quickstarts for Endgame by @Jeadie in https://github.com/spiceai/spiceai/pull/2863
  • Update version to v0.18.3-beta by @sgrebnov in https://github.com/spiceai/spiceai/pull/2882
  • Update DataFusion: fix coalesce, Aggregation with Window functions unparsing support by @sgrebnov in https://github.com/spiceai/spiceai/pull/2884
  • Revert "Rename spiceai data connector to spice.ai" by @sgrebnov in https://github.com/spiceai/spiceai/pull/2881
  • Adding integration test for DuckDB read functions by @slyons in https://github.com/spiceai/spiceai/pull/2857
  • Show more informative mysql error message by @Sevenannn in https://github.com/spiceai/spiceai/pull/2883
  • Fix no process-level CryptoProvider available when using REPL and TLS by @sgrebnov in https://github.com/spiceai/spiceai/pull/2887
  • Change UX for chunking and enable overlap_size in chunking by @Jeadie in https://github.com/spiceai/spiceai/pull/2890
  • Add log/slog to spice CLI tool by @Jeadie in https://github.com/spiceai/spiceai/pull/2859
  • feat: Add GitHub GraphQLOptimizer by @peasee in https://github.com/spiceai/spiceai/pull/2870
  • Fix mysql invalid tablename error message by @Sevenannn in https://github.com/spiceai/spiceai/pull/2896
  • fix: Remove login column rename in pulls and update Optimizer by @peasee in https://github.com/spiceai/spiceai/pull/2897
  • Fix require check checking. by @Jeadie in https://github.com/spiceai/spiceai/pull/2898

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.18.2-beta...v0.18.3-beta

- Rust
Published by Jeadie over 1 year ago

https://github.com/spiceai/spiceai - v0.18.2-beta

Spice v0.18.2-beta (Sep 24, 2024)

The v0.18.2-beta release improves the reliability of the sharepoint data connector and spice search functionality.

Contributors

  • @Jeadie
  • @sgrebnov

New Contributors

  • None

What's Changed

  • Issue with sharepoint Site by @Jeadie in https://github.com/spiceai/spiceai/pull/2810
  • Upgrade Helm chart (Spice v0.18.1-beta) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2812
  • Prepare for v0.18.2-beta release by @sgrebnov in https://github.com/spiceai/spiceai/pull/2811
  • Fix issues with spice search by @Jeadie in https://github.com/spiceai/spiceai/pull/2814

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.18.1-beta...0.18.2

- Rust
Published by github-actions[bot] over 1 year ago

https://github.com/spiceai/spiceai - v0.18.1-beta

Spice v0.18.1-beta (Sep 23, 2024)

The v0.18.1-beta release continues to improve runtime performance and reliability. Performance for accelerated queries joining multiple datasets has been significantly improved with join push-down support. GraphQL, MySQL, and SharePoint data connectors have better reliability and error handling, and a new Microsoft SQL Server data connector has been introduced. Task History now has fine-grained configuration, including the ability to disable the feature entirely. A new spice search CLI command has been added, enabling development-time embeddings-based searches across datasets.

Highlights in v0.18.1-beta

Join push-down for accelerations: Queries to the same accelerator will now push-down joins, significantly improving acceleration performance for queries joining multiple tables.

Microsoft SQL Server Data Connector: Use from: mssql: to access and accelerate Microsoft SQL Server datasets.

Example spicepod.yml:

yaml datasets: - from: mssql:path.to.my_dataset name: my_dataset params: mssql_connection_string: ${secrets:mssql_connection_string}

See the Microsoft SQL Server Data Connector documentation.

Task History: Task History can be configured in the spicepod.yml, including the ability to include, or truncate outputs such as the results of a SQL query.

Example spicepod.yml:

yaml runtime: task_history: enabled: true captured_output: truncated retention_period: 8h retention_check_interval: 15m

See the Task History Spicepod reference for more information on possible values and behaviors.

Search CLI Command Use the spice search CLI command to perform embeddings-based searches across search configure datasets. Note: Search requires the ai feature to be installed.

Refresh on File Changes: File Data Connector data refreshes can be configured to be triggered when the source file is modified through a file system watcher. Enable the watcher by adding file_watcher: enabled to the acceleration parameters.

Example spicepod.yml:

yaml datasets: - from: file://path/to/my_file.csv name: my_file acceleration: enabled: true refresh_mode: full params: file_watcher: enabled

Breaking Changes

The Query History table runtime.query_history has been deprecated and removed in favor of the Task History table runtime.task_history. The Task History table tracks tasks across all features such as SQL query, vector search, and AI completion in a unified table.

See the Task History documentation.

Dependencies

Contributors

  • @phillipleblanc
  • @Jeadie
  • @lukekim
  • @sgrebnov
  • @peasee
  • @Sevenannn
  • @ewgenius
  • @slyons

New Contributors

  • @slyons made their first contribution in https://github.com/spiceai/spiceai/pull/2724

What's Changed

  • Update Helm Chart for 0.18.0-beta release by @sgrebnov in https://github.com/spiceai/spiceai/pull/2711
  • Use a single instance for all DuckDB accelerated datasets by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2669
  • Dependabot upgrades by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2715
  • Use a single instance for all SQLite accelerated datasets by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2720
  • Prepare for v0.18.1-beta release by @sgrebnov in https://github.com/spiceai/spiceai/pull/2692
  • For GraphQL, remove necessity of json_pointer and improve error messaging. by @Jeadie in https://github.com/spiceai/spiceai/pull/2713
  • Postgres accelerator benchmark test by @Sevenannn in https://github.com/spiceai/spiceai/pull/2652
  • Trace query result while running benchmark tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/2684
  • Early check EmbeddingConnector if embedding models do not exist by @Jeadie in https://github.com/spiceai/spiceai/pull/2717
  • Move table creation for spicesysdatasetcheckpoint to DatasetCheckpoint::trynew by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2732
  • Don't load tools immediately by @Jeadie in https://github.com/spiceai/spiceai/pull/2731
  • Renable accelerator federation on trunk by @Sevenannn in https://github.com/spiceai/spiceai/pull/2725
  • Fixing Data Connectors link in README.md by @slyons in https://github.com/spiceai/spiceai/pull/2724
  • Enable rehydration tests for DuckDB by @sgrebnov in https://github.com/spiceai/spiceai/pull/2729
  • Check pageInfo is correct at initialisation of GraphQL connector by @Jeadie in https://github.com/spiceai/spiceai/pull/2730
  • Microsoft SQL Server data connector initial support by @sgrebnov in https://github.com/spiceai/spiceai/pull/2741
  • Add spice search CLI command by @lukekim in https://github.com/spiceai/spiceai/pull/2739
  • Update threat model by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2738
  • Upgrade to Arrow 53, DataFusion 42 and DuckDB 1.1 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2744
  • Update datafusion table provider patch by @Sevenannn in https://github.com/spiceai/spiceai/pull/2747
  • feat: Add enabled config option for task_history by @peasee in https://github.com/spiceai/spiceai/pull/2758
  • Remove v0.18.0-beta from the Roadmap by @sgrebnov in https://github.com/spiceai/spiceai/pull/2748
  • Fix spark-connect to use native roots for TLS again by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2766
  • Fix benchmark test - Install default crypto provider by @Sevenannn in https://github.com/spiceai/spiceai/pull/2752
  • Resolve primary keys for datasets with catalog or schema by @Jeadie in https://github.com/spiceai/spiceai/pull/2749
  • MSSQL: include table name in schema retrieval error by @sgrebnov in https://github.com/spiceai/spiceai/pull/2746
  • File Format parsing for Document tables, support for docx + pdf by @Jeadie in https://github.com/spiceai/spiceai/pull/2740
  • Add Document parsing to Sharepoint connector. by @Jeadie in https://github.com/spiceai/spiceai/pull/2760
  • Execution plan with BinaryExpr predicates pushdown support for MS SQL by @sgrebnov in https://github.com/spiceai/spiceai/pull/2768
  • Update datafusion patch by @Sevenannn in https://github.com/spiceai/spiceai/pull/2772
  • Support for standalone config parameters for MS SQL by @sgrebnov in https://github.com/spiceai/spiceai/pull/2773
  • Utilize DataConnectorError for MySQL Data Connector Errors by @Sevenannn in https://github.com/spiceai/spiceai/pull/2759
  • Add Score to search results by @lukekim in https://github.com/spiceai/spiceai/pull/2774
  • Don't call GetComponentStatuses when --metrics not enabled by @Jeadie in https://github.com/spiceai/spiceai/pull/2779
  • Implement better error handling for spicepods by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2767
  • Make integration tests more robust by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2782
  • Query results streaming support for MS SQL by @sgrebnov in https://github.com/spiceai/spiceai/pull/2781
  • Update benchmark snapshots by @Sevenannn in https://github.com/spiceai/spiceai/pull/2778
  • For Sharepoint connector, if clientsecret and authcode are both provided, default to auth_code by @Jeadie in https://github.com/spiceai/spiceai/pull/2780
  • Add modified pk/indexes scenario to rehydration tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/2743
  • Run benchmarks on Wed, Fri, Sat, and Sun. by @lukekim in https://github.com/spiceai/spiceai/pull/2786
  • Update PULLREQUESTTEMPLATE.md to include a section for Documentation by @slyons in https://github.com/spiceai/spiceai/pull/2785
  • Add E2E test for MS SQL data connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/2788
  • More types support for MS SQL data connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/2789
  • feat: Add capturedoutput option for taskhistory by @peasee in https://github.com/spiceai/spiceai/pull/2783
  • Add ability to refresh when file data connector detects changes by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2787
  • Propagate MySQL invalid table name error by @Sevenannn in https://github.com/spiceai/spiceai/pull/2776
  • feat: Add retention options for task_history config by @peasee in https://github.com/spiceai/spiceai/pull/2784
  • fix: Move task history check after query history creation by @peasee in https://github.com/spiceai/spiceai/pull/2793
  • MS SQL connector should ignore all unsupported types by @sgrebnov in https://github.com/spiceai/spiceai/pull/2795
  • Improve Sharepoint DX by @Jeadie in https://github.com/spiceai/spiceai/pull/2791
  • Replace query history with task history by @peasee in https://github.com/spiceai/spiceai/pull/2792
  • Fix datasetshealthmonitor spice.runtime.task_history not found warning by @sgrebnov in https://github.com/spiceai/spiceai/pull/2805
  • Upgrade macOS x86_64 test runner to macOS 13.6.9 Ventura by @sgrebnov in https://github.com/spiceai/spiceai/pull/2803
  • Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/2808
  • Add mssql to the list of supported data connectors by @sgrebnov in https://github.com/spiceai/spiceai/pull/28

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.18.0-beta...v0.18.1-beta

- Rust
Published by sgrebnov over 1 year ago

https://github.com/spiceai/spiceai - v0.18.0-beta

Spice v0.18.0-beta (Sep 16, 2024)

The v0.18.0-beta release adds new Sharepoint and File data connectors, introduces AWS Identity and Access Management (IAM) support for the S3 Data Connector, improves performance of the GitHub connector, and increases the overall reliability of all data accelerators. The /ready API endpoint was enhanced to report as ready only when all components, including loaded data, have successfully reported readiness.

Highlights in v0.18.0-beta

Sharepoint Data Connector: Use from: sharepoint: to access and accelerate documents stored in Microsoft 365 OneDrive for Business (Sharepoint).

Example spicepod.yml:

yaml datasets: - from: sharepoint:drive:Documents/path:/important_documents/ name: important_documents params: sharepoint_client_id: ${secrets:SPICE_SHAREPOINT_CLIENT_ID} sharepoint_tenant_id: ${secrets:SPICE_SHAREPOINT_TENANT_ID} sharepoint_client_secret: ${secrets:SPICE_SHAREPOINT_CLIENT_SECRET}

See the Sharepoint Data Connector documentation.

AWS Identity and Access Management (IAM) for S3: A new s3_auth parameter for the s3 data connector to configure the authentication method to use when connecting to S3. Supported values are public, key, and iam_role. Use s3_auth: iam_role to assume the instance IAM role.

Example spicepod.yml:

yaml datasets: - from: s3://my-bucket name: bucket params: s3_auth: iam_role # Assume IAM role of instance

See the S3 Data Connector documentation.

File Data Connector Use from: file: to query files stored by locally accessible filesystems.

Example spicepod.yml:

yaml datasets: - from: file://path/to/customer.parquet name: customer params: file_format: parquet

See the File Data Connector documentation.

Improved /ready Api Now includes the initial data load for accelerated datasets in addition to component readiness to ensure readiness is only reported when data has loaded and can be successfully queried.

Breaking Changes

  • GitHub Data Connector: The data type for time-related columns has changed from Utf8 to Timestamp. To upgrade, data type references to timestamp. For example, if using time_format:, change uses of time_format: ISO8601 to time_format: timestamp.

  • Ready API: The /ready API reports ready only when all components have reported ready and data is fully loaded. To upgrade, evaluate uses of the Ready API (such as Kubernetes readiness probes) and consider how it might affect system behavior.

Dependencies

No major dependencies updates.

Contributors

  • @phillipleblanc
  • @Jeadie
  • @lukekim
  • @sgrebnov
  • @peasee
  • @eltociear
  • @Sevenannn
  • @ewgenius
  • @karifabri

New Contributors

  • @karifabri made their first contribution in https://github.com/spiceai/spiceai/pull/2601

What's Changed

  • Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/2585
  • Set helm to v0.17.4-beta by @ewgenius in https://github.com/spiceai/spiceai/pull/2595
  • Bump to next v0.18.0-beta version by @ewgenius in https://github.com/spiceai/spiceai/pull/2596
  • Add snapshot test docs / Update beta criteria for data accelerators by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2594
  • Enable federation for accelerated queries (sqlite, duckdb, postgres) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2598
  • spelling updates on v0.17.4 release notes by @karifabri in https://github.com/spiceai/spiceai/pull/2601
  • Update endgame template by @ewgenius in https://github.com/spiceai/spiceai/pull/2591
  • fix: Re-attach DuckDB attachments on each query by @peasee in https://github.com/spiceai/spiceai/pull/2602
  • Speed up sqlite accelerator benchmark test with indexes by @Sevenannn in https://github.com/spiceai/spiceai/pull/2597
  • Fix refresh API using refresh_mode: append by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2609
  • Tweak /ready to only report ready when components have all reported Ready by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2600
  • Add s3_auth parameter to configure IAM role authentication by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2611
  • Bump fundu from 2.0.0 to 2.0.1 by @dependabot in https://github.com/spiceai/spiceai/pull/2576
  • fix: Remove comments from SQL files by @peasee in https://github.com/spiceai/spiceai/pull/2627
  • Utilize runtime.status().is_ready() to check acceleration dataset readiness in benchmark test by @Sevenannn in https://github.com/spiceai/spiceai/pull/2614
  • Allow for prefix to be kept in internal Parameters by @Jeadie in https://github.com/spiceai/spiceai/pull/2603
  • Bump itertools from 0.12.1 to 0.13.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2572
  • Bump golang.org/x/mod from 0.20.0 to 0.21.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2571
  • Add initial threat model using OWASP Threat Dragon by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2599
  • fix: Explicitly error for duplicate duckdb file accelerators by @peasee in https://github.com/spiceai/spiceai/pull/2628
  • Benchmark test binary can parse command line option by @Sevenannn in https://github.com/spiceai/spiceai/pull/2626
  • Snapshot tests shouldn't crash the Spice benchmark test by @Sevenannn in https://github.com/spiceai/spiceai/pull/2613
  • Bump anyhow from 1.0.86 to 1.0.87 by @dependabot in https://github.com/spiceai/spiceai/pull/2573
  • Upgrade datafusion to improve SQLite subquery tables aliasing support by @sgrebnov in https://github.com/spiceai/spiceai/pull/2634
  • Run benchmark separately using workflow by @Sevenannn in https://github.com/spiceai/spiceai/pull/2631
  • Sharepoint UX changes by @Jeadie in https://github.com/spiceai/spiceai/pull/2633
  • Improve /ready to only mark a dataset ready iff the initial refresh completed by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2630
  • Support relative paths for file connector by @Jeadie in https://github.com/spiceai/spiceai/pull/2637
  • Fix error decoding response body GitHub file connector bug by @sgrebnov in https://github.com/spiceai/spiceai/pull/2645
  • GraphQL pagination and robustness. by @Jeadie in https://github.com/spiceai/spiceai/pull/2632
  • docs: Update bug template by @peasee in https://github.com/spiceai/spiceai/pull/2629
  • Define GitHub issues data connector schema upfront by @sgrebnov in https://github.com/spiceai/spiceai/pull/2646
  • Add support for loading from Sharepoint Group's default drive. by @Jeadie in https://github.com/spiceai/spiceai/pull/2642
  • Fix typo in workflow, fix the postgres connector container readiness check by @Sevenannn in https://github.com/spiceai/spiceai/pull/2654
  • Fix check all features by @Sevenannn in https://github.com/spiceai/spiceai/pull/2653
  • Enable Warn/Error traces from dependency components by @sgrebnov in https://github.com/spiceai/spiceai/pull/2655
  • Use lower case iso8601 for time_column by @Sevenannn in https://github.com/spiceai/spiceai/pull/2551
  • Add basic integration test for Spice spill-to-disk and re-hydration scenario by @sgrebnov in https://github.com/spiceai/spiceai/pull/2643
  • Add 'RefreshOverrides::max_jitter' to 'POST /v1/datasets/:name/acceleration/refresh' by @Jeadie in https://github.com/spiceai/spiceai/pull/2641
  • Bump rustls-pemfile from 1.0.4 to 2.1.3 by @dependabot in https://github.com/spiceai/spiceai/pull/2575
  • Update dependencies to support querying postgres enum types by @Sevenannn in https://github.com/spiceai/spiceai/pull/2657
  • Upgrade table-providers by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2659
  • Improve spill_to_disk_and_rehydration integration test by @sgrebnov in https://github.com/spiceai/spiceai/pull/2658
  • Enhance GitHub connector robustness with explicit table schema definitions by @sgrebnov in https://github.com/spiceai/spiceai/pull/2661
  • Rename sharepoint fields by @Jeadie in https://github.com/spiceai/spiceai/pull/2668
  • Disable dataset checkpoint for DuckDB acceleration by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2676
  • Revert "Enable federation for accelerated queries (sqlite, duckdb, postgres) (#2598) by @Sevenannn in https://github.com/spiceai/spiceai/pull/2683

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.17.4-beta...v0.18.0-beta

- Rust
Published by sgrebnov over 1 year ago

https://github.com/spiceai/spiceai - v0.17.4-beta.1

This is the release candidate 0.17.4-beta.1

- Rust
Published by phillipleblanc over 1 year ago

https://github.com/spiceai/spiceai - v0.17.4-beta

- Rust
Published by ewgenius over 1 year ago

https://github.com/spiceai/spiceai - v0.17.3-beta

Spice v0.17.3-beta (Sep 2, 2024)

The v0.17.3-beta release further improves data accelerator robustness and adds a new github data connector that makes accelerating GitHub Issues, Pull Requests, Commits, and Blobs easy.

Highlights in v0.17.3-beta

Improved benchmarking, testing, and robustness of data accelerators: Continued improvements to benchmarking and testing of data accelerators, leading to more robust and reliable data accelerators.

GitHub Connector (alpha): Connect to GitHub and accelerate Issues, Pull Requests, Commits, and Blobs.

```yaml datasets: # Fetch all rust and golang files from spiceai/spiceai - from: github:github.com/spiceai/spiceai/files/trunk name: spiceai.files params: include: '*/.rs; */.go' githubtoken: ${secrets:GITHUBTOKEN}

# Fetch all issues from spiceai/spiceai. Similar for pull requests, commits, and more.
  • from: github:github.com/spiceai/spiceai/issues name: spiceai.issues params: githubtoken: ${secrets:GITHUBTOKEN} ```

Breaking Changes

None.

Contributors

  • @phillipleblanc
  • @Jeadie
  • @peasee
  • @sgrebnov
  • @Sevenannn
  • @lukekim
  • @dependabot
  • @ewgenius

What's Changed

Dependencies

  • delta_kernel from 0.2.0 to 0.3.0.

Commits

  • Prepare version for v0.17.3-beta by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2388
  • Add a basic Github Connector by @Jeadie in https://github.com/spiceai/spiceai/pull/2365
  • task: Re-enable federation by @peasee in https://github.com/spiceai/spiceai/pull/2389
  • fix: Implement custom PartialEq for Dataset by @peasee in https://github.com/spiceai/spiceai/pull/2390
  • GitHub Data Connector files support (basic fields) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2393
  • Add a --force flag to spice install to force it to install the latest released version by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2395
  • Improve experience of using spice chat by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2396
  • Fix view loading on startup by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2398
  • Add include param support to GitHub Data Connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/2397
  • Postgres integration test to cover on-conflict behavior by @Sevenannn in https://github.com/spiceai/spiceai/pull/2359
  • Create dependabot.yml by @lukekim in https://github.com/spiceai/spiceai/pull/2399
  • Add content column to GitHub Connector when dataset is accelerated by @sgrebnov in https://github.com/spiceai/spiceai/pull/2400
  • Fix dependabot indentation by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2402
  • Bump docker/setup-buildx-action from 1 to 3 by @dependabot in https://github.com/spiceai/spiceai/pull/2403
  • Bump github/codeql-action from 2 to 3 by @dependabot in https://github.com/spiceai/spiceai/pull/2404
  • Bump docker/login-action from 1 to 3 by @dependabot in https://github.com/spiceai/spiceai/pull/2405
  • Bump yogevbd/enforce-label-action from 2.1.0 to 2.2.2 by @dependabot in https://github.com/spiceai/spiceai/pull/2406
  • Bump actions/checkout from 3 to 4 by @dependabot in https://github.com/spiceai/spiceai/pull/2407
  • Bump go.uber.org/zap from 1.21.0 to 1.27.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2408
  • Bump github.com/prometheus/client_model from 0.6.0 to 0.6.1 by @dependabot in https://github.com/spiceai/spiceai/pull/2409
  • Bump github.com/spf13/cobra from 1.6.0 to 1.8.1 by @dependabot in https://github.com/spiceai/spiceai/pull/2412
  • Bump chrono-tz from 0.8.6 to 0.9.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2413
  • Bump tokio from 1.39.2 to 1.39.3 by @dependabot in https://github.com/spiceai/spiceai/pull/2414
  • Bump tokenizers from 0.19.1 to 0.20.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2415
  • Bump serde from 1.0.207 to 1.0.209 by @dependabot in https://github.com/spiceai/spiceai/pull/2416
  • Bump gopkg.in/natefinch/lumberjack.v2 from 2.0.0 to 2.2.1 by @dependabot in https://github.com/spiceai/spiceai/pull/2410
  • Bump ndarray from 0.15.6 to 0.16.1 by @dependabot in https://github.com/spiceai/spiceai/pull/2417
  • Bump golang.org/x/mod from 0.14.0 to 0.20.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2411
  • Add correct labels to dependabot.yml by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2418
  • Fix build break by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2430
  • Dependabot updates by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2431
  • Bump github.com/stretchr/testify from 1.8.1 to 1.9.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2422
  • Preserve timezone information in constructing expr by @Sevenannn in https://github.com/spiceai/spiceai/pull/2392
  • Bump github.com/spf13/viper from 1.12.0 to 1.19.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2420
  • Fix repeated base table data in acceleration with embeddings by @Sevenannn in https://github.com/spiceai/spiceai/pull/2401
  • Fix tool calling with Groq (and potentially other tool-enabled models) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2435
  • Remove candle from crates/llms/src/chat/ by @Jeadie in https://github.com/spiceai/spiceai/pull/2439
  • fix: Only attach successfully initialized accelerators by @peasee in https://github.com/spiceai/spiceai/pull/2433
  • Support overriding OpenAI default values in a model param; add token usage telemetry to task_history. by @Jeadie in https://github.com/spiceai/spiceai/pull/2434
  • Enable message chains and tool calls for local LLMs by @Jeadie in https://github.com/spiceai/spiceai/pull/2180
  • DuckDB on-conflict integration test by @Sevenannn in https://github.com/spiceai/spiceai/pull/2437
  • Fix MySQL E2E tests and include MySQL acceleration testing by @sgrebnov in https://github.com/spiceai/spiceai/pull/2441
  • Use rtcontext for proper cloud/local context in spice chat by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2442
  • Fix MySQL connector to respect the source column's decimal precision by @sgrebnov in https://github.com/spiceai/spiceai/pull/2443
  • Improve Github Data Connector tables schema by @sgrebnov in https://github.com/spiceai/spiceai/pull/2448
  • Improve GitHub Connector error msg when invalid token or permissions by @sgrebnov in https://github.com/spiceai/spiceai/pull/2449
  • Proper error tracking across tracing spans by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2454
  • task: Disable and update federation by @peasee in https://github.com/spiceai/spiceai/pull/2457
  • GitHub connector: convert labels and hashes to primitive arrays by @sgrebnov in https://github.com/spiceai/spiceai/pull/2452
  • Bump datafusion version to the latest by @sgrebnov in https://github.com/spiceai/spiceai/pull/2456
  • Trim trailing / for S3 data connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2458
  • Add accelerated_refresh to task_history table by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2459
  • Add assignees and labels fields to github issues and github pulls datasets by @ewgenius in https://github.com/spiceai/spiceai/pull/2467
  • Native clickhouse schema inference by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2466
  • List GitHub connector in readme by @ewgenius in https://github.com/spiceai/spiceai/pull/2468
  • Fix LLMs health check; Add updatedAt field to GitHub connector by @ewgenius in https://github.com/spiceai/spiceai/pull/2474
  • Remove non existing updated_at from github.pulls dataset by @ewgenius in https://github.com/spiceai/spiceai/pull/2475
  • GitHub connector: add pulls labels and rm duplicate milestoneId and milestoneTitle for issues by @sgrebnov in https://github.com/spiceai/spiceai/pull/2477
  • Bump delta_kernel from 0.2.0 to 0.3.0 by @dependabot in https://github.com/spiceai/spiceai/pull/2472
  • Add back GitHub connector Pull Request updated_at by @lukekim in https://github.com/spiceai/spiceai/pull/2479
  • Update ROADMAP Sep 2, 2024. by @lukekim in https://github.com/spiceai/spiceai/pull/2478

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.17.2-beta...v0.17.3-beta

- Rust
Published by Jeadie over 1 year ago

https://github.com/spiceai/spiceai - v0.17.2-beta.1

This is the release candidate 0.17.2-beta.1

- Rust
Published by phillipleblanc over 1 year ago

https://github.com/spiceai/spiceai - v0.17.2-beta

Spice v0.17.2-beta (Aug 26, 2024)

The v0.17.2-beta release focuses on improving data accelerator compatibility, stability, and performance. Expanded data type support for DuckDB, SQLite, and PostgreSQL data accelerators (and data connectors) enables significantly more data types to be accelerated. Error handling and logging has also been improved along with several bugs.

Highlights in v0.17.2-beta

Expanded Data Type Support for Data Accelerators: DuckDB, SQLite, and PostgreSQL Data Accelerators now support a wider range of data types, enabling acceleration of more diverse datasets.

Enhanced Error Handling and Logging: Improvements have been made to aid in troubleshooting and debugging.

Anonymous Usage Telemetry: Optional, anonymous, aggregated telemetry has been added to help improve Spice. This feature can be disabled. For details about collected data, see the telemetry documentation.

To opt out of telemetry:

  1. Using the CLI flag:

bash spice run -- --telemetry-enabled false

  1. Add configuration to spicepod.yaml:

yaml runtime: telemetry: enabled: false

Improved Benchmarking: A suite of performance benchmarking tests have been added to the project, helping to maintain and improve runtime performance; a top priority for the project.

Breaking Changes

None.

Contributors

  • @Jeadie
  • @y-f-u
  • @phillipleblanc
  • @sgrebnov
  • @Sevenannn
  • @peasee
  • @ewgenius

What's Changed

Dependencies

Commits

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.17.1-beta...v0.17.2-beta

- Rust
Published by phillipleblanc over 1 year ago

https://github.com/spiceai/spiceai - v0.17.1-beta

Spice v0.17.1-beta (Aug 5, 2024)

The v0.17.1-beta minor release focuses on enhancing stability, performance, and usability. The Flight interface now supports the GetSchema API and s3, ftp, sftp, http, https, and databricks data connectors have added support for a client_timeout parameter.

Highlights in v0.17.1-beta

Flight API GetSchema: The GetSchema API is now supported by the Flight interface. The schema of a dataset can be retrieved using GetSchema with the PATH or CMD FlightDescriptor types. The CMD FlightDescriptor type is used to get the schema of an arbitrary SQL query as the CMD bytes. The PATH FlightDescriptor type is used to retrieve the schema of a dataset.

Client Timeout: A client_timeout parameter has been added for Data Connectors: ftp, sftp, http, https, and databricks. When defined, the client timeout configures Spice to stop waiting for a response from the data source after the specified duration. The default timeout is 30 seconds.

yaml datasets: - from: ftp://remote-ftp-server.com/path/to/folder/ name: my_dataset params: file_format: csv # Example client timeout client_timeout: 30s ftp_user: my-ftp-user ftp_pass: ${secrets:my_ftp_password}

Breaking Changes

TLS is now required to be explicitly enabled. Enable TLS on the command line using --tls-enabled true:

bash spice run -- --tls-enabled true --tls-certificate-file /path/to/cert.pem --tls-key-file /path/to/key.pem

Or in the spicepod.yml with enabled: true:

yaml runtime: tls: # TLS explicitly enabled enabled: true certificate_file: /path/to/cert.pem key_file: /path/to/key.pem

Contributors

  • @Jeadie
  • @y-f-u
  • @phillipleblanc
  • @sgrebnov
  • @peasee
  • @Sevenannn

What's Changed

Dependencies

  • Rust: Upgraded from v1.79.0 to v1.80.0

Commits

  • Update README.md by @Jeadie in https://github.com/spiceai/spiceai/pull/2142
  • update helm chart to 0.17.0-beta by @y-f-u in https://github.com/spiceai/spiceai/pull/2144
  • Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/2143
  • Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/2141
  • Update Spice runtime to require explicit enablement for TLS by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2148
  • Update next version, ROADMAP, End Game template, move alpha release notes by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2145
  • Update EXTENSIBILITY to be correct, update README.md with Beta connectors by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2146
  • Add benchmark tests for duckdb acceleration by @sgrebnov in https://github.com/spiceai/spiceai/pull/2151
  • fix: Increase benchmark dataset setup timeout for Databricks by @peasee in https://github.com/spiceai/spiceai/pull/2149
  • Add LLMs to v1/models by @Jeadie in https://github.com/spiceai/spiceai/pull/2152
  • Dataset with acceleration enabled = false shouldn't go through accelerated dataset hot reload by @Sevenannn in https://github.com/spiceai/spiceai/pull/2155
  • Show single error string in Spice SQL REPL command line by @Sevenannn in https://github.com/spiceai/spiceai/pull/2150
  • Add CI to build makefile install targets by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2157
  • Make the FlightClient struct cheap to clone by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2162
  • Fix bugs with local Unity Catalog server by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2160
  • Benchmark: data connector tests should continue on query error (s3) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2161
  • fix hanging spiced when odbc loading data and received a cancel signal by @y-f-u in https://github.com/spiceai/spiceai/pull/2156
  • Improve MySql schema extraction and add InList and ScalarFunction expr support by @sgrebnov in https://github.com/spiceai/spiceai/pull/2158
  • Fix issue with use of EmbeddingConnector by @Jeadie in https://github.com/spiceai/spiceai/pull/2165
  • add client timeout for all object store providers by @y-f-u in https://github.com/spiceai/spiceai/pull/2168
  • Benchmark: include sqlite acceleration and enable more tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/2172
  • feat: Use datafusion SQLite streaming updates by @peasee in https://github.com/spiceai/spiceai/pull/2171
  • Benchmark: include arrow acceleration and enable more tests (tpch_q22) by @sgrebnov in https://github.com/spiceai/spiceai/pull/2173
  • Localhost -> Sink; Fix Sink connector to not require schema via CREATE TABLE... and infer on first write by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2167
  • Fix misspelled acceleration engine name in benchmark tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/2175
  • update spark bench catalog by @y-f-u in https://github.com/spiceai/spiceai/pull/2178
  • Benchmark: Discard first measurement of sql query, disable result caching by @Sevenannn in https://github.com/spiceai/spiceai/pull/2179
  • clear message when invalid params configured for accelerator by @y-f-u in https://github.com/spiceai/spiceai/pull/2177
  • Implement the Flight GetSchema API by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2169
  • Support AppendStream for SpiceAI data connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2181
  • Support MySQL BINARY, VARBINARY, Postgres BYTEA and improve MySQL auth error message by @sgrebnov in https://github.com/spiceai/spiceai/pull/2184
  • Benchmark: use SF1 for MySQL TPC-H tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/2183
  • fix windows build broken by adding tokio unix signal by @y-f-u in https://github.com/spiceai/spiceai/pull/2193
  • Adds TLS support for flightsubscriber/flightpublisher tools by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2194
  • Update README output samples by @ewgenius in https://github.com/spiceai/spiceai/pull/2195
  • Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/2197

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.17.0-beta...v0.17.1-beta

- Rust
Published by phillipleblanc over 1 year ago

https://github.com/spiceai/spiceai - v0.17.0-beta

Spice v0.17-beta (July 29, 2024)

Announcing the first beta release of Spice.ai OSS! 🎉

The core Spice runtime has graduated from alpha to beta! Components, such as Data Connectors and Models, follow independent release milestones. Data Connectors graduating from alpha to beta include databricks, spiceai, postgres, s3, odbc, and mysql. From beta to 1.0, project will be to on improving performance and scaling to larger datasets.

This release also includes enhanced security with Transport Layer Security (TLS) secured APIs, a new spice install CLI command, and several performance and stability improvements.

Highlights in v0.17-beta

  • Encryption in transit with TLS: The HTTP, gRPC, Metrics, and OpenTelemetry (OTEL) API endpoints can be secured with TLS by specifying a certificate and private key in PEM format.

Enable TLS using the --tls-certificate-file and --tls-key-file command-line flags:

bash spice run -- --tls-certificate-file /path/to/cert.pem --tls-key-file /path/to/key.pem

Or configure in the spicepod.yml:

yaml runtime: tls: certificate_file: /path/to/cert.pem key_file: /path/to/key.pem

Get started with TLS by following the TLS Sample. For more details see the TLS Documentation.

  • spice install: Running the spice install CLI command will download and install the latest version of the runtime.

bash spice install

  • Improved SQLite and DuckDB compatibility: The SQLite and DuckDB accelerators support more complex queries and additional data types.

  • Pass through arguments from spice run to runtime: Arguments passed to spice run are now passed through to the runtime.

  • Secrets replacement within connection strings: Secrets are now replaced within connection strings:

yaml datasets: - from: mysql:my_table name: my_table params: mysql_connection_string: mysql://user:${secrets:mysql_pw}@localhost:3306/db

Breaking Changes

The odbc data connector is now optional and has been removed from the released binaries. To use the odbc data connector, use the official Spice Docker image or build the Spice runtime from source.

To build Spice from source with the odbc feature:

bash cargo build --release --features odbc

To use the official Spice Docker image from DockerHub:

```bash

Pull the latest official Spice image

docker pull spiceai/spiceai:latest

Pull the official v0.17-beta Spice image

docker pull spiceai/spiceai:0.17.0-beta ```

Contributors

  • @y-f-u
  • @peasee
  • @digadeesh
  • @phillipleblanc
  • @ewgenius
  • @sgrebnov
  • @Sevenannn
  • @lukekim

What's Changed

Dependencies

Commits

  • update helm chart versions for v0.16.0-alpha by @y-f-u in https://github.com/spiceai/spiceai/pull/2057
  • Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/2060
  • fix: Install unixodbc for E2E test release installation by @peasee in https://github.com/spiceai/spiceai/pull/2063
  • update next release to 0.16.1-beta by @digadeesh in https://github.com/spiceai/spiceai/pull/2065
  • update version to 0.17.0-beta by @digadeesh in https://github.com/spiceai/spiceai/pull/2068
  • Update ROADMAP.md - removing delivered features and updating Beta timeline. by @digadeesh in https://github.com/spiceai/spiceai/pull/2066
  • make bench works for more connectors by @y-f-u in https://github.com/spiceai/spiceai/pull/2042
  • enable spark benchmark by @y-f-u in https://github.com/spiceai/spiceai/pull/2069
  • Make the json_pointer param optional for the GraphQL connector by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2072
  • Fix secrets init to not bail if a secret store can't load by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2073
  • Update end_game.md by @ewgenius in https://github.com/spiceai/spiceai/pull/2059
  • Fix time predicate with timezone info casting for Dremio by @sgrebnov in https://github.com/spiceai/spiceai/pull/2058
  • Add benchmark tests for S3 data connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/2049
  • Add benchmark tests for MySQL data connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/2048
  • fix: Add Athena dialect for ODBC by @peasee in https://github.com/spiceai/spiceai/pull/2084
  • Workflow to build MySQL image with TPCH benchmark data by @sgrebnov in https://github.com/spiceai/spiceai/pull/2070
  • Fix secrets replacement within connection strings by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2086
  • fix: Correctly prefix missing required parameters by @peasee in https://github.com/spiceai/spiceai/pull/2088
  • Add Postgres Data Connector TPCH Benchmark Tests by @Sevenannn in https://github.com/spiceai/spiceai/pull/2009
  • Add spice install CLI command by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2090
  • Use MySQL service container for benchmark tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/2089
  • Remove ODBC from default released binaries by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2092
  • Add cfg flag to properly support build w / wo feature in benchmark tests by @Sevenannn in https://github.com/spiceai/spiceai/pull/2095
  • Move Prometheus metrics server to runtime by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2093
  • fix: Remove unixodbc from test release install by @peasee in https://github.com/spiceai/spiceai/pull/2103
  • Upgrade delta_kernel to 0.2.0 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2102
  • Allow DuckDB to load extensions in Docker by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2104
  • Spawn the metrics server in the background. by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2105
  • fix: suffix delta kernel table location with slash if none by @y-f-u in https://github.com/spiceai/spiceai/pull/2107
  • Bump object_store from 0.10.1 to 0.10.2 by @dependabot in https://github.com/spiceai/spiceai/pull/2094
  • Decision Record: Default HTTP and GRPC ports for Spice.ai OSS by @digadeesh in https://github.com/spiceai/spiceai/pull/2091
  • Enable TLS for metrics endpoint by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2108
  • Use Postgres container for tpch bench by @Sevenannn in https://github.com/spiceai/spiceai/pull/2112
  • Add workflow to build Postgres Docker image using tpch data by @Sevenannn in https://github.com/spiceai/spiceai/pull/2101
  • Enable TLS for HTTP endpoint by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2109
  • Enable TLS on the Flight GRPC endpoint by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2110
  • add timeout parameters for object store client options by @y-f-u in https://github.com/spiceai/spiceai/pull/2114
  • Enable TLS on the OpenTelemetry GRPC endpoint by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2111
  • feat: Add ODBC Databricks Benches by @peasee in https://github.com/spiceai/spiceai/pull/2113
  • Support configuring TLS in the spicepod by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2118
  • add broken tpch simple queries by @y-f-u in https://github.com/spiceai/spiceai/pull/2116
  • Add integration test for TLS by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2121
  • Improve SQLite and DuckDB compatibility by @sgrebnov in https://github.com/spiceai/spiceai/pull/2122
  • Pass through arguments from spice run and spice sql to runtime by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2123
  • Handle TLS in the spice CLI when connecting to the runtime by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2124
  • Handle connecting over TLS for spice sql by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2125
  • Remove --tls flag by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2128
  • fix: Handle SQLResult error instead of unwrapping by @peasee in https://github.com/spiceai/spiceai/pull/2127
  • Add delta bench by @y-f-u in https://github.com/spiceai/spiceai/pull/2120
  • feat: Add Athena ODBC benches by @peasee in https://github.com/spiceai/spiceai/pull/2129
  • fix: Use odbc-api fork for decimal conversion fix by @peasee in https://github.com/spiceai/spiceai/pull/2133
  • Update benchmarks job env for delta testing by @y-f-u in https://github.com/spiceai/spiceai/pull/2134
  • Use forked dotenvy to disable variable substitution by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2135
  • Remove unnecessary memory allocations in the query path by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2136
  • upgrade spiceai df for tpch simple 6 and 7 by @y-f-u in https://github.com/spiceai/spiceai/pull/2137
  • Avoid more unnecessary allocations in the query path by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2138

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.16.0-alpha...v0.17-beta

- Rust
Published by phillipleblanc over 1 year ago

https://github.com/spiceai/spiceai - v0.16.0-alpha

Spice v0.16-alpha (July 22, 2024)

The v0.16-alpha release is the first candidate release for the beta milestone on a path to finalizing the v1.0 developer and user experience. Upgraders should be aware of several breaking changes designed to improve the Secrets configuration experience and to make authoring spicepod.yml files more consistent. See the Breaking Changes section below for details. Additionally, the Spice Java SDK was released, providing Java developers a simple but powerful native experience to query Spice.

Highlights in v0.16-alpha

yaml secrets: - from: env name: env - from: aws_secrets_manager:my_secret_name name: aws_secret

Secrets managed by configured Secret Stores can be referenced in component params using the syntax ${<store_name>:<key>}. E.g.

yaml datasets: - from: postgres:my_table name: my_table params: pg_host: localhost pg_port: 5432 pg_pass: ${ env:MY_PG_PASS }

  • Java Client SDK: The Spice Java SDK has been released for JDK 17 or greater.

  • Federated SQL Query: Significant stability and reliability improvements have been made to federated SQL query support in most data connectors.

  • ODBC Data Connector: Providing a specific SQL dialect to query ODBC data sources is now supported using the sql_dialect param. For example, when querying Databricks using ODBC, the databricks dialect can be specified to ensure compatibility. Read the ODBC Data Connector documentation for more details.

Breaking Changes

  • Secret Stores: Secret Stores support has been overhauled including required changes to spicepod.yml schema. File based secrets stored in the ~/.spice/auth file are no longer supported. See Secret Stores Documentation for full reference.

To upgrade Secret Stores, rename any parameters ending in _key to remove the _key suffix and specify a secret inline via the secret replacement syntax (${<secret_store>:<key>}):

yaml datasets: - from: postgres:my_table name: my_table params: pg_host: localhost pg_port: 5432 pg_pass_key: my_pg_pass

to:

yaml datasets: - from: postgres:my_table name: my_table params: pg_host: localhost pg_port: 5432 pg_pass: ${secrets:my_pg_pass}

And ensure the MY_PG_PASS environment variable is set.

  • Datasets: The default value of time_format has changed from unix_seconds to timestamp.

To upgrade:

yaml datasets: - from: name: my_dataset # Explicitly define format when not specified. time_format: unix_seconds

  • HTTP Port: The default HTTP port has changed from port 3000 to port 8090 to avoid conflicting with frontend apps which typically use the 3000 range. If an SDK is used, upgrade it at the same time as the runtime.

To upgrade and continue using port 3000, run spiced with the --http command line argument:

```shell

Using Dockerfile or spiced directly

spiced --http 127.0.0.1:3000 ```

  • HTTP Metrics Port: The default HTTP Metrics port has changed from port 9000 to 9090 to avoid conflicting with other metrics protocols which typically use port 9000.

To upgrade and continue using port 9000, run spiced with the metrics command line argument:

```shell

Using Dockerfile or spiced directly

spiced --metrics 127.0.0.1:9000 ```

To upgrade, change:

yaml json_path: my.json.path

To:

yaml json_pointer: /my/json/pointer

  • Data Connector Configuration: Consistent connector name prefixing has been applied to connector specific params parameters. Prefixed parameter names helps ensure parameters do not collide.

For example, the Databricks data connector specific params are now prefixed with databricks:

yaml datasets: - from: databricks:spiceai.datasets.my_awesome_table # A reference to a table in the Databricks unity catalog name: my_delta_lake_table params: mode: spark_connect endpoint: dbc-a1b2345c-d6e7.cloud.databricks.com token: MY_TOKEN

To upgrade:

yaml datasets: # Example for Spark Connect - from: databricks:spiceai.datasets.my_awesome_table # A reference to a table in the Databricks unity catalog name: my_delta_lake_table params: mode: spark_connect databricks_endpoint: dbc-a1b2345c-d6e7.cloud.databricks.com # Now prefixed with databricks databricks_token: ${secrets:my_token} # Now prefixed with databricks

Refer to the Data Connector documentation for parameter naming changes in this release.

Clickhouse Data Connector: The clickhouse_connection_timeout parameter has been renamed to connection_timeout as it applies to the client and is not Clickhouse configuration itself.

To upgrade, change:

yaml clickhouse_connection_timeout: time

To:

yaml connection_timeout: time

Contributors

  • @y-f-u
  • @phillipleblanc
  • @ewgenius
  • @github-actions
  • @sgrebnov
  • @lukekim
  • @digadeesh
  • @peasee
  • @Sevenannn

What's Changed

Dependencies

No major dependency updates.

Commits

  • bump helm chart versions to 0.15.2-alpha by @y-f-u in https://github.com/spiceai/spiceai/pull/1975
  • Remove unused Cargo.toml fields by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1981
  • Update version to 0.16.0-beta by @ewgenius in https://github.com/spiceai/spiceai/pull/1983
  • Update spicepod.schema.json by @github-actions in https://github.com/spiceai/spiceai/pull/1984
  • Enable sqlite acceleration testing in E2E by @sgrebnov in https://github.com/spiceai/spiceai/pull/1980
  • Revert "Revert "fix: validate time column and time format when constructing accelerated table refresh"" by @y-f-u in https://github.com/spiceai/spiceai/pull/1982
  • Add Datadog dashboard skeleton by @sgrebnov in https://github.com/spiceai/spiceai/pull/1971
  • Format Cargo.toml with taplo by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1988
  • Spice cli spice chat command, to interact with deployed spiced instance in spice.ai cloud by @ewgenius in https://github.com/spiceai/spiceai/pull/1990
  • Use platform api /v1/chat/completions with streaming in spice chat cli command by @ewgenius in https://github.com/spiceai/spiceai/pull/1998
  • update spiceai datafusion version to fix tpch queries by @y-f-u in https://github.com/spiceai/spiceai/pull/2001
  • Install a rustls default CryptoProvider by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2003
  • Roadmap update July, 2024 by @lukekim in https://github.com/spiceai/spiceai/pull/2002
  • Add local spice runtime support for spice chat command, add --model flag by @ewgenius in https://github.com/spiceai/spiceai/pull/2007
  • fix: GraphQL Data Connector - Change json path to json pointer by @digadeesh in https://github.com/spiceai/spiceai/pull/1930
  • Update ROADMAP.md to include MySQL data connector in Beta by @digadeesh in https://github.com/spiceai/spiceai/pull/2016
  • Load secrets from multiple secret stores & secrets UX refresh by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2011
  • upgrade spiceai datafusion to fix tpch simple query 3 by @y-f-u in https://github.com/spiceai/spiceai/pull/2021
  • feat: Autodetect ODBC dialect by @peasee in https://github.com/spiceai/spiceai/pull/1997
  • feat: Use CustomDialectBuilder for Databricks ODBC dialect by @peasee in https://github.com/spiceai/spiceai/pull/2020
  • Switch the secret replacement syntax to ${ <secret>:<key> } by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2026
  • fix spiceai connector lengthy error by @y-f-u in https://github.com/spiceai/spiceai/pull/2024
  • Log parameter key instead of value when injecting secret by @Sevenannn in https://github.com/spiceai/spiceai/pull/2031
  • Update benchmark yml to support postgres benchmark test by @Sevenannn in https://github.com/spiceai/spiceai/pull/2032
  • Separate data connector parameters into connector and runtime categories by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2028
  • Fix spice chat prompt and spinner by @ewgenius in https://github.com/spiceai/spiceai/pull/2029
  • Build spiced with odbc for release binaries by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2036
  • MySQL timestamp, int64 casting, date part extraction and intervals support by @sgrebnov in https://github.com/spiceai/spiceai/pull/2035
  • updating default http and metrics ports by @digadeesh in https://github.com/spiceai/spiceai/pull/2034
  • enable spark connect federated query by @y-f-u in https://github.com/spiceai/spiceai/pull/2041
  • fix: Use MySQL Interval for Databricks ODBC by @peasee in https://github.com/spiceai/spiceai/pull/2037
  • Re-enable testquickstartdremio E2E test by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2045
  • Fix ODBC build for release binaries by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2046
  • chore: Remove unused dependencies by @peasee in https://github.com/spiceai/spiceai/pull/2044
  • fix: Change version to alpha breaking by @peasee in https://github.com/spiceai/spiceai/pull/2051
  • Add connector prefix for dataset configure endpoint param by @sgrebnov in https://github.com/spiceai/spiceai/pull/2052
  • Fix unprefixed runtime parameters by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2050
  • Fix make install-with-models by @phillipleblanc in https://github.com/spiceai/spiceai/pull/2054
  • Bump openssl from 0.10.64 to 0.10.66 by @dependabot in https://github.com/spiceai/spiceai/pull/2047
  • Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/2056
  • ignore empty constraints when creating accelerated table by @y-f-u in https://github.com/spiceai/spiceai/pull/2055

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.15.2-alpha...v0.16.0-alpha

- Rust
Published by digadeesh over 1 year ago

https://github.com/spiceai/spiceai - v0.15.2-alpha

Spice v0.15.2-alpha (July 15, 2024)

The v0.15.2-alpha minor release focuses on enhancing stability, performance, and introduces Catalog Providers for streamlined access to Data Catalog tables. Unity Catalog, Databricks Unity Catalog, and the Spice.ai Cloud Platform Catalog are supported in v0.15.2-alpha. The reliability of federated query push-down has also been improved for the MySQL, PostgreSQL, ODBC, S3, Databricks, and Spice.ai Cloud Platform data connectors.

Highlights in v0.15.2-alpha

Catalog Providers: Catalog Providers streamline access to Data Catalog tables. Initial catalog providers supported are Databricks Unity Catalog, Unity Catalog and Spice.ai Cloud Platform Catalog.

For example, to configure Spice to connect to tpch tables in the Spice.ai Cloud Platform Catalog use the new catalogs: section in the spicepod.yml:

yaml catalogs: - name: spiceai from: spiceai include: - tpch.*

```bash sql> show tables +---------------+--------------+---------------+------------+ | tablecatalog | tableschema | tablename | tabletype | +---------------+--------------+---------------+------------+ | spiceai | tpch | region | BASE TABLE | | spiceai | tpch | part | BASE TABLE | | spiceai | tpch | customer | BASE TABLE | | spiceai | tpch | lineitem | BASE TABLE | | spiceai | tpch | partsupp | BASE TABLE | | spiceai | tpch | supplier | BASE TABLE | | spiceai | tpch | nation | BASE TABLE | | spiceai | tpch | orders | BASE TABLE | | spice | runtime | query_history | BASE TABLE | +---------------+--------------+---------------+------------+

Time: 0.001866958 seconds. 9 rows. ```

ODBC Data Connector Push-Down: The ODBC Data Connector now supports query push-down for joins, improving performance for joined datasets configured with the same odbc_connection_string.

Improved Spicepod Validation Improved spicepod.yml validation has been added, including warnings when loading resources with duplicate names (datasets, views, models, embeddings).

Breaking Changes

None.

Contributors

  • @phillipleblanc
  • @peasee
  • @y-f-u
  • @ewgenius
  • @Sevenannn
  • @sgrebnov
  • @lukekim

What's Changed

Dependencies

Commits

  • Update to next release version v0.15.2-alpha by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1901
  • release: Update helm 0.15.1-alpha by @peasee in https://github.com/spiceai/spiceai/pull/1902
  • fix: Detect and error on duplicate component names on spiced (re)load by @peasee in https://github.com/spiceai/spiceai/pull/1905
  • fix: flaky test - testrefreshstatuschangeto_ready by @y-f-u in https://github.com/spiceai/spiceai/pull/1908
  • Add support for parsing catalog from Spicepod. by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1903
  • Add catalog component to Runtime by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1906
  • Adds a RuntimeBuilder and make most items on Runtime private by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1913
  • Bump zerovec-derive from 0.10.2 to 0.10.3 by @dependabot in https://github.com/spiceai/spiceai/pull/1914
  • Add separate tagged image with enabled models feature by @ewgenius in https://github.com/spiceai/spiceai/pull/1909
  • Update datafusion-table-providers to use newest head by @Sevenannn in https://github.com/spiceai/spiceai/pull/1927
  • Add MySQL support for TPC-H test data generation script by @sgrebnov in https://github.com/spiceai/spiceai/pull/1932
  • fix: Expose ODBC task errors if error is before data stream begins by @peasee in https://github.com/spiceai/spiceai/pull/1924
  • Use public.ecr.aws/docker/library/{postgres/mysql}:latest for integration test images by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1934
  • Implement spice.ai CatalogProvider by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1925
  • fix: validate time column and time format when constructing accelerated table refresh by @y-f-u in https://github.com/spiceai/spiceai/pull/1926
  • Add support for filtering tables included by a catalog by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1933
  • Add UnityCatalog catalog provider by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1940
  • Implement Databricks catalog provider by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1941
  • Copy params into dataset_params by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1947
  • Make integration tests more stable by using logged-in registry during CI by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1955
  • Add integration test for Spice.ai catalog provider by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1956
  • Add GET /v1/catalogs API and catalogs CMD by @lukekim in https://github.com/spiceai/spiceai/pull/1957
  • feat: Enable ODBC JoinPushDown with hashed connection string by @peasee in https://github.com/spiceai/spiceai/pull/1954
  • Fix bug: arrow acceleration reports zero results during refresh by @sgrebnov in https://github.com/spiceai/spiceai/pull/1962
  • Revert "fix: validate time column and time format when constructing accelerated table refresh" by @y-f-u in https://github.com/spiceai/spiceai/pull/1964
  • fix: Update arrow-odbc to use our fork for pending fixes by @peasee in https://github.com/spiceai/spiceai/pull/1965
  • Upgrade to DataFusion 40 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1963
  • Do exchange shouldn't require table to be writable by @Sevenannn in https://github.com/spiceai/spiceai/pull/1958
  • Use custom dialect rule for flight federated request by @y-f-u in https://github.com/spiceai/spiceai/pull/1946
  • upgrade datafusion federation to have the table rewrite fix for tpch-q9 by @y-f-u in https://github.com/spiceai/spiceai/pull/1970
  • Create v0.15.2-alpha.md Release notes by @digadeesh in https://github.com/spiceai/spiceai/pull/1969
  • Fix Unity Catalog API response for Azure Databricks by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1973
  • Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/1976

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.15.1-alpha...v0.15.2-alpha

- Rust
Published by digadeesh over 1 year ago

https://github.com/spiceai/spiceai - v0.15.1-alpha

Spice v0.15.1-alpha (July 8, 2024)

The v0.15.1-alpha minor release focuses on enhancing stability, performance, and usability. Memory usage has been significantly improved for the postgres and duckdb acceleration engines which now use stream processing. A new Delta Lake Data Connector has been added, sharing a delta-kernel-rs based implementation with the Databricks Data Connector supporting deletion vectors.

Highlights

Improved memory usage for PostgreSQL and DuckDB acceleration engines: Large dataset acceleration with PostgreSQL and DuckDB engines has reduced memory consumption by streaming data directly to the accelerated table as it is read from the source.

Delta Lake Data Connector: A new Delta Lake Data Connector has been added for using Delta Lake outside of Databricks.

ODBC Data Connector Streaming: The ODBC Data Connector now streams results, reducing memory usage, and improving performance.

GraphQL Object Unnesting: The GraphQL Data Connector can automatically unnest objects from GraphQL queries using the unnest_depth parameter.

Breaking Changes

None.

New Contributors

None.

Contributors

What's Changed

Dependencies

The MySQL, PostgreSQL, SQLite and DuckDB DataFusion TableProviders developed by Spice AI have been donated to the datafusion-contrib/datafusion-table-providers community repository.

Commits

  • Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/1842
  • Update ROADMAP.md - Remove v0.15.0-alpha roadmap items. by @digadeesh in https://github.com/spiceai/spiceai/pull/1843
  • update helm chart for v0.15.0-alpha by @y-f-u in https://github.com/spiceai/spiceai/pull/1845
  • update cargo.toml and version.txt to 0.15.1-alpha (for next release) by @digadeesh in https://github.com/spiceai/spiceai/pull/1844
  • Fix check for outdated Cargo.lock & update Cargo.lock by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1846
  • Add Debezium to README by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1847
  • use snmalloc as global allocator by @y-f-u in https://github.com/spiceai/spiceai/pull/1848
  • Various improvements for mistral.rs by @Jeadie in https://github.com/spiceai/spiceai/pull/1831
  • Enable streaming for accelerated tables refresh (common logic) by @sgrebnov in https://github.com/spiceai/spiceai/pull/1863
  • Use in-memory DB pool for DuckDB functions by @Jeadie in https://github.com/spiceai/spiceai/pull/1849
  • Generate Spicepod JSON Schema by @ewgenius in https://github.com/spiceai/spiceai/pull/1865
  • Update http param names by @Jeadie in https://github.com/spiceai/spiceai/pull/1872
  • Replace DuckDB, PostgreSQL, Sqlite and MySQL providers with the datafusion-table-providers crate by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1873
  • Remove more dead code moved to datafusion-table-providers by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1874
  • feat: Optimize ODBC for streaming results by @peasee in https://github.com/spiceai/spiceai/pull/1862
  • Fix how models uses secrets by @Jeadie in https://github.com/spiceai/spiceai/pull/1875
  • fix: Add support for varying duplicate columns behavior in GraphQL unnesting by @peasee in https://github.com/spiceai/spiceai/pull/1876
  • fix: Remove GraphQL duplicate rename support by @peasee in https://github.com/spiceai/spiceai/pull/1877
  • fix: Remove Overwrite GraphQL duplicates behavior by @peasee in https://github.com/spiceai/spiceai/pull/1882
  • fix: Use tokio mpsc channels for ODBC streaming by @peasee in https://github.com/spiceai/spiceai/pull/1883
  • Upgrade table providers to enable DuckDB streaming write by @sgrebnov in https://github.com/spiceai/spiceai/pull/1884
  • Update ROADMAP.md - Add debezium (alpha) to connector list. by @digadeesh in https://github.com/spiceai/spiceai/pull/1880
  • Allow defining user for mysql data connector via secrets by @sgrebnov in https://github.com/spiceai/spiceai/pull/1886
  • Replace delta-rs with delta-kernel-rs and add new delta data connector. by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1878
  • Update README images by @lukekim in https://github.com/spiceai/spiceai/pull/1890
  • Handle deletion vectors for delta tables by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1891
  • Rename delta to delta_lake by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1892
  • Add where is the AI to the FAQ. by @lukekim in https://github.com/spiceai/spiceai/pull/1885
  • update df table providers rev version by @y-f-u in https://github.com/spiceai/spiceai/pull/1889
  • Enable other cloud providers for delta_lake integration by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1893
  • Add CLI parameters for logging into Databricks with Azure/GCP cloud storage by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1894
  • Bump zerovec from 0.10.2 to 0.10.4 by @dependabot in https://github.com/spiceai/spiceai/pull/1896
  • Add 'Content-Type' to metrics exporter to be prometheus exposition format compliant by @sgrebnov in https://github.com/spiceai/spiceai/pull/1897
  • Update enforce-labels.yml so it accepts depdenabot updates with kind/… by @digadeesh in https://github.com/spiceai/spiceai/pull/1898

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.15.0-alpha...v0.15.1-alpha

- Rust
Published by digadeesh over 1 year ago

https://github.com/spiceai/spiceai - v0.15.0-alpha

Spice v0.15-alpha (July 1, 2024)

The v0.15-alpha release introduces support for streaming databases changes with Change Data Capture (CDC) into accelerated tables via a new Debezium connector, configurable retry logic for data refresh, and the release of a new C# SDK to build with Spice in Dotnet.

Highlights

  • Debezium data connector with Change Data Capture (CDC): Sync accelerated datasets with Debezium data sources over Kafka in real-time.

  • Data Refresh Retries: By default, accelerated datasets attempt to retry data refreshes on transient errors. This behavior can be configured using refresh_retry_enabled and refresh_retry_max_attempts.

  • C# Client SDK: A new C# Client SDK has been released for developing applications in Dotnet.

Debezium data connector with Change Data Capture (CDC)

Integrating Debezium CDC is straightforward. Get started with the Debezium CDC Sample, read more about CDC in Spice, and read the Debezium data connector documentation.

Example Spicepod using Debezium CDC:

yaml datasets: - from: debezium:cdc.public.customer_addresses name: customer_addresses_cdc params: debezium_transport: kafka debezium_message_format: json kafka_bootstrap_servers: localhost:19092 acceleration: enabled: true engine: duckdb mode: file refresh_mode: changes

Data Refresh Retries

Example Spicepod configuration limiting refresh retries to a maximum of 10 attempts:

yaml datasets: - from: eth.blocks name: blocks acceleration: refresh_retry_enabled: true refresh_retry_max_attempts: 10 refresh_check_interval: 30s

Breaking Changes

None.

New Contributors

  • @rupurt made their first contribution in https://github.com/spiceai/spiceai/pull/1791

Contributors

What's Changed

Dependencies

No major dependency updates.

Commits

  • Update version to 0.15.0-alpha by @ewgenius in https://github.com/spiceai/spiceai/pull/1784
  • Update helm for v0.14.1-alpha by @ewgenius in https://github.com/spiceai/spiceai/pull/1786
  • Run PR checks on PRs merging into feature-- branches by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1788
  • Enable retries for accelerated table refresh by @sgrebnov in https://github.com/spiceai/spiceai/pull/1762
  • enable more tpch benchmark queries as a result of decimal unparsing by @y-f-u in https://github.com/spiceai/spiceai/pull/1790
  • add nix flake by @rupurt in https://github.com/spiceai/spiceai/pull/1791
  • Support local and HF embedding models by @Jeadie in https://github.com/spiceai/spiceai/pull/1789
  • fix(bin/spice): Implement custom Unmarshaller for DatasetOrReference by @peasee in https://github.com/spiceai/spiceai/pull/1787
  • For windows, move symlink -> symlink_file. by @Jeadie in https://github.com/spiceai/spiceai/pull/1793
  • docs: Add PULLREQUESTTEMPLATE.md by @peasee in https://github.com/spiceai/spiceai/pull/1794
  • Fix Unsupported DataType: conversion for time predicates by @sgrebnov in https://github.com/spiceai/spiceai/pull/1795
  • Use incremental backoff for initial dataset registration retries by @sgrebnov in https://github.com/spiceai/spiceai/pull/1805
  • Basic HTTP/S connector by @Jeadie in https://github.com/spiceai/spiceai/pull/1792
  • Scale support for Snowflake fixed-point numbers by @sgrebnov in https://github.com/spiceai/spiceai/pull/1804
  • bump datafusion federation to resolve the join query failures by @y-f-u in https://github.com/spiceai/spiceai/pull/1806
  • fix: Stream PostgreSQL data in by @peasee in https://github.com/spiceai/spiceai/pull/1798
  • Remove clippy::module_name_repetitions lint by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1812
  • Improve Snowflake fixed-point numbers casting by @sgrebnov in https://github.com/spiceai/spiceai/pull/1809
  • Case insensitive secret getter by @ewgenius in https://github.com/spiceai/spiceai/pull/1813
  • refactor: Format TOML with Taplo by @peasee in https://github.com/spiceai/spiceai/pull/1808
  • feat: Update PR template, add label enforcement in PR by @peasee in https://github.com/spiceai/spiceai/pull/1815
  • fix bug that append may miss updates when the incremental changes are not able to be contained in one record batch by @y-f-u in https://github.com/spiceai/spiceai/pull/1817
  • add integration test for inner join across federated table and accelerated table by @y-f-u in https://github.com/spiceai/spiceai/pull/1811
  • Unify spicepod.llms into spicepod.models and refactor UX of spicepod.models by @Jeadie in https://github.com/spiceai/spiceai/pull/1818
  • Fix issue with querying accelerated tables where the dataset name has a schema by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1823
  • Fix schema support for refresh_sql and improve e2e tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/1826
  • feat: Add GraphQL unnesting by @peasee in https://github.com/spiceai/spiceai/pull/1822
  • fix: Allow kind/optimization labels, increase Postgres test timeout by @peasee in https://github.com/spiceai/spiceai/pull/1830
  • Implement Real-time acceleration updates via Debezium CDC by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1832
  • Remove println statement from PG Connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/1835
  • Don't try to "hot reload" Debezium accelerated datasets by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1837
  • Create v1/search that performs vector search. by @Jeadie in https://github.com/spiceai/spiceai/pull/1836
  • Align spicepod UX of embeddings with models by @Jeadie in https://github.com/spiceai/spiceai/pull/1829
  • Add "cmake-build" feature to rdkafka for windows by @Jeadie in https://github.com/spiceai/spiceai/pull/1840
  • Add a better error message when trying to configure refresh_mode=changes on a data connector that doesn't support it. by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1839

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.14.1-alpha...v0.15.0-alpha

- Rust
Published by digadeesh over 1 year ago

https://github.com/spiceai/spiceai - v0.14.1-alpha

Spice v0.14.1-alpha (Jun 24, 2024)

The v0.14.1-alpha release is focused on quality, stability, and type support with improvements in PostgreSQL, DuckDB, and GraphQL data connectors.

Highlights

  • PostgreSQL acceleration and data connector: Support for Composite Types and UUID data types.
  • DuckDB acceleration and data connector: Support for LargeUTF8 and DuckDB functions.
  • GraphQL data connector: Improved error handling on invalid query syntax.
  • Refresh SQL: Improved stability when overwriting STRUCT data types.

Breaking Changes

None.

New Contributors

  • @phungleson made their first contribution in https://github.com/spiceai/spiceai/pull/1750
  • @peasee made their first contribution in https://github.com/spiceai/spiceai/pull/1769

Contributors

  • @lukekim
  • @y-f-u
  • @ewgenius
  • @phillipleblanc
  • @Jeadie
  • @sgrebnov
  • @gloomweaver
  • @phungleson
  • @peasee
  • @digadeesh

What's Changed

Dependencies

No major dependency updates.

Commits

  • Update Helm to v0.14.0-alpha by @sgrebnov in https://github.com/spiceai/spiceai/pull/1720
  • Update version to 0.14.1-alpha by @sgrebnov in https://github.com/spiceai/spiceai/pull/1721
  • Use spiceai/async-openai to solve Deserialize issue in v1/embed by @Jeadie in https://github.com/spiceai/spiceai/pull/1707
  • Add greatest least user defined functions by @y-f-u in https://github.com/spiceai/spiceai/pull/1722
  • default timeunit to be seconds when time column is a numeric column by @y-f-u in https://github.com/spiceai/spiceai/pull/1727
  • use system conf to construct dns resolver by @y-f-u in https://github.com/spiceai/spiceai/pull/1728
  • fix a bug that dataset refresh api does not work for table with schema by @y-f-u in https://github.com/spiceai/spiceai/pull/1729
  • Move secret crate to runtime module by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1723
  • Return schema in getflightinfo_simple by @gloomweaver in https://github.com/spiceai/spiceai/pull/1724
  • Refactor vector search component of v1/assist into a VectorSearch struct by @Jeadie in https://github.com/spiceai/spiceai/pull/1699
  • Update ROADMAP.md. Fix a broken link for the "Get in touch" link. by @digadeesh in https://github.com/spiceai/spiceai/pull/1725
  • Secret keys in params should be case insensitive by @ewgenius in https://github.com/spiceai/spiceai/pull/1737
  • expose error log when refresh encountered some issue, also add more debug logs by @y-f-u in https://github.com/spiceai/spiceai/pull/1739
  • Support Struct in PostgreSQL accelerator by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1733
  • rewrite refresh append update dedup logic using arrow comparators by @y-f-u in https://github.com/spiceai/spiceai/pull/1743
  • Add health checks when loading {llms, embeddings} by @Jeadie in https://github.com/spiceai/spiceai/pull/1738
  • Support DuckDB function in DuckDB datasets by @Jeadie in https://github.com/spiceai/spiceai/pull/1742
  • Update version of spiceai/duckdb-rs, support LargeUTF8 by @Jeadie in https://github.com/spiceai/spiceai/pull/1746
  • Split refresh into coordination and execution layers by @sgrebnov in https://github.com/spiceai/spiceai/pull/1744
  • bump duckdb rs git sha to resolve duckdb incorrect null value issue by @y-f-u in https://github.com/spiceai/spiceai/pull/1747
  • cargo.lock file update with #1747 duckdb-rs sha by @y-f-u in https://github.com/spiceai/spiceai/pull/1748
  • Fix error when GraphQL error locations is missing by @phungleson in https://github.com/spiceai/spiceai/pull/1750
  • Tweak refresh scheduling logic by @sgrebnov in https://github.com/spiceai/spiceai/pull/1749
  • Ensure tonic package is in duckdb feature by @Jeadie in https://github.com/spiceai/spiceai/pull/1756
  • Change tonic::async_trait -> async_trait::async_trait by @Jeadie in https://github.com/spiceai/spiceai/pull/1757
  • Streaming in v1/chat/completion by @Jeadie in https://github.com/spiceai/spiceai/pull/1741
  • Add refreshretryenabled/max_attempts acceleration params by @sgrebnov in https://github.com/spiceai/spiceai/pull/1753
  • Implement refresh retry based on fibonacci backoff (not enabled) by @sgrebnov in https://github.com/spiceai/spiceai/pull/1752
  • Add VSCode debug target to debug runtime benchmark test by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1760
  • update spiceai datafusion to include more unparser rules by @y-f-u in https://github.com/spiceai/spiceai/pull/1764
  • Show UUID types as String instead of base64 binary. by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1767
  • docs: Add linux contributor guide for setup by @peasee in https://github.com/spiceai/spiceai/pull/1769
  • Do not expose connection url on object store error by @ewgenius in https://github.com/spiceai/spiceai/pull/1761
  • Support secrets in llm and embeddings params by @ewgenius in https://github.com/spiceai/spiceai/pull/1770
  • Bump github.com/hashicorp/go-retryablehttp from 0.7.1 to 0.7.7 by @dependabot in https://github.com/spiceai/spiceai/pull/1775
  • Update ROADMAP.md with latest roadmap changes for v0.15.0 by @digadeesh in https://github.com/spiceai/spiceai/pull/1773
  • Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/1776
  • Strip kwarg '=' in DuckDB function parsing by @Jeadie in https://github.com/spiceai/spiceai/pull/1777

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.14.0-alpha...v0.14.1-alpha

- Rust
Published by digadeesh over 1 year ago

https://github.com/spiceai/spiceai - v0.14.0-alpha

Spice v0.14-alpha (June 17, 2024)

The v0.14-alpha release focuses on enhancing accelerated dataset performance and data integrity, with support for configuring primary keys and indexes. Additionally, the GraphQL data connector been introduced, along with improved dataset registration and loading error information.

Highlights

  • Accelerated Datasets: Ensure data integrity using primary key and unique index constraints. Configure conflict handling to either upsert new data or drop it. Create indexes on frequently filtered columns for faster queries on larger datasets.

  • GraphQL Data Connector: Initial support for using GraphQL as a data source.

Example Spicepod showing how to use primary keys and indexes with accelerated datasets:

yaml datasets: - from: eth.blocks name: blocks acceleration: engine: duckdb # Use DuckDB acceleration engine primary_key: '(hash, timestamp)' indexes: number: enabled # same as `CREATE INDEX ON blocks (number);` '(number, hash)': unique # same as `CREATE UNIQUE INDEX ON blocks (number, hash);` on_conflict: '(hash, number)': drop # possible values: drop (default), upsert '(hash, timestamp)': upsert

Primary Keys, constraints, and indexes are currently supported when using SQLite, DuckDB, and PostgreSQL acceleration engines.

Learn more with the indexing quickstart and the primary key sample.

Read the Local Acceleration documentation.

Breaking Changes

None.

Contributors

  • @phillipleblanc
  • @ewgenius
  • @sgrebnov
  • @Jeadie
  • @digadeesh
  • @gloomweaver
  • @y-f-u
  • @lukekim
  • @edmondop

What's Changed

Dependencies

  • Apache DataFusion: Upgraded from 38.0.0 to 39.0.0
  • Apache Arrow/Parquet: Upgraded from 51.0.0 to 52.0.0
  • Rust: Upgraded from 1.78.0 to 1.79.0

Commits

  • Update Helm chart for v0.13.3-alpha by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1671
  • Bump version to v0.14.0-alpha by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1673
  • Dependency upgrades: DataFusion 39, Arrow/Parquet 52, object_store 0.10.1, arrow-odbc 11.1.0 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1674
  • Generate unique runtime instance name and store in runtime.metrics table by @ewgenius in https://github.com/spiceai/spiceai/pull/1678
  • Proper support for Snowflake TIMESTAMP_NTZ by @sgrebnov in https://github.com/spiceai/spiceai/pull/1677
  • Enable tpchq2 and tpchq21 in the benchmark queries by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1679
  • Start runtime metrics recorder after loading secrets and extensions by @ewgenius in https://github.com/spiceai/spiceai/pull/1680
  • Validate table constraints (Primary Keys/Unique Index) on accelerated tables by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1658
  • Store labels as JSON string in runtime.metrics by @ewgenius in https://github.com/spiceai/spiceai/pull/1681
  • Atomic updates for DuckDB tables with constraints by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1682
  • Rename metrics column labels to properties and make it nullable by @ewgenius in https://github.com/spiceai/spiceai/pull/1686
  • Fix federationoptimizerrule schema error for tpch_q7, tpch_q8, tpch_q9, tpch_q14 by @sgrebnov in https://github.com/spiceai/spiceai/pull/1683
  • Better prompt for /v1/assist by @Jeadie in https://github.com/spiceai/spiceai/pull/1685
  • Support stream in v1/assist by @Jeadie in https://github.com/spiceai/spiceai/pull/1653
  • Fix cache hit rate chart loading for Grafana v9.5 by @sgrebnov in https://github.com/spiceai/spiceai/pull/1691
  • Update ROADMAP.md to include data connector statuses by @digadeesh in https://github.com/spiceai/spiceai/pull/1684
  • Support primary_key in Spicepod and create in accelerated table by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1687
  • Datasets with schema support for availability monitoring by @sgrebnov in https://github.com/spiceai/spiceai/pull/1690
  • Improve dataset registration output by @sgrebnov in https://github.com/spiceai/spiceai/pull/1692
  • Readme: update dataset registration traces by @sgrebnov in https://github.com/spiceai/spiceai/pull/1694
  • Improved error logging for datasets load error by @edmondop in https://github.com/spiceai/spiceai/pull/1695
  • Improve ArrayDistance scalar UDF by @Jeadie in https://github.com/spiceai/spiceai/pull/1697
  • Implement on_conflict behavior for accelerated tables with constraints by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1688
  • Fix datasets live update (Spice file watcher) by @sgrebnov in https://github.com/spiceai/spiceai/pull/1702
  • Grafana Dashboard: replace Quantile with Percentile filter by @sgrebnov in https://github.com/spiceai/spiceai/pull/1703
  • refresh with append overlap by @y-f-u in https://github.com/spiceai/spiceai/pull/1706
  • Fix error message on DuckDB constraint violation by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1709
  • Add warning when configuring indexes/primarykey/onconflict for Arrow engine. by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1710
  • ensure schema to be existing when query timestamp during refresh by @y-f-u in https://github.com/spiceai/spiceai/pull/1711
  • Improve README clarity and add comparison table by @lukekim in https://github.com/spiceai/spiceai/pull/1713
  • Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/1716
  • Update README.md to include GraphQL data connector in supported table by @digadeesh in https://github.com/spiceai/spiceai/pull/1717
  • Fix quoting issue for databricks identifier by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1718

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.13.3-alpha...v0.14.0-alpha

- Rust
Published by github-actions[bot] over 1 year ago

https://github.com/spiceai/spiceai - v0.13.3-alpha

Spice v0.13.3-alpha (June 10, 2024)

The v0.13.3-alpha release is focused on quality and stability with improvements to metrics, telemetry, and operability.

Highlights

Ready API: - Add /v1/ready API that returns success once all datasets and models are loaded and ready.

Enhanced Grafana dashboard: The dashboard now includes charts for query duration and failures, the last update time of accelerated datasets, the count of refresh errors, and the last successful time the runtime was able to access federated datasets

Contributors

  • @Jeadie
  • @ewgenius
  • @phillipleblanc
  • @sgrebnov
  • @gloomweaver
  • @y-f-u
  • @mach-kernel

What's Changed

Dependencies

  • DuckDB 1.0.0: Upgrades embedded DuckDB to 1.0.0.

Commits

  • Scalar UDF array_distance as euclidean distance between Float32[] by @Jeadie in https://github.com/spiceai/spiceai/pull/1601
  • Update version to v0.14.0-alpha by @ewgenius in https://github.com/spiceai/spiceai/pull/1614
  • Update helm for v0.13.2-alpha by @ewgenius in https://github.com/spiceai/spiceai/pull/1618
  • Upgrade duckdb-rs to DuckDB 1.0.0 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1615
  • initial idea for 'POST v1/assist' by @Jeadie in https://github.com/spiceai/spiceai/pull/1585
  • openai server trait and move HTTP endpoints to crates/runtime/src/http/v1/ by @Jeadie in https://github.com/spiceai/spiceai/pull/1619
  • Add branching policy & updated endgame instructions by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1620
  • Update Cargo.lock & add CI check for updated Cargo.lock by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1627
  • Add first-class support for views by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1622
  • Add /v1/ready API that returns 200 when all datasets have loaded by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1629
  • Separate NQL logic from LLM Chat messages, and add OpenAI compatiblility per LLM trait. by @Jeadie in https://github.com/spiceai/spiceai/pull/1628
  • Log queries failing on getflightinfo step (Flight Api) by @sgrebnov in https://github.com/spiceai/spiceai/pull/1626
  • Graphql Data Connector by @gloomweaver in https://github.com/spiceai/spiceai/pull/1624
  • GraphQL improved Error formatting, proper format request body by @gloomweaver in https://github.com/spiceai/spiceai/pull/1637
  • Fix v1/assist response and panic bug. Include primary keys in response too by @Jeadie in https://github.com/spiceai/spiceai/pull/1635
  • skip integration test if no secret by @y-f-u in https://github.com/spiceai/spiceai/pull/1638
  • [append] Refresher::getlatesttimestamp / getdf to add refreshsql predicates to scan by @mach-kernel in https://github.com/spiceai/spiceai/pull/1636
  • GraphQL integration test by @gloomweaver in https://github.com/spiceai/spiceai/pull/1600
  • Add err_code to query_failures metric by @sgrebnov in https://github.com/spiceai/spiceai/pull/1639
  • use epoch_ms to replace epoch to work with timestamptz by @y-f-u in https://github.com/spiceai/spiceai/pull/1641
  • fix the schema mismatch issue on the fallback plan use schema casting by @y-f-u in https://github.com/spiceai/spiceai/pull/1642
  • bug report template update by @y-f-u in https://github.com/spiceai/spiceai/pull/1640
  • Add query duration, failures and accelerated dataset metrics to Grafana dashboard by @sgrebnov in https://github.com/spiceai/spiceai/pull/1598
  • Fix FTP/sftp support for ObjectStoreMetadataTable & ObjectStoreTextTable by @Jeadie in https://github.com/spiceai/spiceai/pull/1649
  • Support accelerated embedding tables in v1/assist by @Jeadie in https://github.com/spiceai/spiceai/pull/1648
  • GraphQL pagination, limit pushdown and refactor by @gloomweaver in https://github.com/spiceai/spiceai/pull/1643
  • Support indexes in accelerated tables by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1644
  • Federated datasets availability monitoring by @sgrebnov in https://github.com/spiceai/spiceai/pull/1650
  • Reset federated dataset availability during dataset registration by @sgrebnov in https://github.com/spiceai/spiceai/pull/1661
  • Change to v0.13.3-alpha by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1666
  • Add Time Since Offline chart to Grafana dashboard by @sgrebnov in https://github.com/spiceai/spiceai/pull/1664
  • readme fix to correct the number of rows for show tables by @y-f-u in https://github.com/spiceai/spiceai/pull/1667
  • Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/1668
  • Add missing dependency on arrowsqlgen from duckdb data_component by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1669

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.13.2-alpha...v0.13.3-alpha

- Rust
Published by phillipleblanc over 1 year ago

https://github.com/spiceai/spiceai - v0.13.2-alpha

Spice v0.13.2-alpha (June 3, 2024)

The v0.13.2-alpha release is focused on quality and stability with improvements to federated query push-down, telemetry, and query history.

Highlights

  • Filesystem Data Connector: Adds the Filesystem Data Connector for directly using files as data sources.

  • Federated Query Push-Down: Improved stability and schema compatibility for federated queries.

  • Enhanced Telemetry: Runtime Metrics now include last update time for accelerated datasets, count of refresh errors, and new metrics for query duration and failures.

  • Query History: Enabled query history logging for Arrow Flight queries in addition to HTTP queries.

Contributors

  • @lukekim
  • @y-f-u
  • @ewgenius
  • @phillipleblanc
  • @Jeadie
  • @Sevenannn
  • @sgrebnov
  • @gloomweaver
  • @mach-kernel

What's Changed

  • Update ROADMAP.md May 27, 2024 by @lukekim in https://github.com/spiceai/spiceai/pull/1535
  • update helm chart version and use v0.13.1-alpha by @y-f-u in https://github.com/spiceai/spiceai/pull/1536
  • version correction in v0.13.1 release note by @y-f-u in https://github.com/spiceai/spiceai/pull/1538
  • update version to v0.14.0-alpha by @y-f-u in https://github.com/spiceai/spiceai/pull/1539
  • Update spice_cloud - connect to cloud api by @ewgenius in https://github.com/spiceai/spiceai/pull/1523
  • Update spice_cloud extension params, and remove logging by @ewgenius in https://github.com/spiceai/spiceai/pull/1541
  • Update MSRV to 1.78 and remove unused Rust Version parameter in CI by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1540
  • Improve llm UX in spicepod.yaml by @Jeadie in https://github.com/spiceai/spiceai/pull/1545
  • Store local runtime metrics in Timestamp with nanoseconds precision and UTC time by @ewgenius in https://github.com/spiceai/spiceai/pull/1548
  • Object store metadata Table provider by @Jeadie in https://github.com/spiceai/spiceai/pull/1518
  • Remove clickhouse password requirement by @Sevenannn in https://github.com/spiceai/spiceai/pull/1547
  • pretty print loaded rows number by @y-f-u in https://github.com/spiceai/spiceai/pull/1553
  • Fix UNION ALL federated push down by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1550
  • Update mistral, fix bugs and improve local file DX by @Jeadie in https://github.com/spiceai/spiceai/pull/1552
  • Cast runtime.metrics schema, if remote (spiceai) data connector provided by @ewgenius in https://github.com/spiceai/spiceai/pull/1554
  • Use proper MySQL dialect during federation push-down by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1555
  • parallel load dataset when starting up by @y-f-u in https://github.com/spiceai/spiceai/pull/1551
  • fix linter warning on Scanf return value by @y-f-u in https://github.com/spiceai/spiceai/pull/1556
  • Update spice cloud connect api endpoint by @ewgenius in https://github.com/spiceai/spiceai/pull/1557
  • Create new HTTP endpoint to create embeddings. by @Jeadie in https://github.com/spiceai/spiceai/pull/1558
  • Query History support for Flight API by @sgrebnov in https://github.com/spiceai/spiceai/pull/1549
  • Don't cache queries for runtime tables by @sgrebnov in https://github.com/spiceai/spiceai/pull/1561
  • Fix schema incompatibility on federated push-down queries by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1560
  • move 'embeddings' to top-level concept in spicepod.yaml by @Jeadie in https://github.com/spiceai/spiceai/pull/1564
  • object_store table provider for UTF8 data formats by @Jeadie in https://github.com/spiceai/spiceai/pull/1562
  • Improve connectivity for JDBC clients, like Tableau by @sgrebnov in https://github.com/spiceai/spiceai/pull/1563
  • Enable datasets from local filesystem by @Jeadie in https://github.com/spiceai/spiceai/pull/1584
  • Adds benchmarking tests for Spice by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1577
  • Push down correct timestamp expr to SQLite, add binary type mapping by @mach-kernel in https://github.com/spiceai/spiceai/pull/1566
  • Add query_duration_seconds and query_failures metrics by @sgrebnov in https://github.com/spiceai/spiceai/pull/1575
  • Use /app as a default workdir in spiceai docker image by @ewgenius in https://github.com/spiceai/spiceai/pull/1586
  • Add support for both file:// and file:/ by @Jeadie in https://github.com/spiceai/spiceai/pull/1587
  • put loaddatasets as the latest step along with startservers by @y-f-u in https://github.com/spiceai/spiceai/pull/1559
  • Embedding columns (from embedding providers) are now run inside datafusion plans. by @Jeadie in https://github.com/spiceai/spiceai/pull/1576
  • Support BinaryArray in DuckDB accelerations by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1595
  • Add cache header to Flight API and Spice REPL indicator by @sgrebnov in https://github.com/spiceai/spiceai/pull/1591
  • Add accelerated datasets refresh metrics by @sgrebnov in https://github.com/spiceai/spiceai/pull/1589
  • update the error when starting spice sql with no runtime to be actionable by @digadeesh in https://github.com/spiceai/spiceai/pull/1597
  • add odbc integration test by @y-f-u in https://github.com/spiceai/spiceai/pull/1590
  • Fix bug in instantiating EmbeddingConnector by @Jeadie in https://github.com/spiceai/spiceai/pull/1592
  • readme change to reflect new cli output by @y-f-u in https://github.com/spiceai/spiceai/pull/1602
  • Update version v0.13.2 by @ewgenius in https://github.com/spiceai/spiceai/pull/1604
  • Roadmap changes Jun 3, 2024 by @lukekim in https://github.com/spiceai/spiceai/pull/1609

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.13.1-alpha...v0.13.2

- Rust
Published by ewgenius over 1 year ago

https://github.com/spiceai/spiceai - v0.13.1-alpha

Spice v0.13.1-alpha (May 27, 2024)

The v0.13.1-alpha release of Spice is a minor update focused on stability, quality, and operability. Query result caching provides protection against bursts of queries and schema support for datasets has been added logical grouping. An issue where Refresh SQL predicates were not pushed down underlying data sources has been resolved along with improved Acceleration Refresh logging.

Highlights in v0.13.1-alpha

  • Results Caching: Introduced query results caching to handle bursts of requests and support caching of non-accelerated results, such as refresh data returned on zero results. Results caching is enabled by default with a 1s item time-to-live (TTL). Learn more.

  • Query History Logging: Recent queries are now logged in the new spice.runtime.query_history dataset with a default retention of 24-hours. Query history is initially enabled for HTTP queries only (not Arrow Flight queries).

  • Dataset Schemas: Added support for dataset schemas, allowing logical grouping of datasets by separating the schema name from the table name with a .. E.g.

```yaml datasets: - from: mysql:app1.identities name: app.users

- from: postgres:app2.purchases
  name: app.purchases

```

In this example, queries against app.users will be federated to my_schema.my_table, and app.purchases will be federated to app2.purchases.

Contributors

@y-f-u @Jeadie @sgrebnov @ewgenius @phillipleblanc @lukekim @gloomweaver @Sevenannn

New in this release

  • Add more type support on mysql connector by @y-f-u in https://github.com/spiceai/spiceai/pull/1449
  • Add in-memory caching support for Arrow Flight queries by @sgrebnov in https://github.com/spiceai/spiceai/pull/1450
  • Fix the table reference to use the full table reference, not just the table by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1460
  • Make file_format parameter required for S3/FTP/SFTP connector by @ewgenius in https://github.com/spiceai/spiceai/pull/1455
  • Add more verbose logging when acceleration refresh update is finished by @y-f-u in https://github.com/spiceai/spiceai/pull/1453
  • Fix snowflake dataset path when using federation query by @y-f-u in https://github.com/spiceai/spiceai/pull/1474
  • Update cargo to use spiceai datafusion fork by @y-f-u in https://github.com/spiceai/spiceai/pull/1475
  • Enable in-memory results caching by default by @sgrebnov in https://github.com/spiceai/spiceai/pull/1473
  • Add basic integration test for MySQL federation by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1477
  • Update results_cache config names per final spec by @sgrebnov in https://github.com/spiceai/spiceai/pull/1487
  • Add DuckDB quickstart to E2E tests by @lukekim in https://github.com/spiceai/spiceai/pull/1461
  • Add X-Cache header for http queries by @sgrebnov in https://github.com/spiceai/spiceai/pull/1472
  • Add telemetry for in-memory caching by @sgrebnov in https://github.com/spiceai/spiceai/pull/1456
  • Pin Git dependencies to a specific commit hash by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1490
  • Detect file_format from dataset path by @ewgenius in https://github.com/spiceai/spiceai/pull/1489
  • Add file_format to helm chart sample dataset by @ewgenius in https://github.com/spiceai/spiceai/pull/1493
  • Improve duckdb data connector error messages by @Sevenannn in https://github.com/spiceai/spiceai/pull/1486
  • Add file_format prompt for s3 and ftp datasets in Dataset Configure CLI if no extension detected by @ewgenius in https://github.com/spiceai/spiceai/pull/1494
  • Add llms to the spicepod definition and use throughout by @Jeadie in https://github.com/spiceai/spiceai/pull/1447
  • Fix duckdb acceleration converting null into default values. by @y-f-u in https://github.com/spiceai/spiceai/pull/1500
  • Separate runtime Dataset from spicepod Dataset by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1503
  • Duckdb e2e test OSX support by @y-f-u in https://github.com/spiceai/spiceai/pull/1505
  • Use TableReference for dataset name by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1506
  • Tweak Results Cache naming and output by @lukekim in https://github.com/spiceai/spiceai/pull/1509
  • Fix refresh_sql not properly passing down filters by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1510
  • Allow datasets to specify a schema by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1507
  • Cache invalidation for accelerated tables by @sgrebnov in https://github.com/spiceai/spiceai/pull/1498
  • Improve spark data connector error messages by @Sevenannn in https://github.com/spiceai/spiceai/pull/1497
  • Parse postgres table schema from prepare statement to support empty tables by @ewgenius in https://github.com/spiceai/spiceai/pull/1445
  • Improve clarity of README and add FAQ by @lukekim in https://github.com/spiceai/spiceai/pull/1512
  • Use binary data transfer for ftp by @gloomweaver in https://github.com/spiceai/spiceai/pull/1517
  • Add support for time64 for SQL insertion statement by @y-f-u in https://github.com/spiceai/spiceai/pull/1519
  • Add Spice Extensions PoC by @ewgenius in https://github.com/spiceai/spiceai/pull/1476
  • Add results cache metrics, pod and quantile filters to Grafana dashboard by @sgrebnov in https://github.com/spiceai/spiceai/pull/1513
  • Add unit tests for results caching utils by @sgrebnov in https://github.com/spiceai/spiceai/pull/1514
  • Add E2E tests for results caching by @sgrebnov in https://github.com/spiceai/spiceai/pull/1515
  • Pass tablereference full string into sparksession table so it can query across schemas or catalogs by @y-f-u in https://github.com/spiceai/spiceai/pull/1521
  • Trace on debug level for tables in runtime schema by @ewgenius in https://github.com/spiceai/spiceai/pull/1524
  • Update SparkSessionBuilder::remote and update spark fork hash by @Sevenannn in https://github.com/spiceai/spiceai/pull/1495
  • Fix federation push-down for datasets with schemas by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1526
  • Store history of queries in 'spice.runtime.query_history' by @Jeadie in https://github.com/spiceai/spiceai/pull/1501
  • Disable cache for system queries by @sgrebnov in https://github.com/spiceai/spiceai/pull/1528
  • Register runtime tables with runtime schema by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1532
  • Fix acknowledgments workflow to include all cargo features by @Jeadie in https://github.com/spiceai/spiceai/pull/1531

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.13.0-alpha...v0.13.1-alpha

- Rust
Published by y-f-u almost 2 years ago

https://github.com/spiceai/spiceai - v0.13.0-alpha

Spice v0.13-alpha (May 20, 2024)

The v0.13.0-alpha release significantly improves federated query performance and efficiency with Query Push-Down. Query push-down allows SQL queries to be directly executed by underlying data sources, such as joining tables using the same data connector. Query push-down is supported for all SQL-based and Arrow Flight data connectors. Additionally, runtime metrics, including query duration, collected and accessed in the spice.runtime.metrics table. This release also includes a new FTP/SFTP data connector and improved CSV support for the S3 data connector.

Highlights

  • Federated Query Push-Down (#1394): All SQL and Arrow Flight data connectors support federated query push-down.

  • Runtime Metrics (#1361): Runtime metric collection can be enabled using the --metrics flag and accessed by the spice.runtime.metrics table.

  • FTP & SFTP data connector (#1355) (#1399): Added support for using FTP and SFTP as data sources.

  • Improved CSV support (#1411) (#1414): S3/FTP/SFTP data connectors support CSV files with expanded CSV options.

Contributors

  • @Jeadie
  • @digadeesh
  • @ewgenius
  • @gloomweaver
  • @lukekim
  • @phillipleblanc
  • @sgrebnov
  • @y-f-u

What's Changed

  • Remove milestones from Enhancement template by @lukekim in https://github.com/spiceai/spiceai/pull/1373
  • Update version.txt and Cargo.toml to 0.13.0-alpha by @sgrebnov in https://github.com/spiceai/spiceai/pull/1375
  • Helm chart for Spice v0.12.2-alpha by @sgrebnov in https://github.com/spiceai/spiceai/pull/1374
  • Add release cargo feature to docker builds by @ewgenius in https://github.com/spiceai/spiceai/pull/1377
  • FTP connector by @gloomweaver in https://github.com/spiceai/spiceai/pull/1355
  • Provide ability to specify timeout for s3 data connector by @gloomweaver in https://github.com/spiceai/spiceai/pull/1378
  • clickhouse-rs use tag instead of branch by @gloomweaver in https://github.com/spiceai/spiceai/pull/1313
  • Store runtime metrics in spice.runtime.metrics table by @ewgenius in https://github.com/spiceai/spiceai/pull/1361
  • Update bug_report.md to include the kind/bug label by @digadeesh in https://github.com/spiceai/spiceai/pull/1381
  • Remove redundant [refresh] in log by @lukekim in https://github.com/spiceai/spiceai/pull/1384
  • Implement federation for DuckDB Data Connector (POC) by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1380
  • Update wording for spice cloud connection by @ewgenius in https://github.com/spiceai/spiceai/pull/1386
  • fix dataset refreshing status by @y-f-u in https://github.com/spiceai/spiceai/pull/1387
  • clickhouse friendly error by @y-f-u in https://github.com/spiceai/spiceai/pull/1388
  • Initial work for NQL crate and API by @Jeadie in https://github.com/spiceai/spiceai/pull/1366
  • Fully implement federation for all SqlTable-based Data Connectors by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1394
  • use df logical plan to query latest timestamp when refreshing incrementally by @y-f-u in https://github.com/spiceai/spiceai/pull/1393
  • Refactor datafusion.write_data to use table reference by @ewgenius in https://github.com/spiceai/spiceai/pull/1402
  • Add federation to FlightTable based DataConnectors by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1401
  • SFTP Data Connector by @gloomweaver in https://github.com/spiceai/spiceai/pull/1399
  • Use GPT3.5 for NSQL task by @Jeadie in https://github.com/spiceai/spiceai/pull/1400
  • Update ROADMAP May 16, 2024 by @lukekim in https://github.com/spiceai/spiceai/pull/1405
  • Add ftp/sftp connector to readme by @gloomweaver in https://github.com/spiceai/spiceai/pull/1404
  • Add FlightSQL federation provider by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1403
  • Refactor runtime metrics to use localhost accelerated table by @ewgenius in https://github.com/spiceai/spiceai/pull/1395
  • Use JSON response in OpenAI, text -> SQL model by @Jeadie in https://github.com/spiceai/spiceai/pull/1407
  • support more common csv options by @y-f-u in https://github.com/spiceai/spiceai/pull/1411
  • add a TLS error message in data connector and implement it for clickhouse by @y-f-u in https://github.com/spiceai/spiceai/pull/1413
  • Add CSV to s3 data formats by @gloomweaver in https://github.com/spiceai/spiceai/pull/1414
  • fix up dependencies now 0.5.0 disappeared by @Jeadie in https://github.com/spiceai/spiceai/pull/1417
  • Add NSQL to FlightRepl by @Jeadie in https://github.com/spiceai/spiceai/pull/1409
  • Update Cargo.lock by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1418
  • Enable spice.ai replication for runtime.metrics table by @ewgenius in https://github.com/spiceai/spiceai/pull/1408
  • Restructure the runtime struct to make it easier to test by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1420
  • Make it easier to construct an App programatically by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1421
  • Add an integration test for federation by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1426
  • wait 2 seconds for the status to turn ready in refreshing status test by @y-f-u in https://github.com/spiceai/spiceai/pull/1419
  • Add functional tests for federation push-down by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1428
  • Enable push-down federation by default by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1429
  • Add guides and examples about error handling by @ewgenius in https://github.com/spiceai/spiceai/pull/1427
  • Add LRU cache support for http-based queries by @sgrebnov in https://github.com/spiceai/spiceai/pull/1410
  • Update README.md - Remove bigquery from tablet of connectors by @digadeesh in https://github.com/spiceai/spiceai/pull/1434
  • Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/1433
  • CLI wording and logs change reflected on readme by @y-f-u in https://github.com/spiceai/spiceai/pull/1435
  • Add databricksusessl parameter by @Sevenannn in https://github.com/spiceai/spiceai/pull/1406
  • Update helm version and use v0.13.0-alpha by @Jeadie in https://github.com/spiceai/spiceai/pull/1436
  • Don't include feature 'llms/candles' by default by @Jeadie in https://github.com/spiceai/spiceai/pull/1437
  • Correctly map NullBuilder for Null arrow types by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1438
  • Propagate object store error by @gloomweaver in https://github.com/spiceai/spiceai/pull/1415

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.12.2-alpha...v0.13.0-alpha

- Rust
Published by Jeadie almost 2 years ago

https://github.com/spiceai/spiceai - v0.12.2-alpha

Spice v0.12.2-alpha (May 13, 2024)

The v0.12.2-alpha release introduces data streaming and key-pair authentication for the Snowflake data connector, enables general append mode data refreshes for time-series data, improves connectivity error messages, adds nested folders support for the S3 data connector, and exposes nodeSelector and affinity keys in the Helm chart for better Kubernetes management.

Highlights

  • Improved Connectivity Error Messages: Error messages provide clearer, actionable guidance for misconfigured settings or unreachable data connectors.

  • Snowflake Data Connector Improvements: Enables data streaming by default and adds support for key-pair authentication in addition to passwords.

  • API for Refresh SQL Updates: Update dataset Refresh SQL via API.

  • Append Data Refresh: Append mode data refreshes for time-series data are now supported for all data connectors. Specify a dataset time_column with refresh_mode: append to only fetch data more recent than the latest local data.

  • Docker Image Update: The spiceai/spiceai:latest Docker image now includes the ODBC data connector. For a smaller footprint, use spiceai/spiceai:latest-slim.

  • Helm Chart Improvements: nodeSelector and affinity keys are now supported in the Helm chart for improved Kubernetes deployment management.

Breaking Changes

  • API to trigger accelerated dataset refreshes has changed from POST /v1/datasets/:name/refresh to POST /v1/datasets/:name/acceleration/refresh to be consistent with the Spicepod.yaml structure.

Contributors

  • @mach-kernel
  • @y-f-u
  • @sgrebnov
  • @ewgenius
  • @Jeadie
  • @Sevenannn
  • @digadeesh
  • @phillipleblanc
  • @lukekim

What's Changed

  • Fix list type support in spark connect by @y-f-u in https://github.com/spiceai/spiceai/pull/1341
  • Add nested folder support in S3 Parquet connector by @y-f-u in https://github.com/spiceai/spiceai/pull/1342
  • Improves S3 connector using DataFusion ListingTable table provider by @y-f-u in https://github.com/spiceai/spiceai/pull/1326
  • Update ROADMAP May 6, 2024 by @lukekim in https://github.com/spiceai/spiceai/pull/1315
  • List flightsql and snowflake as supported connectors in README.md by @sgrebnov in https://github.com/spiceai/spiceai/pull/1317
  • Helm chart for v0.12.1-alpha by @ewgenius in https://github.com/spiceai/spiceai/pull/1323
  • Read sqlite_file param and use it as path by @Sevenannn in https://github.com/spiceai/spiceai/pull/1309
  • Compile spiced with release feature in docker image by @ewgenius in https://github.com/spiceai/spiceai/pull/1324
  • Add support for Snowflake key-pair authentication by @sgrebnov in https://github.com/spiceai/spiceai/pull/1314
  • Wrap postgres errors in common DataConnectorError by @ewgenius in https://github.com/spiceai/spiceai/pull/1327
  • Fix TPCH tests runner by @sgrebnov in https://github.com/spiceai/spiceai/pull/1330
  • Spice CLI support for Snowflake key-pair auth by @sgrebnov in https://github.com/spiceai/spiceai/pull/1325
  • sqlproviderdatafusion: Support TimestampMicrosecond, Date32, Date64 by @mach-kernel in https://github.com/spiceai/spiceai/pull/1329
  • Resolve dangling reference for SQLite by @Sevenannn in https://github.com/spiceai/spiceai/pull/1312
  • Select columns from Spark Dataframe according to projected_schema by @Sevenannn in https://github.com/spiceai/spiceai/pull/1336
  • Expose nodeselector and affinity keys in Helm chart by @mach-kernel in https://github.com/spiceai/spiceai/pull/1338
  • Use streaming for Snowflake queries by @sgrebnov in https://github.com/spiceai/spiceai/pull/1337
  • Publish ODBC images by @mach-kernel in https://github.com/spiceai/spiceai/pull/1271
  • Include Postgres acceleration engine to types support tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/1343
  • Refactor dataconnector providers getters to return common DataConnectorResult and DataConnectorError by @ewgenius in https://github.com/spiceai/spiceai/pull/1339
  • s3 csv support to validate the listing table extensibility by @y-f-u in https://github.com/spiceai/spiceai/pull/1344
  • Move model code into separate, feature-flagged crate by @Jeadie in https://github.com/spiceai/spiceai/pull/1335
  • Initial setup for federated queries by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1350
  • Refactor dbconnection errors, and catch invalid postgres table name case by @ewgenius in https://github.com/spiceai/spiceai/pull/1353
  • Rename default datafusion catalog to "spice", add internal "spice.runtime" schema by @ewgenius in https://github.com/spiceai/spiceai/pull/1359
  • Add API to set Refresh SQL for accelerated table by @sgrebnov in https://github.com/spiceai/spiceai/pull/1356
  • Set next version to v0.12.2 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1367
  • Upgrade to DataFusion 38 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1368
  • Incremental append based on time column by @y-f-u in https://github.com/spiceai/spiceai/pull/1360
  • Update README.md to include correct output when running show tables from quickstart by @digadeesh in https://github.com/spiceai/spiceai/pull/1371

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.12.1-alpha...v0.12.2-alpha

- Rust
Published by github-actions[bot] almost 2 years ago

https://github.com/spiceai/spiceai - v0.12.1-alpha

Spice v0.12.1-alpha (May 6, 2024)

The v0.12.1-alpha release introduces a new Snowflake data connector, support for UUID and TimestampTZ types in the PostgreSQL connector, and improved error messages across all data connectors. The Clickhouse data connector enables data streaming by default. The public SQL interface now restricts DML and DDL queries. Additionally, accelerated tables now fully support NULL values, and issues with schema conversion in these tables have been resolved.

Highlights

  • Snowflake Data Connector: Initial support for Snowflake as a data source.

  • Clickhouse Data Streaming: Enables data streaming by default, eliminating in-memory result collection.

  • Read-only SQL Interface: Disables DML (INSERT/UPDATE/DELETE) and DDL (CREATE/ALTER TABLE) queries for improved data source security.

  • Error Message Improvements: Improved the error messages for commonly encountered issues with data connectors.

  • Accelerated Tables: Supports NULL values across all data types and fixes schema conversion errors for consistent type handling.

Contributors

  • @ahirner
  • @y-f-u
  • @sgrebnov
  • @ewgenius
  • @Jeadie
  • @gloomweaver
  • @Sevenannn
  • @digadeesh
  • @phillipleblanc

What's Changed

  • Add schema types check for query result by @sgrebnov in https://github.com/spiceai/spiceai/pull/1212
  • helm chart for v0.12.0-alpha by @y-f-u in https://github.com/spiceai/spiceai/pull/1235
  • Update acknowledgements by @github-actions in https://github.com/spiceai/spiceai/pull/1232
  • Bump spiceai version to v0.12.1-alpha by @ewgenius in https://github.com/spiceai/spiceai/pull/1239
  • Update ROADMAP.md - remove v0.12.0-alpha by @ewgenius in https://github.com/spiceai/spiceai/pull/1241
  • Raise errors in InsertBuilder by @Jeadie in https://github.com/spiceai/spiceai/pull/1242
  • Update endgame template by @ewgenius in https://github.com/spiceai/spiceai/pull/1240
  • Add E2E tests for acceleration engines types support by @sgrebnov in https://github.com/spiceai/spiceai/pull/1218
  • Stream blocks to arrow by @gloomweaver in https://github.com/spiceai/spiceai/pull/1203
  • Update enhancement.md to include a checklist item have a release notes entry for each enhancement. by @digadeesh in https://github.com/spiceai/spiceai/pull/1245
  • arrowsqlgen data column conversion by @Sevenannn in https://github.com/spiceai/spiceai/pull/1230
  • Implement the Localhost Data Connector & fix DoPut by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1266
  • Update postgres parameter check by @Sevenannn in https://github.com/spiceai/spiceai/pull/1244
  • Record batch casting to fix SQLite data type issues by @y-f-u in https://github.com/spiceai/spiceai/pull/1261
  • typo fix on Decimal in postgres arrowsqlgen by @y-f-u in https://github.com/spiceai/spiceai/pull/1277
  • Move verifyschema to arrowtools by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1284
  • Support UUID and TimestampTZ type for Postgres as Data Connector by @ahirner & @y-f-u https://github.com/spiceai/spiceai/pull/1276
  • Fix linter warnings by @ewgenius in https://github.com/spiceai/spiceai/pull/1286
  • Add Snowflake data connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/1278
  • Add Snowflake login support (username and password) by @sgrebnov in https://github.com/spiceai/spiceai/pull/1272
  • convert timestamp properly in sql gen by @y-f-u in https://github.com/spiceai/spiceai/pull/1291
  • Add if not exists clause to create statement on when creating a table using duckdb acceleration. by @digadeesh in https://github.com/spiceai/spiceai/pull/1290
  • Disable DML & DDL queries in the public SQL interface by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1294
  • Refactor duckdb to properly set access_mode for connection by @ewgenius in https://github.com/spiceai/spiceai/pull/1285
  • do not insert batch for sqlite and postgres if no records in the record batch by @y-f-u in https://github.com/spiceai/spiceai/pull/1293
  • Postgres - add custom error message for invalid error table by @ewgenius in https://github.com/spiceai/spiceai/pull/1295
  • SQLite/Accelerators handle null values by @gloomweaver in https://github.com/spiceai/spiceai/pull/1298
  • Add command to attach to running process by @gloomweaver in https://github.com/spiceai/spiceai/pull/1297
  • Use the GITHUB_TOKEN environment variable in the installation script, if available, to avoid rate limiting in CI workflows by @ewgenius in https://github.com/spiceai/spiceai/pull/1302
  • Fix unsupported SSL mode options for PostgreSQL connection string by @ewgenius in https://github.com/spiceai/spiceai/pull/1300
  • Add CLI cmd spice login spark by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1303
  • Check only the latest published release to avoid installing pre-release versions by @ewgenius in https://github.com/spiceai/spiceai/pull/1301
  • Postgres data connector - handle invalid host/port and username/password errors by @ewgenius in https://github.com/spiceai/spiceai/pull/1292
  • Fix the panic on bad clickhouse connection by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1306
  • Improve Snowflake Data Connector by @sgrebnov https://github.com/spiceai/spiceai/pull/1296

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.12.0-alpha...v0.12.1-alpha

- Rust
Published by phillipleblanc almost 2 years ago

https://github.com/spiceai/spiceai - v0.12-alpha

Spice v0.12-alpha (Apr 29, 2024)

The v0.12-alpha release introduces Clickhouse and Apache Spark data connectors, adds support for limiting refresh data periods for temporal datasets, and includes upgraded Spice Client SDKs compatible with Spice OSS.

Highlights

  • Clickhouse data connector: Use Clickhouse as a data source with the clickhouse: scheme.

  • Apache Spark Connect data connector: Use Apache Spark Connect connections as a data source using the spark: scheme.

  • Refresh data window: Limit accelerated dataset data refreshes to the specified window, as a duration from now configuration setting, for faster and more efficient refreshes.

  • ODBC data connector: Use ODBC connections as a data source using the odbc: scheme. The ODBC data connector is currently optional and not included in default builds. It can be conditionally compiled using the odbc cargo feature when building from source.

  • Spice Client SDK Support: The official Spice SDKs have been upgraded with support for Spice OSS.

Breaking Changes

  • Refresh interval: The refresh_interval acceleration setting and been changed to refresh_check_interval to make it clearer it is the check versus the data interval.

Contributors

  • @phillipleblanc
  • @Jeadie
  • @ewgenius
  • @sgrebnov
  • @y-f-u
  • @lukekim
  • @digadeesh
  • @gloomweaver
  • @edmondop
  • @mach-kernel

New Contributors

  • Thanks to @mach-kernel who made their first contribution in https://github.com/spiceai/spiceai/pull/1204 by adding the ODBC data connector!

What's Changed

  • Update helm version by @Jeadie in https://github.com/spiceai/spiceai/pull/1167
  • Handle and trace errors in secret stores by @ewgenius in https://github.com/spiceai/spiceai/pull/1149
  • bump the release versions to 0.12.0 by @y-f-u in https://github.com/spiceai/spiceai/pull/1171
  • Don't fail acknowledgments flow if no changes detected by @ewgenius in https://github.com/spiceai/spiceai/pull/1170
  • Allow Spice CLI to control runtime installation on Windows by @sgrebnov in https://github.com/spiceai/spiceai/pull/1173
  • Allow SELECT count(*) for Sqlite Data Accelerator by @sgrebnov in https://github.com/spiceai/spiceai/pull/1166
  • add refresh_period param in acceleration by @y-f-u in https://github.com/spiceai/spiceai/pull/1180
  • Properly support Spark Connect filter pushdown by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1186
  • Avoid rate-limiting on arduino/setup-protoc@v3 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1189
  • Clickhouse DataConnector base implementation by @gloomweaver in https://github.com/spiceai/spiceai/pull/1168
  • rename refreshinterval to refreshcheck_interval by @y-f-u in https://github.com/spiceai/spiceai/pull/1190
  • Fix timestamp & add support for Decimal to Databricks/Spark by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1194
  • Convert temporal column and refresh period to datafusion expr by @y-f-u in https://github.com/spiceai/spiceai/pull/1187
  • Hot reload accelerated table on dataset update by @ewgenius in https://github.com/spiceai/spiceai/pull/1195
  • Upgrade DataFusion to 37.1 & DuckDB to 10.2 by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1200
  • Update version.txt for 0.11.2 release by @digadeesh in https://github.com/spiceai/spiceai/pull/1199
  • Clickhouse E2E by @gloomweaver in https://github.com/spiceai/spiceai/pull/1193
  • Clickhouse: fix darwin ci pipeline by @gloomweaver in https://github.com/spiceai/spiceai/pull/1201
  • Add table_type to show tables in Spice SQL & update next version to v0.12.0-alpha by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1206
  • print WARN if time_column does not exists in federated schema by @y-f-u in https://github.com/spiceai/spiceai/pull/1207
  • Add FallbackOnZeroResultsScanExec for executing an input ExecutionPlan and optionally falling back to a TableProvider.scan() if the input has zero results by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1196
  • Clickhouse refactor connection code and set secure option by @gloomweaver in https://github.com/spiceai/spiceai/pull/1198
  • E2E: reusable Spice installation by @sgrebnov in https://github.com/spiceai/spiceai/pull/1205
  • Clickhouse blocktoarrow unit test by @gloomweaver in https://github.com/spiceai/spiceai/pull/1202
  • rename refreshperiod to refreshdata_period by @y-f-u in https://github.com/spiceai/spiceai/pull/1210
  • Refactor E2E tests: dataset verification and PostgreSQL installation by @sgrebnov in https://github.com/spiceai/spiceai/pull/1211
  • Add BI dashboard acceleration video to README.md by @lukekim in https://github.com/spiceai/spiceai/pull/1219
  • Improve clarity and consistency of output messages by @lukekim in https://github.com/spiceai/spiceai/pull/1214
  • Update ROADMAP Apr 29, 2024 by @lukekim in https://github.com/spiceai/spiceai/pull/1220
  • Stand-alone Spark Connect: Isolate Spark Connect from Databricks Connect to make it reusable by @edmondop in https://github.com/spiceai/spiceai/pull/1213
  • Optimize build time in dev mode by @gloomweaver in https://github.com/spiceai/spiceai/pull/1215
  • Feature: Support ODBC reads using unixodbc by @mach-kernel in https://github.com/spiceai/spiceai/pull/1204
  • Use non-fork deltalake by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1223
  • Support Date32 & Date64 in arrowsqlgen by @Jeadie in https://github.com/spiceai/spiceai/pull/1217
  • Update REPL output to be consistent with the latest Spice version by @sgrebnov in https://github.com/spiceai/spiceai/pull/1231
  • rename refreshdataperiod to refreshdatawindow by @y-f-u in https://github.com/spiceai/spiceai/pull/1233
  • Update README.md to include ODBC, Spark Connect, and Clickhouse data connectors in support data connector matrix. by @digadeesh in https://github.com/spiceai/spiceai/pull/1234

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.11.1-alpha...v0.12.0-alpha

- Rust
Published by ewgenius almost 2 years ago

https://github.com/spiceai/spiceai - 0.11.1-alpha

Spice v0.11.1-alpha (Apr 22, 2024)

The v0.11.1-alpha release introduces retention policies for accelerated datasets, native Windows installation support, and integration of catalog and schema settings for the Databricks Spark connector. Several bugs have also been fixed for improved stability.

Highlights

  • Retention Policies for Accelerated Datasets: Automatic eviction of data from accelerated time-series datasets when a specified temporal column exceeds the retention period, optimizing resource utilization.

  • Windows Installation Support: Native Windows installation support, including upgrades.

  • Databricks Spark Connect Catalog and Schema Settings: Improved translation between DataFusion and Spark, providing better Spark Catalog support.

Contributors

  • @phillipleblanc
  • @Jeadie
  • @ewgenius
  • @sgrebnov
  • @y-f-u
  • @lukekim
  • @digadeesh
  • @Sevenannn
  • @gloomweaver

New in this release

  • PowerShell script to install Spice on Windows by @sgrebnov in https://github.com/spiceai/spiceai/pull/1128
  • Support catalog and schema in Databricks Spark Connect by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1137
  • Retention handlers by @y-f-u in https://github.com/spiceai/spiceai/pull/1096

What's Changed

  • Update CONTRIBUTING with new dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1121
  • Fix the Helm tag by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1122
  • Upgrade Spice version to 0.11.1 by @sgrebnov in https://github.com/spiceai/spiceai/pull/1123
  • Remove 0.11 from roadmap by @ewgenius in https://github.com/spiceai/spiceai/pull/1124
  • Include refresh_sql and manual refresh to e2e tests by @sgrebnov in https://github.com/spiceai/spiceai/pull/1125
  • Respect executables file extension on Windows by @sgrebnov in https://github.com/spiceai/spiceai/pull/1130
  • Use quoted strings when performing federated SQL queries by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1129
  • Make Windows artifact names consistent with other platforms by @sgrebnov in https://github.com/spiceai/spiceai/pull/1132
  • Make Windows installation less verbose by @sgrebnov in https://github.com/spiceai/spiceai/pull/1138
  • Document Windows installation and add test by @sgrebnov in https://github.com/spiceai/spiceai/pull/1134
  • Use transaction for DuckDB Table Writer by @Sevenannn in https://github.com/spiceai/spiceai/pull/1135
  • Update Windows installation script url by @sgrebnov in https://github.com/spiceai/spiceai/pull/1143
  • Update roadmap Apr 18, 2024 by @lukekim in https://github.com/spiceai/spiceai/pull/1142
  • Test connection when new connection pool created by @ewgenius in https://github.com/spiceai/spiceai/pull/1126
  • Enable clippy::cloneonref_ptr by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1146
  • Allow only alphanumeric dataset names when using spice dataset configure by @ewgenius in https://github.com/spiceai/spiceai/pull/1140
  • Extend PR check to build with no default features, and each individual feature by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1156
  • Bump rustls from 0.21.10 to 0.21.11 by @dependabot in https://github.com/spiceai/spiceai/pull/1150
  • Serde rule for ISO8601 time format by @y-f-u in https://github.com/spiceai/spiceai/pull/1151
  • Add static linking for vcruntime dependencies on Windows by @sgrebnov in https://github.com/spiceai/spiceai/pull/1152
  • Use clearer retention param key - retentioncheckenabled instead by @y-f-u in https://github.com/spiceai/spiceai/pull/1158
  • spice upgrade on Windows by @sgrebnov in https://github.com/spiceai/spiceai/pull/1155

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.11.0-alpha...v0.11.1-alpha

- Rust
Published by y-f-u almost 2 years ago

https://github.com/spiceai/spiceai - Spice.ai v0.11.0-alpha

The Spice v0.11.0-alpha release significantly improves the Databricks data connector with Databricks Connect (Spark Connect) support, adds the DuckDB data connector, and adds the AWS Secrets Manager secret store. In addition, enhanced control over accelerated dataset refreshes, improved SSL security for MySQL and PostgreSQL connections, and overall stability improvements have been added.

Highlights in v0.11.0-alpha

DuckDB data connector: Use DuckDB databases or connections as a data source.

AWS Secrets Manager Secret Store: Use AWS Secrets Managers as a secret store.

Custom Refresh SQL: Specify a custom SQL query for dataset refresh using refresh_sql.

Dataset Refresh API: Trigger a dataset refresh using the new CLI command spice refresh or via API.

Expanded SSL support for Postgres: SSL mode now supports disable, require, prefer, verify-ca, verify-full options with the default mode changed to require. Added pg_sslrootcert parameter for setting a custom root certificate and the pg_insecure parameter is no longer supported.

Databricks Connect: Choose between using Spark Connect or Delta Lake when using the Databricks data connector for improved performance.

Improved SSL support for Postgres: ssl mode now supports disable, require, prefer, verify-ca, verify-full options with default mode changed to require. Added pg_sslrootcert parameter to allow setting custom root cert for postgres connector, pg_insecure parameter is no longer supported as redundant.

Internal architecture refactor: The internal architecture of spiced was refactored to simplify the creation data components and to improve alignment with DataFusion concepts.

New Contributors

@edmondop's first contribution github.com/spiceai/spiceai/pull/1110!

Contributors

  • @phillipleblanc
  • @Jeadie
  • @ewgenius
  • @sgrebnov
  • @y-f-u
  • @lukekim
  • @digadeesh
  • @Sevenannn
  • @gloomweaver
  • @ahirner

New in this release

  • Fixes MySQL NULL values by @gloomweaver in https://github.com/spiceai/spiceai/pull/1067
  • Fixes PostgreSQL NULL values for NUMERIC by @gloomweaver in https://github.com/spiceai/spiceai/pull/1068
  • Adds Custom Refresh SQL support by @lukekim and @phillipleblanc in https://github.com/spiceai/spiceai/pull/1073
  • Adds DuckDB data connector by @Sevenannn in https://github.com/spiceai/spiceai/pull/1085
  • Adds AWS Secrets Manager secret store by @sgrebnov in https://github.com/spiceai/spiceai/pull/1063, https://github.com/spiceai/spiceai/pull/1064
  • Adds Dataset refresh API by @sgrebnov in https://github.com/spiceai/spiceai/pull/1075, https://github.com/spiceai/spiceai/pull/1078, https://github.com/spiceai/spiceai/pull/1083
  • Adds spice refresh CLI command for dataset refresh by @sgrebnov in https://github.com/spiceai/spiceai/pull/1112
  • Adds TEXT and DECIMAL types support and properly handling NULL for MySQL by @gloomweaver in https://github.com/spiceai/spiceai/pull/1067
  • Adds MySQL DATE and TINYINT types support for MySQL by @ewgenius in https://github.com/spiceai/spiceai/pull/1065
  • Adds ssl_rootcert_path parameter for MySql data connector by @ewgenius in https://github.com/spiceai/spiceai/pull/1079
  • Adds LargeUtf8 support and explicitly passing the schema to data accelerator SqlTable by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1077
  • Adds Ability to configure data retention for accelerated datasets by @y-f-u in https://github.com/spiceai/spiceai/issues/1086
  • Adds Custom SSL certificates for PostgreSQL data connector by @ewgenius in https://github.com/spiceai/spiceai/pull/1081
  • Adds Conditional compile for Dremio by @ahirner in https://github.com/spiceai/spiceai/pull/1100
  • Adds Ability for Databricks connector to use spark-connect-rs as the mechanism to execute queries against the Databricks by @edmondop in https://github.com/spiceai/spiceai/pull/1110
  • Adds Ability to choose between Spark Connect and Delta Lake implementation for Databricks by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1115/files
  • Updates Databricks login parameters by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1113
  • Updates Architecture to simplify data components development by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1040
  • Updates Improved readability of GitHub Actions test job names by @lukekim in https://github.com/spiceai/spiceai/pull/1071
  • Updates Upgrade Arrow, DataFusion, Tonic dependencies by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1097
  • Updates Handling non-string spicepod params by @ewgenius in https://github.com/spiceai/spiceai/pull/1098
  • Updates Optional features compile: duckdb, databricks by @ahirner in https://github.com/spiceai/spiceai/pull/1100
  • Updates Helm version to 0.1.3 by @Jeadie in https://github.com/spiceai/spiceai/pull/1120
  • Removes pg_insecure parameter support from Postgres by @ewgenius in https://github.com/spiceai/spiceai/pull/1081

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.10.2-alpha...v0.11.0-alpha

- Rust
Published by sgrebnov almost 2 years ago

https://github.com/spiceai/spiceai - Spice.ai v0.10.2-alpha

The v0.10.2-alpha release adds the MySQL data connector and makes external data connections more robust on initialization.

Highlights in v0.10.2-alpha

  • MySQL data connector: Connect to any MySQL server, including SSL support.

  • Data connections verified at initialization: Verify endpoints and authorization for external data connections (e.g. databricks, spice.ai) at initialization.

New Contributors

  • @rthomas made their first contribution in https://github.com/spiceai/spiceai/pull/1022
  • @ahirner made their first contribution in https://github.com/spiceai/spiceai/pull/1029
  • @gloomweaver made their first contribution in https://github.com/spiceai/spiceai/pull/1004

Contributors

  • @phillipleblanc
  • @y-f-u
  • @ewgenius
  • @sgrebnov
  • @lukekim
  • @digadeesh
  • @jeadie

New in this release

  • Adds MySQL data connector by @gloomweaver in https://github.com/spiceai/spiceai/pull/1004
  • Fixes show tables; parsing in the Spice SQL repl.
  • Adds data connector verification at initialization
    • For Dremio by @sgrebnov in https://github.com/spiceai/spiceai/pull/1017
    • For Databricks by @sgrebnov in https://github.com/spiceai/spiceai/pull/1019
    • For Spice.ai by @sgrebnov in https://github.com/spiceai/spiceai/pull/1020
  • Fixes Ensures unit and doc tests compile and run by @rthomas in https://github.com/spiceai/spiceai/pull/1022
  • Improves Helm chart + Grafana dashboard by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1030
  • Fixes Makes data connectors optional features by @ahirner in https://github.com/spiceai/spiceai/pull/1029
  • Fixes Fixes SpiceAI E2E for external contributors in Github actions by @ewgenius in https://github.com/spiceai/spiceai/pull/1023
  • Fixes remove hardcoded lookback_size (& improve SpiceAI's ModelSource) by @Jeadie in https://github.com/spiceai/spiceai/pull/1016

Full Changelog: https://github.com/spiceai/spiceai/compare/v0.10.1-alpha...v0.10.2-alpha

- Rust
Published by Jeadie almost 2 years ago

https://github.com/spiceai/spiceai - Spice.ai v0.10.1-alpha

The v0.10.1-alpha release focuses on stability, bug fixes, and usability by improving error messages when using SQLite data accelerators, improving the PostgreSQL support, and adding a basic Helm chart.

Highlights in v0.10.1-alpha

Improved PostgreSQL support for Data Connectors TLS is now supported with PostgreSQL Data Connectors and there is improved VARCHAR and BPCHAR conversions through Spice.

Improved Error messages Simplified error messages from Spice when propagating errors from Data Connectors and Accelerator Engines.

Spice Pods Command The spice pods command can give you quick statistics about models, dependencies, and datasets that are loaded by the Spice runtime.

Contributors

  • @phillipleblanc
  • @mitchdevenport
  • @ewgenius
  • @sgrebnov
  • @lukekim
  • @digadeesh

New in this release

  • Adds Basic Helm Chart for spiceai (https://github.com/spiceai/spiceai/pull/1002)
  • Adds Support for spice login in environments with no browser. (https://github.com/spiceai/spiceai/pull/994)
  • Adds TLS support in Postgres connector. (https://github.com/spiceai/spiceai/pull/970)
  • Fixes Improve Postgres VARCHAR and BPCHAR conversion. (https://github.com/spiceai/spiceai/pull/993)
  • Fixes spice pods Returns incorrect counts. (https://github.com/spiceai/spiceai/pull/998)
  • Fixes Return friendly error messages for unsupported types in sqlite. (https://github.com/spiceai/spiceai/pull/982)
  • Fixes Pass Tonic errors when receiving errors from dependencies. (https://github.com/spiceai/spiceai/pull/995)

- Rust
Published by digadeesh almost 2 years ago

https://github.com/spiceai/spiceai - Spice.ai v0.10-alpha

Announcing the release of Spice.ai v0.10-alpha! 🎉

The Spice.ai v0.10-alpha release focused on additions and updates to improve stability, usability, and the overall Spice developer experience.

Highlights in v0.10-alpha

Public Bucket Support for S3 Data Connector: The S3 Data Connector now supports public buckets in addition to buckets requiring an access id and key.

JDBC-Client Connectivity: Improved connectivity for JDBC clients, like Tableau.

User Experience Improvements:

  • Friendlier error messages across the board to make debugging and development better.
  • Added a spice login postgres command, streamlining the process for connecting to PostgreSQL databases.
  • Added PostgreSQL connection verification and connection string support, enhancing usability for PostgreSQL users.

Grafana Dashboard: Improving the ability to monitor Spice deployments, a standard Grafana dashboard is now available.

Contributors

  • @phillipleblanc
  • @mitchdevenport
  • @Jeadie
  • @ewgenius
  • @sgrebnov
  • @y-f-u
  • @lukekim
  • @digadeesh

New in this release

  • Fixes Gracefully handle Arrow Flight DoExchange connection resets
  • Adds Grafana Dashboard
  • Adds Flight SQL CommandGetTableTypes Command support (improves JDBC-client connectivity)
  • Adds Friendlier error messages
  • Adds spice login postgres command
  • Adds PostgreSQL connection verification
  • Adds PostgreSQL connection string support
  • Adds Linux aarch64 build
  • Updates Improves spice status with dataset metrics
  • Updates CLI REPL improved show tables output
  • Updates CLI REPL limit output to 500 rows
  • Updates Improved README.md with architecture diagram updates
  • Updates Improved CI run time.
  • Updates Use macOS hosted Actions runner

- Rust
Published by phillipleblanc almost 2 years ago

https://github.com/spiceai/spiceai - Spice.ai v0.9.1-alpha

The v0.9.1 release focused on stability, bug fixes, and usability by adding spice CLI commands for listing Spicepods (spice pods), Models (spice models), Datasets (spice datasets), and improved status (spice status) details. In addition, the Arrow Flight SQL (flightsql) data connector and SQLite (sqlite) data store were added.

Highlights in v0.9.1-alpha

FlightSQL data connector: Arrow Flight SQL can now be used as a connector for federated SQL query.

SQLite data backend: SQLite can now be used as a data store for acceleration.

Contributors

  • @phillipleblanc
  • @mitchdevenport
  • @Jeadie
  • @ewgenius
  • @sgrebnov
  • @y-f-u
  • @lukekim

New in this release

  • Adds FlightSQL data connector (flightsql).
  • Adds SQLite data store, supports both in-memory and file based (sqlite).
  • Adds support for date, varchar, bpchar, and primitive list types for the PostgreSQL data connector and data store.
  • Adds spice pods, spice status, spice datasets, and spice models CLI commands.
  • Adds GET /v1/spicepods API for listing loaded Spicepods.
  • Adds spiced Docker CI build and release.
  • Adds E2E tests for release installation and local acceleration.
  • Adds E2E tests and instructions to run basic TPC-H benchmark tests.
  • Adds linux/arm64 binary build.
  • Fixes spice sql REPL panics when query result is too large. (https://github.com/spiceai/spiceai/pull/875)
  • Fixes --access-secret in spice s3 login. (https://github.com/spiceai/spiceai/pull/894)
  • Fixes version check upgrade logic.

- Rust
Published by y-f-u almost 2 years ago

https://github.com/spiceai/spiceai - Spice.ai v0.9-alpha

The v0.9 release adds several data connectors including the Spice data connector for the ability to connect to other spiced instances. Improved observability for spiced has been added with the new /metrics endpoint for monitoring deployed instances.

Highlights in v0.9-alpha

Arrow Flight SQL endpoint: The Arrow Flight endpoint now supports Flight SQL, including JDBC, ODBC, and ADBC enabling database clients like DBeaver or BI applications like Tableau to connect to and query the Spice runtime.

Spice.ai data connector: Use other Spice runtime instances as data connectors for federated SQL query across Spice deployments and for chaining Spice runtimes.

Keyring secret store: Use the operating system native credential store, like macOS keychain for storing secrets used by spiced.

PostgreSQL data connector: PostgreSQL can now be used as both a data store for acceleration and as a connector for federated SQL query.

Databricks data connector: Databricks as a connector for federated SQL query across Delta Lake tables.

S3 data connector: S3 as a connector for federated SQL query across Parquet files stored in S3.

Metrics endpoint: Added new /metrics endpoint for spiced observability and monitoring with the following metrics:

- spiced_runtime_http_server_start counter - spiced_runtime_flight_server_start counter - datasets_count gauge - load_dataset summary - load_secrets summary - datasets/load_error counter - datasets/count counter - models/load_error counter - models/count counter

Contributors

  • @phillipleblanc
  • @mitchdevenport
  • @Jeadie
  • @ewgenius
  • @sgrebnov
  • @Sevenannn
  • @y-f-u
  • @digadeesh
  • @lukekim

New in this release

  • Adds Keyring secret store (keyring).
  • Adds PostgreSQL data connector (postgres).
  • Adds Spice.ai data connector (spiceai).
  • Adds Arrow Flight SQL (JDBC/ODBC/ADBC) support.
  • Adds Databricks data connector (databricks) - Delta Lake support.
  • Adds S3 data connector (s3) - Parquet support.
  • Adds /v1/models API.
  • Adds /v1/status API.
  • Adds /metrics API.

- Rust
Published by sgrebnov almost 2 years ago

https://github.com/spiceai/spiceai - Spice.ai v0.8-alpha

Announcing the release of Spice v0.8-alpha! 🏹

This is a minor release that builds on the new Rust-based runtime, adding stability and a preview of new features for the first major release.

Highlights in v0.8-alpha

Secrets management: Spice 0.8 runtime can now configure and retrieve secrets from local environment variables and in a Kubernetes cluster.

Data tables can be locally accelerated using PostgreSQL

New in this release

  • Adds Secrets management in local environment variables and Kubernetes clusters.
  • Adds (Preview) PostgreSQL as a data table acceleration engine.

- Rust
Published by ewgenius almost 2 years ago

https://github.com/spiceai/spiceai - Spice v0.7-alpha

Announcing the release of Spice v0.7-alpha! 🏹

Spice v0.7-alpha is an all new implementation of Spice written in Rust. The Spice v0.7 runtime provides developers with a unified SQL query interface to locally accelerate and query data tables sourced from any database, data warehouse, or data lake.

Learn more and get started in minutes with the updated Quickstart in the repository README!

Highlights in v0.7-alpha

DataFusion SQL Query Engine: Spice v0.7 leverages the Apache DataFusion query engine to provide very fast, high quality SQL query across one or more local or remote data sources.

Data tables can be locally accelerated using Apache Arrow in-memory or by DuckDB.

New in this release

  • Adds runtime rewritten in Rust for high-performance.
  • Adds Apache DataFusion SQL query engine.
  • Adds The Spice.ai platform as a data source.
  • Adds Dremio as a data source.
  • Adds OpenTelemetry (OTEL) collector.
  • Adds local data table acceleration.
  • Adds DuckDB file or in-memory as a data table acceleration engine.
  • Adds In-memory Apache Arrow as a data table acceleration engine.
  • Removes the built-in AI training engine; now cloud-based and provided by the Spice.ai platform.
  • Removes the built-in dashboard and web-interface; now cloud-based and provided by the Spice.ai platform.

- Rust
Published by phillipleblanc about 2 years ago

https://github.com/spiceai/spiceai - Spice.ai v0.6.2-alpha

Announcing the release of Spice.ai v0.6.2-alpha! 🐞

This release fixes a bug in the CLI that prevented users from adding Spicepods from spicerack.org

- Rust
Published by phillipleblanc over 2 years ago

https://github.com/spiceai/spiceai - Spice.ai v0.6.1-alpha

Announcing the release of Spice.ai v0.6.1-alpha! 🌶

Building upon the Apache Arrow support in v0.6-alpha, Spice.ai now includes new Apache Arrow data processor and Apache Arrow Flight data connector components! Together, these create a high-performance bulk-data transport directly into the Spice.ai ML engine. Coupled with big data systems from the Apache Arrow ecosystem like Hive, Drill, Spark, Snowflake, and BigQuery, it's now easier than ever to combine big data with Spice.ai.

And we're also excited to announce the release of Spice.xyz! 🎉

Spice.xyz is data and AI infrastructure for web3. It’s web3 data made easy. Insanely fast and purpose designed for applications and ML.

Spice.xyz delivers data in Apache Arrow format, over high-performance Apache Arrow Flight APIs to your application, notebook, ML pipeline, and of course through these new data components, to the Spice.ai runtime.

Read the announcement post at blog.spice.ai.

New in this release

Now built with Go 1.18.

Dependency updates

  • Updates to React 18
  • Updates to CRA 5
  • Updates to Glide DataGrid 4
  • Updates to SWR 1.2
  • Updates to TypeScript 4.6

- Rust
Published by lukekim almost 4 years ago

https://github.com/spiceai/spiceai - Spice.ai v0.6-alpha

Announcing the release of Spice.ai v0.6-alpha! 🏹

Spice.ai now scales to datasets 10-100 larger enabling new classes of uses cases and applications! 🚀 We've completely rebuilt Spice.ai's data processing and transport upon Apache Arrow, a high-performance platform that uses an in-memory columnar format. Spice.ai joins other major projects including Apache Spark, pandas, and InfluxDB in being powered by Apache Arrow. This also paves the way for high-performance data connections to the Spice.ai runtime using Apache Arrow Flight and import/export of data using Apache Parquet. We're incredibly excited about the potential this architecture has for building intelligent applications on top of a high-performance transport between application data sources the Spice.ai AI engine.

Highlights in v0.6-alpha

Massive improvement in data loading performance and dataset scale

From data connectors, to REST API, to AI engine, we've now rebuilt Spice.ai's data processing and transport on the Apache Arrow project. Specifically, using the Apache Arrow for Go implementation. Many thanks to Matt Topol for his contributions to the project and guidance on using it.

This release includes a change to the Spice.ai runtime to AI Engine transport from sending text CSV over gGPC to Apache Arrow Records over IPC (Unix sockets).

This is a breaking change to the Data Processor interface, as it now uses arrow.Record instead of Observation.

Benchmarking v0.6

Before v0.6, Spice.ai would not scale into the 100s of 1000s of rows.

| Format | Row Number | Data Size | Process Time | Load Time | Transport time | Memory Usage | | ------ | ---------- | --------- | ------------ | --------- | -------------- | ------------ | | csv | 2,000 | 163.15KiB | 3.0005s | 0.0000s | 0.0100s | 423.754MiB | | csv | 20,000 | 1.61MiB | 2.9765s | 0.0000s | 0.0938s | 479.644MiB | | csv | 200,000 | 16.31MiB | 0.2778s | 0.0000s | NA (error) | 0.000MiB | | csv | 2,000,000 | 164.97MiB | 0.2573s | 0.0050s | NA (error) | 0.000MiB | | json | 2,000 | 301.79KiB | 3.0261s | 0.0000s | 0.0282s | 422.135MiB | | json | 20,000 | 2.97MiB | 2.9020s | 0.0000s | 0.2541s | 459.138MiB | | json | 200,000 | 29.85MiB | 0.2782s | 0.0010s | NA (error) | 0.000MiB | | json | 2,000,000 | 300.39MiB | 0.3353s | 0.0080s | NA (error) | 0.000MiB |

After building on Arrow, Spice.ai now easily scales beyond millions of rows.

| Format | Row Number | Data Size | Process Time | Load Time | Transport time | Memory Usage | | ------ | ---------- | --------- | ------------ | --------- | -------------- | ------------ | | csv | 2,000 | 163.14KiB | 2.8281s | 0.0000s | 0.0194s | 439.580MiB | | csv | 20,000 | 1.61MiB | 2.7297s | 0.0000s | 0.0658s | 461.836MiB | | csv | 200,000 | 16.30MiB | 2.8072s | 0.0020s | 0.4830s | 639.763MiB | | csv | 2,000,000 | 164.97MiB | 2.8707s | 0.0400s | 4.2680s | 1897.738MiB | | json | 2,000 | 301.80KiB | 2.7275s | 0.0000s | 0.0367s | 436.238MiB | | json | 20,000 | 2.97MiB | 2.8284s | 0.0000s | 0.2334s | 473.550MiB | | json | 200,000 | 29.85MiB | 2.8862s | 0.0100s | 1.7725s | 824.089MiB | | json | 2,000,000 | 300.39MiB | 2.7437s | 0.0920s | 16.5743s | 4044.118MiB |

New in this release

Dependency updates

  • Updates to numpy 1.21.0
  • Updates to marked 3.0.8
  • Updates to follow-redirects 1.14.7
  • Updates nanoid to 3.2.0

- Rust
Published by phillipleblanc about 4 years ago

https://github.com/spiceai/spiceai - Spice.ai v0.5.1-alpha

Announcing the release of Spice.ai v0.5.1-alpha! 📈

This minor release builds upon v0.5-alpha adding the ability to start training from the dashboard plus support for monitoring training runs with TensorBoard.

Highlights in v0.5.1-alpha

Start training from dashboard

A "Start Training" button has been added to the pod page on the dashboard so that you can easily start training runs from that context.

Training runs can now be started by:

  • Modifications to the Spicepod YAML file.
  • The spice train command.
  • The "Start Training" dashboard button.
  • POST API calls to /api/v0.1/pods/{pod name}/train

Video: https://user-images.githubusercontent.com/80174/146122241-f8073266-ead6-4628-8563-93e98d74e9f0.mov

TensorBoard monitoring

TensorBoard monitoring is now supported when using DQL (default) or the new SACD learning algorithms that was announced in v0.5-alpha.

When enabled, TensorBoard logs will automatically be collected and a "Open TensorBoard" button will be shown on the pod page in the dashboard.

Logging can be enabled at the pod level with the training_loggers pod param or per training run with the CLI --training-loggers argument.

Video: https://user-images.githubusercontent.com/80174/146382503-2bb2570b-5111-4de0-9b80-a1dc4a5dcc35.mov

Support for VPG will be added in v0.6-alpha. The design allows for additional loggers to be added in the future. Let us know what you'd like to see!

New in this release

  • Adds a start training button on the dashboard pod page.
  • Adds TensorBoard logging and monitoring when using DQL and SACD learning algorithms.

Dependency updates

  • Updates to Tailwind 3.0.6
  • Updates to Glide Data Grid 3.2.1

- Rust
Published by phillipleblanc about 4 years ago

https://github.com/spiceai/spiceai - Spice.ai v0.5-alpha

We are excited to announce the release of Spice.ai v0.5-alpha! 🥇

Highlights include a new learning algorithm called "Soft Actor-Critic" (SAC), fixes to the behavior of spice upgrade, and a more consistent authoring experience for reward functions.

If you are new to Spice.ai, check out the getting started guide and star spiceai/spiceai on GitHub.

Highlights in v0.5-alpha

Soft Actor-Critic (Discrete) (SAC) Learning Algorithm

The addition of the Soft Actor-Critic (Discrete) (SAC) learning algorithm is a significant improvement to the power of the AI engine. It is not set as the default algorithm yet, so to start using it pass the --learning-algorithm sacd parameter to spice train. We'd love to get your feedback on how its working!

Consistent reward authoring experience

With the addition of the reward function files that allow you to edit your reward function in a Python file, the behavior of starting a new training session by editing the reward function code was lost. With this release, that behavior is restored.

In addition, there is a breaking change to the variables used to access the observation state and interpretations. This change was made to better reflect the purpose of the variables and make them easier to work with in Python

| Previous (Type) | New (Type) | | ----------------------------------- | -------------------------------------- | | prev_state (SimpleNamespace) | current_state (dict) | | prev_state.interpretations (list) | current_state_interpretations (list) | | new_state (SimpleNamespace) | next_state (dict) | | new_state.interpretations (list) | next_state_interpretations (list) |

Improved spice upgrade behavior

The Spice.ai CLI will no longer recommend "upgrading" to an older version. An issue was also fixed where trying to upgrade the Spice.ai CLI using spice upgrade on Linux would return an error.

New in this release

  • Adds a new learning algorithm called "Soft-Actor Critic (Discrete)" (SAC).
  • Updates the reward function parameters for the YAML code blocks from prev_state and new_state to current_state and next_state to be consistent with the reward function files.
  • Fixes an issue where editing a reward functions file would not automatically trigger training.
  • Fixes the normalization of values for the Deep-Q Learning algorithm to handle larger values.
  • Fixes an issue where the Spice.ai CLI would not upgrade on Linux with the spice upgrade command.
  • Fixes an issue where the Spice.ai CLI would recommend an "upgrade" to an older version.

- Rust
Published by phillipleblanc about 4 years ago

https://github.com/spiceai/spiceai - Spice.ai v0.4.1-alpha

Announcing the release of Spice.ai v0.4.1-alpha! ✅

This point release focuses on fixes and improvements to v0.4-alpha. Highlights include AI engine performance improvements, updates to the dashboard observations data grid, notification of new CLI versions, and several bug fixes.

A special acknowledgment to @Adm28, who added the CLI upgrade detection and prompt, which notifies users of new CLI versions and prompts to upgrade.

Highlights in v0.4.1-alpha

AI engine performance improvements

Overall training performance has been improved up to 13% by removing a lock in the AI engine.

In versions before v0.4.1-alpha, performance was especially impacted when streaming new data during a training run.

Dashboard Observations Datagrid

The dashboard observations datagrid now automatically resizes to the window width, and headers are easier to read, with automatic grouping into dataspaces. In addition, column widths are also resizable.

CLI version detection and upgrade prompt

When it is run, the Spice.ai CLI will now automatically check for new CLI versions once a day maximum.

If it detects a new version, it will print a notification to the console on spice version, spice run or spice add commands prompting the user to upgrade using the new spice upgrade command.

New in this release

  • Adds automatic resizing of the observations datagrid.
  • Adds header group by dataspace to the observations datagrid.
  • Adds CLI version detection and prompt for upgrade on version, run, and add commands.
  • Adds Support for parsing hex-encoded times and measurements. Use the time_format of hex or prefix with 0x.
  • Updates AI engine with improved training performance.
  • Updates Go and NPM dependencies.
  • Fixes detection of Spicepods in the Spicepods directory, and a resulting error when loading a non-Spicepod file.
  • Fixes a potential "zip slip" security issue.
  • Fixes an issue where the AI engine may not gracefully shutdown.

- Rust
Published by lukekim over 4 years ago

https://github.com/spiceai/spiceai - Spice.ai v0.4-alpha

We are excited to announce the release of Spice.ai v0.4-alpha! 🏄‍♂️

Highlights include support for authoring reward functions in a code file, the ability to specify the time of recommendation, and ingestion support for transaction/correlation ids. Authoring reward functions in a code file is a significant improvement to the developer experience than specifying functions inline in the YAML manifest, and we are looking forward to your feedback on it!

If you are new to Spice.ai, check out the getting started guide and star spiceai/spiceai on GitHub.

Highlights in v0.4-alpha

Upgrade using spice upgrade

The spice upgrade command was added in the v0.3.1-alpha release, so you can now upgrade from v0.3.1 to v0.4 by simply running spice upgrade in your terminal. Special thanks to community member @Adm28 for contributing this feature!

Reward Function Files

In addition to defining reward code inline, it is now possible to author reward code in functions in a separate Python file.

The reward function file path is defined by the reward_funcs property.

A function defined in the code file is mapped to an action by authoring its name in the with property of the relevant reward.

Example:

yaml training: reward_funcs: my_reward.py rewards: - reward: buy with: buy_reward - reward: sell with: sell_reward - reward: hold with: hold_reward

Learn more in the documentation: docs.spiceai.org/concepts/rewards/external

Time Categories

Spice.ai can now learn from cyclical patterns, such as daily, weekly, or monthly cycles.

To enable automatic cyclical field generation from the observation time, specify one or more time categories in the pod manifest, such as a month or weekday in the time section.

For example, by specifying month the Spice.ai engine automatically creates a field in the AI engine data stream called time_month_{month} with the value calculated from the month of which that timestamp relates.

Example:

yaml time: categories: - month - dayofweek

Supported category values are: month dayofmonth dayofweek hour

Learn more in the documentation: docs.spiceai.org/reference/pod/#time

Get recommendation for a specific time

It is now possible to specify the time of recommendations fetched from the /recommendation API.

Valid times are from pod epoch_time to epoch_time + period.

Previously the API only supported recommendations based on the time of the last ingested observation.

Requests are made in the following format: GET http://localhost:8000/api/v0.1/pods/{pod}/recommendation?time={unix_timestamp}`

An example for quickstarts/trader

GET http://localhost:8000/api/v0.1/pods/trader/recommendation?time=1605729600

Specifying {unix_timestamp} as 0 will return a recommendation based on the latest data. An invalid {unix_timestamp} will return a result that has the valid time range in the error message:

json { "response": { "result": "invalid_recommendation_time", "message": "The time specified (1610060201) is outside of the allowed range: (1610057600, 1610060200)", "error": true } }

New in this release

  • Adds time categories configuration to the pod manifest to enable learning from cyclical patterns in data - e.g. hour, day of week, day of month, and month
  • Adds support for defining reward functions in a rewards functions code file.
  • Adds the ability to specify recommendation time making it possible to now see which action Spice.ai recommends at any time during the pod period.
  • Adds support for ingestion of transaction/correlation identifiers (e.g. order_id, trace_id) in the pod manifest.
  • Adds validation for invalid dataspace names in the pod manifest.
  • Adds the ability to resize columns to the dashboard observation data grid.
  • Updates to TensorFlow 2.7 and Keras 2.7
  • Fixes a bug where data processors were using data connector params
  • Fixes a dashboard issue in the pod observations data grid where a column might not be shown.
  • Fixes a crash on pod load if the training section is not included in the manifest.
  • Fixes an issue where data manager stats errors were incorrectly being printed to console.
  • Fixes an issue where selectors may not match due to surrounding whitespace.

- Rust
Published by phillipleblanc over 4 years ago

https://github.com/spiceai/spiceai - Spice.ai v0.3.1-alpha

We are excited to announce the release of Spice.ai v0.3.1-alpha! 🎃

This point release focuses on fixes and improvements to v0.3-alpha. Highlights include the ability to specify both seed and runtime data, to select custom named fields for time and tags, a new spice upgrade command and several bug fixes.

A special acknowledgment to @Adm28, who added the new spice upgrade command, which enables the CLI to self-update, which in turn will auto-update the runtime.

Highlights in v0.3.1-alpha

Upgrade command

The CLI can now be updated using the new spice upgrade command. This command will check for, download, and install the latest Spice.ai CLI release, which will become active on it's next run.

When run, the CLI will check for the matching version of the Spice.ai runtime, and will automatically download and install it as necessary.

The version of both the Spice.ai CLI and runtime can be checked with the spice version CLI command.

Seed data

When working with streaming data sources, like market prices, it's often also useful to seed the dataspace with historical data. Spice.ai enables this with the new seed_data node in the dataspace configuration. The syntax is exactly the same as the data syntax. For example:

yaml dataspaces: - from: coinbase name: btcusd seed_data: connector: file params: path: path/to/seed/data.csv processor: name: csv data: connector: coinbase params: product_ids: BTC-USD processor: name: json

The seed data will be fetched first, before the runtime data is initialized. Both sets of connectors and processors use the dataspace scoped measurements, categories and tags for processing, and both data sources are merged in pod-scoped observation timeline.

Time field selectors

Before v0.3.1-alpha, data was required to include a specific time field. In v0.3.1-alpha, the JSON and CSV data processors now support the ability to select a specific field to populate the time field. An example selector to use the created_at column for time is:

yaml data: processor: name: csv params: time_selector: created_at

Tag field selectors

Before v0.3.1-alpha, tags were required to be placed in a _tags field. In v0.3.1-alpha, any field can now be selected to populate tags. Tags are pod-unique string values, and the union of all selected fields will make up the resulting tag list. For example:

yaml dataspace: from: twitter name: tweets tags: selectors: - tags - author_id values: - spiceaihq - spicy

New in this release

  • Adds a new spice upgrade command for self-upgrade of the Spice.ai CLI.
  • Adds a new seed_data node to the dataspace configuration, enabling the dataspace to be seeded with an alternative source of data.
  • Adds the ability to select a custom time field in JSON and CSV data processors with the time_selector parameter.
  • Adds the ability to select custom tag fields in the dataspace configuration with selectors list.
  • Adds error reporting for AI engine crashes, where previously it would fail silently.
  • Fixes the dashboard pods list from "jumping" around due to being unsorted.
  • Fixes rare cases where categorical data might be sent to the AI engine in the wrong format.

- Rust
Published by github-actions[bot] over 4 years ago

https://github.com/spiceai/spiceai - Spice.ai v0.3-alpha

Spice.ai v0.3-alpha

We are excited to announce the release of Spice.ai v0.3-alpha! 🎉

This release adds support for ingestion, automatic encoding, and training of categorical data, enabling more use-cases and datasets beyond just numerical measurements. For example, perhaps you want to learn from data that includes a category of t-shirt sizes, with discrete values, such as small, medium, and large. The v0.3 engine now supports this and automatically encodes the categorical string values into numerical values that the AI engine can use. Also included is a preview of data visualizations in the dashboard, which is helpful for developers as they author Spicepods and dataspaces.

A special acknowledgment to @sboorlagadda, who submitted the first Spice.ai feature contribution from the community ever! He added the ability to list pods from the CLI with the new spice pods list command. Thank you, @sboorlagadda!!!

If you are new to Spice.ai, check out the getting started guide and star spiceai/spiceai on GitHub.

Highlights in v0.3-alpha

Categorical data

In v0.1, the runtime and AI engine only supported ingesting numerical data. In v0.2, tagged data was accepted and automatically encoded into fields available for learning. In this release, v0.3, categorical data can now also be ingested and automatically encoded into fields available for learning. This is a breaking change with the format of the manifest changing separating numerical measurements and categorical data.

Pre-v0.3, the manifest author specified numerical data using the fields node.

In v0.3, numerical data is now specified under measurements and categorical data under categories. E.g.

yaml dataspaces: - from: event name: stream measurements: - name: duration selector: length_of_time fill: none - name: guest_count selector: num_guests fill: none categories: - name: event_type values: - dinner - party - name: target_audience values: - employees - investors tags: - tagA - tagB

Data visualizations preview

A top piece of community feedback was the ability to visualize data. After first running Spice.ai, we'd often hear from developers, "how do I see the data?". A preview of data visualizations is now included in the dashboard on the pod page.

Listing pods

Once the Spice.ai runtime has started, you can view the loaded pods on the dashboard and fetch them via API call localhost:8000/api/v0.1/pods. To make it even easier, we've added the ability to list them via the CLI with the new spice pods list command, which shows the list of pods and their manifest paths.

Coinbase data connector

A new Coinbase data connector is included in v0.3, enabling the streaming of live market ticker prices from Coinbase Pro. Enable it by specifying the coinbase data connector and providing a list of Coinbase Pro product ids. E.g. "BTC-USD". A new sample which demonstrates is also available with its associated Spicepod available from the spicerack.org registry. Get it with spice add samples/trader.

Tweet Recommendation Quickstart

A new Tweet Recommendation Quickstart has been added. Given past tweet activity and metrics of a given account, this app can recommend when to tweet, comment, or retweet to maximize for like count, interaction rates, and outreach of said given Twitter account.

Trader Sample

A new Trader Sample has been added in addition to the Trader Quickstart. The sample uses the new Coinbase data connector to stream live Coinbase Pro ticker data for learning.

New in this release

  • Adds support for ingesting, encoding, and training on categorical data. v0.3 uses one-hot-encoding.
  • Changes Spicepod manifest fields node to measurements and add the categories node.
  • Adds the ability to select a field from the source data and map it to a different field name in the dataspace. See an example for measurements in docs.
  • Adds support for JSON content type when fetching from the /observations API. Previously, only CSV was supported.
  • Adds a preview version of data visualizations to the dashboard. The grid has several limitations, one of which is it currently cannot be resized.
  • Adds the ability to select which learning algorithm to use via the CLI, the API, and specified in the Spicepod manifest. Possible choices are currently "vpg", Vanilla Policy Gradient and "dql", Deep Q-Learning. Shout out to @corentin-pro, who added this feature on his second day on the team!
  • Adds the ability to list loaded pods with the CLI command spice pods list.
  • Adds a new coinbase data connector for Coinbase Pro market prices.
  • Adds a new Tweet Recommendation Quickstart.
  • Adds a new Trader Sample.
  • Fixes bug where the /observations endpoint was not providing fully qualified field names.
  • Fixes issue where debugging messages were printed when using spice add.

- Rust
Published by phillipleblanc over 4 years ago

https://github.com/spiceai/spiceai - Spice.ai v0.2.1-alpha

Spice.ai v0.2.1-alpha

Announcing the release of Spice.ai v0.2.1-alpha! 🚚

This point release focuses on fixes and improvements to v0.2-alpha. Highlights include the ability to specify how missing data should be treated and a new production mode for spiced.

This release supports the ability to specify how the runtime should treat missing data. Previous releases filled missing data with the last value (or initial value) in the series. While this makes sense for some data, i.e., market prices of a stock or cryptocurrency, it does not make sense for discrete data, i.e., ratings. In v0.2.1, developers can now add the fill parameter on a dataspace field to specify the behavior. This release supports fill types previous and none. The default is previous.

Example in a manifest:

yaml dataspaces: - from: twitter name: tweets fields: - name: likes fill: none # The new fill parameter

spiced now defaults to a new production mode when run standalone (not via the CLI), with development mode now explicitly set with the --development flag. Production mode does not activate development time features, such as the Spicepod file watcher. The CLI always runs spiced in development mode as it is not expected to be used in production deployments.

New in this release

  • Adds a fill parameter to dataspace fields to specify how missing values should be treated.
  • Adds the ability to specify the fill behavior of empty values in a dataspace.
  • Simplifies releases with a single spiceai release instead of separate spice and spiced releases.
  • Adds an explicit development mode to spiced. Production mode does not activate the file watcher.
  • Fixes a bug when the pod parameter epoch_time was not set which would cause data not to be sent to the AI engine.
  • Fixes a bug where the User-Agent was not set correctly from CLI calls to api.spicerack.org

- Rust
Published by github-actions[bot] over 4 years ago

https://github.com/spiceai/spiceai - Spice CLI v0.2-alpha

- Rust
Published by lukekim over 4 years ago

https://github.com/spiceai/spiceai - Spice.ai v0.2-alpha

Spice.ai v0.2-alpha

We are excited to announce the release of Spice.ai v0.2-alpha! 🎉

This release is the first major version since the initial v0.1 announcement and includes significant improvements based upon community and early customer feedback. If you are new to Spice.ai, check out the getting started guide and star spiceai/spiceai on GitHub.

Highlights in v0.2-alpha

Tagged data

In the first release, the runtime and AI engine could only ingest numerical data. In v0.2, tagged data is accepted and automatically encoded into fields available for learning. For example, it's now possible to include a "liked" tag when using tweet data, automatically encoded to a 0/1 field for training. Both CSV and the new JSON observation formats support tags. The v0.3 release will add additional support for sets of categorical data.

Streaming data

Previously, the runtime would trigger each data connector to fetch on a 15-second interval. In v0.2, we upgraded the interface for data connectors to a push/streaming model, which enables continuous streaming data into the environment and AI engine.

Interpreted data

Spice.ai works together with your application code and works best when it's provided continuous feedback. This feedback could be from the application itself, for example, ratings, likes, thumbs-up/down, profit from trades, or external expertise. The interpretations API was introduced in v0.1.1, and v0.2 adds AI engine support providing a way to give meaning or an interpretation of ranges of time-series data, which are then available within reward functions. For example, a time range of stock prices could be a "good time to buy," or perhaps Tuesday mornings is a "good time to tweet," and an application or expert can teach the AI engine this through interpretations providing a shortcut to it's learning.

New in this release

  • Adds core runtime and AI engine tagged data support
  • Adds tagged data support to the CSV processor
  • Adds streaming data support to the engine and data connectors
  • Adds a new JSON data processor for ingesting JSON data
  • Adds a new Twitter data connector with JSON processor support
  • Adds a new /pods//dataspaces API
  • Adds support for using interpretations in reward functions Learn more.
  • Adds support for downloading zipped pods from the spicerack.org registry
  • Adds support for adding data along with the pod manifest when adding a pod from the spicerack.org registry
  • Adds basic /pods//diagnostics API
  • Fixes pod period, interval, and granularity not being correctly set when trying to use a "d" format
  • Fixes the color scheme of action counts in the dashboard to improve readability

- Rust
Published by github-actions[bot] over 4 years ago

https://github.com/spiceai/spiceai - v0.1.1-alpha

alpha

- Rust
Published by github-actions[bot] over 4 years ago

https://github.com/spiceai/spiceai - Spice Runtime v0.1.1-alpha

Spice.ai v0.1.1-alpha

Announcing the release of Spice.ai v0.1.1-alpha! 🙌

This is the first point release following the public launch of v0.1-alpha and is focused on fixes and improvements to v0.1-alpha before the bigger v0.2-alpha release.

Highlights include initial support for interpretations and the addition of a new Json Data Processor which enables observations to be posted in JSON to a new Dataspaces API. The ability to post observations directly to the Dataspace also now makes Data Connectors optional.

Interpretations will enable end-users and external systems to participate in training by providing expert interpretation of the data, ultimately creating smarter pods. v0.1.1-alpha includes the ability to add and get interpretations by API and through import/export of Spicepods. Reward function authors will be able to use interpretations in reward functions from the v0.2-alpha release.

Previously observations could only be added in CSV format. JSON is now supported by calling the new dataspace observations API that leverages the also new JSON processor located in the data-components-contrib repository. The JSON processor defaults to parsing the Spice.ai observation format and is extensible to other schemas.

The dashboard has also been improved to show action counts during a training run, making it easier to visualize the learning process.

New in this release

  • Adds visualization of actions counts during a training run in the dashboard.
  • Adds a new interpretations API, along with support for importing and exporting interpretations to pods. Learn more.
  • Adds a new API for ingesting dataspace observations. Learn more.
  • Adds an official DockerHub repository for spiceai/spiceai.
  • Fixes bug where the dashboard would not load on browser refresh.

- Rust
Published by github-actions[bot] over 4 years ago

https://github.com/spiceai/spiceai - Spice CLI v0.1.1-alpha-rc

This is the release candidate 0.1.1-alpha-rc

- Rust
Published by github-actions[bot] over 4 years ago

https://github.com/spiceai/spiceai - Spice Runtime v0.1.1-alpha-rc

This is the release candidate 0.1.1-alpha-rc

- Rust
Published by github-actions[bot] over 4 years ago

https://github.com/spiceai/spiceai - Spice Runtime v0.2.0-alpha-rc

This is the release candidate 0.2.0-alpha-rc

- Rust
Published by github-actions[bot] over 4 years ago

https://github.com/spiceai/spiceai - Spice CLI v0.2.0-alpha-rc

This is the release candidate 0.2.0-alpha-rc

- Rust
Published by github-actions[bot] over 4 years ago

https://github.com/spiceai/spiceai - Spice Runtime v0.1.0-alpha

Spice.ai v0.1.0-alpha

Announcing the public release of Spice.ai v0.1.0-alpha! 🎉

See the blog post at blog.spiceai.org.

New in this release

- Rust
Published by github-actions[bot] over 4 years ago

https://github.com/spiceai/spiceai - Spice CLI v0.1.0-alpha

Spice.ai v0.1.0-alpha

Announcing the public release of Spice.ai v0.1.0-alpha! 🎉

See the blog post at blog.spiceai.org.

New in this release

- Rust
Published by github-actions[bot] over 4 years ago

https://github.com/spiceai/spiceai - Spice Runtime v0.1.0-alpha-rc

This is the release candidate 0.1.0-alpha-rc

- Rust
Published by github-actions[bot] over 4 years ago

https://github.com/spiceai/spiceai - Spice CLI v0.1.0-alpha-rc

This is the release candidate 0.1.0-alpha-rc

- Rust
Published by github-actions[bot] over 4 years ago

https://github.com/spiceai/spiceai - Spice Runtime v0.1.0-alpha.5

Spice.ai v0.1.0-alpha.5

Announcing the release of Spice.ai v0.1.0-alpha.5! 🎉

This release focused on preparation for the public launch of the project, including more comprehensive and easier-to-understand documentation, quickstarts and samples.

Data Connectors and Data Processors have now been moved to their own repository spiceai/data-components-contrib

To better improve the developer experience, the following breaking changes have been made:

  • The pods directory .spice/pods (and thus manifests) and the config file .spice/config.yaml have been moved from the ./spice directory to the app root ./. This allows for the .spice directory to be added to the .gitignore and for the manifest changes to be easily tracked in the project.
  • Flights have been renamed to more understandable Training Runs in user interfaces.

New in this release

  • Adds Open source acknowledgements to the dashboard
  • Adds improved error messages for several scenarios
  • Updates all Quickstarts and Samples to be clearer, easier to understand and better show the value of Spice.ai. The LogPruner sample has also been renamed ServerOps
  • Updates the dashboard to show a message when no pods have been trained
  • Updates all documentation links to docs.spiceai.org
  • Updates to use Python 3.8.12
  • Fixes bug where the dashboards showed undefined episode number
  • Fixes issue where the manifest.json was not being served to the React app
  • Fixes the config.yaml being written when not required
  • Removes the ability to load a custom dashboard - this may come back in a future release

Breaking changes

  • Changes .spice/pods is now located at ./spicepods
  • Changes .spice/config.yaml is now located at .spice.config.yaml

- Rust
Published by github-actions[bot] over 4 years ago

https://github.com/spiceai/spiceai - Spice CLI v0.1.0-alpha.5

Spice.ai v0.1.0-alpha.5

Announcing the release of Spice.ai v0.1.0-alpha.5! 🎉

This release focused on preparation for the public launch of the project, including more comprehensive and easier-to-understand documentation, quickstarts and samples.

Data Connectors and Data Processors have now been moved to their own repository spiceai/data-components-contrib

To better improve the developer experience, the following breaking changes have been made:

  • The pods directory .spice/pods (and thus manifests) and the config file .spice/config.yaml have been moved from the ./spice directory to the app root ./. This allows for the .spice directory to be added to the .gitignore and for the manifest changes to be easily tracked in the project.
  • Flights have been renamed to more understandable Training Runs in user interfaces.

New in this release

  • Adds Open source acknowledgements to the dashboard
  • Adds improved error messages for several scenarios
  • Updates all Quickstarts and Samples to be clearer, easier to understand and better show the value of Spice.ai. The LogPruner sample has also been renamed ServerOps
  • Updates the dashboard to show a message when no pods have been trained
  • Updates all documentation links to docs.spiceai.org
  • Updates to use Python 3.8.12
  • Fixes bug where the dashboards showed undefined episode number
  • Fixes issue where the manifest.json was not being served to the React app
  • Fixes the config.yaml being written when not required
  • Removes the ability to load a custom dashboard - this may come back in a future release

Breaking changes

  • Changes .spice/pods is now located at ./spicepods
  • Changes .spice/config.yaml is now located at .spice.config.yaml

- Rust
Published by github-actions[bot] over 4 years ago

https://github.com/spiceai/spiceai - Spice Runtime v0.1.0-alpha.5-rc

This is the release candidate 0.1.0-alpha.5-rc

- Rust
Published by github-actions[bot] over 4 years ago

https://github.com/spiceai/spiceai - Spice CLI v0.1.0-alpha.5-rc

This is the release candidate 0.1.0-alpha.5-rc

- Rust
Published by github-actions[bot] over 4 years ago

https://github.com/spiceai/spiceai - Spice Runtime v0.1.0-alpha.4

Spice.ai v0.1.0-alpha.4

Announcing the release of Spice.ai v0.1.0-alpha.4! 🎉

We have a project name update. The project will now be referred to as "Spice.ai" instead of "Spice AI" and the project website will be located at spiceai.org.

This release now uses the new spicerack.org AI package registry instead of fetching packages directly from GitHub.

Added support for importing and exporting Spice.ai pods with spice import and spice export commands.

The CLI been streamlined removing the pod command: - pod add changes from spice pod add <pod path> to just spice add <pod path> - pod train changes from spice pod train <pod name> to just spice train <pod name>

We've also updated the names of some concepts:

  • "DataSources" are now "Dataspaces"
  • "Inference" is now "Recommendation"

New in this release

  • Adds a new Gardener to intelligently decide on the best time to water a simulated garden
  • Adds support for importing and exporting Spice.ai pods with spice import and spice export commands
  • Adds a complete end-to-end test suite
  • Adds installing by friendly URL curl https://install.spiceai.org | /bin/bash
  • Adds the spice binary to PATH automatically by shell config (E.g. .bashrc .zshrc)
  • Adds support for targeting hosting contexts (docker or metal) specifically with a --context command line flag
  • Removes the model downloader. This will return with better supported in a later version
  • Updates Trader quickstart with demo Node.js application to better demonstrate its use
  • Updates LogPruner quickstart with demo PowerShell Core script to better demonstrate its use
  • Updates Tensorflow from 2.5.0 to 2.5.1
  • Fixes potential mismatch of CLI and runtime by only automatically upgrading to the same version
  • Fixes issue with .spice/config.yml creation in Docker due to incorrect permissions
  • Fixes dashboard title from React App to Spice.ai

Breaking changes

  • Changes datasources section in the pod manifest to dataspaces
  • Changes /api/v0.1/pods/<pod>/inference API to /api/v0.1/pods/<pod>/recommendation

- Rust
Published by github-actions[bot] over 4 years ago

https://github.com/spiceai/spiceai - Spice CLI v0.1.0-alpha.4

Spice.ai v0.1.0-alpha.4

Announcing the release of Spice.ai v0.1.0-alpha.4! 🎉

We have a project name update. The project will now be referred to as "Spice.ai" instead of "Spice AI" and the project website will be located at spiceai.org.

This release now uses the new spicerack.org AI package registry instead of fetching packages directly from GitHub.

The CLI been streamlined removing the pod command: - pod add changes from spice pod add <pod path> to just spice add <pod path> - pod train changes from spice pod train <pod name> to just spice train <pod name>

We've also updated the names of some concepts:

  • "DataSources" are now "Dataspaces"
  • "Inference" is now "Recommendation"

New in this release

  • Adds a new Gardener to intelligently decide on the best time to water a simulated garden
  • Adds a complete end-to-end test suite
  • Adds installing by friendly URL curl https://install.spiceai.org | /bin/bash
  • Adds the spice binary to PATH automatically by shell config (E.g. .bashrc .zshrc)
  • Adds support for targeting hosting contexts (docker or metal) specifically with a --context command line flag
  • Removes the model downloader. This will return with better supported in a later version
  • Updates [Trader]](https://github.com/spiceai/quickstarts/tree/trunk/trader) quickstart with demo Node.js application to better demonstrate its use
  • Updates [LogPruner]](https://github.com/spiceai/quickstarts/tree/trunk/logpruner) quickstart with demo PowerShell Core script to better demonstrate its use
  • Updates Tensorflow from 2.5.0 to 2.5.1
  • Fixes potential mismatch of CLI and runtime by only automatically upgrading to the same version
  • Fixes issue with .spice/config.yml creation in Docker due to incorrect permissions
  • Fixes dashboard title from React App to Spice.ai

Breaking changes

  • Changes datasources section in the pod manifest to dataspaces
  • Changes /api/v0.1/pods/<pod>/inference API to /api/v0.1/pods/<pod>/recommendation

- Rust
Published by github-actions[bot] over 4 years ago