https://github.com/spiceai/spiceai

A portable accelerated SQL query, search, and LLM-inference engine, written in Rust, for data-grounded AI apps and agents.

https://github.com/spiceai/spiceai

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (8.0%) to scientific vocabulary

Keywords

artificial-intelligence data data-federation developers infrastructure machine-learning sql

Keywords from Contributors

autograd agent transformers tensor sequencers hacking distributed parallel interactive optimism
Last synced: 5 months ago · JSON representation

Repository

A portable accelerated SQL query, search, and LLM-inference engine, written in Rust, for data-grounded AI apps and agents.

Basic Info
  • Host: GitHub
  • Owner: spiceai
  • License: apache-2.0
  • Language: Rust
  • Default Branch: trunk
  • Homepage: https://docs.spiceai.org
  • Size: 41.2 MB
Statistics
  • Stars: 2,562
  • Watchers: 27
  • Forks: 143
  • Open Issues: 324
  • Releases: 106
Topics
artificial-intelligence data data-federation developers infrastructure machine-learning sql
Created over 4 years ago · Last pushed 6 months ago
Metadata Files
Readme Contributing License Code of conduct Codeowners Security Roadmap

README.md

spice oss logo

CodeQL License: Apache-2.0 Discord Follow on X

GitHub Actions Workflow Status - build GitHub Actions Workflow Status - docker build GitHub Actions Workflow Status - unit tests GitHub Actions Workflow Status - integration tests GitHub Actions Workflow Status - integration tests (models) GitHub Actions Workflow Status - benchmark tests

📄 Docs | ⚡️ Quickstart | 🧑‍🍳 Cookbook

Spice is a SQL query, search, and LLM-inference engine, written in Rust, for data apps and agents.

Spice.ai Open Source accelerated data query and LLM-inference engine

Spice provides four industry standard APIs in a lightweight, portable runtime (single binary/container):

  1. SQL Query & Search: HTTP, Arrow Flight, Arrow Flight SQL, ODBC, JDBC, and ADBC APIs; vector_search and text_search UDTFs.
  2. OpenAI-Compatible APIs: HTTP APIs for OpenAI SDK compatibility, local model serving (CUDA/Metal accelerated), and hosted model gateway.
  3. Iceberg Catalog REST APIs: A unified Iceberg REST Catalog API.
  4. MCP HTTP+SSE APIs: Integration with external tools via Model Context Protocol (MCP) using HTTP and Server-Sent Events (SSE).

🎯 Goal: Developers can focus on building data apps and AI agents confidently, knowing they are grounded in data.

Spice is primarily used for:

  • Data Federation: SQL query across any database, data warehouse, or data lake. Learn More.
  • Data Materialization and Acceleration: Materialize, accelerate, and cache database queries. Read the MaterializedView interview - Building a CDN for Databases
  • Enterprise Search: Keyword, vector, and full-text search with Tantivy-powered BM25 and vector similarity search for structured and unstructured data.
  • AI apps and agents: An AI-database powering retrieval-augmented generation (RAG) and intelligent agents. Learn More.

If you want to build with DataFusion or using DuckDB, Spice provides a simple, flexible, and production-ready engine you can just use.

📣 Read the Spice.ai 1.0-stable announcement.

Spice is built-on industry leading technologies including Apache DataFusion, Apache Arrow, Arrow Flight, SQLite, and DuckDB.

How Spice works.

🎥 Watch the CMU Databases Accelerating Data and AI with Spice.ai Open-Source

🎥 Watch How to Query Data using Spice, OpenAI, and MCP

🎥 Watch How to search with Amazon S3 Vectors

Why Spice?

Spice.ai

Spice simplifies building data-driven AI applications and agents by making it fast and easy to query, federate, and accelerate data from one or more sources using SQL, while grounding AI in real-time, reliable data. Co-locate datasets with apps and AI models to power AI feedback loops, enable RAG and search, and deliver fast, low-latency data-query and AI-inference with full control over cost and performance.

How is Spice different?

  1. AI-Native Runtime: Spice combines data query and AI inference in a single engine, for data-grounded AI and accurate AI.

  2. Application-Focused: Designed to run distributed at the application and agent level, often as a 1:1 or 1:N mapping between app and Spice instance, unlike traditional data systems built for many apps on one centralized database. It’s common to spin up multiple Spice instances—even one per tenant or customer.

  3. Dual-Engine Acceleration: Supports both OLAP (Arrow/DuckDB) and OLTP (SQLite/PostgreSQL) engines at the dataset level, providing flexible performance across analytical and transactional workloads.

  4. Disaggregated Storage: Separation of compute from disaggregated storage, co-locating local, materialized working sets of data with applications, dashboards, or ML pipelines while accessing source data in its original storage.

  5. Edge to Cloud Native: Deploy as a standalone instance, Kubernetes sidecar, microservice, or cluster—across edge/POP, on-prem, and public clouds. Chain multiple Spice instances for tier-optimized, distributed deployments.

How does Spice compare?

Data Query and Analytics

| Feature | Spice | Trino / Presto | Dremio | ClickHouse | Materialize | | -------------------------------- | -------------------------------------- | -------------------- | --------------------- | ------------------- | -------------------- | | Primary Use-Case | Data & AI apps/agents | Big data analytics | Interactive analytics | Real-time analytics | Real-time analytics | | Primary deployment model | Sidecar | Cluster | Cluster | Cluster | Cluster | | Federated Query Support | ✅ | ✅ | ✅ | ― | ― | | Acceleration/Materialization | ✅ (Arrow, SQLite, DuckDB, PostgreSQL) | Intermediate storage | Reflections (Iceberg) | Materialized views | ✅ (Real-time views) | | Catalog Support | ✅ (Iceberg, Unity Catalog, AWS Glue) | ✅ | ✅ | ― | ― | | Query Result Caching | ✅ | ✅ | ✅ | ✅ | Limited | | Multi-Modal Acceleration | ✅ (OLAP + OLTP) | ― | ― | ― | ― | | Change Data Capture (CDC) | ✅ (Debezium) | ― | ― | ― | ✅ (Debezium) |

AI Apps and Agents

| Feature | Spice | LangChain | LlamaIndex | AgentOps.ai | Ollama | | ----------------------------- | ---------------------------------------- | ------------------ | ---------- | ---------------- | ----------------------------- | | Primary Use-Case | Data & AI apps | Agentic workflows | RAG apps | Agent operations | LLM apps | | Programming Language | Any language (HTTP interface) | JavaScript, Python | Python | Python | Any language (HTTP interface) | | Unified Data + AI Runtime | ✅ | ― | ― | ― | ― | | Federated Data Query | ✅ | ― | ― | ― | ― | | Accelerated Data Access | ✅ | ― | ― | ― | ― | | Tools/Functions | ✅ (MCP HTTP+SSE) | ✅ | ✅ | Limited | Limited | | LLM Memory | ✅ | ✅ | ― | ✅ | ― | | Evaluations (Evals) | ✅ | Limited | ― | Limited | ― | | Search | ✅ (Keyword, Vector, & Full-Text-Search) | ✅ | ✅ | Limited | Limited | | Caching | ✅ (Query and results caching) | Limited | ― | ― | ― | | Embeddings | ✅ (Built-in & pluggable models/DBs) | ✅ | ✅ | Limited | ― |

✅ = Fully supported ❌ = Not supported Limited = Partial or restricted support

Example Use-Cases

Data-grounded Agentic AI Applications

  • OpenAI-compatible API: Connect to hosted models (OpenAI, Anthropic, xAI) or deploy locally (Llama, NVIDIA NIM). AI Gateway Recipe
  • Federated Data Access: Query using SQL and NSQL (text-to-SQL) across databases, data warehouses, and data lakes with advanced query push-down for fast retrieval across disparate data sources. Federated SQL Query Recipe
  • Search and RAG: Search and retrieve context with accelerated embeddings for retrieval-augmented generation (RAG) workflows, including full-text search (FTS) via Tantivy-powered BM25 scoring and vector similarity search (VSS) integrated into SQL queries. Use SQL functions like vector_search for semantic search and text_search for keyword-based search. Supports multi-column vector search with reciprocal rank fusion for aggregated results. Amazon S3 Vectors Cookbook Recipe
  • LLM Memory and Observability: Store and retrieve history and context for AI agents while gaining deep visibility into data flows, model performance, and traces. LLM Memory Recipe | Observability & Monitoring Features Documentation

Database CDN and Query Mesh

  • Data Acceleration: Co-locate materialized datasets in Arrow, SQLite, and DuckDB with applications for sub-second query. DuckDB Data Accelerator Recipe
  • Resiliency and Local Dataset Replication: Maintain application availability with local replicas of critical datasets. Local Dataset Replication Recipe
  • Responsive Dashboards: Enable fast, real-time analytics by accelerating data for frontends and BI tools. Sales BI Dashboard Demo
  • Simplified Legacy Migration: Use a single endpoint to unify legacy systems with modern infrastructure, including federated SQL querying across multiple sources. Federated SQL Query Recipe

Retrieval-Augmented Generation (RAG)

  • Unified Search with Vector Similarity: Perform efficient vector similarity search across structured and unstructured data sources, now with native support for Amazon S3 Vectors for petabyte-scale vector storage and querying. The Spice runtime manages the vector lifecycle: ingesting data from disparate sources, embedding it using models like Amazon Titan Embeddings or Cohere Embeddings via AWS Bedrock, or MiniLM L6 from HuggingFace, and storing in S3 Vector buckets. Supports distance metrics like cosine similarity, Euclidean distance, or dot product. Example SQL: SELECT * FROM vector_search(my_table, 'search query', 10) WHERE condition ORDER BY score;. Amazon S3 Vectors Cookbook Recipe
  • Semantic Knowledge Layer: Define a semantic context model to enrich data for AI. Semantic Model Feature Documentation
  • Text-to-SQL: Convert natural language queries into SQL using built-in NSQL and sampling tools for accurate query. Text-to-SQL Recipe
  • Model and Data Evaluations: Assess model performance and data quality with integrated evaluation tools. Language Model Evaluations Recipe

FAQ

  • Is Spice a cache? No specifically; you can think of Spice data acceleration as an active cache, materialization, or data prefetcher. A cache would fetch data on a cache-miss while Spice prefetches and materializes filtered data on an interval, trigger, or as data changes using CDC. In addition to acceleration Spice supports results caching.

  • Is Spice a CDN for databases? Yes, a common use-case for Spice is as a CDN for different data sources. Using CDN concepts, Spice enables you to ship (load) a working set of your database (or data lake, or data warehouse) where it's most frequently accessed, like from a data-intensive application or for AI context.

➡️ Docs FAQ

Watch a 30-sec BI dashboard acceleration demo

https://github.com/spiceai/spiceai/assets/80174/7735ee94-3f4a-4983-a98e-fe766e79e03a

See more demos on YouTube.

Supported Data Connectors

| Name | Description | Status | Protocol/Format | | ---------------------------------- | ------------------------------------- | ----------------- | ---------------------------- | | databricks (mode: delta_lake) | Databricks | Stable | S3/Delta Lake | | delta_lake | Delta Lake | Stable | Delta Lake | | dremio | Dremio | Stable | Arrow Flight | | duckdb | DuckDB | Stable | Embedded | | file | File | Stable | Parquet, CSV | | github | GitHub | Stable | GitHub API | | postgres | PostgreSQL | Stable | | | s3 | S3 | Stable | Parquet, CSV | | mysql | MySQL | Stable | | | spice.ai | Spice.ai | Stable | Arrow Flight | | graphql | GraphQL | Release Candidate | JSON | | databricks (mode: spark_connect) | Databricks | Beta | Spark Connect | | flightsql | FlightSQL | Beta | Arrow Flight SQL | | iceberg | Apache Iceberg | Beta | Parquet | | mssql | Microsoft SQL Server | Beta | Tabular Data Stream (TDS) | | odbc | ODBC | Beta | ODBC | | snowflake | Snowflake | Beta | Arrow | | spark | Spark | Beta | Spark Connect | | oracle | Oracle | Alpha | Oracle ODPI-C | | abfs | Azure BlobFS | Alpha | Parquet, CSV | | clickhouse | Clickhouse | Alpha | | | debezium | Debezium CDC | Alpha | Kafka + JSON | | kafka | Kafka | Alpha | Kafka + JSON | | dynamodb | Amazon DynamoDB | Alpha | | | ftp, sftp | FTP/SFTP | Alpha | Parquet, CSV | | glue | AWS Glue | Alpha | Iceberg, Parquet, CSV | | http, https | HTTP(s) | Alpha | Parquet, CSV | | imap | IMAP | Alpha | IMAP Emails | | localpod | Local dataset replication | Alpha | | | sharepoint | Microsoft SharePoint | Alpha | Unstructured UTF-8 documents | | mongodb | MongoDB | Coming Soon | | | elasticsearch | ElasticSearch | Roadmap | |

Supported Data Accelerators

| Name | Description | Status | Engine Modes | | ---------- | -------------------------------- | ----------------- | ---------------- | | arrow | In-Memory Arrow Records | Stable | memory | | duckdb | Embedded DuckDB | Stable | memory, file | | postgres | Attached PostgreSQL | Release Candidate | N/A | | sqlite | Embedded SQLite | Release Candidate | memory, file |

Supported Model Providers

| Name | Description | Status | ML Format(s) | LLM Format(s) | | ------------- | -------------------------------------------- | ----------------- | ------------ | ------------------------------- | | openai | OpenAI (or compatible) LLM endpoint | Release Candidate | - | OpenAI-compatible HTTP endpoint | | file | Local filesystem | Release Candidate | ONNX | GGUF, GGML, SafeTensor | | huggingface | Models hosted on HuggingFace | Release Candidate | ONNX | GGUF, GGML, SafeTensor | | spice.ai | Models hosted on the Spice.ai Cloud Platform | | ONNX | OpenAI-compatible HTTP endpoint | | azure | Azure OpenAI | | - | OpenAI-compatible HTTP endpoint | | anthropic | Models hosted on Anthropic | Alpha | - | OpenAI-compatible HTTP endpoint | | xai | Models hosted on xAI | Alpha | - | OpenAI-compatible HTTP endpoint |

Supported Embeddings Providers

| Name | Description | Status | ML Format(s) | LLM Format(s)* | | ------------- | ----------------------------------- | ----------------- | ------------ | ------------------------------- | | openai | OpenAI (or compatible) LLM endpoint | Release Candidate | - | OpenAI-compatible HTTP endpoint | | file | Local filesystem | Release Candidate | ONNX | GGUF, GGML, SafeTensor | | huggingface | Models hosted on HuggingFace | Release Candidate | ONNX | GGUF, GGML, SafeTensor | | azure | Azure OpenAI | Alpha | - | OpenAI-compatible HTTP endpoint | | bedrock | AWS Bedrock (e.g., Titan, Cohere) | Alpha | - | OpenAI-compatible HTTP endpoint |

Supported Vector Stores

| Name | Description | Status | | --------------- | -------------------------------------------------------------------- | ------ | | s3_vectors | Amazon S3 Vectors for petabyte-scale vector storage and querying | Alpha | | pgvector | PostgreSQL with pgvector extension | Alpha | | duckdb_vector | DuckDB with vector extension for efficient vector storage and search | Alpha | | sqlite_vec | SQLite with sqlite-vec extension for lightweight vector operations | Alpha |

Supported Catalogs

Catalog Connectors connect to external catalog providers and make their tables available for federated SQL query in Spice. Configuring accelerations for tables in external catalogs is not supported. The schema hierarchy of the external catalog is preserved in Spice.

| Name | Description | Status | Protocol/Format | | --------------- | ----------------------- | ------ | ---------------------------- | | spice.ai | Spice.ai Cloud Platform | Stable | Arrow Flight | | unity_catalog | Unity Catalog | Stable | Delta Lake | | databricks | Databricks | Beta | Spark Connect, S3/Delta Lake | | iceberg | Apache Iceberg | Beta | Parquet | | glue | AWS Glue | Alpha | CSV, Parquet, Iceberg |

⚡️ Quickstart (Local Machine)

https://github.com/spiceai/spiceai/assets/88671039/85cf9a69-46e7-412e-8b68-22617dcbd4e0

Installation

Install the Spice CLI:

On macOS, Linux, and WSL:

bash curl https://install.spiceai.org | /bin/bash

Or using brew:

bash brew install spiceai/spiceai/spice

On Windows using PowerShell:

powershell iex ((New-Object System.Net.WebClient).DownloadString("https://install.spiceai.org/Install.ps1"))

Usage

Step 1. Initialize a new Spice app with the spice init command:

bash spice init spice_qs

A spicepod.yaml file is created in the spice_qs directory. Change to that directory:

bash cd spice_qs

Step 2. Start the Spice runtime:

bash spice run

Example output will be shown as follows:

bash 2025/01/20 11:26:10 INFO Spice.ai runtime starting... 2025-01-20T19:26:10.679068Z INFO runtime::init::dataset: No datasets were configured. If this is unexpected, check the Spicepod configuration. 2025-01-20T19:26:10.679716Z INFO runtime::flight: Spice Runtime Flight listening on 127.0.0.1:50051 2025-01-20T19:26:10.679786Z INFO runtime::metrics_server: Spice Runtime Metrics listening on 127.0.0.1:9090 2025-01-20T19:26:10.680140Z INFO runtime::http: Spice Runtime HTTP listening on 127.0.0.1:8090 2025-01-20T19:26:10.682080Z INFO runtime::opentelemetry: Spice Runtime OpenTelemetry listening on 127.0.0.1:50052 2025-01-20T19:26:10.879126Z INFO runtime::init::results_cache: Initialized results cache; max size: 128.00 MiB, item ttl: 1s

The runtime is now started and ready for queries.

Step 3. In a new terminal window, add the spiceai/quickstart Spicepod. A Spicepod is a package of configuration defining datasets and ML models.

bash spice add spiceai/quickstart

The spicepod.yaml file will be updated with the spiceai/quickstart dependency.

yaml version: v1 kind: Spicepod name: spice_qs dependencies: - spiceai/quickstart

The spiceai/quickstart Spicepod will add a taxi_trips data table to the runtime which is now available to query by SQL.

bash 2025-01-20T19:26:30.011633Z INFO runtime::init::dataset: Dataset taxi_trips registered (s3://spiceai-demo-datasets/taxi_trips/2024/), acceleration (arrow), results cache enabled. 2025-01-20T19:26:30.013002Z INFO runtime::accelerated_table::refresh_task: Loading data for dataset taxi_trips 2025-01-20T19:26:40.312839Z INFO runtime::accelerated_table::refresh_task: Loaded 2,964,624 rows (399.41 MiB) for dataset taxi_trips in 10s 299ms

Step 4. Start the Spice SQL REPL:

bash spice sql

The SQL REPL inferface will be shown:

```bash Welcome to the Spice.ai SQL REPL! Type 'help' for help.

show tables; -- list available tables sql> ```

Enter show tables; to display the available tables for query:

```bash sql> show tables; +---------------+--------------+---------------+------------+ | tablecatalog | tableschema | tablename | tabletype | +---------------+--------------+---------------+------------+ | spice | public | taxitrips | BASE TABLE | | spice | runtime | queryhistory | BASE TABLE | | spice | runtime | metrics | BASE TABLE | +---------------+--------------+---------------+------------+

Time: 0.022671708 seconds. 3 rows. ```

Enter a query to display the longest taxi trips:

sql SELECT trip_distance, total_amount FROM taxi_trips ORDER BY trip_distance DESC LIMIT 10;

Output:

```bash +---------------+--------------+ | tripdistance | totalamount | +---------------+--------------+ | 312722.3 | 22.15 | | 97793.92 | 36.31 | | 82015.45 | 21.56 | | 72975.97 | 20.04 | | 71752.26 | 49.57 | | 59282.45 | 33.52 | | 59076.43 | 23.17 | | 58298.51 | 18.63 | | 51619.36 | 24.2 | | 44018.64 | 52.43 | +---------------+--------------+

Time: 0.045150667 seconds. 10 rows. ```

⚙️ Runtime Container Deployment

Using the Docker image locally:

bash docker pull spiceai/spiceai

In a Dockerfile:

dockerfile from spiceai/spiceai:latest

Using Helm:

bash helm repo add spiceai https://helm.spiceai.org helm install spiceai spiceai/spiceai

🏎️ Next Steps

Explore the Spice.ai Cookbook

The Spice.ai Cookbook is a collection of recipes and examples for using Spice. Find it at https://github.com/spiceai/cookbook.

Using Spice.ai Cloud Platform

Access ready-to-use Spicepods and datasets hosted on the Spice.ai Cloud Platform using the Spice runtime. A list of public Spicepods is available on Spicerack: https://spicerack.org/.

To use public datasets, create a free account on Spice.ai:

  1. Visit spice.ai and click Try for Free. Try for Free

  2. After creating an account, create an app to generate an API key. Create App

Once set up, you can access ready-to-use Spicepods including datasets. For this demonstration, use the taxi_trips dataset from the Spice.ai Quickstart.

Step 1. Initialize a new project.

```bash

Initialize a new Spice app

spice init spice_app

Change to app directory

cd spice_app ```

Step 2. Log in and authenticate from the command line using the spice login command. A pop up browser window will prompt you to authenticate:

bash spice login

Step 3. Start the runtime:

```bash

Start the runtime

spice run ```

Step 4. Configure the dataset:

In a new terminal window, configure a new dataset using the spice dataset configure command:

bash spice dataset configure

Enter a dataset name that will be used to reference the dataset in queries. This name does not need to match the name in the dataset source.

bash dataset name: (spice_app) taxi_trips

Enter the description of the dataset:

bash description: Taxi trips dataset

Enter the location of the dataset:

bash from: spice.ai/spiceai/quickstart/datasets/taxi_trips

Select y when prompted whether to accelerate the data:

bash Locally accelerate (y/n)? y

You should see the following output from your runtime terminal:

bash 2024-12-16T05:12:45.803694Z INFO runtime::init::dataset: Dataset taxi_trips registered (spice.ai/spiceai/quickstart/datasets/taxi_trips), acceleration (arrow, 10s refresh), results cache enabled. 2024-12-16T05:12:45.805494Z INFO runtime::accelerated_table::refresh_task: Loading data for dataset taxi_trips 2024-12-16T05:13:24.218345Z INFO runtime::accelerated_table::refresh_task: Loaded 2,964,624 rows (8.41 GiB) for dataset taxi_trips in 38s 412ms.

Step 5. In a new terminal window, use the Spice SQL REPL to query the dataset

bash spice sql

bash SELECT tpep_pickup_datetime, passenger_count, trip_distance from taxi_trips LIMIT 10;

The output displays the results of the query along with the query execution time:

```bash +----------------------+-----------------+---------------+ | tpeppickupdatetime | passengercount | tripdistance | +----------------------+-----------------+---------------+ | 2024-01-11T12:55:12 | 1 | 0.0 | | 2024-01-11T12:55:12 | 1 | 0.0 | | 2024-01-11T12:04:56 | 1 | 0.63 | | 2024-01-11T12:18:31 | 1 | 1.38 | | 2024-01-11T12:39:26 | 1 | 1.01 | | 2024-01-11T12:18:58 | 1 | 5.13 | | 2024-01-11T12:43:13 | 1 | 2.9 | | 2024-01-11T12:05:41 | 1 | 1.36 | | 2024-01-11T12:20:41 | 1 | 1.11 | | 2024-01-11T12:37:25 | 1 | 2.04 | +----------------------+-----------------+---------------+

Time: 0.00538925 seconds. 10 rows. ```

You can experiment with the time it takes to generate queries when using non-accelerated datasets. You can change the acceleration setting from true to false in the datasets.yaml file.

📄 Documentation

Comprehensive documentation is available at spiceai.org/docs.

Over 45 quickstarts and samples available in the Spice Cookbook.

🔌 Extensibility

Spice.ai is designed to be extensible with extension points documented at EXTENSIBILITY.md. Build custom Data Connectors, Data Accelerators, Catalog Connectors, Secret Stores, Models, or Embeddings.

🔨 Upcoming Features

🚀 See the Roadmap for upcoming features.

🤝 Connect with us

We greatly appreciate and value your support! You can help Spice in a number of ways:

⭐️ star this repo! Thank you for your support! 🙏

Owner

  • Name: Spice.ai OSS
  • Login: spiceai
  • Kind: organization
  • Email: hey@spiceai.org
  • Location: Seattle, Washington

Time series AI designed for developers.

Committers

Last synced: 9 months ago

All Time
  • Total Commits: 3,305
  • Total Committers: 41
  • Avg Commits per committer: 80.61
  • Development Distribution Score (DDS): 0.787
Past Year
  • Commits: 2,301
  • Committers: 29
  • Avg Commits per committer: 79.345
  • Development Distribution Score (DDS): 0.794
Top Committers
Name Email Commits
Phillip LeBlanc p****p@l****h 705
Sergei Grebnov s****v@g****m 476
Jack Eadie j****k@s****i 364
Luke Kim 8****m 345
peasee 9****e 342
Evgenii Khramkov e****i@s****i 242
Qianqian 1****n 235
yfu f****6@g****m 142
github-actions[bot] 4****] 130
dependabot[bot] 4****] 60
Mitch 8****t 55
Aurash Behbahani a****b@g****m 42
Lane Harris l****e@s****o 32
Kirill Khramkov h****r@g****m 27
Kevin Zimmerman 4****m 24
Corentin c****o@m****m 17
Scott Lyons s****t@s****i 15
Roy Rotstein 1****7 6
David Stancu d****u@b****m 6
karifabri 1****i 5
Darin Douglass 1****n 4
Advayp 6****p 3
Alexander Hirner 6****r 3
Edmondo Porcu e****u@g****m 3
Adarsh Mishra a****8@g****m 2
Garam Choi S****2@g****m 2
Michael Ilie m****g@g****m 2
Nicolas Lamirault n****t@g****m 2
Sai Boorlagadda s****a@g****m 2
Spice Schema Bot s****t@s****i 1
and 11 more...
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 2,015
  • Total pull requests: 5,223
  • Average time to close issues: 28 days
  • Average time to close pull requests: 1 day
  • Total issue authors: 38
  • Total pull request authors: 44
  • Average comments per issue: 0.53
  • Average comments per pull request: 0.27
  • Merged pull requests: 4,039
  • Bot issues: 5
  • Bot pull requests: 935
Past Year
  • Issues: 1,417
  • Pull requests: 3,824
  • Average time to close issues: 10 days
  • Average time to close pull requests: about 22 hours
  • Issue authors: 28
  • Pull request authors: 29
  • Average comments per issue: 0.46
  • Average comments per pull request: 0.34
  • Merged pull requests: 2,892
  • Bot issues: 3
  • Bot pull requests: 829
Top Authors
Issue Authors
  • peasee (349)
  • phillipleblanc (348)
  • sgrebnov (348)
  • Sevenannn (270)
  • lukekim (157)
  • Jeadie (141)
  • ewgenius (95)
  • y-f-u (67)
  • digadeesh (66)
  • kczimm (58)
  • slyons (33)
  • Advayp (25)
  • boer0924 (7)
  • mitchdevenport (5)
  • Corentin-pro (5)
Pull Request Authors
  • phillipleblanc (849)
  • sgrebnov (772)
  • Jeadie (633)
  • peasee (587)
  • dependabot[bot] (499)
  • github-actions[bot] (435)
  • Sevenannn (411)
  • ewgenius (324)
  • lukekim (184)
  • y-f-u (149)
  • kczimm (72)
  • mitchdevenport (57)
  • Advayp (53)
  • digadeesh (51)
  • gloomweaver (29)
Top Labels
Issue Labels
kind/bug (335) kind/enhancement (193) kind/task (89) area/data-accelerators (55) endgame (50) area/ai (45) area/data-connectors (36) enhancement (22) good first issue (19) area/observability (15) kind/documentation (14) area/datafusion (11) bug (9) area/cli (7) upstream (7) kind/optimization (5) kind/dependencies (5) area/quality (5) data-connector/spice.ai (4) area/ci (4) rust (4) performance (4) kind/refactor (3) data-connector/duckdb (3) bugbash (3) documentation (2) needs testing (2) data-connector/spark (2) data-connector/s3 (2) breaking change (2)
Pull Request Labels
kind/enhancement (1,396) kind/bug (1,244) kind/dependencies (541) kind/documentation (426) rust (380) area/ai (316) kind/optimization (236) kind/refactor (88) go (81) area/ci (71) kind/task (61) dependencies (55) area/data-connectors (52) breaking change (39) enhancement (33) area/cli (33) github_actions (28) area/quality (21) nomerge (18) javascript (14) endgame (14) python (13) documentation (11) area/data-accelerators (10) bug (8) area/observability (7) area/build (6) kind/performance (5) data-connector/graphql (4) data-connector/github (3)

Packages

  • Total packages: 1
  • Total downloads: unknown
  • Total dependent packages: 1
  • Total dependent repositories: 1
  • Total versions: 103
proxy.golang.org: github.com/spiceai/spiceai
  • Versions: 103
  • Dependent Packages: 1
  • Dependent Repositories: 1
Rankings
Stargazers count: 2.2%
Forks count: 3.9%
Average: 4.2%
Dependent repos count: 4.7%
Dependent packages count: 5.8%
Last synced: 6 months ago

Dependencies

go.mod go
  • cloud.google.com/go/iam v0.3.0
  • github.com/andreyvit/diff v0.0.0-20170406064948-c7f18ee00883
  • github.com/andybalholm/brotli v1.0.4
  • github.com/apache/arrow/go/v7 v7.0.0
  • github.com/benbjohnson/immutable v0.3.0
  • github.com/bradleyjkemp/cupaloy v2.3.0+incompatible
  • github.com/cenkalti/backoff/v4 v4.1.3
  • github.com/cespare/xxhash/v2 v2.1.2
  • github.com/davecgh/go-spew v1.1.1
  • github.com/deepmap/oapi-codegen v1.11.0
  • github.com/dghubble/go-twitter v0.0.0-20220608135633-47eb18e5aab5
  • github.com/dghubble/oauth1 v0.7.1
  • github.com/dghubble/sling v1.4.0
  • github.com/emirpasic/gods v1.12.0
  • github.com/fasthttp/router v1.4.10
  • github.com/fsnotify/fsnotify v1.5.4
  • github.com/gocarina/gocsv v0.0.0-20220531201732-5f969b02b902
  • github.com/goccy/go-json v0.9.7
  • github.com/gofrs/uuid v4.2.0+incompatible
  • github.com/golang-jwt/jwt/v4 v4.4.1
  • github.com/golang/glog v0.0.0-20160126235308-23def4e6c14b
  • github.com/golang/groupcache v0.0.0-20210331224755-41bb18bfe9da
  • github.com/golang/protobuf v1.5.2
  • github.com/google/flatbuffers v2.0.6+incompatible
  • github.com/google/go-cmp v0.5.8
  • github.com/google/go-licenses v1.2.1
  • github.com/google/go-querystring v1.1.0
  • github.com/google/licenseclassifier v0.0.0-20210722185704-3043a050f148
  • github.com/google/uuid v1.3.0
  • github.com/gorilla/websocket v1.5.0
  • github.com/hashicorp/go-cleanhttp v0.5.2
  • github.com/hashicorp/go-retryablehttp v0.7.1
  • github.com/hashicorp/hcl v1.0.0
  • github.com/inconshreveable/mousetrap v1.0.0
  • github.com/influxdata/flux v0.171.0
  • github.com/influxdata/influxdb-client-go v1.4.0
  • github.com/influxdata/line-protocol v0.0.0-20210922203350-b1ad95c89adf
  • github.com/jbenet/go-context v0.0.0-20150711004518-d14ea06fba99
  • github.com/kevinburke/ssh_config v0.0.0-20190725054713-01f96b0aa0cd
  • github.com/klauspost/compress v1.15.6
  • github.com/logrusorgru/aurora v2.0.3+incompatible
  • github.com/magiconair/properties v1.8.6
  • github.com/mattn/go-runewidth v0.0.13
  • github.com/mitchellh/go-homedir v1.1.0
  • github.com/mitchellh/mapstructure v1.5.0
  • github.com/olekukonko/tablewriter v0.0.5
  • github.com/opentracing/opentracing-go v1.2.0
  • github.com/otiai10/copy v1.6.0
  • github.com/pelletier/go-toml v1.9.5
  • github.com/pelletier/go-toml/v2 v2.0.2
  • github.com/pierrec/lz4/v4 v4.1.15
  • github.com/pkg/errors v0.9.1
  • github.com/pmezard/go-difflib v1.0.0
  • github.com/rivo/uniseg v0.2.0
  • github.com/savsgio/gotils v0.0.0-20220530130905-52f3993e8d6d
  • github.com/sergi/go-diff v1.2.0
  • github.com/spf13/afero v1.8.2
  • github.com/spf13/cast v1.5.0
  • github.com/spf13/cobra v1.4.0
  • github.com/spf13/jwalterweatherman v1.1.0
  • github.com/spf13/pflag v1.0.5
  • github.com/spf13/viper v1.12.0
  • github.com/spiceai/data-components-contrib v0.0.0-20220421015910-24f18a850757
  • github.com/src-d/gcfg v1.4.0
  • github.com/stretchr/testify v1.7.2
  • github.com/subosito/gotenv v1.4.0
  • github.com/uber/jaeger-client-go v2.30.0+incompatible
  • github.com/uber/jaeger-lib v2.4.1+incompatible
  • github.com/valyala/bytebufferpool v1.0.0
  • github.com/valyala/fasthttp v1.37.0
  • github.com/xanzy/ssh-agent v0.2.1
  • go.opencensus.io v0.23.0
  • go.uber.org/atomic v1.9.0
  • go.uber.org/multierr v1.8.0
  • go.uber.org/zap v1.21.0
  • golang.org/x/crypto v0.0.0-20220513210258-46612604a0f9
  • golang.org/x/mod v0.6.0-dev.0.20220419223038-86c51ed26bb4
  • golang.org/x/net v0.0.0-20220617184016-355a448f1bc9
  • golang.org/x/sync v0.0.0-20220601150217-0de741cfad7f
  • golang.org/x/sys v0.0.0-20220615213510-4f61da869c0c
  • golang.org/x/text v0.3.7
  • golang.org/x/tools v0.1.11
  • golang.org/x/xerrors v0.0.0-20220609144429-65e65417b02f
  • google.golang.org/genproto v0.0.0-20220617124728-180714bec0ad
  • google.golang.org/grpc v1.47.0
  • google.golang.org/protobuf v1.28.0
  • gopkg.in/ini.v1 v1.66.6
  • gopkg.in/natefinch/lumberjack.v2 v2.0.0
  • gopkg.in/src-d/go-billy.v4 v4.3.2
  • gopkg.in/src-d/go-git.v4 v4.13.1
  • gopkg.in/warnings.v0 v0.1.2
  • gopkg.in/yaml.v2 v2.4.0
  • gopkg.in/yaml.v3 v3.0.1
go.sum go
  • 1221 dependencies
ai/src/requirements/common.txt pypi
  • Keras-Preprocessing ==1.1.2
  • Markdown ==3.3.4
  • Werkzeug ==2.0.1
  • absl-py ==1.1.0
  • astunparse ==1.6.3
  • attrs ==21.2.0
  • cachetools ==4.2.2
  • certifi ==2021.5.30
  • charset-normalizer ==2.0.4
  • clang ==5.0
  • cloudpickle ==2.0.0
  • decorator ==5.1.1
  • dm-tree ==0.1.6
  • flatbuffers ==1.12
  • gast ==0.4.0
  • google-auth ==1.35.0
  • google-auth-oauthlib ==0.4.5
  • google-pasta ==0.2.0
  • grpcio ==1.44.0
  • h5py ==3.1.0
  • humanize ==3.11.0
  • idna ==3.2
  • iniconfig ==1.1.1
  • keras ==2.9.0
  • libclang ==14.0.1
  • numpy ==1.22.4
  • oauthlib ==3.1.1
  • opt-einsum ==3.3.0
  • packaging ==21.0
  • pandas ==1.4.2
  • pluggy ==0.13.1
  • protobuf ==3.19.4
  • psutil ==5.8.0
  • py ==1.11.0
  • pyarrow ==8.0.0
  • pyasn1 ==0.4.8
  • pyasn1-modules ==0.2.8
  • pyparsing ==2.4.7
  • python-dateutil ==2.8.2
  • pytz ==2021.1
  • pyzmq ==22.3.0
  • requests ==2.26.0
  • requests-oauthlib ==1.3.0
  • rsa ==4.7.2
  • six ==1.15.0
  • tensorboard ==2.9.1
  • tensorboard-data-server ==0.6.1
  • tensorboard-plugin-wit ==1.8.1
  • tensorflow ==2.9.1
  • tensorflow-estimator ==2.9.0
  • tensorflow-io-gcs-filesystem ==0.24.0
  • tensorflow-probability ==0.15.0
  • termcolor ==1.1.0
  • tf-estimator-nightly ==2.8.0.dev2021122109
  • toml ==0.10.2
  • tornado ==6.1
  • traitlets ==5.1.0
  • typing_extensions ==4.0.1
  • urllib3 ==1.26.6
  • wrapt ==1.12.1
ai/src/requirements/development.txt pypi
  • grpcio-tools *
  • jupyter-client ==6.1.12
  • jupyter-core ==4.7.1
  • pylint ==2.12.2
  • pytest ==6.2.3
  • pytest-timeout ==1.4.2
.github/workflows/build_and_release.yml actions
  • actions/checkout v3 composite
  • actions/download-artifact master composite
  • actions/download-artifact v3 composite
  • actions/setup-go v3 composite
  • actions/upload-artifact v3 composite
.github/workflows/codeql-analysis.yml actions
  • actions/checkout v3 composite
  • github/codeql-action/analyze v2 composite
  • github/codeql-action/autobuild v2 composite
  • github/codeql-action/init v2 composite
.github/workflows/e2e_test.yml actions
  • actions/cache v2 composite
  • actions/checkout v3 composite
  • actions/download-artifact v3 composite
  • actions/setup-go v3 composite
  • actions/setup-python v1 composite
  • actions/upload-artifact v3 composite
  • docker/build-push-action v2 composite
.github/workflows/license_check.yml actions
  • actions/checkout v3 composite
  • actions/setup-go v3 composite
  • actions/setup-python v1 composite
.github/workflows/linter.yml actions
  • actions/checkout v3 composite
  • actions/setup-go v3 composite
  • actions/setup-python v1 composite
  • golangci/golangci-lint-action v2 composite
.github/workflows/pr_test.yml actions
  • actions/cache v2 composite
  • actions/checkout v3 composite
  • actions/setup-go v3 composite
  • actions/setup-python v1 composite
.github/workflows/spiced_docker.yml actions
  • actions/checkout v3 composite
  • docker/build-push-action v2.5.0 composite
  • docker/login-action v1 composite
  • docker/setup-buildx-action v1 composite
docker/Dockerfile docker
  • golang latest build
  • node 16 build
  • python 3.8.12-slim build
ai/src/requirements/production.txt pypi