Spice AI – Open Source Data & AI Inference Engine
Spice AI is an open-source data and AI inference engine designed to help developers build AI-powered applications that are grounded in enterprise data. It enables SQL query federation, data acceleration, search, and retrieval, integrating AI models and data across modern and legacy sources. The platform emphasizes speed, scalability, and a developer-friendly experience, with a focus on deploying AI capabilities close to data sources and in production-ready environments.
Key capabilities include:
- SQL Query Federation: Join data across databases, data warehouses, data lakes, and APIs using familiar SQL syntax in a single query.
- Data Acceleration: Fast, low-latency query, search, and AI retrieval with real-time or near-real-time performance.
- AI Inference & Training: Load local models (e.g., Llama3) or connect to hosted AI services (OpenAI, xAI, NVidia) and run ML/AI workloads.
- Compute Engine: A portable, high-performance Rust-based compute engine built on Apache Arrow and DataFusion for efficient in-memory processing.
- Building Blocks: Composable components for data access, acceleration, search, retrieval, and AI inference that can be stitched together to power apps and agents grounded in data.
- Connectors: 30+ connectors to sources from Databricks, MySQL, CSV on FTP, and more, with standard protocols (ODBC, JDBC, ADBC, HTTP, Apache Arrow Flight).
- Real-Time & CDC: Change Data Capture (CDC) support to keep accelerations and indexes up to date with live data.
- Developer Experience: Minimal setup and code requirements (three lines of code to get started in the Spice Cloud Platform) with robust SDKs.
- Ecosystem & SDKs: Node.js, Python, Go, Rust, with additional SDKs and ecosystem libraries (Pandas, PyTorch, TensorFlow, etc.).
- Private Data & Security: Enterprise-grade infrastructure with SOC 2 Type II options for secure data handling and governance.
- Data & AI Store: Support for materials like DuckDB, SQLite, and private datasets/views for SQL querying and sharing.
- Private & Public Deployments: Flexible deployment options to fit cloud, on-premises, or hybrid environments.
Core advantages include rapid access to real-time data, integrated AI capabilities, and a modular architecture that enables teams to quickly build data-driven AI applications and agents grounded in actual enterprise data.
How Spice AI Works
- Query across diverse data sources with SQL, leveraging data federation to join datasets in a single query.
- Use the built-in compute engine to execute AI-intensive workloads close to data, supported by local or hosted models.
- Apply data acceleration and AI retrieval to power fast, interactive experiences and AI-enabled applications.
- Orchestrate data access, acceleration, search, retrieval, and AI inference via composable building blocks to create apps and agents.
Core Features
- Open-source data and AI inference engine
- SQL Query Federation across databases, warehouses, lakes, and APIs
- Data Acceleration for fast, low-latency queries and AI retrieval
- Load and serve local AI models (e.g., Llama3) or hosted platforms (OpenAI, xAI, NVidia)
- Rust-based Compute Engine on Apache Arrow/DataFusion for performance
- Building blocks that can be composed to build data-grounded AI apps and agents
- 30+ Connectors to modern and legacy sources (Databricks, MySQL, CSV/FTP, etc.)
- Industry-standard protocols: ODBC, JDBC, ADBC, HTTP, Arrow Flight (gRPC)
- Change Data Capture (CDC) for real-time updates to accelerations
- Three-line-code onboarding in Spice Cloud Platform
- SDKs across Node.js, Python, Go, Rust; interoperability with Pandas, PyTorch, TensorFlow
- Private datasets and views accessible via SQL; optional sharing
- Petabyte-scale data access for applications and ML use cases
- Enterprise-grade infrastructure with SOC 2-type compliance options
How to Get Started
- Get Started with Just Three Lines of Code: query data and run AI workloads quickly via the Spice Cloud Platform.
- Sign up to access full planet-scale SQL query and ML capabilities or use self-hosted open-source components.
- Explore examples and ecosystem libraries to accelerate development.
Safety, Privacy & Compliance
- Enterprise-grade security and governance features with SOC 2 Type II options.
- Data stays within controlled environments according to deployment choice and organizational policies.
Related Resources
- Documentation: Get Started, Read the Docs
- Case Studies: Real-world deployments and use cases
- Community & Tutorials: Example libraries and community contributions
Core Features Summary
- Open Source Data & AI Inference Engine
- SQL Query Federation across data sources
- Data Acceleration with fast in-memory processing
- Local and Hosted AI model support
- Rust-based, Arrow/DataFusion-powered Compute Engine
- Composable Building Blocks for data and AI apps
- 30+ Data Source Connectors with standard protocols
- Real-time Data indexing with CDC
- Three-line Code Onboarding in Spice Cloud Platform
- Rich SDKs (Node.js, Python, Go, Rust) and ecosystem compatibility
- Private Datasets, Views, and SQL-based access
- Enterprise-grade SOC 2 compliant options