Cloud Independent Data Platform

Own your data.
Run anywhere.

A sovereign, open-source data platform built on Apache Iceberg, Polaris, SQE, and OPA. Enterprise-grade security, AI-assisted analytics, and total cloud independence — zero proprietary lock-in. Runs on Kubernetes: hyperscaler, sovereign cloud, private cloud, or on-prem.

Open standards end to end: Apache Iceberg · Polaris · OPA · Keycloak · S3. No vendor IAM, no proprietary format.

Cloud Independent Data Platform — workspace home: Discover, Build, Govern, Admin
The problem

Your data infrastructure serves the vendor — not the business.

Vendor lock-in

Cloud-native warehouses create deep dependency on a single provider. Migrating away costs millions and takes years.

Data sovereignty

GDPR, DORA, and NIS2 demand control over where data lives and who can access it. Proprietary platforms make compliance a moving target.

Fragmented tooling

Catalogs, query engines, access control, and storage — each with its own API, auth model, and upgrade cycle. Rising cost, audit nightmares.

The solution

One platform. Zero proprietary dependencies.

A fully integrated data platform built exclusively on open standards — every layer governed by a vendor-neutral body, swappable at will.

CapabilityTechnologyLock-in risk
Data format Apache Icebergnone — open table format
Data catalog Apache Polarisnone — open REST catalog
Query engine SQE (Flight SQL) · Trino · Spark Connectnone — ANSI SQL, JDBC/ODBC
Identity Keycloak / OpenID Connectnone — any OIDC provider
Authorization OPA + Rego policiesnone — open policy standard
Storage Any S3-compatible backendnone — standard S3 API
What it does

Catalog, query, dbt, AI, and governance — in one place.

Data Explorer

Browse catalogs, namespaces, and tables with schema inspection and Databricks-style column profiling.

Query Editor

Execute SQL via SQE (Arrow Flight SQL), Trino, or Spark Connect with inline results, history, and row counts.

dbt Workspace

Browser-based IDE: file editor, git operations, build/test/run, and dbt model lineage visualization.

AI Flows

Context-aware assistant powered by a local LLM with MCP tool integration. Schema-aware natural-language SQL.

Data Contracts

Visual ODCS v3.1.0 editor with quality checks, import from catalog, and dbt export.

Fine-grained RBAC

Table-level access control via OPA Rego policies, identity from Keycloak. service_admin → table_reader.

Observability

OpenTelemetry traces, OCSF audit logging, Jaeger UI — see every query and access decision.

Column-level lineage

dbt model lineage with auto-layout, column-level tracking, and search across the warehouse.

Admin overview — catalogs, namespaces, tables, principals, and roles
Admin overview — catalogs, principals, and roles at a glance.
Data Contracts — ODCS contracts across catalogs with quality status
Data contracts — ODCS, versioned, with quality status per table.
dbt projects — connect a git repo and build
dbt workspace — bring a git repo, build and test in the browser.
Workspace picker — per-team workspaces on one platform
Per-team workspaces, one governed platform.

Flagship views — query editor, catalog explorer, and lineage — join the gallery in a later beta drop.

The query engine

SQE first — Trino-compatible.

SQE — sovereign, fast, no JVM

The sqe deployment mode runs the Sovereign Query Engine in place of Trino: Arrow Flight SQL on the wire, per-query identity, and highly Iceberg-compatible reads and writes — V2 and V3. Every query runs as the authenticated user; the bearer token flows through to the catalog and storage.

Trino-compatible, Spark for engineering

SQE speaks the Trino HTTP protocol too, so existing clients connect unchanged — or swap in Trino itself. For heavy data engineering, Spark Connect runs against the same Iceberg tables, no copies. Either way it's ANSI SQL over standard JDBC/ODBC — no proprietary drivers.

Connect anything that speaks SQL

Tableau · Power BI · DBeaver · DataGrip · dbt · Python/pandas · Spark Connect · the built-in Query Editor — all via Flight SQL or standard JDBC/ODBC. No data copying; queries run against data in place.

Enterprise security

Two layers, both open: Keycloak + OPA.

Authentication — Keycloak (OIDC)

OpenID Connect standard — swap for Okta, Azure AD, or any OIDC provider. SSO, MFA, user federation. BFF-style auth: no tokens exposed to the browser. Group and role claims minted into every token.

Authorization — OPA (Rego)

One policy engine for the whole platform. Catalog-level (catalogs, namespaces, tables), query-level (execution, visibility, row filtering). Policy-as-code: version, review, and audit every access rule. Roles from service_admin to table_reader.

Catalog grants — on top of Apache Polaris

GRANT and REVOKE privileges on catalogs, namespaces, and tables using Polaris's native RBAC security model — then layer OPA policies on top for row- and column-level rules. Standard, portable grants; no vendor-specific access model to migrate away from.

Deploy anywhere

Kubernetes-native. Your cloud, sovereign cloud, or on-prem.

Runs where your data must live

  • Hyperscalers — AWS, Azure, GCP.
  • Sovereign clouds — STACKIT and other EU sovereign providers.
  • Private cloud / on-prem — your datacenter, your hardware.

The same open stack everywhere — no managed-service lock-in, no per-query fees.

Kubernetes + Helm, GitOps-ready

Production deploys via Helm charts on any conformant Kubernetes: Polaris, the query engine (SQE or Trino), Spark Connect, Keycloak, OPA, and the portals — with horizontal scaling and Argo CD GitOps delivery. Storage is any S3-compatible backend.

Beta: public Helm charts and a hosted demo are on the way.

The receipts

Cloud Independent Data Platform vs typical cloud DW.

This platformTypical cloud DW
Vendor lock-in zero — open sourcedeep — proprietary
Data format Apache Icebergproprietary internal
Data location any cloud / on-premvendor cloud only
Identity provider any OIDCvendor IAM only
Authorization OPA — policy-as-codebuilt-in, non-portable
Query engine SQE / Trino — standard SQLproprietary drivers
AI analytics local or cloud LLM — your choicevendor's AI service
Exit strategy walk away — data is Iceberg/Parquetmulti-year migration
Cost model infrastructure onlyper-query + markup