Apache XTable
Apache XTable (formerly OneTable, originated at Onehouse) is an open-source interoperability layer that translates table metadata between Apache Hudi, Apache Iceberg, and Delta Lake. The data files (Parquet) stay in place; XTable rewrites only the metadata so the same physical dataset can be read as any of the three formats by any compatible engine. It sidesteps the “format wars” by letting teams pick formats per consumer rather than per producer.
Key Features:
- Metadata-Only Translation. No data is rewritten — XTable generates Iceberg / Hudi / Delta metadata pointing at the same Parquet files.
- Bidirectional Sync. Writes performed via one format are visible to readers using another after a sync run.
- Engine-Agnostic. Works with Spark, Flink, Trino, Presto, Snowflake, BigQuery — whichever engines you already have.
- CLI & Library. Run as a one-shot job, a scheduled batch sync, or embedded in your pipeline.
- Apache Foundation Project. Donated to the ASF in 2024; community governance and broad contributor base.
Why It Matters:
Many organizations end up multi-format by accident: data engineers write Hudi from Spark Structured Streaming, while analysts query through Snowflake or BigQuery (which have stronger Iceberg support), and data scientists train on Databricks (Delta-native). XTable avoids forcing a re-platform — the producer keeps its preferred format and downstream consumers see the format they prefer.
Use Cases:
- Migrating gradually from one open table format to another without downtime.
- Letting Snowflake read Delta or Hudi tables natively as Iceberg.
- Multi-engine architectures where each engine’s native format is different.
- Vendor-neutral data lakehouse strategy — avoid lock-in to one format.