Apache Iceberg Table Format Versions

4 min readJun 25, 2023

In this blog we will explore mainly these things.

What are the different types of Iceberg table & difference between them.
How Iceberg table handles update/delete.
What is copy-on-write and merge-on-read & difference between them.

Iceberg table has two different table formats v1 & v2.

v1 format — defaults copy-on-write.
v2 format — copy-on-write or merge-on-read.

Iceberg tables support table properties to configure table behavior. There are different types of properties eg. read properties, write properties, & other table behavior properties.

For defining tables format.

+----------------+----------+----------------------------------------+
|   Property     | Default  |     Description                        |
+----------------+----------+----------------------------------------+
| format-version | 1        | Table’s format version (can be 1 or 2) |
+----------------+----------+----------------------------------------+

For defining tables strategy copy-on-write or merge-on-read.

+-------------------+---------------+--------------------------+
|   Property        | Default       |     Description          |
+-------------------+---------------+--------------------------+
| write.update.mode | copy-on-write | c-o-w or m-o-r (v2 only) |
| write.delete.mode | copy-on-write | c-o-w or m-o-r (v2 only) |
| write.merge.mode  | copy-on-write | c-o-w or m-o-r (v2 only) |
+-------------------+----------+-------------------------------+

Iceberg table support update/delete through copy-on-write or merge-on-read techniques.

Copy-on-Write

If we are updating/deleting just few rows in table, still Iceberg will re-write the entire datafile.
So at the time of writing only, Iceberg identifies which datafiles has changes, duplicates those datafiles, applies the changes (Update/Delete).

Example: Let say you have two data files in data directory of Iceberg table data-file-1 & data-file-2. You have updated an record, which is present in data-file-2 only. Iceberg will create a new copy of data-file-2 only and apply the changes.

At the end you have three files — data-file-1, data-file-2, data-file-2-new. But you latest snapshot will only refer data-file-1 & data-file-2-new.

Use of old file (data-file-2) is just for time travel. If you don't need time travel, you can go-ahead and expire the snapshot, it will clear us-used files.

SELECT * FROM prod.db.table TIMESTAMP AS OF '1986-10-26 01:21:00';

Pros & Cons:

copy-on-write is expensive — In case of frequent updates/deletes. For streaming pipelines it cannot be a good fit.
copy-on-write is ideal — In case of bulk updates (where max rows getting update) but in batch mode.
Writes are slower — As processing is require while writing. copy datafile & applying changes.
Reading is faster — As no processing require at reader end.

Merge-on-Read

If we are updating/deleting just few rows in table, Iceberg will not re-write the entire datafile. Instead changes are written to new file.
So at the time of writing, Iceberg will identifies which datafiles has changes, identifies the position of those records. Write the file details & position of those records in positional delete file.

Positional delete file — hold the positions for deleted & updated records.

+------------------------------+----------+
|   file_path                  | Default  |     
+------------------------------+----------+
| .../00191-1676-00001.parquet | 11       | 
| .../00191-1676-00001.parquet | 21       |
+------------------------------+----------+

Also in the separate data file, it will store the updated records.

Example: Let say you have two data files in data directory of Iceberg table data-file-1 & data-file-2. You have updated an record, which is present in data-file-2 only. Iceberg will create a positional-delete-file to hold the position of that updates record. Also a new datafile with updates.

At the end you have three files — data-file-1, data-file-2, positional-delete-file, data-file-with-change-records. At the time of reading Iceberg merge thses files & show you latest data.

Pros & Cons:

merge-on-read is ideal — In case of small/frequent updates.
Writes are quick — as no need to re-write file. Only processing require is write positional delete file & new data file with changes.
Reading is slower — as processing (merge) require while reading data.

Table maintenance (compaction, rewrite data files, rewrite positional-delete-file etc.) is required, once these small file grows.

As of now Iceberg supports positional-deletes only for Apache Spark. Iceberg also has equality delete, where it store the actual value of records (ID etc.) in positional-delete-file. But as of now there is no support in Spark.

Refer this blog for internals of Iceberg table.

Internals of Apache Iceberg

In this blog we are going explore the architectural components of the Apache Iceberg.

bigdataenthusiast.medium.com

Refer below blogs:

Apache Iceberg Table — v1 Format — copy-on-write

In previous blog we have already seen different types of iceberg table format & write mode supported. Please refer…

bigdataenthusiast.medium.com

Apache Iceberg Table — v2 Format — merge-on-read

In previous blog we have already seen different types of iceberg table format & write mode supported. Please refer…

bigdataenthusiast.medium.com

Apache Iceberg Table Format Versions

Copy-on-Write

Merge-on-Read

Internals of Apache Iceberg

In this blog we are going explore the architectural components of the Apache Iceberg.

Apache Iceberg Table — v1 Format — copy-on-write

In previous blog we have already seen different types of iceberg table format & write mode supported. Please refer…

Apache Iceberg Table — v2 Format — merge-on-read

In previous blog we have already seen different types of iceberg table format & write mode supported. Please refer…

References

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by BigDataEnthusiast

No responses yet