Make money doing the work you believe in

Behind the Scenes: How do deletion vectors actually work

Deletion Vectors in Delta Lake operate by marking deleted rows in a highly compressed bitmap format (RoaringBitmap), rather than rewriting data files immediately—this is known as “soft delete” and allows changes to be deferred until read time. When a delete operation is performed, three key steps occur:

  1. Deletion Vector binary file is created to record positions of deleted rows across affected data files

  2. A "remove" action is added to the Delta log for each affected file, marking their previous metadata invalid

  3. A new "add" action updates file metadata to reference the relevant Deletion Vector, ensuring readers merge the bitmap information at read time for correct results

Sep 15
at
9:09 PM
Relevant people

Log in or sign up

Join the most interesting and insightful discussions.