Behind the Scenes: How do deletion vectors actually work
Deletion Vectors in Delta Lake operate by marking deleted rows in a highly compressed bitmap format (RoaringBitmap), rather than rewriting data files immediately—this is known as “soft delete” and allows changes to be deferred until read time. When a delete operation is performed, three key steps occur:
Deletion Vector binary file is created to record positions of deleted rows across affected data files
A "remove" action is added to the Delta log for each affected file, marking their previous metadata invalid
A new "add" action updates file metadata to reference the relevant Deletion Vector, ensuring readers merge the bitmap information at read time for correct results
Sep 15
at
9:09 PM
Relevant people
Log in or sign up
Join the most interesting and insightful discussions.