You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First of all, the primary key on a single column is supported. When data is inserted, judge whether the row appears in the table by primary key. If there is already row with the same primary key, the insertion is skipped.
The initial plan is to achieve deduplication by maintaining a deduplication container for each table. When the database is restarted, the primary key column is read from the disk and container in memory is rebuilt.
After investigation, roaring bitmap is a compressed bitmap index with excellent performance and less memory usage.
We can use RoaringBitmap and RoaringTreemap in roaring-rs to store ordinary integer primary keys. For string types that cannot be supported by roaring bitmap, we can use HashSet storage.
Also, where can the deduplication container of each table be placed appropriately, can it be placed in the MetaStore?
sql parse
deduplication by primary key when data inserting
recovery
performance test
The text was updated successfully, but these errors were encountered:
The primary key plan supports data deduplication.
First of all, the primary key on a single column is supported. When data is inserted, judge whether the row appears in the table by primary key. If there is already row with the same primary key, the insertion is skipped.
The initial plan is to achieve deduplication by maintaining a deduplication
container
for each table. When the database is restarted, the primary key column is read from the disk andcontainer
in memory is rebuilt.After investigation, roaring bitmap is a compressed bitmap index with excellent performance and less memory usage.
We can use
RoaringBitmap
andRoaringTreemap
in roaring-rs to store ordinary integer primary keys. For string types that cannot be supported by roaring bitmap, we can useHashSet
storage.Also, where can the deduplication
container
of each table be placed appropriately, can it be placed in theMetaStore
?The text was updated successfully, but these errors were encountered: