Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HashTreeCollections] Remove subtree count augmentation #236

Open
lorentey opened this issue Nov 2, 2022 · 1 comment
Open

[HashTreeCollections] Remove subtree count augmentation #236

lorentey opened this issue Nov 2, 2022 · 1 comment
Labels
enhancement New feature or request HashTreeCollections Hash-array mapped prefix tree implementations

Comments

@lorentey
Copy link
Member

lorentey commented Nov 2, 2022

Node references in persistent collections currently include the count of all items in the corresponding subtree. This makes index(_:offsetBy:) an operation with logarithmic complexity, but it wastes a bunch of memory on storing these counts. This is very unlikely to be a worthwhile trade off, as these data structures are unordered, and it seem improbable that someone would want to randomly jump around within them.

Remove the count augmentation, to boost memory performance and to make these prefix trees even more competitive in memory use vs the standard Set and Dictionary.

@lorentey lorentey added enhancement New feature or request HashTreeCollections Hash-array mapped prefix tree implementations labels Nov 2, 2022
@msteindorfer
Copy link
Contributor

Note that the CHAMP data structure, as discussed in the OOPSLA '15 paper, does not have sub-tree count augmentation. This can indeed be highly beneficial in terms of reducing memory footprints, depending on the memory alignment of the nodes. See Section 6 of the paper for a performance evaluation on the JVM platform.

For PersistentSet there would be a performance implication when performing structural set-algebra operation, e.g., when pruning or adding complete sub-trees. Without knowing the size of the sub-tree, one would have to iterate over the sub-tree (i.e., O(n) in terms of sub-tree size) to update the collections count accordingly.

In the end it's a trade-off between memory footprint and performance of some operations.

@lorentey lorentey changed the title [PersistentCollections] Remove subtree count augmentation [HashTreeCollections] Remove subtree count augmentation Nov 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request HashTreeCollections Hash-array mapped prefix tree implementations
Projects
None yet
Development

No branches or pull requests

2 participants