You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Summary
Some value aggregators are resource intensive for pairwise merges on index creation.
Enhance the ValueAggregator interface to allow implementors to store intermediate state when merging aggregated records with raw values.
Why it is important
Some operations involving sketches perform poorly when intermediate book-keeping structures are re-created for every merge operation. Instead, using a Merge/Union object as state will permit more efficient merging of raw values before yielding a final result to store in an index such as the StarTree. This will speed up resource-intensive segment merges and index creation operations in large-scale production clusters.
Proposal
Extend the existing interface to include a State type S:
public interface ValueAggregator<R, S, A>
A new method is added to realise the aggregate value A from S:
A getFinalAggregatedValue(S stateValue);
All existing methods that merge records should use the state S instead.
Alternatives
I have preserved the current interface structures in my branch but have instead changed the aggregated type to an opaque Object. This means that I can dynamically switch between state and final aggregates for sketches such as the ThetaSketch - see this file. This yields significant performance improvements over the current implementation.
The text was updated successfully, but these errors were encountered:
Summary
Some value aggregators are resource intensive for pairwise merges on index creation.
Enhance the ValueAggregator interface to allow implementors to store intermediate state when merging aggregated records with raw values.
Why it is important
Some operations involving sketches perform poorly when intermediate book-keeping structures are re-created for every merge operation. Instead, using a Merge/Union object as state will permit more efficient merging of raw values before yielding a final result to store in an index such as the StarTree. This will speed up resource-intensive segment merges and index creation operations in large-scale production clusters.
Proposal
Extend the existing interface to include a State type
S
:A new method is added to realise the aggregate value
A
fromS
:All existing methods that merge records should use the state
S
instead.Alternatives
I have preserved the current interface structures in my branch but have instead changed the aggregated type to an opaque
Object
. This means that I can dynamically switch between state and final aggregates for sketches such as the ThetaSketch - see this file. This yields significant performance improvements over the current implementation.The text was updated successfully, but these errors were encountered: