Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize serialization size for primitive arrays #912

Open
jhsenjaliya opened this issue Sep 7, 2022 · 3 comments
Open

optimize serialization size for primitive arrays #912

jhsenjaliya opened this issue Sep 7, 2022 · 3 comments

Comments

@jhsenjaliya
Copy link

Is your feature request related to a problem? Please describe.

primitive arrays pre-occupies space as per the size of the elements, but its perfectly possible that most of the space is not used while serializing

Describe the solution you'd like
Instead of writing size and data from 0 to size,
Improvement proposal is to serialize array as per following

size -- defined size for the array to be serialized, as it is today
if(primitive_array_optimization_configured){
actual_size -- this is the actual size of array where user has set any value.
data -- from 0 to actual_size
}else{
data -- from 0 to size
}

where actual_size is calculated by reducing size until non default value is detected ( ex: 0 for int array )

Describe alternatives you've considered
None available to extend the array serializer

Additional context
Having this feature can greatly benefit size of the serialized object where primitive arrays are being used.

Happy to hear comments and suggestions.

@jhsenjaliya jhsenjaliya changed the title optimize serialization for primitive arrays [draft] optimize serialization size for primitive arrays Sep 7, 2022
@theigl
Copy link
Collaborator

theigl commented Sep 8, 2022

The disadvantage is that you will have to traverse the array multiple times, or write the data in reverse order. I'm not sure this makes sense as a general addition to Kryo, but you should have no problem writing such custom SparseArraySerializers yourself and using them instead of the default implementation.

@jhsenjaliya
Copy link
Author

jhsenjaliya commented Sep 8, 2022

actually u would only need to traverse the array once ( from end until u find non-default value), and this can only be activated with configuration so default behavior wont change.

@theigl
Copy link
Collaborator

theigl commented Sep 8, 2022

Please create a PR with an implementation for one of the default array serializers. I'm still not sure this should be provided by Kryo out of the box, but we can discuss it further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

2 participants