Support for coded streams #482

a-khabarov · 2023-05-02T09:56:34Z

C++ and Java implementations of protocol buffers have support for writing and reading multiple messages via CodedOutputStream and CodedInputStream . This is also space-efficient because the size of each message can be represented as a varint.

I think having something similar to CodedOutputStream and CodedInputStream in betterproto would be useful. One use case would be when a Python tool uses betterproto to emit a binary stream of messages using this and then a C++/Java tool can use CodedInputStream to read the stream.

Related Java documentation:

The text was updated successfully, but these errors were encountered:

Adds support for serializing/deserializing messages and their components to/from streams. Where possible, existing methods now use this functionality internally, minimising code size. This is useful as a standalone feature, and is also a step towards the functionality requested in danielgtaylor#482 As part of this, the following functions have been created/changed: - `betterproto.dump_varint` - this function encodes a value as a varint and dumps it into the provided stream. It is mostly the same as the existing `betterproto.encode_varint` was, but based around streams and with some additional error checking. - `betterproto.encode_varint` - this existing function has the same effects, but now uses `betterproto.dump_varint` internally, keeping code size and complexity down. - `betterproto.size_varint` - this function calculates the size of the varint for a given value without actually serializing it. This may be useful to some, and similar functions exist in the official C++ and other implementations. It is also used internally by other new functionality to reduce memory and time usage compared to simply running `len(...)` after serializing a varint. - `betterproto._len_preprocessed_single` - calculates the size of the value that would be returned by `_preprocess_single` without fully serializing it. Used internally by new functionality to reduce memory and time usage over simply serializing and then checking the size of it. - `betterproto._len_single` - similar to above, but for `_serialize_single`. - `betterproto.load_varint` - loads a varint from a stream and decodes its value. Mostly the same as `decode_varint` already was, but based around streams. - `betterproto.decode_varint` - existing function, has the same functionality as it did previously, but now uses `load_varint` internally to keep code size and complexity down. - `betterproto.load_fields` - does the same as `parse_fields`, but loads the fields from the provided stream, rather than a `bytes` object. Used internally by `Message.load`. - `betterproto.Message.dump` - does the same as `Message.__bytes__` already did, but dumps the results to a stream rather than a `bytes` object. - `betterproto.Message.__bytes__` - does as it already did, but now uses `Message.dump` internally to reduce code size and complexity. - `betterproto.Message.__len__` - returns the size of the encoded message - i.e. does the same as `len(bytes(message))` without fully serializing the message, reducing time and memory usage. - `betterproto.Message.load` - loads and parses a binary encoded message from a stream. Similar to `Message.parse`, but retrieving the data from a stream rather than a `bytes` object. - `betterproto.Message.parse` - does as it already did, but now uses `Message.load` internally to reduce code size and complexity.

Gobot1234 added enhancement New feature or request medium Medium effort issue, can fit in a single PR labels May 25, 2023

JoshuaLeivers mentioned this issue Aug 16, 2023

Add message streaming support #518

Merged

JoshuaLeivers mentioned this issue Sep 27, 2023

Add support for streaming delimited messages #529

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for coded streams #482

Support for coded streams #482

a-khabarov commented May 2, 2023

Support for coded streams #482

Support for coded streams #482

Comments

a-khabarov commented May 2, 2023