Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for coded streams #482

Open
a-khabarov opened this issue May 2, 2023 · 0 comments
Open

Support for coded streams #482

a-khabarov opened this issue May 2, 2023 · 0 comments
Labels
enhancement New feature or request medium Medium effort issue, can fit in a single PR

Comments

@a-khabarov
Copy link
Contributor

C++ and Java implementations of protocol buffers have support for writing and reading multiple messages via CodedOutputStream and CodedInputStream . This is also space-efficient because the size of each message can be represented as a varint.

I think having something similar to CodedOutputStream and CodedInputStream in betterproto would be useful. One use case would be when a Python tool uses betterproto to emit a binary stream of messages using this and then a C++/Java tool can use CodedInputStream to read the stream.

Related Java documentation:

@Gobot1234 Gobot1234 added enhancement New feature or request medium Medium effort issue, can fit in a single PR labels May 25, 2023
JoshuaLeivers added a commit to JoshuaLeivers/python-betterproto that referenced this issue Aug 15, 2023
Adds support for serializing/deserializing messages and their
components to/from streams. Where possible, existing methods now
use this functionality internally, minimising code size.

This is useful as a standalone feature, and is also a step towards
the functionality requested in danielgtaylor#482

As part of this, the following functions have been created/changed:
- `betterproto.dump_varint` - this function encodes a value as a
    varint and dumps it into the provided stream. It is mostly the
    same as the existing `betterproto.encode_varint` was, but based
    around streams and with some additional error checking.

- `betterproto.encode_varint` - this existing function has the same
    effects, but now uses `betterproto.dump_varint` internally,
    keeping code size and complexity down.

- `betterproto.size_varint` - this function calculates the size of
    the varint for a given value without actually serializing
    it. This may be useful to some, and similar functions exist in
    the official C++ and other implementations. It is also used
    internally by other new functionality to reduce memory and time
    usage compared to simply running `len(...)` after serializing a
    varint.

- `betterproto._len_preprocessed_single` - calculates the size of
    the value that would be returned by `_preprocess_single`
    without fully serializing it. Used internally by new
    functionality to reduce memory and time usage over simply
    serializing and then checking the size of it.

- `betterproto._len_single` - similar to above, but for
    `_serialize_single`.

- `betterproto.load_varint` - loads a varint from a stream and
    decodes its value. Mostly the same as `decode_varint`
    already was, but based around streams.

- `betterproto.decode_varint` - existing function, has the same
    functionality as it did previously, but now uses `load_varint`
    internally to keep code size and complexity down.

- `betterproto.load_fields` - does the same as `parse_fields`, but
    loads the fields from the provided stream, rather than a
    `bytes` object. Used internally by `Message.load`.

- `betterproto.Message.dump` - does the same as `Message.__bytes__`
    already did, but dumps the results to a stream rather than a
    `bytes` object.

- `betterproto.Message.__bytes__` - does as it already did, but now
    uses `Message.dump` internally to reduce code size and
    complexity.

- `betterproto.Message.__len__` - returns the size of the encoded
    message - i.e. does the same as `len(bytes(message))` without
    fully serializing the message, reducing time and memory usage.

- `betterproto.Message.load` - loads and parses a binary encoded
    message from a stream. Similar to `Message.parse`, but
    retrieving the data from a stream rather than a `bytes` object.

- `betterproto.Message.parse` - does as it already did, but now
    uses `Message.load` internally to reduce code size and
    complexity.
JoshuaLeivers added a commit to JoshuaLeivers/python-betterproto that referenced this issue Aug 15, 2023
Adds support for serializing/deserializing messages and their
components to/from streams. Where possible, existing methods now
use this functionality internally, minimising code size.

This is useful as a standalone feature, and is also a step towards
the functionality requested in danielgtaylor#482

As part of this, the following functions have been created/changed:
- `betterproto.dump_varint` - this function encodes a value as a
    varint and dumps it into the provided stream. It is mostly the
    same as the existing `betterproto.encode_varint` was, but based
    around streams and with some additional error checking.

- `betterproto.encode_varint` - this existing function has the same
    effects, but now uses `betterproto.dump_varint` internally,
    keeping code size and complexity down.

- `betterproto.size_varint` - this function calculates the size of
    the varint for a given value without actually serializing
    it. This may be useful to some, and similar functions exist in
    the official C++ and other implementations. It is also used
    internally by other new functionality to reduce memory and time
    usage compared to simply running `len(...)` after serializing a
    varint.

- `betterproto._len_preprocessed_single` - calculates the size of
    the value that would be returned by `_preprocess_single`
    without fully serializing it. Used internally by new
    functionality to reduce memory and time usage over simply
    serializing and then checking the size of it.

- `betterproto._len_single` - similar to above, but for
    `_serialize_single`.

- `betterproto.load_varint` - loads a varint from a stream and
    decodes its value. Mostly the same as `decode_varint`
    already was, but based around streams.

- `betterproto.decode_varint` - existing function, has the same
    functionality as it did previously, but now uses `load_varint`
    internally to keep code size and complexity down.

- `betterproto.load_fields` - does the same as `parse_fields`, but
    loads the fields from the provided stream, rather than a
    `bytes` object. Used internally by `Message.load`.

- `betterproto.Message.dump` - does the same as `Message.__bytes__`
    already did, but dumps the results to a stream rather than a
    `bytes` object.

- `betterproto.Message.__bytes__` - does as it already did, but now
    uses `Message.dump` internally to reduce code size and
    complexity.

- `betterproto.Message.__len__` - returns the size of the encoded
    message - i.e. does the same as `len(bytes(message))` without
    fully serializing the message, reducing time and memory usage.

- `betterproto.Message.load` - loads and parses a binary encoded
    message from a stream. Similar to `Message.parse`, but
    retrieving the data from a stream rather than a `bytes` object.

- `betterproto.Message.parse` - does as it already did, but now
    uses `Message.load` internally to reduce code size and
    complexity.
JoshuaLeivers added a commit to JoshuaLeivers/python-betterproto that referenced this issue Aug 15, 2023
Adds support for serializing/deserializing messages and their
components to/from streams. Where possible, existing methods now
use this functionality internally, minimising code size.

This is useful as a standalone feature, and is also a step towards
the functionality requested in danielgtaylor#482

As part of this, the following functions have been created/changed:
- `betterproto.dump_varint` - this function encodes a value as a
    varint and dumps it into the provided stream. It is mostly the
    same as the existing `betterproto.encode_varint` was, but based
    around streams and with some additional error checking.

- `betterproto.encode_varint` - this existing function has the same
    effects, but now uses `betterproto.dump_varint` internally,
    keeping code size and complexity down.

- `betterproto.size_varint` - this function calculates the size of
    the varint for a given value without actually serializing
    it. This may be useful to some, and similar functions exist in
    the official C++ and other implementations. It is also used
    internally by other new functionality to reduce memory and time
    usage compared to simply running `len(...)` after serializing a
    varint.

- `betterproto._len_preprocessed_single` - calculates the size of
    the value that would be returned by `_preprocess_single`
    without fully serializing it. Used internally by new
    functionality to reduce memory and time usage over simply
    serializing and then checking the size of it.

- `betterproto._len_single` - similar to above, but for
    `_serialize_single`.

- `betterproto.load_varint` - loads a varint from a stream and
    decodes its value. Mostly the same as `decode_varint`
    already was, but based around streams.

- `betterproto.decode_varint` - existing function, has the same
    functionality as it did previously, but now uses `load_varint`
    internally to keep code size and complexity down.

- `betterproto.load_fields` - does the same as `parse_fields`, but
    loads the fields from the provided stream, rather than a
    `bytes` object. Used internally by `Message.load`.

- `betterproto.Message.dump` - does the same as `Message.__bytes__`
    already did, but dumps the results to a stream rather than a
    `bytes` object.

- `betterproto.Message.__bytes__` - does as it already did, but now
    uses `Message.dump` internally to reduce code size and
    complexity.

- `betterproto.Message.__len__` - returns the size of the encoded
    message - i.e. does the same as `len(bytes(message))` without
    fully serializing the message, reducing time and memory usage.

- `betterproto.Message.load` - loads and parses a binary encoded
    message from a stream. Similar to `Message.parse`, but
    retrieving the data from a stream rather than a `bytes` object.

- `betterproto.Message.parse` - does as it already did, but now
    uses `Message.load` internally to reduce code size and
    complexity.
JoshuaLeivers added a commit to JoshuaLeivers/python-betterproto that referenced this issue Aug 16, 2023
Adds support for serializing/deserializing messages and their
components to/from streams. Where possible, existing methods now
use this functionality internally, minimising code size.

This is useful as a standalone feature, and is also a step towards
the functionality requested in danielgtaylor#482

As part of this, the following functions have been created/changed:
- `betterproto.dump_varint` - this function encodes a value as a
    varint and dumps it into the provided stream. It is mostly the
    same as the existing `betterproto.encode_varint` was, but based
    around streams and with some additional error checking.

- `betterproto.encode_varint` - this existing function has the same
    effects, but now uses `betterproto.dump_varint` internally,
    keeping code size and complexity down.

- `betterproto.size_varint` - this function calculates the size of
    the varint for a given value without actually serializing
    it. This may be useful to some, and similar functions exist in
    the official C++ and other implementations. It is also used
    internally by other new functionality to reduce memory and time
    usage compared to simply running `len(...)` after serializing a
    varint.

- `betterproto._len_preprocessed_single` - calculates the size of
    the value that would be returned by `_preprocess_single`
    without fully serializing it. Used internally by new
    functionality to reduce memory and time usage over simply
    serializing and then checking the size of it.

- `betterproto._len_single` - similar to above, but for
    `_serialize_single`.

- `betterproto.load_varint` - loads a varint from a stream and
    decodes its value. Mostly the same as `decode_varint`
    already was, but based around streams.

- `betterproto.decode_varint` - existing function, has the same
    functionality as it did previously, but now uses `load_varint`
    internally to keep code size and complexity down.

- `betterproto.load_fields` - does the same as `parse_fields`, but
    loads the fields from the provided stream, rather than a
    `bytes` object. Used internally by `Message.load`.

- `betterproto.Message.dump` - does the same as `Message.__bytes__`
    already did, but dumps the results to a stream rather than a
    `bytes` object.

- `betterproto.Message.__bytes__` - does as it already did, but now
    uses `Message.dump` internally to reduce code size and
    complexity.

- `betterproto.Message.__len__` - returns the size of the encoded
    message - i.e. does the same as `len(bytes(message))` without
    fully serializing the message, reducing time and memory usage.

- `betterproto.Message.load` - loads and parses a binary encoded
    message from a stream. Similar to `Message.parse`, but
    retrieving the data from a stream rather than a `bytes` object.

- `betterproto.Message.parse` - does as it already did, but now
    uses `Message.load` internally to reduce code size and
    complexity.
JoshuaLeivers added a commit to JoshuaLeivers/python-betterproto that referenced this issue Aug 30, 2023
Adds support for serializing/deserializing messages and their
components to/from streams. Where possible, existing methods now
use this functionality internally, minimising code size.

This is useful as a standalone feature, and is also a step towards
the functionality requested in danielgtaylor#482

As part of this, the following functions have been created/changed:
- `betterproto.dump_varint` - this function encodes a value as a
    varint and dumps it into the provided stream. It is mostly the
    same as the existing `betterproto.encode_varint` was, but based
    around streams and with some additional error checking.

- `betterproto.encode_varint` - this existing function has the same
    effects, but now uses `betterproto.dump_varint` internally,
    keeping code size and complexity down.

- `betterproto.size_varint` - this function calculates the size of
    the varint for a given value without actually serializing
    it. This may be useful to some, and similar functions exist in
    the official C++ and other implementations. It is also used
    internally by other new functionality to reduce memory and time
    usage compared to simply running `len(...)` after serializing a
    varint.

- `betterproto._len_preprocessed_single` - calculates the size of
    the value that would be returned by `_preprocess_single`
    without fully serializing it. Used internally by new
    functionality to reduce memory and time usage over simply
    serializing and then checking the size of it.

- `betterproto._len_single` - similar to above, but for
    `_serialize_single`.

- `betterproto.load_varint` - loads a varint from a stream and
    decodes its value. Mostly the same as `decode_varint`
    already was, but based around streams.

- `betterproto.decode_varint` - existing function, has the same
    functionality as it did previously, but now uses `load_varint`
    internally to keep code size and complexity down.

- `betterproto.load_fields` - does the same as `parse_fields`, but
    loads the fields from the provided stream, rather than a
    `bytes` object. Used internally by `Message.load`.

- `betterproto.Message.dump` - does the same as `Message.__bytes__`
    already did, but dumps the results to a stream rather than a
    `bytes` object.

- `betterproto.Message.__bytes__` - does as it already did, but now
    uses `Message.dump` internally to reduce code size and
    complexity.

- `betterproto.Message.__len__` - returns the size of the encoded
    message - i.e. does the same as `len(bytes(message))` without
    fully serializing the message, reducing time and memory usage.

- `betterproto.Message.load` - loads and parses a binary encoded
    message from a stream. Similar to `Message.parse`, but
    retrieving the data from a stream rather than a `bytes` object.

- `betterproto.Message.parse` - does as it already did, but now
    uses `Message.load` internally to reduce code size and
    complexity.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request medium Medium effort issue, can fit in a single PR
Projects
None yet
Development

No branches or pull requests

2 participants