Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New codec setting API #389

Open
1st1 opened this issue Nov 1, 2022 · 4 comments
Open

New codec setting API #389

1st1 opened this issue Nov 1, 2022 · 4 comments

Comments

@1st1
Copy link
Member

1st1 commented Nov 1, 2022

I'd like to continue the discussion started in #388.

Paul shared a number of great thoughts in his comment #388 (comment) and later Fantix and I had a quick 1:1 call and naturally discussed the API a bit.

Here's what I think we should do:

  1. I don't like with_client_options API that I proposed.
  2. I don't like an option to somehow configure existing codecs.
  3. I do think that we have to have a way to assign custom codecs to base scalar types. One use-case is redefining the behavior for std::json in edgedb-python. Another - supporting various decimal libraries in edgedb-js.

I propose the following API:

A new method called client.with_codec(codec: Codec). Enables the passed codec for the new client instance the method returns. Usage:

client2 = client.with_codec(JsonUnpackingCodec)

A new method called client.with_default_codecs(). Returns a new client instance with all codecs reset to the default mapping. We need this in the codegen - to reset all codecs before we run the query.

A new Codec interface:

class Codec:

    @classmethod
    def get_scalar_type_name(cls) -> str:
        """Return the name of a scalar type the codec is designed for"""
        # e.g. return 'std::int16` or `acme::my_int`

    @classmethod
    def get_codegen_type(cls) -> str:
        """Return the Python type name for the return value type"""
        # E.g. for JsonUnpackingCodec it would be `typing.Any`; for a hypothetical
        # "MyDecimalLib" codec it would be `MyDecimalLib.Decimal`
 
        # Perhaps this method should return a tuple[str, str]:
        #   first element: module to import
        #   second element: type name defined in that module

    def encode(...)
    def decode(...)

New command-line option for codegen: --with-codec that would accept a fully qualified name
to a Python factory for the codec, e.g. mypackage.mytype.EdgeDBCodec.

cc @fantix @tailhook

@tailhook
Copy link
Contributor

tailhook commented Nov 2, 2022

I'm not sure why you want a factory in --with-codec and to export JsonUnpackingCodec as a factory rather than as a const, but other than that looks good.

My thinking is that you should have very few codecs even if more than one. And instantiating one on every query (more specifically every with_codec call) makes very little sense. Overriding __new__ for singleton is quite cumbersome. So importing const is what makes the most sense.

@1st1
Copy link
Member Author

1st1 commented Nov 2, 2022

And instantiating one on every query (more specifically every with_codec call) makes very little sense.

We do that though for every type descriptor for every query. Saying that I see now that we should clarify the relationship between a odec, a factory, etc.

Sounds like client.with_codec(JsonUnpackingCodec) should receive a type, not an instance, and that fixes the proposal.

@tailhook
Copy link
Contributor

tailhook commented Nov 3, 2022

The alternative would be to set the whole codec collection:

connection.with_codecs(edgedb.codecs.DEFAULT)
connection.with_codecs(edgedb.codecs.DEFAULT_WITH_UNPACKED_JSON)

The collection itself has interface like this:

class Codecs:
    def get_type_name(self, descriptor) -> str:
        # roughly this:
        return self.__type_names[descriptor.type_name]
    def get_codec(self, descriptor) -> str:
        # roughly this:
        return self.__codecs[descriptor.type_name]

Then you can either inherit the collection type:

class MyCodecs(Codecs):
    def get_type_name(self, descriptor) -> str:
        if descriptor.type_name.starts_with('myapp'):
            return self._custom_type(descriptor)
        else:
            return super().get_type_name(descriptor)

Or use it as a mutable collection:

codecs = edgedb.codecs.DEFAULT.copy()
codecs.add_codec('myapp::MyScalar', MyScalarCodec, 'myapp.MyType')
conn.with_codecs(codecs)

This is kinda a bit more cumbersome if you need to change just one type. But is more extensible if you want to change decoding all enum types or all integer types. Also it will allow us adding more information to the Descriptor type (we can start using type ids, then add type names, then add inheritance information, etc.) without breaking compatibility.

Few variations are possible. Like maybe get_type_name should be got from the codec itself rather than collection?

@tailhook
Copy link
Contributor

tailhook commented Nov 3, 2022

To simplify single type case, we can also probably allow both ways:

connection.with_codec(MyDecimalCodec)
# being equivalent to
connection.with_codecs(
    connection.get_codecs().with_codec(MyDecimalCodec)
)

But I think this means copying the whole codec dictionary, which means it's better to do it at the global level:

DECIMAL_CODECS = edgedb.codecs.DEFAULT.with_codec(MyDecimalCodec)
de request():
    conn = conn.with_codecs(DECIMAL_CODECS)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants