Support calling methods that doesn't match a rule name #1076

plannigan · 2022-01-09T18:42:16Z

A working version that supports using a decorator to register a method to be called for a rule that does match the method name. It also supports things like a A inherits from B, which inherits from Transformer, and both A & B define overrides.

However, things get complicated with merge_transformers(). Since it will mutate base_transformer when given (explicitly documented behavior), copying over registered overrides needs to be done at the instance level instead of the class level. That in itself, is not a problem as an instance variable could be created in __init__(). The issue is that, nothing in the Transformer or Visitor inheritance chain currently defines an __init__(). This means that classes that do have an __init__(), likely don't call super().__init__() (example in test cases). This means the straightforward implementation would be a breaking change as super().__init__() would now be a requirement.

This is Python, so we can do a dynamic non-straightforward implementation, but I wanted to get some thought before I headed down that path.

Addresses #1066

erezsh · 2022-01-09T19:35:59Z

This is a creative approach. But I think the code will be much simpler if the decorator merely tagged the method, and the tree-builder used the tagging when scanning the class.

Something like:

def call_for(rule_name: str) -> Callable:
    def _call_for(func: Callable) -> Callable:
        func.__rule_name__ = rule_name
        return func
    return _call_for

Then the visitors/transformers will only need to scan the instance and build a mapping.

Maybe we'll need a few extra lines to make sure it plays nice with v_args (i.e. the attribute doesn't get removed), not sure, but either way it feels much simpler.

plannigan · 2022-01-09T22:55:46Z

Putting a marker attribute on the function could also work.

Then the visitors/transformers will only need to scan the instance and build a mapping.

I guess the key question is when should the scan happen to build the mapping? The answer to this is likely related to: What is the tolerance for how frequently does this happen? What I have now is a single scan each time a subclass is created, likely the best that can done as it never needs to be done again after that. The other extreme, would be to not maintain a mapping and scan the class every time. The middle ground dynamic option I was thinking about was lazily building the mapping the first time transform() or visit() are called. (Actually, putting some code at the start of the traversal would be an option catch cases where super().__init__() was not called and do a gradual deprecation instead of a direct breaking change.)

erezsh · 2022-01-10T08:20:39Z

Well, I can't say for sure if it's better to do it during __init__ or __init_subclass, but having to scan the MRO for that is a bit of an over-complication. Python resolves the names for us, and everything we need is in the local namespace.

It might seem nitpicky but every line we add, is a line we need to maintain. Every behavior we choose, needs to play nice with future changes. So keeping it simple is really crucial for the long run.

(sure, putting it at the beginning of transform() might also be a good option.)

erezsh · 2022-01-10T08:24:11Z

Btw, maybe we're over-engineering this. Maybe the only form of renaming we need to support is to add _, so we can do something like

try:
    return getattr(self, rule_name)
except AttributeError:
    return getattr(self, rule_name + '_')

And we memoize the lookup.

Maybe a little too magical, actually, but definitely simpler than the alternatives.

MegaIng · 2022-01-10T13:33:33Z

I would suggest allowing the user to define a method get_callback or something like that that allows a custom name resolution. This would allow the user to do what @erezsh proposed themselves. This function could also be cached in some form

We could also consider to amend that with a default search that tries to find methods having an attribute set by some call_for decorator, but IMO that is some tangential.

erezsh · 2022-01-10T15:13:08Z

Since we're brainstorming here, why not just rstrip underscores from methods? Something like

@rstrip_underscores
class Foo(Transformer):
    def from_(self, args):   # renamed to from
        ...

    def not_a_keyword_(self, args):    # also renamed, because why not
        ...

MegaIng · 2022-01-10T15:19:04Z

My suggestion of providing a generic interface also has the benefit that it works even if the grammar rules are named very different than python identifiers, for example what might happen if you have ABNF compatible identifiers with hyphens instead of underscores.

erezsh · 2022-01-10T15:24:18Z

Or we can rename hyphens to underscores and lose nothing in the process :)

MegaIng · 2022-01-10T15:27:53Z

and the spaces, and any other special characters that might appear in the name

plannigan · 2022-01-10T18:55:03Z

My preference would be to avoid a name coercion scheme where certain characters are collapsed into underscores. It can lead to whack-a-mole situation where the implementation initially seems sufficient, but then a user comes with an unexpected edge case. Additionally, it can cause situations where the grammar rule names are unique, but the converted names collide.

In general, it seems like were attempting to figure out where we want to be in the triangle of:

easy for the framework to implement
nice API for the user to work with
low performance cost at runtime

get_callback() is a generic approach that would be a bit more explicit and would support caching, which is an improvement over users needing to implement everything in __default__(). Though, I think it does index a little too heavily on "easy for the framework to implement" at the cost of "nice API for the user to work with".

The decorator approach is also generic and likely the best option for "nice API for the user to work with". In general, it also scores well on "low performance cost at runtime", but some existing implementation decisions hurt that some as well as making "easy for the framework to implement" a bit worse. I can take another shot at using a marker instead of the wrapper object using a cache instead of identifying everything at class creation.

erezsh · 2022-01-10T19:26:23Z

@plannigan I think that's a good analysis of the issue here. I'll just note that "nice API for the user to work with" is the main driver here, because otherwise everyone can do

setattr(my_transformer, 'from', from_)

And problem solved. So this is not really about adding new abilities, but about making it easy and idiomatic.

(okay, it won't solve dashes or dollar signs in the rule name, and for that get_callback() might be better. But it's still just a nice API for something the __default__() can already do)

erezsh · 2022-01-10T20:11:27Z

Actually, doesn't __getattr__() solve the need for get_callback() ? I feel like we're being silly and forgetting the basics.

Use a marker on the function instead of a wrapper class Look up overrides only when needed and cache result

plannigan · 2022-01-18T01:17:46Z

Was able to get a complete solution working that:

Uses a marker attribute instead of a wrapping class (though that did introduce other complexities)
Has an internal cache for overrides that is lazily filled
The normal getattr() behavior happens first, so there is no performance change for matching on defined user functions. However, rules that don't have matching user functions will be slower.
Is backwards compatible for users that aren't calling super().__init__().

Open questions:

Since marker attributes don't work on methods, is that actually "simpler" than a wrapping class.
If we are adding a cache for overrides, do we want to cache everything? getattr() should always be fast, but rules without user functions currently pay a heavy cost every time. ~~@lru_cache() would make this easy. Though there is a potential issue with merge_transformers() because base_transformer could have seen a prefixed rule that only now has a user function.~~ ( An @lru_cache style cache makes it cache per class, not instance. Caching everything is still possible, just not easily tucked behind a decorator.)

Support calling methods that doesn't match a rule name

1b4fd0d

plannigan added 3 commits January 17, 2022 19:41

Pivot implmentaiton

fa00eb6

Use a marker on the function instead of a wrapper class Look up overrides only when needed and cache result

Add super init calls that are now expected

fb0b46c

Merge branch 'master' into call_for_decorator

689fc20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support calling methods that doesn't match a rule name #1076

Support calling methods that doesn't match a rule name #1076

plannigan commented Jan 9, 2022

erezsh commented Jan 9, 2022 •

edited

plannigan commented Jan 9, 2022

erezsh commented Jan 10, 2022 •

edited

erezsh commented Jan 10, 2022

MegaIng commented Jan 10, 2022

erezsh commented Jan 10, 2022 •

edited

MegaIng commented Jan 10, 2022

erezsh commented Jan 10, 2022

MegaIng commented Jan 10, 2022

plannigan commented Jan 10, 2022

erezsh commented Jan 10, 2022 •

edited

erezsh commented Jan 10, 2022

plannigan commented Jan 18, 2022 •

edited

Support calling methods that doesn't match a rule name #1076

Are you sure you want to change the base?

Support calling methods that doesn't match a rule name #1076

Conversation

plannigan commented Jan 9, 2022

erezsh commented Jan 9, 2022 • edited

plannigan commented Jan 9, 2022

erezsh commented Jan 10, 2022 • edited

erezsh commented Jan 10, 2022

MegaIng commented Jan 10, 2022

erezsh commented Jan 10, 2022 • edited

MegaIng commented Jan 10, 2022

erezsh commented Jan 10, 2022

MegaIng commented Jan 10, 2022

plannigan commented Jan 10, 2022

erezsh commented Jan 10, 2022 • edited

erezsh commented Jan 10, 2022

plannigan commented Jan 18, 2022 • edited

erezsh commented Jan 9, 2022 •

edited

erezsh commented Jan 10, 2022 •

edited

erezsh commented Jan 10, 2022 •

edited

erezsh commented Jan 10, 2022 •

edited

plannigan commented Jan 18, 2022 •

edited