Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix error on instance property and init-only variable with the same name in a dataclass #17219

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

roberfi
Copy link
Contributor

@roberfi roberfi commented May 6, 2024

Fixes #12046

It does not raise "already defined" fail if already existing node is InitVar, so name of the property will be redefined. Then, when getting the method of that property, it looks for a possible redefinition.

Copy link
Contributor

github-actions bot commented May 6, 2024

According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅

@roberfi roberfi marked this pull request as ready for review May 6, 2024 19:07
@stroxler
Copy link
Contributor

The actual fix for the reported bug looks like it works to me, and I think it's probably a good thing for get_method to return the last definition when there are redefinitions of a method (this could come up in library code that isn't type checked, and using the last method seems right).

But I'm a little concerned that the example reported in the bug is actually relying on undefined behavior that just sort of happens to work.

@dataclass
class Test:
    foo: InitVar[str]
    _foo: str = field(init=False)

    def __post_init__(self, foo: str) -> None:
        self._foo = foo

    @property
    def foo(self) -> str:
        return self._foo

    @foo.setter
    def foo(self, value: str) -> None:
        self._foo = value

does indeed work at runtime, because dataclasses relies on inspect.get_annotations and Test.__annotations__ winds up recording the attribute annotation.

But the dataclasses docs do not attempt to specify behavior here, nor do the unit tests at https://github.com/python/cpython/blob/main/Lib/test/test_dataclasses/__init__.py validate anything.

And we can very easily run into poorly-defined behavior; for example if I tweak the first two lines in the example valid to provide a default argument but this code

@dataclass
class Test:
    foo: InitVar[str] = 6
    ... etc ...

then it will not work as expected, trying to use the default value will give

>>> Test()
Test(_foo=<property object at 0x108aa04a0>)

because even though dataclasses is interpreting the annotation using __annotations__ which the runtime is setting based on assignments in the class body, dataclasses in interpreting the value using the actual name bound to Test.foo which is no longer 6 by the time the decorator executes.

cc @JelleZijlstra / @hauntsaninja - I think this question is close to being either a typing spec or a CPython standard library question.

Comment on lines 3166 to +3171
if name in cls.names:
node = cls.names[name].node
if isinstance(node, FuncBase):
return node
elif isinstance(node, Decorator): # Two `if`s make `mypyc` happy
return node
else:
return None
elif possible_redefinitions := sorted(
[n for n in cls.names.keys() if n.startswith(f"{name}-redefinition")]
):
node = cls.names[possible_redefinitions[-1]].node
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regardless of what we decide to do about dataclasses that redefine InitVar fields, this seems like it might be a good change.

In general, validating that there aren't redefinitions (which is happening in semanal) seems like it should be independent of getting the appropriate value for a name; when dealing with dependencies, using the last-defined value for a name seems like a good thing.

I wonder if it would be worth pulling into a helper function, and using not just for methods but other attributes as well?

@hauntsaninja
Copy link
Collaborator

hauntsaninja commented May 25, 2024

Oh, nice catch! I didn't fully realise that InitVar could have a default.

From the CPython side, redefinition in general is problematic, even with normal fields. In my similar-to-dataclasses library, I have some heuristic based runtime checks against this case. Basically I complain if something in __annotations__ has a value matching the following:

if _is_property_like(value) or (
    isinstance(value, types.FunctionType)
    and value.__name__ != "<lambda>"
    and value.__qualname__.startswith(cls.__qualname__)
):

(my similar-to-dataclasses lib is sometimes used in ways that make it more likely that people will try to clobber things)

CPython is probably stuck with its behaviour given that folks are clearly relying on it. Either way, it should definitely have tests for this case. Want to open a PR?

From the mypy side, it seems like a half dozen folks have run into this, so it would be nice to support. The ideal behaviour would match the runtime behaviour, i.e. we allow redefinition when there isn't a value and we disallow redefinition when there is a value. @roberfi are you interested in adding the warning back when there's a default value? :-)

@roberfi
Copy link
Contributor Author

roberfi commented May 25, 2024

Oh, that's crazy! This kind of implementation could never have a default value.

Thank you so much for the catch and the explanation. I was a very good one!

@roberfi are you interested in adding the warning back when there's a default value? :-)

Yes, sure, I will add the check/warning and a test to verify that it's working. Let's do it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error on instance property and init-only variable with the same name in a dataclass
3 participants