Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-74929: Implement PEP 667 #115153

Merged
merged 44 commits into from May 4, 2024
Merged

gh-74929: Implement PEP 667 #115153

merged 44 commits into from May 4, 2024

Conversation

gaogaotiantian
Copy link
Member

@gaogaotiantian gaogaotiantian commented Feb 8, 2024

This is a prototype implementation of PEP 667.
All the examples given in PEP 667 passed.

The code only serves as a prototype, the proxy object is not even tracked by GC. However, the most essential features are implemented.

frame.f_locals is now a real-time proxy for the actual local variables in the frame (there's no mapping yet but that's an implementation detail).

locals() behaves basically like before.

All tests passed except for 3 - all expected

  • test_sys for frame size - yes it changed
  • test_frame - The pop method is not implemented yet
  • test_listcomps - the behavior of frame.f_locals changed so the corresponding test is not valid anymore

Copy link
Member

@markshannon markshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for doing this.

I've a few comments, but nothing major.

Include/internal/pycore_frame.h Outdated Show resolved Hide resolved
Include/internal/pycore_frame.h Outdated Show resolved Hide resolved
Objects/frameobject.c Outdated Show resolved Hide resolved
Objects/frameobject.c Outdated Show resolved Hide resolved
Objects/frameobject.c Show resolved Hide resolved
@carljm
Copy link
Member

carljm commented Feb 9, 2024

Thanks for doing this! I was hoping this would get done for 3.13 but hadn't found time to do it yet.

I don't agree that test_frame_locals in test_listcomps should be expected to fail under PEP 667. f_locals is supposed to be a real-time proxy for the live variables in the frame. Within a comprehension, comprehension-scoped variables are live and should be present in f_locals. The fact that this test is failing suggests that the PR does not yet correctly handle "fast-hidden" locals.

EDIT: on second thought, I realized the problem is likely that f_locals is now a live view instead of a snapshot, so by the time we look for the key, the comprehension is no longer running, and a is no longer a live name. So you are right that the test as written should be expected to fail; it should be updated to take a copy of f_locals within the comprehension.

@gaogaotiantian
Copy link
Member Author

EDIT: on second thought, I realized the problem is likely that f_locals is now a live view instead of a snapshot, so by the time we look for the key, the comprehension is no longer running, and a is no longer a live name. So you are right that the test as written should be expected to fail; it should be updated to take a copy of f_locals within the comprehension.

Yes, that's why the test fails. The test is val = [sys._getframe().f_locals for a in [0]][0]["a"] so when it tries to access ["a"], it's already out of the comprehension scope. This actually represents one of the changes we made to f_locals.

@gaogaotiantian
Copy link
Member Author

I cleaned up the code a bit to address the review and added the GC part in. This is still in draft state, which only aims to demostrate the possibility of PEP 667. There's plenty of work to be done before this can be merged in. The current goal is to make it enough for the PSF to decide whether to accept PEP 667. Let me know if there's still missing pieces (like do we want to actually remove all the FAST/LOCAL conversion calls to prove the point).

@markshannon
Copy link
Member

Removing the fast locals <--> slow locals conversion is key to fixing #74929. I think that is worth doing before submitting PEP 667 to the SC.

@gaogaotiantian since you are doing most of the work on this, would you like to be added as co-author of PEP 667?

@gaogaotiantian
Copy link
Member Author

Sure, I'm happy to be listed as the co-author. I'll work on the fast vs local thing and check if #74929 is fixed.

@gvanrossum
Copy link
Member

FWIW the discussion has been started at https://discuss.python.org/t/pep-667-consistent-views-of-namespaces/46631.

We now need at least the following updates to the PEP:

  • Add Discussion-To header pointing to that Discourse thread
  • Add Tian Gao to list of Authors
  • Link to implementation prototype PR from the Implementation section

@gaogaotiantian
Copy link
Member Author

I've already added myself to the list of authors with Mark's guidance. Let me know if you want me to make the PR to update the PEP.

@gvanrossum
Copy link
Member

Go right ahead.

@gaogaotiantian
Copy link
Member Author

@markshannon okay I made all fast/local related functions no-op. It did not break anything new and #74929 (comment) is not an issue anymore. The original test case in #74929 has been accidentally fixed in 3.11, is there any repro case that I can test against?

@carljm
Copy link
Member

carljm commented Mar 1, 2024

Is there a reason you're leaving the failing tests as failing? I think it is easier for reviewers to evaluate the compatibility impact on tests by seeing the required changes to tests in the PR, rather than by having to go look at the CI test results.

@gaogaotiantian
Copy link
Member Author

The current implementation is just a prototype and far from being done. I can make some modifications to the test so they can pass with the change if that's preferred.

@gaogaotiantian
Copy link
Member Author

We have two more months until feature freeze. If we want to land this on 3.13, we probably should push it forward soon.

@gvanrossum
Copy link
Member

How is the Discourse discussion on this topic going? Is it substantial enough to send the PEP to the SC? Do we need to make any PEP updates first?

@gaogaotiantian
Copy link
Member Author

I made a PR to the PEP and it's approved but not merged (I think Mark has to review and approve it?), if you meant the link changes mentioned before. I'm not sure if the discussion thread attracts enough attention. What point should we get for the Disclosure discussion?

Python/ceval.c Outdated Show resolved Hide resolved
@markshannon
Copy link
Member

markshannon commented Mar 5, 2024

We need to decide which mapping methods we want FrameLocalsProxy to support.
Listing all the methods on UserDict:

Dunder methods

  • Comparison operators, __le__, etc.
  • __or__, __ior__, __ror__
  • __contains__
  • __copy__
  • __getitem__
  • __setitem__
  • __delitem__ (see comment below under normal methods)
  • __format__
  • __len__
  • __iter__
  • __getitem__
  • __reversed__
  • __sizeof__
  • __repr__, __str__
  • Pickling support, __reduce__, etc

I think we need to support all the above, except pickling, and that __copy__ should return a normal __dict__, not a shallow copy.

Normal methods

  • clear,
  • copy,
  • get,
  • items,
  • keys,
  • pop,
  • popitem,
  • setdefault,
  • update,
  • values

Methods that remove items are tricky as we cannot properly delete fast locals; we just set them to None and emit a warning.
So, I think it best not to implement clear, pop, or popitem.

@markshannon
Copy link
Member

We don't need to implement them all before submitting the PEP though, just decide on which methods to support.

@gaogaotiantian
Copy link
Member Author

Should we follow some standard for the methods we should implement? The original PEP mentioned that we should do a MutableMapping, which probably lists all the methods that need to be implement.

For the remove related methods - should the debugger be able to decref the local variable? Just like del val. I think making that equivalent to f.locals.pop("val") is reasonable. Should we even emit a warning for that? Did we do that now? I think we emit a warning when we have to assign None to unassigned local variables.

@@ -622,10 +622,10 @@ def test_exception_in_post_comp_call(self):

def test_frame_locals(self):
code = """
val = [sys._getframe().f_locals for a in [0]][0]["a"]
val = "a" in [sys._getframe().f_locals for a in [0]][0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we've changed the intent of the test here. It was previously checking that a was added to the f_locals snapshot during the listcomp, and now it's checking that it is not visible through the proxy afterwards.

To cover both aspects, I think we need two test cases (first one shows that "a" is accessible in the f_locals while the listcomp is running, second shows that it is gone afterwards):

    def test_frame_locals_in_listcomp(self):
        # Iteration variable is accessible via f_locals proxy while the listcomp is running
        code = """
            val = [sys._getframe().f_locals["a"] for a in [0]][0]
        """
        import sys
        self._check_in_scopes(code, {"val": 0}, ns={"sys": sys})

    def test_frame_locals_after_listcomp(self):
        # Iteration variable is no longer accessible via f_locals proxy after listcomp finishes
        code = """
            val = "a" in [sys._getframe().f_locals for a in [0]][0]
        """
        import sys
        self._check_in_scopes(code, {"val": False}, ns={"sys": sys})

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a very interesting point because it introduced a new issue - should we include the hidden fast local in f_locals when it's module/class scope?

val = [sys._getframe().f_locals["a"] for a in [0]][0]

What should the sys._getframe().f_locals inside the list comprehension return? It's it a dict or a proxy? Do we consider that "within" the function scope and return a proxy? Will that include the module-level variables as well? Do we return a dict? Should we include variable a? If so, writing to the key 'a' won't have any effects because it's a fast variable, how do we deal with that inconsistency?

@markshannon

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hidden fast-locals were designed to mimic the prior behavior when comprehensions were previously implemented as one-shot functions. So as much as is feasible, the behavior of f_locals with hidden fast-locals should mirror how it would behave if the comprehension were a function.

Copy link
Member

@carljm carljm Apr 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second proposed test in @ncoghlan 's comment is not correct. Nor is the test as currently modified in the PR.

If you access f_locals on a frame you got while inside the comprehension, it should include the hidden fast-local (i.e. the comprehension iteration variable), and that variable should still be present (in the f_locals of the frame you got while inside the comprehension) even after the comprehension is done. This behavior is the same as getting a frame (or its f_locals) inside a function and returning it from the function. This is the current behavior in main, and this behavior is important to keep in PEP 667.

Of course if you access the frame outside the comprehension, the comprehension's hidden fast locals should not be present in its f_locals.

Currently in main, module globals are present in the f_locals of a frame accessed inside a module-level comprehension. This differs from a function, but this difference was called out in PEP 709 and determined to be OK, and I think it's still OK.

writing to the key 'a' won't have any effects because it's a fast variable

I suspect this is also not a problem in practice, though if we can reasonably return a proxy that can support it, it would be even better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you access f_locals on a frame you got while inside the comprehension, it should include the hidden fast-local (i.e. the comprehension iteration variable), and that variable should still be present (in the f_locals of the frame you got while inside the comprehension) even after the comprehension is done.

I don't think this is correct. f_locals is a proxy, so as the frame changes, so will the proxy. Inside the comprehension, the comprehension variables will be visible. After the comprehension has executed, the comprehension variables no longer exist.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the key here is "as if it's a one-shot function". That's infeasible with the current compiler because it's not a function anymore. For a function, we have a dedicated frame object that keeps the local variables, which can last longer than the intepreter frame as long as there are references on it. In that way, we can always access the variables on that frame even when the actual interpreter frame is gone.

However, with inline comprehension, it fakes the "function call" and clear the hidden variable - there's no mechanism currently to prevent the variable from being cleared. If we want this, we'll probably need to change the compiler and the interpreter which will definitely postpone this PEP to 3.14 (and would probably make the compiler code more complicated).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For class and module level comprehensions, can we return a proxy if f_locals is accessed inside the comprehension and a dict if it's accessed outside which does not have the hidden variable? I feel like that's more consistent.

Copy link
Member

@carljm carljm Apr 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is correct. f_locals is a proxy, so as the frame changes, so will the proxy. Inside the comprehension, the comprehension variables will be visible. After the comprehension has executed, the comprehension variables no longer exist.

It depends whether we are aiming for "correctness" for the actual situation (that the comprehension doesn't have its own frame), or for the backwards-compatible fiction (which PEP 709 only partially maintained, but did jump through hoops to maintain in terms of visibility of class-scoped variables) that comprehensions still behave as if they have their own frame. But as @gaogaotiantian points out, it may be very hard or impossible to maintain that fiction in this case, while still having f_locals be a proxy.

TBH in the long term I think it would be better to move away from that fiction anyway, but it will have backwards-compatibility implications. Today in main, sys._getframe.f_locals() inside a module-level comprehension returns a dictionary that includes the comprehension iteration variable, and that variable is still visible in that dictionary after the comprehension is done.

I don't know why someone relying on that couldn't just use locals() instead, though, and locals() should be fine for backwards-compatibility; it remains a dict. So I don't know how significant that backwards-incompatibility will be in practice.

For class and module level comprehensions, can we return a proxy if f_locals is accessed inside the comprehension and a dict if it's accessed outside which does not have the hidden variable? I feel like that's more consistent.

Certainly the dict returned when f_locals is accessed outside the comprehension should not have the hidden local in it.

I think returning a proxy inside the module- or class-scoped comprehension is more internally consistent for PEP 667; returning a dict (that includes the comprehension iteration variable) would be a bit more backward-compatible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow the details, but I agree that people who access f_locals are better off being shown implementation details that may vary by Python version than a costly fiction.

@gaogaotiantian
Copy link
Member Author

I don't like the idea of support all types of keys. It has no practical use except for using f_locals as a pure dict which is something we are steering away from. I think this is something similar to the copy() issue - I don't believe backwards compatibility is the priority here because we will break something. We won't support pop() for example.

I think the most important thing here is "what makes sense for this proxy", rather than "how could we make it less incompatible with 3.12". If we are destined to break the backwards compatibility, break it correctly.

@markshannon
Copy link
Member

I don't like the idea of support all types of keys. It has no practical use except for using f_locals as a pure dict which is something we are steering away from. I think this is something similar to the copy() issue - I don't believe backwards compatibility is the priority here because we will break something. We won't support pop() for example.

I see your point, but the PEP as written allows arbitrary keys to be stored in the f_locals object.

@gaogaotiantian
Copy link
Member Author

I see your point, but the PEP as written allows arbitrary keys to be stored in the f_locals object.

I missed the line and I re-read it. Then sure we can support that. I think it's just putting it into f_extra_locals which would not affect much if users are only using unicodes.

@gaogaotiantian
Copy link
Member Author

gaogaotiantian commented May 2, 2024

Okay I resolved most of the comments, and I believe there's one thing left here - comprehensions.

I'm just going to jump to my proposal and see if that makes sense. If not, we can discuss it more.

import sys
def f():
    x = 1
    [print(sys._getframe().f_locals) for a in [0]]
    print([sys._getframe().f_locals for a in [0]][0])
    print(sys._getframe().f_locals)
f()

will print

# Inside comprehension, show hidden variable a and locals in the outside frame
{'x': 1, 'a': 0}
# f_locals from inside but read outside, no hidden variable, only locals.
# This does not match the fiction because the variable should "exist" if it's a frame, but in reality it's gone
{'x': 1}
# f_locals outside, just local variables
{'x': 1}

Notice all the values above are proxies.

import sys
class C:
    x = 1
    [print(sys._getframe().f_locals) for a in [0]]
    print([sys._getframe().f_locals for a in [0]][0])
    print(sys._getframe().f_locals)

will print (ignore __module__ and __qualname__)

# This is a proxy without class-level variable
{'a': 0}
# Empty proxy
{}
# A dict with class level variable
{'x': 1}

An alternative would be:

# A dict, but you can't write to variable a to change the local value
{'a': 0, 'x': 1}
# A dict
{'x': 1}
# Still a dict
{'x': 1}

The latter seems more consistent with the function level result, and it makes locals() backwards compatible, but it creates a horrible exemption where the f_locals is not "write through".

Maybe we could stick to a simple rule here - f_locals will always return something that's write through. locals() always return dict(proxy) or the dict itself (in class/module level, outside of the comprehension).

However, this will break the backwards compatibility for locals() in comprehension in class/module level.

@carljm
Copy link
Member

carljm commented May 2, 2024

I think that proposal looks reasonable.

I think your first proposal for the class behavior is better. Comprehensions in class scope cannot see class-level variables (as if the comprehension were still a function, since functions defined in a class scope can't see variables in the enclosing class scope either), so those class variables should not show up when f_locals is accessed inside the comprehension.

@gaogaotiantian
Copy link
Member Author

gaogaotiantian commented May 2, 2024

I think your first proposal for the class behavior is better. Comprehensions in class scope cannot see class-level variables (as if the comprehension were still a function, since functions defined in a class scope can't see variables in the enclosing class scope either), so those class variables should not show up when f_locals is accessed inside the comprehension.

But we will probably treat module-level the same as class-level. Do you think it's reasonable to have a module-level comprehension to return a proxy without the global variables?

edit: I guess that makes sense as well. Even in a function in a module, f_locals should not return any global variables.

@gaogaotiantian
Copy link
Member Author

Okay I finished the last piece and cleaned up the code a bit.

Now we have some very consistent behavior for f_locals and locals().

locals() will be either

  • dict(f_locals) for function scope frames and all comprehensions
  • or f_locals for class and module frames

f_locals will be

  • a proxy for function scope frames and all comprehensions (comprehension variables are not accessible outside of the comprehension)
  • or a dict for class and module frames

f_locals will always be write through, no exceptions.

This breaks the behavior introduced by pep 709 that [locals() for a in [0]] will include outside variables if it's under a module or class level scope, but I think this specific behavior is not desired nor intentional, it's a compromise to begin with. If we consider the comprehension as a "function", the locals() should never include the outside variables.

locals() in a function scope will include the cell variables as expected.

Let me know if there are missing pieces that I don't know of.

@markshannon markshannon self-requested a review May 3, 2024 20:21
Copy link
Member

@markshannon markshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. Thanks for your perseverance.

last chance for comments.
I'm going to merge this in 12 hours or so, if no one objects.

@markshannon markshannon merged commit b034f14 into python:main May 4, 2024
36 checks passed
@gaogaotiantian gaogaotiantian deleted the pep667 branch May 6, 2024 19:42
SonicField pushed a commit to SonicField/cpython that referenced this pull request May 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants