Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-117139: Convert the evaluation stack to stack refs #118450

Open
wants to merge 48 commits into
base: main
Choose a base branch
from

Conversation

Fidget-Spinner
Copy link
Member

@Fidget-Spinner Fidget-Spinner commented Apr 30, 2024

Only tags all pointers 0b11 and NULL and immortal stuff as deferred for now.

@gvanrossum
Copy link
Member

Could we hold off on this until 3.14? It's only a week until feature freeze for 3.13 (at which point main becomes 3.14), and this looks like a lot of churn in a time where we all would like stability to merge things that are actually needed in 3.13.

@Fidget-Spinner
Copy link
Member Author

Could we hold off on this until 3.14? It's only a week until feature freeze for 3.13 (at which point main becomes 3.14), and this looks like a lot of churn in a time where we all would like stability to merge things that are actually needed in 3.13.

Alright. I always forget that ten other people are rushing in things at the same time as me. I don't think this will add any value for CPython 3.13 anyways at the moment.

@colesbury
Copy link
Contributor

Could we hold off on this until 3.14?

That makes sense. I'll start providing feedback and reviewing this now, but it won't be merged in 3.13.

Copy link
Contributor

@colesbury colesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments below, mostly minor. I'll take a closer look at bytecodes.c and the generated cases next.

Python/gc_free_threading.c Outdated Show resolved Hide resolved
Include/internal/pycore_call.h Show resolved Hide resolved
Include/internal/pycore_ceval.h Outdated Show resolved Hide resolved
Include/internal/pycore_dict.h Outdated Show resolved Hide resolved
Include/internal/pycore_frame.h Outdated Show resolved Hide resolved
Objects/dictobject.c Show resolved Hide resolved
Objects/frameobject.c Outdated Show resolved Hide resolved
Python/bytecodes.c Outdated Show resolved Hide resolved
Python/bytecodes.c Outdated Show resolved Hide resolved
Python/bytecodes.c Show resolved Hide resolved
@Fidget-Spinner
Copy link
Member Author

Fidget-Spinner commented May 10, 2024

I have made the requested changes; please review again

Namely:

  1. Manually convert most of the bytecodes.c to stackref, instead of making it magically done by the cases generator.
  2. Fix ownership problems when using steal or borrow.
  3. Tag all pointers 0b11.

To be conservative, this PR only defers immortal objects at the moment and NULL.

@bedevere-app
Copy link

bedevere-app bot commented May 10, 2024

Thanks for making the requested changes!

@markshannon: please review the changes made to this pull request.

@bedevere-app bedevere-app bot requested a review from markshannon May 10, 2024 18:18
static inline PyObject *
PyStackRef_Get(_PyStackRef tagged)
PyStackRef_To_PyObject_Borrow(_PyStackRef tagged)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These names do not look like other CPython function naming conventions to me. Can we do something like:

  • PyStackRef_AsPyObjectBorrow(_PyStackRef) -- instead of this funcion

And, for example:

  • PyStackRef_AsPyObjectSteal
  • PyStackRef_FromPyObjectBorrow
  • PyStackRef_FromPyObjectSteal

@Fidget-Spinner
Copy link
Member Author

Note: I'm aware that certain decref options don't follow the semantics laid out in the issue. For now they're still safe because they will never operate on deferred objects. In PEP 703's plan, (non-immortal) floats and ints will not be deferred. So we're safe for now. However, in the future if we want to defer everything, then those need to be removed #118930.

@Fidget-Spinner
Copy link
Member Author

5% performance hit on single-threaded default build: https://github.com/faster-cpython/benchmarking-public/tree/main/results/bm-20240511-3.14.0a0-144a6fa

@markshannon
Copy link
Member

I really think we need a design document (in an issue) that we can discuss before doing more work on this.
I will attempt to summarize what I believe to be the design at the moment:

  • Values on "localsplus", that is the fast locals and evaluation stack, of frames may be deferred. This includes suspended generators, so that deferred references may occur in the heap without being present in the frame stack
  • A reference can only be deferred if it references an object whose class is in a fixed set of "deferrable" classes.
  • Because deferred references can exist in the heap alone, objects that can have deferred references to them can only collected by scanning the heap.
  • Calls to Py_Dealloc can cause a garbage collection and updating stacktop across all calls to Py_Dealloc and any other call that would escape the interpreter is deemed too expensive.
  • This has the following consequences:
    • Since the stack pointer may not visible to the GC it must scan the whole stack including the dead part.
    • Unused stack items must be initialized to NULL, so that they don't point to invalid memory.
    • It is OK to scan dead pointers to deferred objects, as only the GC can free those objects.
    • I don't understand how non-deferred objects are freed, as we cannot safely look at non-deferred pointers.

@colesbury
Copy link
Contributor

I really think we need a design document (in an issue)...

I'll add a more detailed summary of the GC changes to:

I don't understand how non-deferred objects are freed, as we cannot safely look at non-deferred pointers

We don't look at non-deferred pointers above stacktop (beyond checking the tag).

To be clear, what we are describing here is an extra step in the GC that scans each thread's stack and ensures that deferred referenced objects from thread's stacks are kept alive. These frames are mostly not tracked and would not otherwise be considered by the GC.

We don't need to do anything special in this step for non-deferred references because those objects are kept alive by their reference count (i.e., computed gc_refs > 0), just like they are today.

@colesbury
Copy link
Contributor

I've updated #117376.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants