-
-
Notifications
You must be signed in to change notification settings - Fork 29.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set up tagged pointers in the evaluation stack #117139
Comments
So the mentions of unboxed ints are speculative and not part of the initial implementation, right? And the tagged pointers just have their tag bit(s) removed when used in the "blob of C code" that's the instruction definition's body. For example,
would expand its body to SETLOCAL(oparg, (PyObject *)(value.bits & ~TAGS)); How do pointers receive their tag bit? Does every stack push just "or in" the tag bit(s)? I recall trying to do this manually (before we had the generator) and it was a lot of work -- the code generator should make this much simpler. It also was quite a bit slower -- have you thought about this? My biggest question is what does the tag bit mean? Is there a specific section of PEP 703 I should read to learn more about this? |
PEP 703 doesn't go into detail about the tag bits. I'd expect it to match the description in Mark's issue, except with unboxed ints not part of the initial implementation:
We should only enable this in the free-threaded build to start. We do not want to introduce any performance regressions in the default build. If it turns out to be a net win for the default build, then we can consider enabling it there. Otherwise, we can revisit that in 3.14, if we pursue tagged integers.
Yeah, most pushes just "or in" the tag bit, which is an immortal check. So for, example cpython/Python/generated_cases.c.h Line 2922 in 40d75c2
That might be generated instead as: #define PACK(obj) ((PyTaggedObject){.bits = (uintptr_t)obj | _Py_IsImmortal(obj)})
stack_pointer[-1] = PACK(iter);
I think we want to use tagged pointers in the locals as well as the evaluation stack, but it may be easier to start with just using tagged pointers the evaluation stack. In that case
If we use tagged pointers for the locals as well, then we could skip the extra work in MotivationThe primary motivation is to avoid reference count contention in multithreaded programs on things like functions/methods (in the free-threaded build). In particular, the benefit (from avoid reference counting contention) is going to be in For example, currently PyObject *attr = ...;
Py_INCREF(attr);
stack_pointer[-1] = attr; We want the generated code to instead look something like the following, but probably refactored with helper macros/functions: PyObject *attr = ...;
PyTaggedObject attr_tag;
if (_Py_IsImmortalOrDeferred(attr)) {
attr_tag.bits = (uintptr_t)attr | IS_DEFERRED;
}
else {
attr_tag.obj = attr;
}
stack_pointer[-1] = attr_tag; |
Okay... So from this definition #define PACK(obj) ((PyTaggedObject){.bits = (uintptr_t)obj | _Py_IsImmortal(obj)}) I gather that the tag bit is (by default) only set if the object is immortal. From your Motivation here (and also from the section on deferred refcounting in PEP 703) I take it that the tag bit would also be set for selected objects like functions and methods. Mark's original issue describes variants of INCREF/DECREF to check for the tag bit; presumably these variants should only be used in the interpreter. @Fidget-Spinner Good luck! |
--------- Co-authored-by: Sam Gross <655866+colesbury@users.noreply.github.com>
--------- Co-authored-by: Sam Gross <655866+colesbury@users.noreply.github.com>
Feature or enhancement
Proposal:
This issue mostly mirrors Mark's original one at faster-cpython/ideas#632.
As part of PEP 703, tagged pointers will help achieve deferred reference counting.
Tagged pointers will also help the default GIL builds in 3.14 and future with unboxed ints.
Here's the initial design (open for comments):
We change the evaluation stack from
PyObject *
toPyTaggedObject
.PyTaggedObject
looks like this.Note: I am limiting the tagged pointers to just the evaluation stack (this is up for debate as well). The main reason is to keep the change as minimally invasive as possible for 3.13 and not leak out any implementation details for now. For example, I don't want to support tagged pointers in C API functions. This may/will change in 3.14.
Thanks to the cases generator, we can then autogenerate all the proper accessors to said object. For example, a stack effect of
PyObject *
will generate the codeeffect.obj
, while a stack effect ofPyTaggedObject
(the new default) will generate the code(PyObject *)(effect.bits & ~TAGS)
or something similar.Unboxed ints will also help both free-threaded and non-free-threaded builds by reducing refcounting even more for simple arithmetic operations (and making integer operations native). However, I am aiming for deferred tags in 3.13, and anything else is a plus.
Further note: due to wanting to respect immediate finalizers, we are not deferring that many objects. So this may not be a win on default builds. On free-threaded builds, I see a slight (10%) speedup on fibonacci microbenchmarks. Will benchmark on the default bulid as well and see the results.
Has this already been discussed elsewhere?
No response given
Links to previous discussion of this feature:
No response
Linked PRs
Tasks
The text was updated successfully, but these errors were encountered: