Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce critical section boilerplate in type slot implementations #114258

Open
colesbury opened this issue Jan 18, 2024 · 8 comments
Open

Reduce critical section boilerplate in type slot implementations #114258

colesbury opened this issue Jan 18, 2024 · 8 comments

Comments

@colesbury
Copy link
Contributor

colesbury commented Jan 18, 2024

Feature or enhancement

PEP 703 in large part relies on replacing the GIL with fine grained per-object locks. The primary way to acquire and release these locks is through the critical section API (#111569). In #111903, we added support for generating the necessary critical section calls in Argument Clinic wrapped methods. In #112205, this was extended to getters and setters as well.

However, some object behavior is implemented using "tp slots", which are not supported by Argument Clinic. These functions often need critical sections for thread-safety; that means more boilerplate code.

Here are some possible ideas on how to address this:

Add support for "tp slots" to Argument Clinic

We could add support for certain tp slots functions to Argument Clinic. This would provide a consistent way to add critical sections across both slots, getters/setters, and "regular" methods. The disadvantage is that "tp slots" don't need much argument parsing, so they may otherwise not be a great fit for Argument Clinic

Use C macros to wrap slot implementations with critical sections

We could implement an internal-only header that provides macros to wrap a given "tp slot" function with a critical section. For example, something like:

static PyObject *
list_richcompare_impl(PyObject *v, PyObject *w, int op)
{
   // implement richcompare unchanged
}

// generates a list_richcompare() function with the correct Py_BEGIN/END_CRITICAL_SECTION()
_Py_CRITICAL_SECTION_TP_RICHCOMPARE(list_richcompare, list_richcompare_impl);

I think this would not be too difficult to implement -- probably easier than modifying Argument Clinic. The disadvantage is the lack of uniformity around how we add critical sections to functions.

Keep the status quo

We can always just keep adding the Py_BEGIN/END_CRITICAL_SECTION() calls manually for tp slots.

cc @erlend-aasland, @corona10

Linked PRs

@erlend-aasland
Copy link
Contributor

The disadvantage is that "tp slots" don't need much argument parsing, so they may otherwise not be a great fit for Argument Clinic

OTOH, we may want to expand Argument Clinic's feature set in the future12, and we already added stuff that were not a great fit3: getters and setters.

Footnotes

  1. https://github.com/faster-cpython/ideas/issues/546

  2. https://github.com/faster-cpython/ideas/issues/565

  3. wrt. the original goals of Argument Clinic, that is

@corona10
Copy link
Member

corona10 commented Jan 19, 2024

I would like to suggest delaying the decision until we notice how many cases will be affected to be thread-safe.

The following scenario can be imagined from now on,
A. If every tp_xxx should be thread-safe (AFAIK, We don't want to declare crirical_sections for all tp_compare):
Declare critical_section implicitly before calling tp_richcompare

res = (*f)(v, w, op);

B. Not all of them, some of them:
Provide wrapping macros that can be easily used.

_Py_CRITICAL_SECTION_TP_RICHCOMPARE_IMPL(list_richcompare_impl)
// ...
.tp_richcompare = _Py_CRITICAL_SECTION_TP_RICHCOMPARE(list_richcompare_impl)

C. Very few of them:
Use our hands.. :)

@erlend-aasland
Copy link
Contributor

erlend-aasland commented Jan 19, 2024

I would like to suggest delaying the decision until we notice how many cases will be affected to be thread-safe.

I'd expect (correct me if I'm wrong) most GC related slots to be critical sections: clear, traverse, and free. With more than a hundred heap types in Modules/, that means more than three hundred cases already. I also expect (correct me if I'm wrong) can throw in rich-compare, iter, next, repr. ISTM the best way forward is through a carefully designed feature set in Argument Clinic. That will make sure the code base is more maintainable. The consistency will make it easier to reason about the code. It will also be easier to make changes to the boiler-plate code.

+1 for Argument Clinic.

@corona10
Copy link
Member

corona10 commented Jan 19, 2024

I'd expect (correct me if I'm wrong) most GC related slots to be critical sections: clear, traverse, and free.

@colesbury cc @erlend-aasland
There will be the stop the world phase(Objects can not be touched at this phase) in free threading CPython, should we care about it?

@colesbury
Copy link
Contributor Author

I am mostly thinking about the non-GC related slots, although critical sections on certain tp_finalize slots would make sense. The tp_traverse slot is only called during stop-the-world (no locking necessary). The tp_clear() is only called on unreachable objects and doesn't support resurrection, so I'm inclined not bother with locking there either, but we might end up revisiting that.

I counted about 14 slots on list that probably need locking or some other thread-safety. There's probably about as many on dict and some on set as well. And then there are other classes that provide tp_richcompare or tp_repr and need locking for those.

I think it's worth doing, but I'm not sure how much work it is to get Argument Clinic to support these slots. It would be less overall work if it's implemented sooner, but it's also not the most urgent thing right now, so I'm not sure how we will prioritize it.

@erlend-aasland
Copy link
Contributor

I'll try to cook up a list of needed AC changes.

@corona10
Copy link
Member

corona10 commented Feb 1, 2024

For the design,
If we decide to support this feature in AC.

The design should be decided first and my rough idea is using the reserved dunder method convention.

  • object.__tp_repr__ => tp_repr
  • object.__tp_richcompare__ ==> tp_richcompare
  • object.__tp_clear__ => tp_clear
  • object.__tp_finalize__ => tp_finalize

@erlend-aasland
Copy link
Contributor

[...] my rough idea is using the reserved dunder method convention.

Currently, the denylist is made up of the Python equivalents; see:

cpython/Tools/clinic/clinic.py

Lines 2573 to 2642 in 1390796

unsupported_special_methods: set[str] = set("""
__abs__
__add__
__and__
__call__
__delitem__
__divmod__
__eq__
__float__
__floordiv__
__ge__
__getattr__
__getattribute__
__getitem__
__gt__
__hash__
__iadd__
__iand__
__ifloordiv__
__ilshift__
__imatmul__
__imod__
__imul__
__index__
__int__
__invert__
__ior__
__ipow__
__irshift__
__isub__
__iter__
__itruediv__
__ixor__
__le__
__len__
__lshift__
__lt__
__matmul__
__mod__
__mul__
__neg__
__next__
__or__
__pos__
__pow__
__radd__
__rand__
__rdivmod__
__repr__
__rfloordiv__
__rlshift__
__rmatmul__
__rmod__
__rmul__
__ror__
__rpow__
__rrshift__
__rshift__
__rsub__
__rtruediv__
__rxor__
__setattr__
__setitem__
__str__
__sub__
__truediv__
__xor__
""".strip().split())

Since we will be allowing only a subset of this list, I think it would be beneficial to follow that established convention.

As for the feature, I've stated hashing out some refactorings1.

Footnotes

  1. breaking up state_modulename_name(), breaking up render_function(), breaking up output_templates())...

erlend-aasland added a commit to erlend-aasland/cpython that referenced this issue Feb 2, 2024
Yak-shaving in preparation for adding support for slots.

- factor out return converter resolver
- factor out cloning
erlend-aasland added a commit that referenced this issue Feb 15, 2024
Refactor state_modulename_name() of the parsing state machine, by
adding helpers for the sections that deal with ...:

1. parsing the function name
2. normalizing "function kind"
3. dealing with cloned functions
4. resolving return converters
5. adding the function to the DSL parser
erlend-aasland added a commit that referenced this issue Mar 4, 2024
* Move param guard to param state machine
* Override return converter during parsing
* Don't use a custom type slot return converter; instead
  special case type slot functions during generation.
woodruffw pushed a commit to woodruffw-forks/cpython that referenced this issue Mar 4, 2024
…hon#116170)

* Move param guard to param state machine
* Override return converter during parsing
* Don't use a custom type slot return converter; instead
  special case type slot functions during generation.
adorilson pushed a commit to adorilson/cpython that referenced this issue Mar 25, 2024
…hon#116170)

* Move param guard to param state machine
* Override return converter during parsing
* Don't use a custom type slot return converter; instead
  special case type slot functions during generation.
diegorusso pushed a commit to diegorusso/cpython that referenced this issue Apr 17, 2024
…hon#116170)

* Move param guard to param state machine
* Override return converter during parsing
* Don't use a custom type slot return converter; instead
  special case type slot functions during generation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants