New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-111997: C-API for signalling monitoring events #116413
base: main
Are you sure you want to change the base?
Conversation
iritkatriel
commented
Mar 6, 2024
•
edited by bedevere-app
bot
edited by bedevere-app
bot
- Issue: C-API for signalling monitoring events #111997
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do I understand correctly that the CodeLike
type is expected to be provided by user code? How do we guarantee its struct layout then? Why not define a public object struct for it that users can use C-struct inheritance with? (I.e. if they need more fields on their side, they can put the public struct into the first field of their own struct.)
I think we should provide CodeLike in some form, but for now I put it in the text file so we can talk about how and where we want it to be. The offset to location mapping should also be part of this.
@markshannon did you intend for these to be private? |
My thinking was that it is better to keep any function private unless we need it to be public. It sounds like they need to be public. How about this?
Something like this: #ifndef Py_LIMITED_API
static inline int
PyMonitoring_FirePyStartEvent(PyMonitoringState *state, PyObject *codelike, uint32_t offset)
{
if (state->active)
return _PyMonitoring_FirePyStartEvent(state, codelike, offset);
else
return 0;
}
#else
extern int
PyMonitoring_FirePyStartEvent(PyMonitoringState *state, PyObject *codelike, uint32_t offset);
#endif |
The "code like" object is entirely opaque to the monitoring machinery. As far as it is concerned, a location is a pair of an object and an integer index. So, no C struct, but a Python abstract base class. |
Ah, right. I hadn't looked into generator/async functions yet for the implementation. I can see the version being helpful there. They keep their execution state, so there is a place for them to also store away their profiling state. |
I'm not sure what others mean by "scope", but what I mean in this context is lexical scope. Take a simple function: def f(a, b):
return a + b With the enter and exit inserted it would look something like: def f(a, b):
ENTER(&f.version, f.state)
tmp = a + b
EXIT(f.state)
return tmp Generator and async functionsdef f(a, b):
yield a
yield b
return a+b becomes: def f(a, b):
ENTER(&f.version, f.state)
tmp = a
EXIT()
yield tmp
ENTER(&f.version, f.state)
tmp = b
EXIT()
yield tmp
ENTER(&f.version, f.state)
tmp = a + b
EXIT()
return tmp |
Sorry for not thinking of this earlier, but rather than passing typedef struct _monitoring_state_array {
const uint8_t *event_types;
Py_ssize_t length;
PyMonitoringState states[1];
} PyMonitoringStateArray; That way |
But you're not expecting the RETURN and YIELD events to come outside of the BEGIN/EXIT scope events, are you? I was expecting an order like |
The VM disables tracing, no need for you to do it. Thanks for pointing out exception state. |
But wouldn't users end up with casting all over the place, due to the varying struct length? |
Yes, assuming |
I meant the expected order of C-API calls to trigger events. Sorry for being unclear. |
Add
Used as follows:
Does that help? |
True, but it is probably better to have fewer arguments as it causes fewer spills around the calls. Since we have no users, and you are the only prospective user at the moment, feel free to suggest any ergonomic or efficiency improvements you like 🙂 |
If I move the version there, then I also need to move the monitoring state array there. Both are intended to stay in sync. Does it matter if the monitoring state gets updated concurrently while a function is running? If I tie it to the function/generator/code object, then it will be used by all executions of the function, including concurrent ones. But in the end, it's only restricting the events that get created to the latest set of events for which there are listeners. Those events that get reported to the listeners will be selected based on the listener's current interest anyway. That's not really different from a listener changing its interest while another function is running. So, should we expect the version+statearray pair to be tied globally to the lexical scope, not to a specific execution? |
|
Yes, they should be kept together.
Updating the array, and synchronizing updates are the VMs problem, not yours.
Mostly yes. Not a specific execution, but it depends what you mean by "globally". If you intend to support multiple interpreters, then it will need to be tied to a specific heap object. |
I think this is in a "good enough" shape for merging it into beta-1. From my side, feel free to merge it. I'll continue with the implementation in Cython to make sure we have something proven to be usable for the release. |
I've updated the docs with this. Does it look right now? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! The usage is clear now :)
The thread about exception state still looks unresolved, however.
Co-authored-by: Petr Viktorin <encukou@gmail.com>
Thank for pointing this out. I've added it now. |
It is expected that monitoring functions are not called with an exception set, | ||
except for those which are firing exception-related events. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Py_THROW
is exception-related, but with this PR, calling PyMonitoring_FirePyThrowEvent
with an exception set triggers an assertion error.
What's the expected behaviour -- should the user wrap the calls in PyErr_GetRaisedException
/PyErr_SetRaisedException
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should probably say that the exception should not be set even for exception-related events (the exception should be an arg). @markshannon do you agree?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That means users typically need to call PyErr_GetRaisedException
& PyErr_SetRaisedException
themselves, outside the internal _PYMONITORING_IF_ACTIVE
check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could do it for them.
@encukou Please see the latest commit re exception state. |