-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a ms_threshold
flag to profiler: filter blocks "slower then" a ms value, fix recursive profiling
#2281
base: master
Are you sure you want to change the base?
Conversation
pyqtgraph/debug.py
Outdated
@@ -535,7 +535,7 @@ def mark(self, msg=None): | |||
pass | |||
_disabledProfiler = DisabledProfiler() | |||
|
|||
def __new__(cls, msg=None, disabled='env', delayed=True): | |||
def __new__(cls, msg=None, disabled='env', delayed=True, gt: float = 0): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably should make the default value be 0.
or 0.0
if it's going to be type-annotated to a float.
>>> a = 0
>>> isinstance(a, float)
False
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good idea, are we cool with introducing type annots now?
Hi @goodboy Thanks for the PR. The feature looks good to me, I agree with its usefulness. I would suggest renaming the keyword argument from |
@j9ac9k sounds good! I have a couple more commits to address issues with nested profilers that i'm going to append here. I will also make the suggested changes 🏄🏼 |
This is a more explicit input argument name which we assign to an internal `._mt` instance var. This patch also includes some improvements for recursive profiling, see below. Further this commit makes some larger alterations: - drops the class variable `._msgs` since it was causing issues on nested profiler use. My presumption is that this code was written with the expectation that stacks could unwind (per thread) out of order? I don't see why else you'd have `.flush()` only print on a `._depth < 1` condition.. - drops `.flush()`, since `.finish()` is the only place this method is called (even when scanning the rest of the code base) thus dropping a call frame. - formatting adjustments around this `class` as per `flake8` linter. Details, in order from bullets above: - when a top level (aka `._depth == 0`) profiler was deleted by the GC, all messages from all underling (aka recursive) profilers were also wiped, meaning you'd get a weird case where only the "Exit" msg was showing. Now we instead expect each instance to clear it's own msg set. There should also be no further reason for the `._depth < 1` check since each instance unwinds and emits it's own messages, presumably as the call stack unwinds via the GC. - I see no reason to have the `.flush()` when it was a trivial method that can be inlined to `.finish()` and I doubt there's much of a use case for "incrementally flushing" unless the user is using one "god profiler" throughout their entire stack..
@j9ac9k changed the name and added more changes that you'll likely want to audit 😂 |
gt
flag to profiler to filter blocks "slower then" a ms valuems_threshold
flag to profiler: filter blocks "slower then" a ms value, fix *recursive profiling*
ms_threshold
flag to profiler: filter blocks "slower then" a ms value, fix *recursive profiling*ms_threshold
flag to profiler: filter blocks "slower then" a ms value, fix recursive profiling
msg=None, | ||
disabled='env', | ||
delayed=True, | ||
ms_threshold: float = 0.0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pijyoi mentioned some methods/args are snake-cased, but only to indicate they are not part of the explicit API. Not to bikeshed too much, but would it be better to call this msThreshold
(or thresholdMs
if unit suffixes typically go at the end of a number)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I honestly am indifferent but personally don't love the camel case stuff.
It's up to y'all.
@goodboy Since def test_profiler():
"""
Test the profiler by creating a profiler and calling it.
"""
import time
logger.setLevel(logging.DEBUG)
profiler = Profiler("test", disabled=False)
def nested_func():
profiler = Profiler("nested", disabled=False)
time.sleep(0.1)
profiler("after nested sleep")
profiler()
nested_func()
profiler()
profiler() Produces:
I'm also a little confused at the issue you reported with recursive profiling, can you give me a small code example to understand? I tried the current def fib_bad_recursive(n):
"""
A bad recursive implementation of fibonacci.
"""
if n < 2:
return n
profiler = Profiler("fib_bad_recursive", disabled=False)
return fib_bad_recursive(n-1) + fib_bad_recursive(n-2)
def fib_memo(n, memo={}):
"""
A memoized implementation of fibonacci.
"""
if n < 2:
return n
profiler = Profiler("fib_memo", disabled=False)
if n not in memo:
memo[n] = fib_memo(n-1) + fib_memo(n-2)
return memo[n]
if __name__ == '__main__':
from pyqtgraph.debug import Profiler
n = 10
fib_bad_recursive(n)
print('\n=====================\n')
fib_memo(n) Produced
|
Yeah it probably is confusing since in the client system we're using to call this code is all async driven and it seems the whole unwinding via garbage collection isn't playing great. I couldn't even originally see half of the profile stack levels until i made this change. Not entirely sure why. I can probably whip up something but honestly for such a small change as this we'll probably just end up copying this code out since it's getting resistance for merge yet again 😂 It looks like you already have something more extensive in #2322 anyway 😎 |
I wouldn't be opposed to the simpler solution you presented here (I mentioned in my PR that the The only thing giving me pause on this PR is the reverse order printout for the MWE I made. The hesitation is certainly not against this PR, I think it's a good idea 👍
Do you mean |
See my comment in #2322 but I think we might just not worry too much about keeping this and go with a more "serious" solution in the longer run for our code base. https://github.com/plasma-umass/scalene I really just was using this profiler for graphics stuff, but honestly it would make more sense to be profiling things properly (even |
Yes, we have a single-thread graphics loop entirely driven by async data flows from multiple processes. |
Sorry missed this. Yeah I mean if it's something obvious can you just revert what you think is causing it, PR it to this branch and then I'll give it a shot. Sorry i'm not in graphics land at the moment but will definitely come back this! Also thanks for all the feedback 😸 |
@ntjess just went back to this again and I think you're right. It's the change of Not sure why I couldn't get it working before (maybe other issues in the threshold logic before it was working properly?) but seems to be doing about what I'd expect now. I'll push up this last little change and you can tell me if it matches your expectations 😎 |
Oh, and yes, for our purposes we're trying to get under 10ms per full graphics render cycle on interactions with >= 1M datums..so yeah it kinda does matter. I'm already dealing with a bit much skew due to the current overhead latency this original impl contains 😂 |
Details of the original patch to upstream are in: pyqtgraph/pyqtgraph#2281 Instead of trying to land this we've opted to just copy out that version of `.debug.Profiler` into our own internals (luckily the class is entirely self-contained) until such a time when we choose to find a better dependency as per #337
Details of the original patch to upstream are in: pyqtgraph/pyqtgraph#2281 Instead of trying to land this we've opted to just copy out that version of `.debug.Profiler` into our own internals (luckily the class is entirely self-contained) until such a time when we choose to find a better dependency as per #337
Details of the original patch to upstream are in: pyqtgraph/pyqtgraph#2281 Instead of trying to land this we've opted to just copy out that version of `.debug.Profiler` into our own internals (luckily the class is entirely self-contained) until such a time when we choose to find a better dependency as per #337
I found this super useful for narrowing down latency issues (from slowest to fastest) much easier then having a whole sale or "tag based" selection of sections enabled.
If noone is interested in this I'll stick with our ongoing approach of pulling into our dependent project and y'all can come ask for it later if wanted 🖖🏼
Upate: new changes to aid with recursive profiling,
Taken from latest commit msg:
Further this commit makes some larger alterations:
._msgs
since it was causing issues onnested profiler use. My presumption is that this code was written with
the expectation that stacks could unwind (per thread) out of order?
I don't see why else you'd have
.flush()
only print on a._depth < 1
condition...flush()
, since.finish()
is the only place this method iscalled (even when scanning the rest of the code base) thus dropping
a call frame.
class
as perflake8
linter.Details, in order from bullets above:
._depth == 0
) profiler was deleted by the GC,all messages from all underling (aka recursive) profilers were also
wiped, meaning you'd get a weird case where only the "Exit" msg was
showing. Now we instead expect each instance to clear it's own msg
set. There should also be no further reason for the
._depth < 1
check since each instance unwinds and emits its own messages,
presumably as the call stack unwinds via the GC.
.flush()
when it was a trivial methodthat can be inlined to
.finish()
and I doubt there's much of a usecase for "incrementally flushing" unless the user is using one "god
profiler" throughout their entire stack..
Some notes (old)
found thenow solved by making.__del__()
finishing to be highly unreliable._msgs
and instance var as per above (particularly in async code driving sections with profiling)"nested profiling" seems to be something really handy but i'm still not entirely sure how it works; do you need to pass in the caller's instance to make it work or should it work with new instances always?seems to work great now, even with using the newms_threshold
param after the latest patch.TODO: