New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce profiler interface to benchmark framework. #1533
base: main
Are you sure you want to change the base?
Conversation
- Users of benchmark library provides implementations (e.g., defines how counters are collected and reported) and registers profiler once. - The profiler is init'ed once per 'RunSpecifiedBenchmarks' and injected in a way to skip capturing states for 'Benchmark::Setup' or 'Benchmark::TearDown'. The profiler calls 'Finalize' to do post-processing.
|
Hi Roman, |
Then this doesn't do the right thing still - how does this deal with |
{Pause,Resume}Timing and profilers are added to solve relatively orthogonal problems.
Another reason to de-couple timer and profiler is generability. Time and perf counters are measured per-thread; while profilers may need to have view across threads (e.g., contention profilers, or if the profiler rely on global states) -> if each thread invokes profiler in a synchronized way, synchronization overhead could change the performance data themselves.
Regarding complexity, the interface and its injection is not adding much maintenance overhead (e.g., doesn't mess up the codebase or make future contributions hard) or benchmark execution overhead (optional and off if users don't need a profiler) |
My point is that it may be fine for your use-case to ignore |
The current way of injection (based on MemoryManager but not exactly the same) aims to support
This reply above means to elaborate why skipping Setup or TearnDown could improve profile quality for profile collection purpose, and (stepping back on how to handle PauseTiming) the implications on generality (State and timers are per-thread, while a profiler might need to update data across concurrent threads) Given that calling {Pause,Resume}Timing per iteration should generally be avoided (consider it not common), I wonder if it makes sense to prioritize the use case for this PR. If it helps avoid confusion I could add comments in class description (and PR description) to scope the intended use case. If there is a profiler use case that needs to handle per-thread timers in the future, this interface class won't make contributions harder by then or mess up the code base in terms of maintenance -> support could be added for per-thread purposes (likely new classes, not extending this one). |
i don't understand the use-case. before working on such a large and invasive change it's generally a good idea to open an issue (feature request) with some examples of what can't be done today but would be possible with this change. i'm pretty sure if there's a consistent issue with timing expensive setup/teardown this is going to be an issue for all users and something we should resolve within the library without some external hooks. |
RunSpecifiedBenchmarks
and injected in a way to skip capturing states forBenchmark::Setup
orBenchmark::TearDown
. The profiler callsFinalize
to do post-processing.