Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function order, timing units, percentage #47

Open
Scraft opened this issue Jun 25, 2015 · 5 comments
Open

Function order, timing units, percentage #47

Scraft opened this issue Jun 25, 2015 · 5 comments

Comments

@Scraft
Copy link

Scraft commented Jun 25, 2015

I would find it very useful if the profiler breakdown was closer to other profilers I have used in the past, specifically:

image

It would be nice if the browser view could (either by default, or as an option) display like:

image

So:

  • An expandable tree view for the function stack
  • Functions orderer in descending percentage order
  • Overall percentage next to each function (so 87.2% of the frame is spending in PcEngine::Tick and 61.4% is spent in PcEngine::Render, looking further down, 23.8% is spent in OpenGLSprite::Render). Some profilers allow you to toggle whether the percentages are relatively (to parent) or absolute. I believe absolute is the more useful overall.
  • Specify (somewhere!) what the units are (microseconds?), perhaps milliseconds would be better (so 16565 might become 16.565) in games at least, we typically talk about timing in ms.

This is of course on top of:

  • Combining samples from the same function (which I have hacked in)
  • Displaying the hit count of each function

I appreciate there is quite a lot to this enhancement request, and perhaps parts of it would be better suited to subtasks, but I'll put it all here initially, maybe you'll be able to comment on whether the above fits with your vision for this project or not. From my perspective it is a neat little profiler, I'd like to be able to make more use of it (I would also like my dev team to use it) and if it worked somewhat similarly to existing projects it would help a lot.

@jswigart
Copy link

If you do this, please make it an option. Visualizing the ordering of the calls made in code is important to see dependencies between threads and such, and ordering them by their time is not useful enough to lose the order context of the capture data.

@dwilliamson
Copy link
Collaborator

Thanks for the comments - it's all a bit too big to be one issue that can be closed off as achieved but I'll leave it open in the hopes that others might pitch in ideas.

Here's some points on all that:

  • Visualising a sample tree by anything other than call hierarchy will likely not be done for the real-time view as it will be constantly swapping around as percentages change. When paused, I might get a chance to add that mode but I'm hoping by then that the UI issues for Remotery will have been solved such that others can provide those modes as a patch later.
  • Timing should be in milliseconds but they are in microseconds because as a low level engine coder, I have spoken in microseconds for many years now. If a system's timing differences are measured in milliseconds it's either not been optimised by me yet or it's not important enough to receive my attention yet :)
  • I will be adding inclusive/exclusive timings so that you can see how much time is spent in the parent and children. And I will be adding percentage equivalents of these, too. Hit count will also go in once sample merging has been achieved.

To be clear, Remotery development is stalled slightly right now because of two issues:

  • The public interface does not allow easy adding of new sample types (e.g. D3D12) or sample behaviours (e.g. Merge samples) without significantly ballooning the number of entry-points, making things considerably more confusing.
  • The UI for Remotery does not scale in terms of a larger application. Window sizes are manually specified, control regions can't resize and a bunch of other stuff make it really difficult to add the new windows/controls I want. Not to mention CSS is awful for this use-case. I am mid-way to a solution to this but it's not in a public branch.

Once those are solved, we should see a snowball of new features so please keep ideas coming in.

@dwilliamson
Copy link
Collaborator

This is vaguely what I'm chasing and intend to improve on. This is a remote profiler I built for Splinter Cell some 8 years ago

debug

@Scraft
Copy link
Author

Scraft commented Jun 29, 2015

I'll throw some comments back:

  • Visualising a sample tree by anything other than call hierarchy will likely not be done for the real-time view as it will be constantly swapping around as percentages change.

Certainly I have used profilers that do this; some entries will jump around; but a lot will remain in order. As a reference point, the OSX/iOS default profiler 'Instruments' works this way. Hitting 'pause' to get a correct ordering isn't a huge inconvenience, but personally I find it useful to be able to just see real time what the bottlenecks are. Unless there is a strong reason not to do it, I think it'd be worth doing (or have a checkbox to specify whether it live updates or not).

  • Timing should be in milliseconds but they are in microseconds because as a low level engine coder

I don't mind too much what the units are, but right now there isn't anything in the UI that specifies what the units are, so a column heading saying 'μs' or whatever would be nice. Also, having thousand separators would be useful (especially if it is staying in microsecond). For your reference, I profile at both the low level (where microsecond is useful) but also at the high level, as in I will 'profile' a game loading - the level may take 30 seconds, and I just expand our the profile to see where this 30 seconds is going. Right now that means numbers like 30000000, 28750000, which I find hard to read at a glance. One reason I thought milliseconds maybe better is with three decimal places you get microsecond precision anyway, however if you are profiling at fractions of a microsecond I appreciate there is an issue here.

  • The public interface does not allow easy adding of new sample types

I am happy to try and help out a little; is there something specific I can do to help with this issue? I have taken a little look at the code and understand the problem. I write nearly all C++ (vs C) so my initial solutions to some of the problems aren't going to be suitable for this project. However if there is anything very specific you need someone to do let me know. For the issue about the UI I don't feel I can offer any assistance as I've done very little web stuff. I appreciate you doing everything yourself with give the best results as you know exactly what you want, but I'd like to accelerate the development of this project if I can help at all.

Finally, the Splinter Cell remote profiler looks nice. Where I am at is working at a studio with several developers, making games using an in-house engine. Currently I do most of the profiling, and I am a little frustrating by the lack of decent libraries that expose a nice web-interface for games (doing things like logging, live adjusting variables, profiling, crash reporting, etc.) but rather than starting anything from scratch I'd rather contribute to promising projects (which this is; obviously). Prior to this studio, I worked at Travellers Tales, in the tech team, working across all platforms, using various profilers. Since leaving and going to a small studio, working on mobile, I have certainly missed some of the tools you get when working on the consoles (but on a positive point, things are getting better, Xcode now has a nice GPU Frame Capture utilities for iOS projects along with a reasonable profiler, and there are initiates like Remotery).

@dwilliamson
Copy link
Collaborator

Unless there is a strong reason not to do it, I think it'd be worth doing (or have a checkbox to specify whether it live updates or not).

Time is the reason at the moment so Remotery is primarily built to serve my immediate needs. I will add various sorts and real-time updates "may" fall out of it for free but we shall have to see.

I don't mind too much what the units are, but right now there isn't anything in the UI that specifies what the units are, so a column heading saying 'μs' or whatever would be nice. Also, having thousand separators would be useful (especially if it is staying in microsecond).

Stuff such as this is very easy to add (see SampleWindow.js). However I am not on that branch right now but will happily accept pull requests.

I am happy to try and help out a little; is there something specific I can do to help with this issue? I have taken a little look at the code and understand the problem.

I need to remove the copy/paste API rmt_BeginCPUSample, rmt_BeginD3D11Sample, etc and replace it with something that's easier to extend without defining a host of new macros. So something like:

rmt_BeginSample(CPU, name);

I then need to add optional flags that modify the behaviour of a sample: does it collapse to parent, collapse to neighbour, collapse to sibling, etc? All this bearing in mind that I want to add arbitrary variable watchers to this and part of that task may cross-over with the sample tasks.

I've not thought much of this through in any detail of late. We may find that the goal is neither possible or reasonable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants