Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Attach & detach to a running process #47

Open
iddan opened this issue Aug 20, 2018 · 26 comments
Open

Feature Request: Attach & detach to a running process #47

iddan opened this issue Aug 20, 2018 · 26 comments

Comments

@iddan
Copy link
Contributor

iddan commented Aug 20, 2018

I'd like to be able to run a long time process (for instance a server), attach to it, trace the calls, detach from it and get the frame timings.

@joerick
Copy link
Owner

joerick commented Aug 22, 2018

Cool idea. Could perhaps be possible by using https://github.com/lmacken/pyrasite?

@iddan
Copy link
Contributor Author

iddan commented Aug 22, 2018

Yes, but it requires gdb which I think is the opposite of what this project aims for.

@iddan
Copy link
Contributor Author

iddan commented Aug 22, 2018

For my use case a specific solution using https://github.com/Microsoft/ptvsd would do but if there's a more general way to do it I prefer so.

@joerick
Copy link
Owner

joerick commented Aug 22, 2018

true. I didn't actually manage to get pyrasite working on my mac just now. how are you imagining this working then? there's a dormant bit of pyinstrument that's already loaded into your server, and another process can message it to start/stop profiling and send output?

@iddan
Copy link
Contributor Author

iddan commented Aug 22, 2018

That is definitely one option. I'd try to look for a standard way to attach to debug process and get back with my findings.

@itamarst
Copy link

itamarst commented Jul 27, 2019

I had similar idea, and here's how I would design it:

  1. Program does pyinstrument.enable_background_mode("/tmp/profile/") on startup.
  2. This installs a signal handler.

Later:

  1. User sends SIGUSR2 to process. This does Profiler().start()
  2. User waits a minute.
  3. user sends SIGUSR2 to the process. This stops the profiler and dumps to /tmp/profile/<current timestamp>

@itamarst
Copy link

Would you like the above as a PR? I could also do it as separate package.

@iddan
Copy link
Contributor Author

iddan commented Jul 28, 2019

I think it can be useful as part of this package

@joerick
Copy link
Owner

joerick commented Jul 28, 2019

Yes, I think this is a good idea! I'm a little wary of using signals for this, it's probably okay, and they'll get us the ability to drop a callback onto the main thread. But I guess profiling other threads would not be possible using this, or adding options to the profile request.

For that reason, and to make it a little more user friendly, I'd suggest changing the interface to this a little - instead of documenting the SIGUSR2 mechanism, we'd have a pyinstrument --attach <pid>, which would wait for ctrl-c and then print output. Internally, it would use the mechanism you describe. We'd also provide a default profile location so the program could just write pyinstrument.enable_background_mode().

By documenting pyinstrument --attach <pid> instead of SIGUSR2, we're not locked in and can change the underlying mechanism later, to add options like which thread to profile, or if we want to stream output.

Would you like the above as a PR?

Yes! Happy to:

  • get a PR for the mechanism you describe, and I can implement my interface on top
  • or, a PR for the whole lot would be amazing.

@itamarst
Copy link

Ah, threads :( It looks you can only inject exceptions in other threads with PyThreadState_SetAsyncExc, not arbitrary code? That's annoying. I wonder if there's another way...

@joerick
Copy link
Owner

joerick commented Jul 28, 2019

Would main-thread-only be a deal-breaker for your use case?

@itamarst
Copy link

Just thinking about common use cases like threaded WSGI servers.

@joerick
Copy link
Owner

joerick commented Jul 28, 2019

yeah..... it's a pain for that use case. I'm wondering if we could use threading.setprofile to cover that off? But that would require the profiling to be always active, even if nothing was listening. Sounds like a performance hit.

@itamarst
Copy link

itamarst commented Jul 28, 2019

If you did that on C level it would be C-level if statement on every traceable event (if !actively_profiling { return}. Which is... still a performance hit, though one wonders how much.

@joerick
Copy link
Owner

joerick commented Jul 28, 2019

I've been looking around at other profilers, there's one called 'yappi' that can do this - it can make itself the profiler for all active threads, by editing a couple of fields on the PyThreadState object... check it out

@itamarst
Copy link

That makes me a little twitchy, but I guess if the setup API had strict Python version checks...

@itamarst
Copy link

@itamarst
Copy link

For sys.setprofile it says "The function is thread-specific, but there is no way for the profiler to know about context switches between threads, so it does not make sense to use this in the presence of multiple threads." Is that still the case, or can pyinstrument reasonably get meaningful results when installed in multiple threads?

@joerick
Copy link
Owner

joerick commented Jul 28, 2019

The function is thread-specific, but there is no way for the profiler to know about context switches between threads, so it does not make sense to use this in the presence of multiple threads.

That confuses me, tbh. I guess it's partially true, in the sense that when looking at one thread's results you can't easily correlate which actions causes which CPU usage? But for the case of the multithreaded WSGI server, there aren't context switches that we care about because each request is handled on one thread only. Am I reading that right?

Original implementation is presumably based on https://github.com/python/cpython/blob/b9a0376b0dedf16a2f82fa43d851119d1f7a2707/Python/ceval.c#L4700

pyinstrument_cext uses PyEval_SetProfile, which is one function up in that file.

That makes me a little twitchy, but I guess if the setup API had strict Python version checks...

Yeah, it's a little hacky. But I'd expect if a new version of Python changes their implementation, to see that in CI failures. It could be a fun thing to implement! Starting place would be an all-threads version of pyinstrument_cext.setprofile.

As a feature/PR, it should probably be separated from the 'attach' functionality too.

@itamarst
Copy link

The issue with context switches on further thought is for tracing profilers only:

  1. Function A in thread T1 STARTED
  2. Context switch to T2
  3. Later...
  4. Context switch to T1
  5. Function A in thread T1 finished

So now profiler thinks that function A took longer than it did. Whereas sampling profiler shouldn't have that issue, I think?

Anyway, for the feature, there's three parts:

  1. API to install profiler in all threads.
  2. enable_background_profiler() or some better name low-level implementation.
  3. pyinstrument attach (which is a very good idea).

Do you want to do any of those? I can probably over time do all of them, might just take longer.

@iddan
Copy link
Contributor Author

iddan commented Jul 29, 2019

Keep in mind the common use case for servers is one thread locally, multiple threads deployed.

@pjz
Copy link

pjz commented Aug 16, 2019

https://github.com/benfred/py-spy does essentially this; you might see what method it uses.

@itamarst
Copy link

py-spy needs ptrace, so back to requiring root/privileged processes, which is what I'm trying to avoid.

@DanielJanzon
Copy link

DanielJanzon commented Jan 10, 2020

A very simple solution that could go quite a long way would be to add a --max-duration=SECONDS option that runs the python program for the given amount of time (unless it quits by itself before), then halts and displays the instrumentation report. Then you can start a server with e.g. 2 minutes duration, apply your load and get the result.

@joerick
Copy link
Owner

joerick commented Jan 10, 2020

@DanielJanzon in that case, my normal pattern is to run the server with pyinstrument <file.py>, apply load, and then hit ctrl-c to kill it - pyinstrument prints the profile even if the program exits with failure.

@DanielJanzon
Copy link

Ok, that didn't work for me due to other reasons (ctrl-c/SIGINT was not enough to stop the server). But I agree your proposed way is the way to go. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants