Event streaming #404

apangin · 2021-03-24T21:41:13Z

Allow to process events while profiling is active

PaulBGD · 2021-09-23T19:48:15Z

Does this exist now with jfr sync?

apangin · 2021-09-23T22:10:35Z

@PaulBGD No, jfrsync combines events from async-profiler and Flight Recorder in a file.

jhalliday · 2022-06-27T12:02:38Z

Is this being worked on? As OpenTelemetry starts to look at adding continuous profiling support (issue, slack) I'm considering how it may be implemented in their Java SDK. My current prototype uses the JFR's new(ish) streaming API, but I'd like to be able to abstract over JFR and async-profiler as data sources. Whilst it's possible to read the log file and transform it to an event stream, eliding that and getting events more directly through a Java API would be a welcome enhancement.

apangin · 2022-06-27T21:28:26Z

@jhalliday I'm not working on this feature right now, since there are no customers for it. This may change though, if there is enough interest.

I doubt that event streaming in async-profiler will look any similar to JEP 349. Despite its name, JFR Event Streaming is not the right tool for the real-time streaming of profile data, since it ruins the main advantage of the production profiling - the low overhead. ¹
Instead, async-profiler will likely transmit data in a compact binary representation that may be parsed (but not necessarily) into Java object model on the receiver side.

http://hirt.se/blog/?p=1239 ↩

jhalliday · 2022-06-27T22:12:04Z

Thanks for the update.

The OTel work is specifying an interop wire format for profiling data across many platforms, looking for a good balance between bandwidth use and data handling overhead. Marcus' DataDog colleagues and other observability vendors are part of that discussion and bring their experience with JFR and similar tools to the table.

For the JVM OTel SDK, there is the possibility to handle the encoding in C rather than Java before passing it out through the Java network stack, but one way or another we'll need to transcode the various profiler's 'native' recording format to the OTel transport one, whatever that ends up looking like, either at the client or perhaps at an intermediate gateway ('collector', in OTel terms).

I'm not particularly enchanted with the JFR streaming API for this use case, but it's what we have right now if reading files back from disk doesn't appeal. Longer term there may be the possibility of better aligning it with emerging observability needs, but perhaps async-profiler has the opportunity to be more agile here, given the JDK API change cadence. The async-profiler current context id PR for example would give OTel a way to correlate tracing and profiling signals by labelling a thread with the trace spanId, which is a gap in the functionality offered by JFR.

Anyhow, we'd welcome your thoughts if you have time to participate in the OTel process.

farmerworking · 2022-08-11T01:40:54Z

much needed！

continuous profiling is useless without analyze. currently I can only do it by reading output files produced by async profiler which is not convenient

I noticed that a issue related to "Publish Jar" is on going and AsyncProfilerMXBean is already defined

so it would be nice if I can just supply a callback function and get notified with profile result periodicity so that I can do things magic like: interrupt thread which allocate too much memory

by the way, async-profiler is really amazing

JonasKunz · 2024-03-28T11:35:11Z

We (elastic) would also be very interested in this feature!
We recently ported the inferred spans feature from our elastic-apm-agent to a standalone OpenTelemetry extension.

This feature enabled async-profiler in wall-clock profiling mode to fetch stacktraces for threads which have active OpenTelemetry spans.
These stacktraces are then used to generate synthetic spans for areas where the application spends time which are not covered by instrumentation.

One of the main pain points of this implementation is that at the moment the processing needs to happen after the profiling session has ended:

When a span is started, a profiling session with a fixed duration is started if none is running already
For the duration of this session, the extension also needs to spill a log of when which span was active at which thread to disk
After the profiling session is over, both the profiling data and the span log is read back and the synthetic spans are reconstructed

We would like to contribute this extension to the upstream OpenTelemetry project.
However, the current approach of having to spill tracing data to disk is quiet complex and required a lot of code, making this harder to contribute and maintain.
In addition the current approach has the downside that it doesn't work properly with spans which overlap multiple profiling sessions. While in theory this is doable, this would further increase the complexity.

A way of directly streaming the profiling samples back to the application with a reasonable low latency would greatly simplify and improve this feature.

I initially thought of proposing to extend the existing async-profiler Java API to:

Allow streaming of profiling stacktraces in the form of jmethodId-arrays (ignoring native stackframes) back to the Java application
Extend the API to allow resolving classes and method-names from jmethodIds

I would also be willing to contribute here, though I'd likely need a good amount of guidance for the first part.

However, I figured that this proposal might be just kind of a special case of your intentions of this issue, so I decided to comment here instead.

apangin · 2024-03-31T23:48:19Z

@JonasKunz Thank you for your interest in the feature.
Currently, event streaming is not in the nearest project plans because of other priority features. That said, if you have a specific detailed proposal and ready to contribute, we can discuss that.

pnf · 2024-05-04T15:32:44Z

We have similar requirements for streaming, always-on, massively distributed profiling with stacks representing native, java and asynchronous continuations. Our approach uses an async-profiler fork that supports injection of await/continuation frames and a compact ascii format for exporting multi-event stacks (cpu, lock, heap+native allocation, plus arbitrary custom events). Stacks are aggregated over chunks of (say) a minute in situ on every engine, processed, published out of band over kafka, and finally loaded into pyroscope. The chunking is key to making this practical across thousands of hosts; bandwidth aside, pyroscope can barely sustain assimilating a few dozen jfr files simultaneously, much less the magnitude we require.

apangin added the enhancement label Mar 24, 2021

This was referenced Mar 24, 2021

Immediately dump traces, rather than waiting for collection #191

Closed

Java Callback API #403

Closed

Jongy mentioned this issue Apr 3, 2021

Profile low-privileged processes with perf_events #411

Merged

8 tasks

JonasKunz mentioned this issue Apr 4, 2024

Add Java-API as output option for wall-clock profiling #913

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Event streaming #404

Event streaming #404

apangin commented Mar 24, 2021

PaulBGD commented Sep 23, 2021

apangin commented Sep 23, 2021

jhalliday commented Jun 27, 2022

apangin commented Jun 27, 2022

jhalliday commented Jun 27, 2022

farmerworking commented Aug 11, 2022

JonasKunz commented Mar 28, 2024

apangin commented Mar 31, 2024

pnf commented May 4, 2024

Event streaming #404

Event streaming #404

Comments

apangin commented Mar 24, 2021

PaulBGD commented Sep 23, 2021

apangin commented Sep 23, 2021

jhalliday commented Jun 27, 2022

apangin commented Jun 27, 2022

Footnotes

jhalliday commented Jun 27, 2022

farmerworking commented Aug 11, 2022

JonasKunz commented Mar 28, 2024

apangin commented Mar 31, 2024

pnf commented May 4, 2024