Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exposing more data on IProfiledCommand to enrich traces #1448

Open
CodeBlanch opened this issue Apr 28, 2020 · 6 comments · May be fixed by #1454
Open

Exposing more data on IProfiledCommand to enrich traces #1448

CodeBlanch opened this issue Apr 28, 2020 · 6 comments · May be fixed by #1454

Comments

@CodeBlanch
Copy link

Greetings!

I'm using Open Telemetry and its Redis adapater to instrument an application using a Redis cluster for caching.

There are a couple of issues that have been discussed in this area already. Ideally, we would use DiagnosticSource (#833) because the profiling interface is a bit awkward for the integration (#1044) but let's set that aside for a second.

The profiling mechanism works once you get it registered with the connection, but it's missing some key data, primarily the cache key and whether or not the retrieval was a hit or a miss. That stuff isn't exposed on the IProfiledCommand interface so there's no way for OT to write it on its spans. We can solve this by wrapping our calls into StackExchange.Redis with a class that spins up tracing around it, but it would be nice to have an OOB solution.

Are we open to exposing more data on IProfiledCommand? I'm happy to work on it, and throw up a proposal, I just thought it would be a good idea to ask first.

/cc @SergeyKanzhelev @MikeGoldsmith @cijothomas

@SergeyKanzhelev
Copy link

CC: @davidfowl and @tarekgh

@NickCraver
Copy link
Collaborator

What would be the structure of the property on the command if we did this? A few things to keep in mind:

  • Not everything is a key, a command can be zero, one, or a thousand keys
  • Some commands are full scripts, etc.
  • We don't allocate the command today - it's bytes to shove to the wire so we don't necessarily every have a string version of it (which I'm guessing is the ask here). If we did so, there'd be a lot of allocation overhead to satisfy this profiling.

On the result end of things, the questions go towards: what is "success"? Hit or miss is called out here, but that's a very narrow view of Redis in general. We could be getting zero, one, or a thousand items. It could be a key or a set or nothing at all. We could be issuing a config set or a pub/sub to send to replication, etc. None of those abstract to a hit or a miss really - the concept itself is far too narrow for the usage. I think to have a chance at a facade here, we'd have to define "success", but I'm no sure that's any easier to do or less opinionated though.

The above is a quick brain dump to get the conversation going - thoughts on...stuff?

@SergeyKanzhelev
Copy link

@tarekgh this would be a great use case to try the improved DiagnosticSource design. Would you be available to make an example on how this instrumentation may look like so we can estimate the number of allocations needed?

@SergeyKanzhelev
Copy link

As of user scenarios and what is missing - the most critical piece of information that is missing (I heard of) is an information on failed calls. Like network issue or such.

After the basic details like Span with name and duration and failure indication, things like hit or miss is a nice addition for cases we can identify. Yes, those attributes would be useful for the narrow set of commands, but would work great for people using those. I like the suggestion to add number of bytes sent/received. If there is a way to obtain this information easily - it may be very informative.

@CodeBlanch CodeBlanch linked a pull request May 8, 2020 that will close this issue
@CodeBlanch
Copy link
Author

I just opened draft PR showing what I was thinking. Basically add more of the raw data to IProfiledCommand. StackExchange.Redis doesn't really need to worry about what OT will do with it :) At the very least having Script when Command=EVAL will be helpful. But I think we can also roughly determine cache hit/miss? For example, if Script contains redis.call('HMGET' (that's what the dotnet lib does) we can check if the Keys requested have matching Values. Or am I being too optimistic?

As far as adding the key requested on the span, easy to do if one is present. If multiple, we could write how many were requested? Or the top X number? Or make it configurable?

@CodeBlanch
Copy link
Author

CodeBlanch commented May 9, 2020

I updated the PR to fix the failed calls not being captured (@SergeyKanzhelev brought that up). Couldn't find anything obvious for bytes sent/received though.

One thing...

The exception that gets exposed is very verbose:

if (command.IsFaulted)
{
	string ErrorMessage = command.Exception.Message;
	// "No connection is active/available to service this operation: EVAL; SocketClosed (ReadEndOfStream, last-recv: 0) on 127.0.0.1:6379/Subscription, Idle/MarkProcessed, last: SUBSCRIBE, origin: ReadFromPipe, outstanding: 0, last-read: 0s ago, last-write: 17s ago, keep-alive: 60s, state: ConnectedEstablished, mgr: 8 of 10 available, in: 0, in-pipe: 0, out-pipe: 0, last-heartbeat: 0s ago, last-mbeat: 0s ago, global: 0s ago, v: 2.1.41.3040, mc: 1/1/0, mgr: 10 of 10 available, clientName: DotNetHashMemberGetAsync, IOCP: (Busy=1,Free=999,Min=16,Max=1000), WORKER: (Busy=4,Free=2043,Min=16,Max=2047), v: 2.1.41.3040"
}

May or may not be a good thing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants