-
Notifications
You must be signed in to change notification settings - Fork 896
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Enhance DefaultUnhandledExceptionsReporter for better exception repor… #5601
base: main
Are you sure you want to change the base?
Conversation
@@ -106,6 +109,7 @@ private void reportException() { | |||
"detailed error logs. One of the thrown exceptions:", | |||
newExceptionsCount, | |||
TextFormatter.elapsed(intervalMillis, TimeUnit.MILLISECONDS), exception); | |||
thrownExceptionCounters.computeIfAbsent(exception, k -> new LongAdder()).increment(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@injae-kim nim, I've implemented recording the exception in the thrownExceptionCounters
when it's not null.
Did I understand the intention correctly?
Please let me know if I've misunderstood anything.
Thank you. 🙇
@@ -59,7 +62,7 @@ final class DefaultUnhandledExceptionsReporter implements UnhandledExceptionsRep | |||
} | |||
|
|||
@Override | |||
public void report(Throwable cause) { | |||
public void report(ServiceRequestContext ctx, Throwable cause) { | |||
if (reportingTaskFuture == null && scheduledUpdater.compareAndSet(this, 0, 1)) { | |||
reportingTaskFuture = CommonPools.workerGroup().next().scheduleAtFixedRate( | |||
this::reportException, intervalMillis, intervalMillis, TimeUnit.MILLISECONDS); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a question about this code.
Why is this::reportException
called repeatedly at regular intervals?
I'm wondering why we don't call this::reportException
only once when the report
method is invoked.
Thank you. 🙇
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, we lazily start the periodic timer task that logs and clears the exceptions and their count mapping every few seconds. The timer is started only once since it's guarded by the compareAndSet()
call.
The reason we do this way is because otherwise the recently reported exceptions will not be logged until the 'next' exception is reported, which is not always the case.
🔍 Build Scan® (commit: 1bfbfb0) |
.isEqualTo(1); | ||
assertThat(defaultUnhandledExceptionsReporter.getThrownExceptions().get(cause3).first().longValue()) | ||
.isEqualTo(1); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@injae-kim nim, I have a question 🙇
When referring to this test, even though it's the same IllegalArgumentException
, it's being recognized as a new key because they are different objects.
Is this the direction we want? I thought that even if they are different objects, they should be recognized as the same key if they have the same type. 😄
Thank you~!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's wait maintainer's comment here~! 🙇
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If stack traces and exception messages differ, the same exception type should be corrected separately. Exceptions.traceText(cause)
can be used as keys instead of instances themselves.
@@ -50,7 +55,7 @@ final class DefaultUnhandledExceptionsReporter implements UnhandledExceptionsRep | |||
@Nullable | |||
private ScheduledFuture<?> reportingTaskFuture; | |||
@Nullable | |||
private Throwable thrownException; | |||
private Map<Throwable, Pair<LongAdder, ServiceRequestContext>> thrownExceptions; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// akka
implementation libs.akka.actor.typed.v213
private static class ThrowableEntry {
LongAdder adder;
ServiceRequestContext ctx;
}
private Map<Throwable, ThrowableEntry> thrownExceptions;
oh instead of import Pair
on akka
, can we add some class like ThrowableEntry
?
then we don't need to import akka 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't want to add an additional dependency for this. Should we use Map.Entry
or a custom class as suggested by @injae-kim instead?
@@ -37,6 +39,7 @@ final class DefaultUnhandledExceptionsReporter implements UnhandledExceptionsRep | |||
private static final AtomicIntegerFieldUpdater<DefaultUnhandledExceptionsReporter> scheduledUpdater = | |||
AtomicIntegerFieldUpdater.newUpdater(DefaultUnhandledExceptionsReporter.class, | |||
"scheduled"); | |||
private Map<Throwable, LongAdder> thrownExceptionCounters = new ConcurrentHashMap<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private Map<Throwable, LongAdder> thrownExceptionCounters = new ConcurrentHashMap<>(); | |
private final Map<Throwable, LongAdder> thrownExceptionCounters = new ConcurrentHashMap<>(); |
@@ -50,7 +55,7 @@ final class DefaultUnhandledExceptionsReporter implements UnhandledExceptionsRep | |||
@Nullable | |||
private ScheduledFuture<?> reportingTaskFuture; | |||
@Nullable | |||
private Throwable thrownException; | |||
private Map<Throwable, Pair<LongAdder, ServiceRequestContext>> thrownExceptions; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't want to add an additional dependency for this. Should we use Map.Entry
or a custom class as suggested by @injae-kim instead?
if (thrownException == null) { | ||
thrownException = cause; | ||
if (thrownExceptions == null) { | ||
thrownExceptions = new HashMap<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is called by multiple threads for which HashMap
can' be used.
TextFormatter.elapsed(intervalMillis, TimeUnit.MILLISECONDS), exception); | ||
thrownException = null; | ||
TextFormatter.elapsed(intervalMillis, TimeUnit.MILLISECONDS), exceptions.toString()); | ||
thrownExceptions = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think of a dedicated class to collect exceptions and encapsulate the logic?
final class ExceptionStats {
private Map<String, ExceptionContext> exceptions = ...;
// A lock is required to thread-safely update `exceptions` field
private final ReentrantShortLock lock = ...
void record(ServiceRequestContext ctx, Throwable cause) {
lock.lock();
try {
exceptions.compute(Exceptions.traceText(cause), (key, value) -> {
...
value.increase();
...
});
} finally {
lock.unlock();
}
}
Collection<ExceptionContext> dump() {
lock.lock();
try {
final Map<String, ExceptionContext> oldExceptions = exceptions;
// Allocate a new map
exceptions = new HashMap<>();
return oldExceptions.values();
} finally {
lock.unlock();
}
}
private static class ExceptionContext {
private final Throwable exception;
// Capture the first context when the exception is raised for better debugging
private final ServiceRequestContext ctx;
private long counter = 1;
...
}
}
When reportException()
is invoked, we may dump the counted exceptions and log all of them.
private void reportException() {
for (exceptionContext: exceptionStats.dump()) {
logger.warn("Observed {} exception(s) that ...", exceptionContext.count(), ...);
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Introduce ExceptionStats
class looks nice to me too here!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ikhoon nim, thank you so much for providing detailed explanations along with the example code.
It has been really helpful in enhancing my understanding.
I would like to verify if I have correctly understood your instructions first and then proceed to write test codes.
Would you please review it again? 🙇
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any progress since the last review, @coodercl?
core/build.gradle
Outdated
@@ -162,6 +162,9 @@ dependencies { | |||
} | |||
testImplementation libs.brave6.context.slf4j | |||
testImplementation libs.brave6.instrumentation.http.tests | |||
|
|||
// akka | |||
implementation libs.akka.actor.typed.v213 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As others pointed out, we can go without this dependency.
@@ -59,7 +62,7 @@ final class DefaultUnhandledExceptionsReporter implements UnhandledExceptionsRep | |||
} | |||
|
|||
@Override | |||
public void report(Throwable cause) { | |||
public void report(ServiceRequestContext ctx, Throwable cause) { | |||
if (reportingTaskFuture == null && scheduledUpdater.compareAndSet(this, 0, 1)) { | |||
reportingTaskFuture = CommonPools.workerGroup().next().scheduleAtFixedRate( | |||
this::reportException, intervalMillis, intervalMillis, TimeUnit.MILLISECONDS); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, we lazily start the periodic timer task that logs and clears the exceptions and their count mapping every few seconds. The timer is started only once since it's guarded by the compareAndSet()
call.
The reason we do this way is because otherwise the recently reported exceptions will not be logged until the 'next' exception is reported, which is not always the case.
Oh, @trustin nim, sorry for the delay in updating. 😢 🙏 |
Let me ping when I'm ready for review~! 🙇 |
…ting
Motivation: #5567
Modifications:
Result: