-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Elevated lock wait time pointing to PrivateMaxEntriesMap.put #4371
Comments
what does your model look like? some Object field? maybe you should try removing that |
The LRUMap is one implementation of LookupCache. You could write your own implementation that doesn't have thread safety (eg unguarded read/writes to a HashMap). I have https://github.com/pjfanning/jackson-caffeine-cache (a Caffeine based LookupCache) which is an another alternative. |
The LruMap’s putIfAbsent might do an optimistic get first to avoid locking. That might be enough, else swap the implementations as @pjfanning suggested (Caffeine includes that optimization). edit: It looks like TypeFactory does a |
If I don't think it makes sense to try to use custom cache implementation, but to rather see why type resolution becomes bottleneck and see if it can be avoided. |
thanks everyone to replying! @yawkat @ben-manes @pjfanning @cowtowncoder
or more of doing
once on service startup? |
@Enigo latter; former wouldn't help. This assuming you know in advanced types. But looking at flame graph bit more I realized it might not be that simple. Instead, it looks like some serializers ( Increasing cache size or using different cache for |
what im confused by is why you spend so much time in _findAndAddDynamic. Ideally that call should only happen a few times before it's cached. You mention you already tried a fixed ObjectReader/Writer, but the flame graph doesn't show that, so please move back to that. The only reason I can think of to repeatedly hit _findAndAddDynamic with a proper ObjectWriter would be to have weird dynamic field situations |
Agree with what @yawkat said. I guess there is a possibility that some Collection (or possibly So the problem is not as much type resolution (although there are generic types being resolved as per stack traces) but re-creating value serializers, I think. |
Interesting! Calling |
I'm still confused by the new flame graphs. The bulk of the time is spent in _findAndAddDynamic. aiui, this method is supposed to populate a cache, it's not going to be on the hot path because the hot path is going to use the existing cached value. Your flame graph looks like that is not what's happening, so you should investigate why that is — maybe the writer is not reused properly? Another possibility is that your flame graph is misleading. You say the graph shows lock wait time. But if that's really all it shows, this result is entirely expected: the hot path should have zero lock wait time, so it won't show up, but the warmup may wait on locks as it populates caches. In theory this should not have a perf impact after the warmup, have you verified with async-profiler or another normal cpu sampler? The warmup may also have significant perf impact before it settles down, if you have many types hitting the same cache concurrently. Question is how much it matters for your app. |
@yawkat I was guessing that the time taken is actually from call to find (and as necessary, construct, including all introspection) actual serializer. This would be heavy-weight operation. But then again flamegraph does not seem to suggest that... which would make me wonder if it could be an artifact of some kind. |
lemme try to answer some of the questions
but good news are - with |
Thank you @Enigo. It is quite peculiar that there is this contention, but glad that combination of changes can resolve it. I would still be interested in how much of solution is use of full Caffeine classes for cache, vs. warm-up with |
ah, now there is somewhat of a different issue
and then using it like this
I see this warning in logs:
and the stacktrace that points to
I must admit, I'm kinda clueless about what to do with this so any hints would be greatly appreciated, thanks! |
It’s strange as that usually occurs due to long computations which might block an eviction from a neighboring entry in the hash map. It could be if you flood ForkJoinPool.commonPool() so it’s not evicting regularly, though writing should barge in. You can try |
sorry for late reply, didn't have time to properly look at the issue.
I use caffeine for jackson ObjectMapper TypeFactory cache it appears to be happening only on an instance startup for some time - longest I've seen was ~40 minutes of warnings in logs, ard 9k log entries. Then it stops and never reappears again. So something must be overloading the cache at the instance startup |
@Enigo overload during startup makes lots of sense, since that is also when (de)Serializers get constructed, big part of which is introspection of classes, methods, fields, constructors and all relevant types. And this tends to not happen later on when all/most (de)serializers are then cached and reused. Although earlier discussion did suggest that not all serializers are fully cached (possibly due to "wider" polymorphism for values being serialized than what the standard |
Search before asking
Describe the bug
Hello everyone!
There is a curious issue I want to share regarding the
TypeFactory::_typeCache
and its underlying map.But first a bit of context.
We run a high-load servers processing lots requests (300k+) per second, which translates to 5-6k rps per single machine in a cluster. Those requests are all json, so we do lots of serialization and deserialization operations, here is a very simplified flow:
we have two separate singleton
ObjectMappers
- for serialization and deserialization - created once and shared across the entire application.Now onto the issue. We are running in AWS on 4x large machines and recently decided to migrate to the bigger 16x ones.
That ofc means that every single instance will be processing 4-5x more rps. And with that our app couldn't scale and after running a few flamegraphs everything points to
LRUMap
.I attached two images - they are from the same flamegraph one for incoming and another for outgoing request processing
and lastly the lock time
I've been looking around the internals of jackson lib and it seems to me that in our particular case the cache used in
TypeFactory
is not scaling with the instance size. Basically, when running bigger instances what happens is the newly launched instances have all available threads busy for like 2-5 minutes which pretty much make the entire cluster sorta "dead".I've tried a couple of optimizations that seemed to mitigate the issue, but it is still spending quite a lot of time waiting:
ObjectWriter
andObjectReader
instead ofObjectMapper
ObjectMapper
for each of those convertes from point (3)objectMapper.setTypeFactory(TypeFactory.defaultInstance().withCache(new LRUMap<>(10000, 10000)));
from code POV what we run are simple
objectWriter.writeValueAsBytes
andobjectReader.readValue
operations andObjectMapper
is configured like thisEverywhere it is stated that
ObjectMapper
is thread-safe, which is true, but if this is indeed an expected behavior that thread-safety comes with quite a price tag it seems.Appreciate any feedback and hints on how to solve this, thanks!
Version Information
2.14.2
Reproduction
<-- Any of the following
-->
// Your code here
Expected behavior
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: