Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IndicesRequestCache uncancellably blocks search threads while result is pending #108703

Open
DaveCTurner opened this issue May 16, 2024 · 1 comment
Labels
>bug :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team

Comments

@DaveCTurner
Copy link
Contributor

A user reported to me that they had inadvertently run a very expensive collection of queries which caused stress to their cluster so they cancelled them, but some indices:data/read/search[phase/query] tasks continued to run for a very long time after being cancelled and eventually they had to restart nodes to restore their cluster back to a working state. They shared a thread dump which shows various places where we appear to be missing cancellation detection today, see #108701, but also in the thread dump I noticed that there were quite a few search threads blocking within IndicesRequestCache.getOrCompute apparently waiting for the result of the query that the other threads are computing.

I think we should avoid filling up the search pool with these blocking tasks so that these threads can do other more meaningful work, but at the very least we should also make these cache interactions react to cancellations properly.

    0.0% [cpu=0.0%, other=0.0%] (0s out of 500ms) cpu usage by thread 'elasticsearch[REDACTED][search][T#19]'
     10/10 snapshots sharing following 28 elements
       java.base@21.0.1/jdk.internal.misc.Unsafe.park(Native Method)
       java.base@21.0.1/java.util.concurrent.locks.LockSupport.park(LockSupport.java:221)
       java.base@21.0.1/java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1864)
       java.base@21.0.1/java.util.concurrent.ForkJoinPool.unmanagedBlock(ForkJoinPool.java:3780)
       java.base@21.0.1/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3725)
       java.base@21.0.1/java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1898)
       java.base@21.0.1/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2072)
       app/org.elasticsearch.server@8.11.1/org.elasticsearch.common.cache.Cache$CacheSegment.get(Cache.java:205)
       app/org.elasticsearch.server@8.11.1/org.elasticsearch.common.cache.Cache.get(Cache.java:350)
       app/org.elasticsearch.server@8.11.1/org.elasticsearch.common.cache.Cache.computeIfAbsent(Cache.java:376)
       app/org.elasticsearch.server@8.11.1/org.elasticsearch.indices.IndicesRequestCache.getOrCompute(IndicesRequestCache.java:120)
       app/org.elasticsearch.server@8.11.1/org.elasticsearch.indices.IndicesService.cacheShardLevelResult(IndicesService.java:1637)
       app/org.elasticsearch.server@8.11.1/org.elasticsearch.indices.IndicesService.loadIntoContext(IndicesService.java:1559)
       app/org.elasticsearch.server@8.11.1/org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:516)
       app/org.elasticsearch.server@8.11.1/org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:671)
       app/org.elasticsearch.server@8.11.1/org.elasticsearch.search.SearchService.lambda$executeQueryPhase$2(SearchService.java:543)
       app/org.elasticsearch.server@8.11.1/org.elasticsearch.search.SearchService$$Lambda/0x00007f8aa18338c8.get(Unknown Source)
       app/org.elasticsearch.server@8.11.1/org.elasticsearch.action.ActionRunnable$2.accept(ActionRunnable.java:51)
       app/org.elasticsearch.server@8.11.1/org.elasticsearch.action.ActionRunnable$2.accept(ActionRunnable.java:48)
       app/org.elasticsearch.server@8.11.1/org.elasticsearch.action.ActionRunnable$3.doRun(ActionRunnable.java:73)
       app/org.elasticsearch.server@8.11.1/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
       app/org.elasticsearch.server@8.11.1/org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:33)
       app/org.elasticsearch.server@8.11.1/org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:983)
       app/org.elasticsearch.server@8.11.1/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
       java.base@21.0.1/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
       java.base@21.0.1/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
       java.base@21.0.1/java.lang.Thread.runWith(Thread.java:1596)
       java.base@21.0.1/java.lang.Thread.run(Thread.java:1583)
@DaveCTurner DaveCTurner added >bug :Search/Search Search-related issues that do not fall into other categories labels May 16, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@elasticsearchmachine elasticsearchmachine added the Team:Search Meta label for search team label May 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team
Projects
None yet
Development

No branches or pull requests

2 participants