Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] CohereServiceUpgradeIT classMethod failing #107887

Open
jfreden opened this issue Apr 25, 2024 · 2 comments
Open

[CI] CohereServiceUpgradeIT classMethod failing #107887

jfreden opened this issue Apr 25, 2024 · 2 comments
Labels
:ml Machine learning needs:risk Requires assignment of a risk label (low, medium, blocker) Team:ML Meta label for the ML team >test-failure Triaged test failures from CI

Comments

@jfreden
Copy link
Contributor

jfreden commented Apr 25, 2024

Build scan:
https://gradle-enterprise.elastic.co/s/mab4dlj6reji4/tests/:x-pack:plugin:inference:qa:rolling-upgrade:v8.14.0%23bwcTest/org.elasticsearch.xpack.application.CohereServiceUpgradeIT

Reproduction line:

null

Applicable branches:
main

Reproduces locally?:
Didn't try

Failure history:
Failure dashboard for org.elasticsearch.xpack.application.CohereServiceUpgradeIT#classMethod

Failure excerpt:

java.lang.RuntimeException: An error occurred orchestrating test cluster.

  at __randomizedtesting.SeedInfo.seed([83CBC4274485658A]:0)
  at org.elasticsearch.test.cluster.local.DefaultLocalClusterHandle.execute(DefaultLocalClusterHandle.java:264)
  at org.elasticsearch.test.cluster.local.DefaultLocalClusterHandle.writeUnicastHostsFile(DefaultLocalClusterHandle.java:245)
  at org.elasticsearch.test.cluster.local.DefaultLocalClusterHandle.waitUntilReady(DefaultLocalClusterHandle.java:188)
  at org.elasticsearch.test.cluster.local.DefaultLocalClusterHandle.start(DefaultLocalClusterHandle.java:74)
  at org.elasticsearch.test.cluster.local.DefaultLocalElasticsearchCluster$1.evaluate(DefaultLocalElasticsearchCluster.java:45)
  at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
  at org.junit.rules.RunRules.evaluate(RunRules.java:20)
  at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
  at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
  at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
  at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
  at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
  at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
  at org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47)
  at org.junit.rules.RunRules.evaluate(RunRules.java:20)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:850)
  at java.lang.Thread.run(Thread.java:1583)

  Caused by: java.lang.RuntimeException: Timed out after PT2M waiting for ports files for: { cluster: 'test-cluster', node: 'test-cluster-0' }

    at org.elasticsearch.test.cluster.local.AbstractLocalClusterFactory$Node.waitUntilReady(AbstractLocalClusterFactory.java:285)
    at org.elasticsearch.test.cluster.local.AbstractLocalClusterFactory$Node.getTransportEndpoint(AbstractLocalClusterFactory.java:204)
    at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197)
    at java.util.AbstractList$RandomAccessSpliterator.forEachRemaining(AbstractList.java:722)
    at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)
    at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
    at java.util.stream.ReduceOps$ReduceTask.doLeaf(ReduceOps.java:960)
    at java.util.stream.ReduceOps$ReduceTask.doLeaf(ReduceOps.java:934)
    at java.util.stream.AbstractTask.compute(AbstractTask.java:327)
    at java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:754)
    at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:387)
    at java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1312)
    at java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1843)
    at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1808)
    at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)

@jfreden jfreden added :ml Machine learning >test-failure Triaged test failures from CI labels Apr 25, 2024
@elasticsearchmachine elasticsearchmachine added Team:ML Meta label for the ML team needs:risk Requires assignment of a risk label (low, medium, blocker) labels Apr 25, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@alex-spies
Copy link
Contributor

Are we sure this is specific to ML?

We seem to get failures of this kind on all kinds of integration tests.

I got one here, for PkiRealmAuthIT and two other integration tests: https://gradle-enterprise.elastic.co/s/53chtsv74jzf4

This issue that we thought was fixed looked the same as well: #107879

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:ml Machine learning needs:risk Requires assignment of a risk label (low, medium, blocker) Team:ML Meta label for the ML team >test-failure Triaged test failures from CI
Projects
None yet
Development

No branches or pull requests

3 participants