Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pod requests queue is empty => leak of emulator deployment #1260

Open
materkey opened this issue Oct 27, 2021 · 1 comment
Open

Pod requests queue is empty => leak of emulator deployment #1260

materkey opened this issue Oct 27, 2021 · 1 comment
Labels
bug Something isn't working

Comments

@materkey
Copy link
Contributor

materkey commented Oct 27, 2021

Describe the bug
Pod requests queue is empty => leak of emulator deployment

How to reproduce

  1. instrumentationUi started
  2. 8/14 pods Running with emulators (6 pods has insufficient cpu)
  3. 20 minutes later has cpu resources => creating new pods in deployment => crash IllegalStateException: Pod requests queue is empty

Expected behavior
Expect no deployment leak and no failed instrumentationUi for this scenario

Environment
Version: 2021.36 (fork)
2 worker nodes (openstack VMs), each has 12 cpu

Additional context
Logs:

[StatsDSender@:app:instrumentationUi] time:consumerapp.testrunner.app.ui.reservation.pod.queue:1653
[RemoteDeviceProvider@:app:instrumentationUi] Found new pod: default-462db226-8a78-4421-91ab-f3b0af6152fa-6d8768786f-t527c2021-10-27T16:52:42.650782489+03:00 
[StatsDSender@:app:instrumentationUi] time:consumerapp.testrunner.app.ui.reservation.pod.queue:1653
[RemoteDeviceProvider@:app:instrumentationUi] Found new pod: default-462db226-8a78-4421-91ab-f3b0af6152fa-6d8768786f-v9596
kotlinx.coroutines.JobCancellationException: ScopeCoroutine is cancelling; job=ScopeCoroutine{Cancelling}@16296bdd
Caused by: java.lang.IllegalStateException: Pod requests queue is empty
	at com.avito.android.runner.devices.internal.kubernetes.KubernetesReservationState.podAcquired(KubernetesReservationState.kt:42)
	at com.avito.android.runner.devices.internal.kubernetes.StatsDKubernetesReservationMetricsSender.onPodAcquired(StatsDKubernetesReservationMetricsSender.kt:25)
	at com.avito.android.runner.devices.internal.kubernetes.KubernetesReservationClaimer$initializeDevices$2$1.invokeSuspend(KubernetesReservationClaimer.kt:101)
	at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
	at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:56)
	at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:571)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:738)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:678)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:665)2021-10-27T16:52:42.650893088+03:00 [TestRunner@:app:instrumentationUi] Test run finished with error

kotlinx.coroutines.JobCancellationException: Parent job is Cancelling; job=StandaloneCoroutine{Cancelling}@11922229
Caused by: java.lang.IllegalStateException: Pod requests queue is empty
	at com.avito.android.runner.devices.internal.kubernetes.KubernetesReservationState.podAcquired(KubernetesReservationState.kt:42)
	at com.avito.android.runner.devices.internal.kubernetes.StatsDKubernetesReservationMetricsSender.onPodAcquired(StatsDKubernetesReservationMetricsSender.kt:25)2021-10-27T16:52:42.650935967+03:00 
	at com.avito.android.runner.devices.internal.kubernetes.KubernetesReservationClaimer$initializeDevices$2$1.invokeSuspend(KubernetesReservationClaimer.kt:101)2021-10-27T16:52:42.650947013+03:00 
	at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
	at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:56)
	at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:571)2021-10-27T16:52:42.650969371+03:00 
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:738)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:678)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:665)2021-10-27T16:52:42.650991556+03:00 
[RemoteDeviceProvider@:app:instrumentationUi] Pod default-462db226-8a78-4421-91ab-f3b0af6152fa-6d8768786f-j8hk8 can't load device. Disconnect and delete.
Check device logs in artifacts: /job/app/app/build/test-runner/4bdd0cc9fa0288878524b47b3e7574a3d2cdb4d9.local-root/ui/devices/10.0.3.134.txt
[StatsDSender@:app:instrumentationUi] time:consumerapp.service.kubernetes.pods_delete.202:34
[RemoteDeviceProvider@:app:instrumentationUi] Pod default-462db226-8a78-4421-91ab-f3b0af6152fa-6d8768786f-j8hk8 is deleted: true
[AbstractDevice@:app:instrumentationUi] Wait device with serial: 10.0.3.137:5555 succeed in 10008 at attempt=1
[AbstractDevice@:app:instrumentationUi] Wait device with serial: 10.0.3.136:5555 succeed in 10010 at attempt=1
kotlinx.coroutines.JobCancellationException: Parent job is Cancelling; job=StandaloneCoroutine{Cancelling}@11922229
Caused by: java.lang.IllegalStateException: Pod requests queue is empty
	at com.avito.android.runner.devices.internal.kubernetes.KubernetesReservationState.podAcquired(KubernetesReservationState.kt:42)
	at com.avito.android.runner.devices.internal.kubernetes.StatsDKubernetesReservationMetricsSender.onPodAcquired(StatsDKubernetesReservationMetricsSender.kt:25)
	at com.avito.android.runner.devices.internal.kubernetes.KubernetesReservationClaimer$initializeDevices$2$1.invokeSuspend(KubernetesReservationClaimer.kt:101)
	at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
	at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:56)
	at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:571)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:738)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:678)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:665)
kotlinx.coroutines.JobCancellationException: Parent job is Cancelling; job=StandaloneCoroutine{Cancelling}@11922229
Caused by: java.lang.IllegalStateException: Pod requests queue is empty
	at com.avito.android.runner.devices.internal.kubernetes.KubernetesReservationState.podAcquired(KubernetesReservationState.kt:42)
	at com.avito.android.runner.devices.internal.kubernetes.StatsDKubernetesReservationMetricsSender.onPodAcquired(StatsDKubernetesReservationMetricsSender.kt:25)2021-10-27T16:52:52.650765459+03:00 [RemoteDeviceProvider@:app:instrumentationUi] Pod default-462db226-8a78-4421-91ab-f3b0af6152fa-6d8768786f-t527c can't load device. Disconnect and delete.
Check device logs in artifacts: /job/app/app/build/test-runner/4bdd0cc9fa0288878524b47b3e7574a3d2cdb4d9.local-root/ui/devices/10.0.3.137.txt

	at com.avito.android.runner.devices.internal.kubernetes.KubernetesReservationClaimer$initializeDevices$2$1.invokeSuspend(KubernetesReservationClaimer.kt:101)
	at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
	at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:56)
	at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:571)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:738)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:678)
	at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:665)
[RemoteDeviceProvider@:app:instrumentationUi] Pod default-462db226-8a78-4421-91ab-f3b0af6152fa-6d8768786f-v9596 can't load device. Disconnect and delete.

I see this exception 4 times in one run and then:

FAILURE: Build failed with an exception.

* What went wrong:
Execution failed for task ':app:instrumentationUi'.
> A failure occurred while executing com.avito.gradle.worker.NonSerializableWork
   > Pod requests queue is empty
@materkey materkey added the bug Something isn't working label Oct 27, 2021
@dsvoronin
Copy link
Contributor

Please check if 2021.37 will fix this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants