You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hit an interesting issue, luckily only on staging, but could be disastrous on production...
Starting seeing many review workflows failing at the build stage with messages like:
error: failed to solve: ruby:3.2.2-buster: failed to copy: httpReadSeeker: failed open: unexpected status code https://registry-1.docker.io/v2/library/ruby/manifests/sha256:38f85fe6580dade01906f3b20381668250731d8a49cff715784a614ba0ffd815: 429 Too Many Requests - Server message: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit
So thought "fair enough, it's been a busy afternoon, I'll add some Docker Hub authentication credentials to at least double our limit".
Added docker_hub_username and docker_hub_password to the rack params (side note, these should really be documented), so they Rack goes to update itself, but then the rack update ends up failing with:
Error: Waiting for rollout to finish: 1 old replicas are pending termination...
on .terraform/modules/system/terraform/api/k8s/main.tf line 45, in resource "kubernetes_deployment" "api":
45: resource "kubernetes_deployment" "api" {
ERROR: exit status 1
ERROR: we have been notified about a system error: (d1b29934916543ed8ccfd9841d3dbaf1)
And the Rack has now become completely unresponsive. I'm guessing that when the new API services are trying to start up, they aren't able to do so successfully because of the aforementioned rate limit, so everything is just dead.
I'm just waiting to see if k8s will recover automatically, but if it doesn't, I'll have to rebuild this rack from scratch 😭
The text was updated successfully, but these errors were encountered:
As an addendum, I still have kubectl access, but that is failing as well...
╰─ kubectl get ns
E1113 18:08:00.252126 73233 memcache.go:265] couldn't get current server API group list: the server is currently unable to handle the request
E1113 18:08:00.605552 73233 memcache.go:265] couldn't get current server API group list: the server is currently unable to handle the request
E1113 18:08:00.975894 73233 memcache.go:265] couldn't get current server API group list: the server is currently unable to handle the request
E1113 18:08:01.302964 73233 memcache.go:265] couldn't get current server API group list: the server is currently unable to handle the request
E1113 18:08:01.670614 73233 memcache.go:265] couldn't get current server API group list: the server is currently unable to handle the request
Error from server (ServiceUnavailable): the server is currently unable to handle the request
Or does that proxy through the API server as well?
As an addendum, I still have kubectl access, but that is failing as well...
╰─ kubectl get ns
E1113 18:08:00.252126 73233 memcache.go:265] couldn't get current server API group list: the server is currently unable to handle the request
E1113 18:08:00.605552 73233 memcache.go:265] couldn't get current server API group list: the server is currently unable to handle the request
E1113 18:08:00.975894 73233 memcache.go:265] couldn't get current server API group list: the server is currently unable to handle the request
E1113 18:08:01.302964 73233 memcache.go:265] couldn't get current server API group list: the server is currently unable to handle the request
E1113 18:08:01.670614 73233 memcache.go:265] couldn't get current server API group list: the server is currently unable to handle the request
Error from server (ServiceUnavailable): the server is currently unable to handle the request
Or does that proxy through the API server as well?
yes it also proxy through API server
btw make sure your docker user and password is correct.
Hit an interesting issue, luckily only on staging, but could be disastrous on production...
Starting seeing many review workflows failing at the build stage with messages like:
So thought "fair enough, it's been a busy afternoon, I'll add some Docker Hub authentication credentials to at least double our limit".
Added
docker_hub_username
anddocker_hub_password
to the rack params (side note, these should really be documented), so they Rack goes to update itself, but then the rack update ends up failing with:And the Rack has now become completely unresponsive. I'm guessing that when the new API services are trying to start up, they aren't able to do so successfully because of the aforementioned rate limit, so everything is just dead.
I'm just waiting to see if k8s will recover automatically, but if it doesn't, I'll have to rebuild this rack from scratch 😭
The text was updated successfully, but these errors were encountered: