Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retries exceeded causes pod to wrongly restart #53

Open
OranShuster opened this issue Oct 12, 2022 · 4 comments
Open

Retries exceeded causes pod to wrongly restart #53

OranShuster opened this issue Oct 12, 2022 · 4 comments

Comments

@OranShuster
Copy link

When launching a new pod the beats-exporter container will shut down with the following error message

2022-10-12 09:30:48,767 - __main__ - ERROR - Error connecting Beat at port 5066:
HTTPConnectionPool(host='localhost', port=5066): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f149d86b0d0>: Failed to establish a new connection: [Errno 111] Connection refused'))

However, after a couple of restarts (usually no more than 2) the pod will be alive and ready. I assume this is the filebeat container not being ready fast enough for the exporter to exhaust it's retries

is there some k8s way to handle this? i can add a delay by overriding the container CMD but it looks like kind of a hack for me
maybe we could set a startup delay using arguments? or increase interval between retries?

@paketb0te
Copy link

@OranShuster this is probably way too late, but anyways:

If you are running this in k8s, you could add a startupProbe to the exporter container which checks if the filebeat http server has started / is answering requests

@OranShuster
Copy link
Author

@paketb0te this was solved with a sleep 10 && ....
also my cluster at the time was too old to support startup probes

@paketb0te
Copy link

sleep 10, the poor man's startupProbe 😅
Are you currently still using the beat exporter? If so, is it still working?
(just wondering because the last commit is 4 years old)

@OranShuster
Copy link
Author

@paketb0te when I was laid off in March it was still working. It was on filebeat 7 but 8 don't think the metric endpoint changed that much

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants