Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Metaflow UI] stdout and stderr logs timeout/fail to load #393

Open
martinbattentive opened this issue Oct 17, 2023 · 2 comments
Open

[Metaflow UI] stdout and stderr logs timeout/fail to load #393

martinbattentive opened this issue Oct 17, 2023 · 2 comments
Assignees

Comments

@martinbattentive
Copy link

martinbattentive commented Oct 17, 2023

When using the Metaflow UI the stdout/stderr panes no longer successfully load, and the requests to load them return with a 504 gateway timeout.

Screenshot 2023-10-17 at 2 36 28 PM

Example url being requested by UI for stderr logs:
/api/flows/<flow_name>/runs/59510/steps/start/tasks/539228/logs/err?attempt_id=0&_limit=500&_page=1&_order=-row

I believe the issue is caused by a very expensive join query in async def get_task_by_request(self, request): in ui_backend_service/api/log.py. Looking at the code, this function call and underlying join query seems unnecessary given that the UI is already passing all the task parameters necessary to uniquely identify the task in the Task table directly, including attempt.

@martinbattentive
Copy link
Author

@martinbattentive
Copy link
Author

@saikonen @savingoyal My initial belief was incorrect. This was actually caused by the log CacheAsyncClient and/or CacheAsyncServer getting into a bad state where it would internally fetch the logs but never return them, leading to the list of pending streams continually building. A restart of the ui_backend service resolved the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants