Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: API reset can cause evaluations to get stuck with RUNNING status. #541

Open
1 task done
czaloom opened this issue Apr 16, 2024 · 1 comment
Open
1 task done
Assignees
Labels
bug Something isn't working

Comments

@czaloom
Copy link
Collaborator

czaloom commented Apr 16, 2024

valor version checks

  • I have confirmed this bug exists on the latest version of valor.

Reproducible Example

Restart server mid-computation but retain the database.

Issue Description

Evaluations in Valor are performed using a combination of python code in the backend and some postgresql queries. If the API is restarted mid-computation a desynchronization occurs between the API and the database.

Expected Behavior

Any evaluation stuck in a RUNNING status after a timeout threshold should be set to the FAILED status.

@czaloom czaloom added the bug Something isn't working label Apr 16, 2024
@czaloom czaloom self-assigned this Apr 16, 2024
@ntlind ntlind assigned ntlind and unassigned czaloom Apr 22, 2024
@ntlind
Copy link
Collaborator

ntlind commented Apr 26, 2024

Moving this one to backlog. The best fix here might be implementing a message broker.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants