Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

shuffle workflow in executing state and pods are in error state #1384

Open
anjanaradhakrishnan opened this issue May 10, 2024 · 2 comments
Open

Comments

@anjanaradhakrishnan
Copy link

Hi @frikky I'm facing an issue with the shuffle workflow. The shuffle workflow has webhook which is integrated with wazuh and based on the condition it will create case in thhive. Eventhough the case is creating in thhive but the workflow is still in executing state
image
And the shuffle worker pod is in error state. This is the logs i'm getting

2024/05/10 11:23:29 [DEBUG] Got kubernetes client
2024/05/10 11:23:29 [DEBUG] Created pod "aws-s3-93ced6c7-8638-4178-988e-123456" in namespace "shuffle"
2024/05/10 11:23:30 [DEBUG][cd7b9999-bffe-4bc6-b31a-123456] Action: Received, Label: 'Upload_file_to_S3', Action: 'AWS S3', Status: SUCCESS, Run status: EXECUTING, Extra=Retry:0
2024/05/10 11:23:30 [WARNING][cd7b9999-bffe-4bc6-b31a-123456] Execution is not executing, but FINISHED. Stopping Transaction update.
2024/05/10 11:23:30 [DEBUG][cd7b9999-bffe-4bc6-b31a-123456] Shutting down (35)
2024/05/10 11:23:31 [DEBUG][cd7b9999-bffe-4bc6-b31a-123456] NEWRESP (from backend): {"success": true, "reason": "success"}
2024/05/10 11:23:31 [DEBUG][cd7b9999-bffe-4bc6-b31a-123456] Shutdown (FINISHED) started with reason "". Result amount: 4. ResultsSent: 0, Send result:false, Parent: ""
2024/05/10 11:23:31 [DEBUG][cd7b9999-bffe-4bc6-b31a-123456] Finished shutdown (after 1 seconds).
2024/05/10 11:23:31 [DEBUG][cd7b9999-bffe-4bc6-b31a-123456] Setting execution to finished because all results are in and it was still in EXECUTING mode. Should set subflow parent result as well (not implemented).
2024/05/10 11:23:31 [INFO][cd7b9999-bffe-4bc6-b31a-123456] Validation. Status: FINISHED, Actions: 4, Extra: 0, Results: 4
2024/05/10 11:23:31 [INFO][cd7b9999-bffe-4bc6-b31a-123456] Already finished (validate)! Stopping the rest of the request for execution.
2024/05/10 11:23:31 [DEBUG][cd7b9999-bffe-4bc6-b31a-123456] Sending result (set)

Here it is showing the Upload_file_to_S3', Action: 'AWS S3', Status: SUCCESS but the action is already skipped in the workflow based on the condition.

So if the cases is successfully created means the workflow should be in success state and the pods should be in completed state right?
Need your help on this @frikky

@frikky
Copy link
Member

frikky commented May 10, 2024

Hi @frikky I'm facing an issue with the shuffle workflow. The shuffle workflow has webhook which is integrated with wazuh and based on the condition it will create case in thhive. Eventhough the case is creating in thhive but the workflow is still in executing state image And the shuffle worker pod is in error state. This is the logs i'm getting

2024/05/10 11:23:29 [DEBUG] Got kubernetes client
2024/05/10 11:23:29 [DEBUG] Created pod "aws-s3-93ced6c7-8638-4178-988e-123456" in namespace "shuffle"
2024/05/10 11:23:30 [DEBUG][cd7b9999-bffe-4bc6-b31a-123456] Action: Received, Label: 'Upload_file_to_S3', Action: 'AWS S3', Status: SUCCESS, Run status: EXECUTING, Extra=Retry:0
2024/05/10 11:23:30 [WARNING][cd7b9999-bffe-4bc6-b31a-123456] Execution is not executing, but FINISHED. Stopping Transaction update.
2024/05/10 11:23:30 [DEBUG][cd7b9999-bffe-4bc6-b31a-123456] Shutting down (35)
2024/05/10 11:23:31 [DEBUG][cd7b9999-bffe-4bc6-b31a-123456] NEWRESP (from backend): {"success": true, "reason": "success"}
2024/05/10 11:23:31 [DEBUG][cd7b9999-bffe-4bc6-b31a-123456] Shutdown (FINISHED) started with reason "". Result amount: 4. ResultsSent: 0, Send result:false, Parent: ""
2024/05/10 11:23:31 [DEBUG][cd7b9999-bffe-4bc6-b31a-123456] Finished shutdown (after 1 seconds).
2024/05/10 11:23:31 [DEBUG][cd7b9999-bffe-4bc6-b31a-123456] Setting execution to finished because all results are in and it was still in EXECUTING mode. Should set subflow parent result as well (not implemented).
2024/05/10 11:23:31 [INFO][cd7b9999-bffe-4bc6-b31a-123456] Validation. Status: FINISHED, Actions: 4, Extra: 0, Results: 4
2024/05/10 11:23:31 [INFO][cd7b9999-bffe-4bc6-b31a-123456] Already finished (validate)! Stopping the rest of the request for execution.
2024/05/10 11:23:31 [DEBUG][cd7b9999-bffe-4bc6-b31a-123456] Sending result (set)

Here it is showing the Upload_file_to_S3', Action: 'AWS S3', Status: SUCCESS but the action is already skipped in the workflow based on the condition.

So if the cases is successfully created means the workflow should be in success state and the pods should be in completed state right? Need your help on this @frikky

Hey,

could you find logs for cd7b9999-bffe-4bc6-b31a-123456 on the shuffle-backend container around the same time? It looks like the worker that handled the action worked, but the backend may have not updated the state in the database, or it may have not been forwarded properly where it says Sending result (set).

This can be prevented all the time if cache between Workers and the Backend are in place, which it isn't by default. Docker reference for it:

@frikky
Copy link
Member

frikky commented May 10, 2024

#memcached:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

No branches or pull requests

2 participants