Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix schedule_action_delay metric for buffered actions #5884

Merged
merged 6 commits into from May 20, 2024
Merged

Conversation

dnr
Copy link
Member

@dnr dnr commented May 9, 2024

What changed?

  • Calculate "delay" for buffered actions from previous action close.
  • Update a few logs.

Why?

If an action is held up by waiting for a previous action to finish (e.g. BufferOne, BufferAll, CancelOther overlap policies), it's not fair to count the waiting time as the "delay" for that action.

How did you test it?

Tested locally by building up a backlog and checking that the metric didn't increase (and did before this change). Also tested upgrade and downgrade.

Potential risks

This is touching workflow code, but is compatible because:

  • If it sees nil for the close time or desired time, that's fine, it just falls back to actual time.
  • All the behavior changes are in logs and metrics only.

@dnr dnr requested a review from a team as a code owner May 9, 2024 00:44
@dnr dnr enabled auto-merge (squash) May 18, 2024 00:44
@dnr dnr merged commit 0bf37a0 into temporalio:main May 20, 2024
42 checks passed
@dnr dnr deleted the sched86 branch May 20, 2024 19:37
ychebotarev pushed a commit to ychebotarev/temporal that referenced this pull request May 20, 2024
## What changed?
- Calculate "delay" for buffered actions from previous action close.
- Update a few logs.

## Why?
If an action is held up by waiting for a previous action to finish (e.g.
BufferOne, BufferAll, CancelOther overlap policies), it's not fair to
count the waiting time as the "delay" for that action.

## How did you test it?
Tested locally by building up a backlog and checking that the metric
didn't increase (and did before this change). Also tested upgrade and
downgrade.

## Potential risks
This is touching workflow code, but is compatible because:
- If it sees nil for the close time or desired time, that's fine, it
just falls back to actual time.
- All the behavior changes are in logs and metrics only.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants