Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add data integrity check for cron task #3769

Merged
merged 7 commits into from
Aug 29, 2023
Merged

Conversation

tintinthong
Copy link
Contributor

@tintinthong tintinthong commented Aug 25, 2023

This adds a data integrity check that checks when a graphile worker task was last fired. If the task runs every 5 minutes, we give it an allowances of 5*3 = 15 minutes. Right now every task uses 3x multiplier but this tolerance can be adjusted for a specific task if needed

I didn't use prisma. Well because graphile_worker tables exist in the graphile_worker namespace and typically prisma schema file only allow to specify only one namespace

Copy link
Contributor

@jurgenwerk jurgenwerk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good but would be really good to explore the possibilities of getting the cron job frequency dynamically so that we reduce the chance of getting out of sync with checks in case someone adjusts the tasks frequency.

Also is there a better word to label a task as a task that didn't run but it was supposed to? You used "stopped" but that can also indicate someone stopped it intentionally. I'd prefer maybe "lagging"

@tintinthong
Copy link
Contributor Author

tintinthong commented Aug 28, 2023

@jurgenwerk this was perhaps more involved than I wanted to. I had to parse the way the graphile worker encoded ParsedCronItem. Which is their object encoding of the original crontab configuration. But in any case I created classes that would correspond to a conservative understanding of what each ParsedCronItem meant

export const CRON_TAB_STRING: string = [
  '0 5 * * * remove-old-sent-notifications ?max=5',
  '*/5 * * * * print-queued-jobs',
  '*/5 * * * * execute-scheduled-payments',
  ...(config.get('rewardsIndexer.enabled') ? ['*/10 * * * * check-reward-roots'] : []),
].join('\n');

-> 

ParsedCronItem[]

This exercise also made me realise that I was perceiving remove-old-sent-notifications wrongly. The task is actually sent daily at 5am UTC. Previously, I thought it was just an interval type configuration -- which is wrong. I had to make adjustments

return r;
}
isType(item: ParsedCronItem): item is MinuteIntervalLessThanHour {
if (item.hours.length !== 24 || item.dows.length !== 7 || item.months.length !== 12 || item.dates.length !== 31) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the logic in this function is a bit hard to follow for me. I think it went a bit too far in complexity

Have you considered parsing the cron string, get the interval for each entry, and simply compare whether it has been executed in the last interval + some padding? Or am I missing something?

Looks like ChatGPT can help to spit out the function for that:

function calculateCronInterval(expression) {
    const fields = expression.split(' ');
    const [minute, hour, dayOfMonth, month, dayOfWeek] = fields;

    function processField(field, min, max) {
        if (field === '*') {
            return max - min + 1;
        }
        const values = field.split(',').map(value => parseInt(value));
        return Math.min(...values.map(value => max - min + 1));
    }

    const minutesInterval = processField(minute, 0, 59);
    const hoursInterval = processField(hour, 0, 23);
    const daysInterval = processField(dayOfMonth, 1, 31);
    const monthsInterval = processField(month, 1, 12);
    const weekdaysInterval = processField(dayOfWeek, 0, 6);

    // Calculate the overall interval in minutes
    const overallInterval = minutesInterval * hoursInterval * daysInterval * monthsInterval * weekdaysInterval;

    return overallInterval;
}

const cronExpression = "*/15 * * * *"; // Example cron expression
const intervalInMinutes = calculateCronInterval(cronExpression);
console.log(`Interval in minutes: ${intervalInMinutes}`);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted. I have over-abstracted at the expense of clarity. I can just parse the crontab string and simplify.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have simplified everything. This function above is wrong.

I think we prob don't want to model ENTIRE graphile workers crontab string.

The main case we consider is just the difference */15 * * * * and 0 5 * * *. Former is every 15 minutes. Latter is every day at 5am UTC. For our purposes, we just perceive these two cases as every 15 minutes and every 24*60 minutes (1 day) from last execution.

That means that I write the code CONSERVATIVELY in that if you write something */15 15 * * *. Which means every 15 minutes within the 15th hour of the day. I expect to just not parse this and error out. This keeps our code exact but incomplete.

Comment on lines 449 to 461
it('calculateMinuteInterval', async function () {
expect(calculateMinuteInterval('0 5 * * * remove-old-sent-notifications ?max=5').minuteInterval).to.equal(
60 * 24
);
expect(calculateMinuteInterval('*/5 * * * * print-queued-jobs').minuteInterval).to.equal(5);
expect(() => calculateMinuteInterval('0 5 2 * * remove-old-sent-notifications ?max=5')).to.throw(
'Cannot parse the provided cron expression: 0 5 2 * * remove-old-sent-notifications ?max=5'
);
expect(() => calculateMinuteInterval('*/5 */3 * * * print-queued-jobs')).to.throw(
'Cannot parse the provided cron expression: */5 */3 * * * print-queued-jobs'
);
});
});
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote test here

@tintinthong tintinthong merged commit 27bdccb into main Aug 29, 2023
2 checks passed
@delete-merged-branch delete-merged-branch bot deleted the task-data-integrity-check branch August 29, 2023 09:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants