You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some recurrent tasks are crucial for data integrity (e.g. getting weather forecasts from an external API), and we are not only interested in actual errors (which we could catch with sentry, for example) but also in the absence of regular, frequent success. For instance, we know this task has to happen every hour. If it isn't being run for some reason, we need to know.
We can already mark CLI tasks as task (using @task_with_status_report), so that their status is reported in the latest_task_run table.
We also made an API endpoint called getLatestTaskRun so that an external monitoring tool could check if certain tasks are being run successfully and as frequent as we hope (using this tool for instance).
This issue is about a CLI task which can monitor the state of tasks. This approach is a little less safe than an external tool that queries our API (single point of failure), but way more convenient (teams can decide which route to take according to their project).
The monitoring task should accept a set of task names and frequency pairs (or we keep using the config for the frequency demands - MONITOR_FREQUENCY_<task name>. We can use some logic from py-pinger to get the core logic implemented (decide if the task ran successfully and recent enough). Probably we also want to configure also actions to take if a task is found to be failing. Here I'm open to email, slack and/or sentry.
Also, we should add documentation about this topic.
The text was updated successfully, but these errors were encountered:
Some recurrent tasks are crucial for data integrity (e.g. getting weather forecasts from an external API), and we are not only interested in actual errors (which we could catch with sentry, for example) but also in the absence of regular, frequent success. For instance, we know this task has to happen every hour. If it isn't being run for some reason, we need to know.
We can already mark CLI tasks as task (using
@task_with_status_report
), so that their status is reported in thelatest_task_run
table.We also made an API endpoint called
getLatestTaskRun
so that an external monitoring tool could check if certain tasks are being run successfully and as frequent as we hope (using this tool for instance).This issue is about a CLI task which can monitor the state of tasks. This approach is a little less safe than an external tool that queries our API (single point of failure), but way more convenient (teams can decide which route to take according to their project).
The monitoring task should accept a set of task names and frequency pairs (or we keep using the config for the frequency demands -
MONITOR_FREQUENCY_<task name>
. We can use some logic from py-pinger to get the core logic implemented (decide if the task ran successfully and recent enough). Probably we also want to configure also actions to take if a task is found to be failing. Here I'm open to email, slack and/or sentry.Also, we should add documentation about this topic.
The text was updated successfully, but these errors were encountered: