Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Audit task failed-deps for later investigation #39712

Open
1 task done
eladkal opened this issue May 20, 2024 · 2 comments
Open
1 task done

Audit task failed-deps for later investigation #39712

eladkal opened this issue May 20, 2024 · 2 comments
Labels
area:Scheduler Scheduler or dag parsing Issues kind:feature Feature Requests

Comments

@eladkal
Copy link
Contributor

eladkal commented May 20, 2024

Body

Currently Airflow offers failed-deps to investigate why task isn't being scheduled. This is very helpful tool however it works only in real time according to the current entries in the metadb. Investigating past anomalies isn't supported.

Sometimes scheduling problems are "solved" on their own. It could be that pool is overcrowded or concurrency has been reached but eventually stress is reduced and tasks are scheduled, thus when you notice it and want to investigate why there was a delay to begin with your capabilities are limited as there could be many reasons.

The needed solution:
We should investigate the option to audit the failed-deps information or alternatively offer an easy way to export this information in real time to an external audit storage for later investigation.

Committer

  • I acknowledge that I am a maintainer/committer of the Apache Airflow project.
@eladkal eladkal added area:Scheduler Scheduler or dag parsing Issues kind:feature Feature Requests labels May 20, 2024
@tirkarthi
Copy link
Contributor

Recently I worked on this and the information is available as part of UI and API for tasks in scheduled or None state. Perhaps the API could be used for export and also enriched with additional checks that provide useful information.

Ref : #38449

@eladkal
Copy link
Contributor Author

eladkal commented May 20, 2024

Recently I worked on this and the information is available as part of UI and API for tasks in scheduled or None state. Perhaps the API could be used for export and also enriched with additional checks that provide useful information.

Ref : #38449

The UI part is exposing failed-deps as is. It doesnt have the mechanism to export/store the information.
There is also the question of export interval

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:Scheduler Scheduler or dag parsing Issues kind:feature Feature Requests
Projects
None yet
Development

No branches or pull requests

2 participants