New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set parallelism log messages to warning level for better visiblity #39298
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-0.9 for this change, reaching the parallelism limits is a normal and expected thing and we should not warn the user when it happens.
Why should we not warn the user? |
I think the question is why should we warn the user when the parallel limits are reached? |
To inform the user that a software limit has been reached. For example, unless you enable debug logging there's no way to tell from the logs that |
I think this is more sutaible for cluster activity dashboard to expose this in actionable manner. Logs are a bit problematic because what you get here is notification that parallesim is reach but you have no information if user set this specific value by design or not. If user chose the specific value then logging warnings is not very user freidnly and in fact not actionable. This can result in further user request to silent the messages. In my point of view If we can find a way to present the information in cluster activity dashboard that would be prefered. |
This is only true for users whose Airflow environment is somewhat optimized. There are a lot of configs to consider (parallelism, worker concurrency, pools) in order to limit task throughput -- surfacing more info to the user could only be helpful IMO.
Why not both? I personally spend a lot more time debugging Airflow logs than I do using the cluster activity dashboard. If parallelism is reached unexpectedly such that it causes a problem, I'd definitely find it faster in the logs. |
My 2c, I think info is appropriate. |
Exactly, in this case:
|
Yeah that's fair. Do you agree that at least an info log is reasonable? |
Okay I reverted a few logging statements so that now leaves an info log when @hussein-awala Regarding:
Disagree with this statement. Airflow has lots of knobs to tweak for parallelism which can be a blessing and a curse. You might have a perfect setup with OTEL integration, dashboards, and alerts, but I'm sure that knowledge and tooling wasn't developed overnight. My experience is that Airflow's default settings quite often limit the user before they start reaching hardware limits. This is a frustrating problem because as of today visibility (via UI/logs) on such limits is low. I'll investigate alternative options such as dashboards, but stating that it's "useless to warn the user" is not my experience since 9 out of 10 times I see people hitting a default limit instead of a user-defined limit. Sidenote: we currently have one warning-level log regarding pool limits, should we downgrade that to INFO level? https://github.com/apache/airflow/blob/main/airflow/jobs/scheduler_job_runner.py#L450 |
This PR raises the log level of several log messages related to parallelism to warning level. IMO it's currently not evident when Airflow reaches a parallelism limit so the purpose of this PR is to make it more obvious that a scalability limit is reached.
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rst
or{issue_number}.significant.rst
, in newsfragments.