Releases: apache/airflow
Apache Airflow 2.9.2
Significant Changes
No significant changes.
Bug Fixes
- Fix bug that makes
AirflowSecurityManagerV2
leave transactions in theidle in transaction
state (#39935) - Fix alembic auto-generation and rename mismatching constraints (#39032)
- Add the existing_nullable to the downgrade side of the migration (#39374)
- Fix Mark Instance state buttons stay disabled if user lacks permission (#37451). (#38732)
- Use SKIP LOCKED instead of NOWAIT in mini scheduler (#39745)
- Remove DAG Run Add option from FAB view (#39881)
- Add max_consecutive_failed_dag_runs in API spec (#39830)
- Fix example_branch_operator failing in python 3.12 (#39783)
- Fetch served logs also when task attempt is up for retry and no remote logs available (#39496)
- Change dataset URI validation to raise warning instead of error in Airflow 2.9 (#39670)
- Visible DAG RUN doesn't point to the same dag run id (#38365)
- Refactor
SafeDogStatsdLogger
to useget_validator
to enable pattern matching (#39370) - Fix custom actions in security manager
has_access
(#39421) - Fix HTTP 500 Internal Server Error if DAG is triggered with bad params (#39409)
- Fix static file caching is disabled in Airflow Webserver. (#39345)
- Fix TaskHandlerWithCustomFormatter now adds prefix only once (#38502)
- Do not provide deprecated
execution_date
in@apply_lineage
(#39327) - Add missing conn_id to string representation of ObjectStoragePath (#39313)
- Fix
sql_alchemy_engine_args
config example (#38971) - Add Cache-Control "no-store" to all dynamically generated content (#39550)
Miscellaneous
- Limit
yandex
provider to avoidmypy
errors (#39990) - Warn on mini scheduler failures instead of debug (#39760)
- Change type definition for
provider_info_cache
decorator (#39750) - Better typing for BaseOperator
defer
(#39742) - More typing in TimeSensor and TimeSensorAsync (#39696)
- Re-raise exception from strict dataset URI checks (#39719)
- Fix stacklevel for _log_state helper (#39596)
- Resolve SA warnings in migrations scripts (#39418)
- Remove unused index
idx_last_scheduling_decision
ondag_run
table (#39275)
Doc Only Changes
- Provide extra tip on labeling DynamicTaskMapping (#39977)
- Improve visibility of links / variables / other configs in Configuration Reference (#39916)
- Remove 'legacy' definition for
CronDataIntervalTimetable
(#39780) - Update plugins.rst examples to use pyproject.toml over setup.py (#39665)
- Fix nit in pg set-up doc (#39628)
- Add Matomo to Tracking User Activity docs (#39611)
- Fix Connection.get -> Connection. get_connection_from_secrets (#39560)
- Adding note for provider dependencies (#39512)
- Update docker-compose command (#39504)
- Update note about restarting triggerer process (#39436)
- Updating S3LogLink with an invalid bucket link (#39424)
- Update testing_packages.rst (#38996)
- Add multi-team diagrams (#38861)
Apache Airflow 2.9.1
Significant Changes
Stackdriver logging bugfix requires Google provider 10.17.0
or later (#38071)
If you use Stackdriver logging, you must use Google provider version 10.17.0
or later. Airflow 2.9.1
now passes gcp_log_name
to the StackdriverTaskHandler
instead of name
, and this will fail on earlier provider versions.
This fixes a bug where the log name configured in [logging] remove_base_log_folder
was overridden when Airflow configured logging, resulting in task logs going to the wrong destination.
Bug Fixes
- Make task log messages include run_id (#39280)
- Copy menu_item
href
for nav bar (#39282) - Fix trigger kwarg encryption migration (#39246, #39361, #39374)
- Add workaround for datetime-local input in
firefox
(#39261) - Add Grid button to Task Instance view (#39223)
- Get served logs when remote or executor logs not available for non-running task try (#39177)
- Fixed side effect of menu filtering causing disappearing menus (#39229)
- Use grid view for Task Instance's
log_url
(#39183) - Improve task filtering
UX
(#39119) - Improve rendered_template
ux
in react dag page (#39122) - Graph view improvements (#38940)
- Check that the dataset<>task exists before trying to render graph (#39069)
- Hostname was "redacted", not "redact"; remove it when there is no context (#39037)
- Check whether
AUTH_ROLE_PUBLIC
is set incheck_authentication
(#39012) - Move rendering of
map_index_template
so it renders for failed tasks as long as it was defined before the point of failure (#38902) Undeprecate
BaseXCom.get_one
method for now (#38991)- Add
inherit_cache
attribute forCreateTableAs
custom SA Clause (#38985) - Don't wait for DagRun lock in mini scheduler (#38914)
- Fix calendar view with no DAG Run (#38964)
- Changed the background color of external task in graph (#38969)
- Fix dag run selection (#38941)
- Fix
SAWarning
'Coercing Subquery object into a select() for use in IN()' (#38926) - Fix implicit
cartesian
product in AirflowSecurityManagerV2 (#38913) - Fix problem that links in legacy log view can not be clicked (#38882)
- Fix dag run link params (#38873)
- Use async db calls in WorkflowTrigger (#38689)
- Fix audit log events filter (#38719)
- Use
methodtools.lru_cache
instead offunctools.lru_cache
in class methods (#37757) - Raise deprecated warning in
airflow dags backfill
only if-I
/--ignore-first-depends-on-past
provided (#38676)
Miscellaneous
TriggerDagRunOperator
deprecateexecution_date
in favor oflogical_date
(#39285)- Force to use Airflow Deprecation warnings categories on
@deprecated
decorator (#39205) - Add warning about run/import Airflow under the Windows (#39196)
- Update
is_authorized_custom_view
from auth manager to handle custom actions (#39167) - Add in Trove classifiers Python 3.12 support (#39004)
- Use debug level for
minischeduler
skip (#38976) - Bump
undici
from5.28.3 to 5.28.4
in/airflow/www
(#38751)
Doc Only Changes
- Fix supported k8s version in docs (#39172)
- Dynamic task mapping
PythonOperator
op_kwargs (#39242) - Add link to
user
androle
commands (#39224) - Add
k8s 1.29
to supported version in docs (#39168) - Data aware scheduling docs edits (#38687)
- Update
DagBag
class docstring to include all params (#38814) - Correcting an example taskflow example (#39015)
- Remove decorator from rendering fields example (#38827)
Apache Airflow 2.9.0
Significant Changes
Following Listener API methods are considered stable and can be used for production system (were experimental feature in older Airflow versions) (#36376):
Lifecycle events:
on_starting
before_stopping
DagRun State Change Events:
on_dag_run_running
on_dag_run_success
on_dag_run_failed
TaskInstance State Change Events:
on_task_instance_running
on_task_instance_success
on_task_instance_failed
Support for Microsoft SQL-Server for Airflow Meta Database has been removed (#36514)
After discussion <https://lists.apache.org/thread/r06j306hldg03g2my1pd4nyjxg78b3h4>
__
and a voting process <https://lists.apache.org/thread/pgcgmhf6560k8jbsmz8nlyoxosvltph2>
__,
the Airflow's PMC and Committers have reached a resolution to no longer maintain MsSQL as a supported Database Backend.
As of Airflow 2.9.0 support of MsSQL has been removed for Airflow Database Backend.
A migration script which can help migrating the database before upgrading to Airflow 2.9.0 is available in
airflow-mssql-migration repo on Github <https://github.com/apache/airflow-mssql-migration>
_.
Note that the migration script is provided without support and warranty.
This does not affect the existing provider packages (operators and hooks), DAGs can still access and process data from MsSQL.
Dataset URIs are now validated on input (#37005)
Datasets must use a URI that conform to rules laid down in AIP-60, and the value
will be automatically normalized when the DAG file is parsed. See
documentation on Datasets <https://airflow.apache.org/docs/apache-airflow/stable/authoring-and-scheduling/datasets.html>
_ for
a more detailed description on the rules.
You may need to change your Dataset identifiers if they look like a URI, but are
used in a less mainstream way, such as relying on the URI's auth section, or
have a case-sensitive protocol name.
The method get_permitted_menu_items
in BaseAuthManager
has been renamed filter_permitted_menu_items
(#37627)
Add REST API actions to Audit Log events (#37734)
The Audit Log event
name for REST API events will be prepended with api.
or ui.
, depending on if it came from the Airflow UI or externally.
Official support for Python 3.12 (#38025)
There are a few caveats though:
-
Pendulum2 does not support Python 3.12. For Python 3.12 you need to use
Pendulum 3 <https://pendulum.eustace.io/blog/announcing-pendulum-3-0-0.html>
_ -
Minimum SQLAlchemy version supported when Pandas is installed for Python 3.12 is
1.4.36
released in
April 2022. Airflow 2.9.0 increases the minimum supported version of SQLAlchemy to1.4.36
for all
Python versions.
Not all Providers support Python 3.12. At the initial release of Airflow 2.9.0 the following providers
are released without support for Python 3.12:
apache.beam
- pending onApache Beam support for 3.12 <https://github.com/apache/beam/issues/29149>
_papermill
- pending on Releasing Python 3.12 compatible papermill client version
including this merged issue <https://github.com/nteract/papermill/pull/771>
_
Prevent large string objects from being stored in the Rendered Template Fields (#38094)
There's now a limit to the length of data that can be stored in the Rendered Template Fields.
The limit is set to 4096 characters. If the data exceeds this limit, it will be truncated. You can change this limit
by setting the [core]max_template_field_length
configuration option in your airflow config.
Change xcom table column value type to longblob for MySQL backend (#38401)
Xcom table column value
type has changed from blob
to longblob
. This will allow you to store relatively big data in Xcom but process can take a significant amount of time if you have a lot of large data stored in Xcom.
To downgrade from revision: b4078ac230a1
, ensure that you don't have Xcom values larger than 65,535 bytes. Otherwise, you'll need to clean those rows or run airflow db clean xcom
to clean the Xcom table.
New Features
- Allow users to write dag_id and task_id in their national characters, added display name for dag / task (v2) (#38446)
- Prevent large objects from being stored in the RTIF (#38094)
- Use current time to calculate duration when end date is not present. (#38375)
- Add average duration mark line in task and dagrun duration charts. (#38214, #38434)
- Add button to manually create dataset events (#38305)
- Add
Matomo
as an option for analytics_tool. (#38221) - Experimental: Support custom weight_rule implementation to calculate the TI priority_weight (#38222)
- Adding ability to automatically set DAG to off after X times it failed sequentially (#36935)
- Add dataset conditions to next run datasets modal (#38123)
- Add task log grouping to UI (#38021)
- Add dataset_expression to grid dag details (#38121)
- Introduce mechanism to support multiple executor configuration (#37635)
- Add color formatting for ANSI chars in logs from task executions (#37985)
- Add the dataset_expression as part of DagModel and DAGDetailSchema (#37826)
- Add TaskFail entries to Gantt chart (#37918)
- Allow longer rendered_map_index (#37798)
- Inherit the run_ordering from DatasetTriggeredTimetable for DatasetOrTimeSchedule (#37775)
- Implement AIP-60 Dataset URI formats (#37005)
- Introducing Logical Operators for dataset conditional logic (#37101)
- Add post endpoint for dataset events (#37570)
- Show custom instance names for a mapped task in UI (#36797)
- Add excluded/included events to get_event_logs api (#37641)
- Add datasets to dag graph (#37604)
- Show dataset events above task/run details in grid view (#37603)
- Introduce new config variable to control whether DAG processor outputs to stdout (#37439)
- Make Datasets
hashable
(#37465) - Add conditional logic for dataset triggering (#37016)
- Implement task duration page in react. (#35863)
- Add
queuedEvent
endpoint to get/delete DatasetDagRunQueue (#37176) - Support multiple XCom output in the BaseOperator (#37297)
- AIP-58: Add object storage backend for xcom (#37058)
- Introduce
DatasetOrTimeSchedule
(#36710) - Add
on_skipped_callback
toBaseOperator
(#36374) - Allow override of hovered navbar colors (#36631)
- Create new Metrics with Tagging (#36528)
- Add support for openlineage to AFS and common.io (#36410)
- Introduce
@task.bash
TaskFlow decorator (#30176, #37875) - Added functionality to automatically ingest custom airflow.cfg file upon startup (#36289)
Improvements
- More human friendly "show tables" output for db cleanup (#38654)
- Improve trigger assign_unassigned by merging alive_triggerer_ids and get_sorted_triggers queries (#38664)
- Add exclude/include events filters to audit log (#38506)
- Clean up unused triggers in a single query for all dialects except MySQL (#38663)
- Update Confirmation Logic for Config Changes on Sensitive Environments Like Production (#38299)
- Improve datasets graph UX (#38476)
- Only show latest dataset event timestamp after last run (#38340)
- Add button to clear only failed tasks in a dagrun. (#38217)
- Delete all old dag pages and redirect to grid view (#37988)
- Check task attribute before use in sentry.add_tagging() (#37143)
- Mysql change xcom value col type for MySQL backend (#38401)
ExternalPythonOperator
use version fromsys.version_info
(#38377)- Replace too broad exceptions into the Core (#38344)
- Add CLI support for bulk pause and resume of DAGs (#38265)
- Implement methods on TaskInstancePydantic and DagRunPydantic (#38295, #38302, #38303, #38297)
- Made filters bar collapsible and add a full screen toggle (#38296)
- Encrypt all trigger attributes (#38233, #38358, #38743)
- Upgrade react-table package. Use with Audit Log table (#38092)
- Show if dag page filters are active (#38080)
- Add try number to mapped instance (#38097)
- Add retries to job heartbeat (#37541)
- Add REST API events to Audit Log (#37734)
- Make current working directory as templated field in BashOperator (#37968)
- Add calendar view to react (#37909)
- Add
run_id
column to log table (#37731) - Add
tryNumber
to grid task instance tooltip (#37911) - Session is not used in _do_render_template_fields (#37856)
- Improve MappedOperator property types (#37870)
- Remove provide_session decorator from TaskInstancePydantic methods (#37853)
- Ensure the "airflow.task" logger used for TaskInstancePydantic and TaskInstance (#37857)
- Better error message for internal api call error (#37852)
- Increase tooltip size of dag grid view (#37782) (#37805)
- Use named loggers instead of root logger (#37801)
- Add Run Duration in React (#37735)
- Avoid non-recommended usage of logging (#37792)
- Improve DateTimeTrigger typing (#37694)
- Make sure all unique run_ids render a task duration bar (#37717)
- Add Dag Audit Log to React (#37682)
- Add log event for auto pause (#38243)
- Better message for exception for templated base operator fields (#37668)
- Clean up webserver endpoints adding to audit log (#37580)
- Filter datasets graph by dag_id (#37464)
- Use new exception type inheriting BaseException for SIGTERMs (#37613)
- Refactor dataset class inheritance (#37590)
- Simplify checks for package versions (#37585)
- Filter Datasets by associated dag_ids (GET /datasets) (#37512)
- Enable "airflow tasks test" to run deferrable operator (#37542)
- Make datasets list/graph width adjustable (#37425)
- Speedup determine installed airflow version in
ExternalPythonOperator
(#37409) - Add more task details from rest api (#37394)
- Add confirmation dialog box for DAG run actions (#35393)
- Added shutdown color to the STATE_COLORS (#37295)
- Remove legacy dag details page and redirect to grid (#37232)
- Order XCom entries by map index in API (#37086...
Apache Airflow 2.8.4
Significant Changes
No significant changes.
Bug Fixes
- Fix incorrect serialization of
FixedTimezone
(#38139) - Fix excessive permission changing for log task handler (#38164)
- Fix task instances list link (#38096)
- Fix a bug where scheduler heartrate parameter was not used (#37992)
- Add padding to prevent grid horizontal scroll overlapping tasks (#37942)
- Fix hash caching in
ObjectStoragePath
(#37769)
Miscellaneous
- Limit importlib_resources as it breaks
pytest_rewrites
(#38095, #38139) - Limit
pandas
to<2.2
(#37748) - Bump
croniter
to fix an issue with 29 Feb cron expressions (#38198)
Doc Only Changes
Apache Airflow Helm Chart 1.13.1
Significant Changes
Default Airflow image is updated to 2.8.3
(#38036)
The default Airflow image that is used with the Chart is now 2.8.3
, previously it was 2.8.2
.
Bug Fixes
- Don't overwrite
.Values.airflowPodAnnotations
(#37917) - Fix cluster-wide RBAC naming clash when using multiple
multiNamespace
releases with the same name (#37197)
Misc
- Chart: Default airflow version to 2.8.3 (#38036)
Apache Airflow 2.8.3
Significant Changes
The smtp provider is now pre-installed when you install Airflow. (#37713)
Bug Fixes
- Add "MENU" permission in auth manager (#37881)
- Fix external_executor_id being overwritten (#37784)
- Make more MappedOperator members modifiable (#37828)
- Set parsing context dag_id in dag test command (#37606)
Miscellaneous
- Remove useless methods from security manager (#37889)
- Improve code coverage for TriggerRuleDep (#37680)
- The SMTP provider is now preinstalled when installing Airflow (#37713)
- Bump min versions of openapi validators (#37691)
- Properly include
airflow_pre_installed_providers.txt
artifact (#37679)
Doc Only Changes
- Clarify lack of sync between workers and scheduler (#37913)
- Simplify some docs around airflow_local_settings (#37835)
- Add section about local settings configuration (#37829)
- Fix docs of
BranchDayOfWeekOperator
(#37813) - Write to secrets store is not supported by design (#37814)
ERD
generating doc improvement (#37808)- Update incorrect config value (#37706)
- Update security model to clarify Connection Editing user's capabilities (#37688)
- Fix ImportError on examples dags (#37571)
Apache Airflow Helm Chart 1.13.0
Significant Changes
Default Airflow image is updated to 2.8.2 (#37704)
The default Airflow image that is used with the Chart is now 2.8.2, previously it was 2.8.1.
New Features
- Support labels specific to the database migration objects and pods (#37490)
Improvements
- Flower K8s Probe config (#37528)
Bug Fixes
- Remove duplicate ports key in webserver service (#37356)
- Add AIRFLOW_HOME env var to log groomer sidecar (#37588)
- Skip . path when preparing reproducible packages (#37402)
Misc
- Default airflow version to 2.8.2 (#37704)
Apache Airflow 2.8.2
Significant Changes
The allowed_deserialization_classes
flag now follows a glob pattern (#36147).
For example if one wants to add the class airflow.tests.custom_class
to the
allowed_deserialization_classes
list, it can be done by writing the full class
name (airflow.tests.custom_class
) or a pattern such as the ones used in glob
search (e.g., airflow.*
, airflow.tests.*
).
If you currently use a custom regexp path make sure to rewrite it as a glob pattern.
Alternatively, if you still wish to match it as a regexp pattern, add it under the new
list allowed_deserialization_classes_regexp
instead.
The audit_logs permissions have been updated for heightened security (#37501).
This was done under the policy that we do not want users like Viewer, Ops,
and other users apart from Admin to have access to audit_logs. The intention behind
this change is to restrict users with less permissions from viewing user details
like First Name, Email etc. from the audit_logs when they are not permitted to.
The impact of this change is that the existing users with non admin rights won't be able
to view or access the audit_logs, both from the Browse tab or from the DAG run.
AirflowTimeoutError
is no longer except
by default through Exception
(#35653).
The AirflowTimeoutError
is now inheriting BaseException
instead of
AirflowException
->Exception
.
See https://docs.python.org/3/library/exceptions.html#exception-hierarchy
This prevents code catching Exception
from accidentally
catching AirflowTimeoutError
and continuing to run.
AirflowTimeoutError
is an explicit intent to cancel the task, and should not
be caught in attempts to handle the error and return some default value.
Catching AirflowTimeoutError
is still possible by explicitly except
ing
AirflowTimeoutError
or BaseException
.
This is discouraged, as it may allow the code to continue running even after
such cancellation requests.
Code that previously depended on performing strict cleanup in every situation
after catching Exception
is advised to use finally
blocks or
context managers. To perform only the cleanup and then automatically
re-raise the exception.
See similar considerations about catching KeyboardInterrupt
in
https://docs.python.org/3/library/exceptions.html#KeyboardInterrupt
Bug Fixes
- Sort dag processing stats by last_runtime (#37302)
- Allow pre-population of trigger form values via URL parameters (#37497)
- Base date for fetching dag grid view must include selected run_id (#34887)
- Check permissions for ImportError (#37468)
- Move
IMPORT_ERROR
from DAG related permissions to view related permissions (#37292) - Change
AirflowTaskTimeout
to inheritBaseException
(#35653) - Revert "Fix future DagRun rarely triggered by race conditions when max_active_runs reached its upper limit. (#31414)" (#37596)
- Change margin to padding so first task can be selected (#37527)
- Fix Airflow serialization for
namedtuple
(#37168) - Fix bug with clicking url-unsafe tags (#37395)
- Set deterministic and new getter for
Treeview
function (#37162) - Fix permissions of parent folders for log file handler (#37310)
- Fix permission check on DAGs when
access_entity
is specified (#37290) - Fix the value of
dateTimeAttrFormat
constant (#37285) - Resolve handler close race condition at triggerer shutdown (#37206)
- Fixing status icon alignment for various views (#36804)
- Remove superfluous
@Sentry.enrich_errors
(#37002) - Use execution_date= param as a backup to base date for grid view (#37018)
- Handle SystemExit raised in the task. (#36986)
- Revoking audit_log permission from all users except admin (#37501)
- Fix broken regex for allowed_deserialization_classes (#36147)
- Fix the bug that affected the DAG end date. (#36144)
- Adjust node width based on task name length (#37254)
- fix: PythonVirtualenvOperator crashes if any python_callable function is defined in the same source as DAG (#37165)
- Fix collapsed grid width, line up selected bar with gantt (#37205)
- Adjust graph node layout (#37207)
- Revert the sequence of initializing configuration defaults (#37155)
- Displaying "actual" try number in TaskInstance view (#34635)
- Bugfix Triggering DAG with parameters is mandatory when show_trigger_form_if_no_params is enabled (#37063)
- Secret masker ignores passwords with special chars (#36692)
- Fix DagRuns with UPSTREAM_FAILED tasks get stuck in the backfill. (#36954)
- Disable
dryrun
auto-fetch (#36941) - Fix copy button on a DAG run's config (#36855)
- Fix bug introduced by replacing spaces by + in run_id (#36877)
- Fix webserver always redirecting to home page if user was not logged in (#36833)
- REST API set description on POST to
/variables
endpoint (#36820) - Sanitize the conn_id to disallow potential script execution (#32867)
- Fix task id copy button copying wrong id (#34904)
- Fix security manager inheritance in fab provider (#36538)
- Avoid
pendulum.from_timestamp
usage (#37160)
Miscellaneous
- Install latest docker
CLI
instead of specific one (#37651) - Bump
undici
from5.26.3
to5.28.3
in/airflow/www
(#37493) - Add Python
3.12
exclusions inproviders/pyproject.toml
(#37404) - Remove
markdown
from core dependencies (#37396) - Remove unused
pageSize
method. (#37319) - Add more-itertools as dependency of common-sql (#37359)
- Replace other
Python 3.11
and3.12
deprecations (#37478) - Include
airflow_pre_installed_providers.txt
intosdist
distribution (#37388) - Turn Pydantic into an optional dependency (#37320)
- Limit
universal-pathlib to < 0.2.0
(#37311) - Allow running airflow against sqlite in-memory DB for tests (#37144)
- Add description to
queue_when
(#36997) - Updated
config.yml
for environment variablesql_alchemy_connect_args
(#36526) - Bump min version of
Alembic to 1.13.1
(#36928) - Limit
flask-session
to<0.6
(#36895)
Doc Only Changes
- Fix upgrade docs to reflect true
CLI
flags available (#37231) - Fix a bug in fundamentals doc (#37440)
- Add redirect for deprecated page (#37384)
- Fix the
otel
config descriptions (#37229) - Update
Objectstore
tutorial withprereqs
section (#36983) - Add more precise description on avoiding generic
package/module
names (#36927) - Add airflow version substitution into Docker Compose Howto (#37177)
- Add clarification about DAG author capabilities to security model (#37141)
- Move docs for cron basics to Authoring and Scheduling section (#37049)
- Link to release notes in the upgrade docs (#36923)
- Prevent templated field logic checks in
__init__
of operators automatically (#33786)
Apache Airflow Helm Chart 1.12.0
Significant Changes
The helm chart is now using a newer version of bitnami/postgresql
dependency (#34817)
The version of bitnami/postgresql
subchart upgraded from 12.10.0
to 13.2.24
.
The version of PostgreSQL
binaries upgraded from 11
to 16.1.0
.
The change requires existing bitnami/postgresql
subchart users to perform manual major version upgrade using pg_dumpall
or pg_upgrade
.
As a reminder, it is recommended to set up an external database <https://airflow.apache.org/docs/helm-chart/stable/production-guide.html#database>
_ in production.
Default Airflow image is updated to 2.8.1
(#36907)
The default Airflow image that is used with the Chart is now 2.8.1
, previously it was 2.7.1
.
Default PgBouncer and PgBouncer Exporter images have been updated (#36898)
The PgBouncer and PgBouncer Exporter images are based on newer software/os.
pgbouncer
: 1.21.0 based on alpine 3.14 (airflow-pgbouncer-2024.01.19-1.21.0
)pgbouncer-exporter
: 0.16.0 based on alpine 3.19 (apache/airflow:airflow-pgbouncer-exporter-2024.01.19-0.16.0
)
Default StatsD image is updated to v0.26.0
(#37187)
The default StatsD image that is used with the Chart is now v0.26.0
, previously it was v0.22.8
.
Default Redis image is updated to 7-bookworm
(#37187)
The default Redis image that is used with the Chart is now 7-bookworm
, previously it was 7-bullseye
.
New Features
- Enable native HPA for Airflow Workers (#36174)
- Add init container + sidecar support for Airflow Kerberos (#35548)
- Support MySQL backend as KEDA trigger (#36167)
Improvements
- Improve PriorityClass to improve debuggability (#36365)
- Add
securityContexts
in dag processors log groomer sidecar (#34499) - Add support for
securityContexts
in dag processors wait-for-migrations container (#35593) - Add templating for PVC
storageClassName
(#35581) - Add
volumeClaimTemplate
for worker (#34986) - Add support for
priorityClassName
on Redis pods (#34879) - Configurable mount path for DAGs volume (#35083)
- Add support for custom
emptyDir
config (#34837) - Added ability to enable/disable scheduler and webserver (#36991)
Bug Fixes
- Fix StatsD host in Airflow config (#35679)
- Set
AIRFLOW_HOME
env var withairflowHome
value (#34839) - Safer worker pod annotations (#35309)
- Set worker
safeToEvict
properly (#35130) - Fix Redis broker URL with
useStandardNaming
(#34825) - Fix metadata DB & port in KEDA connection when
usePgbouncer
is false (#34741) - Fix PgBouncer connection with
useStandardNaming
(#34787)
Doc only changes
- Add docs about extending the Airflow Helm chart (#36331)
- Add comment for Elasticsearch connection scheme (#35588)
- Add notes about Virtualenvs preventing the need for custom images (#35306)
Misc
- Default Airflow version to 2.8.1 (#36907)
- Support git-sync v4 (#34731)
- Upgrade
bitnami/postgresql
subchart to13.2.24
(#36156) - Change git sync container indent to 4 (#35824)
- Remove K8S 1.24 support (#35214)
- Rebuild
pgbouncer
andpgbouncer-exporter
images with newer versions (#36898) - Update
statsd
andredis
chart images (#37187)
Apache Airflow 2.8.1
Significant Changes
Target version for core dependency pendulum
package set to 3 (#36281).
Support for pendulum 2.1.2 will be saved for a while, presumably until the next feature version of Airflow.
It is advised to upgrade user code to use pendulum 3 as soon as possible.
Airflow packaging specification follows modern Python packaging standards (#36537).
We standardized Airflow dependency configuration to follow latest development in Python packaging by
using pyproject.toml
. Airflow is now compliant with those accepted PEPs:
PEP-440 Version Identification and Dependency Specification <https://www.python.org/dev/peps/pep-0440/>
__PEP-517 A build-system independent format for source trees <https://www.python.org/dev/peps/pep-0517/>
__PEP-518 Specifying Minimum Build System Requirements for Python Projects <https://www.python.org/dev/peps/pep-0518/>
__PEP-561 Distributing and Packaging Type Information <https://www.python.org/dev/peps/pep-0561/>
__PEP-621 Storing project metadata in pyproject.toml <https://www.python.org/dev/peps/pep-0621/>
__PEP-660 Editable installs for pyproject.toml based builds (wheel based) <https://www.python.org/dev/peps/pep-0660/>
__PEP-685 Comparison of extra names for optional distribution dependencies <https://www.python.org/dev/peps/pep-0685/>
__
Also we implement multiple license files support coming from Draft, not yet accepted (but supported by hatchling) PEP:
PEP 639 Improving License Clarity with Better Package Metadata <https://peps.python.org/pep-0639/>
__
This has almost no noticeable impact on users if they are using modern Python packaging and development tools, generally
speaking Airflow should behave as it did before when installing it from PyPI and it should be much easier to install
it for development purposes using pip install -e ".[devel]"
.
The differences from the user side are:
- Airflow extras now get extras normalized to
-
(following PEP-685) instead of_
and.
(as it was before in some extras). When you install airflow with such extras (for exampledbt.core
or
all_dbs
) you should use-
instead of_
and.
.
In most modern tools this will work in backwards-compatible way, but in some old version of those tools you might need to
replace _
and .
with -
. You can also get warnings that the extra you are installing does not exist - but usually
this warning is harmless and the extra is installed anyway. It is, however, recommended to change to use -
in extras in your dependency
specifications for all Airflow extras.
-
Released airflow package does not contain
devel
,devel-*
,doc
anddoc-gen
extras.
Those extras are only available when you install Airflow from sources in--editable
mode. This is
because those extras are only used for development and documentation building purposes and are not needed
when you install Airflow for production use. Those dependencies had unspecified and varying behaviour for
released packages anyway and you were not supposed to use them in released packages. -
The
all
andall-*
extras were not always working correctly when installing Airflow using constraints
because they were also considered as development-only dependencies. With this change, those dependencies are
now properly handling constraints and they will install properly with constraints, pulling the right set
of providers and dependencies when constraints are used.
Graphviz dependency is now an optional one, not required one (#36647).
The graphviz
dependency has been problematic as Airflow required dependency - especially for
ARM-based installations. Graphviz packages require binary graphviz libraries - which is already a
limitation, but they also require to install graphviz Python bindings to be build and installed.
This does not work for older Linux installation but - more importantly - when you try to install
Graphviz libraries for Python 3.8, 3.9 for ARM M1 MacBooks, the packages fail to install because
Python bindings compilation for M1 can only work for Python 3.10+.
This is not a breaking change technically - the CLIs to render the DAGs is still there and IF you
already have graphviz installed, it will continue working as it did before. The only problem when it
does not work is where you do not have graphviz installed it will raise an error and inform that you need it.
Graphviz will remain to be installed for most users:
- the Airflow Image will still contain graphviz library, because
it is added there as extra - when previous version of Airflow has been installed already, then
graphviz library is already installed there and Airflow will
continue working as it did
The only change will be a new installation of new version of Airflow from the scratch, where graphviz will
need to be specified as extra or installed separately in order to enable DAG rendering option.
Bug Fixes
- Fix airflow-scheduler exiting with code 0 on exceptions (#36800)
- Fix Callback exception when a removed task is the last one in the
taskinstance
list (#36693) - Allow anonymous user edit/show resource when set
AUTH_ROLE_PUBLIC=admin
(#36750) - Better error message when sqlite URL uses relative path (#36774)
- Explicit string cast required to force integer-type run_ids to be passed as strings instead of integers (#36756)
- Add log lookup exception for empty
op
subtypes (#35536) - Remove unused index on task instance (#36737)
- Fix check on subclass for
typing.Union
in_infer_multiple_outputs
for Python 3.10+ (#36728) - Make sure
multiple_outputs
is inferred correctly even when usingTypedDict
(#36652) - Add back FAB constant in legacy security manager (#36719)
- Fix AttributeError when using
Dagrun.update_state
(#36712) - Do not let
EventsTimetable
schedule past events ifcatchup=False
(#36134) - Support encryption for triggers parameters (#36492)
- Fix the type hint for
tis_query
in_process_executor_events
(#36655) - Redirect to index when user does not have permission to access a page (#36623)
- Avoid using dict as default value in
call_regular_interval
(#36608) - Remove option to set a task instance to running state in UI (#36518)
- Fix details tab not showing when using dynamic task mapping (#36522)
- Raise error when
DagRun
fails while runningdag test
(#36517) - Refactor
_manage_executor_state
by refreshing TIs in batch (#36502) - Add flask config:
MAX_CONTENT_LENGTH
(#36401) - Fix get_leaves calculation for teardown in nested group (#36456)
- Stop serializing timezone-naive datetime to timezone-aware datetime with UTC tz (#36379)
- Make
kubernetes
decorator type annotation consistent with operator (#36405) - Fix Webserver returning 500 for POST requests to
api/dag/*/dagrun
from anonymous user (#36275) - Fix the required access for get_variable endpoint (#36396)
- Fix datetime reference in
DAG.is_fixed_time_schedule
(#36370) - Fix AirflowSkipException message raised by BashOperator (#36354)
- Allow PythonVirtualenvOperator.skip_on_exit_code to be zero (#36361)
- Increase width of execution_date input in trigger.html (#36278)
- Fix logging for pausing DAG (#36182)
- Stop deserializing pickle when enable_xcom_pickling is False (#36255)
- Check DAG read permission before accessing DAG code (#36257)
- Enable mark task as failed/success always (#36254)
- Create latest log dir symlink as relative link (#36019)
- Fix Python-based decorators templating (#36103)
Miscellaneous
- Rename concurrency label to max active tasks (#36691)
- Restore function scoped
httpx
import in file_task_handler for performance (#36753) - Add support of Pendulum 3 (#36281)
- Standardize airflow build process and switch to Hatchling build backend (#36537)
- Get rid of
pyarrow-hotfix
forCVE-2023-47248
(#36697) - Make
graphviz
dependency optional (#36647) - Announce MSSQL support end in Airflow 2.9.0, add migration script hints (#36509)
- Set min
pandas
dependency to 1.2.5 for all providers and airflow (#36698) - Bump follow-redirects from 1.15.3 to 1.15.4 in
/airflow/www
(#36700) - Provide the logger_name param to base hook in order to override the logger name (#36674)
- Fix run type icon alignment with run type text (#36616)
- Follow BaseHook connection fields method signature in FSHook (#36444)
- Remove redundant
docker
decorator type annotations (#36406) - Straighten typing in workday timetable (#36296)
- Use
batch_is_authorized_dag
to check if user has permission to read DAGs (#36279) - Replace deprecated get_accessible_dag_ids and use get_readable_dags in get_dag_warnings (#36256)
Doc Only Changes
- Metrics tagging documentation (#36627)
- In docs use logical_date instead of deprecated execution_date (#36654)
- Add section about live-upgrading Airflow (#36637)
- Replace
numpy
example with practical exercise demonstrating top-level code (#35097) - Improve and add more complete description in the architecture diagrams (#36513)
- Improve the error message displayed when there is a webserver error (#36570)
- Update
dags.rst
with information on DAG pausing (#36540) - Update installation prerequisites after upgrading to Debian Bookworm (#36521)
- Add description on the ways how users should approach DB monitoring (#36483)
- Add branching based on mapped task group example to dynamic-task-mapping.rst (#36480)
- Add further details to replacement documentation (#36485)
- Use cards when describing priority weighting methods (#36411)
- Update
metrics.rst
for paramdagrun.schedule_delay
(#36404) - Update admonitions in Python operator doc to reflect sentiment (#36340)
- Improve audit_logs.rst (#36213)
- Remove Redshift mention from the list of managed Postgres backends (#36217)