Skip to content

Releases: apache/airflow

Apache Airflow 2.1.2

14 Jul 20:14
d25854d
Compare
Choose a tag to compare

Bug Fixes

  • Only allow the webserver to request from the worker log server (#16754)
  • Fix "Invalid JSON configuration, must be a dict" bug (#16648)
  • Fix CeleryKubernetesExecutor (#16700)
  • Mask value if the key is token (#16474)
  • Fix impersonation issue with LocalTaskJob (#16852)

Misc

  • Add Python 3.9 support (#15515)

Apache Airflow 2.1.1

02 Jul 20:12
2.1.1
Compare
Choose a tag to compare

Bug Fixes

  • Don't crash attempting to mask secrets in dict with non-string keys (#16601)
  • Always install sphinx_airflow_theme from PyPI (#16594)
  • Remove limitation for elasticsearch library (#16553)
  • Adding extra requirements for build and runtime of the PROD image. (#16170)
  • Cattrs 1.7.0 released by the end of May 2021 break lineage usage (#16173)
  • Removes unnecessary packages from setup_requires (#16139)
  • Pins docutils to <0.17 until breaking behaviour is fixed (#16133)
  • Improvements for Docker Image docs (#14843)
  • Ensure that dag_run.conf is a dict (#15057)
  • Fix CLI connections import and migrate logic from secrets to Connection model (#15425)
  • Fix Dag Details start date bug (#16206)
  • Fix DAG run state not updated while DAG is paused (#16343)
  • Allow null value for operator field in task_instance schema(REST API) (#16516)
  • Avoid recursion going too deep when redacting logs (#16491)
  • Backfill: Don't create a DagRun if no tasks match task regex (#16461)
  • Tree View UI for larger DAGs & more consistent spacing in Tree View (#16522)
  • Correctly handle None returns from Query.scalar() (#16345)
  • Adding only_active parameter to /dags endpoint (#14306)
  • Don't show stale Serialized DAGs if they are deleted in DB (#16368)
  • Make REST API List DAGs endpoint consistent with UI/CLI behaviour (#16318)
  • Support remote logging in elasticsearch with filebeat 7 (#14625)
  • Queue tasks with higher priority and earlier execution_date first. (#15210)
  • Make task ID on legend have enough width and width of line chart to be 100%. (#15915)
  • Fix normalize-url vulnerability (#16375)
  • Validate retries value on init for better errors (#16415)
  • add num_runs query param for tree refresh (#16437)
  • Fix templated default/example values in config ref docs (#16442)
  • Add passphrase and private_key to default sensitive field names (#16392)
  • Fix tasks in an infinite slots pool were never scheduled (#15247)
  • Fix Orphaned tasks stuck in CeleryExecutor as running (#16550)
  • Don't fail to log if we can't redact something (#16118)
  • Set max tree width to 1200 pixels (#16067)
  • Fill the "job_id" field for airflow task run without --local/--raw for KubeExecutor (#16108)
  • Fixes problem where conf variable was used before initialization (#16088)
  • Fix apply defaults for task decorator (#16085)
  • Parse recently modified files even if just parsed (#16075)
  • Ensure that we don't try to mask empty string in logs (#16057)
  • Don't die when masking log.exception when there is no exception (#16047)
  • Restores apply_defaults import in base_sensor_operator (#16040)
  • Fix auto-refresh in tree view When webserver ui is not in / (#16018)
  • Fix dag.clear() to set multiple dags to running when necessary (#15382)
  • Fix Celery executor getting stuck randomly because of reset_signals in multiprocessing (#15989)

Apache Airflow Upgrade Check 1.4.0

26 Jun 14:38
Compare
Choose a tag to compare
  • Add conf not importable from airflow rule (#14400)
  • Upgrade rule to suggest rename [scheduler] max_threads to [scheduler] parsing_processes (#14913)
  • Fix running "upgrade_check" command in a PTY. (#14977)
  • Skip DatabaseVersionCheckRule check if invalid version is detected (#15122)
  • Fix too specific parsing of False in LegacyUIDeprecated (#14967)
  • Fix false positives when inheriting classes that inherit DbApiHook (#16543)

Apache Airflow Helm Chart 1.0.0

20 May 01:22
helm-chart/1.0.0
3919ee6
Compare
Choose a tag to compare

This is the first release of the Official Helm Chart.

Docs: https://airflow.apache.org/docs/helm-chart/1.0.0/

Apache Airflow 2.1.0rc2

18 May 11:19
2.1.0rc2
Compare
Choose a tag to compare
Pre-release
- Add ``PythonVirtualenvDecorator`` to Taskflow API (#14761)
- Add ``Taskgroup`` decorator (#15034)
- Create a DAG Calendar View (#15423)
- Create cross-DAG dependencies view (#13199)
- Add rest API to query for providers (#13394)
- Mask passwords and sensitive info in task logs and UI (#15599)
- Add ``SubprocessHook`` for running commands from operators (#13423)
- Add DAG Timeout in UI page "DAG Details" (#14165)
- Add ``WeekDayBranchOperator`` (#13997)
- Add JSON linter to DAG Trigger UI (#13551)
- Add DAG Description Doc to Trigger UI Page (#13365)
- Add airflow webserver URL into SLA miss email. (#13249)
- Add read only REST API endpoints for users (#14735)
- Add files to generate Airflow's Python SDK (#14739)
- Add dynamic fields to snowflake connection (#14724)
- Add read only REST API endpoint for roles and permissions (#14664)
- Add new datetime branch operator (#11964)
- Add Google leveldb hook and operator (#13109) (#14105)
- Add plugins endpoint to the REST API (#14280)
- Add ``worker_pod_pending_timeout`` support (#15263)
- Add support for labeling DAG edges (#15142)
- Add CUD REST API endpoints for Roles (#14840)
- Import connections from a file (#15177)
- A bunch of ``template_fields_renderers`` additions (#15130)
- Add REST API query sort and order to some endpoints (#14895)
- Add timezone context in new ui (#15096)
- Add query mutations to new UI (#15068)
- Add different modes to sort dag files for parsing (#15046)
- Auto refresh on Tree View (#15474)
- BashOperator to raise ``AirflowSkipException`` on exit code 99 (by default, configurable) (#13421) (#14963)
- Clear tasks by task ids in REST API (#14500)
- Support jinja2 native Python types (#14603)
- Allow celery workers without gossip or mingle modes (#13880)
- Add ``airflow jobs check`` CLI command to check health of jobs (Scheduler etc) (#14519)
- Rename ``DateTimeBranchOperator`` to ``BranchDateTimeOperator`` (#14720)

- Add optional result handler callback to ``DbApiHook`` (#15581)
- Update Flask App Builder limit to recently released 3.3 (#15792)
- Prevent creating flask sessions on REST API requests (#15295)
- Sync DAG specific permissions when parsing (#15311)
- Increase maximum length of pool name on Tasks to 256 characters (#15203)
- Enforce READ COMMITTED isolation when using mysql (#15714)
- Auto-apply ``apply_default`` to subclasses of ``BaseOperator`` (#15667)
- Emit error on duplicated DAG ID (#15302)
- Update ``KubernetesExecutor`` pod templates to allow access to IAM permissions (#15669)
- More verbose logs when running ``airflow db check-migrations`` (#15662)
- When one_success mark task as failed if no success (#15467)
- Add an option to trigger a dag w/o changing conf (#15591)
- Add Airflow UI instance_name configuration option (#10162)
- Add a decorator to retry functions with DB transactions (#14109)
- Add return to PythonVirtualenvOperator's execute method (#14061)
- Add verify_ssl config for kubernetes (#13516)
- Add description about ``secret_key`` when Webserver > 1 (#15546)
- Add Traceback in LogRecord in ``JSONFormatter`` (#15414)
- Add support for arbitrary json in conn uri format (#15100)
- Adds description field in variable (#12413) (#15194)
- Add logs to show last modified in SFTP, FTP and Filesystem sensor (#15134)
- Execute ``on_failure_callback`` when SIGTERM is received (#15172)
- Allow hiding of all edges when highlighting states (#15281)
- Display explicit error in case UID has no actual username (#15212)
- Serve logs with Scheduler when using Local or Sequential Executor (#15557)
- Deactivate trigger, refresh, and delete controls on dag detail view. (#14144)
- Turn off autocomplete for connection forms (#15073)
- Increase default ``worker_refresh_interval`` to ``6000`` seconds (#14970)
- Only show User's local timezone if it's not UTC (#13904)
- Suppress LOG/WARNING for a few tasks CLI for better CLI experience (#14567)
- Configurable API response (CORS) headers (#13620)
- Allow viewers to see all docs links (#14197)
- Update Tree View date ticks (#14141)
- Make the tooltip to Pause / Unpause a DAG clearer (#13642)
- Warn about precedence of env var when getting variables (#13501)
- Move ``[celery] default_queue`` config to ``[operators] default_queue`` to re-use between executors  (#14699)

- Fix 500 error from ``updateTaskInstancesState`` API endpoint when ``dry_run`` not passed (#15889)
- Ensure that task preceding a PythonVirtualenvOperator doesn't fail (#15822)
- Prevent mixed case env vars from crashing processes like worker (#14380)
- Fixed type annotations in DAG decorator (#15778)
- Fix on_failure_callback when task receive SIGKILL (#15537)
- Fix dags table overflow (#15660)
- Fix changing the parent dag state on subdag clear (#15562)
- Fix reading from zip package to default to text (#13962)
- Fix wrong parameter for ``drawDagStatsForDag`` in dags.html (#13884)
- Fix QueuedLocalWorker crashing with EOFError (#13215)
- Fix typo in ``NotPreviouslySkippedDep`` (#13933)
- Fix parallelism after KubeExecutor pod adoption (#15555)
- Fix kube client on mac with keepalive enabled (#15551)
- Fixes wrong limit for dask for python>3.7 (should be <3.7) (#15545)
- Fix Task Adoption in ``KubernetesExecutor`` (#14795)
- Fix timeout when using XCom with ``KubernetesPodOperator`` (#15388)
- Fix deprecated provider aliases in "extras" not working (#15465)
- Fixed default XCom deserialization. (#14827)
- Fix used_group_ids in ``dag.partial_subset`` (#13700) (#15308)
- Further fix trimmed ``pod_id`` for ``KubernetesPodOperator`` (#15445)
- Bugfix: Invalid name when trimmed `pod_id` ends with hyphen in ``KubernetesPodOperator`` (#15443)
- Fix incorrect slots stats when TI ``pool_slots > 1`` (#15426)
- Fix DAG last run link (#15327)
- Fix ``sync-perm`` to work correctly when update_fab_perms = False (#14847)
- Fixes limits on Arrow for plexus test (#14781)
- Fix UI bugs in tree view (#14566)
- Fix AzureDataFactoryHook failing to instantiate its connection (#14565)
- Fix permission error on non-POSIX filesystem (#13121)
- Fix spelling in "ignorable" (#14348)
- Fix get_context_data doctest import (#14288)
- Correct typo in ``GCSObjectsWtihPrefixExistenceSensor``  (#14179)
- Fix order of failed deps (#14036)
- Fix critical ``CeleryKubernetesExecutor`` bug (#13247)
- Fix four bugs in ``StackdriverTaskHandler`` (#13784)
- ``func.sum`` may return ``Decimal`` that break rest APIs (#15585)
- Persist tags params in pagination (#15411)
- API: Raise ``AlreadyExists`` exception when the ``execution_date`` is same (#15174)
- Remove duplicate call to ``sync_metadata`` inside ``DagFileProcessorManager`` (#15121)
- Extra ``docker-py`` update to resolve docker op issues (#15731)
- Ensure executors end method is called (#14085)
- Remove ``user_id`` from API schema (#15117)
- Prevent clickable bad links on disabled pagination (#15074)
- Acquire lock on db for the time of migration (#10151)
- Skip SLA check only if SLA is None (#14064)
- Print right version in airflow info command (#14560)
- Make ``airflow info`` work with pipes (#14528)
- Rework client-side script for connection form. (#14052)
- API: Add ``CollectionInfo`` in all Collections that have ``total_entries`` (#14366)
- Fix ``task_instance_mutation_hook`` when importing airflow.models.dagrun (#15851)

- Fix docstring of SqlSensor (#15466)
- Small changes on "DAGs and Tasks documentation" (#14853)
- Add note on changes to configuration options (#15696)
- Add docs to the ``markdownlint`` and ``yamllint`` config files (#15682)
- Rename old "Experimental" API to deprecated in the docs. (#15653)
- Fix documentation error in `git_sync_template.yaml` (#13197)
- Fix doc link permission name (#14972)
- Fix link to Helm chart docs (#14652)
- Fix docstrings for Kubernetes code (#14605)
- docs: Capitalize & minor fixes (#14283) (#14534)
- Fixed reading from zip package to default to text. (#13984)
- An initial rework of the "Concepts" docs (#15444)
- Improve docstrings for various modules (#15047)
- Add documentation on database connection URI (#14124)
- Add Helm Chart logo to docs index (#14762)
- Create a new documentation package for Helm Chart (#14643)
- Add docs about supported logging levels (#14507)
- Update docs about tableau and salesforce provider (#14495)
- Replace deprecated doc links to the correct one (#14429)
- Refactor redundant doc url logic to use utility (#14080)
- docs: NOTICE: Updated 2016-2019 to 2016-now (#14248)
- Skip DAG perm sync during parsing if possible (#15464)
- Add picture and examples for Edge Labels (#15310)
- Add example DAG & how-to guide for sqlite (#13196)
- Add links to new modules for deprecated modules (#15316)
- Add note in Updating.md about FAB data model change (#14478)

- Fix ``logging.exception`` redundancy (#14823)
- Bump ``stylelint`` to remove vulnerable sub-dependency (#15784)
- Add resolution to force dependencies to use patched version of lodash (#15777)
- Update croniter to 1.0.x series (#15769)
- Get rid of Airflow 1.10 in Breeze (#15712)
- Run helm chart tests in parallel (#15706)
- Bump ``ssri`` from 6.0.1 to 6.0.2 in /airflow/www (#15437)
- Remove the limit on Gunicorn dependency (#15611)
- Better "dependency already registered" warning message for tasks #14613 (#14860)
- Pin pandas-gbq to <0.15.0 (#15114)
- Use Pip 21.* to install airflow officially (#15513)
- Bump mysqlclient to support the 1.4.x and 2.x series (#14978)
- Finish refactor of DAG resource name helper (#15511)
- Refactor/Cleanup Presentation of Graph Task and Path Highlighting (#15257)
- Standardize default fab perms (#14946)
- Remove ``datepicker`` for task instance detail view (#15284)
- Turn provider's import warnings into debug logs (#14903)
- Remove left-over fields from required in provider_info schema. (#14119)
- Deprecate ``tableau`` extra (#13595)
- Use built-in `cached_property` on Python 3.8 where possible (#14606)
- Clean-up JS code in UI templates (#14019)
- Bump elliptic from 6.5.3 to 6.5.4 in /airflow/www (#14668)
- Switch to f-strings using ``flynt``. (#13732)
- use ``jquery`` ready instead of vanilla js (#15258)
- Migrate task instance log (ti_log) js (#15309)
- Migrate graph js (#15307)
- Migrate dags.html javascript (#14692)
- Removes unnecessary AzureContainerInstance connection type (#15514)
- Separate Kubernetes pod_launcher from core airflow (#15165)
- update remaining old import paths of operators (#15127)
- Remove broken and undocumented "demo mode" feature (#14601)
- Simplify configuration/legibility of ``Webpack`` entries (#14551)
- remove inline tree js (#14552)
- Js linting and inline migration for simple scripts (#14215)
- Remove use of repeated constant in AirflowConfigParser (#14023)
- Deprecate email credentials from environment variables. (#13601)
- Remove unused 'context' variable in task_instance.py (#14049)
- Disable suppress_logs_and_warning in cli when debugging (#13180)

Apache Airflow 2.1.0rc1

17 May 23:24
2.1.0rc1
Compare
Choose a tag to compare
Pre-release

New Features
""""""""""""

  • Add PythonVirtualenvDecorator to Taskflow API (#14761)
  • Add Taskgroup decorator (#15034)
  • Create a DAG Calendar View (#15423)
  • Create cross-DAG dependencies view (#13199)
  • Add rest API to query for providers (#13394)
  • Mask passwords and sensitive info in task logs and UI (#15599)
  • Add SubprocessHook for running commands from operators (#13423)
  • Add DAG Timeout in UI page "DAG Details" (#14165)
  • Add WeekDayBranchOperator (#13997)
  • Add JSON linter to DAG Trigger UI (#13551)
  • Add DAG Description Doc to Trigger UI Page (#13365)
  • Add airflow webserver URL into SLA miss email. (#13249)
  • Add read only REST API endpoints for users (#14735)
  • Add files to generate Airflow's Python SDK (#14739)
  • Add dynamic fields to snowflake connection (#14724)
  • Add read only REST API endpoint for roles and permissions (#14664)
  • Add new datetime branch operator (#11964)
  • Add Google leveldb hook and operator (#13109) (#14105)
  • Add plugins endpoint to the REST API (#14280)
  • Add worker_pod_pending_timeout support (#15263)
  • Add support for labeling DAG edges (#15142)
  • Add CUD REST API endpoints for Roles (#14840)
  • Import connections from a file (#15177)
  • A bunch of template_fields_renderers additions (#15130)
  • Add REST API query sort and order to some endpoints (#14895)
  • Add timezone context in new ui (#15096)
  • Add query mutations to new UI (#15068)
  • Add different modes to sort dag files for parsing (#15046)
  • Auto refresh on Tree View (#15474)
  • BashOperator to raise AirflowSkipException on exit code 99 (by default, configurable) (#13421) (#14963)
  • Clear tasks by task ids in REST API (#14500)
  • Support jinja2 native Python types (#14603)
  • Allow celery workers without gossip or mingle modes (#13880)
  • Add airflow jobs check CLI command to check health of jobs (Scheduler etc) (#14519)
  • Rename DateTimeBranchOperator to BranchDateTimeOperator (#14720)

Improvements
""""""""""""

  • Add optional result handler callback to DbApiHook (#15581)
  • Update Flask App Builder limit to recently released 3.3 (#15792)
  • Prevent creating flask sessions on REST API requests (#15295)
  • Sync DAG specific permissions when parsing (#15311)
  • Increase maximum length of pool name on Tasks to 256 characters (#15203)
  • Enforce READ COMMITTED isolation when using mysql (#15714)
  • Auto-apply apply_default to subclasses of BaseOperator (#15667)
  • Emit error on duplicated DAG ID (#15302)
  • Update KubernetesExecutor pod templates to allow access to IAM permissions (#15669)
  • More verbose logs when running airflow db check-migrations (#15662)
  • When one_success mark task as failed if no success (#15467)
  • Add an option to trigger a dag w/o changing conf (#15591)
  • Add Airflow UI instance_name configuration option (#10162)
  • Add a decorator to retry functions with DB transactions (#14109)
  • Add return to PythonVirtualenvOperator's execute method (#14061)
  • Add verify_ssl config for kubernetes (#13516)
  • Add description about secret_key when Webserver > 1 (#15546)
  • Add Traceback in LogRecord in JSONFormatter (#15414)
  • Add support for arbitrary json in conn uri format (#15100)
  • Adds description field in variable (#12413) (#15194)
  • Add logs to show last modified in SFTP, FTP and Filesystem sensor (#15134)
  • Execute on_failure_callback when SIGTERM is received (#15172)
  • Allow hiding of all edges when highlighting states (#15281)
  • Display explicit error in case UID has no actual username (#15212)
  • Serve logs with Scheduler when using Local or Sequential Executor (#15557)
  • Deactivate trigger, refresh, and delete controls on dag detail view. (#14144)
  • Turn off autocomplete for connection forms (#15073)
  • Increase default worker_refresh_interval to 6000 seconds (#14970)
  • Only show User's local timezone if it's not UTC (#13904)
  • Suppress LOG/WARNING for a few tasks CLI for better CLI experience (#14567)
  • Configurable API response (CORS) headers (#13620)
  • Allow viewers to see all docs links (#14197)
  • Update Tree View date ticks (#14141)
  • Make the tooltip to Pause / Unpause a DAG clearer (#13642)
  • Warn about precedence of env var when getting variables (#13501)
  • Move [celery] default_queue config to [operators] default_queue to re-use between executors (#14699)

Bug Fixes
"""""""""

  • Fix 500 error from updateTaskInstancesState API endpoint when dry_run not passed (#15889)
  • Ensure that task preceding a PythonVirtualenvOperator doesn't fail (#15822)
  • Prevent mixed case env vars from crashing processes like worker (#14380)
  • Fixed type annotations in DAG decorator (#15778)
  • Fix on_failure_callback when task receive SIGKILL (#15537)
  • Fix dags table overflow (#15660)
  • Fix changing the parent dag state on subdag clear (#15562)
  • Fix reading from zip package to default to text (#13962)
  • Fix wrong parameter for drawDagStatsForDag in dags.html (#13884)
  • Fix QueuedLocalWorker crashing with EOFError (#13215)
  • Fix typo in NotPreviouslySkippedDep (#13933)
  • Fix parallelism after KubeExecutor pod adoption (#15555)
  • Fix kube client on mac with keepalive enabled (#15551)
  • Fixes wrong limit for dask for python>3.7 (should be <3.7) (#15545)
  • Fix Task Adoption in KubernetesExecutor (#14795)
  • Fix timeout when using XCom with KubernetesPodOperator (#15388)
  • Fix deprecated provider aliases in "extras" not working (#15465)
  • Fixed default XCom deserialization. (#14827)
  • Fix used_group_ids in dag.partial_subset (#13700) (#15308)
  • Further fix trimmed pod_id for KubernetesPodOperator (#15445)
  • Bugfix: Invalid name when trimmed pod_id ends with hyphen in KubernetesPodOperator (#15443)
  • Fix incorrect slots stats when TI pool_slots > 1 (#15426)
  • Fix DAG last run link (#15327)
  • Fix sync-perm to work correctly when update_fab_perms = False (#14847)
  • Fixes limits on Arrow for plexus test (#14781)
  • Fix UI bugs in tree view (#14566)
  • Fix AzureDataFactoryHook failing to instantiate its connection (#14565)
  • Fix permission error on non-POSIX filesystem (#13121)
  • Fix spelling in "ignorable" (#14348)
  • Fix get_context_data doctest import (#14288)
  • Correct typo in GCSObjectsWtihPrefixExistenceSensor (#14179)
  • Fix order of failed deps (#14036)
  • Fix critical CeleryKubernetesExecutor bug (#13247)
  • Fix four bugs in StackdriverTaskHandler (#13784)
  • func.sum may return Decimal that break rest APIs (#15585)
  • Persist tags params in pagination (#15411)
  • API: Raise AlreadyExists exception when the execution_date is same (#15174)
  • Remove duplicate call to sync_metadata inside DagFileProcessorManager (#15121)
  • Extra docker-py update to resolve docker op issues (#15731)
  • Ensure executors end method is called (#14085)
  • Remove user_id from API schema (#15117)
  • Prevent clickable bad links on disabled pagination (#15074)
  • Acquire lock on db for the time of migration (#10151)
  • Skip SLA check only if SLA is None (#14064)
  • Print right version in airflow info command (#14560)
  • Make airflow info work with pipes (#14528)
  • Rework client-side script for connection form. (#14052)
  • API: Add CollectionInfo in all Collections that have total_entries (#14366)
  • Fix task_instance_mutation_hook when importing airflow.models.dagrun (#15851)

Doc only changes
""""""""""""""""

  • Fix docstring of SqlSensor (#15466)
  • Small changes on "DAGs and Tasks documentation" (#14853)
  • Add note on changes to configuration options (#15696)
  • Add docs to the markdownlint and yamllint config files (#15682)
  • Rename old "Experimental" API to deprecated in the docs. (#15653)
  • Fix documentation error in git_sync_template.yaml (#13197)
  • Fix doc link permission name (#14972)
  • Fix link to Helm chart docs (#14652)
  • Fix docstrings for Kubernetes code (#14605)
  • docs: Capitalize & minor fixes (#14283) (#14534)
  • Fixed reading from zip package to default to text. (#13984)
  • An initial rework of the "Concepts" docs (#15444)
  • Improve docstrings for various modules (#15047)
  • Add documentation on database connection URI (#14124)
  • Add Helm Chart logo to docs index (#14762)
  • Create a new documentation package for Helm Chart (#14643)
  • Add docs about supported logging levels (#14507)
  • Update docs about tableau and salesforce provider (#14495)
  • Replace deprecated doc links to the correct one (#14429)
  • Refactor redundant doc url logic to use utility (#14080)
  • docs: NOTICE: Updated 2016-2019 to 2016-now (#14248)
  • Skip DAG perm sync during parsing if possible (#15464)
  • Add picture and examples for Edge Labels (#15310)
  • Add example DAG & how-to guide for sqlite (#13196)
  • Add links to new modules for deprecated modules (#15316)
  • Add note in Updating.md about FAB data model change (#14478)

Misc/Internal
"""""""""""""

  • Fix logging.exception redundancy (#14823)
  • Bump stylelint to remove vulnerable sub-dependency (#15784)
  • Add resolution to force dependencies to use patched version of lodash (#15777)
  • Update croniter to 1.0.x series (#15769)
  • Get rid of Airflow 1.10 in Breeze (#15712)
  • Run helm chart tests in parallel (#15706)
  • Bump ssri from 6.0.1 to 6.0.2 in /airflow/www (#15437)
  • Remove the limit on Gunicorn dependency (#15611)
  • Better "dependency already registered" warning message for tasks #14613 (#14860)
  • Pin pandas-gbq to <0.15.0 (#15114)
  • Use Pip 21.* to install airflow officially (#15513)
  • Bump mysqlclient to support the 1.4.x and 2.x series (#14978)
  • Finish refactor of DAG resource name helper (#15511)
  • Refactor/Cleanup Presentation of Graph Task and Path Highlighting (#15257)
  • Standardize default fab perms (#14946)
  • Remove datepicker for task instance detail view (#15284)
  • Turn provider's import warnings into debug logs (#14903)
    ...
Read more

Apache Airflow v2.0.2

19 Apr 21:11
2.0.2
Compare
Choose a tag to compare

Bug Fixes

  • Bugfix: TypeError when Serializing & sorting iterable properties of DAGs (#15395)
  • Fix missing on_load trigger for folder-based plugins (#15208)
  • kubernetes cleanup-pods subcommand will only clean up Airflow-created Pods (#15204)
  • Fix password masking in CLI action_logging (#15143)
  • Fix url generation for TriggerDagRunOperatorLink (#14990)
  • Restore base lineage backend (#14146)
  • Unable to trigger backfill or manual jobs with Kubernetes executor. (#14160)
  • Bugfix: Task docs are not shown in the Task Instance Detail View (#15191)
  • Bugfix: Fix overriding pod_template_file in KubernetesExecutor (#15197)
  • Bugfix: resources in executor_config breaks Graph View in UI (#15199)
  • Fix celery executor bug trying to call len on map (#14883)
  • Fix bug in airflow.stats timing that broke dogstatsd mode (#15132)
  • Avoid scheduler/parser manager deadlock by using non-blocking IO (#15112)
  • Re-introduce dagrun.schedule_delay metric (#15105)
  • Compare string values, not if strings are the same object in Kube executor(#14942)
  • Pass queue to BaseExecutor.execute_async like in airflow 1.10 (#14861)
  • Scheduler: Remove TIs from starved pools from the critical path. (#14476)
  • Remove extra/needless deprecation warnings from airflow.contrib module (#15065)
  • Fix support for long dag_id and task_id in KubernetesExecutor (#14703)
  • Sort lists, sets and tuples in Serialized DAGs (#14909)
  • Simplify cleaning string passed to origin param (#14738) (#14905)
  • Fix error when running tasks with Sentry integration enabled. (#13929)
  • Webserver: Sanitize string passed to origin param (#14738)
  • Fix losing duration < 1 secs in tree (#13537)
  • Pin SQLAlchemy to <1.4 due to breakage of sqlalchemy-utils (#14812)
  • Fix KubernetesExecutor issue with deleted pending pods (#14810)
  • Default to Celery Task model when backend model does not exist (#14612)
  • Bugfix: Plugins endpoint was unauthenticated (#14570)
  • BugFix: fix DAG doc display (especially for TaskFlow DAGs) (#14564)
  • BugFix: TypeError in airflow.kubernetes.pod_launcher's monitor_pod (#14513)
  • Bugfix: Fix wrong output of tags and owners in dag detail API endpoint (#14490)
  • Fix logging error with task error when JSON logging is enabled (#14456)
  • Fix statsd metrics not sending when using daemon mode (#14454)
  • Gracefully handle missing start_date and end_date for DagRun (#14452)
  • BugFix: Serialize max_retry_delay as a timedelta (#14436)
  • Fix crash when user clicks on "Task Instance Details" caused by start_date being None (#14416)
  • BugFix: Fix TaskInstance API call fails if a task is removed from running DAG (#14381)
  • Scheduler should not fail when invalid executor_config is passed (#14323)
  • Fix bug allowing task instances to survive when dagrun_timeout is exceeded (#14321)
  • Fix bug where DAG timezone was not always shown correctly in UI tooltips (#14204)
  • Use Lax for cookie_samesite when empty string is passed (#14183)
  • [AIRFLOW-6076] fix dag.cli() KeyError (#13647)
  • Fix running child tasks in a subdag after clearing a successful subdag (#14776)

Improvements

  • Remove unused JS packages causing false security alerts (#15383)
  • Change default of [kubernetes] enable_tcp_keepalive for new installs to True (#15338)
  • Fixed #14270: Add error message in OOM situations (#15207)
  • Better compatibility/diagnostics for arbitrary UID in docker image (#15162)
  • Updates 3.6 limits for latest versions of a few libraries (#15209)
  • Adds Blinker dependency which is missing after recent changes (#15182)
  • Remove 'conf' from search_columns in DagRun View (#15099)
  • More proper default value for namespace in K8S cleanup-pods CLI (#15060)
  • Faster default role syncing during webserver start (#15017)
  • Speed up webserver start when there are many DAGs (#14993)
  • Much easier to use and better documented Docker image (#14911)
  • Use libyaml C library when available. (#14577)
  • Don't create unittest.cfg when not running in unit test mode (#14420)
  • Webserver: Allow Filtering TaskInstances by queued_dttm (#14708)
  • Update Flask-AppBuilder dependency to allow 3.2 (and all 3.x series) (#14665)
  • Remember expanded task groups in browser local storage (#14661)
  • Add plain format output to cli tables (#14546)
  • Make airflow dags show command display TaskGroups (#14269)
  • Increase maximum size of extra connection field. (#12944)
  • Speed up clear_task_instances by doing a single sql delete for TaskReschedule (#14048)
  • Add more flexibility with FAB menu links (#13903)
  • Add better description and guidance in case of sqlite version mismatch (#14209)

Doc only changes

  • Add documentation create/update community providers (#15061)
  • Fix mistake and typos in airflow.utils.timezone docstrings (#15180)
  • Replace new url for Stable Airflow Docs (#15169)
  • Docs: Clarify behavior of delete_worker_pods_on_failure (#14958)
  • Create a documentation package for Docker image (#14846)
  • Multiple minor doc (OpenAPI) fixes (#14917)
  • Replace Graph View Screenshot to show Auto-refresh (#14571)

Misc/Internal

  • Import Connection lazily in hooks to avoid cycles (#15361)
  • Rename last_scheduler_run into last_parsed_time, and ensure it's updated in DB (#14581)
  • Make TaskInstance.pool_slots not nullable with a default of 1 (#14406)
  • Log migrations info in consistent way (#14158)

Apache Airflow 1.10.15, 2021-03-17

17 Mar 17:29
Compare
Choose a tag to compare

Bug Fixes

  • Fix airflow db upgrade to upgrade db as intended (#13267)
  • Moved boto3 limitation to snowflake (#13286)
  • KubernetesExecutor should accept images from executor_config (#13074)
  • Scheduler should acknowledge active runs properly (#13803)
  • Bugfix: Unable to import Airflow plugins on Python 3.8 (#12859)
  • Include airflow/contrib/executors in the dist package
  • Pin Click version for Python 2.7 users
  • Ensure all statsd timers use millisecond values. (#10633)
  • [kubernetes_generate_dag_yaml] - Fix dag yaml generate function (#13816)
  • Fix airflow tasks clear cli command wirh --yes (#14188)
  • Fix permission error on non-POSIX filesystem (#13121) (#14383)
  • Fixed deprecation message for "variables" command (#14457)
  • BugFix: fix the delete_dag function of json_client (#14441)
  • Fix merging of secrets and configmaps for KubernetesExecutor (#14090)
  • Fix webserver exiting when gunicorn master crashes (#13470)
  • Bump ini from 1.3.5 to 1.3.8 in airflow/www_rbac
  • Bump datatables.net from 1.10.21 to 1.10.23 in airflow/www_rbac
  • Webserver: Sanitize string passed to origin param (#14738)
  • Make rbac_app's db.session use the same timezone with @provide_session (#14025)

Improvements

  • Adds airflow as viable docker command in official image (#12878)
  • StreamLogWriter: Provide (no-op) close method (#10885)
  • Add 'airflow variables list' command for 1.10.x transition version (#14462)

Doc only changes

  • Update URL for Airflow docs (#13561)
  • Clarifies version args for installing 1.10 in Docker (#12875)

Airflow 2.0.1, 2021-02-08

08 Feb 22:50
2.0.1
Compare
Choose a tag to compare

Bug Fixes

  • Bugfix: Return XCom Value in the XCom Endpoint API (#13684)
  • Bugfix: Import error when using custom backend and sql_alchemy_conn_secret (#13260)
  • Allow PID file path to be relative when daemonize a process (scheduler, kerberos, etc) (#13232)
  • Bugfix: no generic DROP CONSTRAINT in MySQL during airflow db upgrade (#13239)
  • Bugfix: Sync Access Control defined in DAGs when running sync-perm (#13377)
  • Stop sending Callback Requests if no callbacks are defined on DAG (#13163)
  • BugFix: Dag-level Callback Requests were not run (#13651)
  • Stop creating duplicate Dag File Processors (#13662)
  • Filter DagRuns with Task Instances in removed State while Scheduling (#13165)
  • Bump datatables.net from 1.10.21 to 1.10.22 in /airflow/www (#13143)
  • Bump datatables.net JS to 1.10.23 (#13253)
  • Bump dompurify from 2.0.12 to 2.2.6 in /airflow/www (#13164)
  • Update minimum cattrs version (#13223)
  • Remove inapplicable arg 'output' for CLI pools import/export (#13071)
  • Webserver: Fix the behavior to deactivate the authentication option and add docs (#13191)
  • Fix: add support for no-menu plugin views (#11742)
  • Add python-daemon limit for python 3.8+ to fix daemon crash (#13540)
  • Change the default celery worker_concurrency to 16 (#13612)
  • Audit Log records View should not contain link if dag_id is None (#13619)
  • Fix invalid continue_token for cleanup list pods (#13563)
  • Switches to latest version of snowflake connector (#13654)
  • Fix backfill crash on task retry or reschedule (#13712)
  • Setting max_tis_per_query to 0 now correctly removes the limit (#13512)
  • Fix race conditions in task callback invocations (#10917)
  • Fix webserver exiting when gunicorn master crashes (#13518)(#13780)
  • Fix SQL syntax to check duplicate connections (#13783)
  • BaseBranchOperator will push to xcom by default (#13704) (#13763)
  • Fix Deprecation for configuration.getsection (#13804)
  • Fix TaskNotFound in log endpoint (#13872)
  • Fix race condition when using Dynamic DAGs (#13893)
  • Fix: Linux/Chrome window bouncing in Webserver
  • Fix db shell for sqlite (#13907)
  • Only compare updated time when Serialized DAG exists (#13899)
  • Fix dag run type enum query for mysqldb driver (#13278)
  • Add authentication to lineage endpoint for experimental API (#13870)
  • Do not add User role perms to custom roles. (#13856)
  • Do not add Website.can_read access to default roles. (#13923)
  • Fix invalid value error caused by long Kubernetes pod name (#13299)
  • Fix DB Migration for SQLite to upgrade to 2.0 (#13921)
  • Bugfix: Manual DagRun trigger should not skip scheduled runs (#13963)
  • Stop loading Extra Operator links in Scheduler (#13932)
  • Added missing return parameter in read function of FileTaskHandler (#14001)
  • Bugfix: Do not try to create a duplicate Dag Run in Scheduler (#13920)
  • Make v1/config endpoint respect webserver expose_config setting (#14020)
  • Disable row level locking for Mariadb and MySQL <8 (#14031)
  • Bugfix: Fix permissions to triggering only specific DAGs (#13922)
  • Fix broken SLA Mechanism (#14056)
  • Bugfix: Scheduler fails if task is removed at runtime (#14057)
  • Remove permissions to read Configurations for User and Viewer roles (#14067)
  • Fix DB Migration from 2.0.1rc1

Improvements

  • Increase the default min_file_process_interval to decrease CPU Usage (#13664)
  • Dispose connections when running tasks with os.fork & CeleryExecutor (#13265)
  • Make function purpose clearer in example_kubernetes_executor example dag (#13216)
  • Remove unused libraries - flask-swagger, funcsigs (#13178)
  • Display alternative tooltip when a Task has yet to run (no TI) (#13162)
  • User werkzeug's own type conversion for request args (#13184)
  • UI: Add queued_by_job_id & external_executor_id Columns to TI View (#13266)
  • Make json-merge-patch an optional library and unpin it (#13175)
  • Adds missing LDAP "extra" dependencies to ldap provider. (#13308)
  • Refactor setup.py to better reflect changes in providers (#13314)
  • Pin pyjwt and Add integration tests for Apache Pinot (#13195)
  • Removes provider-imposed requirements from setup.cfg (#13409)
  • Replace deprecated decorator (#13443)
  • Streamline & simplify __eq__ methods in models Dag and BaseOperator (#13449)
  • Additional properties should be allowed in provider schema (#13440)
  • Remove unused dependency - contextdecorator (#13455)
  • Remove 'typing' dependency (#13472)
  • Log migrations info in consistent way (#13458)
  • Unpin mysql-connector-python to allow 8.0.22 (#13370)
  • Remove thrift as a core dependency (#13471)
  • Add NotFound response for DELETE methods in OpenAPI YAML (#13550)
  • Stop Log Spamming when [core] lazy_load_plugins is False (#13578)
  • Display message and docs link when no plugins are loaded (#13599)
  • Unpin restriction for colorlog dependency (#13176)
  • Add missing Dag Tag for Example DAGs (#13665)
  • Support tables in DAG docs (#13533)
  • Add python3-openid dependency (#13714)
  • Add __repr__ for Executors (#13753)
  • Add description to hint if conn_type is missing (#13778)
  • Upgrade azure blob to v12 (#12188)
  • Add extra field to get_connnection REST endpoint (#13885)
  • Make Smart Sensors DB Migration idempotent (#13892)
  • Improve the error when DAG does not exist when running dag pause command (#13900)
  • Update airflow_local_settings.py to fix an error message (#13927)
  • Only allow passing JSON Serializable conf to TriggerDagRunOperator (#13964)
  • Bugfix: Allow getting details of a DAG with null start_date (REST API) (#13959)
  • Add params to the DAG details endpoint (#13790)
  • Make the role assigned to anonymous users customizable (#14042)
  • Retry critical methods in Scheduler loop in case of OperationalError (#14032)

Doc only changes

  • Add Missing Statsd Metrics in Docs (#13708)
  • Add Missing Email configs in Configuration doc (#13709)
  • Add quick start for Airflow on Docker (#13660)
  • Describe which Python versions are supported (#13259)
  • Add note block to 2.x migration docs (#13094)
  • Add documentation about webserver_config.py (#13155)
  • Add missing version information to recently added configs (#13161)
  • API: Use generic information in UpdateMask component (#13146)
  • Add Airflow 2.0.0 to requirements table (#13140)
  • Avoid confusion in doc for CeleryKubernetesExecutor (#13116)
  • Update docs link in REST API spec (#13107)
  • Add link to PyPI Repository to provider docs (#13064)
  • Fix link to Airflow master branch documentation (#13179)
  • Minor enhancements to Sensors docs (#13381)
  • Use 2.0.0 in Airflow docs & Breeze (#13379)
  • Improves documentation regarding providers and custom connections (#13375)(#13410)
  • Fix malformed table in production-deployment.rst (#13395)
  • Update celery.rst to fix broken links (#13400)
  • Remove reference to scheduler run_duration param in docs (#13346)
  • Set minimum SQLite version supported (#13412)
  • Fix installation doc (#13462)
  • Add docs about mocking variables and connections (#13502)
  • Add docs about Flask CLI (#13500)
  • Fix Upgrading to 2 guide to use rbac UI (#13569)
  • Make docs clear that Auth can not be disabled for Stable API (#13568)
  • Remove archived links from docs & add link for AIPs (#13580)
  • Minor fixes in upgrading-to-2.rst (#13583)
  • Fix Link in Upgrading to 2.0 guide (#13584)
  • Fix heading for Mocking section in best-practices.rst (#13658)
  • Add docs on how to use custom operators within plugins folder (#13186)
  • Update docs to register Operator Extra Links (#13683)
  • Improvements for database setup docs (#13696)
  • Replace module path to Class with just Class Name (#13719)
  • Update DAG Serialization docs (#13722)
  • Fix link to Apache Airflow docs in webserver (#13250)
  • Clarifies differences between extras and provider packages (#13810)
  • Add information about all access methods to the environment (#13940)
  • Docs: Fix FAQ on scheduler latency (#13969)
  • Updated taskflow api doc to show dependency with sensor (#13968)
  • Add deprecated config options to docs (#13883)
  • Added a FAQ section to the Upgrading to 2 doc (#13979)

Airflow 2.0.0, 2020-12-17

18 Dec 13:21
2.0.0
ab5f770
Compare
Choose a tag to compare

The full changelog is about 3,000 lines long (already excluding everything backported to 1.10), so for now I’ll simply share some of the major features in 2.0.0 compared to 1.10.14:

A new way of writing dags: the TaskFlow API (AIP-31)

(Known in 2.0.0alphas as Functional DAGs.)

DAGs are now much much nicer to author especially when using PythonOperator. Dependencies are handled more clearly and XCom is nicer to use

A quick teaser of what DAGs can now look like:

from airflow.decorators import dag, task
from airflow.utils.dates import days_ago

@dag(default_args={'owner': 'airflow'}, schedule_interval=None, start_date=days_ago(2))
def tutorial_taskflow_api_etl():
   @task
   def extract():
       return {"1001": 301.27, "1002": 433.21, "1003": 502.22}

   @task
   def transform(order_data_dict: dict) -> dict:
       total_order_value = 0

       for value in order_data_dict.values():
           total_order_value += value

       return {"total_order_value": total_order_value}

   @task()
   def load(total_order_value: float):

       print("Total order value is: %.2f" % total_order_value)

   order_data = extract()
   order_summary = transform(order_data)
   load(order_summary["total_order_value"])

tutorial_etl_dag = tutorial_taskflow_api_etl()

Fully specified REST API (AIP-32)

We now have a fully supported, no-longer-experimental API with a comprehensive OpenAPI specification

Read more here:

REST API Documentation.

Massive Scheduler performance improvements

As part of AIP-15 (Scheduler HA+performance) and other work Kamil did, we significantly improved the performance of the Airflow Scheduler. It now starts tasks much, MUCH quicker.

Over at Astronomer.io we’ve benchmarked the scheduler—it’s fast (we had to triple check the numbers as we don’t quite believe them at first!)

Scheduler is now HA compatible (AIP-15)

It’s now possible and supported to run more than a single scheduler instance. This is super useful for both resiliency (in case a scheduler goes down) and scheduling performance.

To fully use this feature you need Postgres 9.6+ or MySQL 8+ (MySQL 5, and MariaDB won’t work with more than one scheduler I’m afraid).

There’s no config or other set up required to run more than one scheduler—just start up a scheduler somewhere else (ensuring it has access to the DAG files) and it will cooperate with your existing schedulers through the database.

For more information, read the Scheduler HA documentation.

Task Groups (AIP-34)

SubDAGs were commonly used for grouping tasks in the UI, but they had many drawbacks in their execution behaviour (primarirly that they only executed a single task in parallel!) To improve this experience, we’ve introduced “Task Groups”: a method for organizing tasks which provides the same grouping behaviour as a subdag without any of the execution-time drawbacks.

SubDAGs will still work for now, but we think that any previous use of SubDAGs can now be replaced with task groups. If you find an example where this isn’t the case, please let us know by opening an issue on GitHub

For more information, check out the Task Group documentation.

Refreshed UI

We’ve given the Airflow UI a visual refresh and updated some of the styling. Check out the UI section of the docs for screenshots.

We have also added an option to auto-refresh task states in Graph View so you no longer need to continuously press the refresh button :).

Smart Sensors for reduced load from sensors (AIP-17)

If you make heavy use of sensors in your Airflow cluster, you might find that sensor execution takes up a significant proportion of your cluster even with “reschedule” mode. To improve this, we’ve added a new mode called “Smart Sensors”.

This feature is in “early-access”: it’s been well-tested by AirBnB and is “stable”/usable, but we reserve the right to make backwards-incompatible changes to it in a future release (if we have to. We’ll try very hard not to!)

Simplified KubernetesExecutor

For Airflow 2.0, we have re-architected the KubernetesExecutor in a fashion that is simultaneously faster, easier to understand, and more flexible for Airflow users. Users will now be able to access the full Kubernetes API to create a .yaml pod_template_file instead of specifying parameters in their airflow.cfg.

We have also replaced the executor_config dictionary with the pod_override parameter, which takes a Kubernetes V1Pod object for a 1:1 setting override. These changes have removed over three thousand lines of code from the KubernetesExecutor, which makes it run faster and creates fewer potential errors.

Airflow core and providers: Splitting Airflow into 60+ packages

Airflow 2.0 is not a monolithic “one to rule them all” package. We’ve split Airflow into core and 61 (for now) provider packages. Each provider package is for either a particular external service (Google, Amazon, Microsoft, Snowflake), a database (Postgres, MySQL), or a protocol (HTTP/FTP). Now you can create a custom Airflow installation from “building” blocks and choose only what you need, plus add whatever other requirements you might have. Some of the common providers are installed automatically (ftp, http, imap, sqlite) as they are commonly used. Other providers are automatically installed when you choose appropriate extras when installing Airflow.

The provider architecture should make it much easier to get a fully customized, yet consistent runtime with the right set of Python dependencies.

But that’s not all: you can write your own custom providers and add things like custom connection types, customizations of the Connection Forms, and extra links to your operators in a manageable way. You can build your own provider and install it as a Python package and have your customizations visible right in the Airflow UI.

Security

As part of Airflow 2.0 effort, there has been a conscious focus on Security and reducing areas of exposure. This is represented across different functional areas in different forms. For example, in the new REST API, all operations now require authorization. Similarly, in the configuration settings, the Fernet key is now required to be specified.

Configuration

Configuration in the form of the airflow.cfg file has been rationalized further in distinct sections, specifically around “core”. Additionally, a significant amount of configuration options have been deprecated or moved to individual component-specific configuration files, such as the pod-template-file for Kubernetes execution-related configuration.

We’ve tried to make as few breaking changes as possible and to provide deprecation path in the code, especially in the case of anything called in the DAG. That said, please read through UPDATING.md to check what might affect you. For example: We re-organized the layout of operators (they now all live under airflow.providers.*) but the old names should continue to work - you’ll just notice a lot of DeprecationWarnings that need to be fixed up.