version 2.4.1 migration job "run-airflow-migrations" failed once, but it is not getting restarted #27992
Replies: 4 comments 10 replies
-
airflow-run-airflow-migrations job was initally showing 1 failed run |
Beta Was this translation helpful? Give feedback.
-
Did you use flux/argocd/terraform? Or did you use kubectl to install/upgrade? On this page at the bottom it says
...
|
Beta Was this translation helpful? Give feedback.
-
@minnieshi 2nd tried this ... then this .. kubectl apply -f airflow_run_migration.yamlchecked it ran OK with kubectl get job -o wide |grep migration |
Beta Was this translation helpful? Give feedback.
-
I have deployed the below job # Airflow Run Migrations
apiVersion: batch/v1
kind: Job
metadata:
name: dev-3pp-airflow-run-airflow-migrations
namespace: dev-3pp-airflow
labels:
tier: airflow
component: run-airflow-migrations
release: dev-3pp-airflow
chart: "airflow-1.13.1"
heritage: Helm
annotations:
helm.sh/hook: post-install,post-upgrade
helm.sh/hook-delete-policy: before-hook-creation,hook-succeeded
helm.sh/hook-weight: "1"
spec:
template:
metadata:
labels:
tier: airflow
component: run-airflow-migrations
release: dev-3pp-airflow
spec:
securityContext:
runAsUser: 50000
fsGroup: 0
restartPolicy: OnFailure
and also in my PostgreSQL pod getting below error - Traceback (most recent call last): It seems I am having below issue due to There are still unapplied migrations after 60 seconds. can you help me out here ? |
Beta Was this translation helpful? Give feedback.
-
we are using apache airflow helmcharts with airflow version 2.4.1
The airflow-scheduler pod wait-for-airflow-migrations container was showing these errors and restarting every 60 seconds or so.
kubectl logs -f airflow-scheduler-6f6d976b8-p29t8 -c wait-for-airflow-migrations
/home/airflow/.local/lib/python3.7/site-packages/airflow/configuration.py:367: FutureWarning: The auth_backends setting in [api] has had airflow.api.auth.backend.session added in the running config, which is needed by the UI. Please update your config before Apache Airflow 3.0.
FutureWarning,
[2022-11-29T15:42:30.517+0000] {db.py:737} INFO - Waiting for migrations... 0 second(s)
[2022-11-29T15:42:31.522+0000] {db.py:737} INFO - Waiting for migrations... 1 second(s)
..
[2022-11-29T16:08:36.566+0000] {db.py:737} INFO - Waiting for migrations... 59 second(s)
Traceback (most recent call last):
File "/home/airflow/.local/bin/airflow", line 8, in
sys.exit(main())
File "/home/airflow/.local/lib/python3.7/site-packages/airflow/main.py", line 39, in main
args.func(args)
File "/home/airflow/.local/lib/python3.7/site-packages/airflow/cli/cli_parser.py", line 52, in command
return func(*args, **kwargs)
File "/home/airflow/.local/lib/python3.7/site-packages/airflow/cli/commands/db_command.py", line 138, in check_migrations
db.check_migrations(timeout=args.migration_wait_timeout)
File "/home/airflow/.local/lib/python3.7/site-packages/airflow/utils/db.py", line 739, in check_migrations
f"There are still unapplied migrations after {timeout} seconds. Migration"
TimeoutError: There are still unapplied migrations after 60 seconds. MigrationHead(s) in DB: set() | Migration Head(s) in Source Code: {'ecb43d2a1842'}
The kubectl describe job airflow-run-airflow-migrations was not showing any detailed error message.
I had to manually restart the job by doing the following to stablizie things and get airflow scheduler, webserver start ok
kubectl get job airflow-run-airflow-migrations -o json | jq 'del(.spec.selector)' | jq 'del(.spec.template.metadata.labels)' | kubectl replace --force -f -
kubectl get job airflow-run-airflow-migrations -o yaml > airflow_run_migration.yaml
kubectl apply -f airflow_run_migration.yaml
Warning: resource jobs/airflow-run-airflow-migrations is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.
job.batch/airflow-run-airflow-migrations configured
question is why the job did not get restarted onfailure ?
Beta Was this translation helpful? Give feedback.
All reactions