Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: sqa deprecations for airflow tests #39299

Merged
merged 1 commit into from May 4, 2024

Conversation

dondaum
Copy link
Contributor

@dondaum dondaum commented Apr 28, 2024

related: #28723

fix deprecations for SQLAlchemy 2.0 for Airflow tests.

@Taragolis I went through all test warnings. IMO for the majority of SQA 2.0 warnings there isn't something to do. That is often the case with the warnngs 'object is being merged into a Session along the backref cascade... ' SQA is warning the a object is pulled into the session with a backref cascade and not explicit. But often we compensate for that by adding the object a few lines later in the session which should be fine. In such cases we still get a warning but there won't be an issue later. Here it is also explained.

Fixed reported in tests


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@boring-cyborg boring-cyborg bot added area:API Airflow's REST/HTTP API area:serialization labels Apr 28, 2024
@dondaum dondaum changed the title fix: sqa deprecations for airflow test fix: sqa deprecations for airflow tests Apr 28, 2024
@dirrao dirrao requested review from Taragolis and potiuk April 29, 2024 03:25
@Taragolis
Copy link
Contributor

Fixed but still raise a warning
tests/serialization/test_pydantic_models.py:162
tests/serialization/test_pydantic_models.py:163

Could you provide what kind of error you tried to fix?

@dondaum
Copy link
Contributor Author

dondaum commented May 1, 2024

Fixed but still raise a warning
tests/serialization/test_pydantic_models.py:162
tests/serialization/test_pydantic_models.py:163

Could you provide what kind of error you tried to fix?

Both raise the following warnings

{"category": "sqlalchemy.exc.RemovedIn20Warning", "message": "\"DagScheduleDatasetReference\" object is being merged into a Session along the backref cascade path for relationship \"DatasetModel.consuming_dags\"; in SQLAlchemy 2.0, this reverse cascade will not take place.  Set cascade_backrefs to False in either the relationship() or backref() function for the 2.0 behavior; or to set globally for the whole Session, set the future=True flag (Background on this error at: https://sqlalche.me/e/14/s9r1) (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)", "filename": "tests/serialization/test_pydantic_models.py", "lineno": 162, "when": "runtest", "node_id": "tests/serialization/test_pydantic_models.py::test_serializing_pydantic_dataset_event", "group": "tests", "count": 1}
{"category": "sqlalchemy.exc.RemovedIn20Warning", "message": "\"TaskOutletDatasetReference\" object is being merged into a Session along the backref cascade path for relationship \"DatasetModel.producing_tasks

We can fix it by first intializing the object, then adding it into the Session and then setting the dataset which is where the backref is happening.

@pytest.mark.skipif(not _ENABLE_AIP_44, reason="AIP-44 is disabled")
def test_serializing_pydantic_dataset_event(session, create_task_instance, create_dummy_dag):
    ...
    dag_ds_ref = DagScheduleDatasetReference(dag_id=dag.dag_id)
    session.add(dag_ds_ref)
    dag_ds_ref.dataset = ds1
    task_ds_ref = TaskOutletDatasetReference(task_id=task1.task_id, dag_id=dag.dag_id)
    session.add(task_ds_ref)
    task_ds_ref.dataset = ds1

Yet IMO the code would looks less readable than intializing the whole object first and add it into the Session.

@pytest.mark.skipif(not _ENABLE_AIP_44, reason="AIP-44 is disabled")
def test_serializing_pydantic_dataset_event(session, create_task_instance, create_dummy_dag):
    ...
    dag_ds_ref = DagScheduleDatasetReference(dag_id=dag.dag_id, dataset=ds1)
    task_ds_ref = TaskOutletDatasetReference(task_id=task1.task_id, dag_id=dag.dag_id, dataset=ds1)
    session.add_all([dag_ds_ref, task_ds_ref])

@Taragolis
Copy link
Contributor

Set cascade_backrefs to False in either the relationship() or backref() function for the 2.0 behavior

I guess it points to change in models relationships, maybe it is simmilar: sqlalchemy/sqlalchemy#6148

@dondaum
Copy link
Contributor Author

dondaum commented May 2, 2024

Oh interesting. Due to your comment I looked again in the SQL Alchemy docs

I thought we have enabled already SQA 2.0 style by setting future=True in the engine here but it seems that this enables 2.0 style of engines and connections only. Yet for the ORM, to anble future behavior in the ORM Sessions we also need to pass future=True in the sessionmaker object.

    Session = scoped_session(
        sessionmaker(
            autocommit=False,
            autoflush=False,
            bind=engine,
            expire_on_commit=False,
            future=True
        )
    )

If I do the warnings disapear. If I remove my fixes the tests fail. So it seems that is pretty usefull to test if the changes / fixes work and it might makes in general sense to set future=True also for the ORM session ?

@Taragolis
Copy link
Contributor

There is one small things, according to the docs it is recommended switch to Future=True after we resolve all RemovedIn20 warnings. https://docs.sqlalchemy.org/en/20/changelog/migration_20.html#migration-to-2-0-step-five-use-the-future-flag-on-session

So maybe we need to start replace legacy backrefs by back_populates: https://docs.sqlalchemy.org/en/14/orm/relationships.html because this warning related only to backref

@dondaum
Copy link
Contributor Author

dondaum commented May 3, 2024

Yes legacy backrefs should be replaced anyway by back_populates and explicit relationships.

Yet we will still see those warnings as also with back_populates and an explicit relationship SQLAlchlemy 1.4 still uses the old way of merging an object into the Session along the cascade path.

Just to make sure, we can fix all object is being merged into a Session along the backref cascade path with the current setup (NOT setting future=True in the Session object) when we

  • create the object first,
  • adding it then to the session and
  • adding afterwards the relationship values where backref/back_populates is happening.

Similar to here. IMO it will still be less readable and unstandable. It is also just a warning that does not lead to false behavior when handled correctly. And you can confirm it by using future=True in the Session

WDYT? Shall I change it in a way that all warnings are fixed or shall we acknowledge some warnings as False positives ?

@Taragolis
Copy link
Contributor

I would rather fix one by one rather than apply fix which might hide this

We still have third party components, which strictly depends on SA14 and might not work correctly with new sessions

@Taragolis
Copy link
Contributor

Particular this one described in https://docs.sqlalchemy.org/en/14/orm/cascades.html#save-update
And there is description about cascade_backrefs which is True by default in 1.4, and False in 2.0

@dondaum dondaum force-pushed the fix/sqa-deprecations-airflow-tests branch from a4aaaa9 to 39d7fe4 Compare May 3, 2024 19:58
@dondaum
Copy link
Contributor Author

dondaum commented May 3, 2024

Ok. Changed.
All RemovedIn20Warning should be fixed.

@Taragolis Taragolis added the changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..) label May 4, 2024
@Taragolis Taragolis merged commit 8d29a96 into apache:main May 4, 2024
41 checks passed
RodrigoGanancia pushed a commit to RodrigoGanancia/airflow that referenced this pull request May 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:API Airflow's REST/HTTP API area:serialization changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants