Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Raise error for unknown keywords #632

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

MRVermeulenDeltares
Copy link
Contributor

@MRVermeulenDeltares MRVermeulenDeltares commented Apr 24, 2024

#622

  • Throw error when unknown keyword is located in the mdu
  • When multiple unknown keywords are located in the mdu, have the error contain multiple unknown keywords.

@MRVermeulenDeltares MRVermeulenDeltares changed the title feat: give-warning-for-unknown-keyword feat: give-error-for-unknown-keyword May 10, 2024
Copy link

sonarcloud bot commented May 10, 2024

Quality Gate Passed Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

@MRVermeulenDeltares
Copy link
Contributor Author

MRVermeulenDeltares commented May 10, 2024

After the update to throw an error on an unknown key multiple test started to fail.
I have done some reasearch to try and figure out why.
I came to a few conclusions:

  1. Unknown keywords in the testfiles
  2. Outdated keywords in the testfiles
  3. Not defined keywords in the models (those keywords are in the manual)
  4. Problems with subfiles being validatated and not containing properties in their classes that are defined in their files. (e.g. .bc files and crosssection .ini related files, might be more. )

This is what I have found in regard to the keywords in relation with the tests, I tried to tackle all keywords but I am not 100% sure I have them all.
Manual I used for verification is: content.oss.deltares.nl/dhydro/D-Flow_FM_User_Manual.pdf

Section Keyword defined in manual Mentioned in #634
General guiversion Yes, Table A.1
Geometry pipefile No Yes
Geometry branchfile No
Geometry onednetworkfile No
Geometry shipdeffile No Yes
Geometry bedlevelfile Yes, Table A.4, deprecated and removed
VolumeTables usevolumetablesfile No
Numerics jasfer3d No Yes
Numerics jarhoxu No
Numerics vertadvtypmom3onbnd No
Numerics jposhchk No Yes
Numerics newcorio Yes, Table A.3, research keyword Yes
Numerics jaorgsethu Yes, but only in examples, not in a table defining the keyword
Numerics jaupwindsrc No Yes
Numerics eddyviscositybedfacmax No Yes
Numerics icoriolistype Yes, Table A.3, research keyword Yes
Numerics zlayercenterbedvel Yes, but only once in a text: "10.3 Z-layer modelling" Yes
Numerics epshstem No Yes
Numerics zwsbtol No Yes
Numerics horadvtypzlayer Yes, Table A.3, research keyword Yes
Numerics corioadamsbashfordfac Yes, Table A.3, research keyword Yes
Numerics drop3d Yes, Table A.3, research keyword Yes
Numerics transporttimestepping Yes, Table A.4, deprecated and removed
Numerics transportmethod Yes, Table A.4, deprecated and removed
Numerics noderivedtypes No
Numerics fixedweirfrictscheme No Yes
Numerics jbasqbnddownwindhs Yes, Table A.3, research keyword Yes
Numerics logprofkepsbndin Yes, Table A.3, research keyword Yes
Physics selfattractionloading Yes, But only mentions in Table F.4 Yes
Physics soiltempthick No Yes
Physics jadelvappos No
Physics uniffrictcoef1d2d No Yes
Physics umodlin No
Physics effectspiral Yes, but only in examples, not in a table defining the keyword
Time timestepanalysis No Yes
Time dtfacmax No Yes
Time autotimestepdiff No
Wind windhuorzwsbased Yes, Table A.3, research keyword Yes
Waves wavenikuradse No
Output writepart_domain Yes, but only once in a text: "6.4.2 Partitioning a model" Yes
Output wrimap_salinity ~Yes, Table F.5
Output velocitydirectionclassesinterval No Yes
Output timesplitinterval No Yes
Output wrimap_temperature ~Yes, Table F.5
Output wrimap_input_dt No
Output writebalancefile Yes, Table A.4, deprecated and removed
Output wrihis_heatflux No
Output wrirst_bnd No Yes
Output writedfminterpretedvalues No Yes
Output s1incinterval No
Output velocitymagnitudeclasses No Yes

Action points for issue

  • Wait on implementation of Add mdu keywords (not listed in appendix A of the manual) #634
  • Determine if this list ^ contains keywords that need to be added which aren't defined in another issue
  • Remove unknown keywords from testfiles
  • Further investigate and resolve the issue with .bc files and crosssection .ini related files

@tim-vd-aardweg tim-vd-aardweg changed the title feat: give-error-for-unknown-keyword feat: Raise error for unknown keywords May 13, 2024
@tim-vd-aardweg
Copy link
Contributor

In #634 I wanted to add this test:

    def test_load_model_with_research_keywords_as_fmmodel_raises_error(self):
        input_mdu = (
                test_input_dir / "research" / "mdu_with_research_keywords_from_dia_file_2024.03_release.mdu"
        )

        with pytest.raises(ValueError) as e:
            _ = FMModel(filepath=input_mdu)

        expected_error = "Unknown keywords are detected in section"
        assert expected_error in str(e.value)

But it currently fails, since we have Extra.ignore. So I can't yet add that test. That test can only be added after this issue is implemented.

@@ -55,6 +56,15 @@ class Config:
extra = Extra.ignore
arbitrary_types_allowed = False

def __init__(self, **data):
super().__init__(**data)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be more efficient to first check if there are unknown keywords and then call super().__init__()? It seems we have all the data we need to determine if there are unknown keywords. And if there are unknown keywords there is no reason to let pydantic work all its magic, since we are raising an error anyway?


from pydantic import Extra
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not used. There are a couple other imports that are not used. Please remove those as well.

Notify the user of unknown keywords.

Args:
data (Dict[str, Any]) : Input data containing all set properties which are checked on unknown keywords.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are set properties? It's a dict, not a set? 😊

def __init__(self, **data):
super().__init__(**data)
self._unknown_keyword_error_manager.raise_error_for_unknown_keywords(
data,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems we are only using the keys of this dict in the raise_error_for_unknown_keywords() method. So just pass data.keys() and update the types expected by the method as you will now pass a list of strings instead of a dict.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, ignore this comment. It seems perfectly fine to just pass the dict :)

self, data: Dict[str, Any], fields: Dict[str, Any], excluded_fields: Set
) -> List[str]:
list_of_unknown_keywords = []
for name, _ in data.items():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use: for name in data. There is no need to get the keys and values!

def _is_unknown_keyword(
self, name: str, fields: Dict[str, Any], excluded_fields: Set
):
return name not in fields and name not in excluded_fields
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In HYDROLIB-core we use pydantic to turn dictionaries into valid objects. Pydantic can map the key in the dictionary to the appropriate attribute by either looking at the field name or the field alias. This behaviour can be set in the config (config.allow_population_by_field_name). In our BaseModel we set config.allow_population_by_field_name to True. This means that we want to be able to map the data via either the name or the alias.

In your implementation, we say that a keyword is unknown if there is no attribute with the same name as the key. However, this will, for example, fail for the research keywords that are implemented in #642. All research keyword attribute names are prefixed with research. This will fail the mapping and they will be considered an unknown keyword. I therefore think that it is important to not only check the field names, but also their alias!

@@ -55,6 +56,15 @@ class Config:
extra = Extra.ignore
arbitrary_types_allowed = False

def __init__(self, **data):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there a reason we could not put this in a root_validator?

@tim-vd-aardweg
Copy link
Contributor

This doesn't work for unknown sections, right? Should you create a follow-up issue for this?

@@ -388,3 +392,115 @@ def test_loading_fmmodel_model_with_both_ini_and_xyn_obsfiles(self):
assert actual_point.x == expected_point.x
assert actual_point.y == expected_point.y
assert actual_point.name == expected_point.name

def test_mdu_unknown_keywords_loading_gives_message_for_missing_keyword(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def test_mdu_unknown_keywords_loading_gives_message_for_missing_keyword(
def test_mdu_unknown_keywords_loading_gives_message_for_unknown_keyword(

tmp_mdu_path.write_text(tmp_mdu)

with patch(
"hydrolib.core.dflowfm.ini.models.INIBasedModel.Config"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this mock?

)
assert name in captured.out

def test_mdu_unknown_keywords_loading_gives_message_for_missing_keyword2(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Give it an appropriate name please

FMModel(filepath=tmp_mdu_path)
captured = capsys.readouterr()

excluded_fields = ["comments", "datablock", "_header"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test will still pass if we ever change the _exclude_fields of the model. So maybe use: excluded_fields = model._exclude_fields instead? Maybe even assert that the list is not empty?

for excluded_field in excluded_fields:
assert excluded_field not in captured.out

def test_mdu_unknown_keywords_allow_extra_setting_field_gives_message(self, capsys):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion there is no need to check all the different options for Extra.<allow/forbid/ignore>. The user should not edit the values we have set, since it's not a public setting. If they decide to change it anyway, we can't offer support/validation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

When there are unknown keywords in the mdu, a warning should be given instead of them being silently dropped.
2 participants