refactor: excel to json list (DEV-431) #155

jnussbaum · 2022-02-11T14:28:17Z

resolves DEV-431
resolves also DEV-137

…om_excel()

…mputed

- improve docstrings and annotations

- remove side-effect of get_values_from_excel by making a copy of parentnode before returning it

- improve error message output

…on-list' into wip/dev-431-refactor-excel-to-json-list

jnussbaum · 2022-02-14T16:56:09Z

Mir schien die Architektur von excel_to_json_lists.py nicht gut genug dokumentiert zu sein, und an einigen Stellen verbesserungswürdig.

Deshalb habe ich versucht, mehr Ordnung hineinzubringen, was mir auch einigermassen gut gelungen ist. Bloss eine Sache konnte ich nicht zufriedenstellend lösen: Der rekursive Aufruf von get_values_from_excel() dient (oberflächlich betrachtet) nur dazu, den Maximalwert von row zu ermitteln. Der zweite Rückgabewert parentnode wird ja verworfen. Doch in Tat und Wahrheit gibt es hier einen Seiteneffekt, den ich gerne eliminieren würde:

In der Funktion get_values_from_excel() wird der mitgegebene Parameter parentnode verändert und dann zurückgegeben. Im rekursiven Funktionsaufruf wird also currentnode verändert, was aber verschleiert wird dadurch, dass der zweite Rückgabewert verworfen wird.

Um diese Funktionsweise expliziter zu machen, ohne effektiv eine Änderung im Programm hervorzurufen, möchte ich eigentlich folgende Änderung vornehmen:

Von parentnode wird eine Kopie verändert+zurückgegeben, und der Rückgabewert wird dann currentnode zugewiesen statt verworfen.

Sollte semantisch dasselbe sein wie zuvor. Leider funktioniert es dann nicht mehr. Im letzten Commit (693eb11) ist sichtbar, was ich diese Verschlimmbesserung rückgängig machen musste. Was habe ich falsch gemacht? Wie könnte man es besser machen?

irinaschubert

It's hard to tell what exactly went wrong in your attempt to remove the side effect as there is quite a lot of change in the whole code. I would need to invest more time to investigate the issue. It might have something to do with your method returning two values now instead of just row. This probably changes the whole behaviour of the method.

irinaschubert · 2022-02-16T07:36:02Z

knora/dsplib/utils/expand_all_lists.py


-    if not lists:
+    if 'project' not in data_model or 'lists' not in data_model['project']:


why do you need to check if project ist in data_model?

Because I cannot be sure that data_model is a valid JSON data model, I want to do all checks beforehand. Afterwards, I can go on.

OK, I don't really see why this is needed here. Shouldn't this be checked somewhere else as it has nothing to do with the lists?

irinaschubert · 2022-02-16T07:38:04Z

knora/dsplib/utils/expand_all_lists.py

    """
-    Gets all list definitions from a data model and expands them to JSON if they are only referenced via an Excel file
+    Get all list definitions from a data model and expand them to JSON if they are only referenced via an Excel file


"gets" and "expands" was correct, the docstring should be descriptive not directive.

ok, changed it back

irinaschubert · 2022-02-16T07:40:28Z

knora/dsplib/utils/expand_all_lists.py

        return []

+    lists = data_model['project']['lists']


I like the construction with .get() much better because it either returns None or something. If someone ever removes line 19f., there would be a possibility of line 22 to fail if lists is not present. This would cause an uncaught exception and the program to crash.

I see, but I like to have a clear separation between all tests (the extensive testing on line 19 you wondered about) and the straight access (line 22).

If I use the construction with .get() thoroughly, it would need to be lists = data_model.get('project').get('lists'). But then, the first get could return None.

Yes that's true, you would have to take it apart into two operations.

irinaschubert · 2022-02-16T07:46:14Z

knora/dsplib/utils/expand_all_lists.py

        # check if the folder parameter is used
-        if rootnode.get("nodes") and isinstance(nodes, dict) and nodes.get("folder"):
+        if nodes and isinstance(nodes, dict) and nodes.get("folder"):


If you want to check if nodes... you would have to use rootnode.get('nodes') on line 26. Because rootnode['nodes'] never returns None. Instead it throws an exception if nodes is not present. In the actual version it's done wrong, becuase if nodes would ever be None, it wouldn't get to that line...

A detail, but it would be nice to use 'folder' instead of "folder" here as well :)

thanks for the hint, it was really inconsistent. I improved it.

irinaschubert · 2022-02-16T07:50:04Z

knora/dsplib/utils/expand_all_lists.py

-            rootnode, excel_files = prepare_list_creation(excel_folder, rootnode.get("name"), rootnode.get("comments"))
-
+            prepared_rootnode, excel_files = prepare_list_creation(
+                nodes['folder'], str(rootnode['name']), rootnode['comments']


I'm not quite sure but we should probably also check if rootnode['name'] and rootnode['comments'] is available. Why do we need the explicit str() for rootnode['name']?

Yes, I expanded the checks.

I had used str() to make Mypy happy. But I didn't do it in a good way. I think it's better now, after I implemented your feedback.

knora/dsplib/utils/excel_to_json_lists.py

Co-authored-by: irinaschubert <irina.schubert@dasch.swiss>

knora/dsplib/utils/excel_to_json_lists.py

knora/dsplib/utils/expand_all_lists.py

Co-authored-by: irinaschubert <irina.schubert@dasch.swiss>

sonarcloud · 2022-02-17T15:06:59Z

Kudos, SonarCloud Quality Gate passed!

0 Bugs
0 Vulnerabilities
0 Security Hotspots
1 Code Smell

No Coverage information
0.0% Duplication

irinaschubert

LGTM

jnussbaum added 5 commits January 28, 2022 13:47

Don't open the file each time in the recursive function get_values_fr…

92d6d4f

…om_excel()

instead of side effects, all methods should return the result they co…

47ed29b

…mputed

- make code more understandable by improving the logic

927d6bf

- improve docstrings and annotations

improve docstrings and annotations

2201182

- various style/annotation improvements

cdec1c4

- remove side-effect of get_values_from_excel by making a copy of parentnode before returning it

jnussbaum self-assigned this Feb 11, 2022

jnussbaum and others added 8 commits February 11, 2022 16:09

- correct mistake in logic

76bb885

- improve error message output

- reconstruct the side-effect from before

897a4f5

strip strings

367e3bf

adapt 'expand_all_lists.py' to the refactorings

9fb5972

Merge branch 'main' into wip/dev-431-refactor-excel-to-json-list

64aa62d

elminate little bug

b4afc09

Merge remote-tracking branch 'origin/wip/dev-431-refactor-excel-to-js…

1f61853

…on-list' into wip/dev-431-refactor-excel-to-json-list

introduce the side effect again

693eb11

jnussbaum requested a review from irinaschubert February 14, 2022 16:56

jnussbaum marked this pull request as ready for review February 15, 2022 15:28

irinaschubert suggested changes Feb 16, 2022

View reviewed changes

jnussbaum commented Feb 17, 2022

View reviewed changes

knora/dsplib/utils/excel_to_json_lists.py Outdated Show resolved Hide resolved

jnussbaum commented Feb 17, 2022

View reviewed changes

knora/dsplib/utils/excel_to_json_lists.py Outdated Show resolved Hide resolved

jnussbaum and others added 2 commits February 17, 2022 09:24

rename variable

88bca19

Co-authored-by: irinaschubert <irina.schubert@dasch.swiss>

correct previous error

adb66c2

jnussbaum commented Feb 17, 2022

View reviewed changes

knora/dsplib/utils/excel_to_json_lists.py Outdated Show resolved Hide resolved

jnussbaum commented Feb 17, 2022

View reviewed changes

knora/dsplib/utils/expand_all_lists.py Outdated Show resolved Hide resolved

jnussbaum and others added 2 commits February 17, 2022 15:06

improvements suggested by reviewer

15e97a6

Co-authored-by: irinaschubert <irina.schubert@dasch.swiss>

implement reviewer's feedback

555ebe8

jnussbaum requested a review from irinaschubert February 17, 2022 14:51

Merge branch 'main' into wip/dev-431-refactor-excel-to-json-list

0480f16

irinaschubert approved these changes Feb 17, 2022

View reviewed changes

jnussbaum merged commit 8a8c9d0 into main Feb 17, 2022

jnussbaum deleted the wip/dev-431-refactor-excel-to-json-list branch February 17, 2022 17:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: excel to json list (DEV-431) #155

refactor: excel to json list (DEV-431) #155

jnussbaum commented Feb 11, 2022

jnussbaum commented Feb 14, 2022

irinaschubert left a comment

irinaschubert Feb 16, 2022 •

edited

jnussbaum Feb 17, 2022

irinaschubert Feb 17, 2022

irinaschubert Feb 16, 2022

jnussbaum Feb 17, 2022

irinaschubert Feb 16, 2022

jnussbaum Feb 17, 2022

irinaschubert Feb 17, 2022

irinaschubert Feb 16, 2022

jnussbaum Feb 17, 2022

irinaschubert Feb 16, 2022

jnussbaum Feb 17, 2022

sonarcloud bot commented Feb 17, 2022

irinaschubert left a comment


		if not lists:
		if 'project' not in data_model or 'lists' not in data_model['project']:

refactor: excel to json list (DEV-431) #155

refactor: excel to json list (DEV-431) #155

Conversation

jnussbaum commented Feb 11, 2022

jnussbaum commented Feb 14, 2022

irinaschubert left a comment

Choose a reason for hiding this comment

irinaschubert Feb 16, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sonarcloud bot commented Feb 17, 2022

irinaschubert left a comment

Choose a reason for hiding this comment

irinaschubert Feb 16, 2022 •

edited