Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: catch network interruptions during onto creation (DEV-1073) #210

Conversation

jnussbaum
Copy link
Collaborator

resolves DEV-1073

 - improve console output during the onto creation
 - thorough refactoring of the architecture
 - it is now more clear what the methods do
 - no more doubling, e.g. the onto isn't validated two times any more
@jnussbaum jnussbaum self-assigned this Aug 5, 2022
Comment on lines +38 to +39
- `-V` | `--validate-only`: If set, only the validation of the JSON file is performed.
- `-l` | `--lists-only`: If set, only the lists are created. Please note that in this case the project must already exist.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small change to make it more clear

- `-V` | `--validate-only`: If set, only the validation of the JSON file is performed.
- `-l` | `--lists-only`: If set, only the lists are created. Please note that in this case the project must already exist.
- `-v` | `--verbose`: If set, more information about the progress is printed to the console.
- `-d` | `--dump`: If set, dump test files for DSP-API requests.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This option already existed in the code, but was not documented

Comment on lines +154 to -161
if validate_project(args.datamodelfile):
print('Data model is syntactically correct and passed validation.')
exit(0)
else:
exit(1)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The behaviour of validate_project() changed: It returns True or raises an Error with detailed explanation. So, it's not necessary any more to exit().

Comment on lines -247 to -249
except BaseError:
# return None if no groups are found or an error happened
return None
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want the error to escalate, so that I can handle it in the higher-level methods


def expand_lists_from_excel(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this method here, because it thematically belongs here. Before, it was in the file expand_all_lists.py, which contained nothing else than just this method. I think it makes more sense to move it here and delete the other file.

excelfiles: dict[str, Worksheet],
base_file: dict[str, Worksheet],
parentnode: dict[str, Any],
row: int,
col: int,
preval: list[str]
preval: list[str],
verbose: bool = False
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is used

  • by the create command, where it shouldn't print too much by default
  • during the excel command, where it should print every created node by default

The verbose switch seemed to be a good solution for this.

Comment on lines -327 to -329
print('Found the following files:')
for file in excel_files:
print(file)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be printed in the higher-level methods, because depending on the caller, a different print is needed.



def login(server: str, user: str, password: str) -> Connection:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved the method "login()" to "shared_methods.py"

Comment on lines -135 to -146
# check if status is defined, set default value if not
group_status: Optional[str] = group.get("status")
group_status_bool = True
if isinstance(group_status, str):
group_status_bool = json.loads(group_status.lower()) # lower() converts string to boolean

# check if selfjoin is defined, set default value if not
group_selfjoin: Optional[str] = group.get("selfjoin")
group_selfjoin_bool = False
if isinstance(group_selfjoin, str):
group_selfjoin_bool = json.loads(group_selfjoin.lower()) # lower() converts string to boolean

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I simplified these two radically: They are now directly in the constructor call below, as one-liners


sysadmin = False
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From here on, GitHub's diff renderling is a bit hard to understand. In PyCharm, it was more understandable. Basically, there are 4 things done inside the big loop for user in users:

  • skip the user if he already exists
  • if "groups" is provided, add user to the group(s)
  • if "projects" is provided, add user to the project(s)
  • create the user

I tried to simplify these four steps, to minimize the indentation depth, and to improve code understandability.

The diff might be hard to understand, but the code should be better understandable than before.

Comment on lines 436 to 440
project_remote: Project = try_network_action(
object=project_local,
method="read",
error_message_on_failure=""
)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the project can't be read, it raises a BaseError without message --> create the project.
If the reading suceeds without error --> update project

Comment on lines +81 to +84
if re.search(r'try again later', err.message) or re.search(r'status code=5\d\d', err.message):
print(f'{datetime.now().isoformat()}: Try reconnecting to DSP server, next attempt in {2 ** i} seconds...')
time.sleep(2 ** i)
continue
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines are new: If the API's response says that it is a temporary error, it is worth retrying. All other base errors are permanent and will escalate.

@@ -5,14 +5,12 @@
import json
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did I change this file?

  • I moved _try_network_action() to shared_methods.py
  • I changed the behaviour of _try_network_action(): If the connectivity issues remain, it won't return None anymore, but it will escalate the error that it received from the method that failed. For this reasons, I have to embed all usages of _try_network_action() into a try-except block

@@ -21,8 +20,8 @@ class TestTools(unittest.TestCase):
password = 'test'
imgdir = '.'
sipi = 'http://0.0.0.0:1024'
test_onto_file = 'testdata/test-onto.json'
test_list_file = 'testdata/test-list.json'
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got rid of the lists.json output file

@@ -68,23 +67,23 @@ def test_get(self) -> None:
group_descriptions_received = []
group_selfjoin_received = []
group_status_received = []
for group in groups_expected:
for group in sorted(groups_expected, key=lambda x: x["name"]):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some groups with more diverse characteristics. It is now necessary to sort the groups by name

Comment on lines +23 to +24
test_project_file = 'testdata/test-project-systematic.json'
test_project_minimal_file = 'testdata/test-project-minimal.json'
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not only important to have a systematic test project, but also to have a minimal one. During my work, I accidentally spotted a bug which was never detected. With the new minimal test project, it would have been detected.

Copy link
Collaborator

@BalduinLandolt BalduinLandolt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not an easy one to review... but I think you improved a lot. I hope I didn't miss anything important.

knora/dsplib/utils/excel_to_json_lists.py Show resolved Hide resolved
knora/dsplib/utils/onto_create_lists.py Outdated Show resolved Hide resolved
knora/dsplib/utils/onto_create_ontology.py Outdated Show resolved Hide resolved
knora/dsplib/utils/shared_methods.py Outdated Show resolved Hide resolved
knora/dsplib/utils/shared_methods.py Show resolved Hide resolved
@sonarcloud
Copy link

sonarcloud bot commented Aug 9, 2022

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 25 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@jnussbaum jnussbaum merged commit ab0e3b2 into main Aug 9, 2022
@jnussbaum jnussbaum deleted the wip/dev-1073-catch-network-interruption-while-creating-ontology branch August 9, 2022 09:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants