New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: improve docs and example data for excel2xml (DEV-1370) #233
Conversation
This reverts commit 6596369.
…t XML will be identical
@@ -58,7 +57,7 @@ def make_xsd_id_compatible(string: str) -> str: | |||
return res | |||
|
|||
|
|||
def find_date_in_string(string: str, calling_resource: str = "") -> Optional[str]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are 2 changes in this method:
- param
calling_resource
was a relict from earlier times, which was not used any more. - param
string
can actually be None, this case needs to be covered
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this is good. If the method does what it says it does, namely "find a date in a string", it should not be possible to hand None for the string...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I changed the signature back to str
, but I still sanitize the input. That means that the code demands string to be a string, but it is still ready for the case that string is not a string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are some things that struck me during review.
For me, two main questions remain open after looking over it:
- what is the target audience of this example repo? I have the feeling that parts of it address a user with very little knowledge, whereas other parts are too difficult for such a user.
- wouldn't it make more sense to have this example repository live in its separate repository? and if so, wouldn't it serve the same purpose, if there was a sample python script in the rosetta repo? and in any case, is it necessary to have automated tests for such a script?
@@ -355,7 +356,7 @@ def _upload_resources( | |||
""" | |||
|
|||
# If there are multimedia files: calculate their total size | |||
bitstream_all_sizes_mb = [os.path.getsize(res.bitstream.value) / 1000000 for res in resources if res.bitstream] | |||
bitstream_all_sizes_mb = [os.path.getsize(os.path.join(imgdir, res.bitstream.value)) / 1000000 for res in resources if res.bitstream] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pathlib.Path
tends to offer a nice, higher-level API for most use cases of the os.path
module.
test/e2e/test_0123_import_scripts.py
Outdated
shutil.rmtree("docs/assets/0123-import-scripts-extracted", ignore_errors=True) | ||
if os.path.isfile("docs/assets/0123-import-scripts/data-processed.xml"): | ||
os.remove("docs/assets/0123-import-scripts/data-processed.xml") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could be done much nicer using Path
The potentials for the everyday work of the Client Services at DaSCH are twofold: | ||
1. Data cleaning (recommended): For this purpose, you can think of OpenRefine as a much better version of Excel. You | ||
can perform operations which would be very tiresome in Excel. | ||
2. Conversion to our dps-customised xml format for bulk upload (not recommended) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does dps-customised mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ups, that's a weird wording. In my tutorial on the new homepage, I have "DSP-specific XML format". That's better, I guess.
@@ -58,7 +57,7 @@ def make_xsd_id_compatible(string: str) -> str: | |||
return res | |||
|
|||
|
|||
def find_date_in_string(string: str, calling_resource: str = "") -> Optional[str]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this is good. If the method does what it says it does, namely "find a date in a string", it should not be possible to hand None for the string...
prevent e2e test from making the submodule's head detached
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I have to admit that I lost the overview of the PR. Maye you could give it to someone who would have to work with it in the end to check if everything makes sense to a potential user..? Just an idea though, maybe this doesn't make sense at this stage.
the DSP server. | ||
""" | ||
# pull the latest state of the git submodule | ||
os.system("git submodule update --init --recursive") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it maybe possible to move this into a beforeAll
method instead of inside the test case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, but then I have no place where I can put my import statement (from knora.dsplib.import_scripts import import_script
). Python doesn't accept it inside a method, but I have to make the import after the OS call. So I think I just leave it as it is.
resolves DEV-1370