New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add tracking to data file type column names 2. #553
base: issue-511
Are you sure you want to change the base?
Conversation
Testing that the changes fix what was raised in ISA-tools#509.
as discussed today:
|
Changed the added test to be more realistic. Had to track name columns like Data Transformation Name so values would be handled correctly.
I did an example more like what was discussed in the meeting. I had to also add tracking for the columns that are added for known protocol types. The problem that this PR solves is that for any ".* File" column, if there is more than 1 then the values don't get written out correctly. Solving this wasn't as easy as it was for a similar issue with Protocol REF columns. Due to how graph nodes were previously handled. Old Table Output:
New Table Output:
|
PRS to test this situation, i.e., several files generated by one data acquisition:
|
@@ -260,7 +263,7 @@ def get_node_by_label_and_key(labl, this_key): | |||
fv_set.add(fv) | |||
material.factor_values = list(fv_set) | |||
|
|||
elif object_label in _LABELS_DATA_NODES: | |||
elif object_label in _LABELS_DATA_NODES or ' File' in object_label: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
object_label="foo File" would cause issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I don't think "foo File" would validate.
- There are other instances of patterns like
' File' in
orendswith('File')
where "foo File" would cause an issue that were already present. I just made things consistent. - When this change was originally made not every File column was in _LABELS_DATA_NODES.
- Having a list of specific acceptable file names is pretty fragile anyway and I would have generalized to columns ending in " File" a while ago, or a pattern like what "Protocol REF" does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After discussing in meeting. I will remove this and make the code always just look in _LABELS_DATA_NODES.
def _build_paths_and_indexes(process_sequence=None): | ||
"""Returns the paths from source/sample to end points and a mapping of sequence_identifier to object.""" | ||
|
||
def _compute_combinations(identifier_list, identifiers_to_objects): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rebasing the original branch for this did not go well, so I manually created another one. This PR should replace #510. I was able to get this to pass all of the tests and now multiple ".* File" columns won't cause an issue. We don't need this capability like I thought we did originally, but I figured I would go on and make it work anyway.