-
-
Notifications
You must be signed in to change notification settings - Fork 471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] [RFC] Framework for extending Dataset #989
Open
darcymason
wants to merge
6
commits into
main
Choose a base branch
from
enh/dataset-extension
base: main
Could not load branches
Branch not found: {{ refName }}
Could not load tags
Nothing to show
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
ab65986
Add keyword property to BaseTag
darcymason ccd769d
Dataset tag handler framework
darcymason 4e6a472
Add ability to clear dataset tag handlers
darcymason 0d07564
Add test for handlers by keyword
darcymason 0ce3a65
pep8 fixes
darcymason 20d9249
Merge branch 'master' into enh/dataset-extension
darcymason File filter
Filter by extension
Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
from pydicom.dataset import Dataset | ||
from pydicom.tag import Tag | ||
import warnings | ||
|
||
|
||
def handle_dataset_tags(tags): | ||
"""Define a decorator to use with classes which handle tags""" | ||
def decorator(handler_class): | ||
existing_tags = set(Dataset._handler_classes.keys()) & set(tags) | ||
if existing_tags: | ||
warnings.warn( | ||
"Registration of handler {!r} is overriding" | ||
"existing handling of tags {}".format(handler_class, | ||
existing_tags) | ||
) | ||
Dataset._handler_classes.update({Tag(tag): handler_class | ||
for tag in tags}) | ||
return handler_class | ||
|
||
return decorator |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
# Copyright 2008-2018 pydicom authors. See LICENSE file for details. | ||
"""Test for encaps.py""" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. copy/paste? |
||
|
||
import pytest | ||
|
||
from pydicom.tag import Tag | ||
from pydicom.dataset import Dataset | ||
from pydicom.dataelem import DataElement | ||
from pydicom.extensions import handle_dataset_tags | ||
|
||
|
||
class TestDatasetExtension(object): | ||
"""Test handlers which extend Dataset""" | ||
def setup(self): | ||
@handle_dataset_tags(["ImagePositionPatient", "PatientName"]) | ||
class MyHandler: | ||
def __init__(self, dataset): | ||
self.ds = dataset | ||
|
||
def get_item(self, tag): | ||
"""Return the data element instance""" | ||
if tag.keyword == "ImagePositionPatient": | ||
x, y, z = self.ds.ImagePositionPatient | ||
# This one updates underlying dataset | ||
self.ds.ImagePositionPatient = [x + 1, y + 1, z + 1] | ||
return self.ds["ImagePositionPatient"] | ||
elif tag.keyword == "PatientName": | ||
# This one leaves underlying dataset alone | ||
return DataElement(tag, "PN", "anonymous") | ||
else: | ||
NotImplementedError("Unhandled tag") | ||
self.ds = Dataset() | ||
self.ds.PatientName = "Patient^Joe" | ||
self.ds.ImagePositionPatient = [0.0, 0.0, 0.0] | ||
self.ds.PatientID = "123" | ||
|
||
def teardown(self): | ||
Dataset.clear_handler_classes() | ||
|
||
def test_get_item(self): | ||
"""Test handler handles specified tags when retrieved by tag.""" | ||
assert [1.0, 1.0, 1.0] == self.ds["ImagePositionPatient"].value | ||
assert "anonymous" == self.ds["PatientName"].value | ||
# Check that one we haven't changed is okay | ||
assert "123" == self.ds["PatientID"].value | ||
|
||
def test_get_attr(self): | ||
"""Test handler handles specified tags when retrieved by keyword.""" | ||
assert [1.0, 1.0, 1.0] == self.ds.ImagePositionPatient | ||
assert "anonymous" == self.ds.PatientName | ||
|
||
# assert that unchanged on still okay | ||
assert "123" == self.ds.PatientID |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it make sense to allow more than one handler? I'm thinking of handlers for different SOP classes that may handle the same tag differently. Not sure if that is an intended use case - maybe that would need another mechanism that takes the SOP class into account (similar to the supported TransferSyntax in the image data handlers).
Or I'm just misunderstanding the use case here. I had been thinking about these handler classes as a replacement for the proposed SR classes (
SRDocumentStorage
,WaveformStorage
etc) - maybe you could elaborate a bit about the intended use cases?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good points ... I had planned to extend this to tag groups, or slices of tag numbers, which I think could handle most special cases. For example, SR code elements seem to be largely group 0040.
The idea of handling SOP classes is interesting, and we could put handlers in for that as well, but I don't know how to extend that to nested datasets within Sequences, which wouldn't have an SOP class data element. Maybe there is some way in analogy to what was done with SpecificCharacterSet, keeping track of parents and using their handlers. Would take some thought to figure that out. But if someone were building something up from scratch, they might start with some SR content items which they then assemble into an overall dataset. Along the way, there would only be the tag numbers to identify what each data element was.
In terms of more than one handler, I'm not sure how that would work - generally speaking only one operation can actually occur. E.g. for 'get' operations, some data element has to come back, and we would have to allow only one handler to actually provide it. Similarly with setting items, it would be very confusing if multiple handlers could be doing different things to the dataset contents.
In the end, I'd like to see something here that can handle some things with relatively clearly defined conditions. There will no doubt be some cases where subclassing would be the only thing flexible enough to handle all cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add that I need to go back and review the classes in this PR again, and see how/if these ideas can fit. I started with something really general, but should check it in the specifics of this SR code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I thought about calling handlers until some of them is able to handle it. That would imply that the handlers should know if they want to handle the dataset, and also, that there may be conflicts if two handlers want to handle the same data for the same tag or tag range, so that the first would win... which is probably not a good thing, as it would depend on the order of the handlers. It would actually be similar to the pixel data handlers in this aspect. Also, you are right - if you want to extend this to nested sequences, it would be more complicated.
What I had in mind was actually a generalization of the pixel data handler concept for other tags beside
PixelData
. So, if this is not really a use case, it may just be a bad idea.I think your idea is an interesting one. I'm not sure yet how to apply it for SR code, but there are other use cases anyway. I can think for example about handling vendor-specific and/or non-standard-conform tags, or a kind of cheap anonymization - actually there are lots of possible use cases for pydicom users now I think about it. I guess I have been thinking more about using it inside pydicom so far...