Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mapping of private into standard attributes as part of legacy enhanced MF generation #19

Open
fedorov opened this issue Feb 25, 2020 · 17 comments
Assignees
Labels
question Further information is requested

Comments

@fedorov
Copy link
Member

fedorov commented Feb 25, 2020

Would this task be considered "in scope" for highdicom?

@fedorov fedorov added the question Further information is requested label Feb 25, 2020
@hackermd
Copy link
Collaborator

I would say that depends on the complexity of the task. If it's as easy as creating a dict object that maps private attribute to standard attribute than I think we can take it on. However, I defer to you and @pieper.

@pieper
Copy link
Member

pieper commented Feb 25, 2020

Some cases will be very simple, but when you get into sophisticated imaging scenarios like diffusion MRI the task is very complex (see for example dcm2niix and related efforts like DWIConvert). Perfusion, PET, fMRI, 4D Ultrasound, microCT, CBCT, etc all have similar complexities when you scratch the surface a bit.

For highdicom I would suggest making a framework that is extensible so that people working these domains can specialize the general solution for their own needs. Ideally there should be contributions from the community to cover a broad range of real world use cases.

@hackermd
Copy link
Collaborator

It seems that the task can get quite messy. Therefore, we may need to put some more thought into the design of the highdicom.legacy API, in case we would like to take on that task.

@fedorov could you elaborate on the use cases that you had in mind?

@fedorov
Copy link
Member Author

fedorov commented Feb 25, 2020

For me personally, the most common use case is Diffusion weighted MRI. B-values are most often stored in private attributes. For prostate MRI, just the trace image is stored, so it is a 4d acquisition. To me, it would be most appropriate if the private attributes containing the b-value are instead placed into the appropriate standard attribute.

@seandoyle
Copy link
Contributor

If we did move the B-values to a standard location - would new SOPInstances need to be created? I'm guessing that the old private values would be left in place so that existing routines that depended on them would still function?

@pieper
Copy link
Member

pieper commented Feb 26, 2020

I would vote for always creating new SOPInstanceUIDs for any derived instance (the original UIDs can be referenced if someone needs to track them down). Also as a general rule I would vote against leaving any private tags in the normalized form of the instances.

@hackermd
Copy link
Collaborator

Also as a general rule I would vote against leaving any private tags in the normalized form of the instances.

Are you arguing in favor of skipping all private attributes and not including them in the enhanced data sets at all?

@pieper
Copy link
Member

pieper commented Feb 26, 2020

Are you arguing in favor of skipping all private attributes and not including them in the enhanced data sets at all?

Yes, my preference would be that the normalized form would not include any lingering private tags. So if we have a diffusion scan in normalized form, there would be only one standard place to find the b-values and gradients and all instances would have the same schema independent of vendor (in much the same way that a nrrd or nii file doesn't keep the private tags). If there's something vendor-specific about the acquisition that can only be learned from private tags then people can look at the original vendor-generated instances.

@fedorov
Copy link
Member Author

fedorov commented Feb 26, 2020

Steve, but what about private tags that cannot be mapped into standard ones? I think those should definitely be carried forward into MF.

@pieper
Copy link
Member

pieper commented Feb 26, 2020

Steve, but what about private tags that cannot be mapped into standard ones?

I don't think it makes a lot of sense to convert instances that don't map to a standard acquisition description.

@fedorov
Copy link
Member Author

fedorov commented Feb 26, 2020

I don't agree. I think there will be situations where the data may be useful even if there is no mapping for all private attributes, or where semantics of standard attributes is different from that of private attributes. I don't have specific examples handy. I think that if there is a private attribute that does not map, or if it is not absolutely clear it maps directly to a standard attribute, it should be carried forward and stored the same way as highdicom does it right now, I believe.

@hackermd
Copy link
Collaborator

To me there are two main motivations for legacy conversion:

  1. representing a series of single-frame images as one pydicom.Dataset (such that the pixel_data property returns a 3D numpy.ndarray) for use in the Python runtime, e.g. to pass the data set to an ML model
  2. storing converted enhanced images for use by other applications/systems

For both use cases, it would be ideal if we could harmonize data sets across vendors for improved interoperability. Therefore, I agree with @pieper that we should try to map private attributes to the corresponding standard attributes. However, I share @fedorov concerns that omitting all private attributes entirely may cause problems.

@fedorov
Copy link
Member Author

fedorov commented Feb 26, 2020

we should try to map private attributes to the corresponding standard attributes

To be clear, I completely agree with this as well.

@hackermd
Copy link
Collaborator

From an API perspective, do you think it would be sufficient to implement this in form of a simple dict, where keys would be private attributes and values would be the corresponding standard attributes, or would we need to implement this in form of code?

@fedorov
Copy link
Member Author

fedorov commented Feb 26, 2020

I think it is likely that at least in some cases there will be a need to do some formatting or conversion, or cleanup. Difference in units, or change of VR are probably the simplest examples where this will help.

@hackermd
Copy link
Collaborator

Ok, maybe we can start by creating a document for flashing out the mapping and then delve into the implementation. It seems that the simple approach using a dict won't cut it. I further assume that most of the private attributes are modality specific, so we will probably have to do this exercise separately for CT/MR/PET.

@pieper
Copy link
Member

pieper commented Feb 26, 2020

My main point is that these "non-mappable private tags" are essentially opaque to us, and in my experience they probably have very little meaning if they are removed from their original context as part of a vendor-specific instance. Even within a vendor instance, the interpretation may vary based on machine or software version.

It's okay to bring them along in a dict or other structure for completeness, but I think it would be hard to justify ever writing an analysis program that relied on this data - existing legacy tools that understand these private tags are going to need the original instances and are unlikely to ever be ported to use a converted dictionary.

On the plus side, my experience is that for any meaningful type of analysis there will be a standard representation with well defined standard tags. We have ample evidence for this, given, for example, the large body of neuroimaging studies in nifti where vendor differences are not meaningful enough to prevent useful science from happening. Mapping instead to a standardized multiframe would allow preservation of additional useful acquisition information in a non-ad-hoc format.

Anyway, I agree with @hackermd that a we need to work though some specific examples before finalizing our approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants