Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finding relocated private elements using "PrivateCreator" #205

Open
moloney opened this issue Mar 18, 2022 · 10 comments
Open

Finding relocated private elements using "PrivateCreator" #205

moloney opened this issue Mar 18, 2022 · 10 comments

Comments

@moloney
Copy link

moloney commented Mar 18, 2022

Private elements can be relocated (group number will change) and some popular PACS will absolutely do this to data passing through them. Instead of looking up an absolute (group_number, element_number) in rules, we should allow looking up private elements with a PrivateCreator plus an element_number. Then when processing a dataset you look for a matching PrivateCreator string (e.g. "SIEMENS CSA HEADER") to find the private group number dynamically.

I'm not sure how this capability should be exposed in the recipes, any thoughts?

@vsoch
Copy link
Member

vsoch commented Mar 18, 2022

Can you give me a little example in pseudo code? E.g., the idea is that the PrivateCreator is a string and it's use in place of the group number?

@wetzelj
Copy link
Contributor

wetzelj commented Mar 20, 2022

@moloney - Would a %values section work for you?

With a values section, you can define a field (or list of fields) that you want to use to define a set of values. This values list would then be used in conjunction with a REMOVE or other command to change the identified tags.

%values manufacturer_values
FIELD Manufacturer

%header
REMOVE values:manufacturer_values

Header Before
PatientName - John Doe
Manufacturer - Siemens
(0009,0009) - SIEMENS CSA HEADER

Header After
PatientName - John Doe

If that doesn't work, a custom function may be another option for you. The custom function could be written to be grab the value of the tag being acted on, look for "SIEMENS CSA HEADER" and then return the appropriate value for the action that you want to take.

@jstorrs
Copy link
Contributor

jstorrs commented Aug 29, 2022

I don't know if this background information is helpful or not but private elements are tricky in DICOM because it's designed to allow multiple vendors to add private tags without stepping on each other's toes (if they do it correctly) and pydicom is great because it understands this. Often people get things directly from modalities and don't encounter these things.

Generally we think of private tags as similar to DICOM standard tags as having fully-specified group element pairs. But for private tags there's a layer of indirection via the Private Creator.

For example here is a Siemens CSA group (used a lot in MRI research):

(0029,0010) LO [SIEMENS CSA HEADER]                     #  18, 1 PrivateCreator
(0029,0011) LO [SIEMENS MEDCOM HEADER2]                 #  22, 1 PrivateCreator
(0029,1008) CS [IMAGE NUM 4]                            #  12, 1 Unknown Tag & Data
(0029,1009) LO [20211010]                               #   8, 1 Unknown Tag & Data
(0029,1010) OB 53\56\31\30\04\03\02\01\65\00\00\00\4d\00\00\00\45\63\68\6f\4c\69... # 11008, 1 Unknown Tag & Data
(0029,1018) CS [MR]                                     #   2, 1 Unknown Tag & Data
(0029,1019) LO [20211010]                               #   8, 1 Unknown Tag & Data
(0029,1020) OB 53\56\31\30\04\03\02\01\4f\00\00\00\4d\00\00\00\55\73\65\64\50\61... # 93880, 1 Unknown Tag & Data
(0029,1160) LO [com]                                    #   4, 1 Unknown Tag & Data

The things to pay attention to with private tags are these patterns

(GGGG,00XX) LO [NAME]  # PrivateCreator
(GGGG,XX01) item
(GGGG,XX02) item
...

The first two lines in the above example are "reservations" that declare:

  • 0x0029,0x10... will be the "SIEMENS CSA HEADER" private block
  • 0x0029,0x11... will be the "SIEMENS MEDCOM HEADER2" private block

Those specific numbers 0x10 and 0x11 can be remapped by DICOM processors. So for example it is entirely valid to move the entire "SIEMENS CSA HEADER" private block from 0x0029,0x10... to, say, 0x0029,0x27... as long as the group is preserved.

(0029,0011) LO [SIEMENS MEDCOM HEADER2]                 #  22, 1 PrivateCreator
(0029,0027) LO [SIEMENS CSA HEADER]                     #  18, 1 PrivateCreator
(0029,1160) LO [com]                                    #   4, 1 Unknown Tag & Data
(0029,2708) CS [IMAGE NUM 4]                            #  12, 1 Unknown Tag & Data
(0029,2709) LO [20211010]                               #   8, 1 Unknown Tag & Data
(0029,2710) OB 53\56\31\30\04\03\02\01\65\00\00\00\4d\00\00\00\45\63\68\6f\4c\69... # 11008, 1 Unknown Tag & Data
(0029,2718) CS [MR]                                     #   2, 1 Unknown Tag & Data
(0029,2719) LO [20211010]                               #   8, 1 Unknown Tag & Data
(0029,2720) OB 53\56\31\30\04\03\02\01\4f\00\00\00\4d\00\00\00\55\73\65\64\50\61... # 93880, 1 Unknown Tag & Data

The general pattern for this is:

(GGGG,00XX) LO [NAME OF PRIVATE BLOCK]
(GGGG,XXYY) .... items within the block 

Since the location of the block within the group is "arbitrary", one of the nice things about pydicom's dataset is that you can select private items like this:

ds.get_private_tag(0x0029,0x10,"SIEMENS CSA HEADER")

@vsoch
Copy link
Member

vsoch commented Aug 29, 2022

Ah that's handy! @moloney can you give me an example file and some set of actions you want to do and I can try this out?

@jstorrs
Copy link
Contributor

jstorrs commented Aug 29, 2022

Not the OP but one thing that I'm working on is a custom function to clean the Siemens CSA headers at least to the point that the contents are vouched by the deidentifier and that dcm2niix/gdcm etc work as expected on the output. CSA headers don't seem to be terribly complex, but they are big and scary. Generally the approach is to just throw them out because they cannot be vouched (which is a big challenge because they're used to build NiFTI and CROs have demanded they are maintained which is a different story). The reality is that CSA data blobs do contain things like null-terminated strings stored within larger fixed-length fields that were not initialized prior to use so even if the CSA dumps look good, the "dead space" contains who-knows-what. But the locations of dead space in the CSA is knowable and can just be overwritten with nulls etc.

@vsoch
Copy link
Member

vsoch commented Aug 29, 2022

@jstorrs if you get something working and would like to contribute here, it would be hugely welcome!

@jstorrs
Copy link
Contributor

jstorrs commented Sep 9, 2022

I've been thinking about a couple approaches to this and familiarizing myself with the deid codebase. The solution I'd like to try is to add a new dictionary: section to DICOM deid recipes. It would contain typical DICOM dictionary definition lines that define keywords and can be used to track the private creator. Basically the section would feed pydicom.datadict.add_dict_entry() and pydicom.datadict.add_private_dict_entry(). I'll start working on this over the next few days.

@vsoch
Copy link
Member

vsoch commented Sep 10, 2022

That's a cool idea! Can you spec out an example for discussion?

@jstorrs
Copy link
Contributor

jstorrs commented Oct 17, 2022

Odd question and I don't know if anyone here has an answer... is it valid for the same private creator string to be reused for multiple reservations in the same block? i.e. suppose you have:

(0021,0010) LO [MY PRIVATE CREATOR]
(0021,0011) LO [MY PRIVATE CREATOR]
(0021,0012) LO [MY PRIVATE CREATOR]
(0021,0013) LO [MY PRIVATE CREATOR]
(0021,1001) LO [VALUE 1]
(0021,1101) LO [VALUE 2]
(0021,1201) DS [3]
(0021,1301) LO [VALUE 4]

I haven't been able to find anything in the standard that forbids this. But then if you encounter a file like this, I'd flag it as obviously something's amok. This is obviously going to be a challenge for this sort of thing. I'll see whether pydicom or dcmtk has thought about this weird case.

I have not encountered this in the wild, I'm just pondering how to handle this. Previously the thought was that (0x0021,"MY PRIVATE CREATOR",0x01) must obviously be unique and duplicate private creators are forbidden, but I can't find where/if that's specified in the standard.

https://dicom.nema.org/dicom/2013/output/chtml/part05/sect_7.8.html

@jstorrs
Copy link
Contributor

jstorrs commented Oct 17, 2022

Edit: nevermind I just realized after posting the URL that I was looking at the 2013 version (somehow comes to the top of Google for me). The latest version forbids it and says if there's a need for some reason the implementation should use sequences.

https://dicom.nema.org/medical/dicom/current/output/chtml/part05/sect_7.8.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants