Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing documentation : the dataset object #257

Open
Ragnar-Oock opened this issue Jul 27, 2023 · 3 comments
Open

Missing documentation : the dataset object #257

Ragnar-Oock opened this issue Jul 27, 2023 · 3 comments

Comments

@Ragnar-Oock
Copy link

A documentation on the dataset object returned by parseDicom would be great. The types provided in index.d.ts are not very precises on the content of most sub object of this object. Most notably the Element Object is really hard to understand as all possible configurations are in a single interface.

interface Element

export interface Element {
    tag: string;
    vr?: string;
    length: number;
    dataOffset: number;
    items?: Element[];
    dataSet?: DataSet;
    parser?: ByteArrayParser;
    hadUndefinedLength?: boolean;

    encapsulatedPixelData?: boolean;
    basicOffsetTable?: number[];
    fragments?: Fragment[];
  }

I have wasted a lot of time trying to understand which combinaison of property happens when (which I have found 3 : simple elements, sequences and elements with encapsulated pixel data) and some misunderstanding leaded to hard to track down bugs.

Having a documentation on the data returned by parseDicom with a break down of the meaning of each property would be appreciated, an update to the above quoted interface to have it broken down into the actual interfaces that exists at run time would be nice too (you can't have both items and basicOffsetTable for example).

I can draft documents for both if needed.

@yagni
Copy link
Collaborator

yagni commented Jul 27, 2023

@Ragnar-Oock Drafts would be great! I agree that this would be very useful, it's just not too high on the list right now unfortunately given other priorities.

@Ragnar-Oock
Copy link
Author

The first draft of the type declaration for the `Element` object

/**
 * The tag of an element in the format `xGGGGEEEE` where :
 * - `GGGG` stands for the big endian hexadecimal representation of the element's group number
 * - `EEEE` stands for the big endian hexadecimal representation of the element's element number
 * 
 * The `x` at the start is here to allow property access without requiring the use of the index access syntax.
 */
export type Tag = `x${string}`;

export enum VR {
  /** Application Entity */ 
  AE = 'AE',
  /** Age String */ 
  AS = 'AS',
  /** Attribute Tag */ 
  AT = 'AT',
  /** Code String */ 
  CS = 'CS',
  /** Date */ 
  DA = 'DA',
  /** Decimal String */ 
  DS = 'DS',
  /** Date Time */ 
  DT = 'DT',
  /** Floating Point Single */ 
  FL = 'FL',
  /** Floating Point Double */ 
  FD = 'FD',
  /** Integer String */ 
  IS = 'IS',
  /** Long String */ 
  LO = 'LO',
  /** Long Text */ 
  LT = 'LT',
  /** Other Byte String */ 
  OB = 'OB',
  /** Other Double String */ 
  OD = 'OD',
  /** Other Float String */ 
  OF = 'OF',
  /** Other Word String */ 
  OW = 'OW',
  /** Person Name */ 
  PN = 'PN',
  /** Short String */ 
  SH = 'SH',
  /** Signed Long */ 
  SL = 'SL',
  /** Sequence of Items */ 
  SQ = 'SQ',
  /** Signed Short */ 
  SS = 'SS',
  /** Short Text */ 
  ST = 'ST',
  /** Time */ 
  TM = 'TM',
  /** Unique Identifier (UID) */ 
  UI = 'UI',
  /** Unsigned Long */ 
  UL = 'UL',
  /** Unknown */ 
  UN = 'UN',
  /** Unsigned Short */ 
  US = 'US',
  /** Unlimited Text */ 
  UT = 'UT'
}

export type ByteArray = Uint8Array;

export interface ByteArrayParser {
  readUint16: (byteArray: ByteArray, position: number) => number;
  readInt16: (byteArray: ByteArray, position: number) => number;
  readUint32: (byteArray: ByteArray, position: number) => number;
  readInt32: (byteArray: ByteArray, position: number) => number;
  readFloat: (byteArray: ByteArray, position: number) => number;
  readDouble: (byteArray: ByteArray, position: number) => number;
}

/**
 * Base template from which all elements in dataset inherit from.
 */
export interface DataElement {
  /**
   * The tag of the element in the format `xGGGGEEEE`.
   * @see {@link Tag} for details on the formatting.
   */
  tag: Tag;
  /**
   * The value representation of the element.  
   * This property will be undefined for file encoded using a transfer syntax using implicit VR,
   * such as "Implicit VR Little Endian" (UID=1.2.840.10008.1.​2) which is the default transfer
   * syntax defined by the spec and is thus widely used.
   */
  vr?: VR;
  /**
   * The number of bytes in the Value field of the element.  
   * This property will be populated even if the element is defined with an undefined VL (0xFFFFFFFF).
   */
  length: number;
  /**
   * Byte offset, from the start of the byte stream, where the Value field of the element starts
   */
  dataOffset: number;
  /**
   * Is the element defined with an undefined VL (0xFFFFFFFF)?  
   * This property will be absent from elements with a defined Value Length, `true` otherwise.
   */
  hadUndefinedLength?: true;
}

/**
 * Used exclusively for Elements in the File Meta Information Group.
 */
export interface FileMetaInformationElement extends DataElement {
  /**
   * A little-endian parser provided as all elements of this group are required to be
   * encoded in the "Explicit VR Little Endian Transfer Syntax" (UID=1.2.840.10008.1.2.1)
   * as defined in [DICOM PS3.5](https://dicom.nema.org/medical/dicom/current/output/html/part05.html#PS3.5)
   */
  parser: ByteArrayParser;
}

/**
 * Used exclusively for `(7fe0,0010) Pixel Data Attribute` elements that are encoded using
 * dataset encapsulation as defined in [DICOM PS3.5 A.4](https://dicom.nema.org/medical/dicom/2016b/output/chtml/part05/sect_A.4.html).
 */
export interface EncapsulatedPixelDataElement extends DataElement {
  /**
   * Pixel Data Attributes encoded using encapsulation are required to have a 
   * Value Length field set to undefined (0xFFFFFFFF).
   */
  hadUndefinedLength: true;

  /**
   * This field discriminates between a Pixel Data Attribute element simply encoded with 
   * an undefined length and one actually using data set encapsulation.
   */
  encapsulatedPixelData: true;
  /**
   * Holds the byte offset of each fragment, the first of which is the Basic Offset Table itself (it will always be 0).
   * This property will always be present but might be empty depending on how the file was encoded.
   */
  basicOffsetTable: number[];
  /**
   * Holds the information needed to extract the fragments from the byte stream.
   * This property will always be populated with as many fragments as there are in the file.
   */
  fragments: Fragment[];
}

/**
 * Used inside an {@link EncapsulatedPixelDataElement} to define the position of each encapsulated element.
 */
export interface Fragment {
  /**
   * Byte offset, indexing from the start of the Pixel Data Attribute element, where the fragment starts.
   */
  offset: number;
  /**
   * Byte offset, indexing from the start of the byte stream, where the fragment data starts.
   */
  position: number;
  /**
   * Number of bytes contained by the fragment.
   */
  length: number;
}

/**
 * Basically an array of sub-datasets.
 */
export interface Sequence extends DataElement {
  /**
   * This field will always be populated for Sequence elements as they would not have been 
   * parsed correctly otherwise, this doesn't reflect whether or not the transfer syntax
   * uses implicit or explicit VR.
   */
  vr: VR.SQ;

  /**
   * Holds the items making up the sequence, can be empty if the sequence doesn't have any item.
   */
  items: SequenceItem[];
}

/**
 * Used inside a {@link Sequence} to define the content of an item of the sequence
 */
export interface SequenceItem {
  /**
   * Holds the elements contained by the sequence item.
   */
  dataSet: DataSet;
}

/**
 * Union of all possible kinds of elements.
 */
export type Element = DataElement | Sequence | EncapsulatedPixelDataElement | FileMetaInformationElement;

export interface DataSet {
  /**
   * The buffer view of the entire file
   */
  byteArray: ByteArray;
  /**
   * The parser to use for reading the dataset's elements, this parser is automatically selected 
   * to match the requirements of the transfer syntax.
   */
  byteArrayParser : ByteArrayParser;
  /**
   * The record of the elements making up the dataset, accessible via the element's tag in the format xGGGGEEEE.
   * @see {@link Tag} for details on the encoding.
   */
  elements: {
    [tag: Tag]: Element;
  };
  warnings: string[];

  /**
   * Finds the element for tag and returns an unsigned int 16 if it exists and has data. Use this function for VR type US.
   */
  uint16: (tag: string, index?: number) => number | undefined;

  /**
   * Finds the element for tag and returns a signed int 16 if it exists and has data. Use this function for VR type SS.
   */
  int16: (tag: string, index?: number) => number | undefined;

  /**
   * Finds the element for tag and returns an unsigned int 32 if it exists and has data. Use this function for VR type UL.
   */
  uint32: (tag: string, index?: number) => number | undefined;

  /**
   * Finds the element for tag and returns a signed int 32 if it exists and has data. Use this function for VR type SL.
   */
  int32: (tag: string, index?: number) => number | undefined;

  /**
   * Finds the element for tag and returns a 32 bit floating point number if it exists and has data. Use this function for VR type FL.
   */
  float: (tag: string, index?: number) => number | undefined;

  /**
   * Finds the element for tag and returns a 64 bit floating point number if it exists and has data. Use this function for VR type FD.
   */
  double: (tag: string, index?: number) => number | undefined;

  /**
   * Returns the actual Value Multiplicity of an element - the number of values in a multi-valued element.
   */
  numStringValues: (tag: string) => number | undefined;

  /**
   * Finds the element for tag and returns a string if it exists and has data. Use this function for VR types AE, CS, SH, and LO.
   */
  string: (tag: string, index?: number) => string | undefined;

  /**
   * Finds the element for tag and returns a string with the leading spaces preserved and trailing spaces removed if it exists and has data. Use this function for VR types UT, ST, and LT.
   */
  text: (tag: string, index?: number) => string | undefined;

  /**
   * Finds the element for tag and parses a string to a float if it exists and has data. Use this function for VR type DS.
   */
  floatString: (tag: string, index?: number) => number | undefined;

  /**
   * Finds the element for tag and parses a string to an integer if it exists and has data. Use this function for VR type IS.
   */
  intString: (tag: string, index?: number) => number | undefined;

  /**
   * Finds the element for tag and parses an element tag according to the 'AT' VR definition if it exists and has data. Use this function for VR type AT.
   */
  attributeTag: (tag: string) => string | undefined;
}

Notes :

  • The declaration for DataSet and ByteArrayParser has been more or less taken as is, I just updated the elements property and added missing JSDoc on <DataSet>.byteArray and <DataSet>.byteArrayParser.
  • I can open a PR if modifications/validations need to be made.
  • I added the JSDoc with the knowledge I have of the DICOM spec, how the lib tends to parse the files I throw at it, and some referencing to the code, but I can't guarantee the exactness of it and would appreciate it if someone with more knowledge took a look at it to validate.
  • The VR enum is extracted from the table in DICOM Part 5 Section 6.2, I considered adding links to each VR's definition in this table but there is no id to anchor to in the page from NEMA.

@yagni
Copy link
Collaborator

yagni commented Jul 31, 2023

Thanks! Please move this into a PR so we can discuss there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants