Skip to content
Amir R. Sadri edited this page Nov 14, 2023 · 96 revisions

Table of Contents

Introduction

A key aspect to using MR imaging scans obtained from different institutions or scanners for reliable model development and optimization of computational imaging tools is to curate datasets with minimal to no artifacts; to ensure that inherent image quality issues do not impact model performance. Evaluating variations and relative image quality between cohorts could provide critical insights for determining whether a machine learning model or image analysis algorithm that was developed on one cohort will perform reproducibly on a different cohort.

This wiki describes the usage of MRQy, a new open-source quality control and assurance tool for MR imaging data, which can be used as a pre-analytical step when developing computational pipelines including radiomics, image analysis, and machine learning.

MRQy

Properties

MRQy leverages a Python-JavaScript framework and has been specialized for analyzing large-scale MRI cohorts through the following modules: (i) automatic foreground detection for any MR image from any body region, from which it will (ii) extract a series of imaging-specific metadata and quality measures generalized to work with any structural MR sequence, in order to (iii) compute representations that capture relevant MR image quality trends in a data cohort.

These are presented within a specialized HTML5-based front-end which can be easily interrogated by the end-user to identify batch effects and imaging artifacts towards curation of MR imaging cohorts of acceptable quality for model development; as we will demonstrate using a representative large-scale MRI cohort from TCIA.

MRQy works in an unsupervised standalone setting, can be run efficiently on a standard computer, and has a modular design to allow for easy incorporation of additional algorithms and metrics as plugins in the future. MRQy has been developed in an organ-agnostic fashion i.e. it can be used to evaluate image quality of MR scans obtained from any body organ or region. MRQy provides quantitative metrics for benchmarking the quality and consistency of MRI data, as well as to identify batch effects i.e. systematic occurrences of technical variations such as voxel resolution or fields-of-view. By using MRQy in an active fashion, one could develop a more homogeneous cohort of MRI data that is of consistent quality and thus ensure accurate generalizability of computational pipelines and models.

Format and Usage

After the installation of all the prerequisite Python packages (specified in the installation instructions), MRQy can be run on a directory of files via the command:

python QC.py output_folder_name "input directory address" 

No additional configuration files need to be specified. This results in the following steps being executed:

  • Thumbnail images are generated for all 2D sections in each MRI dataset and saved as .png files within the UserInterface/Data folder.
  • Each dataset is processed to detect the foreground and background region.
  • Metadata are extracted from file headers for each dataset. Measurements are computed based on the detected foreground region for each dataset.
  • Both metadata and measurements are saved for each dataset within a tab-separated file (results.tsv) that is stored within the UserInterface/Data folder.
  • For a given cohort, a single UMAP and a single t-SNE embedding are computed for all the datasets based on extracted measures. The embedding coordinates are also saved into the results.tsv file.

Further interrogation of cohort variations and artifact trends may be done reading results.tsv into any common data analytic tool (e.g. MATLAB or R). A specialized front-end HTML interface (index.html) is available within the UserInterface folder designed for real-time manipulation and visualization. Quality control can be performed via multiple pathways:

  • Using sorting arrows available on each table column to re-order measures and examine numeric trends. Users can further annotate rows or remove non-informative patients.
  • As the different interface components are synchronized, if a patient row is highlighted in either Table, a corresponding highlight appears on a line within the PC chart, on a bar in the bar chart, as well as shading the patient-specific bubble in the embedding plots. Thumbnail images for this patient volume are shown in the interface.
  • Using the PC and bar charts to directly compare a specific measure across all the subject scans. This can help quickly determine which of the metadata or measures are consistent across the entire cohort as well as identify outliers. The PC chart can also be used to evaluate positive or negative relationships between different measures and thus determine the trade-off in processing for specific artifacts.
  • Using embedding plots (t-SNE and UMAP) to track specific site- or scanner-specific trends within the cohort. By visualizing the 2D space into which the entire cohort has been mapped, any clusters that can be identified typically correspond to site- and scanner-specific variations. The overall distribution of points in space also provides an indication of the variability within the entire cohort.

Each of the above functionalities is explained in more detail under How MRQy Works.

Troubleshooting

MRQy Image Loading Path

Issue

The MRQy tool is not loading images correctly due to path misconfigurations.

Resolution Steps

Modify the Output Directory in results.tsv. The results.tsv file contains a crucial setting that dictates where MRQy searches for output images. If the path is incorrect, MRQy will not be able to locate and load the images. Follow these steps to correct the path:

  1. Open the results.tsv file located in your MRQy output directory.
  2. Locate the #outdir line.
  3. Change the path specified next to #outdir to the actual directory where the MRQy output is stored.

Update the Data Load Path in the User Interface

To ensure MRQy consistently uses the correct image path:

  1. Navigate to the UserInterface/scripts/ directory in your MRQy installation.
  2. Open the data_load.js file in a text editor.
  3. Look for the DATA_PATH variable within the script.
  4. Update DATA_PATH to the absolute path of your MRQy output directory.

Here's an example of what the change might look like in data_load.js:

'// Before let DATA_PATH = 'relative/path/to/data'; // After let DATA_PATH = 'C:/absolute/path/to/data';'

Make sure to replace 'C:/absolute/path/to/data' with the actual path to your MRQy output data.

By setting the DATA_PATH to an absolute directory, MRQy will directly reference the output data, bypassing the need to replicate the index.html, css, libs, scripts folders in the UserInterface folder for each QC.py execution.