Skip to content
johncolby edited this page Dec 15, 2011 · 20 revisions

trk_compile_data.m

This tool is the main workhorse for performing along-tract processing of tract groups. We have seen the basic steps that are involved in Basic workflow, and then we saw how they can be linked together in Linking it all together. trk_compile_data.m simply takes these ideas and builds them into a flexible wrapper function.

Directory structure

This function assumes you have some base directory subsDir that contains subject folders with names subIDs. Inside of these subID folders are assumed to be the .trk track group files, named <tract>_<hemisphere>.trk, as well as an FA map named dti_fa.nii.gz. If your setup differs, it should be a simple tweak to adapt this function to your specific needs. Here is the directory tree of the included example data to make this clear:

`-- example
    `-- subject1
        |-- CST_L.trk
        `-- dti_fa.nii.gz

tract_info dataset

The tract_info dataset specifies which tracts and hemispheres we want to process. It will typically be loaded in from a tab delimited tract_info.txt file. It also specifies an optional startPt if you want to use automatic streamline reorienting, view settings for the QC figures (type help view for more info), as well as an optional nPts if you want to hard code the number of interpolation points. Here is what the included tract_info.txt looks like for the example data:

Tract Hemisphere startPt view nPts
CST L [NaN NaN NaN] [90 0]

This means we want to process the left corticospinal tract files only, we don’t want to prescribe an automatic origin for all the files (although with the CST this would probably work out fine), we want the QC figure to be a sagittal view from the left side, and we want the number of interpolation points to be determined adaptively.

Number of interpolation points

The number of interpolation points nPts for each tract can either be prescribed in the tract_info dataset, or, if that field is left empty, determined adaptively based on the voxel dimensions and the length of the streamlines for each subject processed. As an example, if the voxel dimension is 2.5 mm, and the mean streamline length for a subject is 100 mm, the adaptive setting would choose to interpolate the tract with 100/2.5 = 40 vertices.

The number of interpolation points that are used for each tract/hemisphere is output in the variable nPts. A useful approach is to first do one pass of trk_compile_data to determine all of the nPts adaptively. Then for each tract, find the mean nPts across all subjects. Then finally rerun trk_compile_data with these predetermined nPts prescribed in tract_data. Here is how the nPts were determined for the [[between_group|between-group]] example (4 tracts, 20 subjects):

>> [track_means,starting_pts_out,nPts] = trk_compile_data(subsDir,subIDs,tract_info,[],starting_pts_in);

>> nPts(nPts==0) = NaN

nPts =

  Columns 1 through 15

    34    40   NaN    35    40    29   NaN   NaN    33    28    34    38    33    28    35
   NaN    43   NaN   NaN    40   NaN   NaN   NaN    32    27    29    35    38   NaN   NaN
    27    34    43    34    39    30    29    35    32    24    34    35    35    32    37
    30    32    34    34    40    29    37    30    32    28    35    27    38    26    42

  Columns 16 through 20

    26    35    30   NaN    36
   NaN   NaN    40   NaN    40
    36    37    37    22    31
    42    36    31    28    33
    
>> round(nanmean(nPts,2))

ans =

    33
    36
    33
    33

trk_flip – Interactive or automated

For some tracts that are relatively long and linear (ex: corticospinal tract, inferior fronto-occipital fasciculus), prescribing a single point in space is sufficient to properly reorient all of the streamlines across subjects. For other tracts, however (ex: arcuate fasciculus), variability in the position of the endpoints of the streamlines means that if a single pt_start was used for all subjects, it would fail to properly reorient all of the streamlines.

The order that we look for which pt_start to use for trk_flip is:

  1. tract_info.txt – If there is a starting point given, it automatically will be used for all subjects.
  2. starting_pts_in – If given, these pre-recorded subject-specific points automatically will be re-used. This list is what was actually used in a previous analysis, so it could be a combination of some originally interactive and some originally automatic points.
  3. Interactively – If neither of these are specified, MATLAB will open up a plot window for the user to interactively choose the pt_start for each subject. [See video]

QC figures

When doing batch processing like this, it is important to perform some type of quality control on the results. If you are determining the tract origins interactively, you will already get to see the raw streamlines and prescribe exactly where you want the tract origin to be. If you are doing a more automatic mode, however, you won’t get to see the output. This is why it’s nice to make some automatic QC figures. Here we plot the mean streamline geometry on top of a slice from the FA map. This makes it easy to see if the streamline reorienting was successful, and additionally lets you rapidly scan for other mistakes like misnamed files, etc..

Outputs

  • starting_pts_out.txt – The pt_start list that was used for trk_flip. This can be re-used in the future to save time.
  • Tracking_QC_<trkName>.pdf – A set of QC images. These are particularly useful with the automated approach to quickly eyeball the quality of the along-tract processing results.
  • trk_data.txt – This is the main output of the along-tract processing. It is an Excel-type tab delimited data table that stores all of the along-tract data for subsequent statistical analysis.
ID	Point	Hemisphere	Tract	FA	SD
subject1	1	L	CST	0.4375	0.1185
subject1	2	L	CST	0.4809	0.1118
subject1	3	L	CST	0.5061	0.1192
subject1	4	L	CST	0.5418	0.1080
subject1	5	L	CST	0.5502	0.1132
subject1	6	L	CST	0.5739	0.1217
subject1	7	L	CST	0.5776	0.1265
subject1	8	L	CST	0.6134	0.1101
subject1	9	L	CST	0.6169	0.0992
subject1	10	L	CST	0.6339	0.0966
...
  • trk_props_long.txt – Whole tract properties (for now, just the number of streamlines) are output to a separate file. This makes it easier to skim through, since there will only be one line for each tract group.
  • <subID>_<trkName>.txt – (Optionally) Raw streamlines from the original tract group in a plain text ASCII format for easy plotting in R.
  • <subID>_<trkName>_mean.trk – (Optionally) Mean tract geometry with attached along-tract cross-sectional mean scalar metric (e.g. FA) for display in TrackVis.

Example

If you look in the MATLAB help for trk_compile_data, you’ll see that there is a simple example described:

Example:
     subsDir         = exDir;
     subIDs          = {'subject1'};
     tract_info      = dataset('file', fullfile(exDir, 'tract_info.txt'));
     starting_pts_in = dataset('file', fullfile(exDir, 'starting_pts_out.txt'));
     [track_means,starting_pts_out,nPts] = trk_compile_data(subsDir,subIDs,tract_info,[],starting_pts_in,1,1);

In words, this tells the program to go to the example directory and process the subject1 folder. In the tract_info for this demo (see above), we have already seen that there is no automatic pt_start given. Since we are passing in a starting_pts_in list, though, these previously identified points will be re-used. (If we didn’t pass in the starting_pts_in argument, pt_start would be determined interactively.) Also, since the number of interpolation points nPts is not specified in tract_info, this will be determined adaptively. Finally, since we have added the flags for saveTrk and saveASCII, the tract mean geometry will be saved to a .trk file, and the raw streamlines will be saved to an ASCII .txt file.

All of the outputs from this example are included in the example directory.

Alternative correspondence schemes

With the default usage of trk_interp, each streamline is resampled so that it has the same number of points, spread evenly along its length. This prescribes 2 points of correspondence – one at either end. However, since the ends of the streamlines typically have higher variability (in FA estimates, geometric position, etc.), it can be desirable to prescribe an extra tie-down point somewhere in the middle of the tract. This is achieved using the tie_at_center flag in trk_interp. As of v1.1, this is the default in trk_compile_data.

See tie-at-center for more information and a comparison between these methods on the example data.
See Alternative-correspondence-schemes for more information on alternative correspondence schemes that don’t use spline-based resampling.