Skip to content

nolanlab/REDSEA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

REDSEA

We present REinforcement Dynamic Spillover EliminAtion (REDSEA) as a solution for spillover compensation without loss of signal. This is the step-by-step guidance of how to use REDSEA to produce a FCS file with compensated channel signals. This FCS file can then be read for your downstream spatial analysis. A detailed manuscript that describes the data and method can be found here.

How it works

In brief, the REDSEA algorithm identifies the boundary region for each cell, based on the segmentation mask. Subsequently, for each channel from a single cell, signals were subtracted based on its shared boundary with neighboring cells, and their corresponding signal in the same channel. Moreover, the removed signal from adjacent cells can be reinforced back to the cell. For detailed information look at the Materials and Methods section of the paper.

Table of content

Example Data

Here we provide example datasets from two different multiplexed imaging modalities, MIBI and CyCIF.

MIBI data

Multiplexed Ion Beam Imaging (MIBI) is a method that uses secondary ion mass spectrometry to image antibodies tagged with isotopically pure elemental metal reporters. Here we will use MIBI data generated from non-human primate lymph nodes as part of the REDSEA manuscript Now Published.

CyCIF data

CyCIF is a method for highly multiplexed immuno-fluorescence imaging of formalin-fixed, paraffin-embedded (FFPE) specimens mounted on glass slides. Here we will use human tonsil data acquired with t-CyCIF method. The acquisition of this data is described in this paper (Rashid, Rumana, et al, Scientific Data 2019)

Required Inputs

MATLAB scripts

The MATLAB scripts include the REDSEA algorithm, along with other scripts created by Leeat Keren for processing multiplexed images, and a fcs processing script writeFCS created by Jakub Nedbal (https://www.mathworks.com/matlabcentral/fileexchange/42603-writefcs-fname-data-text-other).

tiff images and channel information

There should be a CSV file containing the channel name information.

And also a folder containing .tiff images for each channel, where the names should be same as in the CSV file.

Segmentation Mask

This method requires you to provide a cell segmentation mask. A cell nuclei probability matrix can be produced by your own chose (popular options include ilastik or deepcell). In our case for the MIBI data we implemented a trained-in-house deepcell CNN model from DeepCell.

DeepCell is also easy to implement on different imaging modalities. Here is a prediction model we trained with ~ 1500 cells in the CyCIF dataset:

In the example data folder, we have provided a nucleus prediction matrix for the MIBI and CyCIF data, with the name feature_1_frame_1_p1_.tif, in the deepCell sub folder.

After producing a nuclei probablity mask, we will then use the script MibiSegmentByDeepProbWithPerim3.m to implement a watershed algorithm for whole cell segmentation. This will produce something like this:

See MibiSegmentByDeepProbWithPerim3.m Script

%% Nuclear segmentation using pixel probabilities from deepCell
% Modified from original script from Leeat Keren
% Changed the pipeline to use the probablities from the deepcell instead of 
% the raw nuclear intensities.
% 25July2020, by Yunhao Bai and Sizun Jiang

% Main path for the all the data
mainPath = 'sampleData_MIBI'; %for MIBI
%mainPath = 'sampleData_cycIF'; %for CyCIF
resultsPath = [mainPath,'/segmentResults/']; % for the next step, put the 
% output segmentationParams.mat and PureSegmentation.tif to the
% originalTiff/Point? folder

for p=1:1
for segmentThres=0.01 % change from 0.01 to 0.1, segmentThres defines the 
% local maximum, higher segmentThres leads to fewer cells and more merged
% regions
for probNucThres=0.35 % change from 0.05 to 0.5, probNucThres defines how 
% board the cells will expand
    pointNumber=p;disp(['point',num2str(p)]);
    
    % read .tiff image of nucleus marker to matrix for image  
    pathNucleusMarker = [mainPath,'/originalTIFF/Point',num2str(p),'/dsDNA.tiff']; %for MIBI
    %pathNucleusMarker = [mainPath,'/originalTIFF/Point',num2str(p),'/x7500y3500_1700_DAPI.tif']; %for CyCIF
    t = Tiff(pathNucleusMarker,'r');
    nucIm = read(t);
    %the max value of nucleus channel .tiff
    maxv=25; %for MIBI
    %maxv=3000; %for CyCIF
    rgb_image = MibiGetRGBimageFromMat(nucIm,maxv);

    % %% Get maxima from deepCell probabilities
    % read possibility map from deepCell/ilastik/other segmentation methods
    probNuc = double(imread([mainPath,'/deepCell/feature_1_frame_1_p',num2str(p),'_dsDNA.tif'])); %for MIBI
    %probNuc = double(imread([mainPath,'/deepCell/feature_1_frame_1_p',num2str(p),'_DAPI.tif'])); %for CyCIF
    figure;imagesc(probNuc);

    % find local maxima in probability map
    maxs = imextendedmax(probNuc,segmentThres); % change from 0.1 to 0.02
    rgb_image_perim_extMax = imoverlay(rgb_image , maxs, [1 0 0]);   
    figure;
    imagesc(rgb_image_perim_extMax);
    
    %% watershed over the deep results
    bw1 = zeros(size(probNuc));
    bw1(probNuc>probNucThres) = 1; %change from 0.05 to 0.5
    figure; imagesc(bw1);
    bw2 = bwareaopen(bw1,40);
    SE = strel('disk',4);
    bw=imdilate(bw2,SE);
    
    [B,L] = bwboundaries(maxs,4,'noholes');
    maxsFix = bw & maxs;

    % modify the image so that the background pixels and the extended maxima 
    % pixels are forced to be the only local minima in the image.
    Jc = imcomplement(probNuc);
    I_mod = imimposemin(Jc, ~bw | maxsFix);
    L = watershed(I_mod);
    labeledImage = label2rgb(L);
    
    cellPerimNewMod= L;
    cellPerimNewMod(L>0) = 100;
    cellPerimNewMod(cellPerimNewMod==0)=1;
    cellPerimNewMod(cellPerimNewMod==100)=0;
    
    rgb_image_cellPerim = imoverlay(rgb_image, cellPerimNewMod, [1 0 0]);
    figure;
    imagesc(rgb_image_cellPerim);

    %% 1. For each label, decide whether it is of a nucleus/ background
    t=40;
    labelNum = length(unique(L(:)));
    labelIdentity = zeros (labelNum,1);
    labelPixelsPercentInNucleiMask = zeros(labelNum,1);
    labelSize = zeros(labelNum,1);
    
    %spend a lot of time during this loop 
    for i=1:labelNum
        [r c] = find(L==i);
        labelSize(i) = length(r); 
        labelMask = (L==i);
        labelPixelsNumInNucleiMask = sum(bw(labelMask));
        labelPixelsPercentInNucleiMask(i) = labelPixelsNumInNucleiMask / labelSize(i);
        if (labelPixelsPercentInNucleiMask(i) > 0.7)
            labelIdentity(i) = 1;
        end
    end

    % 2. Merge small regions within the nuclei mask with their neighbours
    keepVec = ones(labelNum,1);
    newL = L;
    for i=1:labelNum
        if (labelIdentity(i) == 1) && (labelSize(i) < t)
            disp(['Removing label ',num2str(i),'. Size: ',num2str(labelSize(i))]);
            % get neighbour with largest border that is also in nuclear region
            [neighbourLabels , neighbouringRegionSize] = MibiGetNeighbourLabels (newL, i);
            found = 0;
            [neighbouringRegionSizeS , neighbouringRegionSizeSInd] = sort(neighbouringRegionSize,'descend');
            neighbourLabelsS = neighbourLabels(neighbouringRegionSizeSInd);
            maxInd = 1;
            while ~found
                mergeLabelId = neighbourLabelsS(maxInd);
                if (~(mergeLabelId == 0) && (labelIdentity(mergeLabelId) == 1))
                    found = 1;
                else
                    maxInd = maxInd+1;
                end
                if (maxInd >length(neighbourLabelsS)) % reached end of neighbours with no good merging candidate
                    disp (['Warning: no good merging target found for label', num2str(i), '. Keeping it.']);
                    break;
                end
            end
            % update
            if (maxInd <= length(neighbourLabelsS))
                [newL] = MibiMergeLabels (newL, i, mergeLabelId);
                keepVec(i) = 0;
            end
        end
    end

    % Update label numbers to account for deleted labels
    allLabels = [1:labelNum];
    currLabels =  allLabels(keepVec == 1);
    labelIdentityNew = zeros(length(currLabels),1); 
    newLmod = newL;
    for i = 1:length(currLabels)
        newLmod(newLmod == currLabels(i)) = i;
        labelIdentityNew(i) = labelIdentity(currLabels(i));
    end

    cellPerimNewMod = bwperim(newLmod);
    cellPerimNewMod = newLmod;
    cellPerimNewMod(newL>0) = 100;
    cellPerimNewMod(cellPerimNewMod==0)=1;
    cellPerimNewMod(cellPerimNewMod==100)=0;
    rgb_image_cellPerimNewMod = imoverlay(rgb_image, cellPerimNewMod, [1 0 0]);
    % add ilastik segmentation
    % ilsegParams = load([path,'/SegmentPerim/Point',num2str(pointNumber),'/segmentationParams.mat']);
    % rgb_image_cellPerimNewModIl = imoverlay(rgb_image, ilsegParams.cellPerimNewMod, [0 0 0]);
    % rgb_image_cellPerimNewModBoth = imoverlay(rgb_image_cellPerimNewModIl, cellPerimNewMod, [1 0 0]);
    
    figure;
    imagesc(rgb_image_cellPerimNewMod);
    mkdir([resultsPath,'/Point',num2str(pointNumber),'_',num2str(segmentThres),'_',num2str(probNucThres)]);
    imwrite(rgb_image_cellPerimNewMod,[resultsPath,'/Point',num2str(pointNumber),'_',num2str(segmentThres),'_',num2str(probNucThres),'/compareIlastikDeep.tif'],'tif');
    imwrite(cellPerimNewMod,[resultsPath,'/Point',num2str(pointNumber),'_',num2str(segmentThres),'_',num2str(probNucThres),'/PureSegmentation.tif'],'tif');
    save([resultsPath,'/Point',num2str(pointNumber),'_',num2str(segmentThres),'_',num2str(probNucThres),'/segmentationParams.mat'],'newLmod','cellPerimNewMod','labelIdentityNew');
end
end
end

disp('Please put the PureSegmentation.tif and segmentationParams.mat to the corresponding originalTiff folder for next step.');

This process produced a file called segmentationParams.mat and stored in a subfolder. This file will be used automatically by the REDSEA algorithm.

REDSEA

Parameters

With the .tiff images, .mat segmentation file and .csv channel information, we are now ready to implement REDSEA for boundary compensation.

There are two methods for boundary compensation in REDSEA: Sudoku and Cross. The algorithm walks through the boundaries of each cell, and decides the area to extract signal. You would need to choose one of the two methods and decide how many pixels to expand from the boundary pixel:

The pixel number for expansion should be proportional to the cell size: in our MIBI data, the average cell size is 107 pixels, and we used 2 pixels for expansion; In the CyCIF tonsil dataset the average cell size is 325 pixels, so 3-4 pixels is recommended.

Also, you need to supply a list of channel names to perform the compensation process: normally you should only compensate for the surface markers, like in our case: 'CD4';'CD56';'CD21 (CR2)';'CD163';'CD68';'CD3';'CD20';'CD8a'

Take a look at the annotated code in the block under:

See MibiExtractSingleCellDataFromSegmentationAndTiff_REDSEA.m Script

%% Extract single cell data and do the REDSEA compensation
% Based on the original script from Leeat Keren
% This now reads in any folder of TIFFs with individual channels from the 
% same field of view, and uses that to recreate a countsNoNoise based on 
% the massDS order.
% Then performs REDSEA compensation as implemented by Yunhao Bai
% The outputs will be REDSEA compensated and non-compensated FCS files
% 4May2020, Yunhao Bai, Sizun Jiang


% Main path for the all the data
mainPath = 'sampleData_MIBI'; %for MIBI
%mainPath = 'sampleData_cycIF'; %for CyCIF

% This is a csv file for your channel labels within
massDS = dataset('File',[mainPath,'/sampleData.csv'],'Delimiter',',');

% This assumes the path points to a folder containing all the Points from 
% the run. Your segmentationParams.mat from each point should be in the 
% each Point's folder
pathTiff = [mainPath,'/originalTiff']; 

% This is where the FCS file output will go to
pathResults = [mainPath,'/FCS_output'];

% Select the channels that are expected to be expressed. Cells with minimal
% expression of at least one of these channels will be removed
clusterChannels = massDS.Label; % exclude elemental channels (here the 
% sample dataset does not contain 
[~, clusterChannelsInds] = ismember(clusterChannels,massDS.Label);

% boundaryMod determines the type of compensation done.
% 1:whole cell compensation
% 2:boundary compensation (default)
boundaryMod = 2;
% REDSEAChecker determines the type of compensation done.
% 0:only subtraction; 
% 1:subtraction and reinforcement (default)
REDSEAChecker = 1;

% for boundary compensation, needs to specify elementShape. 
% 1:Sudoku style, 2:Cross style
elementShape = 2;
% elementSize. How many pixels around the center to be considered for the
% elementShape, can be selected from 1-4.
% As a default, keep elementShape and elementSize as 2.
elementSize = 2;

% Select channels for REDSEA compensation. Surface markers are recommended
% boundary compensation codes
% selected channels to do the boundary compensation
normChannels = {'CD4';'CD56';'CD21 (CR2)';'CD163';'CD68';'CD3';'CD20';'CD8a'}; %for MIBI
%normChannels = {'x7500y3500_1700_DAPI';'x7500y3500_1700_CD3';'x7500y3500_1700_CD4';'x7500y3500_1700_CD8a';'x7500y3500_1700_CD11b';'x7500y3500_1700_CD20';'x7500y3500_1700_CD45';'x7500y3500_1700_CD68'}; %for CyCIF
[~, normChannelsInds] = ismember(normChannels,massDS.Label);
channelNormIdentity = zeros(length(massDS.Label),1);
% Getting an array of flags for whether to compensate or not
for i = 1:length(normChannelsInds)
    channelNormIdentity(normChannelsInds(i)) = 1;
end

% Whether what to plot scatter to check the REDSEA result and effect,
% default=0 for not, 1 for plotting.
% Note that if multiple channels selected (in normChannels), to plot out all
% the sanity plots need long time.
plotSanityPlots = 0;

%%
mkdir(pathResults);

for p=1:1
    disp(['point',num2str(p)]);
    pointNumber = p;
    % load tiffs to recreate countsNoNoise
    for i=1:length(massDS.Label)
        t = imread([pathTiff, '/Point', num2str(pointNumber), '/', massDS.Label{i}, '.tiff']); %mind the .tif and .tiff
        d = double(t);
        % imshow(d)
        countsNoNoise(:,:,i) = d;
    end
        
    % load segmentation file
    load([pathTiff, '/Point', num2str(pointNumber), '/segmentationParams.mat']);
    labelNum = max(max(newLmod));
    channelNum = length(massDS);
    stats = regionprops(newLmod,'Area','PixelIdxList'); % Stats on cell size. Region props is DF with cell location by count
    countsReshape= reshape(countsNoNoise,size(countsNoNoise,1)*size(countsNoNoise,2),channelNum);
    
    % make a data matrix the size of the number of labels x the number of markers
    % Include one more marker for cell size
    data = zeros(labelNum,channelNum);
    dataScaleSize = zeros(labelNum,channelNum);
    cellSizes = zeros(labelNum,1);
    
    % for each label extract information
    for i=1:labelNum
        currData = countsReshape(stats(i).PixelIdxList,:);
        data(i,1:channelNum) = sum(currData,1);
        dataScaleSize(i,1:channelNum) = sum(currData,1) / stats(i).Area;
        cellSizes(i) = stats(i).Area;
    end
    
    %% do cell boundary compensation
    if boundaryMod == 1
        dataCompen = MIBIboundary_compensation_wholeCellSA(newLmod,data,channelNormIdentity,REDSEAChecker);
    elseif boundaryMod == 2
        dataCompen = MIBIboundary_compensation_boundarySA(newLmod,data,countsNoNoise,channelNormIdentity,elementShape,elementSize,REDSEAChecker);
    end
    dataCompenScaleSize = dataCompen./repmat(cellSizes,[1 channelNum]);

    % %% Add point number 
    % pointnum = double(repmat(p, 1, length(data))); 
    % pointnum = repelem(p, length(data),[1]);
    % data = [data, pointnum];
    % dataScaleSize = [dataScaleSize, pointnum];
    % dataCompen = [dataCompen, pointnum];
    % dataCompenScaleSize = [dataCompenScaleSize, pointnum];

    %%
    % get the final information only for the labels with 
    % 1.positive nuclear identity (cells)
    % 2.that have enough information in the clustering channels to be
    % clustered
    labelIdentityNew2 = labelIdentityNew([1:end-1]); % fix bug resulting from previous script
    sumDataScaleSizeInClusterChannels = sum(dataScaleSize(:,clusterChannelsInds),2);
    labelIdentityNew2(sumDataScaleSizeInClusterChannels<0.1) = 2;
    
    dataCells = data(labelIdentityNew2==1,:);
    dataScaleSizeCells = dataScaleSize(labelIdentityNew2==1,:);
    dataCompenCells = dataCompen(labelIdentityNew2==1,:);
    dataCompenScaleSizeCells = dataCompenScaleSize(labelIdentityNew2==1,:);

    labelVec=find(labelIdentityNew2==1);
    
    % get cell sizes only for cells
    cellSizesVec = cellSizes(labelIdentityNew2==1);

    dataL = [labelVec,cellSizesVec,dataCells,repmat(p,[length(labelVec) 1])];
    dataScaleSizeL = [labelVec,cellSizesVec,dataScaleSizeCells,repmat(p,[length(labelVec) 1])];
    dataCompenL = [labelVec,cellSizesVec,dataCompenCells,repmat(p,[length(labelVec) 1])];
    dataCompenScaleSizeL = [labelVec,cellSizesVec,dataCompenScaleSizeCells,repmat(p,[length(labelVec) 1])];

    % dataTransStdL = [labelVec,dataCellsTransStd];
    % dataScaleSizeTransStdL = [labelVec,dataScaleSizeCellsTransStd];
    % dataStdL = [labelVec,dataCellsStd];
    % dataScaleSizeStdL = [labelVec,dataScaleSizeCellsStd];

    % channelLabelsForFCS = ['cellLabelInImage';'cellSize';massDS.Label];
    channelLabelsForFCS = ['cellLabelInImage';'cellSize';massDS.Label;'PointNum'];
    
    %% output
    outputPath = [pathResults,'/Point',num2str(pointNumber),'/BM=',num2str(boundaryMod),'_RC=',num2str(REDSEAChecker),'_Shape=',num2str(elementShape),'_Size=',num2str(elementSize)];
    mkdir(outputPath);

    % plot sanity scatter images
    if plotSanityPlots == 1
        pathSanityPlots = [outputPath,'/sanityPlots/'];
        mkdir(pathSanityPlots);
        MIBIboundary_compensation_plotting(dataScaleSizeCells,dataCompenScaleSizeCells,normChannels,normChannelsInds,pathSanityPlots);
    end    
    
    % save fcs
    TEXT.PnS = channelLabelsForFCS;
    TEXT.PnN = channelLabelsForFCS;
    
    save([outputPath,'/cellData.mat'],'labelIdentityNew2','labelVec','cellSizesVec','dataCells','dataScaleSizeCells','dataCompenCells','dataCompenScaleSizeCells','channelLabelsForFCS');
    writeFCS([outputPath,'/dataFCS.fcs'],dataL,TEXT);
    writeFCS([outputPath,'/dataScaleSizeFCS.fcs'],dataScaleSizeL,TEXT);
    writeFCS([outputPath,'/dataRedSeaFCS.fcs'],dataCompenL,TEXT);
    writeFCS([outputPath,'/dataRedSeaScaleSizeFCS.fcs'],dataCompenScaleSizeL,TEXT);

    % writeFCS([outputPath,'/dataTransFCS.fcs'],dataTransL,TEXT);
    % writeFCS([outputPath,'/dataScaleSizeTransFCS.fcs'],dataScaleSizeTransL,TEXT);
    % writeFCS([outputPath,'/dataStdFCS.fcs'],dataStdL,TEXT);
    % writeFCS([outputPath,'/dataScaleSizeStdFCS.fcs'],dataScaleSizeStdL,TEXT);
    % writeFCS([outputPath,'/dataTransStdFCS.fcs'],dataTransStdL,TEXT);
    % writeFCS([outputPath,'/dataScaleSizeTransStdFCS.fcs'],dataScaleSizeTransStdL,TEXT);
end

Sanity plots

To visually inspect if the parameters described in the previous section are optimal for your data, we provided with a flag in the script plotSanityPlots = 1. Once set to 1, it will produce the pairwise combination scatter plots of all the compensated channels. User can use these plots to evaluate if the compensation is optimal (for example looking at CD4-CD8, CD3-CD20 etc.) It is suggested to run this first with less channels, find the optimal parameters, then run the full list.

Here we showed a sainity plot of CD3-CD20 for the CyCIF data:

The x and y axis are raw intensity/counts of the channels, scaled by each individual cell's size. In the plot we can see a reduction of CD3-CD20 double positive cells (less orange dots than blue dots in the middle of the plot), and at the same time retention of single-positive signals.

The sanity plots will be produced in the sanityPlots sub-folder, along with the output of fcs files.

Output

REDSEA will produce the 4 FCS files for downstream analysis:

dataFCS.fcs = raw counts for each single cell without compensation

dataRedSeaFCS = raw counts for each single cell with REDSEA compensation

dataScaleSizeFCS.fcs = counts for each single cell as scaled by cell size (counts/cellSize) without compensation

dataRedSeaScaleSizeFCS.fcs = counts for each single cell as scaled by cell size (counts/cellSize) with REDSEA compensation

It is recommended to use the dataRedSeaScaleSizeFCS.fcs file. We can see by using REDSEA compensation, the boundary signal spillover is dynamically eliminated and reinforced.

For the MIBI dataset:

And REDSEA also works comparably on the CyCIF data:

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •