Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exported channel values do not agree with what was shown in FlowJo Biex? #163

Closed
denvercal1234GitHub opened this issue May 9, 2023 · 8 comments

Comments

@denvercal1234GitHub
Copy link

denvercal1234GitHub commented May 9, 2023

Hi there,

Thanks again for the tool and your help so far!

I hope to get clarification on why the exported channel values do not agree with the numerical values I saw on FlowJo plots? Does that mean, for example, in FlowJo, I saw there is a good population that follows the diagonal aline (double-positive), but this population does not look as prominent when I plotted their channel values in R.

I thought if we exported the FCS from FlowJo as channel values, then the transformation done by FlowJo (in my case, bi-ex) would be preserved in the exported values and that is why in Spectre workflow we will not need to perform any transformation.

As viewed in FlowJo (values go up to 10^5 on both axes):

Screenshot 2023-05-09 at 21 03 02

Reading in the exported channel values:

channel_data.list <- Spectre::read.files(file.loc = "....../FlowJoBiExED_ChannelValues", file.type = ".csv", do.embed.file.names = TRUE)

Exported channel values have values below 10^3 (some rows are shown as example):

Screenshot 2023-05-09 at 21 08 10 Screenshot 2023-05-09 at 21 08 45

Plotted in R of the same parameters look different than in FlowJo:

Screenshot 2023-05-09 at 21 18 24
@ghar1821
Copy link
Member

ghar1821 commented Jun 2, 2023

I believe the channel values serve as an alternative to the bi-ex transformation, with both methods aiming to preserve the distribution of marker expression. Given their distinct way of transforming data, it's reasonable not to expect identical numerical values from both methods. After all, if they yielded the same transformed values, there wouldn't be a need for two separate methods, would there?

When you export the channel values from FlowJo, it performs a linear binning transformation on the data, which negates the need for further bi-ex, logicle, or arc-sinh transformations in downstream analysis.

If you find that the channel values don't quite meet your analysis needs, you could consider exporting as CSV scale values. From there, you can apply either a logicle or arc-sinh transformation using the do.logicle or do.asinh functions. Unfortunately, we don't currently offer functions to do bi-ex transformation.

I'd also recommend reading the following two guides on data transformation. They might provide some additional insight:

@tomashhurst
Copy link
Member

@denvercal1234GitHub is there any chance the X and Y are flipped between the FlowJo and R examples? There is a string of cells on the bottom right that looks similar to those on the top left in the FlowJo example.

@denvercal1234GitHub
Copy link
Author

denvercal1234GitHub commented Jul 22, 2023

Hi @tomashhurst and @ghar1821 -- Thank you for the input. The reason why I wanted to use channel values were to eliminate the need to decide on the cofactor to transform in R.

Below is another example of a FCS file after I transformed it in FlowJo with bi-exponential transformation. I then exported it as channel values (.csv) to then import into R.

Screenshot 2023-07-22 at 18 04 57

Once the channel values are imported into R, the values are now in the hundreads and none of the cells is at 0 any more (even though from visually looking at the FlowJo plots, it looks like some cells should be at 0?). As a result, when I clustered and then plots the expression levels across clusters, the baseline is not 0, but all are around ~200 (as mentioned in HelenaLC/CATALYST#358).

Is it normal? It is a bit strange to have most of the cells having baseline of expression at hundreds.. or is it just how bi-exponential transformed data are? I want to make sure that the exported channel values are compatible with FlowSOM clustering without doing any additional steps I did not know.

Screenshot 2023-07-22 at 18 08 12 Screenshot 2023-07-22 at 18 06 51

Other markers do have some cells at 0, however:

Screenshot 2023-07-22 at 18 07 11

Thank you for your help!

@denvercal1234GitHub
Copy link
Author

Also @ghar1821 @tomashhurst --- Should we even use channel values exported from FlowJo (after visually transforming the data using biexponential in FlowJo) for clustering purposes? Because in other posts, it was mentioned transformed and exported data from FlowJo are not reliable (HelenaLC/CATALYST#358 (comment))? Thank you again for your input.

@SamGG
Copy link

SamGG commented Jul 28, 2023

To make my point clearer, I consider FJ results as correct, but I don't know how to reproduce FJ scaling.
Good to have feedback from Spectre team. I will read the links you pointed above when I have time.
Best.

@denvercal1234GitHub
Copy link
Author

Thank you @SamGG. My biggest concern is whether we can use channel values (transformed by FlowJo) for clustering, because the result of such a clustering showed as above for all markers, the baseline is not 0 but rather around ~200-300, which is a bit odd for the interpretation.

@SamGG
Copy link

SamGG commented Jul 28, 2023

Using your figure, here is what I think is happening. I added a pseudo scale ranging from 0 to 1000 (or should it be 1024?). I think this is the mapping that FJ is applying to any transformed channel. This shows that the zero is around 250.
I added 1 green box and 2 grey boxes. Those boxes represent 50% (green) and 25% (grey) of the full scale. The green box shows the range of intensity that is really used. The grey boxes show the ranges with no cell.
FJ_scaling
I think you should scale each channel so that the intensity cover the range of the 0..1000 pseudo range.
I didn't test yet whether scaling to full range is important or not, but it is on my long todo list. I think it should not be important if dimension reduction is conducted in FJ. If transformation is carried out in Spectre, FlowSOM, CATALYST, R... then zero will be at zero.
Hope this help.
@tomashhurst @ghar1821 what is your opinion/experience?

@tomashhurst
Copy link
Member

@denvercal1234GitHub just looking back over some of these issues -- @SamGG's image summarises it well, and this is also described in our transformation tutorial (https://immunedynamics.io/spectre/cytometry/#tutorials). Personally having run clustering on both channel value data and arcsinh transformed data. In theory the channel data has less overall 'sensitivity' (i.e. the range is something like 650) compared to arcsinh transformed data (which has potentially ~10^5 (in decimal points after scaling). However, I have not found huge differences between the two. If you run clustering/tSNE/UMAP etc in FlowJo, it actually uses the channel values behind the scenes.

@SamGG is right that in theory it would be best to scale each parameter individually such that the maximum range is utilised, but we found it tedious to do this in FlowJo, but easy to do it in R with arcsinh transformations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants