Question about --collapse #94

Antonia-Chalka · 2021-07-19T10:02:04Z

I have a very basic question about how the --collapse flag determines grouping. Does it collapse genotypes that have the exact same distribution across all the samples, or is some other type of correlation statistic used to determine that (and if so, what is it and what is the threshold)?

Both readme and the paper note the following:

For each phenotype supplied via columns in the traits
file, Scoary does the following: first, correlated genotype
variants are collapsed. Plasmid genes, for example, are
typically inherited together rather than as individual
units and Scoary will collapse these genes into a single
unit.

The text was updated successfully, but these errors were encountered:

Antonia-Chalka · 2021-07-19T10:04:34Z

From a quick view at the code in the methods script, it seems the correlation has to be perfect, but there's also a mention of having a 'softer' mention so I'm not 100% sure 😅

AdmiralenOla · 2021-10-21T09:54:38Z

Thanks for your question, and sorry about the wait.

As you have already figured out, the genotypes need to be 100% correlated to be collapsed. You may also have seen from the code that I thought about using a softer threshold, but I have never gotten around to implementing that.

I'm also a bit uncertain how the distribution of the collapsed variant should be counted, i.e. should it be present in all isolates with either of the original variants? I'm uncertain how that would impact other assumptions that are made.

Another thing I'm not sure about is whether the collapsed genes should then go through subsequent rounds of correlation -> collapse. That is, when we collapse two genes into one, this will have a new distribution pattern, and there is a chance that this new pattern will fall within the correlation threshold of being collapsed with yet another gene.

AdmiralenOla added the enhancement label Oct 21, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about --collapse #94

Question about --collapse #94

Antonia-Chalka commented Jul 19, 2021

Antonia-Chalka commented Jul 19, 2021

AdmiralenOla commented Oct 21, 2021

Question about --collapse #94

Question about --collapse #94

Comments

Antonia-Chalka commented Jul 19, 2021

Antonia-Chalka commented Jul 19, 2021

AdmiralenOla commented Oct 21, 2021