Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Marker detection in multimodal data #112

Open
PeteHaitch opened this issue Aug 10, 2023 · 1 comment
Open

Marker detection in multimodal data #112

PeteHaitch opened this issue Aug 10, 2023 · 1 comment

Comments

@PeteHaitch
Copy link
Contributor

Somewhat thinking out loud here, but I'm interested in your ideas.

For multimodal data (e.g., GEX and ADT), we might be interested in using both modalities (simultaneously) to define markers.
I've been doing this by rbind()-ing the logcounts() of each modality (along with some tidying up the rownames by prepending the ADT feature names by ADT), and then running scoreMarkers() on that, but this requires allocating another (potentially large) matrix.

I guess I've got a few questions:

  1. I suppose that rbind() could be a delayed op, but I'm not sure when this would get realised by the scran machinery and so I'm unsure if this is worthwhile?
  2. Am I missing a better/simpler way of achieving this? Something using applySCE(sce, scoreMarkers()) gets very close, but the rank.* statistics are then computed separately for each modality and so won't be the same as if they were computed jointly on all modalities (the other statistics yield identical results whether computed separately or jointly on all modalities). Perhaps running scoreMarkers(full.stats = TRUE) and then re-computing the rank.* statistics with computeMinRank() applied to the full.* columns would work?
  3. What might a scoreMarkers()/findMarkers() interface for multimodal data look like?
  4. Would this easy to achieve with the existing code or require some re-design?
@LTLA
Copy link
Collaborator

LTLA commented Aug 14, 2023

About to sleep but the rbind approach seems reasonable if you want the ranks to be comparable. But comes with some performance loss because the current scran falls back to block processing (though this would not be a problem if it was refactored to use libscran). Otherwise 2 is also fine but also requires some recompute of the ranks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants