Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Additional metadata in results dataframe #70

Open
dnjst opened this issue Nov 8, 2023 · 2 comments
Open

Additional metadata in results dataframe #70

dnjst opened this issue Nov 8, 2023 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@dnjst
Copy link

dnjst commented Nov 8, 2023

Hi, I would really like a way to print the percent expression and CPM values for the (minimally expressed) ligand and receptor in the results dataframe for li.mt.rank_aggregate() or any of the individual methods (which I know also have the supp_cols argument - maybe it is possible there? but rank_aggregate() seems to lose supp_cols)

The ones of interest would be something like:

ligand_percent
receptor_percent
ligand_cpm
receptor_cpm

It'd improve the interpretability of the results a lot - for instance, when return_all_lrs=True and an interaction fails the tests, it gives insight into why it fails, because of the ligand or receptor. Also, it allows for simple TPM threshold-based interaction calling - the most rudimentary method discussed by Armingol et al. 2021 (https://doi.org/10.1038/s41576-020-00292-x), but still a useful one. I don't know if you think that CPM thresholding would work as a "method" to liana, but either way least returning this info in the dataframe would be nice.

Thanks for any input!

@dnjst dnjst added the enhancement New feature or request label Nov 8, 2023
@dbdimitrov
Copy link
Collaborator

dbdimitrov commented Nov 9, 2023

Hey @dnjst,

Indeed, in a past version I decided not to include supp columns in rank_aggregate due to (long-story-short) the way that I used to deal with complexes. I can have a go at implementing supp cols for it in next versions.

In the meantime, you should be able to get those columns for the individual methods. Percentage if understood correctly should be the *_prop columns, while CPM is just *_means if your data is CPM-normalized.

@dbdimitrov
Copy link
Collaborator

Thresholding is already done implicitly via expr_prop but I agree having the option to include these columns would make it more interpretable 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants