Add functions for integrating CITE-seq data #114

HelloWorldLTY · 2024-02-04T18:18:49Z

Dear Authors,

Hi all, I have added functions to implement the functions of integrating gene expression information and protein information from CITE-seq data. I have offered two options to model protein data, either using 1. NBmixture or 2. Normal Distirbution after preprocessing with CLR. It seems that the later option works better. I also provide a method to create the gudiance graph between genes and proteins.

Sincerely,
Tianyu

Jeff1995

Hi @HelloWorldLTY, thank you for the contribution!

I have reviewed the changes and understood the new functionalities and approaches. Below are some further changes that need to be addressed before we can merge.

The unit tests are failing because of a new argument init_fea_emb in scglue.models.sc.GraphEncoder.__init__(). It needs a default value of None, otherwise the standard GLUE model would stop working. Meanwhile, I suppose the forward function does not need this argument?
The scglue.utils.generate_prot_guidance_graph() function should be moved to the scglue.genomics module, just like other guidance graph functions.
The scglue.utils.clr() function is a duplicate of muon.prot.pp.clr, so I'd suggest removing this and use muon directly.
The NBMixtureDataDecoder returns one of the NB components randomly per Bernoulli sample rather than a genuine NB mixture. While this might also work in practice, a cleaner approach would require defining an NB mixture distribution and implement a mixture log_prob. I suppose totalVI did implement one, but we may also achieve the same goal using torch.distributions.MixtureSameFamily and NegativeBinomial.
The new functions need type hints and documentations in a style consistent with the current code base.

Please let me know if you'd like me to address any of these problems or if I've misunderstood some of the code. I would be glad to help!

HelloWorldLTY · 2024-02-23T13:40:57Z

Hi, sorry for my late reply. I have made the modification based on your comments. For NBMixture model, I used categorical and nb distribution from torch to implement. Please let me know if you have other questions.

HelloWorldLTY added 3 commits February 4, 2024 12:15

integrate citeseq

6d7d841

update p

55ad68d

update computation of nbmixture

927d77b

Jeff1995 requested changes Feb 8, 2024

View reviewed changes

modify the codes

45cada0

update utils

3923df9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add functions for integrating CITE-seq data #114

Add functions for integrating CITE-seq data #114

HelloWorldLTY commented Feb 4, 2024

Jeff1995 left a comment

HelloWorldLTY commented Feb 23, 2024

Add functions for integrating CITE-seq data #114

Are you sure you want to change the base?

Add functions for integrating CITE-seq data #114

Conversation

HelloWorldLTY commented Feb 4, 2024

Jeff1995 left a comment

Choose a reason for hiding this comment

HelloWorldLTY commented Feb 23, 2024