Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query mapping #73

Open
ccruizm opened this issue May 30, 2023 · 10 comments
Open

Query mapping #73

ccruizm opened this issue May 30, 2023 · 10 comments

Comments

@ccruizm
Copy link

ccruizm commented May 30, 2023

Hello!

I would like to know whether GLUE has an implementation to perform query mapping of unlabeled data. I want to create a reference map with GLUE and be able later on to either map new data onto it or re-integrate the reference with the new query.

Thanks in advance!

@Jeff1995
Copy link
Collaborator

Jeff1995 commented Jun 3, 2023

Hi @ccruizm. Thanks for your interest in GLUE!

As GLUE is a multi-omics integration method, I suppose you would be mapping a query dataset that is in one modality onto a reference data that is in a different modality? It's indeed an interesting use case, but unfortunately there is no such implementation in GLUE right now. You would have to integrate all datasets in one step, with no distinction between reference and query.

The most obvious solution to this use case would be to fix the pretrained autoencoder of the reference modality and train only the autoencoder of the query modality. That shouldn't be too difficult to implement though. We'll see if we can add this feature in the future. I'll let you know if that becomes available. Of course pull requests are also welcome : )

@ccruizm
Copy link
Author

ccruizm commented Jun 7, 2023

Hello @Jeff1995, that would make GLUE even more powerful than it already is. I was thinking of not necessarily mapping other modalities but reference mapping the same modality and performing label transfer (RNA -> RNA or ATAC -> ATAC). Since I will build a multimodal reference (RNA+ATAC) wanted to know about the possibilities to map new data when it is generated instead of re-training the whole reference from scratch with the new data.

Thanks for developing this great tool!

@Jeff1995
Copy link
Collaborator

Thanks for the clarification! That should be easier to do. We will be testing both kind of mappings then : )

@kanyulongkkk
Copy link

Dear Dr Cao, when I want to perform cell type label tranfer from RNA to ATAC, how can I configure dataset in your code such as: scglue.models.configure_dataset(
rna, "NB", use_highly_variable=True,
use_layer="counts", use_rep="X_pca"
)scglue.models.configure_dataset(
atac, "NB", use_highly_variable=True,
use_rep="X_lsi"
)

@kanyulongkkk
Copy link

@Jeff1995

@Jeff1995
Copy link
Collaborator

Hi @kanyulongkkk, thanks for your interest in GLUE! The current dataset configuration should work fine. You would need to use the transfer_labels function to perform the cell type transfer after model training.

@kanyulongkkk
Copy link

Thanks Dr.Cao ,but after transfer_labels function, I get the label like 0.315746. my ref label is integer, so I need also integer label to query,how can I did next,please help me

@kanyulongkkk
Copy link

@Jeff1995

@Jeff1995
Copy link
Collaborator

Jeff1995 commented Apr 25, 2024

Oh I see. Is your integer label something like a cluster index? In that case you could try converting it to string or category type first, and the predicted labels should remain "integers".

@kanyulongkkk
Copy link

yes, Dr.Cao , my data is cluster index such as "0,1,2,3,4", I converting it to string or category type and try later

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants