Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some questions about general wrapper for datasets #38

Open
HelloWorldLTY opened this issue Oct 16, 2022 · 8 comments
Open

Some questions about general wrapper for datasets #38

HelloWorldLTY opened this issue Oct 16, 2022 · 8 comments
Labels
question Further information is requested

Comments

@HelloWorldLTY
Copy link
Contributor

Hi, I intend to apply this model to different datasets rather than the competition datasets, and I wonder if you have any general loading data structure to load public datasets or not. Moreover, is it possible for me to use a lighter structure comparing the jointembedding structure if I have already processed the given dataset? Thanks.

@RemyLau
Copy link
Collaborator

RemyLau commented Oct 16, 2022

Hi @HelloWorldLTY, thanks for your interest in the dance package! We are currently working on some heavy refactoring to clean up several interfaces, including datasets', and make them more user-friendly, e.g., apply methods to their own datasets, and benchmark their method on datasets provided by the package. As for now, there isn't an easy way to work with custom datasets. We expect to fix this within the next month or so.

Is your primary interest in using your custom dataset on joint-embedding tasks? If so, I can make that a priority so that you can play with the models soon.

@HelloWorldLTY
Copy link
Contributor Author

Thanks a lot, I am now working on JAE and since my dataset is very large, this tool is not very efficient.

@RemyLau
Copy link
Collaborator

RemyLau commented Oct 16, 2022

@HelloWorldLTY Currently, most datasets are loaded from an AnnData object, which is one of the standard data objects for single-cell omics data. So long as your processed data structure can be interfaced with AnnData easily, it shouldn't be a big deal.

Could you briefly describe the type of data structure you are working with and what libraries you currently use to process them? We can also consider adding interfaces for this particular type of data structure if it is somewhat standard as well.

@HelloWorldLTY
Copy link
Contributor Author

Hi, I prefer anndata object based on scanpy, and I am currently using this type of data.

@RemyLau
Copy link
Collaborator

RemyLau commented Oct 16, 2022

Ok, sounds good! This should be supported natively soon. I'll keep you posted on that.

@RemyLau RemyLau added the question Further information is requested label Dec 31, 2022
@RemyLau
Copy link
Collaborator

RemyLau commented Dec 31, 2022

This is related to an ongoing refactoring task #49

@gabumon0
Copy link

This is related to an ongoing refactoring task #49

yeah, I also get into trouble when I want to apply the jointembedding scmogcn model to my own GEX+ATAC data. My data is stored as annadata, and is there any tutorial that I can learn from?

@htumlc
Copy link

htumlc commented Dec 26, 2023

数据集在哪里下载

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants