Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

foldseek easy-cluster iteratively with different batches at different times #249

Open
josephhughes opened this issue Mar 5, 2024 · 1 comment

Comments

@josephhughes
Copy link

Hi,

Is it possible to do foldseek easy-cluster at different points in time with different batches without needing to reprocess everything. For example, I have 10,000 pdb files that I clustered today. Then in 3 weeks time, I add another 10,000 sequences to the folder of pdb files.
When I run foldseek easy-cluster, is there a way for me to tell it that it can use the results of the first 10,000 files to minimise compute?

@CRC63
Copy link

CRC63 commented Mar 27, 2024

Hi,
I am also interested in this possibility. Thanks in advance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants