Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backbone only search returns no hit. #256

Open
realfenston opened this issue Mar 20, 2024 · 4 comments
Open

Backbone only search returns no hit. #256

realfenston opened this issue Mar 20, 2024 · 4 comments

Comments

@realfenston
Copy link

Hello, I am having a naive question for Foldseek search. It is noticed in the paper that only alpha carbon relations will be extracted and be used to predict 3Di tokens. In this way, I manually curate a backbone only dataset with only N, alpha-C, and C atoms and construct corresponding PDBs from this dataset and later on run Foldseek search.

However, the results always say no hits are found.

I am not sure what has actually happened. Any help from you will be highly appreciated.

@milot-mirdita
Copy link
Member

Our check to rebuild the backbone with pulchra seems to be incomplete.

If both N and C are present, it will not try to rebuild the backbone, resulting in broken 3Di-tokens.
If you want Foldseek to reconstruct the backbone, please only pass C-alpha and not other atoms.

Otherwise make sure you pass all of N, C, C-alpha and C-beta.

@realfenston
Copy link
Author

Our check to rebuild the backbone with pulchra seems to be incomplete.

If both N and C are present, it will not try to rebuild the backbone, resulting in broken 3Di-tokens. If you want Foldseek to reconstruct the backbone, please only pass C-alpha and not other atoms.

Otherwise make sure you pass all of N, C, C-alpha and C-beta.

Hello, many thanks to your quick response. I follow what you have mentioned, by passing only the alpha-C in my backbone but still it hits no matching records. I have attached one sample PDB file to this message and I would appreciate it a lot if you could help me check it out!

HEADER HYDROLASE 19-JUL-00 1FCV
ATOM 1 CA XAA A 1 8.306 -88.396 56.274 1.00 0.00 C
ATOM 2 CA XAA A 2 11.628 -86.719 57.108 1.00 0.00 C
ATOM 3 CA XAA A 3 13.334 -87.879 53.911 1.00 0.00 C
ATOM 4 CA XAA A 4 10.647 -86.303 51.710 1.00 0.00 C
ATOM 5 CA XAA A 5 10.812 -83.125 53.810 1.00 0.00 C
ATOM 6 CA XAA A 6 14.529 -82.658 53.081 1.00 0.00 C
ATOM 7 CA XAA A 7 13.838 -83.207 49.393 1.00 0.00 C
ATOM 8 CA XAA A 8 11.092 -80.560 49.435 1.00 0.00 C
ATOM 9 CA XAA A 9 13.354 -78.095 51.254 1.00 0.00 C
ATOM 10 CA XAA A 10 15.850 -78.536 48.447 1.00 0.00 C
ATOM 11 CA XAA A 11 13.292 -77.487 45.823 1.00 0.00 C
ATOM 12 CA XAA A 12 12.116 -74.575 47.987 1.00 0.00 C
ATOM 13 CA XAA A 13 15.711 -73.421 48.333 1.00 0.00 C
ATOM 14 CA XAA A 14 16.222 -73.474 44.547 1.00 0.00 C
ATOM 15 CA XAA A 15 12.973 -71.500 44.290 1.00 0.00 C
ATOM 16 CA XAA A 16 14.097 -68.951 46.883 1.00 0.00 C
ATOM 17 CA XAA A 17 17.465 -68.518 45.192 1.00 0.00 C
ATOM 18 CA XAA A 18 15.914 -67.992 41.767 1.00 0.00 C
ATOM 19 CA XAA A 19 13.578 -65.299 43.159 1.00 0.00 C
ATOM 20 CA XAA A 20 16.473 -63.621 44.975 1.00 0.00 C
ATOM 21 CA XAA A 21 18.342 -63.575 41.670 1.00 0.00 C
ATOM 22 CA XAA A 22 15.317 -61.965 39.941 1.00 0.00 C
ATOM 23 CA XAA A 23 15.084 -59.368 42.724 1.00 0.00 C
ATOM 24 CA XAA A 24 18.804 -58.518 42.717 1.00 0.00 C
ATOM 25 CA XAA A 25 18.816 -58.156 38.932 1.00 0.00 C
ATOM 26 CA XAA A 26 15.666 -56.022 38.911 1.00 0.00 C
ATOM 27 CA XAA A 27 17.119 -53.803 41.668 1.00 0.00 C
ATOM 28 CA XAA A 28 20.386 -53.535 39.763 1.00 0.00 C
ATOM 29 CA XAA A 29 18.376 -52.372 36.736 1.00 0.00 C
ATOM 30 CA XAA A 30 16.562 -49.904 38.920 1.00 0.00 C
ATOM 31 CA XAA A 31 19.815 -48.452 40.310 1.00 0.00 C
ATOM 32 CA XAA A 32 20.883 -47.809 36.746 1.00 0.00 C
ATOM 33 CA XAA A 33 17.530 -46.270 35.836 1.00 0.00 C
ATOM 34 CA XAA A 34 17.415 -44.089 38.947 1.00 0.00 C
ATOM 35 CA XAA A 35 20.874 -42.742 38.154 1.00 0.00 C
ATOM 36 CA XAA A 36 19.685 -41.896 34.637 1.00 0.00 C
ATOM 37 CA XAA A 37 16.546 -40.194 36.011 1.00 0.00 C
ATOM 38 CA XAA A 38 18.814 -38.096 38.250 1.00 0.00 C
ATOM 39 CA XAA A 39 21.120 -37.120 35.396 1.00 0.00 C
ATOM 40 CA XAA A 40 18.242 -36.137 33.143 1.00 0.00 C
ATOM 41 CA XAA A 41 16.381 -34.277 35.882 1.00 0.00 C
ATOM 42 CA XAA A 42 19.550 -32.228 36.482 1.00 0.00 C
ATOM 43 CA XAA A 43 19.717 -31.280 32.774 1.00 0.00 C
ATOM 44 CA XAA A 44 16.077 -30.274 32.660 1.00 0.00 C
ATOM 45 CA XAA A 45 15.873 -28.567 36.066 1.00 0.00 C
ATOM 46 CA XAA A 46 13.367 -30.836 37.771 1.00 0.00 C
ATOM 47 CA XAA A 47 13.049 -31.838 41.400 1.00 0.00 C
END

You may ignore the PDB name which comes from my favorite protein entry. This backbone is from a CATH dataset so it mostly favor a reasonable structure.

@realfenston
Copy link
Author

It is my bad. It seems I am working under 3Di+AA mode which corrupts everything totally. I could try local alignment mode with structure only. Anyway your response is really appreciated and I will report the result after giving a try.

@milot-mirdita
Copy link
Member

milot-mirdita commented Mar 21, 2024

You can try the 3Di only mode. Foldseek doesn't work as well with 3Di only though, normally you'd need to pass both C-alpha and AA letter.

Additionally, I don't think the pulchra backbone reconstruction works if you don't tell it what AA letter it is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants