Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

easy way to get features #53

Open
rohanvarm opened this issue Mar 9, 2022 · 2 comments
Open

easy way to get features #53

rohanvarm opened this issue Mar 9, 2022 · 2 comments
Labels

Comments

@rohanvarm
Copy link

Is there an easy way to access the feature descriptor of the SAS points?

@rdk
Copy link
Owner

rdk commented Apr 12, 2022

Hi and sorry for a late reply. Currently, there is no easy/straightforward way to do this.

If you are still interested, could you say something more about your use case?

In the future release I plan to add:

  1. easy way to export all feature vectors for all SAS points for individual proteins to a CSV file
  2. ability to visualize all individual features mapped to SAS points in PyMol

For now, you actually can export the feature vectors to an ARFF file (basically CSV with a header), but it is a hassle. It is only possible to do it in a training phase and only for the whole dataset at once. So, you can start a fake training run on a single-protein dataset with -delete_vectors 0:

prank traineval -train test.ds -eval test.ds -delete_vectors 0 -extra_features xyz

# Notes:
#  * test.ds should contain a path to only a single pdb/cif file 
#  * xyz feature adds (x,y,z) coordinates of the SAS point to the feature vector (optional)
#  * vectorsTrain.arff.gz file will be produced in the output folder

@rdk rdk added the question label Apr 12, 2022
@rohanvarm
Copy link
Author

Thanks! That is useful, I intended to use this for another downstream app

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants