Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vectorization of hydrophobicity #6

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

richierocks
Copy link

hydrophobicity is now vectorized. I also tweaked the data loading code in that function to avoid NOTEs in R CMD check.

Removing spaces from sequences is now outsourced to .remove_spaces (the leading dot to keep it internal). I only used it in the hydrophobicity function, but there are a dozen other places where this could be used. Also, I kept the logic as is, but you might want to change the regular expression to "[[:space:]]" to remove other spacing characters like tabs and non-breaking spaces.

There is also a small problem with the package licence that I haven't fixed. If you declare GPL-2, you can't also include license file. See section 1.1.2 of Writing R Extensions.

@richierocks
Copy link
Author

I've had a change of heart on the best way to write this function. Using stri_count_fixed from the stringi package is faster for long sequences and long vectors. (I'm getting a better than 3x speed up for cases of 1e6 sequences.)

I also rewrote aindex using the same technique; the code is faster and clearer (to me at least).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant