Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory error in AssignTaxonomy when working with very small sequences #1932

Open
nearinj opened this issue Apr 22, 2024 · 1 comment
Open
Labels

Comments

@nearinj
Copy link

nearinj commented Apr 22, 2024

Hi Ben and team,

I just wanted to make note that currently in the assignTaxonomy function there is an error that results in a memory overflow if you input a a very small sequence. I noticed this because I accidentally was making synthetic reads and included a sequence that was only 4 base pairs long.

When I ran the command I kept getting out of memory errors and eventually was able to track it down to not filtering out very small reads. However, afterward I tested running the command with just my small 4 base pair long sequence and was able to reproduce the bug.

Well I know this isn't a common issue and probably isn't a priority I wanted to highlight it incase others run into it or if there is a simple fix for this in the future.

Thanks for continuing to support this software!

Cheers,
Jacob Nearing

@benjjneb benjjneb added the bug label Apr 23, 2024
@benjjneb
Copy link
Owner

Thanks. Any sequence less than the kmer size (8) will break the code. There is a check for short sequences on the reference database size, but I guess there isn't one on the query sequence side. That should be added.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants