Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BR: HMMRATAC fails to run on large genomes needing .csi index #96

Open
TeiturAK opened this issue May 1, 2022 · 4 comments
Open

BR: HMMRATAC fails to run on large genomes needing .csi index #96

TeiturAK opened this issue May 1, 2022 · 4 comments

Comments

@TeiturAK
Copy link

TeiturAK commented May 1, 2022

Describe the bug
I'm running HMMRATAC on several plant genomes which vary greatly in size. HMMRATAC fails when running on the largest genomes that require a .csi index. It produces the following error:

Exception in thread "main" java.lang.RuntimeException: Invalid file header in BAM index spruce.sorted.unique_mapped.MT_CP_removed.bam.csi: ^_^D

It works fine on the smaller genomes for which I can generate a .bai index.

System:

  • OS: Linux
  • HMMRATAC Version 1.2.10

Additional context
I'm working with a ~20GB genome that requires a .csi index. I did not use multithreading when creating the index and just changing the name of the index to have a .bai ending does not help.

Any help would be much appreciated.
Teitur

@Mouwrice
Copy link

Mouwrice commented May 3, 2022

@TeiturAK I see no reference of HMMRATAC being able to process .csi files. Why did you think this should work?

@Mikxox
Copy link

Mikxox commented May 12, 2022

The internal samtools dependency is using a reader implementation that has been deprecated for years and does not seem to support .csi index files. The dependency should be update to the latest release and the code refactored to use the new reader implementation.

@jitsedesmet
Copy link

@TeiturAK is it possible to share a .csi index file? I would like to implement this feature/ fic the bug and test whether it works. I'm but a noble computer science student and have no idea where to find a .csi file to test this feature.
After this project is done I will contact you again so I can share the implementation of course 🙂 .

@jitsedesmet
Copy link

I have been able to generate a csi file myself and my implementation seems to work for me. I will link my implementation once it can be made public here. You are still welcome to provide me with your data so I can make sure it works. (Although I understand that sharing data in some fields is not trivial in which case I hope it'll work for you) 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants