Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using HMMRATAC with CSI indexes and large chromosome sizes #88

Open
diego-rt opened this issue Aug 29, 2021 · 2 comments
Open

Using HMMRATAC with CSI indexes and large chromosome sizes #88

diego-rt opened this issue Aug 29, 2021 · 2 comments

Comments

@diego-rt
Copy link

I'm working with an organism with a very large genome and thus I require the use of a CSI index. However, when I pass HMMRATAC my csi index, it complains about it having an invalid file header. I've tried renaming it to .bam.bai but this has not helped.

Do you know of any workarounds or plans to support csi indexes? I believe htsjdk supports csi indexes but only from version 2.19.0. I wonder whether this might be the problem?

Thanks a lot!

The exception:

Exception in thread "main" java.lang.RuntimeException: Invalid file header in BAM index 136160.unique.sorted.dedup.bam.bai: ?
	at net.sf.samtools.AbstractBAMFileIndex.<init>(AbstractBAMFileIndex.java:90)
	at net.sf.samtools.DiskBasedBAMFileIndex.<init>(DiskBasedBAMFileIndex.java:46)
	at net.sf.samtools.BAMFileReader.getIndex(BAMFileReader.java:232)
	at net.sf.samtools.BAMFileReader.createIndexIterator(BAMFileReader.java:592)
	at net.sf.samtools.BAMFileReader.query(BAMFileReader.java:352)
	at net.sf.samtools.SAMFileReader.query(SAMFileReader.java:363)
	at HMMR_ATAC.pullLargeLengths.read(pullLargeLengths.java:112)
	at HMMR_ATAC.pullLargeLengths.<init>(pullLargeLengths.java:61)
	at HMMR_ATAC.Main_HMMR_Driver.main(Main_HMMR_Driver.java:219)
@diego-rt
Copy link
Author

Hello, I was wondering whether you have any updates, workarounds or suggestions on how to address this? @taoliu @EvanTarbell

Just for clarify, the problem is that when working with chromosome sizes larger than 512 Mbp, one needs to use a CSI index (i.e. using samtools index -c : https://www.htslib.org/doc/samtools-index.html ) as opposed to a BAI index

Thanks a lot!

@jitsedesmet
Copy link

I think the comments on #96 can help you find a solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants