Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors on idxdepth #60

Open
Yexin-Zhang opened this issue Mar 1, 2021 · 3 comments
Open

Errors on idxdepth #60

Yexin-Zhang opened this issue Mar 1, 2021 · 3 comments

Comments

@Yexin-Zhang
Copy link

Hi,

I am trying to use the idxdepth to calculate the depth for the manifest file, but it always gives me a warning:

[warning] BAM header only has a subset of the reference chromosomes -- please make sure they match!

The issues falls on many datasets that I tried. I use bwa for alignment, and gatk for read groups adding/duplicates removing.

Any hint for how might this happened?

Best,
Monica

@traxexx
Copy link
Contributor

traxexx commented Apr 16, 2021

This is a bam header error. Did you use the same reference fasta for generating your bam, and for running idxdepth?

@biozzq
Copy link

biozzq commented Oct 25, 2021

Hi @traxexx

Yes, even I use the same reference fasta for generating BAM files, it sometimes give me the errors.

Best regards,
Zheng zhuqing

@karen916
Copy link

karen916 commented Nov 5, 2023

Hello,@traxexx

I have encountered the same issue. Using the same reference genome, I came across the following warning message:

[2023-11-05 16:25:41.063] [idxdepth] [48575] [warning] BAM header only has a subset of the reference chromosomes -- please make sure they match!
The command I used is as follows:

(py37) [chenzhaojin@lfpara idxdepth]$ idxdepth -b /home/chenzhaojin/expansion/sv/results_1-6_supdata1/1-12/delly/alignment_1_sorted.bam
-o /home/chenzhaojin/expansion/sv/results_1-6_supdata1/paragraph/idxdepth/depth_1-12.txt
-O /home/chenzhaojin/expansion/sv/results_1-6_supdata1/paragraph/idxdepth/depth_1-12_tsv.txt
-r /home/chenzhaojin/sv_test/GCF_000003025.6_Sscrofa11.1_genomic.fna
--threads 48
My BAM file was generated by bwa mem. Upon checking the logs, I noticed many chromosome segments being skipped, with parts of the log as follows:

[2023-11-05 16:25:41.741] [idxdepth] [48591] [info] Thread 140667879675648 skipping NC_010443.5
[2023-11-05 16:25:41.741] [idxdepth] [48591] [info] Thread 140667879675648 skipping NC_010445.4
... (and so on for the rest of the chromosomes)
Running the samtools view command again to inspect the file header, it appears that these chromosome identifiers do exist:
(py37) [chenzhaojin@lfpara delly]$ samtools view -H alignment_1_sorted.bam
@hd VN:1.3 SO:coordinate
@sq SN:NC_010443.5 LN:274330532
... (and so on for the rest of the chromosomes)
Could you please help me understand why idxdepth is skipping these chromosome segments even though they are present in the BAM header? What could be causing this discrepancy?
In addition, I encountered a similar issue when running the command:
python3 /home/chenzhaojin/tools/bin/multigrmpy.py -i /home/chenzhaojin/expansion/sv/results_1-6_supdata1/survivor/filter1_DEL.vcf
-m /home/chenzhaojin/expansion/sv/results_1-6_supdata1/paragraph/samples.txt
-t 48
-r /home/chenzhaojin/expansion/sv/mid75/paragraph/GCF_000003025.6_Sscrofa11.1_genomic_uppercase.fna
-o test1
The error message was:

Exception: NC_010443.5:75954757 fail to retrieve genome REF. Are you using the correct ref genome?
Could this issue be related to the previous one, and how can I ensure that the reference genome is correctly recognized in both cases?
Thank you for your assistance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants