-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
get_io_reads()
fails with Move table discordant with signal
on some reads
#158
Comments
get_io_reads
fails with Move table discordant with signal
on some readsget_io_reads()
fails with Move table discordant with signal
on some reads
Hi. I also played a lot with move tables. Check if you don't have duplicated read IDs in your pod5 files that you use for basecalling. or basecalled bam They may cause this kind of issues according to my experience |
I can reproduce the bug with the BAM file you sent, but I'm not able to replicate the issue with dorado 0.5.3 and without reference mapping. It is possible that the issues is due to a bug in dorado concerning adapter trimming (related to the splitting bug). Could you upgrade to dorado 0.5.3 and let me know if this resolves your issue? |
Hi marcus, I'm getting a similar issue with the newest dorado version (0.6.1) while trying to run
Getting a similar output:
I'm unsure about the first error. However I did check the bams and pod5s and it does seem like there are some read ids that are not present in the pod5 and vice versa, unsure how this happens. Trying this with unmapped reads gives me the same outcome. It might be a read splitting issue, however I'm not able to resolve it. |
This should be resolved with the v3.2 release yesterday. Please reopen this issue if you have further issues. |
@marcus1487 Could you please link to the changes in the code for this issue and if you have the time, elaborate on what caused this and how it was solved? I have been trying to debug multiple instances of this issue over time and I am curious what changes were made to the logic. Thanks! |
The changes are buried in this rather large merge request. Specifically note the changes in io.py regarding the ts, sp, and ns tags. |
I can confirm that this issue is solved with v3.2. On a dataset of 60000 reads mapping onto a plasmid, prior versions failed on 50000 of them. Now, all are processed correctly. Thank you very much for this fix @marcus1487 ! |
Hi @marcus1487,
I am running into an issue that I was hoping you could provide some insight into:
I have a dataset from a P2 sequencer that was basecalled and aligned using
dorado
v0.5.1. When working with these data usingremora
v3.1.0, a small portion of reads (about 1 in 5000 reads) fail when callingget_io_reads
with the following errorI have attached the corresponding read to replicate the issue. I am curious as to what could cause these cases and if this is anything that can be fixed from a programmatic standpoint or whether these reads should simply be filtered out.
In the the only other issue raising this problem you hinted this could be related to read splitting. In this instance however, the read does not stem from a split read:
Any insight into this matter would be greatly appreciated.
Thank you in advance!
The text was updated successfully, but these errors were encountered: