Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect read position for single end reads #102

Open
Ivarz opened this issue Jun 7, 2021 · 1 comment
Open

Incorrect read position for single end reads #102

Ivarz opened this issue Jun 7, 2021 · 1 comment
Assignees

Comments

@Ivarz
Copy link

Ivarz commented Jun 7, 2021

Describe the bug
After trimming single end reads the position for some of them is incorrect.
Example SAM record:
Before primer trimming:

NB501725:46:HV7TYBGXH:3:21611:1357:10891 16 MN908947.3 650	42 68M	* 0 0 AAAGGAGCTGGTGGCCATAGTTACGGCGCCGATCTAAAGTCATTTGACTTAGGCGACGAGCTTGGCAC EE//<EEEE/EE/EEAE6EEEE6EEA/E/E<EAAEA//AEEE/EEEE///E/E/EEAEEE//EAAAAA AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0NM:i:0 MD:Z:68 YT:Z:UU

After primer trimming:

NB501725:46:HV7TYBGXH:3:21611:1357:10891 16 MN908947.3 672	42 15S31M22S	* 0 0 AAAGGAGCTGGTGGCCATAGTTACGGCGCCGATCTAAAGTCATTTGACTTAGGCGACGAGCTTGGCAC EE//<EEEE/EE/EEAE6EEEE6EEA/E/E<EAAEA//AEEE/EEEE///E/E/EEAEEE//EAAAAA AS:i:0 XN:i:0 XM:i:0 XO:i:0XG:i:0 NM:i:0 MD:Z:68 YT:Z:UU XA:i:3

To Reproduce
Steps to reproduce the behavior:

  1. Map single end reads to reference (I used bowtie2)
  2. ivar trim -i $BAM -b $BED -p PREFIX
  3. Identify reads for which old position + softclipped bases at the beginning of reads != new position

Interestingly, when separately performing repeated alignment and primer trimming on such reads, the trimming is performed correctly.

Expected behavior
In this case read should map to position 665 (650 + 15) not 672.

I can provide you with more cases to help debugging.

@gkarthik gkarthik self-assigned this Jun 8, 2021
@AdmiralenOla
Copy link

I too am facing this issue. It seems this will occur with single-end reads even if I don't supply any primer BED file, i.e. quality-based trimming only.

Example
Before trimming:
5XHX7:02978:00991 16 AJ277461.2 526 60 257M * 0 0 AAGGATGATGTCTCTAGAACCAACTGCAGACTGTGGAGTGGAAAAAGGCTTTACAACGGAAAGGATTAAGACTGGAAAGGTGGACTTGGATAGCTGTTGCACTCAGCATGGATGTACAAAAGGGATTAGGGTGGAGGTTCCATCGCCTGTACTGGTATCGGCCAAATGCAATGAAATTTCATTCAGAGTAGTGCCGTTCCATTCTGTACCAGACAGGTTAGGGTTCGCTAGAACTAGTTCTTTTACACTAAGAGCCG 4:4AAAABCCA@A@AA=:95;5;;AABBCCBCCC@CBBC>;*;;<<5BB=EBBC?BA>A=BB?AA@B?BB@@@=@:AA=CD@CAA?B@BCCCBBDE?EAAA9:::99@B@FCBBCAA9FDB=BBB@C@<AA@ACC?D@C@BCBBBADCBBAA@>BBB<;7A=;7;;<BA=A@:/99/=<<?<@?>?CCCCBCB@BB@DACC@CBBBD@>?ADDECACACC=EE@DDBCCAA?CCCCFABB:CCBBBBBB?BBCDIFF NM:i:4 MD:Z:63A116G36C38A0 AS:i:482 XS:i:0

After trimming:
5XHX7:02978:00991 16 AJ277461.2 610 60 24S149M84S * 0 0 AAGGATGATGTCTCTAGAACCAACTGCAGACTGTGGAGTGGAAAAAGGCTTTACAACGGAAAGGATTAAGACTGGAAAGGTGGACTTGGATAGCTGTTGCACTCAGCATGGATGTACAAAAGGGATTAGGGTGGAGGTTCCATCGCCTGTACTGGTATCGGCCAAATGCAATGAAATTTCATTCAGAGTAGTGCCGTTCCATTCTGTACCAGACAGGTTAGGGTTCGCTAGAACTAGTTCTTTTACACTAAGAGCCG 4:4AAAABCCA@A@AA=:95;5;;AABBCCBCCC@CBBC>;*;;<<5BB=EBBC?BA>A=BB?AA@B?BB@@@=@:AA=CD@CAA?B@BCCCBBDE?EAAA9:::99@B@FCBBCAA9FDB=BBB@C@<AA@ACC?D@C@BCBBBADCBBAA@>BBB<;7A=;7;;<BA=A@:/99/=<<?<@?>?CCCCBCB@BB@DACC@CBBBD@>?ADDECACACC=EE@DDBCCAA?CCCCFABB:CCBBBBBB?BBCDIFF NM:i:4 MD:Z:63A116G36C38A0 AS:i:482 XS:i:0 XA:i:7

I think Ivar trim is getting the placement wrong after soft clipping. It seems to mix up soft clipping on the left and the right end. This example, the trimmed read is placed on position 610, i.e. 526 + 84 (the soft clipping at the right end). It should be placed on position 550 (526+24).

The same logic applies to the example of @Ivarz above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants