Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

banded alignment gives spurious gaps for 1bp sequences #2323

Open
mchaisso opened this issue Jul 8, 2018 · 3 comments
Open

banded alignment gives spurious gaps for 1bp sequences #2323

mchaisso opened this issue Jul 8, 2018 · 3 comments
Assignees

Comments

@mchaisso
Copy link

mchaisso commented Jul 8, 2018

This isn't a common case, but it's worth bringing up as a base-case issue. Passing a single nucleotide to banded global alignment can give spurious alignments.

#include "seqan/align.h"
#include <string>
#include <iostream>
using namespace std;
typedef seqan::String<seqan::Dna> TSequence;                 // sequence type
typedef seqan::Align<TSequence, seqan::ArrayGaps> TSeqAlign;   
typedef seqan::Row<TSeqAlign>::Type TSeqRow;                 // gapped sequence typ



int main(int argc, char* argv[]) {
	string read="T";
	string ref="T";
	TSeqAlign align;
	seqan::resize(seqan::rows(align), 2);
	seqan::assignSource(seqan::row(align, 0), read.c_str());
	seqan::assignSource(seqan::row(align, 1), ref.c_str());
	int k=7;
	int score=0;
// affine banded alignment
	score = seqan::globalAlignment(align, seqan::Score<int, seqan::Simple>(4, -4, -20, 0), -k, k);
	cout << score<< endl;
	cout << align << endl;
// standard affine
	score = seqan::globalAlignment(align, seqan::Score<int, seqan::Simple>(4, -4, -20, 0));
	cout << score<< endl;
	cout << align << endl;
// standard banded
	score = seqan::globalAlignment(align, seqan::Score<int, seqan::Simple>(4, -4, -4), -k, k);
	cout << score<< endl;
	cout << align << endl;
	return 0;
}
Output:
1073741824
      0    
        T-
          
        -T



4
      0   
        T
        |
        T



-1073741824
      0    
        T-
          
        -T
@marehr
Copy link
Member

marehr commented Jul 17, 2018

Hi @mchaisso,

sorry for coming back to you so late, but it is currently vacation time and our group kinda focuses on version 3 of SeqAn.

You seem to have found a neat little edge case. I imagine that the 2bp case works correctly?
Could you try to use seqan::String<char> directly for read and ref without the whole .c_str() conversion?

Could you clarify what you mean by spurious?

Do you mean that affine, banded, affine-banded behave inconsistently (and wrong)? Or do you mean that multiple executions of the same binary produces different output?

Anyway, I put @rrahn which is the maintainer of the alignment code and should fairly easy find the issue. (Who is on vacation 😄 )

@mchaisso
Copy link
Author

mchaisso commented Jul 20, 2018 via email

@rrahn
Copy link
Contributor

rrahn commented Jul 25, 2018

hi, many thanks for reporting this.
I will look into it, but at the moment I am stacked with things that need to be done before we have our workshops in September. So I can't guarantee to fix this by then, but probably when porting the alignment code to SeqAn3 I will fix this as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants