Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert vertor<dna5> to std:string #3218

Open
nhhaidee opened this issue Dec 17, 2023 · 1 comment
Open

Convert vertor<dna5> to std:string #3218

nhhaidee opened this issue Dec 17, 2023 · 1 comment
Labels
question a user question how to do certain things

Comments

@nhhaidee
Copy link

nhhaidee commented Dec 17, 2023

Platform

  • SeqAn version: 3.2.0
  • Operating system: Ubuntu
  • Compiler: 11.4

Question

With the following code, is there any way to Convert vertor to std:string as I want to handle C++ standard string?

Thanks,
Hai

auto input = R"(>TEST1
ACGT
>Test2
AGGCTGA
>Test3
GGAGTATAATATATATATATATAT)";

int main(int argc, const char *argv[]) {

    using sequence_file_input_type =
            seqan3::sequence_file_input<seqan3::sequence_file_input_default_traits_dna,
                    seqan3::fields<seqan3::field::seq, seqan3::field::id>,
                    seqan3::type_list<seqan3::format_fasta>>;
    sequence_file_input_type fin{std::istringstream{input}, seqan3::format_fasta{}};
    // Retrieve the sequences and ids.
    for (auto &[seq, id]: fin) {
        seqan3::debug_stream << "ID:  " << id << '\n';
        seqan3::debug_stream << "SEQ: " << seq << '\n';
        // a quality field also exists, but is not printed, because we know it's empty for FASTA files.
    }

    return 0;
}
@nhhaidee nhhaidee added the question a user question how to do certain things label Dec 17, 2023
@smehringer
Copy link
Member

Hi @nhhaidee,

thanks for reaching out!

This is indeed a common use case that is not well handled by our library. The solution is a bit unintuitive:

You can adapt the seqan3::sequence_file_input_default_traits_dna

struct my_traits : seqan3::sequence_file_input_default_traits_dna
{
    using sequence_alphabet = char; // instead of dna5
 
    template <typename alph>
    using sequence_container = std::basic_string<alph>; // must be defined as a template!
};

that will automatically read the sequences as a std::string (std::string = std::basic_string<char>)

Full Solution:

#include <iostream>

#include <seqan3/io/sequence_file/all.hpp>
#include <seqan3/core/debug_stream.hpp>

auto input = R"(>TEST1
ACGT
>Test2
AGGCTGA
>Test3
GGAGTATAATATATATATATATAT)";

struct my_traits : seqan3::sequence_file_input_default_traits_dna
{
    using sequence_alphabet = char; // instead of dna5
 
    template <typename alph>
    using sequence_container = std::basic_string<alph>; // must be defined as a template!
};

int main(int argc, const char *argv[]) {

    using sequence_file_input_type =
            seqan3::sequence_file_input<my_traits,
                    seqan3::fields<seqan3::field::seq, seqan3::field::id>,
                    seqan3::type_list<seqan3::format_fasta>>;

    sequence_file_input_type fin{std::istringstream{input}, seqan3::format_fasta{}};
    // Retrieve the sequences and ids.
    for (auto &[seq, id]: fin) {
        std::cout << "ID:  " << id << '\n';
        std::cout << "SEQ: " << seq << '\n';
        // a quality field also exists, but is not printed, because we know it's empty for FASTA files.
    }

    return 0;
}

working on Compiler Explorer: https://godbolt.org/z/PrrooYzTK

As you can see, the sequence can now also be printed with std::cout since it is a std::string

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question a user question how to do certain things
Projects
None yet
Development

No branches or pull requests

2 participants