Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consequences of using vsearch on NovaSeq data #549

Closed
slambrechts opened this issue Dec 11, 2023 · 4 comments
Closed

Consequences of using vsearch on NovaSeq data #549

slambrechts opened this issue Dec 11, 2023 · 4 comments
Labels

Comments

@slambrechts
Copy link

Hi,

I know there are consequences of using dada2 on NovaSeq data (e.g. benjjneb/dada2#791), but do you know if there are similar problems with using vsearch on novaseq data?

Best,
Sam

@frederic-mahe
Copy link
Collaborator

@slambrechts you probably refer to NovaSeq's simplified quality encoding.

The short answer is: no known adverse effect yet.

Only marginal effects are known. For instance, vsearch may report fastq quality average or median values that do not belong to the reduced set of quality values. vsearch commands such as --fastq_mergepairs recompute quality values, and thus may be more impacted. Nothing showed up in our tests so far.

@slambrechts
Copy link
Author

slambrechts commented Dec 11, 2023

@frederic-mahe ok great, thank you for the info. If I understand correctly, there is also no need to adjust maxee for fastq filtering?

@frederic-mahe
Copy link
Collaborator

Earlier this year, I've listed the following reduced sets of quality values (see issue #474):

  • NovaSeq and RTA3 (2021) quality values are: 2, 12, 23, and 37
  • NextSeq and RTA3 (2023) quality values are: 2, 14, 21, 27, 32, and 36

These are subsets of usual quality sets, so I do not expect any particular difficulties for vsearch.

Also no need to adjust maxee for fastq filtering?

When using --fastq_filter, --fastq_mergepairs or --fastx_filter, option --fastq_maxee discards sequences with an expected error greater than the specified value. There is no default value for --fastq_maxee, so there is no adjustment to be done on at the code level. Also, the way --fastq_maxee is computed (sum of 10^-(Q/10)) should not be impacted if a reduced set of quality values is used.

I could be wrong though, please feel free to suggest tests or configurations.

@frederic-mahe
Copy link
Collaborator

basic tests added to our test suite (frederic-mahe/vsearch-tests@bd064e7)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants