Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fastq-multx not identifying barcodes #60

Open
SJRussell opened this issue May 10, 2017 · 8 comments
Open

fastq-multx not identifying barcodes #60

SJRussell opened this issue May 10, 2017 · 8 comments

Comments

@SJRussell
Copy link

I have a fastq file with 1.7 million reads of 75 bps, and 28 different barcodes. My command:
``fastq-multx -l barcodes.txt all_samples.fastq -e -o %.fq -x"

Part of my barcode file: barcodes.txt:
image

Head of my fastq, with some barcodes highlighted:
image

My output, which leaves most reads multiplexed:
picture2

Suggestions? What does (shifted) mean?

Thanks,

Stewart

@benligan
Copy link

Having this problem as well, for some reason fastq-multx is not recognizing the barcode correctly.

@drsuuzzz
Copy link

Can you give some more information about the type of assay and sequencing equipment used?

@benligan
Copy link

Sure, Illumina Miseq, paired ends. R1 forward and R3 reverse sequences.

/Users/###/miniconda2/bin/fastq-multx -B /Volumes/homes/###/SRA/Columbia.Gut.Murine/map/reversecomplement/reversecomplement.txt /Volumes/homes/###/SRA/Columbia.Gut.Murine/2_fastq/lane1_NoIndex_L001_R1_001.fastq /Volumes/homes/###/SRA/Columbia.Gut.Murine/2_fastq/lane1_NoIndex_L001_R3_001.fastq -o /Volumes/homes/###/SRA/Columbia.Gut.Murine/SRA/R3/R3.%.fastq /Volumes/homes/###/SRA/Columbia.Gut.Murine/SRA/R3/R1.%.fastq
/Users/###/miniconda2/bin/fastq-multx -B /Volumes/homes/###/SRA/Columbia.Gut.Murine/map/forward/forward2.txt /Volumes/homes/###/SRA/Columbia.Gut.Murine/2_fastq/lane1_NoIndex_L001_R1_001.fastq /Volumes/homes/###/SRA/Columbia.Gut.Murine/2_fastq/lane1_NoIndex_L001_R3_001.fastq -o /Volumes/homes/###/SRA/Columbia.Gut.Murine/SRA/R1/R1.%.fastq -o /Volumes/homes/###/SRA/Columbia.Gut.Murine/SRA/R1/R3.%.fastq
Using Barcode File: /Volumes/homes/###/SRA/Columbia.Gut.Murine/map/forward/forward2.txt

Returns:

  1. Empty fastq files

screen shot 2017-07-18 at 8 21 48 am

I know each of these samples should have 5000-10000 amplicons per sample.

The behavior is also very erratic, it works with some fastq files (demultiplexes appropriately) and on other runs it doesn't demultiplex at all and returns empty files.

Primer barcodes.
screen shot 2017-07-18 at 8 23 02 am

@benligan
Copy link

I am also using: Version: 1.3.1

Here is cat of the fastq file

+
AAAAAFFAA11>AEGGGFGCGGHGGGAEHHFHHGHHEGGGHH1B01BF/BFEGCEE@/B2222BFFFF1BEFACEHF1B@/EEGEE<FHH1F00?0<B11<//?>///1?/C////1?F0<0>FCGFF=<GEGG<-..<<E00:CGHHHH:
@MISEQ01:40:000000000-ART6C:1:1101:16675:2021 1:N:0:
GAAATATCCTTTGCAGTAGCGCCAATATGAGAAGAGCCATACCGCTGATTCTGCGTTTGCTGATGAACTAAGTCAACCTCAGCACTAACCTTGCGAGTCATTTCTTTGATTTGGTCATTGGTAAAATACTGACCAGCCGTTTGAGCTTGAG
+
AAA3AFFFFFFDGG54BDEGGCGGFFHHGGFCFHFHGHHDHHHGGGGDHHHHHHGGGGGHHFFGHHHHHHHGHHHFHHHHHHHFGGHGHHHHHHGGGGHGFHGHHHHGGHHHHHFHHGHGGHHGHHHHHHGHHFHHHGGGGHFHGFHHHEG
@MISEQ01:40:000000000-ART6C:1:1101:19362:2021 1:N:0:
TACGTAGGGGGCAAGCGTTATCCGGATTTACTGGGTGTAAAGGGGGCGCAGACGGCAATGCAAGCCAGGAGTGAAAGCCCGGGGCCCAACCCCGGGACTGCTCTTGGAACTGCATGGCTGGAGTACAGGCGGGGCAGGCGGAATTCCTAAT
+

@ExpressionAnalysis
Copy link
Owner

Would you be able to attach some sample data that we can work with to recreate the problem?

@J-Sabino
Copy link

I am having a similar problem.
I have the same samples in 2 lanes (Hiseq Illumina), in total 992 samples. In lane 1 it returns 983 samples and in lane 2 it returns 985 samples. This is strange because I checked and the barcode sequences are there, but fastq-multx is not returning everything as needed.

@wltrimbl
Copy link
Contributor

João, can you prepare subsets of the data files (perhaps 20 lines of all of the relevant fastqs?) and the sample mapping file that will produce the bad output? (Short, self-contained example of the bug)

@J-Sabino
Copy link

J-Sabino commented Jun 7, 2019

Hi, I was having a problem in generating this data subset and I am not allowed to share the full dataset.

In the meanwhile, I used QIIME2 to demultiplex the samples and it worked fine. I realized that the low read samples were the ones not being picked with fastq-multx. I need to say that I always used fastq-multx in Miseq and it worked fine. So maybe it is something with Hiseq or the size of the dataset.

Again, I am sorry for not being able to provide a self-contained example of the bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants