Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

popscle/tsv_reader.cpp:93 double_field_at] Cannot access field at 5 >= 5 #37

Open
kyleac opened this issue Dec 7, 2020 · 12 comments
Open

Comments

@kyleac
Copy link

kyleac commented Dec 7, 2020

Hi, I've managed to successfully run the tutorial pileup and freemuxlet successfully as well as a pileup with the 1000g_ref.vcf and barcodes.tsv on a 10x Chromium scRNA-seq sample after some troubleshooting (using popscle_helper_tools to sort 1000g_ref.vcf according to chromosome order output for 10x, downgrading htslib to 1.10.2 for popscle installation, and making a change to the Cmake file as recommended in issue #21):

We will fix this, but I believe that you can fix yourself by changing the line in CMakeLists.txt from set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -O3 -pthread") to set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -O3 -pthread --std=c++11") I will appreciate if you provide us with feedback on whether the change worked or not Hyun. ----------------------------------------------------- Hyun Min Kang, Ph.D. Associate Professor of Biostatistics University of Michigan, Ann Arbor Email : hmkang@umich.edu

On Thu, Dec 19, 2019 at 4:08 PM Griffan(Fan Zhang) @.***> wrote: It's likely a problem related to gcc version, you can try solutions mentioned in this link: https://stackoverflow.com/questions/10033373/c-error-nullptr-was-not-declared-in-this-scope-in-eclipse-ide — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#21?email_source=notifications&email_token=ABPY5OODWTEZU7GX2KQ2XDDQZPPEDA5CNFSM4J4FEZQ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHLBEVY#issuecomment-567677527>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABPY5OOS67AXD26FKAEEKQTQZPPEDANCNFSM4J4FEZQQ .

Now, when I attempt to run freemuxlet on the dsc_pileup output for my sample, I get the following error:

NOTICE [2020/12/07 11:09:55] - Loading pileup information with prefix dir/popscle_out/pr_478/pr_478_pooled_dsc_pileup
NOTICE [2020/12/07 11:09:55] - Reading barcode information from dir/popscle_out/pr_478/pr_478_pooled_dsc_pileup.cel.gz..
NOTICE [2020/12/07 11:09:55] - Finished loading 1907 droplets, skipping 0
NOTICE [2020/12/07 11:09:55] - Reading variant information from dir/popscle_out/pr_478/pr_478_pooled_dsc_pileup.var.gz..
FATAL ERROR -
[dir/popscle/tsv_reader.cpp:93 double_field_at] Cannot access field at 5 >= 5

I'm unsure how to troubleshoot further. Thank you.

@Griffan
Copy link
Collaborator

Griffan commented Dec 7, 2020

It looks like your var.gz file is not in the expectedted format. Could you post the first 2 lines?

@kyleac
Copy link
Author

kyleac commented Dec 7, 2020

[kyleac@gl-login1 pr_478]$ gzip -cd pr_478_pooled_dsc_pileup.var.gz | head
#SNP_ID CHROM POS REF ALT AF
0 10177 A A 0.42532
1 10235 T T 0.00120
2 10352 T T 0.43750
3 10505 A T 0.00020
4 10506 C G 0.00020
5 10511 G A 0.00020
6 10539 C A 0.00060
7 10542 C T 0.00020
8 10579 C A 0.00020

Looks like the CHROM column is empty?

@hyunminkang
Copy link
Contributor

hyunminkang commented Dec 8, 2020 via email

@kyleac
Copy link
Author

kyleac commented Dec 12, 2020

Thank you for your help in troubleshooting the pileup output. I was able to solve the issue.

I started with a fresh 1000Genomes reference .vcf and directly compared it to my Cell Ranger .bam output to determine and correct the differences (chromosome order and differences in chromosome contig naming conventions).

@kyleac kyleac closed this as completed Dec 12, 2020
@bwheel12
Copy link

I seem to be having a similar issue. However, I cannot see how my plp.var file is malformed? Below are the first few lines of the file:

#SNP_ID CHROM POS REF ALT AF
0 chr1 69496 G A 0.01400
1 chr1 69761 A T 0.02100
2 chr1 786344 A A 0.01100
3 chr1 827105 C A 0.18800
4 chr1 930165 G A 0.00040
5 chr1 930204 G A 0.00841
6 chr1 930245 G A 0.00040
7 chr1 930248 G A 0.00360
8 chr1 930314 C T 0.05900
9 chr1 930320 C T 0.00241
10 chr1 935779 G A 0.00080
11 chr1 935849 G C 0.00120

@valentinaOpazo
Copy link

I seem to be having a similar issue. However, I cannot see how my plp.var file is malformed? Below are the first few lines of the file:

#SNP_ID CHROM POS REF ALT AF 0 chr1 69496 G A 0.01400 1 chr1 69761 A T 0.02100 2 chr1 786344 A A 0.01100 3 chr1 827105 C A 0.18800 4 chr1 930165 G A 0.00040 5 chr1 930204 G A 0.00841 6 chr1 930245 G A 0.00040 7 chr1 930248 G A 0.00360 8 chr1 930314 C T 0.05900 9 chr1 930320 C T 0.00241 10 chr1 935779 G A 0.00080 11 chr1 935849 G C 0.00120

Could you solved it? I'm with the same issue

@hyunminkang
Copy link
Contributor

I believe that it has to be tab-delimited.

@valentinaOpazo
Copy link

valentinaOpazo commented Mar 21, 2024

My .var.gz file is tab-delimited and has the chr col, so I don't know why I get the error:

NOTICE [2024/03/21 16:00:26] - Loading pileup information with prefix result_demuxlet_SUBSET_updTSV
NOTICE [2024/03/21 16:00:26] - Reading barcode information from result_demuxlet_SUBSET_updTSV.cel.gz..
NOTICE [2024/03/21 16:00:26] - Finished loading 16218 droplets, skipping 0
NOTICE [2024/03/21 16:00:26] - Reading variant information from result_demuxlet_SUBSET_updTSV.var.gz..
FATAL ERROR - [E:/home/xlopez/popscle/tsv_reader.cpp:93 double_field_at] Cannot access field at 5 >= 5

The head of .var.gz file looks like this

image

@hyunminkang hyunminkang reopened this Mar 21, 2024
@hyunminkang
Copy link
Contributor

@valentinaOpazo Sometimes this could happen due to strange quirks related to htslib. If you share your input files to me, I will see if the issues are reproduced on my end, and suggest a possible fix.

@valentinaOpazo
Copy link

I share with you this google-drive folder with the 4 output files of dsc-pileup, the .tsv file and the log of Demuxlet with the error
https://drive.google.com/drive/folders/1XlRu9eQCnQcB48BLZIVhpC9lgba_rvHb?usp=drive_link

@valentinaOpazo
Copy link

@hyunminkang Any news? I tried to run it on differents pc (Linux and Mac) and I got the same issue in both

@hyunminkang
Copy link
Contributor

Your input file is malformed. The .var.gz file contains "empty" chromosome names in the middle of the file and it is causing errors. I don't think this is a bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants