Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BGI data was not suitable for version 2.0.7 #276

Open
biozhao77 opened this issue Mar 5, 2024 · 7 comments
Open

BGI data was not suitable for version 2.0.7 #276

biozhao77 opened this issue Mar 5, 2024 · 7 comments

Comments

@biozhao77
Copy link

STAR
--genomeDir ./celescope/reference
--readFilesIn read_2.fq.gz read_1.fq.gz
--readFilesCommand zcat
--soloCBwhitelist ./scopeV2.2.1/bclist ./scopeV2.2.1/bclist ./scopeV2.2.1/bclist
--soloCellFilter EmptyDrops_CR 3000 0.99 10 45000 90000 500 0.01 20000 0.001 10000
--outFileNamePrefix .
--runThreadN 10
--clip3pAdapterSeq AAAAAAAAAAAA
--outFilterMatchNmin 50
--soloFeatures Gene GeneFull_Ex50pAS
--outSAMattributes NH HI nM AS CR UR CB UB GX GN
--soloType CB_UMI_Complex
--soloCBposition 0_0_0_7 0_24_0_31 0_48_0_55
--soloUMIposition 0_57_0_68 --soloUMIlen 12
--soloCBmatchWLtype 1MM
--outSAMtype BAM SortedByCoordinate
--soloCellReadStats Standard
--soloBarcodeReadLength 0

When I used 2.0.7 to analysis BGI data (STARsolo step shown up), mapping and counting were finished normally. But, an error "IndexError: sequence length is not enough in R1 read" raised when exec 'celescope.tools.starsolo.get_Q30_cb_UMI'.

I think it was related to the para 'soloCBposition' and 'soloUMIposition ' (the barcode sequence is 68bp, and the library was built with singleron kit v1), but I'm not sure. So I need your help.

Thank you!

@zhouyiqi91
Copy link
Collaborator

https://github.com/singleron-RD/CeleScope/blob/master/doc/chemistry.md
singleron kit v1 corresponds to chemistry scopeV2.2.1, which requires at least 69bp R1. If you have 68bp R1, the last 1bp UMI will be thrown away.
change -soloUMIposition 0_57_0_68 --soloUMIlen 12 to -soloUMIposition 0_57_0_67 --soloUMIlen 11 should work.

@zhouyiqi91
Copy link
Collaborator

https://github.com/singleron-RD/CeleScope/blob/master/doc/chemistry.md singleron kit v1 corresponds to chemistry scopeV2.2.1, which requires at least 69bp R1. If you have 68bp R1, the last 1bp UMI will be thrown away. change -soloUMIposition 0_57_0_68 --soloUMIlen 12 to -soloUMIposition 0_57_0_67 --soloUMIlen 11 should work.

In multi_rna, use '--chemistry customized --pattern C8L16C8L16C8L1U11T18 --whitelist "./scopeV2.2.1/bclist ./scopeV2.2.1/bclist ./scopeV2.2.1/bclist"

@biozhao77
Copy link
Author

Thanks for your advice! I found the linker in scopeV2.2.1 was "TCGGTGACAGCCATATCGTAGTCAGAAGCTGAC", and the last base "C" was not sequenced in BGI data. So, the "L1" in '--pattern C8L16C8L16C8L1U11T18' was not exact I think.

@biozhao77
Copy link
Author

https://github.com/singleron-RD/CeleScope/blob/master/doc/chemistry.md singleron kit v1 corresponds to chemistry scopeV2.2.1, which requires at least 69bp R1. If you have 68bp R1, the last 1bp UMI will be thrown away. change -soloUMIposition 0_57_0_68 --soloUMIlen 12 to -soloUMIposition 0_57_0_67 --soloUMIlen 11 should work.

In multi_rna, use '--chemistry customized --pattern C8L16C8L16C8L1U11T18 --whitelist "./scopeV2.2.1/bclist ./scopeV2.2.1/bclist ./scopeV2.2.1/bclist"

Thanks for your advice! I found the linker in scopeV2.2.1 was "TCGGTGACAGCCATATCGTAGTCAGAAGCTGAC", and the last base "C" was not sequenced in BGI data. So, the "L1" in '--pattern C8L16C8L16C8L1U11T18' was not exact I think.

@zhouyiqi91
Copy link
Collaborator

zhouyiqi91 commented Mar 5, 2024 via email

@biozhao77
Copy link
Author

if so,just remove the L1 and use U12

---------- 该邮件从移动设备发送
--------------原始邮件-------------- 发件人:"biozhao77 @.>; 发送时间:2024年3月5日(星期二) 晚上9:19 收件人:"singleron-RD/CeleScope" @.>; 抄送:"周义其 @.>;"Comment @.>; 主题:Re: [singleron-RD/CeleScope] BGI data was not suitable for version 2.0.7 (Issue #276) ----------------------------------- https://github.com/singleron-RD/CeleScope/blob/master/doc/chemistry.md singleron kit v1 corresponds to chemistry scopeV2.2.1, which requires at least 69bp R1. If you have 68bp R1, the last 1bp UMI will be thrown away. change -soloUMIposition 0_57_0_68 --soloUMIlen 12 to -soloUMIposition 0_57_0_67 --soloUMIlen 11 should work. In multi_rna, use '--chemistry customized --pattern C8L16C8L16C8L1U11T18 --whitelist "./scopeV2.2.1/bclist ./scopeV2.2.1/bclist ./scopeV2.2.1/bclist" Thanks for your advice! I found the linker in scopeV2.2.1 was "TCGGTGACAGCCATATCGTAGTCAGAAGCTGAC", and the last base "C" was not sequenced in BGI data. So, the "L1" in '--pattern C8L16C8L16C8L1U11T18' was not exact I think. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

I'm trying. But I have another question, if the last base "C" will be thrown when L1 was removed? Because I noticed that the linker was 33bp including "C".

@zhouyiqi91
Copy link
Collaborator

The lastest pipeline actually does NOT use the sequence of the linker. It just use the --pattern to locate the barcode segments and check if the barcode segments match the whitelist or not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants