demux_isoseq_with_genome.py Issue - expecting input not generated by get_abundance_post_collapse.py #223

KrRi4 · 2022-05-19T01:46:28Z

Hi,

I was hoping someone could help me solve/clarify the issue I'm having below.

First, I am running cDNA_Cupcake v28.0.0, taken from README:

cDNA_Cupcake

Last Updated: 09.15.2021
Current version: 28.0.0

I have 2 IsoSeq CCS files that I ran through steps 1-7 outlined here: #167. I would like to obtain the sample specific counts which I believe can be generated using the demux_isoseq_with_genome.py script.

In order to obtain the read stat file for the above script, I ran get_abundance_post_collapse.py.

Here are the first few lines of the required input files for demux_isoseq_with_genome.py script:

collapsed.mapped.read_stat.txt
id is_fl stat pbid
1_m54328U_191216_220556/90243745/ccs Y unique PB.7953.2
1_m54328U_191216_220556/64751804/ccs Y unique PB.7953.2
1_m54328U_191216_220556/83690194/ccs Y unique PB.7953.2
1_m54328U_191216_220556/53478253/ccs Y unique PB.5007.3
2_m54328U_191216_220556/115672586/ccs Y unique PB.642.8

all_report_flnc.csv
id,strand,fivelen,threelen,polyAlen,insertlen,primer
1_m54328U_191216_220556/4/ccs,+,22,24,33,4386,1_NEB_5p--1_NEB_Clontech_3p
1_m54328U_191216_220556/11/ccs,+,21,24,30,5525,1_NEB_5p--1_NEB_Clontech_3p
1_m54328U_191216_220556/12/ccs,+,22,25,32,4805,1_NEB_5p--1_NEB_Clontech_3p
1_m54328U_191216_220556/14/ccs,+,25,22,30,7448,1_NEB_5p--1_NEB_Clontech_3p
2_m54328U_191216_220556/180554526/ccs,+,22,25,29,3578,2_NEB_5p--2_NEB_Clontech_3p

collapsed_mapped.collapsed.rep.fq
@PB.1.1|chr1:14401-29364(-)|transcript/77472 transcript/77472
GAGAGCGGAAGC....
@PB.1.2|chr1:14401-29359(-)|transcript/121124 transcript/121124
GGAAGCGG....

I get the following warning when I run the script... and the resulting output file is empty.

demux_isoseq_with_genome.py --mapped_fafq collapsed_mapped.collapsed.rep.fq --read_stat collapsed_mapped.collapsed.read_stat.txt --classify_csv all_report_flnc.csv -o test
Reading all_report_flnc.csv....
Reading collapsed_mapped.collapsed.read_stat.txt....
Reading collapsed_mapped.collapsed.rep.fq....
WARNING: Could not find PB.1.1|chr1:14401-29363(-)|transcript/77472 in collapsed_mapped.collapsed.read_stat.txt. No demux output for this sequence!
WARNING: Could not find PB.1.2|chr1:14401-29355(-)|transcript/121124 in collapsed_mapped.collapsed.read_stat.txt. No demux output for this sequence!

It appears the script is looking for the entire P.B.x.y.|chr... string in the read.stat file.
Now, if I remove everything after the P.B.x.y with sed 's/|[a-z].*//g' collapsed_mapped.collapsed.rep.fq > testingSyntax.fq and use testingSnytax.fq for the --mapped file in demux_isoseq_with_genome.py it works!

I wanted to make sure I haven't done something off with this process up until this point, or if this is an issue with the script? So any clarification would be helpful.

Thank you!
Kr

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

demux_isoseq_with_genome.py Issue - expecting input not generated by get_abundance_post_collapse.py #223

demux_isoseq_with_genome.py Issue - expecting input not generated by get_abundance_post_collapse.py #223

KrRi4 commented May 19, 2022 •

edited

demux_isoseq_with_genome.py Issue - expecting input not generated by get_abundance_post_collapse.py #223

demux_isoseq_with_genome.py Issue - expecting input not generated by get_abundance_post_collapse.py #223

Comments

KrRi4 commented May 19, 2022 • edited

cDNA_Cupcake

KrRi4 commented May 19, 2022 •

edited