Skip to content
This repository has been archived by the owner on Mar 16, 2022. It is now read-only.

Does Falcon work with .bas.h5 and .bax.h5 files? #232

Closed
JohnYu-yuzhongyi opened this issue Oct 27, 2015 · 5 comments
Closed

Does Falcon work with .bas.h5 and .bax.h5 files? #232

JohnYu-yuzhongyi opened this issue Oct 27, 2015 · 5 comments
Labels

Comments

@JohnYu-yuzhongyi
Copy link

Hi,

I'm a new to bioinformatics so apologise if my question is stupid.

I'm trying to use falcon to assemble my PacBio reads. In the 'input.fofn' file, should I put all my .bax.h5 and .bas.h5 files?(I have 2 .bas.h5 files and 6 .bax.h5 files) Or should I convert them into .fasta files before I use falcon?

I tried to run falcon with those .h5 files but an error occurred.

[INFO](local) '/home/zhongyi/pacbio/0-rawreads/prepare_rdb.sh'
[WARNING]Call 'bash /home/zhongyi/pacbio/0-rawreads/prepare_rdb.sh 1> /home/zhongyi/pacbio/0-rawreads/prepare_rdb.sh.log 2>&1' returned 256.
[ERROR]Contents of '/home/zhongyi/pacbio/0-rawreads/prepare_rdb.sh.log':
trap 'touch /home/zhongyi/pacbio/0-rawreads/rdb_build_done.exit' EXIT
+ trap 'touch /home/zhongyi/pacbio/0-rawreads/rdb_build_done.exit' EXIT
cd /home/zhongyi/pacbio/0-rawreads
+ cd /home/zhongyi/pacbio/0-rawreads
hostname
+ hostname
vinculin
date
+ date
Tue Oct 27 16:38:55 GMT 2015
fasta2DB -v raw_reads -f/home/zhongyi/pacbio/0-rawreads/input.fofn
+ fasta2DB -v raw_reads -f/home/zhongyi/pacbio/0-rawreads/input.fofn
fasta2DB: Cannot open /home/zhongyi/pacbio/data/1_p0.bas.h5.fasta for 'r'

I noticed a program called fasta2DB was trying to open a .bas.h5.fasta file which doesn't exist. I wonder does that mean the program assumes the input files should all be fasta files?

@mseetin
Copy link

mseetin commented Oct 27, 2015

Hi John,

You need to run an RS_Subreads job either in SMRT Portal or SMRT Pipe to generate subreads.fasta files for each of your SMRT cells. Then list the full paths to those in your fofn.

@JohnYu-yuzhongyi
Copy link
Author

Thanks @mseetin !

But if I only use fasta files, does that mean I'm losing all the other information like sequence quality etc.?

@mseetin
Copy link

mseetin commented Oct 27, 2015

Hi John,

Falcon doesn't use that information. That information isn't as useful for the overall assembly as it is for the final sequence polishing with Quiver. Falcon doesn't perform the polishing step. Instead, what you'll need to do is concatenate the p_ctg.fa and a_ctg.fa files you get from Falcon into one fasta file, upload that as a reference into SMRT Analysis, and then run a BAM_Resequencing job using your SMRT cells as input and your new reference as the reference. This will produce a polished consensus sequence that is approximately QV 50. This takes as input the full .bax.h5 files and uses all the sequence quality information to produce it.

@JohnYu-yuzhongyi
Copy link
Author

Thank you so much @mseetin! I'll give it a try

@pb-cdunn
Copy link

Linked from FAQ. Thanks, @mseetin.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants