Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

paf file-size estimate #8

Open
MichelMoser opened this issue Feb 26, 2019 · 2 comments
Open

paf file-size estimate #8

MichelMoser opened this issue Feb 26, 2019 · 2 comments

Comments

@MichelMoser
Copy link

Hi,

I am excited to test CONSENT with a nanopore dataset of about 60x of a 600Mb genome. Its about 2.8 mio reads (41 Gb total length). Unfortunately, all-vs-all alignments expands very fast and i had to terminate after paf file reached 2.1 Terabyte.
Is there a size estimate what is needed as temporary storage size for such a dataset?

In your bioarxiv publication, you ran CONSENT on 30x human data, what was the file-size of all-vs-all alignments there?

Cheers,
Michel

@morispi
Copy link
Owner

morispi commented Feb 26, 2019

Hi,

I'm afraid I can't precisely answer the question about the estimated size of the PAF file for your dataset.

The 30x human dataset was only composed of reads from chr1.
The reads file was therefore 7 GB, and resulted in a 171 GB PAF file.

Cheers,
Pierre

@harish0201
Copy link

This might be a bit tardy solution, but @MichelMoser @morispi probably an easy way out might be to gzip the paf file and uncompress+stream it for the next step?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants