Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

supplying multiple BAM files #37

Open
rajitz opened this issue Mar 31, 2020 · 5 comments
Open

supplying multiple BAM files #37

rajitz opened this issue Mar 31, 2020 · 5 comments

Comments

@rajitz
Copy link

rajitz commented Mar 31, 2020

Hi,

I want to supply multiple BAM files to the cnv tool with the -I argument; I was able to do so with *bam, but I now want to limit it to only certain BAM files in the folder (without having to move them to a separate folder), and I tried to give it a text file with each BAM file listed line-by-line but that did not work. Please let me know the correct format for supplying multiple BAM files; i.e., how they have to be separated within the list of files given.

Thanks very much,
Rajat

@heuermh
Copy link
Member

heuermh commented Mar 31, 2020

Hello @rajitz!

Spark uses Hadoop file system glob syntax for multiple files; see e.g.
https://hadoop.apache.org/docs/r2.4.1/api/org/apache/hadoop/fs/FileSystem.html#globStatus(org.apache.hadoop.fs.Path)

@rajitz
Copy link
Author

rajitz commented Apr 3, 2020

That worked well, thank you!

Is it also possible to supply the BAMs in a separate file where they are all listed in a compatible format, instead of doing {file1.bam,file2.bam,file3.bam} ?

Thanks.

@heuermh
Copy link
Member

heuermh commented Apr 3, 2020

If you're using a file with a list of BAMs, it looks like the format should be one BAM file per line

See
https://github.com/bigdatagenomics/deca/blob/master/deca-cli/src/main/scala/org/bdgenomics/deca/cli/Coverager.scala#L81

so your args should include -I list_of_bams.txt -l.

If you can't easily tell the difference between -{uppercase 'I'} and -{lowercase 'l'}, neither can I, whoever came up with those argument names ought to be shot. 😉

@rajitz
Copy link
Author

rajitz commented Apr 3, 2020

Haha. That worked like a charm! Have a good weekend. Rajat

@rajitz
Copy link
Author

rajitz commented Apr 15, 2020

Hi again, is there a way while doing this to just call CNVs on the first few samples listed instead of all of them? We don't want to have to spend time on calling CNVs in the reference set of samples. Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants