Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify output folder name #11

Open
alezanalp opened this issue Mar 17, 2017 · 10 comments
Open

Specify output folder name #11

alezanalp opened this issue Mar 17, 2017 · 10 comments

Comments

@alezanalp
Copy link

Is it possible to make option for specifying output folder name with the report files rather than using input files names?

Yours faithfully,
Katerina

@sfchen
Copy link
Member

sfchen commented Mar 17, 2017

AfterQC is designed to run in batch. So, normally AfterQC will create a QC folder, and within the QC folder there will be folders for different input fastq.

You can change the name QC to report by specifying -r report in the command line.

Then the dir tree will be like:

report/
└── R1.fq
    ├── report.html
    └── report.json

So, your requirement is not to include 'R1.fq' folder inside the report folder, and make the dir tree like:

report/
├── report.html
└── report.json

Am I right?

@alezanalp
Copy link
Author

Yeah, so the user can specify "report" folder for each pair manually if running in -1 -2 mode, for example.

@serge2016
Copy link

I think, that it would be perfect I user can specify some prefix for reports, e.g. --report-prefix=/path/to/dir/filename and then:

/path/to/dir/
├── filename.html
└── filename.json

@sfchen
Copy link
Member

sfchen commented Mar 17, 2017

@alezanalp do you agree with @serge2016 ?

sfchen added a commit that referenced this issue Mar 17, 2017
@sfchen
Copy link
Member

sfchen commented Mar 17, 2017

I have submitted a commit to implement @serge2016 's idea. You can pull or download the latest master to have a try.

Now, you will get

QC
├── filename1.fq.html
└── filename1.fq.json
└── filename2.fq.html
└── filename2.fq.json
...

And you can change folder name from QC to report by specifying -r report . And you can also specify an absolute path by -r /path/to/dir/

@alezanalp
Copy link
Author

@sfchen Yes, I agree with @serge2016 . Thank you for the prompt reply. Will try it

@serge2016
Copy link

serge2016 commented Mar 17, 2017

There is one more "issue" or bag with this in v0.9.0:
If I run after.py --read1_file=SRR3184279_1.fastq.gz --read2_file=SRR3184279_2.fastq.gz --read1_flag=_1 --read2_flag=_2 --qc_only then I get everything ok:

$(pwd)/QC/
└── SRR3184279_1.fastq.gz
    ├── report.html
    └── report.json

But if I run after.py --read1_file=SRR3184279_1.fastq.gz --read2_file=SRR3184279_2.fastq.gz --read1_flag=_1 --read2_flag=_2 --qc_only --report_output_folder=$(pwd) then I get:

SRR3184279_1.fastq.gz options:
{'qc_only': True, 'version': '0.9.0', 'seq_len_req': 35, 'index1_file': None, 'trim_tail': 0, 'report_output_folder': '/home/bg/kate/AfterQC/PE_reads/', 'trim_pair_same': True, 'no_correction': False, 'debubble_dir': 'debubble', 'barcode_flag': 'barcode', 'read2_file': 'SRR3184279_2.fastq.gz', 'barcode_length': 12, 'trim_tail2': 0, 'unqualified_base_limit': 60, 'allow_mismatch_in_poly': 2, 'read2_flag': '_2', 'store_overlap': False, 'debubble': False, 'read1_flag': '_1', 'index2_flag': 'I2', 'draw': True, 'index1_flag': 'I1', 'mask_mismatch': False, 'barcode': False, 'overlap_output_folder': None, 'barcode_verify': 'CAGTA', 'index2_file': None, 'qualified_quality_phred': 15, 'trim_front': 9, 'good_output_folder': 'good', 'poly_size_limit': 35, 'n_base_limit': 5, 'qc_sample': 200000, 'trim_front2': 9, 'no_overlap': False, 'input_dir': None, 'read1_file': 'SRR3184279_1.fastq.gz', 'qc_kmer': 8, 'bad_output_folder': None}

Traceback (most recent call last):
  File "/home/bg/soft/AfterQC-0.9.0/after.py", line 221, in <module>
    main()
  File "/home/bg/soft/AfterQC-0.9.0/after.py", line 215, in main
    processOptions(options)
  File "/home/bg/soft/AfterQC-0.9.0/after.py", line 171, in processOptions
    filter.run()
  File "/home/bg/soft/AfterQC-0.9.0/preprocesser.py", line 709, in run
    stat_file = open(os.path.join(qc_dir, "report.json"), "w")
IOError: [Errno 20] Not a directory: '/home/bg/kate/AfterQC/PE_reads/SRR3184279_1.fastq.gz/report.json'

This error occurs if I set the -r dir equal to the dir, where I run AfterQC from.

@sfchen
Copy link
Member

sfchen commented Mar 17, 2017

@serge2016 this issue is because of v0.9.0 need to create a folder same as the R1 fastq file name, so it will conflict with the fastq file name if $(pwd) is specified as report_output_folder.

I believe with last commit, this issue is gone.

@sfchen
Copy link
Member

sfchen commented Mar 17, 2017

I just released v0.9.1. You can have a try with the new feature described above.

@serge2016
Copy link

Now previous behavior is changed to more predictable:) Thank you!
But I still think about the variant when we specify -1 and -2 options: is this mode we have only one sample, so we can specify the full output name for the report.

I simply want to use your tool inside CWL environment, and it is easier to do if it is possible to specify output filenames independently from input filenames.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants