Specify output folder name #11

alezanalp · 2017-03-17T08:38:29Z

Is it possible to make option for specifying output folder name with the report files rather than using input files names?

Yours faithfully,
Katerina

sfchen · 2017-03-17T09:35:01Z

AfterQC is designed to run in batch. So, normally AfterQC will create a QC folder, and within the QC folder there will be folders for different input fastq.

You can change the name QC to report by specifying -r report in the command line.

Then the dir tree will be like:

report/
└── R1.fq
    ├── report.html
    └── report.json

So, your requirement is not to include 'R1.fq' folder inside the report folder, and make the dir tree like:

report/
├── report.html
└── report.json

Am I right?

alezanalp · 2017-03-17T09:41:08Z

Yeah, so the user can specify "report" folder for each pair manually if running in -1 -2 mode, for example.

serge2016 · 2017-03-17T09:43:11Z

I think, that it would be perfect I user can specify some prefix for reports, e.g. --report-prefix=/path/to/dir/filename and then:

/path/to/dir/
├── filename.html
└── filename.json

sfchen · 2017-03-17T10:13:29Z

@alezanalp do you agree with @serge2016 ?

#11

sfchen · 2017-03-17T10:34:42Z

I have submitted a commit to implement @serge2016 's idea. You can pull or download the latest master to have a try.

Now, you will get

QC
├── filename1.fq.html
└── filename1.fq.json
└── filename2.fq.html
└── filename2.fq.json
...

And you can change folder name from QC to report by specifying -r report . And you can also specify an absolute path by -r /path/to/dir/

alezanalp · 2017-03-17T10:44:10Z

@sfchen Yes, I agree with @serge2016 . Thank you for the prompt reply. Will try it

serge2016 · 2017-03-17T13:13:30Z

There is one more "issue" or bag with this in v0.9.0:
If I run after.py --read1_file=SRR3184279_1.fastq.gz --read2_file=SRR3184279_2.fastq.gz --read1_flag=_1 --read2_flag=_2 --qc_only then I get everything ok:

$(pwd)/QC/
└── SRR3184279_1.fastq.gz
    ├── report.html
    └── report.json

But if I run after.py --read1_file=SRR3184279_1.fastq.gz --read2_file=SRR3184279_2.fastq.gz --read1_flag=_1 --read2_flag=_2 --qc_only --report_output_folder=$(pwd) then I get:

SRR3184279_1.fastq.gz options:
{'qc_only': True, 'version': '0.9.0', 'seq_len_req': 35, 'index1_file': None, 'trim_tail': 0, 'report_output_folder': '/home/bg/kate/AfterQC/PE_reads/', 'trim_pair_same': True, 'no_correction': False, 'debubble_dir': 'debubble', 'barcode_flag': 'barcode', 'read2_file': 'SRR3184279_2.fastq.gz', 'barcode_length': 12, 'trim_tail2': 0, 'unqualified_base_limit': 60, 'allow_mismatch_in_poly': 2, 'read2_flag': '_2', 'store_overlap': False, 'debubble': False, 'read1_flag': '_1', 'index2_flag': 'I2', 'draw': True, 'index1_flag': 'I1', 'mask_mismatch': False, 'barcode': False, 'overlap_output_folder': None, 'barcode_verify': 'CAGTA', 'index2_file': None, 'qualified_quality_phred': 15, 'trim_front': 9, 'good_output_folder': 'good', 'poly_size_limit': 35, 'n_base_limit': 5, 'qc_sample': 200000, 'trim_front2': 9, 'no_overlap': False, 'input_dir': None, 'read1_file': 'SRR3184279_1.fastq.gz', 'qc_kmer': 8, 'bad_output_folder': None}

Traceback (most recent call last):
  File "/home/bg/soft/AfterQC-0.9.0/after.py", line 221, in <module>
    main()
  File "/home/bg/soft/AfterQC-0.9.0/after.py", line 215, in main
    processOptions(options)
  File "/home/bg/soft/AfterQC-0.9.0/after.py", line 171, in processOptions
    filter.run()
  File "/home/bg/soft/AfterQC-0.9.0/preprocesser.py", line 709, in run
    stat_file = open(os.path.join(qc_dir, "report.json"), "w")
IOError: [Errno 20] Not a directory: '/home/bg/kate/AfterQC/PE_reads/SRR3184279_1.fastq.gz/report.json'

This error occurs if I set the -r dir equal to the dir, where I run AfterQC from.

sfchen · 2017-03-17T14:13:15Z

@serge2016 this issue is because of v0.9.0 need to create a folder same as the R1 fastq file name, so it will conflict with the fastq file name if $(pwd) is specified as report_output_folder.

I believe with last commit, this issue is gone.

sfchen · 2017-03-17T14:18:21Z

I just released v0.9.1. You can have a try with the new feature described above.

serge2016 · 2017-03-20T12:58:59Z

Now previous behavior is changed to more predictable:) Thank you!
But I still think about the variant when we specify -1 and -2 options: is this mode we have only one sample, so we can specify the full output name for the report.

I simply want to use your tool inside CWL environment, and it is easier to do if it is possible to specify output filenames independently from input filenames.

sfchen added a commit that referenced this issue Mar 17, 2017

refine QC output folders and file names

7b00e2b

#11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Specify output folder name #11

Specify output folder name #11

alezanalp commented Mar 17, 2017

sfchen commented Mar 17, 2017

alezanalp commented Mar 17, 2017

serge2016 commented Mar 17, 2017

sfchen commented Mar 17, 2017

sfchen commented Mar 17, 2017 •

edited

alezanalp commented Mar 17, 2017

serge2016 commented Mar 17, 2017 •

edited

sfchen commented Mar 17, 2017

sfchen commented Mar 17, 2017

serge2016 commented Mar 20, 2017

Specify output folder name #11

Specify output folder name #11

Comments

alezanalp commented Mar 17, 2017

sfchen commented Mar 17, 2017

alezanalp commented Mar 17, 2017

serge2016 commented Mar 17, 2017

sfchen commented Mar 17, 2017

sfchen commented Mar 17, 2017 • edited

alezanalp commented Mar 17, 2017

serge2016 commented Mar 17, 2017 • edited

sfchen commented Mar 17, 2017

sfchen commented Mar 17, 2017

serge2016 commented Mar 20, 2017

sfchen commented Mar 17, 2017 •

edited

serge2016 commented Mar 17, 2017 •

edited