Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract jinja_partials and fix CRISPRessoPooled fastp errors #425

Merged
merged 3 commits into from
Apr 24, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
4 changes: 1 addition & 3 deletions .github/envs/test_env.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,7 @@ channels:
- defaults
dependencies:
- pip
- trimmomatic
- flash
- fastp
- numpy
- cython
- jinja2
Expand All @@ -15,4 +14,3 @@ dependencies:
- scipy
- matplotlib
- pandas
- plotly=5.18.0
7 changes: 6 additions & 1 deletion .github/workflows/integration_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ jobs:
with:
repository: edilytics/CRISPResso2_tests
token: ${{ secrets.ACCESS_CRISPRESSO2_TESTS }}
# ref: '<BRANCH_NAME>' // Use this to specify a branch other than master
# ref: '<BRANCH-NAME>' # Use this to specify a branch other than master

- name: Run Basic
run: |
Expand All @@ -72,6 +72,11 @@ jobs:
run: |
make pooled test

- name: Run Pooled Paired Sim
if: success() || failure()
run: |
make pooled-paired-sim test

- name: Run WGS
if: success() || failure()
run: |
Expand Down
24 changes: 10 additions & 14 deletions CRISPResso2/CRISPRessoPooledCORE.py
Original file line number Diff line number Diff line change
Expand Up @@ -516,6 +516,10 @@ def main():

if fastp_status:
raise CRISPRessoShared.FastpException('FASTP failed to run, please check the log file.')

if not args.keep_intermediate:
files_to_remove += [output_forward_filename]

info('Done!', {'percent_complete': 7})

processed_output_filename = output_forward_filename
Expand Down Expand Up @@ -550,6 +554,9 @@ def main():
if args.debug:
info('Fastp command: {0}'.format(fastp_cmd))

if not args.keep_intermediate:
files_to_remove += [processed_output_filename, not_combined_1_filename, not_combined_2_filename]

if fastp_status:
raise CRISPRessoShared.FastpException('Fastp failed to run, please check the log file.')
crispresso2_info['running_info']['fastp_command'] = fastp_cmd
Expand All @@ -566,6 +573,9 @@ def main():
info(f'Forced {num_reads_force_merged} read pairs together.')
processed_output_filename = new_output_filename

if not args.keep_intermediate:
files_to_remove += [new_merged_filename, new_output_filename]

info('Done!', {'percent_complete': 7})

if can_finish_incomplete_run and 'count_input_reads' in crispresso2_info['running_info']['finished_steps']:
Expand Down Expand Up @@ -1554,20 +1564,6 @@ def default_sigpipe():
if not args.keep_intermediate:
info('Removing Intermediate files...')

if not args.aligned_pooled_bam:
if args.fastq_r2!='':
files_to_remove+=[processed_output_filename, flash_hist_filename, flash_histogram_filename,\
flash_not_combined_1_filename, flash_not_combined_2_filename]
if args.force_merge_pairs:
files_to_remove.append(new_merged_filename)
files_to_remove.append(old_flashed_filename)
else:
files_to_remove+=[processed_output_filename]

if args.trim_sequences and args.fastq_r2!='':
files_to_remove+=[output_forward_paired_filename, output_reverse_paired_filename,\
output_forward_unpaired_filename, output_reverse_unpaired_filename]

if RUNNING_MODE=='ONLY_GENOME' or RUNNING_MODE=='AMPLICONS_AND_GENOME':
if args.aligned_pooled_bam is None:
files_to_remove+=[bam_filename_genome]
Expand Down
2 changes: 1 addition & 1 deletion CRISPResso2/CRISPRessoReports/CRISPRessoReport.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

import os
from jinja2 import Environment, FileSystemLoader, ChoiceLoader, make_logging_undefined
from jinja_partials import generate_render_partial, render_partial
from CRISPResso2.CRISPRessoReports.jinja_partials import generate_render_partial, render_partial
from CRISPResso2 import CRISPRessoShared

if CRISPRessoShared.is_C2Pro_installed():
Expand Down
46 changes: 46 additions & 0 deletions CRISPResso2/CRISPRessoReports/jinja_partials.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
'''
This file is derived from https://github.com/mikeckennedy/jinja_partials and is subject to the following license:

MIT License

Copyright (c) 2021 Michael Kennedy

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
'''

from functools import partial

from markupsafe import Markup


def render_partial(template_name, renderer=None, markup=True, **data):
if renderer is None:
if flask is None:
raise PartialsException('No renderer specified')
else:
renderer = flask.render_template

if markup:
return Markup(renderer(template_name, **data))

return renderer(template_name, **data)


def generate_render_partial(renderer, markup=True):
return partial(render_partial, renderer=renderer, markup=markup)
122 changes: 62 additions & 60 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[![Docker Cloud Automated build](https://img.shields.io/docker/cloud/automated/pinellolab/crispresso2.svg)](https://hub.docker.com/r/pinellolab/crispresso2)
[![Docker Cloud Build Status](https://img.shields.io/docker/cloud/build/pinellolab/crispresso2.svg)](https://hub.docker.com/r/pinellolab/crispresso2)
[![Docker Image Version (tag)](https://img.shields.io/docker/v/pinellolab/crispresso2/latest?logo=docker&label=Docker)
](https://hub.docker.com/r/pinellolab/crispresso2/tags)
[![CircleCI branch](https://img.shields.io/circleci/project/github/pinellolab/CRISPResso2/master.svg)](https://circleci.com/gh/pinellolab/CRISPResso2)
[![install with bioconda](https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg?style=flat)](http://bioconda.github.io/recipes/crispresso2/README.html)

Expand Down Expand Up @@ -167,64 +167,15 @@ docker run -v ${PWD}:/DATA -w /DATA -i pinellolab/crispresso2 CRISPResso --fastq

Guardrails automatically check the inputs and results of experiments against standardized values. The guardrail warnings that are triggered are printed in the commandline and at the top of generated reports. In order to turn off the guardrails, add the `--disable_guardrails` argument.

TotalReadsGuardrail : Checks if the number of reads is lower than expected. (Default: 10000)
OverallReadsAlignedGuardrail : Checks if the number of aligned reads is lower than expected. (Default: 90% of the total reads)
DisproportionateReadsAlignedGuardrail : Checks if the number of reads aligned to an amplicon is higher or lower than expected proportionally. (Default: 30% more or less than expected)
LowRatioOfModsInWindowToOutGuardrail : Checks if the ratio of modifications inside to outside the quantification window is lower than expected. (Default: 0.01)
HighRateOfModificationAtEndsGuardrail : Checks if there is a high rate of modifications at the ends of the read. (Default: 0.01)
HighRateOfSubstitutionsOutsideWindowGuardrail : Checks if there is a high rate of substitutions outside of the quantification windows. (Default: 0.002)
HighRateOfSubstitutionsGuardrail : Checks if the proportion of substitutions to other modifications is higher than expected. (Default: 0.3)
ShortSequenceGuardrail : Checks if the provided sequences (both Amplicons and Guides) are shorter than expected. (Amplicon Default: 50, Guide Default: 19)
LongAmpliconShortReadsGuardrail : Checks if the rovided amplicon is more than <value> times the average length of read. (Default: 1.5)

### CRISPRessoPro

CRISPResso is an open source tool for free use by academics. However, for-profit organizations are required to purchase a license to use CRISPResso. As a part of this license, organizations gain access to the CRISPRessoPro package which supplements CRISPResso
with several useful features:
- Interactive and improved plots using D3 and Plotly
- Customizable colors
- Customizable warnings based on potential issues in results (guardrails)

#### Installation

To add CRISPRessoPro to CRISPResso contact Edilytics - support@edilytics.com

#### D3 and Plotly

If CRISPRessoPro is installed, by default reports will include interactive plots. To use matplotlib for figures add the `--use_matplotlib` argument.

#### Customizable Colors and Guardrails

If CRISPRessoPro is installed, by default the colors and guardrails will remain the same as CRISPResso. To alter this, use the `--custom_config` argument and a filepath to a `.json` file with the following format:

'''
"colors": {
'Substitution': '#0000FF',
'Insertion': '#008000',
'Deletion': '#FF0000',
'A': '#7FC97F',
'T': '#BEAED4',
'C': '#FDC086',
'G': '#FFFF99',
'N': '#C8C8C8',
'-': '#1E1E1E',
},
"guardrails": {
'min_total_reads': 10000,
'aligned_cutoff': 0.9,
'alternate_alignment': 0.3,
'min_ratio_of_mods_in_to_out': 0.01,
'modifications_at_ends': 0.01,
'outside_window_max_sub_rate': 0.002,
'max_rate_of_subs': 0.3,
'guide_len': 19,
'amplicon_len': 50,
'amplicon_to_read_length': 1.5
}
'''
(These are the default values as an example).

Change the values as desired to any color or guardrail specification.
- `TotalReadsGuardrail` Checks if the number of reads is lower than expected. (Default: 10000)
- `OverallReadsAlignedGuardrail` Checks if the number of aligned reads is lower than expected. (Default: 90% of the total reads)
- `DisproportionateReadsAlignedGuardrail` Checks if the number of reads aligned to an amplicon is higher or lower than expected proportionally. (Default: 30% more or less than expected)
- `LowRatioOfModsInWindowToOutGuardrail` Checks if the ratio of modifications inside to outside the quantification window is lower than expected. (Default: 0.01)
- `HighRateOfModificationAtEndsGuardrail` Checks if there is a high rate of modifications at the ends of the read. (Default: 0.01)
- `HighRateOfSubstitutionsOutsideWindowGuardrail` Checks if there is a high rate of substitutions outside of the quantification windows. (Default: 0.002)
- `HighRateOfSubstitutionsGuardrail` Checks if the proportion of substitutions to other modifications is higher than expected. (Default: 0.3)
- `ShortSequenceGuardrail` Checks if the provided sequences (both Amplicons and Guides) are shorter than expected. (Amplicon Default: 50, Guide Default: 19)
- `LongAmpliconShortReadsGuardrail` Checks if the provided amplicon is more than `<value>` times the average length of read. (Default: 1.5)

### Example run: Non-homologous end joining (NHEJ)

Expand Down Expand Up @@ -1235,3 +1186,54 @@ The output will consist of:
3. CRISPRessoAggregate_mapping_statistics.txt: A tab-separated file showing the number of reads sequenced and mapped for each run.
4. CRISPRessoAggregate_quantification_of_editing_frequency.txt: A tab-separated with the number of reads and edits for each run folder. Data from run folders with multiple amplicons show the sum totals for all amplicons.
5. CRISPRessoAggregate_quantification_of_editing_frequency_by_amplicon.txt: A tab-separated file showing the number of reads and edits for each amplicon for each run folder. Data from run folders with multiple amplicons will appear on multiple lines, with one line per amplicon.

### CRISPRessoPro

CRISPResso is an open source tool for free use by academics. However, for-profit organizations are required to purchase a license to use CRISPResso. As a part of this license, organizations gain access to the CRISPRessoPro package which supplements CRISPResso
with several useful features:

- Interactive and improved plots using D3 and Plotly
- Customizable colors
- Customizable warnings based on potential issues in results (guardrails)

#### Installation

To add CRISPRessoPro to CRISPResso contact Edilytics - licensing@edilytics.com

#### D3 and Plotly

If CRISPRessoPro is installed, by default reports will include interactive plots. To use matplotlib for figures add the `--use_matplotlib` argument.

#### Customizable Colors and Guardrails

If CRISPRessoPro is installed, by default the colors and guardrails will remain the same as CRISPResso. To alter this, use the `--config_file` argument and a filepath to a `.json` file with the following format:

``` json
{
"colors": {
"Substitution": "#0000FF",
"Insertion": "#008000",
"Deletion": "#FF0000",
"A": "#7FC97F",
"T": "#BEAED4",
"C": "#FDC086",
"G": "#FFFF99",
"N": "#C8C8C8",
"-": "#1E1E1E"
},
"guardrails": {
"min_total_reads": 10000,
"aligned_cutoff": 0.9,
"alternate_alignment": 0.3,
"min_ratio_of_mods_in_to_out": 0.01,
"modifications_at_ends": 0.01,
"outside_window_max_sub_rate": 0.002,
"max_rate_of_subs": 0.3,
"guide_len": 19,
"amplicon_len": 50,
"amplicon_to_read_length": 1.5
}
}
```

Above are the default values as an example, change the values as desired to any color or guardrail specification.
2 changes: 0 additions & 2 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -89,10 +89,8 @@ def main():
'matplotlib', # '>=1.3.1,<=2.2.3',
'seaborn', # '>0.7.1,<0.10',
'jinja2',
'jinja_partials',
'scipy',
'numpy',
'plotly',
],
cmdclass = command_classes,
ext_modules = ext_modules
Expand Down