NucleaSeq

A package for the processing of NucleaSeq data, as described in the manuscript:

Massively parallel kinetic profiling of natural and engineered CRISPR nucleases

Stephen K. Jones Jr, John A. Hawkins, Nicole V. Johnson, Cheulhee Jung, Kuang Hu, James R. Rybarski, Janice S. Chen, Jennifer A. Doudna, William H. Press, and Ilya J. Finkelstein

bioRxiv

Installation

The following instructions should work across platforms, except that installing virtualenv with apt-get is Ubuntu specific. For other platforms, install virtualenv appropriately if desired.

First, clone the repository to a local directory:

git clone https://github.com/finkelsteinlab/nucleaseq.git

Optionally, you can install into a virtual environment (recommended):

sudo apt-get install -y virtualenv
cd nucleaseq
virtualenv envnucleaseq
. envnucleaseq/bin/activate

Now install dependencies:

pip install numpy==1.13.3 && pip install cython==0.24 && pip install -r requirements.txt

Note that some versions of pip require each line of the requirements.txt file be installed in a separate call.

Next install the freebarcodes package in the same virtual environment, found at https://github.com/hawkjo/freebarcodes.

Finally, install nucleaseq with

python setup.py install

A note on memory

NucleaSeq data processing and analysis is memory intensive. All work for the original paper was performed on a server with 256GB of RAM. Less is possible with reduced the parallelization, but results in correspondingly slower runtimes. Memory usage also depends on library design, and analysis of the original experiments required at least 64GB of RAM. For the first run, monitor memory use carefully, and select the machine and parallelization accordingly.

Usage

The NucleaSeq pipeline is a combination of command line invocations and Jupyter notebooks, typically performed in the order below. The notebooks are located in the notebooks folder and contain instructions for their use.

Library design

Notebook: NucleaSeq Oligo Design.ipynb

Raw data preprocessing

Command line:

  nucleaseq setup_run ...
  nucleaseq preprocess ...

Type nucleaseq --help for details.

Analysis

Notebooks, in the following order:

NucleaSeq Full Data Tabulation and Normalization Factors.ipynb
NucleaSeq Full Data Analysis.ipynb
NucleaSeq Cut Data Processing.ipynb

CRISPR Model

The CRISPR Model described in the paper is available in this software package. Its use is explained in the notebook CRISPR Model Calculator Tutorial.ipynb.

Name		Name	Last commit message	Last commit date
Latest commit History 103 Commits
notebooks		notebooks
nucleaseq		nucleaseq
params		params
resources		resources
.gitignore		.gitignore
README.md		README.md
Supplement.docx		Supplement.docx
Supplemental_File_1.xlsx		Supplemental_File_1.xlsx
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

notebooks

notebooks

nucleaseq

nucleaseq

params

params

resources

resources

.gitignore

.gitignore

README.md

README.md

Supplement.docx

Supplement.docx

Supplemental_File_1.xlsx

Supplemental_File_1.xlsx

requirements.txt

requirements.txt

setup.py

setup.py

Repository files navigation

NucleaSeq

Massively parallel kinetic profiling of natural and engineered CRISPR nucleases

Installation

A note on memory

Usage

Library design

Raw data preprocessing

Analysis

CRISPR Model

About

Releases

Packages

Languages

finkelsteinlab/nucleaseq

Folders and files

Latest commit

History

Repository files navigation

NucleaSeq

Massively parallel kinetic profiling of natural and engineered CRISPR nucleases

Installation

A note on memory

Usage

Library design

Raw data preprocessing

Analysis

CRISPR Model

About

Resources

Stars

Watchers

Forks

Languages