Envision Screen data cleaning

Author: Sam Lee, `samleenz@me.com`

Date: 2017-12-05

Yeast screen initial analysis pipeline. From a set of paired 384 well plates find the wells that exhibit the greatest growth defects across plate pairs for further analysis and potential follow-up.

The code was written with a certain setup in mind from an Envision screen so while it is generalisable there will likely be some changes required e.g. for exclusion of control wells and ensuring mapping of well numbers to alpha-numeric coordinates is correct.

Directory strucutre

code/ has the scripts for processing
data/ contains raw data files
results/deltas has the delta files for each plate in /data
results/z-score has ranked z-score lists for each plate and a plot of the z-score distribution for each plate in z-score/plots
results/paired-hits are the lists of plate coordinates where z-score < -2 for the duplicate plates
results/summary.txt is the summary file for the pipeline. It lists each pair and the number of "hits" in descending order.

Deltas

Deltas (differences) are calcualted as the difference between every well ij, for i rows and j columns, and the arithmetic mean of column 23 and 24.
The filenames for the deltas match the raw filename but with the addition of _delta to the end of each file.

Z-score

The Z-score is a centred transformation of the delta values where they are moved from an absolute scale to the standard deviation scale
columns [1,2,23,24] (controls) are removed prior to the z-score transformation so do not affect the distribution of the standard deviations
The filenames for the z-scores match the raw filename but with the addition of _zscore to the end of each file.

Configuring the Snakemake pipeline

To configure the pipeline for execution the plate name pairs must be specified in the config.yaml file as follows:

pairs:
    A: SCW0137251_084229_003___SCW0137252_084229_002
    B: SCW0137253_084229_001___SCW0137254_084229_004

The triple-underscore ___ betwen the plate names is important.

To test the pipeline once the config file is edited use

snakemake --dag --forceall | dot -Tpng > dag.png
#or
snakemake -npr

To execute:

snakemake

If further plate pairs are added only those that have not been run will be re-calculated if snakemake is called again.

misc

To regenerate this readme from the markdown doc:

pandoc -s README.md -o README.html

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
code		code
envs		envs
.Rhistory		.Rhistory
.gitignore		.gitignore
README.md		README.md
Snakefile		Snakefile
example_config.yaml		example_config.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

code

code

envs

envs

.Rhistory

.Rhistory

.gitignore

.gitignore

README.md

README.md

Snakefile

Snakefile

example_config.yaml

example_config.yaml

Repository files navigation

Envision Screen data cleaning

Author: Sam Lee, `samleenz@me.com`

Date: 2017-12-05

Directory strucutre

Deltas

Z-score

Configuring the Snakemake pipeline

misc

About

Releases

Packages

Languages

samleenz/snakemake-plate-parse

Folders and files

Latest commit

History

Repository files navigation

Envision Screen data cleaning

Author: Sam Lee, samleenz@me.com

Date: 2017-12-05

Directory strucutre

Deltas

Z-score

Configuring the Snakemake pipeline

misc

About

Topics

Resources

Stars

Watchers

Forks

Languages

Author: Sam Lee, `samleenz@me.com`