Skip to content

Scripts and what they eat

Ram RS edited this page Aug 29, 2013 · 3 revisions

I will be recording my workflow-walkthrough on khmer here. I will be walking through the following tutorials in order:

  1. 2013 MSU Assembly Workshop
  2. Eel-pond Tutorial
  3. Kalamazoo Metagenome Tutorial

For each, I will go step-by-step, pick up the python scripts used and follow their dependencies until I reach the core logic. For each interface, I will list the input parameters it takes and my proposed error handling messages.

2013 MSU Assembly Workshop

Step 1: Getting the Data

split-pe.py 1 inPrm, accepts and adds '.1' and '.2' to the value, no error checking right now, Need to handle: No input file )maybe nest reading in a try-except block?

Step 2: Trimming and Qual

interleave.py 2 inPrms, error handling + argparse implemented, moving on!

Step 3: Assembling with Velvet

strip-and-split-for-assembly.py 1 mandatory inPrm, 1 optional outPrm, no error checking done for inPrm, basic file existence check needed.

assemstats3.py No of args check done, first arg int check done, file existence check on sys.argv[2:] advisable

Step 4: Mapping reads to ref

No new scripts

Eel Pond mRNAseq Tutorial

Step 0: Downloading Data

No khmer scripts involved

Step 1: Q-Trim + Filter

interleave-reads.py extract-paired-reads.py

Step 2: DigiNorm

normalize-by-median.py filter-abund.py

Step 3: Running Assembly

split-paired-reads.py

Step 4: BLAST

No khmer scripts involved

Step 5: Transcript families

do-partition.py

Kalamazoo Metagenome Assembly Tutorial

Step 1: Q-trim + filter

No new khmer scripts used

Step 2: DigiNorm

readstats.py

Step 3: Partitioning

extract-partitions.py sweep-reads3.py

Step 4: Assembling

extract-long-sequences.py

Step 5: Mapping + Abundance

make-coverage.py