Skip to content

Latest commit

 

History

History
84 lines (63 loc) · 4.25 KB

TODO.md

File metadata and controls

84 lines (63 loc) · 4.25 KB

Current development goals and outstanding tasks for bcbio-nextgen development. These are roughly ordered by current priority and we welcome contributors.

  • Improved deployment experience using docker containers to provide a fully isolated bcbio-nextgen installation. Requires re-working of installation process to be a two step process: download docker + add external biological data. Also requires adjustment of the pipeline and distributed processing to involve starting and using code isolated inside docker container. Work in progress is at bcbio-nextgen-vm.
  • Enable processing on Amazon EC2 with use of spot instances and no shared filesystem. Store file intermediates in S3 object storage instead of globally shared filesystem and make use of high speed local ephemeral storage.

  • Integrated structural variant analysis, including CNV prediction. Current targets are lumpy, delly and cn.mops.

  • Improved support for cancer tumor/normal paired callers. Suggested callers include SomaticSniper (#66, #109), LoFreq and others. A comprehensive discussion is at #112. FreeBayes supports tumor/normal calling: see this mailing list discussion for the suggested parameters. Requires improved framework for evaluating callers and approaches for handling Ensemble calling with multiple inputs (#67).
  • Improve analysis of coverage, especially in targeted sequencing experiments. Plan to integrate with chanjo. See #249 for more discussion.
  • Explore options for accumulating and displaying summary information from multiple runs. Prioritize options which allow accumulation across multiple analysis machines and already handle query and visualization.

  • Once initial structural variation analysis and evaluation is in place, incorporate and evaluate additional CNV and structural variant callers. Some current targets are the VarScan2 CNV caller and Control-FREEC.

  • Add in methylation analysis approaches. See [[#618][bcbio#618]] for discussion.

  • Handle split inputs across multiple sequencing lanes, handling merging of multiple fastq/BAM inputs and correctly maintaining lane information in BAM read group headers.

  • Test to see if less strict quality trimming results in better RNA-seq DE results.

  • Evaluate RNA-seq fusion analysis callers and implement support for one if we can find one with reliable results (#210).