Skip to content

raivivek/makebio

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

makebio

Manage computational biology projects with ease.

Installation

pip install makebio

For autocompletion, add the following lines to your login file (BASH): :: eval "$(_MAKEBIO_COMPLETE=source makebio)"

Or for ZSH, :: eval "$(_MAKEBIO_COMPLETE=source_zsh makebio)"

Philosophy

makebio is inspired by project management practices in @theparkerlab and "A Quick Guide to Organizing Computational Biology Projects" by WS Noble1. Using makebio is easy and intuitive, and it is flexible enough to be used in a collaborative environment.

The core idea of makebio is to create and manage a specific directory layout for your project. Using makebio init, for example, will setup the following files and directories:

This structure allows us to keep:

  • analysis files in one place --- control directory
  • intermediate files in one place --- work directory. You may symlink this directory to /tmp, /scratch space, or a destination with a large enough space.
  • results (produced from workflows or scripts) in one place --- results directory
  • raw data in one place --- data directory
  • essential project files to reproduce your project results

Workflow

  1. Setup an empty project directory with makebio init.
  2. Create a new analysis (<NAME>) with makebio add-analysis. This will create a (1) new directory in the control/, and (2) a new directory in the results/ directory with the name of the analysis.
  3. Add your analysis files in the control/<NAME> directory. Make sure all the output is written to the results/<NAME> directory.
  4. Save your project using makebio save.

Usage

See makebio --help.

Usage: makebio [OPTIONS] COMMAND [ARGS]...

  Manage computational biology research projects.

  Computational biology or Bioinformatics projects utilize HPC systems that
  typically limit the amount of space on your home directory and provide an
  external mounted space for scratch work. The idea is that your code and
  necessary files sit within the home directory and all intermediate files
  which could take up a lot of space are kept on the scratch.

  This necessitates organizing your work in such a way that even though the
  underlying data is fragmented, it all should transparently appear in one
  place for the user.

  makebio is a simple utility to create and manage such projects. While
  still in development, it is actively used by at least one person in their
  daily bioinformatics work.

  NOTES

  `analysis` and `data` directories are prefixed with the date of creation.
  For example, `2019-04-20_createTracks` or `2019-05-01_fastq`.

  `freeze` marks the target directory and all files within to be read-only.

  CONTACT

  Please leave suggestions on and bugs reports on GitHub [1].

  [1]: https://github.com/raivivek/makebio

  @raivivek

Options:
  --version  Show the version and exit.
  --help     Show this message and exit.

Commands:
  add-analysis     Add new analysis.
  add-data         Add new data.
  freeze           Mark a directory/file read only (for the user/group).
  init             Initialize a new project.
  rename-analysis  Rename existing analysis.
  save             Save a (Git) snapshot.
  show             Show current configuration.
  update           Refresh configuration with new changes.

Footnotes


  1. Noble W. S., 2009 A Quick Guide to Organizing Computational Biology Projects. PLOS Computational Biology 5: e1000424. https://doi.org/10.1371/journal.pcbi.1000424