Skip to content
This repository has been archived by the owner on Oct 11, 2023. It is now read-only.

Latest commit

 

History

History
61 lines (44 loc) · 1.72 KB

README.md

File metadata and controls

61 lines (44 loc) · 1.72 KB

libra

Build Status

Compute the Similarity between Metagenomic Samples

DOWNLOAD BINARY

Download a pre-built binary (compiled with Java 7, including dependencies):

For old releases, check out the release page:

BUILD FROM SOURCE

Most users do not need to build a binary from source. Use pre-built binaries.

To build, use ANT build system.

Type following to build without dependencies:

ant

Type following to build with dependencies (recommended):

ant allinone

The jar package built will be located at the /dist directory.

RUN

Preprocessing FASTA/FASTQ files

hadoop jar libra-all.jar preprocess -k 20 -t 8 -o /index_dir /source_dir

Preprocessing Options

  • k : k-mer size
  • t : number of tasks (reducers). 1 by default.
  • s : min size of group in bytes. 10GB by default. For each file group, a separate index file is created.
  • g : max number of groups. 20 groups by default. If groups to be created by "-s" option exceeds this value, combine groups.
  • f : kmer filter algorithm. NONE | STDDEV (standard deviation) | STDDEV2 (two's standard deviation) | NOTUNIQUE (default)
  • o : output directory

Scoring

hadoop jar libra-all.jar core -w LOGARITHM -o /score_dir /index_dir

Scoring Options

  • s : scoring algorithm. COSINESIMILARITY (default) | BRAYCURTIS | JENSENSHANNON
  • m : run mode. MAP (default) | REDUCE
  • w : weighting algorithm. LOGARITHM (default) | BOOLEAN | NATURAL
  • o : output directory