Skip to content

Commit

Permalink
csi stuff
Browse files Browse the repository at this point in the history
bump version. document how to use CSI and adjust env vars to take
advantange
  • Loading branch information
brentp committed Mar 26, 2018
1 parent 3b57f81 commit 703ae58
Show file tree
Hide file tree
Showing 6 changed files with 30 additions and 6 deletions.
4 changes: 2 additions & 2 deletions .travis.yml
Expand Up @@ -5,10 +5,10 @@ os:
- osx

go:
- 1.6
- 1.7
- 1.8
- tip
- 1.9
- 1.10

before_install:
- go get github.com/axw/gocov/gocov
Expand Down
3 changes: 2 additions & 1 deletion README.md
Expand Up @@ -2,7 +2,8 @@ vcfanno
=======
<!--
build:
VERSION=0.1.0; goxc -build-ldflags "-X main.VERSION=$VERSION" -include docs/,example/,README.md -d /tmp/vcfanno/ -pv=$VERSION -bc='linux,darwin,windows,!arm'
CGO_ENABLED=0 GOARCH=amd64 go build -o vcfanno_linux64 --ldflags '-extldflags "-static"' vcfanno.go
GOOS=darwin GOARCH=amd64 CGO_ENABLED=0 go build -o vcfanno_osx --ldflags '-extldflags "-static"' vcfanno.go
-->


Expand Down
4 changes: 4 additions & 0 deletions docs/CHANGES.md
@@ -1,3 +1,7 @@
v0.2.9
------
+ support for CSI indexes. If present, a .csi file will be preferred over a .tbi.

v0.2.8
------
+ fix fo #71 for out of bounds warning followed by panic.
Expand Down
20 changes: 19 additions & 1 deletion docs/performance_tips.md
Expand Up @@ -23,6 +23,23 @@ default value is 100. For example:
GOGC=2000 vcfanno -p 12 a.conf a.vcf
```

CSI
---

For very dense files such as CADD, or even gnomAD or ExAC, it is recommended to index
with csi, this allows finer resolution in the index. When a .csi file is present, `vcfanno`
will prefer it over a .tbi. For example, using:

```
tabix -m 12 --csi $file
```

will work for most cases. When a csi is present, it seems to be best to lower the
`IRELATE_MAX_GAP` (see below) to 1000 or lower. Doing this, we can see a **50 % speed improvement** when
using a csi-index ExAC file to annotate a clinvar file.

Experiment with what works best for each scenario.

Max Gap / Chunk Size
--------------------

Expand All @@ -37,7 +54,8 @@ sets, it is best to have this value be large so that each annotation worker
gets enough work to keep it busy.

The default gap size is `5000` bases. Users can alter this using the
environment variable `IRELATE_MAX_GAP`.
environment variable `IRELATE_MAX_GAP`. When using a csi index this can
be much lower, for example `1000`

The default chunk size is `8000` query intervals. Users can alter this using the
environment variable `IRELATE_MAX_CHUNK`.
Expand Down
3 changes: 2 additions & 1 deletion tests/release-tests.sh
Expand Up @@ -14,10 +14,11 @@ BASE=/data/gemini_install/data/gemini_data/
go build -a
V=./vcfanno

IRELATE_MAX_GAP=500 run clinvar_common_pathogenic $V -lua docs/examples/clinvar_exac.lua -p 4 -base-path $BASE docs/examples/clinvar_exac.conf $BASE/clinvar_20170130.tidy.vcf.gz
GOGC=900 IRELATE_MAX_GAP=600 run clinvar_common_pathogenic $V -lua docs/examples/clinvar_exac.lua -p 4 -base-path $BASE docs/examples/clinvar_exac.conf $BASE/clinvar_20170130.tidy.vcf.gz
assert_equal 577 $(zgrep -wc common_pathogenic $STDOUT_FILE)
assert_equal $(zgrep -cv ^# $STDOUT_FILE) $(zgrep -cv ^# $BASE/clinvar_20170130.tidy.vcf.gz)
exit 0
#tail -1 $STDERR_FILE
run exac_combine vcfanno -base-path $BASE docs/examples/exac_combine/exac_combine.conf $BASE/ExAC.r0.3.sites.vep.tidy.vcf.gz
Expand Down
2 changes: 1 addition & 1 deletion vcfanno.go
Expand Up @@ -25,7 +25,7 @@ import (
"github.com/brentp/xopen"
)

var VERSION = "0.2.8"
var VERSION = "0.2.9"

func envGet(name string, vdefault int) int {
sval := os.Getenv(name)
Expand Down

0 comments on commit 703ae58

Please sign in to comment.