Skip to content

Releases: EBIvariation/vcf-validator

VCF version detected automatically and checks on duplicate fields

04 May 15:48
Compare
Choose a tag to compare

This version simplifies the integration of the validation tool in automated pipelines, detecting the version of the VCF file before running the validation. This also prevents errors from being raised due to involuntary mismatches between the command line argument and the file.

New checks have been also included, to guarantee that no duplicate values are present in the ID and FORMAT columns in a single line. These checks are only applicable to version 4.3 of the specification!

The binaries can be downloaded using the links below.

Fixed bug when GT field is not listed in FORMAT column

22 Mar 11:57
Compare
Choose a tag to compare

The VCF specification allows not to list the GT field in the FORMAT column, but if present it must the first field. This release solves an issue that was making the validator raise a misleading error if GT was not present.

INFO CIGAR field and newline at end of file issues solved

05 Dec 10:03
Compare
Choose a tag to compare

This maintenance release solves a couple of issues reported for version 0.4.1:

  • Only a single value was considered valid as CIGAR field in the INFO column, when it should be a list as long as the number of alternate alleles. Thanks @sambrightman for your pull request!
  • Errors due to the lack of newline characters and the end of the file were not properly reported.

Memory usage issues solved

03 Nov 10:17
Compare
Choose a tag to compare

This maintenance release solves memory issues reported for version 0.4.

New dependencies were added to make possible to detect more complex errors, but the amount of memory consumed grew indefinitely. This has been solved and memory usage now remains constant at less than 10 MB of RAM.

The new executables, compatible with any Linux version, can be downloaded using the links below.

Fixing VCF files, fixing bugs...

18 Oct 14:01
Compare
Choose a tag to compare

In addition to the removal of duplicate variants introduced in the previous release, errors in the INFO and samples columns can be fixed now by removing the faulty field from the column. For instance, if an INFO value looks like AN=123;AF=not_a_frequency;DP=345, the fix would transform it into AN=123;DP=345.

Other improvements included in this version are:

  • Support for genomic ploidy different from 2
  • Ensuring all the variants that don't require fixing are written after running the vcf-debugulator
  • Simplified build process using a Docker image (recommended for developers only)

You can download the executables using the links below.

VCF v4.3 support and automatic error fixing

27 Jul 20:14
Compare
Choose a tag to compare

This release brings many exciting new features! VCF v4.3 is now supported and has been tested against more than 150 VCF files, so you can be sure it will catch a lot of pesky errors.

To make error solving a bit easier, vcf-validator now contains 2 different tools:

  • The validator, which can write reports to plain text and now also to a portable database (SQLite). Then the user doesn't need to fix every error by hand, because this database can be later processed by an automated tool such as...
  • The "debugulator", which reads the validator reports and automatically corrects as many errors as possible. This version can remove duplicated variants, and we will add more fixes in the future. In this release, the debugulator support is experimental and has some important bugs that were fixed in newer versions.

Compiler compatibility improved

03 Nov 13:31
Compare
Choose a tag to compare

Support for multiple compilers has been improved and it is automatically checked when committing changes to the repository. The list of fully supported compilers is:

  • Clang 3.5 to 3.7
  • GCC 4.8 to 5.0

Static linking supported during build

24 Sep 14:47
Compare
Choose a tag to compare

Static linking is now supported during the build process, benefiting those who can't install the dependencies in the machine that will run the validator.

If that is your case, please run the build in a system where you have root permissions, adding the -DBUILD_STATIC=1 option to the cmake command.

Multiple validation levels and more errors/warnings handled

23 Sep 13:25
Compare
Choose a tag to compare

This version introduces support for different validation levels: check only errors, errors and warnings, or stop after the first error is found.

The previous version didn't handle properly all the corner cases in meta-data descriptions. Empty descriptions and those containing escaped quotes are now considered valid.

Warnings are now raised when:

  • A comma is found in the ID column
  • The POS column value is zero

Please check the milestone information for a detailed description of resolved issues.

Working version for VCF 4.1 and 4.2

09 Sep 09:46
Compare
Choose a tag to compare

First working version of the validator, supporting VCF 4.1 and 4.2 via a formal grammar implemented using the Ragel State Machine Compiler (http://www.colm.net/open-source/ragel/).

It has been tested against all the files submitted to the European Variation Archive (www.ebi.ac.uk/eva) up to this date. More stand-alone tests are being implemented and will be included in future releases.