Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Bioconductor converters to handle multi-allelic data and data with non-variant segments. #46

Open
deflaux opened this issue Jul 28, 2015 · 0 comments

Comments

@deflaux
Copy link
Collaborator

deflaux commented Jul 28, 2015

The variant converters currently have trouble with multi-allelic data and non-variant segments. We can work around this by filtering and reshaping the data before sending it to the converters (example below) but it would be better if we pushed correct handling for this into the package.

See also #32 for another example of how non-variant segments can also be expressed in the data.

  variants <- getVariants(datasetId="10473108253681171589", chromosome="22",
              start=50300077, end=50301500)

  # Remove non-variant segments
  only_variants <- Filter(function(v) { 1 <= length(v$alternateBases)}, variants)

  # Convert to biallelic data by truncating alternateBases
  # (this isn't how it should be fixed, its just an example)
  biallelic_variants <- lapply(only_variants, function(v) {
    if(1 < length(v$alternateBases)) {
      v$alternateBases = v$alternateBases[[1]]
    }
    v
  })

  granges <- variantsToGRanges(biallelic_variants)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant