how to give ploidy a non missing value in a dosage data set #86

nickychapman · 2023-10-09T16:09:08Z

Background: I've used imputedDosageFile to get my Beagle output in a format I can use with SeqArray. Then I used seqSNP2GDS to make a file for use with SeqVarTools. This object only includes dosage data, no genotypes. I am running into problems using GENESIS because it doesn't like that ploidy is NA. How can I tweak the object so ploidy is non missing? I think I probably need to put some dummy data in the genotype node but I can't figure that out.

Here is what my object looks like.

genodata
Object of class "SeqVarGDSClass"
File::panel_01_chr22.dose.array.gds (267.7K)

[ ] *
|--+ description [ ] *
|--+ sample.id { Str8 628 LZMA_ra(22.3%), 1.2K } *
|--+ variant.id { Int32 429 LZMA_ra(24.4%), 425B } *
|--+ position { Int32 429 LZMA_ra(79.4%), 1.3K } *
|--+ chromosome { Str8 429 LZMA_ra(7.30%), 101B } *
|--+ allele { Str8 429 LZMA_ra(22.0%), 385B } *
|--+ genotype [ ] *
|--+ phase [ ]
|--+ annotation [ ]
| |--+ id { Str8 429 LZMA_ra(24.0%), 393B } *
| |--+ qual { Float32 429 LZMA_ra(5.71%), 105B } *
| |--+ filter { Int32,factor 429 LZMA_ra(5.71%), 105B } *
| |--+ info [ ]
| --+ format [ ]
| --+ DS [ ] *
| --+ data { Float64 628x429 LZMA_ra(12.4%), 260.1K } *
--+ sample.annotation [ ]

smgogarten · 2023-10-10T06:14:52Z

Currently SeqVarTools determines ploidy of a GDS file using the first dimension of the genotype node: https://github.com/smgogarten/SeqVarTools/blob/devel/R/AllUtilities.R#L20

Is there an alternate way to get the ploidy if that value is missing?

nickychapman · 2023-10-10T16:25:36Z

Here's some background to this predicament, in case that is where the problem lies:

I've been using imputedDosageFile (from GWASTools) and then seqSNP2GDS (from SeqArray) to go from a Beagle output file of dosages to a SeqVarGDS object.

In an older version of R (4.1.2), using GWASTools 1.40.0 and seqArray 1.34.0 this works fine and in the resulting object the ploidy is 2.

In a newer version of R (4.3.1), using GWASTools 1.46.6 and SeqArray 1.40.1, the resulting object has ploidy NA. This trips me up when I go to use assocTestSingle in GENESIS.

For now I am going to use the old version of R and proceed as I was before.

zhengxwen · 2023-10-12T03:16:30Z

Currently SeqVarTools determines ploidy of a GDS file using the first dimension of the genotype node: https://github.com/smgogarten/SeqVarTools/blob/devel/R/AllUtilities.R#L20

Is there an alternate way to get the ploidy if that value is missing?

In seqAlleleFreq(), if ploidy is NA (no genotype node, but a numeric dosage node), it will be set to be

ploidy <- getOption("seqarray.ploidy", 2L)

If the environment variable seqarray.ploidy is not defined, then just use 2 as a default value.

This might help.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

how to give ploidy a non missing value in a dosage data set #86

how to give ploidy a non missing value in a dosage data set #86

nickychapman commented Oct 9, 2023 •

edited

smgogarten commented Oct 10, 2023

nickychapman commented Oct 10, 2023

zhengxwen commented Oct 12, 2023

how to give ploidy a non missing value in a dosage data set #86

how to give ploidy a non missing value in a dosage data set #86

Comments

nickychapman commented Oct 9, 2023 • edited

smgogarten commented Oct 10, 2023

nickychapman commented Oct 10, 2023

zhengxwen commented Oct 12, 2023

nickychapman commented Oct 9, 2023 •

edited