Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merging UKB SV Files #204

Open
GHawkes93 opened this issue Dec 20, 2023 · 1 comment
Open

Merging UKB SV Files #204

GHawkes93 opened this issue Dec 20, 2023 · 1 comment

Comments

@GHawkes93
Copy link

Hi,

In the recent release of 500,000 genomes, the UKB has provided SV calls, but only in bgzipped sample-level vcf files.

I've tried merging these files in groups to create a pVCF- after unzipping each vcf, as survivor doesn't seem to take .gz files? - but the file size is growing such that I can't merge those groups (I get a "Killed" error). I tried trimming the vcf files to just genotypes in the FORMAT field using bcftools - but then the merging was odd, in that when merging two files with 9000 people each in, I got only 2 individuals in the output

Do you have any suggestions for how I could perform this analysis?

Cheers,
Gareth

@GHawkes93
Copy link
Author

GHawkes93 commented Dec 20, 2023

I should add - I'm using a 72-core machine - each group file (approx 9k people) is ~ 270GB and contains ~.5M SVs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant