New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error running vcf2xg.sh on human data #42
Comments
Please let me know if you can help me debug this. |
Thank you for your report. I will get back to you soon. |
I improved the error message in the latest commit. Is it possible for you to try it again? |
Hi, I treated my sequel2 human CCS data (pacbio) with smrttools CCS, pbmm2, and pbsv.
|
Thank you for trying MoMI-G.
For the last three errors, the path to the vg binary might not be appropriate.
The third argument of command needs to be the path to the vg binary you installed on your environment. I updated the error message for the last three error lines. |
Thank you so much. I re-checked and changed input file and vg file path, and re-ran it.
|
I improved the error message. Is it possible for you to try it with the latest master? |
Thank you for your instruction.
|
It seems a corner case bug and I fixed in the latest master. Could you try it again? |
Thank you for your help. I git pulled again and retried. Then I got another error.
|
I updated the script not to raise the error above |
Thanks @KokoroO7 for reporting this error also. I've git pulled again but I get an error after
|
The error message seems to come from vg. Could you try it again with the following command? vg convert -g -p [ggf file] > output.vg |
Sure.
Stack trace
|
I'd like to know if
If both answers are yes, then it seems the error related to vg. |
Yes, the input file
Not sure what the memory requirements would be for this?
I'm going to try and run this a cluster with lots of RAM to see what happens (prior to opening a |
With
|
I updated the script to raise an error if there is an edge record with missing source/target name. |
I agree that the size is odd.... Also, the
What do you make of this? |
I updated the script to generate gfa file instead of ggf file. Also, I added an error message when the line is malformed on generating gfa format. |
This is the new error message I get when running the updated script:
This time the output size of the
Trying the command on
|
I improved the error message again. However, I suspect the input vcf is not supported in MoMI-G tools. If the current master still raises the same error message, I recommend confirming if the only SV is contained in the input vcf file and then using SURVIVOR for the input vcf file. |
The current master gives errors for two RUFUS
Manta
Looks like it's a problem with the VCF files. Whats the proper way to re-format with SURVIVOR?
Where But I see From the number of SV calling tools we've tried so far (for short-read callers) Could you please add support for RUFUS? |
Actually, these variants of VCF are not supported now. For example, the output of RUFUS seems to include SNP records, though MoMI-G tools currently do not support them because MoMI-G is not suitable for visualizing SNPs. I plan to examine it moving forward if time allows. |
Hi @6br okay good to know that Could you also please provide a list of SV calling tools which you've tried on |
I've ran Manta on public data which only reports
Removing the first record:
Removing that record:
and so on. |
The list of software is here https://github.com/MoMI-G/MoMI-G#adapt-your-own-dataset. |
As I don't have any calling data from non-Illumina data it would be useful if you could provide some test data. Certainly it should be possible to transform a callset generated by SV callers for Illumina into a format the I'm wondering if any SV calling data can be coerced into a format that For example, I subset
Next, I ran
It's odd because this was the same reference genome which I aligned the data to...
|
Looking at the reference index (
So, indeed, the reference genome does not contain Since the records from
Not sure why Could you please fix this or let me know which reference genome(s) are tested on |
Historically reason, there are two variants of human reference genome, i.e. GRCh and hg. hg is chr-prefixed reference genome, and the earlier version of GRCh is not chr-prefixed genome. |
I've returned to give MoMI-G another shot, this time with human data but I get an error trying to generate the
xg
file.This is the output files:
The text was updated successfully, but these errors were encountered: