Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

comparing --call output bed files with gfa file #102

Open
suyeonwy opened this issue Aug 11, 2023 · 0 comments
Open

comparing --call output bed files with gfa file #102

suyeonwy opened this issue Aug 11, 2023 · 0 comments

Comments

@suyeonwy
Copy link

suyeonwy commented Aug 11, 2023

Hi,

I'm working on generating pangenome using 19 genome assemblies with minigraph.
After that, I generated a bed file for each used sample using the following command, and I got 19 bed files.
minigraph -cxasm --call pangenome.gfa [sample*.fa] > sample*.bed

And I noticed that there is a path (+s447539 -> +s489808) in several bed files that doesn't exist in the gfa file.
Here is the path information from two output bed files

# from sample1.bed
NC_010460.4     49562573        49562960        >s447539        >s447541        >s489808:847:+:LUXR01077659.1:1025219:1026573

# from sample2.bed
NC_010460.4       49562573        49562960        >s447539        >s447541        >s489808:847:-:AORO02009665.1:479046:480559

In this example, sample 1 and 2 had +s447539 -> +s489808 (== -s489808 -> -s447539) path, but there is no link for that path in the input gfa file
Here is the line for the corresponding path in the input gfa file that I got with 'grep' command.

$ grep "s447539" pangenome.gfa | grep "s489808"
L       s489808 -       s447539 +       0M      SR:i:18 L1:i:847        L2:i:1

And conversely, I also found that there are paths in the gfa file but not in bed files.
For example, I could find +s590861 -> -s349685 path in gfa file

grep "s590861" pangenome.gfa | grep "s349685"
L       s590861 +       s349685 -       0M      SR:i:8  L1:i:61 L2:i:16327

but, I cannot find the corresponding path in all 19 bed files that I generated.
(I used 'grep' command for find the path, but I got nothing)

I'm not sure how to interpret these results..!

Thanks,
Suyeon Wy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant