Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KeyError during "Propagating labels to unlabelled vertices" #4

Open
nick-youngblut opened this issue Dec 6, 2020 · 2 comments
Open
Assignees
Labels
bug Something isn't working enhancement New feature or request

Comments

@nick-youngblut
Copy link

The error:

GraphBin2 started
-------------------
Total number of contigs available: 276680
Total number of edges in the assembly graph: 23569
Number of bins available in binning result: 13
Number of binned contigs: 2261
Total number of unbinned contigs: 274419
Number of isolated contigs: 270459

Removing labels of unsupported vertices...
Iteration: 1
100%|███████████████████████████████████████████████████████████| 2261/2261 [00:03<00:00, 669.23it/s]
Iteration: 2
100%|███████████████████████████████████████████████████████████| 2178/2178 [00:02<00:00, 731.72it/s]
Iteration: 3
100%|███████████████████████████████████████████████████████████| 2177/2177 [00:02<00:00, 734.18it/s]
Iteration: 4
100%|███████████████████████████████████████████████████████████| 2176/2176 [00:02<00:00, 734.44it/s]

Refining labels of inconsistent vertices...
Iteration: 1
100%|███████████████████████████████████████████████████████████| 2176/2176 [00:02<00:00, 733.30it/s]
Iteration: 2
100%|███████████████████████████████████████████████████████████| 2176/2176 [00:02<00:00, 770.52it/s]
Iteration: 3
100%|███████████████████████████████████████████████████████████| 2176/2176 [00:02<00:00, 771.00it/s]

Obtaining non isolated contigs...
100%|██████████████████████████████████████████████████████| 276680/276680 [00:29<00:00, 9521.30it/s]

Number of non-isolated contigs: 5095
Number of non-isolated unbinned contigs: 2919

Propagating labels to unlabelled vertices...
  0%|                                                                       | 0/2919 [00:00<?, ?it/s]Traceback (most recent call last):
  File "/ebio/abt3_projects/software/dev/ll_pipelines/llmga/bin/scripts/GraphBin2/src/graphbin2_SPAdes.py", line 617, in <module>
    sorted_node_list_ = [list(runBFS(x, threhold=depth)) for x in contigs_to_bin]
  File "/ebio/abt3_projects/software/dev/ll_pipelines/llmga/bin/scripts/GraphBin2/src/graphbin2_SPAdes.py", line 617, in <listcomp>
    sorted_node_list_ = [list(runBFS(x, threhold=depth)) for x in contigs_to_bin]
  File "/ebio/abt3_projects/software/dev/ll_pipelines/llmga/bin/scripts/GraphBin2/src/graphbin2_SPAdes.py", line 350, in runBFS
    labelled_nodes.add((node, active_node, contig_bin, depth[active_node], abs(coverages[contigs_map[node]]-coverages[contigs_map[active_node]])))
KeyError: 276488
  0%|

What is the key error referring to? What is the key that is not found?

conda info:

# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       1_gnu    conda-forge
biopython                 1.78             py39hbd71b63_1    conda-forge
ca-certificates           2020.12.5            ha878542_0    conda-forge
cairo                     1.16.0            h488836b_1006    conda-forge
certifi                   2020.12.5        py39hf3d152e_0    conda-forge
fontconfig                2.13.1            h1056068_1002    conda-forge
freetype                  2.10.4               h5ab3b9f_0
gettext                   0.19.8.1             h9b4dc7a_1
gmp                       6.2.1                h58526e2_0    conda-forge
icu                       67.1                 he1b5a44_0    conda-forge
ld_impl_linux-64          2.35.1               hed1e6ac_0    conda-forge
libblas                   3.9.0                3_openblas    conda-forge
libcblas                  3.9.0                3_openblas    conda-forge
libffi                    3.3                  he6710b0_2
libgcc-ng                 9.3.0               h5dbcf3e_17    conda-forge
libgfortran-ng            9.3.0               he4bcb1c_17    conda-forge
libgfortran5              9.3.0               he4bcb1c_17    conda-forge
libglib                   2.66.3               h1f3bc88_1    conda-forge
libgomp                   9.3.0               h5dbcf3e_17    conda-forge
libiconv                  1.16                 h516909a_0    conda-forge
liblapack                 3.9.0                3_openblas    conda-forge
libopenblas               0.3.12          pthreads_h4812303_1    conda-forge
libpng                    1.6.37               hbc83047_0
libstdcxx-ng              9.3.0               h2ae2ef3_17    conda-forge
libuuid                   2.32.1            h14c3975_1000    conda-forge
libxcb                    1.14                 h7b6447c_0
libxml2                   2.9.10               h68273f3_2    conda-forge
ncurses                   6.2                  he6710b0_1
numpy                     1.19.4           py39h57d35e7_1    conda-forge
openssl                   1.1.1h               h7b6447c_0
pcre                      8.44                 he6710b0_0
pip                       20.3.1             pyhd8ed1ab_0    conda-forge
pixman                    0.38.0               h7b6447c_0
pycairo                   1.20.0           py39h08627d8_1    conda-forge
python                    3.9.0                hdb3f193_2
python-igraph             0.8.3            py39hd24af65_2    conda-forge
python_abi                3.9                      1_cp39    conda-forge
readline                  8.0                  h7b6447c_0
setuptools                50.3.2           py39h06a4308_2
sqlite                    3.34.0               h74cdb3f_0    conda-forge
texttable                 1.6.3              pyh9f0ad1d_0    conda-forge
tk                        8.6.10               hbc83047_0
tqdm                      4.54.1             pyhd8ed1ab_0    conda-forge
tzdata                    2020d                h52ac0ba_0
wheel                     0.36.1             pyhd3deb0d_0    conda-forge
xorg-kbproto              1.0.7             h14c3975_1002    conda-forge
xorg-libice               1.0.10               h516909a_0    conda-forge
xorg-libsm                1.2.3             h84519dc_1000    conda-forge
xorg-libx11               1.6.12               h516909a_0    conda-forge
xorg-libxext              1.3.4                h516909a_0    conda-forge
xorg-libxrender           0.9.10            h516909a_1002    conda-forge
xorg-renderproto          0.11.1            h14c3975_1002    conda-forge
xorg-xextproto            7.3.0             h14c3975_1002    conda-forge
xorg-xproto               7.0.31            h14c3975_1007    conda-forge
xz                        5.2.5                h7b6447c_0
zlib                      1.2.11               h7b6447c_3
@nick-youngblut
Copy link
Author

I think that the error is due to me using a spades assembly contig fasta in which all sequences <2000bp were removed. I'm guessing that graphbin2 expects all contigs in the *.gfa and *.paths files to be present in the fasta file also. It would help to just have a warning instead of a keyerror, given that many users filtering the contig fasta generated by metaspades, since metaspades has no minimum contig length

@Vini2
Copy link
Collaborator

Vini2 commented Dec 6, 2020

Hi @nick-youngblut,

You are correct. GraphBin2 expects all the contigs available in the *.paths to be provided for binning. I will add a fix so users can filter out contigs and still use the original graph. Thank you for pointing this out. I will leave this issue open until I fix it.

@Vini2 Vini2 added bug Something isn't working enhancement New feature or request labels Dec 6, 2020
@Vini2 Vini2 self-assigned this Dec 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants