Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AttributeError: 'NoneType' object has no attribute 'group' #8

Open
wn835166087 opened this issue Feb 13, 2022 · 4 comments
Open

AttributeError: 'NoneType' object has no attribute 'group' #8

wn835166087 opened this issue Feb 13, 2022 · 4 comments

Comments

@wn835166087
Copy link

Hi,
Thanks for developing this tool.
I met a problem when running the graphbin2. Below is my pipeline and the error i got:

flye --meta --nano-hq barcode05-trimmed-2000bp.fastq --genome-size 4.3m --out-dir flye05 --threads 16
perl /programs/MaxBin-2.2.4/run_MaxBin.pl -contig flye05/assembly.fasta -abund flye05/assembly_info.txt -thread 16 -out Sample05
mkdir Sample05
mv Sample05.* Sample05
conda activate graphbin2
python GraphBin2/support/prepResult.py --binned flye05/MaxBin2 --output flye05/MaxBin2
python GraphBin2/graphbin2 --assembler flye --contigs flye05/assembly.fasta --abundance flye05/assembly_info.txt --graph flye05/assembly_graph.gfa --binned flye05/Sample05/initial_contig_bins.csv --output flye05/graphbin2 --nthreads 8

The flye&maxbin2 work alright.
The log of the graphbin2 is:

2022-02-13 11:36:59,497 - INFO - Existing binning output file: flye05/Sample05/initial_contig_bins.csv 2022-02-13 11:36:59,497 - INFO - Final binning output file: flye05/graphbin2 2022-02-13 11:36:59,498 - INFO - Depth: 5 2022-02-13 11:36:59,498 - INFO - Threshold: 1.5 2022-02-13 11:36:59,498 - INFO - Number of threads: 8 2022-02-13 11:36:59,498 - INFO - GraphBin2 started Traceback (most recent call last): File "GraphBin2/src/graphbin2_Flye.py", line 97, in <module> contig_num = int(re.search('%s(.*)%s' % (start_n, end_n), record.id).group(1))-1 AttributeError: 'NoneType' object has no attribute 'group'

Any hint on solving this problem?
Thank you very much.
Best,
Nan

@Vini2
Copy link
Collaborator

Vini2 commented Feb 16, 2022

Hello @wn835166087!

Thanks for posting this issue. From your pipeline, it seems that you are using the original contig file assembly.fasta from the Flye output which is not supported at the moment. As described in the section Before using Flye assemblies for binning, you need to get the edge sequences in the assembly graph and use that as input for binning.

I'm working on adding support to bin the original assembly.fasta file. I will release an updated version soon.

Let me know if you have any further questions.

Thank you!

@wn835166087
Copy link
Author

thank you so much for your reply! However, I'm wondering how i get corresponding Abundance file? the direct output from flye is the format of

#seq_name length cov. circ. repeat mult. alt_group graph_path
contig_143 187871 33 N N 3 * -195,143,994

I assume i need a similar file but the seq_name should be edge_1.
(if i directly use the output from flye assembly_info.txt, the error will be on the line 113 of graphbin2_Flye.py)

@Vini2
Copy link
Collaborator

Vini2 commented Feb 27, 2022

Hello @wn835166087,

You can refer to Before using Flye/Miniasm assemblies for binning from the GraphBin documentation. I have provided a script named flye_miniasm_gfa2fasta.py which can generate the edges.fasta file.

Once you get the edges.fasta file, you have to map the reads back to the sequences in the edges.fasta file and calculate the coverage values. You can use the tool CoverM to get the coverage values.

Let me know if you have any further questions.

Thank you!

@wn835166087
Copy link
Author

Thanks. I used CoverM to get the coverage.
I got abundance.txt in the format of
edge_961 11.233458\n
So for graphbin2_Flye.py, on line 105, I modified line = my_file.readline() as line = my_file.readline().rstrip('\n')
on line 115, i modified int(strings[1]) to int(float(strings[1]))

But now I'm totally confused. I still got error on line 249 (i quoted the try...; except ... as it exit the program)
Traceback (most recent call last): File "/home/nw323/GraphBin2/src/graphbin2_Flye.py", line 249, in <module> contig_num = contig_names_rev[row[0]] KeyError: 'contig_1'
I prepared the binned input using the prepResult.py, from which I got file like
contig_1,1
contig_10,1
which is original contig from the Flye output. But line 249 seems to expect the edge_XXX.

Any suggestions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants