Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ecountering an issue when running AlleleCall #182

Open
kamivain opened this issue Aug 14, 2023 · 4 comments
Open

Ecountering an issue when running AlleleCall #182

kamivain opened this issue Aug 14, 2023 · 4 comments
Assignees
Labels
Status: In Progress Has been assigned and is being worked on.

Comments

@kamivain
Copy link

Hello,I have encountered an issue when running AlleleCall to the genomes. It said "AttributeError: 'NoneType' object has no attribute 'seq'", what's the matter, thank you!

$ chewBBACA.py AlleleCall -i bu_genome -g bu_schema/schema_seed/ --gl bu_result_wgMLST/cgMLST/cgMLSTschema99.txt -o bu_result251_cgMLST --cpu 2

chewBBACA version: 3.2.0
Authors: Rafael Mamede, Pedro Cerqueira, Mickael Silva, João Carriço, Mário Ramirez
Github: https://github.com/B-UMMI/chewBBACA
Documentation: https://chewbbaca.readthedocs.io/en/latest/index.html
Contacts: imm-bioinfo@medicina.ulisboa.pt

==========================
chewBBACA - AlleleCall

Started at: 2023-08-13T22:39:06

Minimum sequence length: 0
Size threshold: 0.2
Translation table: 11
BLAST Score Ratio: 0.6
Word size: 5
Window size: 5
Clustering similarity: 0.2
Prodigal training file: bu_schema/schema_seed/bu_train.trn
CPU cores: 2
BLAST path: /usr/bin
CDS input: False
Prodigal mode: single
Mode: 4
Number of inputs: 251
Number of loci: 971

== CDS prediction ==

Predicting CDS for 251 inputs...
[====================] 100%

== CDS extraction ==

Extracting predicted CDS for 251 inputs...
[====================] 100%
Extracted a total of 1694809 CDS from 251 inputs.

== CDS deduplication ==

Identifying distinct CDS...identified 603928 distinct CDS.

== CDS exact matches ==

Searching for DNA exact matches...found 194185 exact matches (matching 38271 distinct alleles).
Unclassified CDS: 565657

== CDS translation ==

Translating 565657 CDS...
[====================] 100%
Identified 3633 CDS that could not be translated.
Information about untranslatable and small sequences stored in bu_result251_cgMLST/temp/invalid_cds.txt
Unclassified CDS: 562024

== Protein deduplication ==

Identifying distinct proteins...identified 296723 distinct proteins.

== Protein exact matches ==

Searching for Protein exact matches...found 5906 exact matches (22513 distinct CDS, 30655 total CDS).
Unclassified proteins: 290823

== Clustering ==

Translating schema's representative alleles...done.
Creating minimizer index for representative alleles...done.
Created index with 81137 distinct minimizers for 971 loci.
Clustering proteins...
[====================] 100%
Clustered 290823 proteins into 984 clusters.
Clusters to BLAST: 984
[====================] 100%
Classifying clustered proteins...
[====================] 100%
Classified 11856 distinct proteins.
Unclassified proteins: 278967

== Representative determination ==

Iteration 1

Loci: 971
BLASTing loci representatives against unclassified proteins...done.
Traceback (most recent call last):
File "/home/yao/.local/bin/chewBBACA.py", line 8, in
sys.exit(main())
File "/home/yao/.local/lib/python3.10/site-packages/CHEWBBACA/chewBBACA.py", line 1545, in main
functions_info[process]1
File "/home/yao/.local/lib/python3.10/site-packages/CHEWBBACA/utils/process_datetime.py", line 146, in wrapper
func(*args, **kwargs)
File "/home/yao/.local/lib/python3.10/site-packages/CHEWBBACA/chewBBACA.py", line 528, in allele_call
AlleleCall.main(genome_list, loci_list, args.schema_directory,
File "/home/yao/.local/lib/python3.10/site-packages/CHEWBBACA/AlleleCall/AlleleCall.py", line 2718, in main
results = allele_calling(input_files, schema_directory, temp_directory,
File "/home/yao/.local/lib/python3.10/site-packages/CHEWBBACA/AlleleCall/AlleleCall.py", line 2510, in allele_calling
locus_results = expand_matches(match_info, prot_index, dna_index,
File "/home/yao/.local/lib/python3.10/site-packages/CHEWBBACA/AlleleCall/AlleleCall.py", line 1389, in expand_matches
target_protein = str(pfasta_index.get(target_id).seq)
AttributeError: 'NoneType' object has no attribute 'seq'

@ramirma
Copy link
Member

ramirma commented Aug 14, 2023

Dear @kamivain,

Thank you for your interest in chewBBACA. Please have a look at issue #176. I note that you are using python 3.10. Althought this should not be a problem we do advise to use python 3.9, this may also result in a clearer error reporting. The other potential problem is if you are using BLAST>2.9. Please downgrade if necessary because we know there are incompatibilities. If downgrading BLAST does not solve the problem there may be problems with the file or contig names. Please look into the previous issues reported on this.

Best Regards,

Mario

@kamivain
Copy link
Author

kamivain commented Aug 14, 2023 via email

@rfm-targa rfm-targa added the Status: In Progress Has been assigned and is being worked on. label Sep 6, 2023
@Fla1487
Copy link

Fla1487 commented Dec 15, 2023

I have the similar problem, but if I apply the command on a selection of the genomes it appears to be solved. Conversely, when applied on the second part I have agains the problem.

@rfm-targa
Copy link
Contributor

Greetings @Fla1487,

Thank you for your interest in chewBBACA. Based on what you report, it might be related to issues in one or several input files (badly formatted files, special characters in the filename or sequence headers, etc). Updating to the latest version may also help, as it solves several issues in older versions. If you cannot find the cause of the issue, please share what's printed to the stdout, as it might include enough information to determine the type of issue.

Kind regards,

Rafael

@rfm-targa rfm-targa self-assigned this Dec 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: In Progress Has been assigned and is being worked on.
Projects
None yet
Development

No branches or pull requests

4 participants