Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REF_AA not consistent with actual AA at the position from reference protein fasta file #172

Open
Shawn-X-Zhang opened this issue Jan 16, 2024 · 2 comments

Comments

@Shawn-X-Zhang
Copy link

Hello,
I used samtools mpileup and ivar variants to identify codon and amino acid changes in assembled genomes with reference genome and .gff3 files. It turned out the codon and amino acid listed in the .tsv file don't match the actual codon and amino acid in reference CDS and protein fasta files.
Below is the command I used:
mpi_cmd_str = f'samtools mpileup -aa -A -d 20000 -B -Q 0 {sample}.sorted.bam '
ivar_cmd_str = f'ivar variants -p mutations -q 30 -t 0.03 -r {ref_file} -g {gff_file}'
cmd_str = mpi_cmd_str + " | " + ivar_cmd_str
os.system(cmd_str)

As an example, in the excel screenshot below you can find the sequence validation for SARS-CoV-2 ORF1ab.
screenshot
Any suggestion?
Thank you very much!

@cmaceves
Copy link
Collaborator

Hi, would you mind supplying a sample bam, reference, and gff file so I can take a look?

@Shawn-X-Zhang
Copy link
Author

Thanks for your quick reply.
Github does not allow to upload files over 25MB.
So I uploaded the files to google drive.
https://drive.google.com/drive/folders/1yytG0_DnAr_mvBTCTZKdlMKa2iHOT_4p?usp=sharing

For Staphylococcus aureus, I compared REF_AA with actual AA at the position for many proteins.
Some are consistent, some are not.
I also uploaded the files, could you please also take a look?
https://drive.google.com/drive/folders/1z8ag7921A5s6Bw9AxNESLCANcMeWtSd0?usp=sharing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants