Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suppressed Sequences included in RefSeq_viral_genomes_v2.4.0.fa.gz #224

Open
selkamand opened this issue Nov 13, 2023 · 1 comment
Open

Comments

@selkamand
Copy link

Hi, thanks so much for your work on this tool!

Just wanted to flag that RefSeq_viral_genomes_v2.4.0.fa.gz includes several sequences that are now 'suppressed' by NCBI. This suggests the sequences were poor quality or potentially contaminated, and thus have potential to negatively affect Arriba viral detection. Would it be possible to remove these sequences from RefSeq_viral_genomes_v2.4.0.fa.gz in future versions?

A complete list of the problematic sequences is below

NC_027359.1_Propionibacterium_phage_PHL082M00-complete_genome
NC_027991.1_Staphylococcus_phage_SA1-complete_genome
NC_029050.1_Salmonella_phage_21-complete_genome
NC_029072.1_Salmonella_phage_19-complete_genome
NC_035203.1_Grapevine_virus_T_isolate_Cho_replicase_ORF1-TGB1_ORF2-TGB2_ORF3-TGB3_ORF4-and_CP_ORF5_genes-complete_cds
NC_023591.1_Mycobacterium_phage_Adler-complete_genome
NC_024711.1_Uncultured_crAssphage-complete_genome
NC_026813.1_Fusarium_graminearum_hypovirus_2_isolate_FgHV2_JS16-complete_genome
NC_002669.1_Lactococcus_prophage_bIL310-complete_genome
NC_002671.1_Lactococcus_prophage_bIL312-complete_genome
NC_002670.1_Lactococcus_prophage_bIL311-complete_genome
NC_001847.1_Bovine_herpesvirus_1-complete_genome
NC_007045.1_Staphylococcus_phage_PT1028-complete_genome
NC_041920.1_UNVERIFIED_Escherichia_phage_HP3-complete_genome
NC_042059.1_Halobacterium_phage_phiH_T4-T4-and_T_down_LX1_down_genes-complete_sequence_and_orf75_T_down_LX3_down_gene-complete_cds
NC_043055.1_Caprine_herpesvirus_1_strain_E_CH_glycoprotein_B_gene-complete_cds
NC_043057.1_Cervid_herpesvirus_2_strain_Salla_82_glycoprotein_E_US8_gene-partial_cds
NC_043229.1_Johnston_Atoll_virus_isolate_LBJ_polymerase_PB1_PB1_gene-complete_cds
NC_043230.1_Johnston_Atoll_virus_isolate_LBJ_hemagglutinin_HA_gene-complete_cds
@suhrig
Copy link
Owner

suhrig commented Nov 13, 2023

Thank you for making me aware of this! This is very useful feedback. I will exclude them in the next release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants