Skip to content

Consensus sequences for U.S. H5N1 clade 2.3.4.4b

Notifications You must be signed in to change notification settings

andersen-lab/avian-influenza

Repository files navigation

Consensus sequences for U.S. H5N1 clade 2.3.4.4b

BioProject: PRJNA1102327

This repository aims to provide consensus sequences, variant calls and depth information for the SRA data associated with BioProject PRJNA1102327. The repository checks for new data every 24 hours and updates the consensus sequences, variant calls and depth information accordingly.

All the data generated from 23rd May 2023 uses the genbank genome A/cattle/Texas/24-008749-002/2024(H5N1) as a reference. The reference genome is stored in ./reference/. Minimum depth was set at 1, minimum quality at 20, and the consensus threshold at 50%.

Note

Prior to 23rd May 2023 Consensus genomes for 8 segments were generated with EPI_ISL_19032063 (source: GISAID) as a reference using iVar v1.4.2. Minimum depth was set at 1, minimum quality at 20, and the consensus threshold at 50%.

The consensus genomes are in ./fasta/.

The SRA metadata is stored in ./metadata/SraRunTable_PRJNA1102327_automated.csv

The variant calls are in ./variants/.

The depth information is in ./depth/.

The pipeline used to generate the consensus genomes is in gp201/flusra

For NextStrain-style formatted version of the genomes and associated metadata, please see https://github.com/moncla-lab/avian-flu-USDA-cattle/.

Data usage

We have shared this data with the hope that people will download and use it, as well as scrutinize it so we can improve the data quality. Please contact us if you have any questions or comments.

Please refer to the NCBI usage policies for more details.


We gratefully acknowledge the authors, originating and submitting laboratory of the sequences from GISAID's EpiFlu™ Database we used as references for our genome assemblies. The list is provided in ./acknowledgements.