1. Introduction
Wild migratory waterfowl are well known reservoir hosts for low pathogenic influenza (LPAI) viruses, either as asymptomatic carriers of the virus or exhibiting mild disease [
1,
2,
3]. When H5 or H7 subtype LPAI viruses spillover from the reservoir host into non-host species such as gallinaceous birds (e.g., chickens, turkeys, pheasants) the virus has the potential to mutate to a highly pathogenic form (HPAI); however, spillback into the natural host is rare. The A/goose/Guangdong/1/1996 (Gs/GD) HPAI H5N1 lineage that emerged in poultry, spreading to poultry across other countries has generated two clades (most recently the H5 clade 2.3.4.4) that have spilled back into wild migratory birds resulting in intercontinental transmission across five continents and causing poultry outbreaks, wild bird mortality events, and spillovers into mammalian species [
3,
4]. This global expansion of Gs/GD lineage facilitated increased diversification of the HA gene [
5].
During the 2020-2021 epidemic wave in Europe, the first report of HPAI H5N1 clade 2.3.4.4b viruses in migratory waterfowl was in the Netherlands during October 2020 [
6,
7]. Phylogenetic analyses revealed that the newly detected H5N1 virus was a result of reassortment among a circulating H5N8 2.3.4.4b virus with Eurasian avian lineage LPAI viruses [
6]. Since its detection, H5N1 2.3.4.4b viruses have been detected across numerous countries within Africa, Asia, Europe, North, and South America [
5,
8]. H5N1 2.3.4.4b viruses, closely related to those in Europe, were detected in North America in December 2021 on the Atlantic coast of Canada [
9,
10,
11]. Time-scaled phylogenetic analyses revealed that the spread of H5N1 to North America was likely due to migratory waterfowl movement from Iceland to Canada [
11]. The H5N1 clade 2.3.4.4b virus has since disseminated across all four North American flyways with detections in a myriad of wild and domestic bird populations and mammals [
12,
13,
14].
Typically, the characterization of genotypic diversity in H5N1 viruses largely relies on phylogenetic analysis and tree construction by gene segment to assess existing and potential novel genotypes. These analyses and data visualization methods provide useful tools for the study of viral spatiotemporal evolution [
9,
13]. However, phylogenetic tree construction for large datasets can be time consuming and difficult to interpret [
15,
16]. Ordination analyses are powerful tools for the rapid visualization of evolutionary distances in low-dimensional space [
16,
17]. Ordination approaches are not a replacement for phylogenetic analysis but, when used in tandem, they provide useful means for efficient and targeted data analysis. Indeed, ordination and phylogenetic analyses have been used in parallel in numerous instances, including, for example, the study of evolution of HA and NA genes of influenza A viruses [
18], human virome characterization [
19], and fungal pathogenesis [
20].
The overall goal of this study was to characterize the genotypic expansion of H5N1 2.3.4.4b virus in North America using rapid ordination approaches in concert with phylogenetics. To achieve this goal, we used ordination analyses on distance matrices generated from sequence alignments of over four thousand H5N1 clade 2.3.4.4b viruses detected in North America between its introduction in January 2020 and the end of 2023. Our ordination analyses paralleled the results observed with phylogenetic analyses, revealing the expansion of the A1 genotype into numerous reassortant genotypes upon its circulation in North America.
4. Discussion
Multiple genotyping schemes have been developed for A/goose/Guangdong/1/1996 (Gs/GD) H5N1 lineage viruses. Because of the complex circulation of these viruses within flyways the distinct genotyping schemes utilized in Asia [
29], Europe [
7], and North America [
14] have utility for better understanding and monitoring the change and movement of these viruses within those geographic areas. The GenoFLU tool used in this study was developed to characterize introductions of the 2.3.4.4b clade in the USA as well as subsequent re-assortments of that virus with low pathogenic North American viruses [
14]. This study aimed at implementing strategies for the efficient detection and visualization of H5N1 2.3.4.4b virus genotypic diversity across North America. To this end, we used multidimensional scaling (MDS) analysis as an alternative to phylogenetic analysis for the identification of H5N1 virus reassortants across North America between January 2020 and December 2023. Our results revealed that the MDS approach readily paralleled the results observed with phylogenetic analysis while reducing time and computational demand. To our knowledge, this is the first study that uses clustering approach in tandem with phylogenetic analyses for the genotypic assignment of H5N1 viruses in North America. Ordination-based approaches are used in addition to phylogenetic analyses in numerous host-virus systems, including the study of coronavirus evolution through the analysis of the spike protein [
30]. Similarly, studies that assess the virome associated with human blood and plasma utilize cluster-based methods to show the evolutionary space of anelloviruses in relation to viruses within reference databases [
19,
31]. This approach is not limited to the study of viral evolution in mammalian systems but is also used in the identification of tick evolution across various geographical spans [
32]. While ordination analysis effectively detects clustering patterns in a dataset, it has some drawbacks, including sensitivity to outliers, potential for overfitting, and loss of information. Thus, the utilization of MDS approaches in addition to phylogenetic analyses can provide a powerful compendium of tools to study evolutionary changes in targeted populations within and across host systems.
The dataset used in this study identified five distinct H5N1 2.3.4.4b virus introductions into North America [
14]; three of which arrived through the Atlantic flyway (A1, A2, and A5) [
9,
10] and one through the Pacific flyway (A3 and A4) [
33]. Interestingly, our analysis of viruses not assigned genotypes by GenoFLU revealed viruses that clustered with A1 genotype which were first detected in Canada in mid-December 2021 where the A1 genotype was then detected in the USA in late December 2021. While some viruses collected from Canada were not assigned any genotype, the clustering pattern supports the circulation of the A1 genotype in Canada prior to the USA [
34,
35]. Similarly, the A3 genotype was first detected in a bald eagle sample in British Columbia, Canada collected in February 2022, followed by Alaska, USA in April 2022. The remaining A genotypes were only detected in the USA across 2022 and 2023 [
14,
36]. We acknowledge that virus sequences from Canada were under-represented in our study such as the H5N5 (provisionally A6 genotypes) that has been found in Atlantic Canada at the beginning of 2023 [
37]. As such, we cannot exclude the possibility that genotypes A2, A4, A5 or A1 reassortants were circulating in Canada prior to their detection in the USA. Waterfowl migratory flyways span North America in a north-south orientation [
38], making it likely that common H5N1 genotypes in the United States also occur in Canada. Indeed, a previous study reports the identification of a reassortant H5N1 genotype containing gene segments closely related to the North American A1 genotype in broiler chicken in Canada [
39], revealing the potential circulation of reassortant H5N1 genotypes in Canadian regions. Thus, future efforts to obtain historical H5N1 virus sequences from Canada would be beneficial to better understand the introduction and emergence of reassortant H5N1 2.3.4.4b viruses in North America. In addition, while over a thousand H5N1 genomes from the year 2022 introduction into North America have been made available, further sequencing of yet untested samples may reveal additional introductions, such as the H5N5 [
40], and help elucidate the evolutionary history of these viruses following their introduction.
Reassortment events within H5N1 2.3.4.4b viruses are not surprising and have been reported across numerous continents [
8,
29]. Indeed, studies provide evidence of numerous reassortment events involving one or more gene segments of the virus. For example, a recent study characterized 16 distinct reassortant virus genotypes identified in 233 H5N1 2.3.4.4b virus sequences across countries within Africa, North Americas, Asia, and Europe [
29,
41]. While one genotype classified as G1, based on the Asian classification scheme [
29], was widely spread across countries from all four continents, numerous other genotypes were associated with specific geographic locales. Furthermore, a recent study of genotypic diversity of H5N1 virus in the USA, revealed 21 distinct clusters of which six reassortant genotypes accounted for 92% of all viruses identified in the USA [
14]. Our results parallel those described by Youk et al., [
14] for North America, supporting the emergence of reassortant genotypes from wild bird LPAI viruses.
Our study identified that reassortment events were associated with five gene segments of H5N1 viruses. These data were supported by studies that report reassortment events associated with the same gene segments across the USA and other countries [
14,
42,
43]. Reassortment events associated with avian influenza viruses are shown to generate significant viral diversity and facilitate host specificity and virulence [
44] and may have contributed to the rapid spread of influenza viruses and their emergence in novel hosts [
5]. Our data curation from public repositories revealed numerous examples of H5N1 viruses detected in North America from mammalian hosts, such as bobcats (
Lynx rufus), red foxes (
Vulpes vulpes), bottlenose dolphins (
Tursiops truncatus), skunks, and seals. Similar host expansion is increasingly reported across North America, with novel detections in marine mammals including dolphins [
45], and the recent spillover into dairy cows [
46,
47,
48] with an associated increase in human detections [
46]. Interestingly, in most cases, viruses detected in mammals were associated with reassortant genotypes, especially genotypes with PB2 reassorted. In addition, reassortant H5N1 2.3.4.4b viruses previously detected in North America were also detected in marine mammals in Peru [
42].
The circulation of LPAI H5N1 in poultry populations allowed for the initial emergence of the HPAI H5N1 goose/Guangdong lineage viruses and their spillover to wild bird populations [
49,
50]. Following introduction into the United States, wide distribution of HPAI viruses in wild waterfowl species that are natural hosts for North American LPAI viruses created the potential for reassortment events and the emergence of novel genotypes [
51]. While some mammalian hosts were detected, our study found that most H5N1 HPAI viruses in North America during this period were associated with avian hosts, particularly waterfowl and other wild birds. In addition, we showed varied H5N1 virus clustering patterns as measured by host type. For example, genotype B4.1 was mostly detected in waterfowl followed by other wild birds. In addition, several reassortant genotypes were identified in mammals but were in turn not identified within the A1 genotype. This dataset suggests an association of viral genotypic diversity with host species; however, genotype success or frequency as well as potential sampling biases in the analyzed dataset are acknowledged. For example, 1) while genotype B1.1 was associated with other wild birds, this genotype was largely associated with black vulture (
Coragyps atratus) mortality events (n=350/456). These animals roost together and are known to feed on members of the roost, perpetuating the infection [
52], 2) bald eagles are large birds with high visibility, potentially increasing the sampling of bald eagle samples within mortality events of raptors, and 3) limited surveillance in mammals.
We also show that avian migratory flyways and sub-regions within a flyway may further contribute to H5N1 genotypic diversity in North America. Four introductions of H5N1 to North America were limited to the Atlantic flyway, while two introductions were limited to the Pacific flyway. However, reassortant genotypes had a higher rate of detection within subregions of the Central and Mississippi flyways. Our results agree with previous reports of genotypic diversity associated with geographic ranges, including bird flyways [
14,
53]. This result may reveal a geographic signature of H5N1 virus detection; however, some degree of sampling bias and data heterogeneity may also affect the magnitude in effect size and the observed distribution of genotypes. While surveillance mechanisms exist for monitoring of avian influenza in wild migratory waterfowl, the absence of a targeted and consistent surveillance strategy in other wild bird species and mammals provides an incomplete understanding of the full transmission dynamics and distribution of different genotypes.
In conclusion, our study reveals that ordination and cluster-based approaches can complement traditional phylogenetic analyses specifically for the preliminary assignment of H5N1 viruses to genotypic groups or to identify novel genotypes. Rapid genotypic assignment of newly detected H5N1 viruses was achieved using a reference dataset that recapitulates the genotypic clusters followed by K-means clustering. Using this approach, we provide fundamental insight into the genotypic diversity and spread of H5N1 2.3.4.4b viruses across the USA which expands current knowledge on genotype diversity of H5N1 viruses in North America. This work also highlights the need for more widespread and standardized surveillance strategies for the accurate depiction of reassortant genotype circulation within non-target host populations, including mammalian hosts.
Author Contributions
Conceptualization, Patil Tawidian and Hon Ip; Data curation, Patil Tawidian, Krista Dilione and Jourdan Ringenberg; Formal analysis, Patil Tawidian and Kristina Lantz; Funding acquisition, Mia Torchetti and Hon Ip; Investigation, Patil Tawidian; Methodology, Patil Tawidian, Mia Torchetti, Mary Killian, Sarah Bevins and Julianna Lenoch; Project administration, Hon Ip; Software, Patil Tawidian; Supervision, Hon Ip; Validation, Patil Tawidian; Visualization, Patil Tawidian; Writing – original draft, Patil Tawidian; Writing – review & editing, Mia Torchetti, Mary Killian, Kristina Lantz, Sarah Bevins and Julianna Lenoch. All authors will be informed about each step of manuscript processing including submission, revision, revision reminder, etc. via emails from our system or assigned Assistant Editor.