1. Introduction
Histoplasma capsulatum is a thermally dimorphic fungus that can cause histoplasmosis when inhaled. It is non-contagious and affects humans and animals including dogs and cats [
1]. The fungus predominately lives in soil, and infections are often linked with activities that disturb the environment, particularly where bird or bat droppings are present [
2,
3,
4]. Clinical presentation ranges from mild self-resolving to moderate pneumonia-like symptoms to a severe, life-threatening, disseminated disease. Histoplasmosis can affect healthy individuals or those with compromised immune systems. In the case of disseminated histoplasmosis, the infection can affect several organs including the lungs, bone marrow, skin, brain, and gastrointestinal tract [
5,
6].
In the United States,
H. capsulatum is endemic to central and eastern states around the Ohio River Valley and the Mississippi River Valley [
7]. It is estimated that 60–90% of the population living in this area has been exposed to the fungus [
7]. However, disease surveillance is limited, with histoplasmosis reportable to public health authorities in only 12 states [
8]. Among reported cases in 2019 (>1,000), the high rate of hospitalization (54%) and death (5%) suggests that the actual number of cases is likely higher [
9]. Furthermore, systematic environmental surveillance of
H. capsulatum is not conducted. Therefore, due to under-detection of infections and limited surveillance, the true geographic distribution of
H. capsulatum in the United States is poorly understood [
10].
Based on morphology and pathogenic characteristics, genus
Histoplasma was thought to consist of three distinct varieties:
H. capsulatum,
H. duboisii and
H. farciminosum [
11]. In 2003, Kasuga et al. utilized genealogical concordance–phylogenetic species concept (GC–PSC) to classify
H. capsulatum into eight clades: North American clades 1 and 2 (NAm 1 and NAm 2), Latin American clades A and B (LAm A and LAm B), Eurasian, Netherlands, Australian, and African, as well as a distinct lineage (H81) comprised of Panamanian isolates [
12]. LAm A and LAm B clades, which comprised isolates from Mexico, Suriname, Guatemala, Brazil, and Argentina [
12], exhibited the highest genetic diversity. Furthermore, a few cases of LAm A, NAm 1, and NAm 2 clades co-occurred in the endemic areas of North America with different population dynamics [
12,
13]. More recently, Sepulveda et al. used genomic sequencing to classify
H. capsulatum into five genetically distinct clades of which four could be considered as species: NAm 1 (also referred to as
H. mississippiense species), NAm 2 (also referred to as
H. ohiense species), LAm A (also referred to as
H. suramericanum species), Panama lineage H81 (also referred to
H. capsulatum sensu stricto species) and Africa [
14]. In 2022, a new Indian lineage was reported [
15]. However, it is important to note that these clades defined by genomic sequencing have not yet been accepted as valid species [
16].
Whole-genome sequencing (WGS), compared with more traditional molecular typing methods, has proven to be a superior method for molecular surveillance and epidemiology of infectious diseases [
17]. Specifically, it allows detection of genome-wide polymorphisms that can be highly correlated with epidemiologic data and spatio-temporal spread [
18]. WGS also can help trace transmission, identify the source of an outbreak, and elucidate the evolution of a pathogen. In the case of
H. capsulatum, WGS helped reclassify the five distinct major clades that were previously phenotypically identified as three clades, demonstrating its high resolution and ability to refine our understanding of pathogen diversity.
Here, we utilized WGS to better describe
H. capsulatum in the United States. We present the phylogeographic structure of
H. capsulatum within the United States by utilizing clinical isolates obtained from a previous enhanced surveillance study of histoplasmosis patients from eight U.S. states [
8]
3. Results
Among the 48 total isolates, 39 (81%) case patients were male, 25 (52%) were aged 20–65 years, 19 (40%) were immunosuppressed. (
Table 1). Most (n=17, 34%) had a positive culture from bronchial specimens. Overall, cases resided in eight U.S. states (Indiana [N=2], Kentucky[N=4], Louisiana [N=1], Michigan[N=21], Minnesota[N=13], Nebraska[N=1], Pennsylvania[N=1], and Wisconsin[N=5]), most were from Michigan (44%), Minnesota (27%), and Wisconsin (10%)).
Table 1.
Associated case characteristics for Histoplasma capsulatum isolates collected from eight U.S. states.
Table 1.
Associated case characteristics for Histoplasma capsulatum isolates collected from eight U.S. states.
|
Count |
Percentage (%) |
Sex |
|
|
Male |
39 |
81 |
Female |
9 |
19 |
Total |
48 |
100 |
Age Group |
|
|
<21 |
6 |
13 |
>=21 & <65 |
25 |
52 |
>=65 |
17 |
35 |
Total |
48 |
100 |
Immunosuppressed cases |
|
|
Indiana |
1 |
5 |
Kentucky |
3 |
16 |
Michigan |
6 |
32 |
Minnesota |
8 |
42 |
Wisconsin |
1 |
5 |
Total |
19 |
100 |
Specimen Source |
|
|
Sputum |
4 |
8 |
Lymph node |
4 |
8 |
Bronchial specimen |
17 |
35 |
Lung tissue |
3 |
6 |
Blood |
12 |
25 |
Bone marrow |
3 |
6 |
Others*
|
5 |
10 |
Total |
48 |
100 |
States |
|
|
Indiana |
2 |
4 |
Kentucky |
4 |
8 |
Louisiana |
1 |
2 |
Michigan |
21 |
44 |
Minnesota |
13 |
27 |
Nebraska |
1 |
2 |
Pennsylvania |
1 |
2 |
Wisconsin |
5 |
10 |
Total |
48 |
100 |
Genomic sequencing and SNP analysis identified 1,969,979 variant sites, which were used for constructing a NJ and ML phylogenetic tree. Phylogenetic analysis revealed that the isolates formed clades as previously described for
H. capsulatum (
Figure 1). Specifically, 44 (92%) samples clustered with the NAm 2 clade samples SRR6243656 and the reference genome (
Figure 1). Three (6%) isolates clustered with the LAm A clade. One (2%) isolate clustered with the NAm 1 clade. Isolates from LAm A and NAm 2 clades were separated by ≤346,613 SNPs and isolates from the NAm1 and NAm 2 clades were separated by ≤600,854 SNPs.
Figure 1.
Phylogenetic analysis of H. capsulatum samples collected from eight U.S. states. The ML tree includes 54 isolates. Node color is based on the associated U.S. states where the patient resided.
Figure 1.
Phylogenetic analysis of H. capsulatum samples collected from eight U.S. states. The ML tree includes 54 isolates. Node color is based on the associated U.S. states where the patient resided.
Additionally, MDS analysis showed similar findings to the phylogenetic tree whereby distinct clusters of NAm1, NAm2 and LAm A clades were observed (
Figure 2A). The NAm 2 clade comprised two clusters whereby isolates primarily grouped by state. One cluster contained 11 isolates; 10 were from cases from Minnesota and one from a case from Wisconsin (
Figure 2B). A second cluster contained 10 isolates from cases from Michigan. The remaining eleven cases from Michigan, four from Kentucky, two from Indiana, four from Wisconsin, two from Minnesota, one from Nebraska, and one from Pennsylvania clustered together in the third cluster. The reference sample was the most distant isolate within the NAm2 clade (not included in the MDS plot).
Figure 2.
Multi-dimensional (MDS) plot of Histoplasma isolates. A) MDS plot of the patristic distance with x axis being dimension 1 and y axis dimension 2. The plot which includes all isolates revealed four distinct clades. Most (92%) of the samples clustered with the NAm 2 clade, three with the LAm A clade, one with the NAm 1 clade, and none clustered with the Panama control sample. B) MDS plot of isolates from the NAm 2 clade. Two separate clusters of isolates belonging to MI and MN were observed.
Figure 2.
Multi-dimensional (MDS) plot of Histoplasma isolates. A) MDS plot of the patristic distance with x axis being dimension 1 and y axis dimension 2. The plot which includes all isolates revealed four distinct clades. Most (92%) of the samples clustered with the NAm 2 clade, three with the LAm A clade, one with the NAm 1 clade, and none clustered with the Panama control sample. B) MDS plot of isolates from the NAm 2 clade. Two separate clusters of isolates belonging to MI and MN were observed.
4. Discussion
The phylogeographic structure of
H. capsulatum in the United States is poorly understood. Modeling studies have predicted potential shifts in the geographic distribution of
H. capsulatum and other environmental fungal pathogens within the United States [
29]. As previously observed, the expanding presence of fungal pathogens such as
H. capsulatum in newer geographic areas could be attributed to changes in their ecological niche, alterations in the behavior of their natural reservoirs, and dispersers, [
30]. Additionally, soil contamination via the guano of birds and bats is believed to play a crucial role in the dispersal of
H. capsulatum [
13]. To improve our understanding of lineages and clade specific genetic variations of
H. capsulatum in the United States, we evaluated 48 histoplasmosis cases from eight U.S. states. By performing whole-genome analysis on associated isolates, we leveraged the power of WGS, which is recognized as a highly effective molecular epidemiologic tool that provides much greater epidemiologically relevant resolution than classical genotyping methods like MLST.
Our analysis revealed a single NAm 1 clade isolate from Wisconsin, which is not unexpected given that the NAm 1 clade has been previously reported in North America, both in the United States and Canada [
12,
31]. Likewise, we also found three isolates from Minnesota, Michigan, and Louisiana that grouped with the LAm A clade, which has also been previously reported in the United States [
11].
Most isolates in this study belonged to the NAm 2 clade, a finding that is consistent with previous work [
12]. Despite belonging to the same clade, isolates showed a high degree of SNP differences (maximum of 60,000 SNPs) highlighting high within-clade genetic diversity (Supplementary
Figure 1). NAm 2 is amongst the oldest
H. capsulatum clades which is distant from other clades and hypothesized to have emerged between 3.2–13 million years ago [
12].
H. capsulatum is well known for its genetic complexity and the role played by geographical expansion in the creation of new lineages with notable phenotypic and virulence differences [
12]; for example, the prevalence of extensive genetic variation of
H. capsulatum in Latin America has been documented earlier [
32]. A possible reason for large intra-clade diversity within the NAm 2 clade could be due to recombination and changes in selection pressures because of the expanding geographic boundaries, forcing the fungi to rapidly adapt to varying environmental changes [
30,
33].
Within NAm2, we identified two genetically distinct clusters, with isolates primarily grouping by state. Most cases from Minnesota were found in one cluster, while those from Michigan were in another cluster, with a few exceptions where isolates from Michigan clustered with those from other states. Moreover, one isolate from Wisconsin was found in the Minnesota cluster. This could be due to an exposure occurring in Minnesota when the patient resided in Wisconsin. It was not possible to determine whether cases were locally acquired or travel-related, such as from neighboring or visiting states. For future studies, interdisciplinary approaches that tap into environmental samples or samples from veterinary surveillance that have robust epidemiologic data may prove useful.
Regarding future applications of WGS for genomic surveillance and epidemiology of histoplasmosis, there may be a potential role for this technology as is performed for
Coccidoides immitis. WGS has proven to be an effective method to identify locally acquired Valley fever due to the well-defined phylogeographic structure of
C. immitis. Specifically, it is possible to identify cases of locally acquired Valley fever in Washington and delineate between exposures in Washington and California [
34]. Histoplasmosis, like Valley fever, also has endemic and non-endemic areas. However, it is unknown whether
H. capsulatum has a strong phylogeographic structure as described for
C. immitis. Therefore, to understand whether WGS can be used to help determine locally acquired cases of histoplasmosis, a more robust characterization of the phylogeographic population structure is needed. Here, we show evidence that this may be possible for some endemic states such as Minnesota and Michigan, but further studies are needed that incorporate environmental sampling and comprehensive travel history to confirm this.
Overall, we employed WGS to investigate the prevalence of Histoplasma lineages in the United States. Our findings shed light on the phylogeographic structure of this significant pathogen and raise questions regarding the potential utility of WGS for genomic epidemiology of histoplasmosis.