4. Discussion
In 59 studies of symptoms lasting more than 3 weeks after acute COVID infection, we found wide ranges of symptom frequencies, variable categories of symptoms included, and marked variation in study design. Most studies to date have focused largely on individual symptom prevalence but are limited by lack of controls and retrospective data. The current review examined the most prevalent PASC symptoms in 59 scientific articles as fitted to a 21-symptom phenotype grid. Individual symptom prevalence varied widely depending on study design with lower prevalence in studies based on EHR to moderate prevalence through surveys conducted by outreach to general population samples, infected inpatients after hospital discharge, or mixed studies of hospitalized and non-hospitalized outpatients to highest symptom prevalence reported from surveys on recorded in the EHR of individuals with suspected Long COVID. After excluding studies with high risk of bias, meta-analysis of symptom prevalence for 21 symptom categories ranged widely from 2.6-28.7% in studies based on surveys to 0.3%-7.1% in studies based on EHRs. The challenges with varied study design include potential biases resulting in under-ascertainment of symptoms in EHR studies to including symptom frequencies only among individuals with suspected Long COVID that do not reflect general population prevalence.
EHR studies demonstrated the lowest prevalence rates for PASC symptoms, shortness of breath, fatigue, pain, cognitive problems, changes in smell and/or taste compared to other methods with some symptoms not available in EHR data. The challenges in survey design include assuring a representative sample of the general population. Many of the studies that surveyed hospitalized patients or mixed hospitalized and non-hospitalized outpatients used smaller populations and were classified as high risk of bias. Some large studies that surveyed general population samples with low response rates, thus raising the issue of representativeness of responders, were also rated as high risk of bias. For studies that assessed symptoms only among individuals with suspected Long COVID (Long COVID clinics, diagnosis code, Long COVID support groups/social media), prevalence data demonstrates much higher symptom rates The most striking of these are post-exertional malaise (88%) and fatigue (80%), followed by shortness of breath (55%), cognitive problems (53%), changes in smell and/or taste (26%), joint/muscle pain (42%), and fever/chills/sweats (36%), illustrating the burden of symptoms among people with Long COVID symptoms.
Differences in data collection methods across the studies can affect how prevalence is reported. Potential biases in symptom assessment in EHR data include absence of diagnosis codes for many symptoms, incomplete symptom assessment or documentation by providers, potential limited access to care by vulnerable patients, and lack of full data on individuals who receive care outside the EHR system. Many of these limitations could result in underestimates of symptom prevalence. This may contribute to the lower frequencies reported in EHR studies. Conversely, survey studies could be impacted by response bias whereby symptomatic individuals are more likely to respond than asymptomatic individuals. Surveys among individuals seeking resources for Long COVID, such as in specialty clinics or social media support groups, are most likely to exhibit selection bias. The very high prevalence among support groups and Long COVID clinics implies that their symptoms may have been more likely to be severe or disruptive to their quality of life. Nevertheless, symptom prevalence data amongst the moderate and low risk of bias studies supports substantial prevalence of symptoms across a wide array of organ systems and suggests that PASC/Long COVID is a multi-system disorder.
While EHR and survey studies each can have their own biases, this meta-analysis identifies which of the PASC symptoms have similar measurements between study designs. For example, chest pain, GI pain, and persistent cough were all detected with similar prevalence in the EHR and survey studies. This would suggest that future PASC studies that use EHR should expect the prevalence of these specific symptoms to be more accurate. However, the prevalence of fatigue, post-exertional malaise, and weakness in arms and legs were significantly different between the EHR and the survey studies. Therefore, these symptoms may not be as reliably measured. Given that collecting prospective or retrospective data from EHR’s can be easier and less expensive than setting up a survey study, future PASC studies could rely on the EHR to track symptoms such as chest pain and gastrointestinal discomfort in PASC. However, they should consider alternative ways to measure symptoms like fatigue and post-exertional malaise.
In addition, it has already been observed that a longer recovery course is expected in patients requiring hospitalization or prolonged stays in the hospital [
41,
45,
68]. While this meta-analysis validates increased prevalence of PASC among hospitalized patients, it provides further insight into exactly how these differences manifest across the various PASC symptoms. Interestingly, for certain symptoms such as headaches, nerve problems/seizures, and changes in smell, no statistically significant difference was observed between the two cohorts. While these data suggest the severity of the acute COVID-19 episode may directly lead to increased prevalence of fatigue, weakness, and shortness of breath in the subsequent recovery phase, an alternative mechanism (auto-immune response, persistent viral reservoir) may explain why that difference is not as significant for PASC symptoms like headache and changes in smell. These findings may also prepare health care providers for what to expect with patients depending on their acute COVID-19 level of severity.
Other limitations to the published studies include incomplete symptom lists in some studies and lack of control groups to assess frequency of symptoms in a representative uninfected group. Many of these PASC symptoms, taken individually, are non-specific, prevalent in the population, vary widely, and overlap with many other conditions. The heterogeneity of symptoms suggests that PASC is a set of syndromes with variable etiologies [
6,
7]. Because of the substantial short- and long-term effects of PASC, including impacts on quality of life, healthcare costs, and economic productivity [
69], it is imperative to better characterize PASC, and PASC sub-phenotypes, in a prospective design with uniform data collection from a diverse group of uninfected and infected participants as the RECOVER Initiative intends to do.
We acknowledge that PASC symptoms—especially considering the introduction of vaccinations, virus variants, and pre-existing conditions—are difficult to fully capture and describe. A comprehensive overview of symptom phenotypes was attempted in this review, however, 200+ symptoms have been reported post-COVID. Knowledge of PASC is still evolving. Another challenge in this review was reliability of self-reports compared to EHR, for example, and wide variation in the quality of symptom reporting. Occasionally, values had to be visually estimated from graphs and figures when exact numbers were not provided. We have attempted to summarize extant research in a way that is helpful for clinicians, the patient community, and researchers alike while keeping these caveats in mind.
The striking variability in symptom prevalence across studies of PASC/Long COVID illustrates the challenge with defining criteria for this novel, multi-system condition. Studies to date have focused largely on individual symptom prevalence but some are limited by lack of controls, small sample size, potentially biased study designs. Symptom prevalence ranges widely, with lowest prevalence in EHR studies and higher prevalence in Survey studies, as illustrated by a range of 1-8% in EHR studies versus 2-28% in Survey studies in the low to moderate risk of bias studies. When limiting the study population to only participants who report Long COVID, the prevalence is much higher, reflecting a change in the denominator for prevalence calculations and a different study question, namely: “what is the prevalence of symptoms among participants who have any long term symptoms after SARS-CoV-2 infection?” rather than “what is the prevalence among a population infected with SARS-CoV-2”. The challenges are that many of these symptoms, taken individually, can be common in the population regardless of COVID infection. PASC symptoms are also heterogeneous and overlap with many other conditions. Major questions remain about the effect of vaccination status, the role of viral variants, the role of COVID therapies, comorbidities, social determinants of health, and clinical risk factors on the development of PASC/Long COVID to enable studies on the underlying pathobiology of individual symptoms and clusters of symptoms. Further research into PASC phenotypes is needed to effectively cluster symptoms in meaningful ways that enable focused pathobiology studies and clinical trials.