Real-Time Epidemiology and Acute Care Need Monitoring and Forecasting for COVID-19 via Bayesian Sequential Monte Carlo-leveraged Transmission Models

Xiaoyan Li; Vyom Patel; Lujie Duan; Jalen Mikuliak; Jenny Basran; Nathaniel Osgood

doi:10.20944/preprints202302.0078.v2

Submitted:

06 September 2023

Posted:

07 September 2023

You are already at the latest version

Part of the Following Collection

Preprints on COVID-19 and SARS-CoV-2

Abstract

COVID-19 transmission models have conferred great value in informing public health understanding, planning, and response. However, the pandemic also demonstrated the infeasibility of basing public health decision-making on transmission models with pre-set assumptions. No matter how favourably evidenced when built, a model with fixed assumptions is challenged by numerous factors that are difficult to predict. Ongoing planning associated with rolling back and re-instituting measures, initiating surge planning, and issuing public health advisories can benefit from approaches that allow state estimates for transmission models to be continuously updated in light of unfolding time series. A model being continuously regrounded by empirical data in this way can provide a consistent, integrated depiction of the evolving underlying epidemiology and acute care demand, offer the ability to project forward such a depiction in a fashion suitable for triggering the deployment of acute care surge capacity or public health measures, support quantative evaluation of tradeoffs associated with prospective interventions in light of the latest estimates of the underlying epidemiology. We describe here the design, implementation and multi-year daily use for public health and clinical support decision-making of a particle filtered COVID-19 compartmental model, which served Canadian federal and provincial governments via regular reporting starting in June 2020. The use of the Bayesian Sequential Monte Carlo algorithm of Particle Filtering allows the model to be re-grounded daily and adapt to new trends within daily incoming data – including test volumes and positivity rates, endogenous and travel-related cases, hospital census and admissions flows, daily counts dose-specific vaccinations administered, measured concentration of SARS-CoV-2 in wastewater, and mortality. Important model outputs include estimates (via sampling) of the count of undiagnosed infectives, the count of individuals at different stages of the natural history of frankly and pauci-symptomatic infection, the current force of infection, effective reproductive number, and current and cumulative infection prevalence. Following a brief description of model design, we describe how the machine learning algorithm of particle filtering is used to continually reground estimates of dynamic model state, support probabilistic model projection of epidemiology and health system capacity utilization and service demand and probabilistically evaluate trade-offs between potential intervention scenarios. We further note aspects of model use in practice as an effective reporting tool in a manner that is parameterized by jurisdiction, including support of a scripting pipeline that permits a fully automated reporting pipeline other than security-restricted new data retrieval, including automated model deployment, data validity checks, and automatic post-scenario scripting and reporting. As demonstrated by this multi-year deployment of Bayesian machine learning algorithm of particle filtering to provide industrial-strength reporting to inform public health decision making across Canada, such methods offer strong support for evidence-based public health decision making informed by ever-current articulated transmission models whose probabilistic state and parameter estimates are continually regrounded by diverse data streams.

Keywords:

COVID-19

;

Particle Filtering

;

Machine Learning

;

Epidemiologic Modeling

;

Compartmental Model

;

Projection and Intervention

Subject:

Computer Science and Mathematics - Information Systems

1. Introduction

A novel coronavirus and accompanying infectious disease was reported to the World Health Organization (WHO) in Wuhan, China in December of 2019 [1]. The WHO declared this outbreak a Public Health Emergency of International Concern in January of 2020, designating this new coronavirus disease COVID-19 [1]. Global travel and endogenous spread across hundreds of countries have yielded a worldwide pandemic, with rapidly rising totals of over 752 million confirmed cases, and over 6.8 million confirmed deaths through January 30, 2023 [2].

During the COVID-19 pandemic, ongoing public health order planning, and replanning associated with rolling back and reinstituting measures and conducting timely messaging has benefited from the availability of empirical time series — often holding evidence of shifts in epidemiology, availability of acute care resources, and changes in behaviour with regards to risk, testing, vaccination uptake, and clinical presentation. At the same time, decision-making has relied heavily on a variety of types of dynamic models.

Several previous works [3,4,5,6,7,8] showed success in monitoring, estimating and predicting the transmission of infectious diseases by stochastic filtering of mathematical epidemiology models using with observed datasets via Sequential Monte Carlo (SMC) machine learning algorithms. SMC methods were introduced in the early 2000s and [9,10], and commonly go by the name of particle filtering (PF). Such work has demonstrated that projections forward from dynamic models in health and health care offer substantial additional value if they can be informed by up-to-date, grounded estimates of the current situation. The particle filtering method – together with several variants – has also been used for COVID-19 [11,12,13,14,15] in the last two years since this new infectious disease emerged. Most of these works used public health surveillance data – such as daily reported cases and daily hospitalized admission patients – to track the transmission dynamics. After the SARS-CoV-2 virus was confirmed detected in untreated wastewater [16], several researchers [15] used wastewater surveillance data to ground the mathematical epidemiology models via a partially observed Markov processes (POMP) model. These methods use Markov chain Monte Carlo and sequential Monte Carlo (particle filtering) methods.

This paper presents the use of PF with a model deployed by the health system and used internally for routine provincial-level reporting and decision-making since the fourth month of the pandemic. Such PF incorporated a COVID-19 compartmental transmission model, and a wide variety of observed daily datasets from both public health surveillance data and wastewater surveillance data. Within this context, the COVID-19 model provides an integrated characterization of disease transmission, a natural history of infection including both frankly symptomatic and oligo-/pauci-symptomatic pathways, and distinct passive and active case-finding systems the occurrence of travelling cases, basic COVID-19 related acute care flows and occupancy, characterization of two dose-specific vaccination stages, and mortality. Important model outputs include estimates (via sampling) of the effective reproductive number, the count of undiagnosed infectives, and the count of individuals at different stages of the natural history of infection along both pathways. Since July 2020, the model further incorporated a representation of SARS-CoV-2 fecal viral shedding; and when wastewater evidence is available, the PF framework makes use of a likelihood term, comparing the empirical viral concentration of SARS-CoV-2 in wastewater with model expectations for that concentration.

The model was built in concert with the Saskatchewan Health Authority and has been in production use for regular health system reporting since June 2020, with some model findings informing understanding of the evolving epidemiological context as early as April 2020. Since that time, and beyond its use for reporting to the Saskatchewan Health Authority and Saskatchewan Ministry of Health, the model has been used to deliver reporting contracts with the Public Health Agency of Canada (for each Canadian province), First Nations and Inuit Health Branch (FNIHB). The resulting reports have proven particularly key in day-to-day instituted health system reporting and informing planning for the Canadian Midwestern province of Saskatchewan. In this paper, we characterize the structure of the model and present the results of applying the model to the population of the city of Saskatoon in the province of Saskatchewan of Canada, during the period of wild type SARS-CoV-2 and the alpha variant [17] from February 22, 2020, to July 31, 2021.

2. Methods

2.1. Deterministic Compartmental Model

We describe here the compartmental model used within this system, which characterizes the total population as divided into different compartments distinguished by different pathways of natural progression, severity of illness, diagnosis, and acute care use. For simplicity, our description of the model omits discussion of the evolution of that model, pausing only to note that the vast majority of the model as described here was in use at the start of regular reporting in June 2020. We also exclude from this section a characterization of variants of that model, differing particularly in the levels of stratification involved. We further exclude discussion of variant-specific adjustment of values of some parameters otherwise treated as constant and the model structure adjusted to accommodate further variants in the application.

The structure of the COVID-19 compartmental model is shown in Figure 1, and employs a time unit of days. The compartments of the model are introduced as follows. The model contains a largely orthogonal characterization of progression along two possible natural histories of infection (on one hand) and diagnosis status (on the other). Specifically, the model dichotomizes both the infective (compartments denoted by names prefixed by I or H) and recovered (compartments prefixed by R) populations into diagnosed (sub-scripted by

_{D}

) and undiagnosed (sub-scripted by

_{U}

) status, depending on whether an individual has been diagnosed via lab-confirmed PCR testing. The infective population in the model was further divided into two groups: hospitalized individuals (compartments

H_{N I C U}

and

H_{I C U}

) and those in the community (subcompartments of the supercompartment I). Supercompartment I of infectives in the community is characterized by dividing it into three groups based on the stage of the natural history of infection – presymptomatics (compartments

I_{A U}

and

I_{A D}

), and those at a later stage along each of the two parallel pathways of infection distinguished by degree of symptomaticity. Specifically, the model treats infected individuals as proceeding from the (infectious) presymptomatic phase to one of two possible natural histories of infection: A frankly symptomatic pathway and an oligosymptomatic route of progression, which accept fractions

1 - f_{P A}

and

f_{P A}

of undiagnosed individuals proceeding from presymptomatic compartment

I_{A U}

, respectively. The frankly symptomatic pathway starts at an early stage in which individuals have not yet had the opportunity to exhibit complications (compartments

I_{Y U}

and

I_{Y D}

), and symptoms are assumed to be mild. The progression of an individual from the first to the second symptomatic stage marks the point where any complications emerge, with a specified fraction (denoted as

f_{H}

) of progressing individuals (regardless of erstwhile diagnosis status) developing severe or critical complications. Such individuals suffering complications are presumed to lead to presentation for care and hospitalization. Frankly symptomatic individuals absent complications proceed on to a stage involving symptomatic individuals beyond the risk of complications (compartments

I_{Y N U}

and

I_{Y N D}

). By contrast to the frankly symptomatic pathway, the oligosymptomatic pathway proceeds from the presymptomatic stage through a natural history of infection in which infected individuals remain infective but never develop symptoms sufficient to motivate care-seeking; compartments along this pathway are denoted by an

_{A}

subscript. Like their symptomatic counterparts, oligosymptomatic infectives are characterized as proceeding through two subsequent compartments of

I_{A}

, with the timing of progression identical to the frankly symptomatic stages – oligosymptomatic stage 1 (compartments

I_{A 2 U}

and

I_{A 2 D}

) and oligosymptomatic stage 2 (compartments of

I_{A 3 U}

and

I_{A 3 D}

). The model also considers the vaccinated population, where only Susceptible individuals are assumed to be administered vaccines. Compartment

V 1

represents the persons who have only received one dose of a COVID-19 vaccine, and

V 2

represents the persons who have received two vaccine doses. As is detailed further below, vaccinated individuals are treated as remaining subject to some vaccine-efficacy-moderated risk of infection (denoted

e_{1}

for only having one dose and

e_{2}

for having two doses).

2.1.1. Diagnosis and Case Finding

In this COVID-19 compartmental model, infected patients can be diagnosed both by passive case finding via presentation for care and (separately) via active case finding, such as through contact tracing, screening, and mass testing [18]. Passive case finding is treated as diagnosing symptomatic infectives who present for care, and is treated as endogenously driven within the model. Such presentation-driven diagnosis is represented by red flows in Figure 1, and proceeds from compartments of undiagnosed symptomatic infectives that have not yet exhibited complications

I_{Y U}

to the next stage compartment of diagnosed individuals

I_{Y N D}

. By contrast, reflecting the fact that active case finding can identify individuals not yet exhibiting symptoms, active case finding within the model is represented by flows (the orange arrows in Figure 1) from a broader set of compartments of undiagnosed exposed and infective individuals to the corresponding next stage diagnosed compartments of the model. It is to be noted that because of the multi-day time lag commonly associated with test results in the province, for both passive and active-case finding, we let the flows of undiagnosed infectives proceed to the next stage diagnosed compartments instead of the directly corresponding diagnosed stages; thus, for example, those diagnosed from stage

I_{Y U}

flow into the next stage compartments of

I_{Y N D}

, rather than into

I_{Y D}

.

The daily flow of cases being diagnosed by passive testing, but not leading to hospitalization, is mainly governed according to the endogenous model calculations

I_{Y U} (f_{Y}) / t_{I_{Y}}

, where

f_{Y}

is the fraction of undiagnosed symptomatic infected individuals with complications that do not require hospitalization during their course of infection, and

t_{I_{Y}}

is the mean days to develop or avoid complications; this is bounded by the empirical data (denoted as

E_{m}

) of total test volume presenting other than due to hospitalization or international travel.

E_{m}

can be calculated by the difference between the daily total test volume (denoted as

V_{t}

) and the three-way sum of daily admitted COVID-19 patients to ICU and non-ICU hospitalizations (denoted as

V_{H I C U}

and

V_{H N I C U}

, respectively) and new likely exogenous cases (denoted as

E_{x_{D}}

). This difference reflects the known use of tests for hospitalization, and the fact that out-of-province cases were carefully estimated for the opening weeks of the pandemic, and each required tests.

The model characterization of daily diagnosed cases identified specifically by active case finding – conducted via activities such as contact tracing, screening, and drive-through testing – is designed to capture the fact that in such forms of case finding, testing tends to drive the count of individuals diagnosed, and identifies infected individuals at all stages of the natural history of infection. To represent the fact that test count drives the count of cases diagnosed with an efficiency limited by the number of infected individuals, we made use of a previously published testing model [19]. Within this model, the count of infectives identified by testing is characterized as

I_{U} β_{T} (1 - e^{- α \frac{V}{I_{U}}})

, where V is the total test volume,

I_{U}

is the total count of undiagnosed infectives,

α

is a measure of test efficiency, and

β_{T} \in [0, 1]

represents an upper limit on the fraction of infectives that could be identified via active case finding. In this test model, the term

β_{T} (1 - e^{- α \frac{V}{I_{U}}})

characterizes the fraction of all infectives that are diagnosed. Reflecting the fact that active case-finding efforts are incomplete in their reach,

β_{T}

represents the fraction of infectives that would be diagnosed via active-case finding if the total test volumes V were to be arbitrarily large (i.e., the asymptomatic fraction of infected individuals who would be located as the ratio of test volume to infections approaches infinity); given the broad reach of contact tracing within the province, this work treated

β_{T}

as 1.

α

is a measure of testing efficiency. When

β_{T}

is 1 (as it is here), for a small active test volume V, this can be seen roughly corresponding to the product of the test positivity rate and test specificity: For every test performed,

α

infectives will on average be discovered. The saturating exponential term (

1 - e^{- α \frac{V}{I_{U}}}

) assumes that as the volume of tests performed for active case finding rises, a greater number of tests are needed to find a given infective. Thus, while more tests will identify additional infectives, doubling the count of tests performed will not double the count of infectives identified. By employing this test model to calculate the cases diagnosed by active case finding in this project, and recognizing the priority placed on presentation-driven tests that drive passive case finding, the model assumes that the total volume of tests performed for active finding is given by the difference between the total testing volume (

E_{m}

) and the volume of tests performed for passive case finding

(min (\frac{I_{Y U} f_{Y}}{t_{I_{Y}}}, E_{m}))

, and thus

V_{a c t i v e} = E_{m} - m i n (\frac{I_{Y U} f_{Y}}{t_{I_{Y}}}, E_{m})

. At any time, the total count of undiagnosed infectives can be calculated by summing all of the undiagnosed compartments, which is

I_{U} = E_{U} + I_{A U} + I_{A 2 U} + I_{A 3 U} + I_{Y U} + I_{Y N U}

. Thus, the model gives the diagnosed cases found by active testing as

V_{p} = I_{U} β_{T} (1 - e^{- α \frac{V_{active}}{I_{U}}})

. And the daily count of diagnoses from active case finding for different compartments (e.g.,

E_{U}

,

I_{A U}

,

I_{A 2 U}

,

I_{A 3 U}

,

I_{Y U}

, and

I_{Y N U}

) – depicted as orange arrows in Figure 1 – is treated as simply being split proportionally according to the count of people in each undiagnosed infective compartment.

2.1.2. Acute Care Utilization

Undiagnosed or diagnosed symptomatic individuals who develop severe or critical COVID-19 complications [20] at the time of transitioning from the early stage symptomatic period (leaving

I_{Y U}

and

I_{Y D}

) are presumed to present for care and enter into the hospitalization stocks either for acute but non-critical care (compartment

H_{N I C U}

) or for critical care (

H_{I C U}

) – the purple flows in Figure 1. The fraction of all individuals progressing from diagnosed early- to diagnosed late-stage symptomatic state who are treated as not developing severe or critical COVID-19 complications is treated as

1 - f_{H}

. The fraction of all individuals progressing from undiagnosed early- to diagnosed late-stage symptomatic state diagnosed by passive testing is

f_{Y}

. And the fraction of individuals progressing from undiagnosed early- to undiagnosed late-stage symptomatic state diagnosed by passive testing is

1 - f_{H} - f_{Y}

. Of the fraction

f_{H}

of such progressors requiring hospitalization, the fractions that transition to the ICU (

H_{I C U}

) and non-ICU (

H_{N I C U}

) are given by parameter

f_{I C U}

and

1 - f_{I C U}

, respectively. Individuals in both such hospitalization compartments are further subject to mortality, with deceased individuals transitioning to compartment D at the time of passing, as indicated by the grey flows in Figure 1. Given the overall COVID-19 case fatality rate for hospitalized patients requiring ICU care or not in need of such care (denoted by

ϕ_{I C U}

and

ϕ_{N I C U}

, respectively), the model characterizes the corresponding daily mortality rates as

- l n (1 - ϕ_{I C U}) / t_{I C U}

and

- l n (1 - ϕ_{N I C U}) / t_{N I C U}

, where

t_{I C U}

and

t_{N I C U}

are the mean durations of ICU hospitalized and non-ICU hospitalized patient stays before death, respectively. As a simplifying assumption and to lower the count of compartments required and the resulting size of the state space, the model does not seek to explicitly model continued hospital residence amongst some patients prior to or following ICU discharge.

2.1.3. Exogenous/Endogenous infections

The model considers infectives as originating from both endogenous sources (via infection through contact with other infectives in the modeled population) and exogenous sources (where infectives arrive in the population via out-of-province (and particularly international) arrivals), which are flows represented by the magenta arrows in Figure 1. This exogenous flow is driven by the empirical time series of daily travellers infected outside of the population and was of strong importance for accounting for patterns in the opening two to three months of the pandemic, on account of the importance of international arrivals in driving subsequent endogenous transmission. Endogenous infections are calculated by the transmission system of the model.

2.1.4. Vaccination System

The model considers two levels of vaccination-induced protection for the population [21]. This characterization reflects the fact that Saskatchewan’s vaccination campaign employed only two-dose vaccines, namely Pfizer/BioNTech BNT-162b2, Moderna mRNA-1273, and AstraZeneca ChAdOx1. With the BNT-162b2 vaccine being responsible for approximately 74.86% of all vaccines delivered within the province, and conscious of the adverse impact on model state space size and – by extension – machine-learning inference accuracy, we made the simplifying assumption of characterizing vaccinated individuals by two levels of vaccine protection, rather than with further levels and/or via stratification with respect to each vaccine product. Two flows of daily vaccinated cases from the susceptible (compartment S) to the first level of vaccination-induced protection (compartment

V_{1}

) and from the first dose vaccinated to a higher level of protection (compartment

V_{2}

) (represented by green arrows in Figure 1) are driven by the empirical time series of daily receiving first dose vaccines and second dose vaccines. Because of limited evidence concerning the duration of vaccine protection [21], this model currently assumes the vaccines confer permanent protection. Individuals with both one and two doses of vaccines remain subject to the risk of infection, with the relative risk of infection in each dose-count-specific compartment compared to an unvaccinated symptomatic being given by the one minus an estimate of vaccine efficacy against infection with that dose count. The vaccine efficacy against infection of the vaccines used within Saskatchewan is reported based on clinical trial data [22] that differs from vaccine to vaccine, notably against different COVID-19 variants of concern (VoCs). Reflecting the mixed vaccination regime and the presence of multiple VoCs over the timeframe of the study, the vaccine efficacy against infection considered in this project for a single dose (denoted as

e_{1}

) and two doses (denoted as

e_{2}

) are

0.8

and

0.95

based on the vaccines used in Canada – Pfizer, Moderna, and Astra-Zeneca [22]. While COVID-19 vaccines routinely offer greater efficacy against hospitalization and mortality than against infection, motivated in part by the desire to avoid the adverse effects on model inference of enlarging the state space of the model and lacking ready empirical data on breakthrough infections at time of formulation, the model treats breakthrough infection as placing an individual into the same pathways of infection as are used for an infected unvaccinated individual.

2.1.5. Infectious Transmission System

The force of infection parameter

λ

characterizes the hazard rate of infection – the probability density with which a fully susceptible (e.g., a person in the stocks of S) is subject to infection from an infective, and is governed by mass action principles [23]. The force of infection parameter

λ

is calculated by

c β p_{e}

, where c is the contact rate among the population per unit time,

β

is the probability of transmitting COVID-19 per discordant contact and

p_{e}

is the effective prevalence of infectives in the mixing community. The construct of the effective prevalence of infectives in the mixing community,

p_{e}

, is designed to take into account the mixing implications of the symptoms, diagnosis, and acute care status of infective individuals; we refer to a relative-mixing-level-adjusted size of a subpopulation as the “effective” size of that subpopulation. The effective prevalence of infectives in the mixing community

p_{e}

is represented by the fraction of the effective infectives among the effective population in the community. We assume that undiagnosed oligosymptomatic individuals (in the compartments of

I_{A U}

,

I_{A 2 U}

and

I_{A 3 U}

) have full social contacts and undiagnosed symptomatic individuals (in the compartments of

I_{Y U}

and

I_{Y N U}

) exhibit a relative reduction in the level of social mixing as given by fraction

ρ_{U}

as measured relative to full social contacts (themselves as assumed to be associated with a relative mixing rate of 1), and non-hospitalized diagnosed patients (in the compartments of

I_{A D}

,

I_{A 2 D}

,

I_{A 3 D}

,

I_{Y D}

and

I_{Y N D}

) in the community have a similar proportional reduction in mixing denoted

ρ_{D}

. Hospitalized patients are treated as not engaging in mixing, and thus do not contribute to the size of the effective mixing populations, and thus carry a relative mixing rate of 0. It is important to emphasize that such values represent relative mixing rate characterizations; secular changes in contact rate across the population (such as those that might be caused by public health orders) are characterized by another element of the formulation detailed below. There are three flows in the model reflecting the force of the infection process – the infection from stocks S,

V_{1}

and

V_{2}

, which are associated with rose-coloured flows in Figure 1.

2.1.6. Municipal Wastewater Surveillance Characterization

Municipal wastewater refers to sewage containing waste from households, workplaces and other sources served by municipal infrastructure [24]. In a public health context, wastewater surveillance (WWS) describes the process of sampling and analyzing wastewater to monitor phenomena such as the prevalence of conditions, use of pharmaceuticals, and occurrence of viral outbreaks in communities [24]. Medema et al. [25] demonstrated a significant correlation between COVID-19 virus SARS-CoV-2 concentrations in wastewater and the prevalence of COVID-19. This finding suggested that wastewater surveillance of SARS-CoV-2 could offer a tool to monitor the trends of COVID-19 prevalence in cities. Moreover, wastewater surveillance offers a significant advantage since the concentration of SARS-CoV-2 in the wastewater sampling is representative of the entire population served by the sewage shed, regardless of health status, propensity care-seeking behavior or reported infection status [24]. Moreover, because of the high shedding levels seen in the early stages of infection by SARS-CoV-2, wastewater assays can often identify pre-symptomatic or oligosymptomatic populations.

This project involved the design, implementation, deployment, and routinized use of a particle-filtered compartmental model to estimate epidemiological and health system state using time series including wastewater concentrations of SARS-CoV-2. Due to the dynamics of viral load, fecal shedding in an SARS-CoV-2-infected individual varies across natural histories of infection, such as between symptomatic/asymptomatic, and over stages of progression [26,27,28]. We made use of a weighted shedding model reflecting the fact that individuals in the early stages of infection shed at far higher rates than do those at later stages of infection. Hoffman et al. [28] estimated shedding profile modulates viral concentrations in faecal samples over time; the estimated weights of viral concentration of different stages based on this research are shown in Table 1. In light of that weighted shedding profile for individuals and the larger shedding populations of interest, we treat there as being a constant of proportionality

γ

that relates the (weighted) value of the shedding population to the daily concentration of SARS-CoV-2. Reflecting the fact that the focus of the wastewater monitoring within Saskatchewan was on cities exhibiting separated storm-water and wastewater infrastructure marked by short (

\leq 8

hours) toilet-to-municipal wastewater treatment plant transit times, and use of autosampling from the primary inflow into the treatment plant, we treated the concentration of COVID-19 wastewater samples for a given city as indicative of the current – rather than the lagged – epidemiology for that city.

2.1.7. Model Parameters

Table 1 gives the value and units for constant parameters of the deterministic COVID-19 model; readers interested in further detail regarding the formulations involving these parameters are referred to Appendix A.

2.2. Calculation of Variables of Interest from the COVID-19 Model

Figure 1 shows the system of ODEs governing the behaviour of the deterministic COVID-19 model. As detailed in Section 2.3.1, the stochastic version of this model serves as the state space model for particle filtering. We detail here a set of derived quantities whose formulation is identical for both forms of the model.

A variety of COVID-19 outcomes of interest can be derived from the ODEs shown in Equation A1 in the Appendix A, including those relevant for epidemiological and acute care decision-making. From the standpoint of public health planning and epidemiology, important quantities include a dynamic characterization of the effective reproductive number (denoted as

R_{t}

), the count of undiagnosed infectives in the community with time (denoted as

N_{U}

), and the force of infection (

λ

). Each of these quantities provides information important for understanding the evolution of the current pandemic situation and played a central role in the reporting undertaken from the model. Such measures are especially useful in indicating the evolution of the epidemiological situation, anticipating incipient outbreaks, assessing the performance of current intervention strategies, and informing decisions to be made in the near future, such as those involving relaxation or re-imposition of public health orders.

Some of the model-derived values are of foremost value in the sphere of projection, rather than in the historic time horizon. From the standpoint of acute care and surge planning, the model offers particular value by virtue of its capacity to project forward acute care demand, both in the form of new admissions for COVID-19 Intensive Care Unit (ICU) and non-ICU hospital needs and in terms of census counts for both of those levels of acute care services. Particularly when the stochastic version of the model is used with particle filtering, such information can aid in decisions involving triggering of surge capacity.

2.2.1. Calculation of the Evolving Effective Reproductive Number

The basic reproductive number (denoted as

R_{0}

) and effective reproductive number (denoted here as

R_{t}

) are widely used concepts in mathematical epidemiological models. The basic reproductive number (

R_{0}

) is the average number of secondary infections transmitted by a typical infective individual in a completely susceptible surrounding population [34]. While an understanding of this quantity is of great value, in the context of an evolving outbreak, with a population of evolving susceptibility, behavioural & public-health measure-induced changes in the contact rate, changing variant ecology, and greater day-to-day attention typically rests on the effective reproductive number (

R_{t}

).

R_{t}

is the average number of secondary infections transmitted by a typical infective individual in a population composed of both susceptible and non-susceptible persons and reflective of the current epidemiology, including mixing patterns and public health, institutional and personal protective practices at present, vaccine effectiveness, population turnover, and currently circulating variants. As a general rule, if

R_{t} (t) > 1

, the count of infected individuals will increase over time; if

R_{t} (t) = 1

, the count of infected patients will remain roughly constant; if

R_{t} (t) < 1

, the number of individuals infected will decline over time.

The model detailed here used two methods to calculate the effective reproductive number (

R_{t}

): A simplified original method, and a method that takes into account the differential mixing rates between undiagnosed and diagnosed individuals, and the case-finding process that leads individuals to transition from the former to the latter. Both methods played prominent roles in daily reporting using the model throughout different stages of the pandemic. The original method is based on such an assumption that all infectives exhibit full – not reduced – mixing with the susceptibles throughout their full duration of infectivity (i.e., until recovery). Recalling that

e_{1}

and

e_{2}

represent vaccine effectiveness given one or two administered doses, respectively, and that

ρ_{U}

and

ρ_{D}

denote relative rates of mixing amongst symptomatic-but-undiagnosed individuals and diagnosed individuals, respectively, the original values of

R_{0}

and

R_{t} (t)

in this COVID-19 model are characterized as follows:

\begin{matrix} R_{0} = C β (t_{I} + t_{I_{Y}} + t_{I_{Y N}}) \\ R_{t} (t) = R_{0} \cdot f_{S u s c} (t) \\ f_{S u s c} (t) = \frac{S (t) + (1 - e_{1}) V_{1} (t) + (1 - e_{2}) V_{2} (t)}{N (t)} \\ N (t) = (S + E_{U} + I_{A U} + I_{A 2 U} + I_{A 3 U} + R_{U} + V_{1} + V_{2}) + ρ_{U} (I_{Y U} + I_{Y N U}) + \\ ρ_{D} (I_{A D} + I_{A 2 D} + I_{A 3 D} + I_{Y D} + I_{Y N D} + R_{D}) \end{matrix}

(1)

However, in real-world scenarios (and in this model), infection spread is governed by other factors besides those captured in the equations above. Specifically, the degree of infection spread from an infective is affected by the relative mixing levels between undiagnosed symptomatics and diagnosed infectives. Whilst the characterization in Equation (1) considers those factors inasmuch as they affect the fraction of contacts that are made with susceptibles, it fails to consider them in terms of the behaviour of the infective individual over the course of their illness. Considering the effective duration infectives spend in different infected stages leads to a new formulation for each of the basic and effective reproductive numbers, denoted

R_{0}^{'}

and

R_{t}^{'}

, respectively:

\begin{matrix} R_{0}^{'} = C β t_{E f f e c t i v e} \\ t_{E f f e c t i v e} = \frac{1}{I_{E}} [(I_{A} + ρ_{D} I_{A D}) (t_{I} + t_{I_{Y}} + t_{I_{Y N}}) + \\ (I_{A 2} + ρ_{D} I_{A 2 D} + ρ_{u} I_{Y U} + ρ_{D} I_{Y U D}) (t_{I_{Y}} + t_{I_{Y N}}) + \\ (I_{A 3} + ρ_{D} I_{A 3 D} + ρ_{u} I_{Y D} + ρ_{D} I_{Y N D}) t_{I_{Y N}}] \\ I_{E} = (I_{A} + I_{A 2} + I_{A 3}) + ρ_{u} * (I_{Y U} + I_{Y N}) + ρ_{D} * (I_{A D} + I_{A 2 D} + I_{A 3 D} + I_{Y U D} + I_{Y N D}) \\ R_{t}^{'} (t) = R_{0}^{'} \cdot f_{S u s c} (t) \end{matrix}

(2)

In this contribution, we employ the latter method, which considers the effective time of infectives to estimate and predict the effective reproductive number

R_{t}

.

2.2.2. Count of Undiagnosed Infectives in the Community over Time

Given the underlying structure of the model, the count of undiagnosed infectives in the community

N_{U} (t)

can be calculated by summing the count of undiagnosed persons in each infective compartment as follows:

N_{U} (t) = I_{A U} + I_{A 2 U} + I_{A 3 U} + I_{Y U} + I_{Y N U}

(3)

2.2.3. Daily Effective Prevalence of Infectives in the Mixing Community

The point prevalence of COVID-19 is the proportion of individuals in a population who have COVID-19 at a specified point in time [35]. Thus, the equation of the standard prevalence is as follows:

\begin{matrix} p_{s t} = \frac{I_{A U} + I_{A 2 U} + I_{A 3 U} + I_{Y U} + I_{Y N U} + I_{A D} + I_{A 2 D} + I_{A 3 D} + I_{Y D} + I_{Y N D}}{S + E_{U} + I_{A U} + I_{A 2 U} + I_{A 3 U} + V_{1} + V_{2} + I_{Y U} + I_{Y N U} + I_{A D} + I_{A 2 D} + I_{A 3 D} + I_{Y D} + I_{Y N D}} \end{matrix}

(4)

In the model, we use the effective prevalence instead of the standard prevalence. The effective prevalence considers the weight of contact coefficients of the undiagnosed infectives (

ρ_{U}

) and the weight of contact coefficients of the diagnosed infectives (

ρ_{D}

). Thus, the daily effective prevalence of infectives in the mixing community can be calculated by the fraction of the effective infectives in the total effective population in the community. The formulation is as follows:

\begin{matrix} p_{t} = \frac{(I_{A U} + I_{A 2 U} + I_{A 3 U}) + ρ_{U} (I_{Y U} + I_{Y N U}) + ρ_{D} (I_{A D} + I_{A 2 D} + I_{A 3 D} + I_{Y D} + I_{Y N D})}{(S + E_{U} + I_{A U} + I_{A 2 U} + I_{A 3 U} + V_{1} + V_{2}) + ρ_{U} (I_{Y U} + I_{Y N U}) + ρ_{D} (I_{A D} + I_{A 2 D} + I_{A 3 D} + I_{Y D} + I_{Y N D})} \end{matrix}

(5)

2.2.4. Force of Infection

Section 2.1.5 introduced the model’s use of the force of infection (

λ

). This quantity can be calculated as the product of what we term the transmission rate – itself the product of the contact rate and probability of transmission per discordant contact – and the fraction of the mixing population that is infectious:

\begin{matrix} λ = c β p_{t} \end{matrix}

(6)

where

p_{t}

is the daily effective prevalence of infectives, as characterized by equation 5.

2.2.5. Cumulative Prevalence of Infections

Period prevalence is the proportion of individuals in a population who have had COVID-19 over a specified period of time [35]. Thus, the cumulative prevalence of COVID-19 infections can be calculated by the fraction of the initial population who have ever been infected by COVID-19. The formulation of the cumulative prevalence of infections at time T is as follows:

p_{c} = \frac{\int_{0}^{T} λ [S + (1 - e_{1}) V_{1} + (1 - e_{2}) V_{2}] d t}{N_{0}}

(7)

2.2.6. New Hospital Admissions and Census Count for non-ICU and ICU Needs

A key motivator for the construction of the COVID-19 model characterized in this project is to estimate and predict acute care demand and capacity utilization. This includes considering hospital admissions – including ICU admission and non-ICU admission cases – and the daily number (census) of ICU and non-ICU hospital patients. The daily hospitalized census for ICU and non-ICU at any given time is simply characterized by the values of the compartments

H_{I C U}

and

H_{N I C U}

, respectively. Recalling that the time unit of the model is days, the per-day rate (daily count) of new admissions of ICU patients is given by the sum of two flows into the

H_{I C U}

compartment, representing the development of critical symptoms by both previously diagnosed (

I Y D

) and (separately) previously undiagnosed (

I Y U

) symptomatic infectives. Similarly, the daily new hospital admissions of patients not requiring ICU care is the sum of two flows into the

H_{N I C U}

compartment, representing the development of severe symptoms by both previously diagnosed (

I Y D

) and undiagnosed (

I Y U

) individuals. Thus the daily new admissions of ICU and non-ICU patients are as follows:

\begin{matrix} d H_{I C U} = \frac{(I_{Y U} + I_{Y D}) f_{H} f_{H I C U}}{t_{I_{Y}}} & d H_{N I C U} = \frac{(I_{Y U} + I_{Y D}) f_{H} (1 - f_{H I C U})}{t_{I_{Y}}} \end{matrix}

(8)

2.3. SMC Algorithm Incorporation of the Stochastic COVID-19 Model

The prominent sequential Monte Carlo (SMC) method of particle filtering is a contemporary state inference and identification methodology that supports filtering of general non-Gaussian and non-linear state space models in light of time series of empirical observations [3,5,9]. This approach estimates, via sampling, the time-evolving internal state of a dynamic system (here, the COVID-19 model) in which random perturbations are present, and where information about the state is obtained via noisy measurements made at each observation time. The state space model characterizes the processes governing the time evolution of the internal state of the system with stochastics consisting of random perturbations. The state of the state space model is assumed in general to be latent and unobservable. Information concerning the latent state is obtained periodically via a noisy observation vector. The particle filtering method can be viewed as undertaking a “survival of the fittest" of varying hypotheses as to the current location of the system in state space, with each such hypothesis being represented by a particle, the fitness of which is determined by the consistency between what is observed empirically at each observation time point and what would be expected given the state of the particle (the hypothesized state) at that time point. Interested readers are referred to a more detailed treatment in [5,9,10].

2.3.1. State Space Model

The state space model depicts the processes governing the time evolution of the state – both latent and observable – of a noisy system. In this paper, the state space model consists of a stochastically embellished variant of the deterministic COVID-19 model depicted in Figure 1 and whose equations are given in Equation A1 of the Appendix A. Reflecting the fact that effective use of particle filtering requires an underlying state equation model exhibiting stochastic variability, we characterize here an extension of the deterministic model that incorporates random perturbations in dynamic processes — including several stochastically evolving parameters — so as to reflect stochastic time evolution in the external world. The extended, stochastic model introduced below then serves as the basis for an accompanying particle filter.

The state vector of the particle filtering model is given by:

\begin{matrix} [S, E_{U}, I_{A U}, I_{A D}, I_{A 2 U}, I_{A 2 D}, I_{A 3 U}, I_{A 3 D}, I_{A U}, I_{A D}, I_{Y U}, I_{Y D}, I_{Y N U}, I_{Y N D}, H_{I C U}, H_{N I C U}, \\ R_{U}, R_{D}, D, \begin{matrix} n a m e d - c o n t e n t c o l o r t y p e r g b c o n t e n t - t y p e 0, 0, 1 l o g i t (C_{β}), l o g i t (α), l o g i t (f_{H}), l o g i t (f_{Y}) \end{matrix} & ]^{T} . \end{matrix}

Dynamic Processes

We consider stochastic processes to characterize the arrival of undiagnosed travel-imported symptomatic cases, contact and care-seeing behaviour, and test positivity rates associated with active screening. Moreover, Poisson processes are used to reflect the stochastics associated with the occurrence of a small number of cases over a small unit of time – denoted as

Δ t

(carrying the value of 0.001 days in the COVID-19 model) [3,5]. The stochastic process characterizing undiagnosed travel-based importation of symptomatic infectives is given by

\frac{P o i s s o n (E x_{D} Δ t \frac{1 - f_{S}}{f_{S}})}{Δ t}

.

Dynamic Parameters

There are a set of quantities that might commonly be regarded as parameters, but whose values evolved in notable ways over the course of the COVID-19 pandemic, particularly with the evolution of human behaviour, variant ecology, due to changes in active case-finding efforts, and arrival of the pathogen in vulnerable demographics and communities. Such quantities are termed “dynamic parameters” herein. The dynamic parameters of the deterministic COVID-19 model are listed in Table 2.

Figure 2. The model structure of the stochastic particle filtering model

Likelihood Function

Our formulation of the overall likelihood function and sub-likelihood functions for this work drew inspiration from our past success in employing negative binomial-based likelihood functions in a diverse set of particle filtering applications in communicable disease [3,4,5,6,7,8], and by others in MCMC-based approaches for H1N1 influenza [36]. Moreover, for simplicity and in line with formulations used successfully in multiple of our past contributions [3,4,6], the current model characterized the overall likelihood function for the particle filtering model as the product of sub-likelihood functions, each considering a distinct subset of the empirical datasets employed to ground the model:

\begin{matrix} L = & L_{N e w R e p o r t e d E n d o g e n o u s C a s e s} \times L_{C u m u l a t i v e R e p o r t e d E n d o g e n o u s C a s e s} \\ \times L_{C u m u l a t i v e I C U A d m i s s i o n s} \times L_{C u m u l a t i v e N I C U A d m i s s i o n s} \\ \times L_{I C U C e n s u s} \times L_{N I C U C e n s u s} \\ \times L_{C u m u l a t i v e D e a t h s} \times L_{V i r a l C o n c e n t r a t i o n s i n w a s t e w a t e r} \end{matrix}

(9)

Each sub-likelihood function is characterized by one of two distinct parametric statistical distributions – a negative binomial distribution or gamma distribution. Such sub-likelihood functions characterize the likelihood of observing the empirical datum, given an underlying model state specified by the particle state. Those two forms of sub-likelihood functions are introduced as follows:

The value of each sub-likelihood function based on a negative binomial distribution is given as follows:

$L_{N e g a t i v e B i n o m i a l} = (\binom{y + r - 1}{r - 1}) p^{r} {(1 - p)}^{y}$

(10)

where y is the observed datum, x is the model value corresponding to that datum (integer rounded), r is the dispersion parameter associated with the negative binomial distribution, and $p = \frac{x}{x + r}$ . In this project, the value of dispersion parameter r was chosen to be 5.
The value of the sub-likelihood function based on a gamma distribution is given as follows:

$L_{G a m m a} = \frac{β^{α} y^{(α - 1)} e^{- β y}}{\int_{0}^{\infty} z^{α - 1} e^{- z} d z}$

(11)

where y is the observed datum, x is the model value corresponding to that datum, k is the shape parameter, $α = \frac{x}{k - 1}$ , and $β = \frac{k}{x}$ . Such likelihood functions within this project assumed a value of $k = 5$ .

It is important to note that while the likelihood function employed here is designed to be used with each of the types of data shown in Table 3, the likelihood formulation is moreover designed to be robust in the context of missing data for several of those types of data. Data that can be accommodated as missing includes hospitalized admission data – ICU and non-ICU, hospitalized census data – ICU and non-ICU, and viral concentration in wastewater data. When a datum is not available for these types of observations, the corresponding sub-likelihood function will be treated as holding a value of unity (1.0). Thus, given missing data of this sort, the overall likelihood function will still carry the value of the product of the sub-likelihood functions for which data is available.

2.4. Data Sources

Since June 2020 and through early 2022 – and with prior episodic use – this model has served in a production capacity for the whole of Saskatchewan, and for varying periods for particular regions, municipalities, and small-area geographies within Saskatchewan. For the period October 2020 - October 2021, via a contract with the Public Health Agency of Canada (PHAC), it was further used for reporting and projections multiple times a week for all provinces of Canada. Beyond that, via a contract for reporting to the First Nations and Inuit Health Branch of Health Canada (FNIHB), the model was used in the period November 2020 - March 2022 for biweekly reporting and projections for First Nations Reserves in six Canadian provinces. Most such uses have exercised subsets of the likelihood functions considered, with hospital census data and wastewater data being restricted to subsets of jurisdictions.

For a given jurisdiction, empirical datasets are fed into the particle filtering model to estimate and predict the evolution of the epidemiological and acute care state of that jurisdiction. The empirical datasets employed in the model can be divided into two categories: A set incorporated in the likelihood function for training the particle filtering model, and another that serves as an exogenous input to the differential equation model.

The following empirical datasets were considered in the likelihood function:

Daily count of new reported incident confirmed or suspected cases.
Cumulative reported incident confirmed or suspected cases from the inception of the pandemic.
Cumulative reported deaths from COVID-19.
Daily count of COVID-19 patients admitted into the ICU.
Daily COVID-19 patients admitted into hospital for non-ICU care.
Daily midnight census (count) of COVID-19 patients in the ICU.
Daily midnight census of COVID-19 patients in the hospital for non-ICU care.
Weekly average virus SARS-CoV-2 concentration in wastewater.

The following empirical datasets were incorporated as exogenous inputs directly into the dynamic model:

Daily new likely exogenous cases, which represent arrivals into the jurisdiction believed to be infected while outside the jurisdiction, with an emphasis on international travel.
Daily count of persons undergoing PCR (nasopharyngeal swab)-based testing.
Daily count of COVID-19 patients admitted into the ICU.
Daily count of COVID-19 patients admitted into hospital for non-ICU care.
Daily count of persons who received the 1st dose vaccination.
Daily count of persons who received their second vaccinate dose.

It is important to note that the empirical datasets of “Daily count of COVID-19 patients admitted into the ICU" and “Daily COVID-19 patients admitted into hospital for non-ICU care" are used in both the likelihood function and in driving the model directly.

2.5. Characterizing Model Fidelity to Empirical Data

A key driver for the evolution of the particle filtering approach applied here was the ongoing critical assessment of the fidelity between outputs from the particle-filtered model and the above empirical data sources. As a primary metric for assessing such fidelity in this project, we employed a discrepancy function. The discrepancy between the particle filtering model results and each empirical dataset specified here is the mean of the normalized-RMSE (root mean square error) across the whole time frame of the model incorporating the empirical data; as such, smaller discrepancies are considered favourable. To accommodate the different scales of multiple empirical datasets, we employ the normalized-RMSE to measure the difference between the model estimated/predicted values and the observed data of each empirical dataset on each day having observed data. The mathematical formulation for the normalized-RMSE is specified in Appendix D.

3. Results

This section characterizes COVID-19 particle filtering model results and (empirical data availability permitting) associated discrepancies for both day-to-day estimates of the epidemiological state and projection of quantities such as the future daily infected cases, force of infection, and ICU and non-ICU admissions and census.

3.1. Particle Filtering Model Results with Incorporating Empirical Datasets

Although the particle filter model characterized here at various intervals provided reporting for 17 different jurisdictions; for the sake of simplicity, we focus here on results for a jurisdiction offering wastewater data and served by amongst the longest spans of data – Saskatoon, Saskatchewan. In the application examined here – which is emblematic of simulations conducted on this jurisdiction over long periods of time – the COVID-19 particle filtering model takes in daily incoming empirical data to produce daily reporting. The model runs start on February 22, 2020, when the empirical data became available from the appropriate public health agency (here, the Saskatchewan Health Authority); testing for COVID-19 began on February 25, 2020, and the first reported infected cases occurred on March 11, 2020. The simulation here proceeds to July 31st, 2021, prior to the widespread appearance of the Delta variant of concern. Particle filtering was conducted with a particle count of 150,000.

Table 4 presents the mean discrepancy, with 5 runs of the model. For comparison in scale, the table further provides the 95% confidence interval of each empirical dataset incorporated in the likelihood function to ground the model. As a reminder, the lower the discrepancy, the better the results sampled from the particle filtered model reproduce the empirical dataset.

Figure 3 to Figure 10 show the particle filtering COVID-19 model’s (the results of the minimum discrepancy among those 5 runs) estimated results, compared with empirical data. The comparison between model results and empirical data indicates that the particle filtering COVID-19 model can estimate the daily COVID-19 transmission and hospitalization status. As an important caveat, for data confidentiality reasons, precise empirical data is only provided here for empirical data publicly available through the Saskatchewan Health Authority COVID-19 dashboard [37]. For depiction of the two types of data not publicly available (ICU and non-ICU hospital admissions) in those figures, we ensure data confidentiality by showing synthetic data in the figure instead of actual data. Specifically, for each of ICU and (separately) non-ICU hospital admissions, the data shown in the figures for a given day is poisson-distributed pseudo-empirical data. That is, for day t with an actual count of

n_{t}

hospital admissions, the synthetic datapoint is drawn from

p o i s s o n (\max (n_{t}, 0.05))

.

Figure 3. Daily new reported confirmed or suspected infective cases between particle filtering model results (boxplot) and empirical data (superimposed red scatterplot)

Figure 4. Cumulative reported infective cases in the community between particle filtering model results (boxplot) and empirical data (superimposed red scatterplot)

Figure 5. Daily count of COVID-19 patients in the hospitalized ICU between particle filtering model results (boxplot) and empirical data (superimposed red scatterplot)

Figure 6. Daily count of COVID-19 patients in the hospitalized non-ICU between particle filtering model results (boxplot) and empirical data (superimposed red scatterplot)

Figure 7. Daily count of COVID-19 hospitalized deaths between particle filtering model results (boxplot) and empirical data (red scatter plot)

Figure 8. Daily wastewater viral concentration of SARS-CoV-2 (N2 copies per 100mL) between particle filtering model results (boxplot) and empirical data with missing days (red scatter plot)

Figure 9. Daily count of Hospitalized ICU admitted patients between particle filtering model results (boxplot) and pseudo-empirical synthetic data (superimposed red scatterplot)

Figure 10. Daily count of hospitalized non-ICU admitted patients between particle filtering model results (boxplot) and pseudo-empirical synthetic data (superimposed red scatterplot)

3.2. Estimation of Latent Dynamic Variables

Through ongoing incorporation of the empirical datasets to ground the COVID-19 dynamic model, the particle filtering process estimates the latent states and dynamic variables to inform the COVID-19 transmission. Figure 11 shows the model-estimated daily effective reproductive number (the method and underlying mathematical formulation can be found in Section 2.2.1). Figure 12 shows the model-estimated daily undiagnosed infectives (with details on formulation found in Section 2.2.2). Figure 13 shows the model-estimated force of infection (

λ

) (with details on formulation found in Section 2.2.4). Figure 14 shows the model-estimated daily effective prevalence of infectives in the mixing community (with details on formulation found in Section 2.2.3). And Figure 15 shows the cumulative prevalence of infections for each day (with details on formulation found in Section 2.2.5). Readers interested in the estimated latent state of the COVID-19 particle filtering model are referred to Appendix D.

3.3. Projection Results

While the COVID-19 particle filtering model offers strong performance in monitoring and estimating COVID-19 transmission, throughout its use across jurisdictions, such estimates of current epidemiological state have routinely been accompanied by 14-day projections of COVID-19 transmission and hospitalization.

To assess the predictive capacity of the COVID-19 particle filtering model herein, we employed it to perform 1-day, 7-day and 14-day predictions for each day starting from day 100 in Saskatoon and continuing for the remainder of the time horizon considered here. Within the projection period (e.g., 7 days) conducted from a given, no further particle filtering is performed, the model is simply run forward, without any incorporation of the observed data. Mirroring the process seen in de-facto use of the model, such a projection is performed from each successive day. It is essential to recognize that while the projection itself doesn’t incorporate any new data, between each such day on which the projection is launched, new data arrives and is incorporated by the particle filtering. The updated estimate of the latent state of the system afforded by this incorporation of the new data by the particle filtering allows the next projection (looking into the “future” as obtained from the standpoint of that day) to be made on the basis of the refined and updated understanding of the current epidemiological context.

Table 5 shows the predictive discrepancy between the model-predicted results and the empirical data. By comparing the average discrepancy of the three projection runs – 1-day, 7-day, and 14-day ahead – we can see that the relative accuracy of the projections decreases with longer prediction timeframes. It bears emphasis that while discrepancies are computed here to compare model results against empirical data, during the projection timeframes launched from each day, empirical data are only used for comparison with the model-projected results. However, as noted above, new data is taken into account before undertaking each successive projection.

Figure 16 through Figure 20 depict a comparison between the model-predicted results with the empirical datasets (or, for confidentiality of hospital admissions, the empirically inspired synthetic data noted above). To understand the results, it is to be emphasized that for every day, we perform three predictive runs (1-day, 7-day and 14-day). When shown in boxplots in the figures, the values for (one) day-ahead predictions of the model will be shown directly in comparison with the corresponding day-ahead empirical data. By contrast, for the 7-day projection results, for each day the figures visually compare the average model-predicted value over that 7-day interval with the corresponding average of the empirical data over that same 7-day interval. As above, it is to be emphasized that within each such projection from a given day, no particle filtering is occurring, and the empirical data are only compared with the model results, not incorporated in the model. As noted above, as would and did occur in day-to-day practice of a deployed system such as this, with the passage of each successive day, new data is incorporated by the particle filtering mechanism to update the estimate of system state, allowing the next projection to be made on the basis of that updated state estimate.

Those figures show that the preponderance of observed data (blue points in the diagrams) fall within the 50% inter-quartile range of the boxplot, demonstrating relatively accurate model predictions for up to 14 days in advance to inform public health agencies and governments, as in informed by data up to the point of projection. For example, the prediction of daily new reported cases can produce a picture of the trends in future transmission – as informed by system state estimated by the particle filtering. In a similar manner, the prediction of daily hospitalized ICU, non-ICU admission, and census patients can inform public health agencies’ mobilization of surge capacity, anticipation of capacity utilization and service demands, and more broadly supporting judicious allocation of hospital resources over the multi-week timeframe.

Figure 16. The projection results (boxplot) of the daily reported cases compared with empirical data (the blue points, and not incorporated in the model). (a) The next day projection results. (b) The 7-day projection time-window-averaged projection results versus corresponding empirical data window-averages. (c) The 14-day projection window-averaged projection results versus corresponding empirical data window-averages.

Figure 17. The projection results (boxplot) of the daily count of patients in the non-ICU compared with the empirical data (the blue points, and not incorporated in the model). (a) the next day projection results. (b) The 7-day projection time-window-averaged projection results versus corresponding empirical data window-averages. (c) The 14-day projection time-window-averaged projection results versus corresponding empirical data window-averages.

Figure 18. The projection results (boxplot) of the daily count of patients in the hospitalized ICU compared with the empirical data (the blue points, and not incorporated in the model). (a) The next day projection results. (b) The 7-day projection time-window-averaged projection results versus corresponding empirical data window-averages. (c) The 14-day projection time-window-averaged projection results versus corresponding empirical data window-averages.

Figure 19. The projection results (boxplot) of the daily count of deaths compared with the empirical data (the blue points, and not incorporated in the model). (a) The next day projection results. (b) The 7-day projection time-window-averaged projection results versus corresponding empirical data window-averages. (c) The 14-day projection time-window-averaged projection results versus corresponding empirical data window-averages.

Figure 20. The projection results (boxplot) of the daily count of patients admitted to the non-ICU admitted patients with empirically mimicking synthetic data (the blue points, and not incorporated in the model), with the latter being employed to preserve confidentiality. (a) The next day projection results. (b) The 7-day projection time-window-averaged projection results versus corresponding synthetic data window-averages. (c) The 14-day projection time-window-averaged projection results versus corresponding synthetic data window-averages.

3.4. Intervention Results

In the previous sections, we showed that the particle filtering algorithm can estimate the state space of the COVID-19 model. Beyond supporting the projection methods examined in the previous session, the capacity to perform such state estimation also confers benefits for conducting simulations of tradeoffs between intervention strategies, despite their counterfactual character. As for projection scenarios discussed in the previous section, during the time horizon of a given intervention run (e.g., 14 days), the particle filtering is disabled, and the dynamic model is run forward with no empirical data being incorporated. But, as for the projection methods above, with each successive day of operation, particle filtering incorporates a new day worth of data. The accuracy of the intervention runs conducted forward into the future from a given day following the particle filter update from the previous day benefits from the updated state estimated made possible by particle filtering’s incorporation of a new day’s worth of data.

In this section, we show two intervention experiments to simulate stylized public health intervention policies. The stylized intervention strategies are characterized abstractly for demonstration purposes, but are emblematic of the sort of more textured interventions examined during use of the model, and provide a flavour of what could be achieved with other interventions. In each case, the scenarios are run forward for 14 days from each successive day with the intervention mechanisms in place. Such a scenario run undertaken from a given day depicts the posited result of undertaking the associated intervention starting on that day.

For a given such day, a single boxplot for that day depicts, on the basis of the state estimate as of that particular day as updated by particle filtering for that day, the average outcome over the 14-day intervention time horizon. It bears emphasis that, in each case, the intervention scenario is counter-factual — the intervention is not put into effect from day to day; rather, each such 14-day intervention scenario projects what the impact of such an intervention would be, were it to be undertaken starting on the current day.

The two interventions considered here are predominantly focused on actions undertaken during two outbreak waves:

The first stylized intervention exhibited here focuses on elevating hygiene-oriented personal protective measures, such as might be exemplified by a regional mask mandate. For simplicity, the examination here characterizes such interventions as multiplying the effective contact rate by a coefficient in the range $(0, 1)$ . Figure 22 depicts the results of such counterfactual scenario occurring focused on the first outbreak wave in Saskatoon. For simplicity, this scenario posits an aggressive such hygiene-enhancing intervention which reduces the contact rate by 50% specifically for the window between day 220 to day 310 (inclusive).
In a second intervention type, we examine the outcomes from a stylized outbreak-response immunization campaign elevating vaccination rates for the 14-day defined period. This effect is achieved by using a coefficient to increase the effective vaccination rate in the model over that timeframe. As an example, Figure 23 shows the results of elevating the effective vaccination rate by 50% rate during the third outbreak wave in Saskatoon, with those elevated rates being in place from day 390 to day 510, inclusive.

The baseline comparator for the intervention runs – no intervention policies performed – can be found as the normal projection runs from Figure 16 to Figure 19.

4. Discussion and Limitations

As demonstrated in its use to support reporting for 17 jurisdictions across Canada for a period of a year or more, the COVID-19 particle filtering model can monitor COVID-19 transmission and hospitalization, estimate the daily latent states and important dynamic variables, and predict future daily transmission and hospitalization status over the multi-week timeframe, all in light of daily updates to estimated system state. With running the model daily, the daily estimates for COVID-19 transmission, hospitalization status, projection and intervention results reflect the latest sets of empirical data – including both health system and wastewater data – to provide current understanding to inform public health and healthcare system decision-making.

This work suffers from a number of limitations. A key one concerns changes in variant ecology. Reflective of the high amounts of transmission experienced globally, the virus SARS-CoV-2 causing COVID-19 has exhibited marked evolution. For most of the period for which data is considered in this paper (early 2020 to the end of July 2021) the wild type of SARS-CoV-2 of the uniform lineage in place in Canada, with the Alpha variant [17] appearing in Canada in the final week of 2021 (and in Saskatchewan by February 2021), followed by Beta and Gamma and Delta. With respect to our example jurisdictions considered in example runs here (Saskatoon), the highly distinctive Delta variant [17] became the dominant variant of SARS-CoV-2 following July 2021 driving the next wave of outbreaks. Our COVID-19 particle filtering model is capable of simulating changes in the virus ecology by adjusting characteristic parameter values (e.g., to reflect virulence, transmissivity, fraction of cases that are symptomatic, or vaccine effectiveness). For example, to change from simulating Alpha to the Delta variant, we increase the maximum value of the “transmission contact rate" (denoted as

c_{β}

in this paper), and decreased both of the two doses’ vaccine efficacy (denoted as

e_{1}

and

e_{2}

). However, the dynamic model assumes the presence of a single variant at a time, and is not suitable for characterizing processes requiring representation of multi-variant ecologies, such as those involving competition between multiple lineages. It is also not well-suited for capturing variant cross-reactivity with respect to immunological protection.

Although the structure of the COVID-19 model has demonstrated effectiveness in simulating COVID-19 transmission and hospitalization across diverse jurisdictions from the beginning of the first infected individual occurrence until the end of 2021, there are a number of key shortcomings in the existing structure of the model. Likely the single most important such limitation relates to the failure of the model to adequately characterize the differential impact of vaccination on protection from infection vs. protection from severe disease and death. The model’s existing characterization of COVID-19 vaccination characterizes its impact of vaccination only as mediated by an impact on transmission. While individuals in the model can be infected regardless of vaccination status, once a breakthrough infection of a vaccinated individual occurs, the model lacks existing mechanisms for retaining information on that individual’s vaccination status. As a result, conditional on infection, the model grossly unrealistically characterizes a vaccinated individual as having an identical risk of hospitalization, ICU admission and death as a non-vaccinated individual. While model parameters associated with such outcomes can be modified to reflect a high prevalence of vaccination uptake, the model urgently needs a means of characterizing different types of protection conferred by vaccines. Such a representation is particularly urgent in light of the need to capture the evolution in variant ecology emphasized above. Beyond this foundational modification, the model requires the capacity to represent the impact of successive booster vaccines.

Beyond the key change required for characterization of vaccine-induced protection, the model depicted in this paper exhibits a needs to adapt to the updated epidemiological context and evolved understanding of SARS-CoV-2. Most importantly are a need to take into account the extensively evidenced phenomenon of waning of both natural immunity (acquired from exposure to the disease through infection) and vaccine-induced immunity.

Finally, the COVID-19 model structure characterized here is only applied to an aggregate population. Important gains in insight can be secured by incorporation of key elements of heterogeneity via stratification. Given the marked differences in risk of severe disease and hospitalization, vaccine uptake, assortive mixing, and risk behaviour, stratification by age group is a key priority. Particularly in light of the pronounced rural-urban disparities in vaccination and risk behaviour, and opportunities for incorporation of data drawn from SARS-CoV-2 wastewater concentration assays across varying municipalities, stratified by multiple regions can also confer notable benefits.

Most of the needs covered in this section have subsequently been successfully incorporated into newer versions of the particle filtered dynamic model than those presented here, but coverage of this expanded model and particle filtering framework lies outside this presentation.

Also left for separate coverage are our refinement and expansion of the model covered here into a Particle Markov Chain Monte Carlo model offering additional capabilities and sophistication in sampling of static parameters [33], closer examination and evaluation of the incorporation, and support for drawing insight from wastewater data, and details of the extensive and articulated distributed computation framework used to provide nearly fully automated day-to-day running and reporting of model results across diverse jurisdictions and data sources at scale.

5. Conclusion

The work here characterizes the design and multi-year deployment of a production-quality particle filter model that played a central role in informing public health decision-making starting in the opening months of the pandemic. By cross-leveraging particle filtering, dynamic (transmission) modeling, and diverse health system and wastewater data sources, the system presented here and close variants offered important initial findings by April 2020, and served to deliver daily-updated COVID-19 situational analyses and short-term forecasts for Saskatchewan for the period of June 2020 through December 2021, multiple times a week for each Canadian province for Public Health Agency of Canada until November 2021 and weekly to First Nations across six Canadian provinces via FNIHB through March 2022.

Particle filtered dynamic models confer strong benefits by virtue of their ability to incorporate diverse incoming empirical data streams — here including both regularly reported health system data and episodically sampled wastewater data — to perform day-to-day probabilistic estimation and reporting of latent epidemiological and health system quantities of interest. Quantities routinely reported from the model described here include — but are not limited to — COVID-19 cases, testing volumes, hospitalization admissions and census, and deaths, force of infection, undiagnosed individuals and other factors. This further includes a more sophisticated estimate of the effective reproductive number taking into account incomplete reporting, asymptomatic transmission, diagnosis and isolation, and other considerations. Beyond supporting updated estimation of such quantities and other elements of system state whenever new data arrives, our extensively deployed particle filtered framework uses each new system state estimate as the basis for probabilistically projecting forward the evolution of epidemiology and acute-care demand, such as can readily support triggering surge capacity mobilization, motivate the institution of public health measures, or prepare for higher health capacity utilization. Similar methods can and were used to support reporting of results from prospective counterfactual intervention scenarios, with each undertaken in light of the latest empirical observations.

As demonstrated by its widespread adoption for continually regrounded reporting and scenario analysis for diverse Canadian jurisdictions, the sequential Monte Carlo approach of particle filtering offers a compelling tool for evidence-based public health decision-making. The capacity of particle filtering to keep transmission models and resulting probabilistic state estimates and scenarios projections continually updated with the latest data offers compelling advantages over earlier generations of techniques such as the Extended Kalman Filter, and the computational demands of this technique are well-balanced with the velocity of contemporary data streams of relevance. Systems employing particle filtering offer strong advantages well-matched to the urgent need for public health surveillance and decision-making in coming years.

Author Contributions

Conceptualization, N.O., X.L. and J.B.; methodology, N.O., X.L. and L.D.; software, N.O., X.L., L.D. and V.P.; validation, N.O. and J.B. and X.L.; formal analysis, N.O. and X.L.; investigation, N.O., X.L., L.D., V.P. and J.M.; resources, J.B., L.D., N.O., V.P., and J.M.; data curation, V.P., X.L., N.O., L.D. and J.M; writing—original draft preparation, X.L.; writing—review and editing, N.O., X.L., V.P. and J.M.; visualization, X.L., V.P., J.M. and L.D.; supervision, N.O. and J.B.; project administration, N.O. and J.B.; funding acquisition, N.O. and J.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research benefited foundational from salary support provided by the Saskatchewan Health Authority and Saskatchewan Ministry of Health. It further received contract funding from the Public Health Agency of Canada (PHAC), and First Nations and Inuit Health Branch of Indigenous Services Canada. Later evolution of this work were supported by PHAC and NSERC through the Mathematics for Public Health Network via the Emerging Infectious Disease Modeling (EIDM) Initiative. Finally, the work here immediately extended work and infrastructure that was supported through the Natural Sciences and Engineering Research Council (NSERC) of Canada Discovery Grants program (RGPIN 2017-04647).

Acknowledgments

The team gratefully acknowledges the support of diverse partners in this work. We wish to extend particular gratitude to the Saskatchewan Health Authority, Saskatchewan Ministry of Health for provision of health system data, funding the successive development of the particle filtering model and framework, and the supporting infrastructure. We particularly wish to offer special thanks to Jennifer Zerff and Michelle Mula of the SHA, as well as many other university students who helped support this system, especially Eric Redekopp, Aaron Toderash, and Long Pham. We thank PHAC for the wastewater contract that accelerated this work, helped provide trainee support, as well as PHAC and FNIHB for further supporting trainees and development and maintenance of the particle filtering infrastructure. We further wish to extend our gratitude to Dr. John Giesy, Dr. Yuwei Xie, Dr. Markus Brinkmann, and Dr. Kerry McPhedran of the U Saskatchewan Toxicology Centre for providing the wastewater used, key background understanding concerning wastewater analysis, lab processes, wastewater infrastructure, shedding, and related areas; their generosity is so appreciated, and the wastewater side of this model and derivatives will form the basis for a future joint paper. Through his work with the associated the PMCMC model, Osgood postdoctoral fellow Dr. Jeremy Eng helped estimate a key shedding-related scaling constant to enhance the integration the wastewater data into the particle filter. We further want to think Dr. Juxin Liu for her esteemed and deep expertise in sequential and PMCMC methods over the years, without which this work would not have been possible. Co-author Osgood further wishes to express his appreciation of support for elements of the work that contributed to the success of these efforts via the NSERC Discovery Grants program, from the Mathematics for Public Health Network, and from SYK & XZO.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

SMC	Sequential Monte Carlo
PF	Particle Filtering
WHO	World Health Organization
PMCMC	Particle Markov Chain Monte Carlo
PHAC	Public Health Agency of Canada
SHA	Saskatchewan Health Authority
FNIHB	First Nations and Inuit Health Branch
PCR	Polymerase Chain Reaction
ICU	Intensive Care Unit
WWS	Wastewater Surveillance
ODE	Ordinary Differential Equation
MCMC	Markov Chain Monte Carlo
RMSE	Root Mean Square Error
NRMSE	Normalized Root Mean Square Error

Appendix A. The ODEs of the COVID-19 Mathematical Model

The mathematical equations of the compartmental model are listed as follows:

\begin{matrix} \frac{d S}{d t} = - λ S - E_{V a c c 1} \\ \frac{d V_{1}}{d t} = E_{V a c c 1} - E_{V a c c 2} - λ (1 - e_{1}) V_{1} \\ \frac{d V_{2}}{d t} = E_{V a c c 2} - λ (1 - e_{2}) V_{2} \\ \frac{d E_{U}}{d t} = λ S + λ (1 - e_{1}) V_{1} + λ (1 - e_{2}) V_{2} - \frac{E_{U}}{t_{E}} - V_{p} \frac{E_{U}}{I_{U}} \\ \frac{d I_{A U}}{d t} = \frac{E_{U}}{t_{E}} - \frac{I_{A U}}{t_{I}} - V_{p} \frac{I_{A U}}{I_{U}} \\ \frac{I_{A D}}{d t} = V_{p} \frac{E_{U}}{I_{U}} - \frac{I_{A D}}{t_{I}} \\ \frac{d I_{A 2 U}}{d t} = f_{p A} \frac{I_{A U}}{t_{I}} - \frac{I_{A 2 U}}{t_{I_{Y}}} - V_{p} \frac{I_{A 2 U}}{I_{U}} \\ \frac{d I_{A 2 D}}{d t} = f_{p A} \frac{I_{A D}}{t_{I}} - \frac{I_{A 2 D}}{t_{I_{Y}}} \\ \frac{d I_{A 3 U}}{d t} = \frac{I_{A 2 U}}{t_{I_{Y}}} - V_{p} \frac{I_{A 3 U}}{I_{U}} - \frac{I_{A 3 U}}{t_{I_{Y N}}} \\ \frac{d I_{A 3 D}}{d t} = \frac{I_{A 2 D}}{t_{I_{Y}}} + V_{p} \frac{I_{A 2 U}}{I_{U}} - \frac{I_{A 3 D}}{t_{I_{Y N}}} \\ \frac{d I_{Y U}}{d t} = E x_{D} \frac{1 - f_{S}}{f_{S}} + (1 - f_{p A}) \frac{I_{A U}}{t_{I}} - \frac{I_{Y U} (1 - f_{Y})}{t_{I_{Y}}} - V_{p} \frac{I_{Y U}}{I_{U}} - min (\frac{I_{Y U} f_{Y}}{t_{I_{Y}}}, E_{m}) \\ \frac{d I_{Y D}}{d t} = E x_{D} + (1 - f_{p A}) \frac{I_{A D}}{t_{I}} + V_{p} \frac{I_{A U}}{I_{U}} - \frac{I_{Y D}}{t_{I_{Y}}} \\ \frac{d H_{I C U}}{d t} = I_{Y U} \frac{f_{H} f_{H I C U}}{t_{I_{Y}}} + I_{Y D} \frac{f_{H} f_{H I C U}}{t_{I_{Y}}} - \frac{H_{I C U}}{t_{I C U}} - (H_{I C U} \frac{- l n (1 - ϕ_{I C U})}{t_{I C U}}) \\ \frac{d H_{N I C U}}{d t} = I_{Y U} \frac{f_{H} (1 - f_{H I C U})}{t_{I_{Y}}} + I_{Y D} \frac{f_{H} (1 - f_{H I C U})}{t_{I_{Y}}} + \frac{H_{I C U}}{t_{I C U}} - (H_{N I C U} \frac{- l n (1 - ϕ_{N I C U})}{t_{N I C U}}) - \frac{H_{N I C U}}{t_{H}} \\ \frac{d I_{Y N U}}{d t} = I_{Y U} \frac{1 - f_{H} - f_{Y}}{t_{I_{Y}}} - min (\frac{I_{Y U} f_{Y}}{t_{I_{Y}}}, E_{m}) - \frac{I_{Y N U}}{t_{I_{Y N}}} - V_{p} \frac{I_{Y N U}}{I_{U}} \\ \frac{d I_{Y N D}}{d t} = V_{p} \frac{I_{Y U}}{I_{U}} + \frac{I_{Y D} (1 - f_{H})}{t_{I_{Y}}} + min (\frac{I_{Y U} f_{Y}}{t_{I_{Y}}}, E_{m}) - \frac{I_{Y N D}}{t_{I_{Y N}}} \\ \frac{d R_{U}}{d t} = \frac{I_{Y N U}}{t_{I_{Y N}}} + \frac{I_{A 3 U}}{t_{I_{Y N}}} \\ \frac{d R_{D}}{d t} = \frac{I_{Y N D}}{t_{I_{Y N}}} + \frac{H_{I C U}}{t_{H}} + \frac{I_{A 3 D}}{t_{I_{Y N}}} + V_{p} \frac{I_{Y N U}}{I_{U}} + V_{p} \frac{I_{A 3 U}}{I_{U}} \\ \frac{d D}{d t} = (H_{N I C U} \frac{- l n (1 - ϕ_{N I C U})}{t_{N I C U}}) + (H_{I C U} \frac{- l n (1 - ϕ_{I C U})}{t_{I C U}}) \\ I_{U} = E_{U} + I_{A U} + I_{A 2 U} + I_{A 3 U} + I_{Y U} + I_{Y N U} \\ t_{I_{Y N}} = t_{R} - t_{I_{Y}} \\ N = S + V_{1} + V_{2} + E_{U} + E_{D} + I_{A U} + I_{A D} + I_{A 2 U} + I_{A 2 D} + I_{A 3 U} + I_{A 3 D} + I_{Y U} + I_{Y D} + I_{Y N U} + I_{Y N D} + R_{U} + R_{D} \\ + H_{I C U} + H_{N I C U} + D \\ λ = c β \frac{(I_{A U} + I_{A 2 U} + I_{A 3 U}) + ρ_{U} (I_{Y U} + I_{Y N U}) + ρ_{D} (I_{A D} + I_{A 2 D} + I_{A 3 D} + I_{Y D} + I_{Y N D})}{(S + V_{1} + V_{2} + E_{U} + I_{A U} + I_{A 2 U} + I_{A 3 U}) + ρ_{U} (I_{Y U} + I_{Y N U}) + ρ_{D} (E_{D} + I_{A D} + I_{A 2 D} + I_{A 3 D} + I_{Y D} + I_{Y N D})} \\ E_{m} = max (0, V_{t} - V_{H I C U} - V_{H N I C U} - E_{x_{D}}) \\ V_{r e m a i n} = E_{m} - min (\frac{I_{Y U} f_{Y}}{t_{I_{Y}}}, E_{m}) \\ V_{p} = V_{r e m a i n} β (1 - e^{- α \frac{V_{r e m a i n}}{I_{U}}}) \end{matrix}

(A1)

The meaning of each stock:

S: The number of susceptible individuals

V_{1}

: The number of individuals who have received first vaccination dose

V_{2}

: The number of individuals who have received two vaccination doses

E_{U}

: The number of undiagnosed susceptible individuals

I_{A U}

: The number of undiagnosed temporary asymptomatic infected individuals

I_{A D}

: The number of diagnosed temporary asymptomatic infected individuals

I_{A 2 U}

: The number of undiagnosed persistent asymptomatic infected individuals

I_{A 2 D}

: The number of diagnosed persistent asymptomatic infected individuals

I_{A 3 U}

: The number of undiagnosed persistent asymptomatic infected individuals with progression from

I_{A 2 U}

I_{A 3 D}

: The number of diagnosed persistent asymptomatic infected individuals with progression from

I_{A 2 D}

I_{Y U}

: The number of undiagnosed symptomatic infected individuals with complications

I_{Y D}

: The number of diagnosed symptomatic infected individuals with complications

I_{Y N U}

: The number of undiagnosed symptomatic infected individuals without complications

I_{Y N D}

: The number of diagnosed symptomatic infected individuals without complications

H_{I C U}

: The number of hospitalized critical infected individuals

H_{N I C U}

: The number of hospitalized acute infected individuals

R_{U}

: The number of undiagnosed recovered individuals

R_{D}

: The number of diagnosed recovered individuals

D: The number of died individuals from COVID-19

Appendix B. The Mathematical Deduction of The Dynamic Parameters

If parameter

β

varies over the range

[0, 1]

, we characterize the logit of

β

(transfer from interval

[0, 1]

to

(- \infty, + \infty)

) as undergoing Brownian Motion according to Stratonovich notation as:

d (l o g i t (β)) = d (l n (\frac{β}{1 - β})) = s_{β} d W_{t}

(A2)

In a more general situation, where

β

varies in the interval

[a, b]

, we scale

β \in [a, b]

to

β^{'} \in [0, 1]

, where we have:

β^{'} = \frac{β - a}{b - a}

(A3)

Finally, if we substitute equation (A3) to (A2), we can get the logit of

β

(transfer from interval

[a, b]

to

(- \infty, + \infty)

) as undergoing Brownian Motion according to Stratonovich notation as:

d (l o g i t (β)) = d (l n (\frac{β - a}{b - β})) = s_{β} d W_{t}

(A4)

Appendix C. Boxplots of the COVID-19 Particle Filtering Model Estimated Latent State

Figure A1. Boxplot of the latent state of stock S

Figure A2. Boxplot of the latent state of stock

E_{U}

Figure A2. Boxplot of the latent state of stock

E_{U}

Figure A3. Boxplot of the latent state of stock

I_{A U}

Figure A3. Boxplot of the latent state of stock

I_{A U}

Figure A4. Boxplot of the latent state of stock

I_{A D}

Figure A4. Boxplot of the latent state of stock

I_{A D}

Figure A5. Boxplot of the latent state of stock

I_{A 2 U}

Figure A5. Boxplot of the latent state of stock

I_{A 2 U}

Figure A6. Boxplot of the latent state of stock

I_{A 2 D}

Figure A6. Boxplot of the latent state of stock

I_{A 2 D}

Figure A7. Boxplot of the latent state of stock

I_{A 3 U}

Figure A7. Boxplot of the latent state of stock

I_{A 3 U}

Figure A8. Boxplot of the latent state of stock

I_{A 3 D}

Figure A8. Boxplot of the latent state of stock

I_{A 3 D}

Figure A9. Boxplot of the latent state of stock

I_{Y N U}

Figure A9. Boxplot of the latent state of stock

I_{Y N U}

Figure A10. Boxplot of the latent state of stock

I_{Y N D}

Figure A10. Boxplot of the latent state of stock

I_{Y N D}

Figure A11. Boxplot of the latent state of stock

I_{Y U}

Figure A11. Boxplot of the latent state of stock

I_{Y U}

Figure A12. Boxplot of the latent state of stock

I_{Y D}

Figure A12. Boxplot of the latent state of stock

I_{Y D}

Figure A13. Boxplot of the latent state of stock

R_{U}

Figure A13. Boxplot of the latent state of stock

R_{U}

Figure A14. Boxplot of the latent state of stock

R_{D}

Figure A14. Boxplot of the latent state of stock

R_{D}

Figure A15. Boxplot of the latent state of stock

V_{1}

Figure A15. Boxplot of the latent state of stock

V_{1}

Figure A16. Boxplot of the latent state of stock

V_{2}

Figure A16. Boxplot of the latent state of stock

V_{2}

Figure A17. Boxplot of the latent state of stock

l o g i t (c_{β})

Figure A17. Boxplot of the latent state of stock

l o g i t (c_{β})

Figure A18. Boxplot of the latent state of stock

l o g i t (f_{H})

Figure A18. Boxplot of the latent state of stock

l o g i t (f_{H})

Figure A19. Boxplot of the latent state of stock

l o g i t (f_{Y})

Figure A19. Boxplot of the latent state of stock

l o g i t (f_{Y})

Figure A20. Boxplot of the latent state of stock

l o g i t (α)

Figure A20. Boxplot of the latent state of stock

l o g i t (α)

Appendix D. Mathematical Equations of Calculating the Normalized RMSE and Discrepancy

The normalized-RMSE between the COVID-19 particle filtering model estimated/predicted values and the observed data on each day is calculated as follows:

NRMSE = \sqrt{\frac{\sum_{i = 1}^{n} {(\frac{2 (\hat{y} - y_{i})}{\hat{y} + y_{i}})}^{2}}{n}}

(A5)

where y is the model estimated/predicted value,

\hat{y}

is the observed value, and n is the total number of particles sampled by weight to measure. Then, the discrepancy of each empirical dataset is simply the average of the NRMSE across the whole time frame. The result of the NRMSE lies in the interval

[0, 2]

.

Appendix E. Calculation of Relative Mixing Rate amongst Undiagnosed Symptomatics ρ U and Diagnosed in Community ρ D

[29] shows “About 35 of the 160 confirmed cases did not minimise social contact. More than a fifth continued to work or carried on with their daily routine despite being sick". We then assume 80% of diagnosed mostly isolated themselves by reducing their contacts to 20% of normal, and 50% undiagnosed reduce their contacts to 20% of normal. Then we have

ρ_{D} = 0.8 * 0.2 + (1 - 0.8) * 1 = 0.36

,

ρ_{U} = 0.5 * 0.2 + (1 - 0.5) * 1 = 0.6

\begin{matrix} N_{I C} = I_{A} + I_{Y U} + I_{Y N} + I_{A 2} + I_{A 3} \\ N = S + E + V_{P} + V_{F} + I_{A} + I_{Y U} + I_{Y N} + I_{A 2} + I_{A 3} + H_{I C U} + H_{N I C U} + R \\ N_{I} = I_{A} + I_{Y U} + I_{Y N} + I_{A 2} + I_{A 3} + H_{I C U} + H_{N I C U} \end{matrix}

References

World Health Organisation (WHO). Rolling updates on coronavirus disease (COVID-19), 2020. https://www.who.int/emergencies/diseases/novel-coronavirus-2019/events-as-they-happen, Last accessed on 2020-04-15.
World Health Organisation (WHO). Coronavirus disease (COVID-19) Pandemic, 2020. https://www.who.int/emergencies/diseases/novel-coronavirus-2019, Last accessed on 2023-01-30.
Li, X.; Doroshenko, A.; Osgood, N.D. Applying Particle Filtering in Both Aggregated and Age-structured Population Compartmental Models of Pre-vaccination Measles. PLoS ONE 2018, 13. [CrossRef]
Safarishahrbijari A, O.N. Social Media Surveillance Improves Outbreak Projection via Transmission Models. JMIR Public Health and Surveillance 2019.
Osgood, N.; Liu, J. Towards closed loop modeling: Evaluating the prospects for creating recurrently regrounded aggregate simulation models using particle filtering. In Proceedings of the Winter Simulation Conference 2014. IEEE, 2014, pp. 829–841. [CrossRef]
Oraji, R.; Hoeppner, V.H.; Safarishahrbijari, A.; Osgood, N.D. Combining Particle Filtering and Transmission Modeling for TB Control. In Proceedings of the 2016 IEEE International Conference on Healthcare Informatics (ICHI). ieeexplore.ieee.org, 2016, pp. 392–398.
Kreuger, K.; Osgood, N. Particle filtering using agent-based transmission models. In Proceedings of the 2015 Winter Simulation Conference (WSC), 2015, pp. 737–747.
Safarishahrbijari, A.; Lawrence, T.; Lomotey, R.; Liu, J.; Waldner, C.; Osgood, N. Particle filtering in a SEIRV simulation model of H1N1 influenza. In Proceedings of the 2015 Winter Simulation Conference (WSC). ieeexplore.ieee.org, 2015, pp. 1240–1251. [CrossRef]
Arulampalam, M.S.; Maskell, S.; Gordon, N.; Clapp, T. A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking. IEEE transactions on signal processing: a publication of the IEEE Signal Processing Society 2002, 50, 174–188.
Doucet, A.; Johansen, A.M. A tutorial on particle filtering and smoothing: Fifteen years later. Handbook of nonlinear filtering 2009, 12, 3.
Storvik, G.; Palomares, A.D.L.; Engebretsen, S.; Rø, G.ø.I.; Engø-Monsen, K.; Kristoffersen, A.B.; de Blasio, B.F.; Frigessi, A. A sequential Monte Carlo approach to estimate a time varying reproduction number in infectious disease models: the Covid-19 case. arXiv preprint arXiv:2201.07590 2022. [CrossRef]
Sheinson, D.; Meng, Y.; Elsea, D. PIN67 Real-Time Analysis of COVID-19 DATA Using Sequential Monte Carlo Methods. Value in Health 2021, 24, S118. [CrossRef]
Peak, C.M.; Kahn, R.; Grad, Y.H.; Childs, L.M.; Li, R.; Lipsitch, M.; Buckee, C.O. Individual quarantine versus active monitoring of contacts for the mitigation of COVID-19: a modelling study. The Lancet Infectious Diseases 2020, 20, 1025–1033. [CrossRef]
Koyama, S.; Horie, T.; Shinomoto, S. Estimating the time-varying reproduction number of COVID-19 with a state-space method. PLoS computational biology 2021, 17, e1008679. [CrossRef]
Jiang, S.; Maggard, K.; Shakeri, H.; Porter, M.D. An Application of the Partially Observed Markov Process in the Analysis of Transmission Dynamics of COVID-19 via Wastewater. In Proceedings of the 2021 Systems and Information Engineering Design Symposium (SIEDS). IEEE, 2021, pp. 1–6. [CrossRef]
Ahmed, W.; Angel, N.; Edson, J.; Bibby, K.; Bivins, A.; O’Brien, J.W.; Choi, P.M.; Kitajima, M.; Simpson, S.L.; Li, J.; et al. First confirmed detection of SARS-CoV-2 in untreated wastewater in Australia: a proof of concept for the wastewater surveillance of COVID-19 in the community. Science of the Total Environment 2020, 728, 138764.
World Health Organisation (WHO). Tracking SARS-CoV-2 variants, 2021. https://www.who.int/en/activities/tracking-SARS-CoV-2-variants/, Last updated on 2021-10-29.
Centers for Disease Control (CDC). Overview of Testing for SARS-CoV-2 (COVID-19), 2021. https://www.cdc.gov/coronavirus/2019-ncov/hcp/testing-overview.html, Last updated on 2021-06-14.
Vickers, D.M.; Osgood, N.D. Current crisis or artifact of surveillance: insights into rebound chlamydia rates from dynamic modelling. BMC infectious diseases 2010, 10, 1–10. [CrossRef]
Shang, Y.; Pan, C.; Yang, X.; Zhong, M.; Shang, X.; Wu, Z.; Yu, Z.; Zhang, W.; Zhong, Q.; Zheng, X.; et al. Management of critically ill patients with COVID-19 in ICU: statement from front-line intensive care experts in Wuhan, China. Annals of intensive care 2020, 10, 1–24. [CrossRef]
World Health Organisation (WHO). COVID-19 vaccines, 2021. https://www.who.int/emergencies/diseases/novel-coronavirus-2019/covid-19-vaccines.
BC Center for Disease Control. Public health statement on extension of the interval between first and second doses of COVID-19 vaccines in BC, 2021. http://www.bccdc.ca/Health-Info-Site/Documents/COVID-19_vaccine/Public_health_statement_deferred_second_dose.pdf,Last updated on 2021-06-24.
Jong, M.; Diekmann, O.; Heesterbeek, H. How does transmission of infection depend on population size? Epidemic models. Publication of the Newton Institute 1995, pp. 84–94.
Public Health Ontario. Wastewater Surveillance of COVID-19, 2021. https://www.publichealthontario.ca/-/media/documents/ncov/phm/2021/04/public-health-measures-wastewater-surveillance.pdf?la=en,Last updated on 2021-04.
Medema, G.; Heijnen, L.; Elsinga, G.; Italiaander, R.; Brouwer, A. Presence of SARS-Coronavirus-2 RNA in sewage and correlation with reported COVID-19 prevalence in the early stage of the epidemic in the Netherlands. Environmental Science & Technology Letters 2020, 7, 511–516. [CrossRef]
Cevik, M.; Tate, M.; Lloyd, O.; Maraolo, A.E.; Schafers, J.; Ho, A. SARS-CoV-2, SARS-CoV, and MERS-CoV viral load dynamics, duration of viral shedding, and infectiousness: a systematic review and meta-analysis. The lancet microbe 2020. [CrossRef]
Miura, F.; Kitajima, M.; Omori, R. Duration of SARS-CoV-2 viral shedding in faeces as a parameter for wastewater-based epidemiology: Re-analysis of patient data using a shedding dynamics model. Science of The Total Environment 2021, 769, 144549.
Hoffmann, T.; Alsing, J. Faecal shedding models for SARS-CoV-2 RNA amongst hospitalised patients and implications for wastewater-based epidemiology. medRxiv 2021.
Zhang, Lim Min and Han Goh Yan. Coronavirus: Govt agencies to suspend activities for seniors for 14 days to cut risk of transmission, 2020. https://www.straitstimes.com/singapore/health/govt-agencies-to-suspend-activities-for-seniors-for-14-days-starting-march-11-to, Published on 2020-03-10 in The Straits Times - SINGAPORE.
Tindale, L.; Coombe, M.; Stockdale, J.E.; Garlock, E.; Lau, W.Y.V.; Saraswat, M.; Lee, Y.H.B.; Zhang, L.; Chen, D.; Wallinga, J.; et al. Transmission interval estimates suggest pre-symptomatic spread of COVID-19. MedRxiv 2020.
Zhou, F.; Yu, T.; Du, R.; Fan, G.; Liu, Y.; Liu, Z.; Xiang, J.; Wang, Y.; Song, B.; Gu, X.; et al. Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study. The Lancet 2020. [CrossRef]
Oran, D.P.; Topol, E.J. Prevalence of asymptomatic SARS-CoV-2 infection: a narrative review. Annals of internal medicine 2020, 173, 362–367.
Osgood, N.; Jeremy, E. Effective Use Of PMCMC For Daily Epidemiological Monitoring And Reporting: Methodological Lessons. In Proceedings of the Annual Meeting of Statistical Society of Canada, 2022. Abstract and Conference Publication.
Rothman, K.J.; Greenland, S.; Lash, T.L. Modern epidemiology; Lippincott Williams & Wilkins, 2008.
Dicker, R.C.; Coronado, F.; Koo, D.; Parrish, R.G. Principles of epidemiology in public health practice; an introduction to applied epidemiology and biostatistics, 2006.
Dorigatti, I.; Cauchemez, S.; Pugliese, A.; Ferguson, N.M. A new approach to characterising infectious disease transmission dynamics from sentinel surveillance: Application to the Italian 2009–2010 A/H1N1 influenza pandemic. Epidemics 2012, 4, 9–21. [CrossRef]
Saskatchewan Dashboard. Available at https://dashboard.saskatchewan.ca/, 2015. Accessed: 2023/01/31.

Figure 1. Transmission model structure

Figure 11. The daily estimated effective reproductive number

Figure 12. The daily estimated count of undiagnosed infectives

Figure 13. The daily estimated force of infection

Figure 14. The daily estimated effective prevalence of infectives in the mixing community

Figure 15. The daily estimated prevalence of cumulative infection

Figure 21. The prediction results (boxplot) of the daily count of patients admitted to the hospitalized ICU compared with the empirically mimicking synthetic data (the blue points, and not incorporated in the model), with the latter being employed to preserve confidentiality. (a) The next day projection results. (b) The 7-day projection time-window-averaged projection results versus corresponding synthetic data window-averages. (c) The 14-day projection time-window-averaged projection results versus corresponding synthetic data window-averages.

Figure 22. Model-based projections of the effects, for each successive day, of the average outcome over an intervention reducing the effective contact rate over the next 14 days, starting at that day.

Figure 23. Model-based projections, for of COVID-19 the next 14 days average results when simulating an outbreak-response immunization campaign. This is realized by characterizing a stylized elevated vaccine-induced protection level among 50% of the population.

Table 1. Table of constant parameters

Parameters	Description	Value	Source	Unit
$ρ_{U}$	Relative mixing rate amongst undiagnosed symptomatics	0.6	[29] $^{(1)}$	1
$ρ_{D}$	Relative mixing rate amongst diagnosed in community	0.36	[29] $^{(1)}$	1
$E x_{D}$	Daily travel imported case count of diagnosed	Surveillance data	SHA primary data	Persons/Day
$E_{V a c c 1}$	Daily count of persons administered the 1st dose vaccination	Surveillance data	SHA primary data	Persons/Day
$E_{V a c c 2}$	Daily count of persons administered the 2 doses vaccination	Surveillance data	SHA primary data	Persons/Day
$V_{t}$	Daily count of persons undergoing PCR (nasopharyngeal swab)-based testing	Surveillance data	SHA primary data	Persons/Day
$V_{H I C U}$	Daily count of COVID-19 patients admitted into the ICU	Surveillance data	SHA primary data	Persons/Day
$V_{H N I C U}$	Daily count of COVID-19 patients admitted into the non-ICU	Surveillance data	SHA primary data	Persons/Day
$f_{S}$	Fraction of arriving symptomatics identified upon arrival	1/3	expert estimation	1
$f_{H I C U}$	Fraction of admitting ICU among hospitalized patients	0.23	SHA primary data	1
$t_{E}$	Mean latent period	2.9	PHAC data	Day
$t_{I}$	Mean incubation period following infectivity	2.72	[30]	Day
$t_{I_{Y}}$	Mean time to develop or avoid complications	6.0	[31]	Day
$t_{R}$	Mean recovery time following symptoms	9.5	PHAC data	Day
$t_{H}$	Mean duration of hospital stay for non-ICU patients before recovery	12.0	SHA primary data	Day
$t_{I C U}$	Mean duration of ICU stay before to hospital wards, discharge or death	6.0	SHA primary data	Day
$t_{N I C U}$	Mean duration of non-ICU stay before death	4.57	SHA primary data	Day
$f_{p A}$	Fraction of persistent asymptomatics	0.4	[32]	1
$ϕ_{I C U}$	Case fatality rate amongst ICU patients	0.45	SHA primary data	1
$ϕ_{N I C U}$	Case fatality rate for cases not requiring ICU care	0.08	SHA primary data	1
$e_{1}$	Vaccine efficacy for dose 1	0.8	[22]	1
$e_{2}$	Vaccine efficacy for those completing (2 doses) primary series	0.95	[22]	1
$γ$	Ratio of model shedding measure to viral concentration in wastewater	10.374	PMCMC model [33]	copies/100ml/ Person
$w_{E}$	Viral shedding weight in exposed stage ( $E_{U}$ )	0.2	[28]	1
$w_{I A}$	Viral shedding weight in presymptomatic stage ( $I_{A U}$ , $I_{A D}$ )	0.5	[28]	1
$w_{I Y}$	Viral shedding weight in symptomatic stage with complications and cotemporal stages of oligosymptomatic infection ( $I_{A 2 U}$ , $I_{A 2 D}$ , $I_{Y U}$ , $I_{Y U D}$ , $H_{I C U}$ , $H_{N I C U}$ )	0.2	[28]	1
$w_{I N}$	Viral shedding weight in early symptomatic stage (absent complications) and cotemporal stage of oligosymptomatic infectives ( $I_{A 3 U}$ , $I_{A 3 D}$ , $I_{Y N U}$ , $I_{Y N D}$ )	0.1	[28]	1
$β_{T}$	Upper limit on fraction of infectives found by active testing	1.0	Reflective of full extent of unit range	1

Table 2. Table of dynamic parameters

Parameters	Meaning	Min(a)	Max(b)	STD	Unit
$C_{β}$	Transmission contact rate	0 $^{(1)}$	0.4918¹	10.0	Persons/Day
$f_{H}$	Fraction of symptomatic individuals who proceed on to require hospitalization	0.04	0.06	0.1	1
$f_{Y}$	Fraction of undiagnosed symptomatics who proceed on to seek care but who are not hospitalized	0.1	0.821	0.5	1
$α_{t}$	A measure of test efficiency	0.01	0.25	5	1

Table 3. Table of sub-likelihood functions

Likelihood Name	Empirical Dataset	Model Value	Mathematical Form
$L_{N e w R e p o r t e d E n d o g e n o u s C a s e s}$	New Reported Endogenous COVID-19 Cases	$V_{P} + \frac{I_{Y U} (f_{H} + f_{Y})}{t_{I_{Y}}}$	Negative Binomial
$L_{C u m u l a t i v e R e p o r t e d E n d o g e n o u s C a s e s}$	Cumulative Reported Endogenous COVID-19 Cases	$\int (V_{P} + \frac{I_{Y U} (f_{H} + f_{Y})}{t_{I_{Y}}})$	Negative Binomial
$L_{C u m u l a t i v e I C U A d m i s s i o n}$	Cumulative Hospitalized ICU Admission patients	$\int d H_{I C U}$	Negative Binomial
$L_{C u m u l a t i v e N I C U A d m i s s i o n}$	Cumulative Hospitalized non-ICU Admission patients	$\int d H_{N I C U}$	Negative Binomial
$L_{I C U C e n s u s}$	Daily Hospitalized ICU Census patients	$H_{I C U}$	Negative Binomial
$L_{N I C U C e n s u s}$	Daily Hospitalized non-ICU Census patients	$H_{N I C U}$	Negative Binomial
$L_{C u m u l a t i v e D e a t h s}$	Cumulative COVID-19 Deaths	D	Negative Binomial
$L_{V i r a l C o n c e n t r a t i o n}$	Measured concentration of SARS-CoV-2 virus in wastewater	$γ [w_{E} E_{U} + w_{I A} (I_{A U} + I_{A D}) + w_{I Y} (I_{A 2 U} + I_{A 2 D} + I_{Y U} + I_{Y U D} + H_{I C U} + H_{N I C U}) + w_{I N} (I_{A 3 U} + I_{A 3 D} + I_{Y N U} + I_{Y N D})]$	Gamma Distribution

Table 4. Table of discrepancies (Normalized Root Mean Square Error (NRMSE)) of all empirical datasets compared with model estimated results (with 5 realizations)

Dataset	Mean	95% Confidence Interval
Count of daily reported cases	0.8429	(0.8343, 0.8515)
Cumulative reported cases	0.2750	(0.2558, 0.2943)
Cumulative death cases	0.4981	(0.4842, 0.5121)
Daily virus concentration in wastewater	0.5734	(0.5511, 0.5957)
Cumulative hospitalized non-ICU admissions	0.1306	(0.1223, 0.1389)
Cumulative hospitalized ICU admissions	0.5372	(0.5313, 0.5431)
Daily hospitalized non-ICU census	0.4181	(0.4117, 0.4245)
Daily hospitalized ICU census	0.6545	(0.6492, 0.6598)
Sum of total	3.9300	(3.8980, 3.9617)

Table 5. Discrepancies for 1-day, 7-day and 14-day projection run projected data against empirical datasets

Dataset	Mean Projection Discrepancy
	1-day	7-day	14-day
Count of daily reported cases	0.7051	0.8301	0.9433
Cumulative reported cases	0.1591	0.1636	0.1769
Cumulative death cases	0.4098	0.4164	0.4293
Cumulative hospitalized non-ICU admissions	0.1617	0.1582	0.1734
Cumulative hospitalized ICU admissions	0.8705	0.8734	0.8838
Daily hospitalized non-ICU census	0.7131	0.7506	0.8308
Daily hospitalized ICU census	1.1541	1.1846	1.2364
Sum of total	4.1734	4.3767	4.6738

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.