1. Introduction
For about 3 years, beginning in September 2019, the SARS-CoV-2/COVID-19 pandemic shook the world in an almost unprecedented way ([
1]). Three years after the outbreak, it is still not clear whether the pandemic entered an endemic phase and how long a crisis-like situation will persist or re-emerge ([
2]). There are still too many unknowns to be able to give clear prognoses, although the flood of COVID-19 related publications is without example ([
3,
4]). Equally unprecedented is the fact that a high proportion of the literature dealing with the pandemic is meta-scientific and/or meta-bibliographic in nature or belongs to the sociology of behaviour or the sociology of science. Topics include discussions of malicious or accidental miscommunication even within the scientific context ([
5]). Contributions address misconduct, but are themselves often characterized by sheer polemic, if not denunciation. This fact may best be summarised that we face a harsh "COVID-19 infodemic" ([
6]). No surprise that a number of surveys and studies clearly points to a polarisation and radicalisation in public attitudes and behaviour driven by a polarisation in elite rhetoric that hinders effective responses to the COVID-19 crisis ([
7]) and gives rise to an increasing social Darwinism ([
8]). Thus, health behaviour is increasingly driven by political ideology ([
9,
10,
11]) such that the resulting epidemic dynamics have become almost unpredictable and uncontrollable.
In this paper, we take a closer look at the spatio-temporal dynamics of the epidemic in Germany. Due to the aforementioned socio-behavioural imponderabilities, the problem is inherently systemic and adaptive in that sense, that preventive measures including their associated compliances and epidemic activity are bidirectionally related via nonlinear feedback loops. It follows, even if containment measures were precisely datable, one could not rule out the possibility that they would ultimately be counteracted and even be changed to the opposite. The notation of a "self-disorganisation" suggests itself. We here take the stance that the recorded, more or less objective incidences, should speak for themselves since most of the aforementioned socio-behavioural determinants defy quantifiability. We aim at presenting suitably quantified spatio-temporal patterns of the German epidemic activity in terms of features of weekly recorded incidence time series at the relatively fine-grained spatial resolution of German districts. In essence, our approach is explorative in that we invert the search direction: striking spatio-temporal incidence patterns provide timestamps and spatial clues that point to external determinants that trigger changes and allow for associations with prevailing social conditions.
"Systems thinking" is a collective term for very different conceptions of complex self-(dis)organised systems ([
12]), even if, strictly speaking, the term self-organisation was only used extensively in the context of synergetics ([
13]). The scientific nature of systems thinking has been questioned and it has, apparently, been conceived as a hermeneutic process ([
14,
15]), thus belonging to the context of discovery rather than to the context of justification in terms of Reichenbach’s partition of the epistemic process ([
16]). The pressing need to better understand the COVID-19 pandemic, and the sheer lack of convincing and actionable evaluations, encouraged us to place greater emphasis on the benefits of systems thinking. It is due to the very nature of complex systems that no unique optimal tool exists for their analysis. Each analysis tool allows to take a particular look at the system and the combination of such analyses reveals the different facets of the given complexity. Specifically, we derive diversity measures based on entropy calculations, but also apply more recently introduced methods from the field of network analysis. These include methods of dimensionality reduction, multidimensional scaling, as well as cluster analyses. Taken together, we try to provide as comprehensive a picture of the complexity of the pandemic as possible, using only available incidence data. Some facets of the complexity of COVID-19 epidemic have been published recently ([
17,
18]) and we ask the readers to combine these previous results mainly focusing on temporal features with the new findings presented here, which put more emphasis on spatial patterns.
2. Materials and Methods
Throughout the paper, statistical calculations and creation of graphs were done with R ([
19]). Used R program packages are mentioned at the appropriate places.
2.1. General Settings and Nomenclature
In the following, contextually either the 400 German rural/urban districts (counties) or the 16 federal states are labelled by index
or
, respectively. In graphs, however, federal states are represented by the official 2-letter abbreviation (see the list of abbreviations in the appendix). The age-specific incidences,
given by registered counts,
per sub-population size,
, at time point
t of counties
(or, depending on context, federal states
) have been calculated from the counts retrieved from the Robert Koch-Institute (RKI) database ([
20]) and from the respective age-specific sub-population sizes retrieved from [
21]. Depending on the context, age
a either refers to the age classes (in years)
or to
,
,
, respectively, and time
t is given by calendar date in weekly steps from January 2020 through end of August 2022, i.e. we have
time steps. Of note, Berlin is sub-divided into 12 administrative districts. Although COVID-19 counts are separately listed in the RKI database with respect to these districts, we use aggregated data for Berlin constrained by the database structure of the Census Bureau. Berlin as well as Hamburg thus function both as a single county and as a federal state.
2.2. t-sne
The "t-distributed stochastic neighbour embedding," in short t-sne, is a commonly applied dimensionality reduction algorithm ([
22]) with a precursor, called "SNE", introduce by [
23]. For a brief outline, assume that each of the 400 incidence time series corresponding to the 400 German districts is represented as a point in a 135-dimensional vector space with 135 being the length of the weekly sampled time series. SNE, and likewise t-sne, as applied here, reduces the 135-dimensional to a 2-dimensional vector space while preserving proximity of data points based on a distance or similarity measure. Such a reduction in dimensionality needs to specify a so-called
parameter, which, in a nutshell, reflects the users taste of how quickly similarity should fade out with increasing distance, i.e., which area is conceived as neighbourhood. This trick allows for a visual inspection of clustering patterns in two dimensions, if any. Calculations are based on algorithms provided by [
24].
2.3. UMAP and PCA
"Uniform manifold approximation and projection," in short UMAP, has been introduced by [
25] as a dimensionality reduction algorithm that out-performs t-sne by means of preserving global structures, at least such is the claim. However, comparative studies do not allow a definitive conclusion. Analogously to t-sne, UMAP necessitates a free parameter, called
, to be set, which defines a custom range of neighbourhood. Both algorithms compete with the well-known method of principal component decomposition/analysis (PCA) ([
26]), which seeks for an optimal explanation of variability after the decomposition while preserving the overall variance in the data. We present the results from applications to the COVID-19 time series of all three algorithms side by side and conceive this as a sensitivity analysis. Used algorithms are provided by [
27,
28]. The PCA algorithm is endowed with the possibility to calculate normal data ellipses around a set of predefined data points, which are assumed to constitute clusters. Therefore, an approximate (pseudo) confidence measure for the separability of clusters is available.
2.4. Correlation Matrix and Hierarchical Clustering
Calculating pairwise correlation coefficients of the 16 federal state-specific incidence time series can be conceived as a reduction to a 1-dimensional space since correlation is just a specific similarity measure. Along with hierarchical clustering, a visualisation of the correlation matrix should essentially yield the same information as a two-dimensional reduction, as long as specific structures that can only be recognised by a specific algorithm are absent. Both Pearson as well as Kendall correlations are calculated since it is common place, that Pearson is good in recognising linear correlations whereas Kendall can also be applied to nonlinearly correlated data vectors. Thus, a comparison of the results after the application of all suggested methods will decidedly bring added value in interpreting the dynamic hallmarks of the epidemic. The correlation plots are created using the R-package provided by [
29]. Hierarchical cluster analysis is based on Ward’s minimum variance method, which aims at finding compact, spherical clusters. Specifically, we use the "ward.D2" method, which means that dissimilarities are squared before cluster updating, according the package manual.
2.5. Multidimensional Scaling and Network Graphs
Multidimensional scaling (MDS) is the umbrella term of a family of dimensionality reduction algorithms. MDS aims at preserving distances or, conversely, proximities between data points. Note that this differs from the t-sne neighborhood embedding, which clusters neighboured points tightly in order to clearly visually separate the clusters. Here, we exclusively use non-metric MDS based on spline transformations. In other words, the spline function
f transforms dissimilarities
to disparities
via
. Corresponding dissimilarities in the low-dimensional space are then found by minimising a so called stress function, which in essence is a function of the difference between disparities in the original and the reduced space. For details cf. [
30].
Recently, MDS is found to be combined with graph visualisation with increasing popularity. Based on a scalar measure of proximity between any pair of data points (here time series), as e.g. correlation coefficients, the data points can still be represented in two dimensions in the form of nodes (or vertices), where the scalar similarities determine the strength of the connecting edges. The spatial arrangement of the nodes can then be based on similarities calculated by minimising the corresponding stress function. Other, usually ambiguous arrangements are in use, which can be constrained by the requirement of having non-overlapping nodes. Also popular is the simple arrangement of nodes into a circle, whereby nodes assumed to share a cluster can be arranged such that they are adjacently located on the circumference.
Although this type of presentation in form of a graph visualisation is suggestive to the eye of the beholder and therefore prone to misinterpretation, it can flank the exploratory approach if one is aware of the pitfalls. We present graphs which are based on correlation coefficients or on distances supplied by a PCA, respectively. Moreover, after the application of a Gaussian graphical model using LASSO, a partial correlation network can be derived, which reduces the number and strengths of the edges of the graph to a relevant magnitude. Please cf. [
30,
31] for a detailed description of the graph visualisation methods used here, including the Gaussian graphical model, referred to as "graphical LASSO."
Network graphs are commonly published together with so called centrality indicators. Network strength, sometimes also referred to as degree centrality, assigns an importance score to each node/vertex, which is, in our application, the sum of pairwise absolute values of correlations of the given to all other nodes. Betweenness centrality measures quantify how strong given nodes build bridges between other pairs of nodes. Closeness centrality scores each node based on their strengths to all other nodes in the network. It is worth of note, that both betweenness and closeness do not differ strongly from degree centrality, if the similarity of the nodes is measured in terms of correlations. However, we will report these centrality measures for the sake of completeness. Expected influence, occasionally called eigen centrality, measures a node’s influence onto the entire network. If correlations are used for the quantification of similarities, the expected influence differs from network strength only if positive and negative correlations are simultaneously present in the network, since expected influence does not use absolute values when summing up the correlations.
2.6. Spatial Heterogeneity
Spatial Shannon entropy at time
t is given by
from which a measure of diversity (or spatial heterogeneity), given by
can be calculated. The upper limit of 400 in the summation refers to the number of districts. However, in order to detect possible differences between East and West Germany, the summation will also be restricted to either the 75 East German or to the 325 West German districts. For details on interpreting the diversity function confer [
32], and for an analogous application within an epidemiological context see [
33]. Briefly, for a given age class
a at time point
t, equal incidences over all counties gives maximum entropy hence maximum diversity
, however, generally
since a synchronisation of the epidemic activity across districts appears to be unlikely. Particularly in the beginning of the epidemic with one or a few number of early index cases located within one or a few number of districts,
will be close to 0. Over the course of time, intervals with a more or less homogeneous distribution of incidences across counties will probably alternate with asynchronous epidemic activities, as a consequence of spatially unequally distributed index cases of new epidemic waves and differences in socio-behavioural conditions. Here, we focus on trends and abstain from presenting statistical significance, i.e., confidence intervals for
are ignored. In this regard, the reported incidence data may substantially deviate from true incidences such that confidence intervals would give rise to spurious certainty.
The following measure serves to capture changes in the dynamics of the epidemic in district/federal state
i,
i.e. fold changes of weekly fold changes in incidence, which has also been used in [
34]. Equation
4 goes to show the similarity to the weekly fold change in the effective reproduction number
of area
i. Taking the logarithm yields
which is used in the following due to its favourable symmetry with respect to zero. For convenience,
and
is used interchangeably, whereby the first version can straightforwardly be extended to comparisons of two areas
i and
j at a given point in time, i.e.
(for details cf. Equation
7 below).
With the exception of taking logarithms, Equation
5 bears resemblance to the so called difference-in-difference method, which has frequently been used to identify causal effects of COVID-19 non-pharmaceutical interventions (cf. [
35] for a review of the method and [
36] for a systematic review of applications within the scope of COVID-19). Within the latter context, the counterfactual difference-in-difference method is applied to a setting which is assumed to be quasi-experimental in nature. If
and
denote average incidences taken over a period before or after a containment measure has been mandated in district/federal state
i, respectively, then
measures the effect of the containment action when
j refers to a district/federal state without a corresponding mandate. Of note, Equation
6 does yield a reliable result if and only if areas
i and
j are "structurally" comparable, i.e., if a common trend assumption (constant underlying differences) holds (for details cf. [
35]).
Supposedly, the individual counties exhibit individual epidemic dynamics, in particular as a consequence of different (starting and stopping of) containment strategies, but also due to inherent socio-structural conditions, hence creating and amplifying spatial heterogeneity. Equation
5, therefore, serves as an auto-difference-in-difference method to detect dynamical change points. It appears plausible that a district remains structurally "self-similar" over time such that the auto-difference-in-difference is even more valid than the between-counties counterpart.
Specifically, as a consequence of the previous remarks, a special application of Equation
5 reads
which goes to show the difference in the dynamics of two distinct areas at a given time point, hence defining a "cross-difference-in-difference." If, e.g., two structurally similar districts
i and
j both follow exactly the same non-pharmaceutic intervention schedule, Equation
7 should then yield a time series constantly close to zero.
To complete, we use cross-correlation analyses based on Kendall’s correlation coefficient in order to quantify mutual associations of the dynamical patterns of areas (counties or federal states, respectively) expressed via the associated auto-difference-in-difference time series. Hierarchical clustering is applied in "Ward.D2" mode.
4. Discussion
Only a couple of months after the pandemic outbreak of SARS-CoV-2/COVID-19, RJ Klement presented a systemic picture including close to hundred components to build complex functional relations ([
40]). Although Klement’s network of "causal" relations is of rather wide scope and included interactions within and between levels of organisation (i.e. macro-micro interactions), it is still preliminary and far from exhaustive, as the author confirms. Klement thus contributed to a hermeneutic discourse on the nature of the pandemic, i.e., to its better understanding, but the question remains how such a complex picture can be operationalised. In practice, science is thrown back on ambiguous reductionist views of a few interacting components. Unprecedentedly, the conception of countermeasures based on both political as well as scientific criteria, indeed the entire culture of communication, has been overshadowed by an intra-scientific scramble for interpretive sovereignty as a consequence of this ambiguity ([
41]). This completely derailed communication culture undoubtedly itself added quite considerably to the pandemic-influencing mechanisms: an infodemic ([
6]). At least this important cluster of nodes should be added to Klement’s "causal" network.
The strategy followed here can be described as an attempt to largely dispense with model assumptions and simply let the data speak for themselves. In this sense, the approach is non-parametric and largely descriptive. The limitation of this approach is at the same time its strength. No causal relationships, or even functional dependencies, are shown. An important note on this: causality is an
a priori, i.e., a fundamental principle that cannot be derived from empiricism - not even from experimental empiricism. At best, a randomised controlled trial (RCT) provides evidence for functional dependencies under very specific conditions. Less evidence is attributed to results of regressions in the context of observational studies, even if the result thus obtained is highly self-evident. But be that as it may, at the end of the day, the conclusion and, in our opinion, already the choice of study design, including RCTs, is always already value-based (cf. [
42]).
The semi-quantitative and broadly descriptive way of proceeding to assess age- and county-specific COVID-19 incidence time series recorded in Germany reveal informative patterns. Most striking is the fact, that the time series can be allocated to two geographic similarity classes or clusters which coincide with the geopolitical division in East and West Germany. This confirms findings based on regression analysis ([
11]). The dependence of this clustering on the chosen measure of similarity and reduction algorithm could give rise to a possible critique. However, this criticism can be countered by the fact that 4 different methods produce the same results.
Spatial heterogeneity of COVID-19 incidence is waxing and waning in the course of time for each age class, however, remains lowest for the youngest age class (children, adolescents) during the entire time course. To interpret this phenomenon as a spatially homogeneous background incidence of the youngest age group may be going a bit too far, but a tendency in this direction is indicated. This view is supported by findings of increased seroprevalence among school-aged children (cf. e.g. [
43,
44,
45,
46] and references therein). Note, however, that a comparably lower virus load in children may lead to underdetection which in turn might also bias the calculation of spatial diversity. Furthermore, in East Germany we find a considerably larger spatial heterogeneity when being compared to West Germany. Seen in the light of marked geopolitical differences between East and West (cf. [
10]), this result may appear less puzzling. In addition, it is in agreement with other spatio-temporal characteristics as discussed in the following.
Temporal acceleration patterns do differ both geographically as well as age-related. Generally, for all federal states the dynamical changes in terms of fold changes of reproduction number gradually fade out in the course of time, although some intermittent bursts can be observed. Particularly striking is the fact that the dynamics for Baden-Wuerttemberg (BW) are relatively flat over the entire observation period. In sharp contrast, Saarland (SL), a very small federal state, does show the largest variability in fold-changes of the reproduction number. The cross-difference-in-difference method applied to all pairs of state-dependent incidence time series reveals patters of similarities and dissimilarites that have to be discussed from the perspective of geopolitical differences, which is beyond the scope of this contribution. However, we point to the observation that North Rhine-Westphalia (NW) and Hesse (HE) do not only exhibit similarity both in terms of cross-difference-in-difference and Kendall correlation, but these two federal states are also located in the center of the Western German cluster derived from a network analysis (
Figure 3). In this sense, NW and HE are "average states" with respect to the SARS-CoV-2/COVID-19 epidemic. The reader is encouraged to throw a glance on the series of figures in appendix S1. Among many other interesting patterns, Saarland (SL) exclusively shows very pronounced mutual cross-difference-in-difference curves and, therefore, makes SL unique in a certain sense.
In West Germany, children and juveniles as well as seniors do contribute more intensively to de-/acceleration of the epidemic spread when being compared to the middle-aged class. Such a difference cannot be observed in East Germany where all age classes equally strongly contribute to fold-changes of the reproduction number. This result is in agreement with what has been found in [
18] using a different analytical approach which gives rise to the interpretation that children have a significantly greater influence as a driver of the pandemic than previously suspected.
A possible explanation is the relatively unsteady dynamics of introduction and withdrawal of containment measures for kids, particularly in schools and daycare facilities. The pronounced impact of adults in the East on dynamical changes when being compared with the West German population may be due to carelessness or differing socio-political attitudes. And, indeed, evidence is mounting on how influential sociocultural aspects and personal beliefs are in relation to epidemic activity. For a more profound discussion of this issue cf. [
10,
11,
47,
48,
49,
50,
51]. Thus, with due caution we conclude that varying containment measures and their compliance, as well as regular occurrences like school vacations are much more instrumental to change the behaviour of non-adult people relevant for the control of the epidemic reproduction number. We hypothesise, however, that the effects of school openings are strongly confounded by the current local epidemic conditions, i.e. the current effective reproduction numbers, and, even more important, by prevailing preparedness, facilities and reliance at the schools and accountable local authorities. Thus, on the one hand, school opening can contribute to combat the epidemic in case of a quick detection of an infection and a proximate shelter-in-place order. On the other hand, school opening can worsen the situation in case of overwhelming infectivity and concurrent lack of preparedness. The pronounced irregularity observed for the epidemic dynamics corresponding to the young population teaches to shift the focus regarding control measures toward children and adolescents. Further research is needed to better understand the causes behind the observed irregularities.
Whether the observed local differences can be attributed to real differences in incidences or rather to locally different frequencies of testing, i.e. different numbers of undetected cases, is unclear (cf. [
52]). This is an unavoidable and probably the strongest limitation of evaluations that refer to publicly available registered case numbers. The observed spatio-temporal intermittency, i.e. the irregular alternation of phases, are presumably to some extent caused by the time-dependent detection ratio. Moreover, the existence of a depensation effect leading to a "detection threshold" cannot be ruled out. In other words, during a low incidence period, disposition to test might be particularly low, which might in turn entail unrealistically many "zero events." To put this limitation in a slightly better light, we point out that the interpretations given here are only suggestions anyway. The observed dynamic patterns are objective, but the associated potential explanations, including changing vaccination rates and the like, are not at all.
Finally, the lack of reliable data on all types of contact regulations does definitely limit the explanatory power of our analysis. However, we are able to report intrinsic spatio-temporal patterns of the epidemic that now can be linked to all types of socio-cultural occurrences that are suspected to influence the transmission dynamics. We presented the results from an explorative study and want to conclude by highlighting the captivating advantage of such a study: we let the data speak and did not use contestable model assumptions.