Introduction
Facing the overwhelming information flood in real-life situations, we normally have to appropriately unify the incoming inputs from segregated sensory events. Multisensory integration (MSI) thus prevents us from cognitive overload and scaffolds our cognitive responses to life events. Schizophrenia is characterized by dysfunctional MSI capability, such as the widened temporal binding window (TBW) [
1], mitigated temporal acuity [
2] and reduced illusory perception. [
3] These changes in multisensory processing usually influence cognitive and social functioning of people with schizophrenia in real-life scenarios, [
4] such as attention, [
5,
6] memory, [
7] speech-related communication, [
8,
9] and emotion interpretation. [
10,
11,
12] In contrast, schizophrenia individuals have also been found to preserve the ability to benefit from multisensory presentations in certain MSI tasks. [
13,
14] Considering that MSI is a complex concept encompassing several components beyond the integration of information from different sensory modalities, [
15] is it possible that individuals with schizophrenia show only partial abnormality in certain subordinate procedures during MSI processing?
Of various sensory modalities (e.g., visual, auditory, tactile and olfactory), the present study focused on audiovisual integration, one particular type of MSI, for a few reasons. Audiovisual integration is crucial for face-to-face communications and the exchange of emotional messages, and difficulties in audiovisual integration may lead to socio-communicative problems, especially for people with mental disorders, thus negatively influencing their quality of life. [
16] Specifically for schizophrenia patients, perceptual dysfunction in visual and auditory modalities has been regarded as a hallmark symptom, [
17,
18] and the integration of these two modalities has been substantially investigated in this population, [
1,
19,
20,
21] which seemed to be correlated with schizophrenic symptoms like hallucination. [
22] Therefore, since the widely attended audiovisual MSI could deepen our understanding of schizophrenia and potentially improve patients’ quality of life, the present study has primarily synthesized MSI findings from the audiovisual domain.
Drawing on existing review articles concerning the MSI ability in populations with neuropsychiatric disorders, this ability could be investigated from three essential components. [
23,
24,
25] First, the multisensory gain [
26] denotes the benefits or performance enhancement obtained from multisensory stimuli compared to unisensory stimuli, which is also known as the redundant signals effect [
27] or congruent facilitation effect. [
28] A few paradigms, including simultaneity judgement [
2] and temporal order judgement [
29] have been commonly used to explore the multisensory gain. A second MSI component is simply the direct processing of multisensory inputs. For example, in the McGurk [
30] and sound-induced flash illusion (SIFI) [
3] paradigms, participants need to fuse information concurrently from different sensory modalities into one percept, which reflects the outcome of MSI processing. Third, the processing discrepancy between different multisensory conditions, as a reflection of the MSI adaptation ability, [
31] has also been attended as part of the MSI capability. For example, in a typical emotion identification task, the identification of emotions in one modality can be influenced differently by different emotions in the other modality. [
11] Recent meta-analyses suggested that neural correlates of audiovisual integration in the human brain are highly context-dependent, with both commonly and uniquely engaged brain regions as a function of analytical contrast, stimulus complexity and attention. [
15,
32] In support of the rationale to classify audiovisual MSI into different components, unique activation patterns of brain regions were identified if respectively analyzing the bimodal-unimodal contrasts (e.g., bilateral temporal cortices), audiovisual processing (e.g., right insula, right middle temporal gyrus) and bimodal-bimodal contrasts (e.g., right inferior frontal gyrus, right superior temporal gyrus, left occipital gyrus). [
15] Motivated by such findings, it shall be necessary to analyze the behavioral and neurological changes of schizophrenia patients across different audiovisual MSI components, as unique patterns may be identified when distinct neural correlates are involved.
Considering distinct neural mechanisms underlying these subordinate components of audiovisual MSI, extant results on different MSI components would also be respectively aggregated in the current study (for more detailed illustrations, see eIntroduction in Supplementary File S1; Supplementary Tables S1 and S2). By definition, the multisensory-unisensory (MU) contrast denotes the performance enhancement or processing benefit of audiovisual information, compared to unisensory stimuli. The audiovisual processing (AVP) component refers to the direct processing of audiovisual stimuli, without contrasting other unisensory or multisensory stimuli. The multisensory-multisensory (MM) contrast means the differences in processing different types of multisensory stimuli involved in particular MSI tasks, which reflects the MSI adaptation ability between various multisensory conditions.
Another interesting question is, to which degree the clinical symptoms of schizophrenia individuals could predict their audiovisual MSI performance? Most prior studies reported non-significant correlations between MSI performance and schizophrenia symptom severity, [
5,
29,
33,
34,
35,
36,
37] but some still identified significant clinical relevance. [
21,
38,
39,
40,
41] These inconsistencies encourage us to reconsider the clinical implications of such correlations. For example, are such correlations indicative of the state or trait characteristics of schizophrenia? [
42] Can neurological and behavioral changes during MSI tasks serve as potential endophenotypes of schizophrenia? [
8,
19]
To our knowledge, none to date has derived meta-analytical estimates for the subordinate components of audiovisual MSI processing or the clinical correlations between MSI and symptomatic measures in schizophrenia. Most extant review articles only qualitatively focused on specific cognitive domains [
25,
43,
44] or neural connectivity [
24] during MSI processing in schizophrenia, and very few have conducted meta-analyses to examine certain measurement metrics. [
45,
46] Therefore, this systematic review and meta-analysis had two primary objectives. First, we aimed to quantitatively synthesize the behavioral and neurological statistics about the differences between schizophrenia patients and healthy controls in audiovisual MSI tasks, and also examine the effects of various moderators through subgroup analyses and meta-regression. Second, we also aggregated correlational coefficients between symptom severity and audiovisual MSI performance in the schizophrenia population.
Primary Outcomes
Group Differences
Primary outcomes and important subgroup analyses are displayed in
Table 2. Aggregated statistics suggested that schizophrenia patients showed overall impairments at a moderate magnitude in audiovisual MSI tasks (
g = -0.50, SE = 0.07;
Figure 2). There existed high heterogeneity among these studies at both the between-cluster (
τ2between-cluster = 0.053,
I2 = 10.87%) and within-cluster (
τ2within-cluster = 0.334,
I2 = 69.00%) levels. However, the impairment of audiovisual MSI in schizophrenia seemed to be ascribed to the AVP (
g = -0.53, SE = 0.09,
p < .001; Supplementary Figure 1) and MM (
g = -0.71, SE = 0.15,
p < .001; Supplementary Figure 2) components, rather than the MU (
g = -0.23, SE = 0.17,
p = .225; Supplementary Figure 3) component. Subgroup Analyses further suggested that both behavioral (
g = -0.52, SE = 0.07,
p < .001; Supplementary Figure 4) and neural (
g = -0.51, SE = 0.12,
p < .001; Supplementary Figure 5) measures of the group differences in audiovisual MSI obtained similar results.
Our fMRI meta-analysis was conducted based on 35 available peak coordinates and
t-values of clusters extracted from five studies. [
19,
31,
68,
69,
70] There was no significant residual heterogeneity between studies (
τ = 0.00,
Q = 6.41,
p = .17). The aggregated fMRI data reflected decreased activation in the left supramarginal gyrus (Brodmann area 48, cluster size = 161 voxels, MNI coordinates = [
19,
31,
68,
69,
70],
z = -2.07,
puncorrected = .02,
I2 = 10.69%) among schizophrenia patients during audiovisual MSI tasks. However, the significance disappeared after FWER-correction, so this aggregated evidence should be considered weak.
Correlations
Existing evidence only suggested weak correlations between the symptomatic measures and audiovisual MSI performance in schizophrenia (
z = 0.16, SE = 0.07,
p = .030;
Figure 3). Moderate heterogeneity was identified at both the between-cluster (
τ2between-cluster = 0.030,
I2 = 28.47%) and within-cluster (
τ2within-cluster = 0.041,
I2 = 38.27%) levels. Neither the behavioral (
z = 0.10, SE = 0.10,
p = .348; Supplementary Figure 6) nor neural (
z = 0.25, SE = 0.11,
p = .077; Supplementary Figure 7) measures in MSI tasks were significantly associated with patients’ symptoms.
Moderator Analyses
Detailed statistics of all meta-regression models are displayed in Supplementary Table S7(continuous moderators for group difference), S8 (categorical moderators for group difference) and S9 (all moderators for correlational coefficients). Significant moderators associated with the group difference and correlation effects are presented in Supplementary Table S10. Detailed descriptions for moderator analyses are illustrated in eResults in supplementary File S1.
For MSI in general, schizophrenia patients with higher scores of the Brief Psychiatric Rating Scale (BPRS) showed increased audiovisual MSI capability (b = 0.038, p = .035). In addition, various measuring metrics were predictive of different group effects (F(8, 200) = 2.358, p = .019), with priming effects (PE; g = 0.55, 95 % CI [0.11, 1.00]) eliciting the best MSI performance in schizophrenia and d’-value (g = -1.04, 95 % CI [-1.23, -0.86]) resulting in the largest between-group difference. Neither social (F(1, 207) = 0.066, p = .798) nor linguistic (F(1, 207) = 0.181, p = .671) complexity was predictive of group effect sizes. Of the three components, AVP was associated with clinical characteristics, while MM and MU were significantly associated with experimental stimuli. For AVP, higher BPRS scores (b = 0.033, p = .013) were associated with smaller between-group differences, and various measuring metrics (F(8, 132) = 3.396, p = .001) could also differentiate the discrepancy in MSI performance between patients and HCs. As for MM, higher proportion of male participation (b = 0.027, p = .011) was predictive of better MSI adaptation within the schizophrenia group. As for the measuring indices for MM, the neural response (g = -1.04, 95 % CI [-1.66, -0.41]) could sensitively detect the largest negative effect sizes. Last, the between-group differences in integration ability, as reflected by the MU component, were only predicted by different stimuli (F(7, 23) = 2.685, p = .035), with soccer-beep stimuli (g = 0.64, 95 % CI [-0.29, 1.56]) predicting the most integration benefits for schizophrenia patients and number-noise stimuli the least (g = -1.01, 95 % CI [-1.86, -0.15]).
Interestingly, the more socially (F(1, 114) = 6.470, p = .012) or linguistically (F(1, 114) = 13.131, p = .0004) complex the stimuli were, the more possible the correlations between severe symptoms and worse MSI capability could be identified in schizophrenia. Of different measures, eye movements (z = 0.63, 95 % CI [-0.16, 1.42]) predicted the highest positive correlations, while PE (z = -0.17, 95 % CI [-2.82, 2.48]) and illusion rates (z = -0.02, 95 % CI [-0.94, 0.89]) were predictive of negative correlations.
Quality Assessment, Publication Bias and Sensitivity Analyses
Quality assessment (QA) scores (see eResults in supplementary File S1; Supplementary Table S11) in our meta-analysis ranged from moderate (score = 8) to high (score = 17) for individual studies, with an average of 13.67 (SD = 2.15, median = 14). The QA scores were not significantly associated with either group (b = 0.02, SE = 0.03, 95% CI [-0.05, 0.09], p = .46) or correlation (b = 0.07, SE = 0.03, 95% CI [-0.08, 0.22], p = .21) effects.
By Egger’s test, the funnel plots (Supplementary Figure 8) showed symmetrical distributions of both between-group (b= -0.814, 95% CI [-2.05, -0.42], p = .20) and correlation (b = 0.182, 95% CI [-0.76, 1.13], p = .71) effect sizes. The trim-and-fill analyses added 24 studies for the group effects, resulting in a corrected medium pooled effect size (k = 233; g = -0.57, SE = 0.08, 95 % CI [-0.73, -0.42], prediction interval [-1.88, 0.73], p < .001), and five studies were added for the correlation effects, remaining a corrected small pooled effect size (k = 121; z = 0.18, SE = 0.07, 95 % CI [0.05, 0.31], prediction interval [-0.37, 0.73], p = .018).
Sensitivity analyses by the leave-one-out method identified 54 outliers for the group effects (Supplementary Figure 9), and the pooled effect size without outliers remained moderately negative (g = -0.54, SE = 0.05, 95 % CI [-0.63, -0.44], prediction interval [-1.04, -0.04], p < .001) with decreased overall heterogeneity (τ2 = 0.063, I2 = 39.78%). For the correlation effects, 16 outliers were eliminated (Supplementary Figure 10) and the pooled effect size remained small (z = 0.25, 95 % CI [0.15, 0.34], prediction interval [0.02, 0.47], p < .001) with low heterogeneity (τ2 = 0.011, I2 = 24.81%). The fail-safe N analysis suggested an addition of 30,131 and 1,776 unpublished studies could respectively negate the two pooled effect sizes, which corroborated the robustness of current findings.
Discussion
This systematic review and multilevel meta-analysis synthesized extant evidence on the audiovisual MSI capability of schizophrenia patients, which calculated pooled between-group effect sizes and aggregated the correlational effect sizes between schizophrenia symptoms and patients’ MSI performance. For the first time, we analyzed patients’ MSI capability respectively in three subordinate MSI components, and inspected the clinical correlations and significant moderators. Our findings challenged the convention that schizophrenia patients showed systematic impairments in MSI processing, and could address a few inconsistent previous findings by providing a new research perspective. We would like to demonstrate two noteworthy results here, and provide supplementary discussions in eDiscussion (supplementary File S1).
First, our findings acknowledged that the moderately impaired audiovisual MSI capability in schizophrenia was primarily ascribed to audiovisual processing (AVP) and adaptation (MM), rather than the integration (MU) itself. Our identified partially deficient audiovisual MSI capability (overall, AVP and MM) in schizophrenia supported previous review articles, which indicated the overall dysfunctional MSI in schizophrenia at the temporal, [
45,
46] (non-)emotional [
25] and neurological [
24] domains. Based on our meta-regression results (Supplementary Table S8), the typically used temporal tasks (e.g., SJ) and illusion-based tasks (e.g., McGurk) consistently revealed worse MSI performance of patients in the AVP and MM components. Surprisingly, our aggregated data identified only small between-group differences (
g = -0.23) concerning the benefit brought by the multisensory integration (MU) itself. This corroborated sporadic evidence claiming that schizophrenia patients showed comparable [
14] or even greater [
71,
72] performance enhancement than HCs while processing audiovisual stimuli, compared to the unisensory information. As displayed in Supplementary Table S8, paradigms involving spatial distance judgement, lexical decision, long-term memory and speech-in-noise recognition distinguished the impaired AVP and MM components from the less impaired or even enhanced MU component. Pertaining to different measuring indices, the TBW and d’-difference detected the largest deficits of patients in the AVP and MM components, while the RT difference between audiovisual and unisensory conditions was found to best reflect the relatively preserved integration ability of patients. However, since only 31 MU effect sizes were pooled, our meta-regression findings are currently tentative and require further examinations.
In addition to the traditional meta-analysis evaluating the between-group effect sizes, our additional fMRI meta-analysis identified weak evidence for decreased activation in the left supramarginal gyrus in schizophrenia patients during audiovisual MSI tasks, which supplemented one recent systematic review reporting patients’ abnormal EEG oscillations and amplitudes in temporal-parietal regions during MSI processing. [
24] Empirical studies have also observed reduced activation in the left temporoparietal junction, [
69] the general parietal areas (dorsal visual stream), and delayed peak in the intraparietal sulcus [
73] in schizophrenia during audiovisual MSI tasks. However, since only five studies reported sufficient statistics for this meta-analysis and 88.57% (31 out of 35) of our extracted raw data reflected the AVP and MM components, our current fMRI meta-analysis only provided preliminary evidence and required further investigations. More qualitative discussions on the neurological findings are provided in eDiscussion (Supplementary file S1).
A few mechanisms have been proposed to interpret the between-group differences. First, impairments in audiovisual MSI performance could be attributed to the deficits in unisensory representations. [
3] Since schizophrenia patients showed diminished decoding of the auditory speech signals [
14,
74] and abnormally disturbed visual perception, [
17] the between-group differences in audiovisual MSI might possibly result from the combination of the weakness and instability of unisensory perception in schizophrenia. [
2] Moreover, during audiovisual MSI tasks, the mutual influences between the two modalities may either facilitate or prohibit information processing, [
12,
75] and unisensory deficits can thus sway the MSI outcomes, which would even be amplified when participants were asked to selectively attend to one modality and inhibit distractions from the other modality. [
6,
76] However, recent research has reported that schizophrenia patients could still display insensitive audiovisual temporal acuity even if their unisensory temporal processing was normal, [
77] so patients’ altered audiovisual MSI could not be satisfactorily explained by unisensory dysfunction alone. [
21] To determine the mechanisms underlying MSI changes in schizophrenia, the Bayesian modelling method has shown its merits in a few atypical populations (e.g., children, older adults and autism populations). [
78] Since MSI Bayesian models involve the computational process itself (i.e., the integration process) and the weighted unisensory estimates (i.e., unisensory processing), such models may supplement previous experimental evidence and help reveal the specific procedures resulting in the changes of MSI performance in schizophrenia. In addition to these interpretations, a few studies have also ascribed MSI changes in schizophrenia to impaired top-down modulation [
79] or inaccurate cognitive event structures (e.g., temporal, spatial), [
30] which might be independent of sensory binding operations. Although an investigation of the mechanisms underlying audiovisual MSI changes in schizophrenia is beyond the current scope, we encourage more sophisticated investigations on the pathological causes for MSI changes in schizophrenia beyond the phenomenal revelation.
Our second important finding was the weak correlation between schizophrenic symptoms and MSI performance, and the predictability of higher social or linguistic complexity for stronger correlations. During data extraction, most included studies did not report the correlation coefficients because of non-significant results. [
5,
29,
33,
34,
35,
36,
37] Our analyses synthesized 116 coefficients from a subset of 17 studies, and the small pooled effect size in our meta-analysis in turn corroborated the widely acknowledged weak or null correlations in individual studies. One reason for the absence of clinical correlations with MSI capability was the relatively small sample size in previous studies. [
14] Another possibility could be the inclusion of patients with only moderate to low levels of symptoms in most studies. Of all included samples in our study, the score ranges for different symptom measures were similarly restricted (PANSSP: 8.21 to 17.4; PANSSN: 10.13 to 23.9; PANSSG: 18.3 to 38.29; SAPS: 5.1 to 37.9; SANS: 5.6 to 46; BPRS: 15.4 to 46.93). Hence, with the restricted range of symptoms, the performance of symptomatically severe patients has been rarely observed, thus excluding the possibility of obtaining significant overall correlations between patients’ cognitive performance (e.g., audiovisual MSI) and clinical symptoms. Although our meta-regression analyses (Supplementary Table S9) suggested that the weak correlation was consistent across different symptom measures (e.g., BPRS, PANSS, SANS and SAPS) and symptom types (e.g., positive and negative), these findings concerning correlational effects might not reflect the whole schizophrenia spectrum and required extra caution for interpretation.
Besides the two primary findings, our meta-regressive results could also inspire future investigations. Although the conclusiveness of our findings was challenged because of the limitations in original studies, such as the small number of particular types of effect size, the restricted symptom coverage, and insufficient report of raw data, a few feasible directions can still be enlightened. Interestingly, we identified that social and speech complexity of stimuli was predictive of different correlational strengths, but did not seem to influence the overall group differences or any subordinate components. In specific, stimuli with complex social (e.g., face, body, gesture) or linguistic (e.g., word, phrase, sentence) characteristics may help detect severe symptoms when patients performed poorly in audiovisual MSI tasks. One possible explanation was that MSI processing of complex stimuli imposed extra difficulty by introducing higher cognitive demands and flexibility engagement, so they could detect the abnormalities in schizophrenia patients more sensitively without potential ceiling effects.11,70 However, the number of stimuli with high or low complexity for the correlation estimate was unbalanced, with much more socially complex (69.83%) but linguistically simple (86.21%) stimuli used in previous studies. As such, certain bias might be introduced due to the design of original studies, thus mitigating the accountability and predictability of our findings on stimuli complexity.
As for other significant meta-regressive findings, the unexpected relationship between higher BPRS score and smaller between-group difference requires further exploration, because higher scores in other scales of symptom measures did not consistently predict better MSI performance of patients (Supplementary Table S7). It should be noted that the significantly positive correlations here did not indicate better MSI performance of patients than healthy controls, but instead reflected mitigated impairments (i.e., smaller negative between-group effect sizes) of patients’ MSI capability. Here, we tentatively propose that the coexistence of hyposensitivity and hypersensitivity to sensory stimuli in schizophrenia may account for this correlation. [
80,
81,
82] Specifically, patients tend to omit relevant sensory information and are inactive in seeking sensory input (hyposensitivity), and they, at the same time, perceive all concurrent sensory stimuli as similarly salient without effectively filtering irrelevant information because of the defective inhibitory gating (hypersensitivity). [
18,
80] Recent evidence found that higher sensory hypersensitivity was correlated with higher schizotypal traits and symptom severity. [
82] Therefore, it could be possible that the patient group showed overall deficits in MSI than HCs partially due to sensory hyposensitivity, but patients with more severe or typical schizophrenic symptoms integrated multisensory inputs better than those with fewer symptoms because of their higher hypersensitivity level. In this case, the clinical implications of correlations between clinical characteristics (e.g., symptoms and illness duration) and patients’ performance still need clearer illustrations. For example, prior studies proposed that non-significant correlations might suggest the MSI performance as a trait marker of schizophrenia, [
1] which was independent of the current status and illness duration. [
42] It has also been recently pointed out that neurological changes (e.g., cerebellar activation and N1 amplitude reduction) during sensory integration may serve as a possible endophenotype of schizophrenia. [
8,
19] Therefore, both significant and non-significant correlational statistics are clinically indispensable for our understanding about schizophrenia, and their exact clinical meanings are still open for more substantial investigations.
Our findings demonstrated certain clinical implications, as the relatively preserved audiovisual integration ability in schizophrenia may promote novel designs of cognitive training programs. Traditional cognitive training for schizophrenia patients was mainly carried out through single sensory modalities from two weeks to 12 months. [
83] Although such training improved patients’ certain higher-order cognition, it was still challenged for the sustainability of improvements, efficacy on symptom mitigation and the time course of training. [
83,
84,
85] Our meta-analytical findings identified less impaired audiovisual integration capability (MU) in schizophrenia, so multisensory training programs are predicted to be more effective for patients, as they could comparably benefit from the information presented in different modalities. In support of our predictions, recent studies comparing the outcomes of audiovisual and unisensory training programs found that a short-term audiovisual training led to significantly stronger cognitive improvement (e.g., emotion identification, TBW) in schizophrenia patients, and the improvements lasted a week or even longer. [
86,
87] Based on our meta-regression findings, those MSI tasks that sensitively elicited patients’ less impaired MU procedure, such as distance judgement, lexical decision and speech-in-noise recognition, may also lead to satisfactory training outcomes. Another promising topic for training program design is to examine the transfer effects of MSI tasks, [
88] that is, whether the MSI avenue is cost-efficient in enhancing any different but malleable cognitive domains. Our other reported statistics are also informative for clinical practice. For example, stimuli with high social or linguistic complexity were found to predict significantly stronger symptom-performance correlations, indicating that such complex stimuli may be used to modulate the outcomes of audiovisual training [
89] and to evaluate symptom severity during early screening and prognosis of schizophrenia. Since research concerning the clinical interventions of MSI training on the recovery of patients’ cognitive functioning is still emerging, substantially more studies are urgently needed.
Our study also showed certain implications for academic research. Future studies on this topic should evaluate the MSI components in greater details, including the audiovisual processing (AVP), adaptation (MM) and integration benefits (MU). Of the three principles in MSI, [
90] the spatial and causal inference principles have attracted much less attention than the temporal principle in existing schizophrenia-oriented MSI research. Additionally, the clinical relevance (e.g., symptom measures, illness duration, medication) and experimental manipulation (e.g., stimuli complexity, MSI tasks, measuring indexes) to each of these components also need to be explored. For example, the influence of medication on patients’ MSI performance has been mentioned as a limitation in most studies, [
12,
40,
68,
71] but there are still few studies specifically controlled and reported the effects of medication. Of all samples included in this study, almost all patients were under medication, some of whom were even taking multiple antipsychotic drugs. It is highly possible that behavior-symptom correlations in MSI tasks are altered across patients with or without taking medications, and even with different types of medications. More limitations and implications for primary studies are provided in eDiscussion (supplementary File S1).
There were important limitations to the current meta-analysis. We synthesized behavioral and neural effect sizes of audiovisual MSI performance in schizophrenia, which have introduced high heterogeneity in aggregated analyses. Although we have controlled the heterogeneity by multilevel models, subgroup analyses and moderator meta-regression, there may still be unaccountable heterogeneity. Moreover, our correlation analyses were highly constrained by the data report in the original articles. Most included studies only reported the significant correlational coefficients and omitted the non-significant ones. Due to the noticeably high variance in patients’ performance, future MSI studies in the schizophrenia population are encouraged to enroll more participants and provide the exact correlational statistics. Third, our current meta-analysis emphasized the clinical relevance of MSI ability on schizophrenic symptoms, but some other dimensions, such as cognitive ability and social functioning, may provide more direct implications on how MSI capability influences real-life living of schizophrenia patients. Of all 46 quantitatively reviewed studies, however, only 15 (32.61%) measured these dimensions. Moreover, different studies used various scales or test batteries, [
5,
33,
40,
68,
73] and few have established correlations between MSI performance and the score of such measures. In this case, it was infeasible to quantitatively synthesize correlational findings between MSI ability and scores other than symptoms in the current study. We encourage future studies to continue establishing the critical links between MSI and higher-order abilities, especially for schizophrenia patients. Finally, our meta-analysis provided less information for neural activation in passive viewing paradigms, and for individuals with first psychotic episodes or early-stage psychoses, due to the lack of inclusion.
In conclusion, the current study, for the first time, quantitatively synthesized effect sizes in a systematic way on audiovisual MSI in schizophrenia, which not only revealed between-group differences in a series of behavioral and neural metrics, but also illustrated the strengths of the clinical correlations with great details. Moderate-level dysfunctions were identified in the overall audiovisual MSI capability in schizophrenia, but such impairments mainly existed in audiovisual processing (AVP) and adaptation (MM), rather than the integration itself (MU). Furthermore, extremely weak clinical correlations were observed between schizophrenic symptoms and patients’ MSI performance. Clinical characteristics (e.g., BPRS score) were predictive of overall between-group differences, and stimuli with complex social or linguistic characteristics could help observe stronger clinical correlations. Our aggregated findings of audiovisual MSI, which involved a less impaired subordinate integration procedure, warrant more sophisticated investigations in future studies and provide important clinical insights towards the design of cognitive intervention programs, which will hopefully improve the quality of life for this population.