2.1. Context.
For ASD and comorbid MI, the prevalence of the comorbid condition P(MI) is generally known, and also one of the conditional probabilities. This is usually P(MI|ASD) because female ASD is cryptic, but here ASD must have been diagnosed. If there is a representative diagnosed ASD population then P(MI|ASD) may be determined. Because it is cryptic the proportion of ASD as a comorbidity in a mental illness has usually not been established and P(ASD|MI) is unknown. The key then for the management of intractable mental illnesses in females due to comorbid ASD is finding the crucial P(ASD|MI). P(ASD) must be established so Bayes’ theorem can be used. While the overall theme of this paper is supported by the qualitative information, P(ASD) is needed to assess the true extent of the problem and focus the clinical and bureaucratic mind. Three methods to do so were employed using the 3 reliable recognition methods listed in
Section 1.6.
Method (1) [
2] calculated biases and adjusted published biased estimates of P(ASD).
Method (2) [
2,
13] used P(ASD|MI), P(MI) and P(MI|ASD) in Bayes’ theorem:
Method (3) [
13] used the mathematical equivalence of a hazard ratio and a likelihood ratio (LR) [
13], and employed a more visible comorbid mental illness (anorexia nervosa) as a “test” for ASD using the odds version of Bayes’ theorem. We can then generalize to other mental illnesses:
where the LR is P(MI|ASD)/P(MI|not ASD).
These methods yielded a median P(ASD) of 0.060 as detailed in
Section 1.6. With this prevalence value of 6.0% for female ASD we can now reverse the process and depending on the reported variables, calculate the proportion of females, P(ASD|MI), who have comorbid ASD with their diagnosed mental illness. There are 3 possible ways to find the result:
P(ASD|MI) may have been directly measured. For the reasons described this is uncommon.
It is relatively common to find values for P(MI|ASD) because in the adolescent literature it has long been known comorbid mental illness is common in this ASD population. In the adult literature researchers with a primary ASD focus are now assessing for comorbidities. Bayes’ theorem can be used if we know P(MI|ASD), P(ASD) and P(MI).
If a hazard ratio is available:
2.2. P(ASD|MI) for selected mental illnesses in adolescents and young adults.
With these methods we can now calculate the proportion of ASD given different mental illnesses. The range of values for some of these conditions are wide, and recent and reasonably conservative data have been used where available. Precise values are not critical. The aim of this paper study is to set a frame of reference to establish the clinical importance of the relationship P(ASD|MI).
Figure 5 shows indicative timing of onset of comorbid illnesses in females to put the results in context. Half of all individuals with a mental health disorder have their onset by 18 years, and 62.5% by age 25 [
47]. Mental illness is a problem for the young, emphasizing the importance of an informed transition to adult services.
- 1.
Anorexia nervosa (AN).
There is reasonable current agreement [
48,
49] on a range of about 0.20 to 0.30 for P(ASD|AN). A value of 0.25 was used to calculate P(ASD) [
13].
Margari et al. [
50] give P(AN|ASD) of 0.068 compared to the published value [
13] of 0.083. With the same values [
13] for P(ASD|AN) of 0.25 and P(AN) of 0.02 this gives P(ASD) of 0.074. This is the third quartile value of the published median estimate of 0.060 and independently corroborates the estimated P(ASD).
- 2.
Schizophrenia spectrum disorder (SSD).
Published values for P(ASD|SSD) vary widely, likely due to significant diagnostic overshadowing, but the upper estimates are about 0.5 [
51,
52,
53]. Overall P(SSD) is about 1% with a male:female ratio of 1.4:1 [
52,
53,
54] giving a female P(SSD) of 0.0083. If we take P(SSD|ASD) to be 0.06 [
55] then by Bayes’ theorem with P(ASD) of 0.06:
This suggests values of up to 0.5 or 50% are plausible. This is supported by the shared genetics of SSD and ASD [
56].
- 3.
Borderline personality disorder (BPD).
The value is uncertain with many overlapping features and uncertainty about the age when the diagnosis should be considered but 14.6% [
57] is widely referenced giving P(ASD|BPD) of 0.146. This value was used to calculate a P(ASD) of 0.060 [
2].
- 4.
Bipolar disorder (BP).
A cohort study of 267 young adult females with ASD were matched with 534 referents and the hazard ratio for BP in ASD ie P(BP|ASD)/P(BP|not ASD) was 5.85 [
58]. Given P(ASD) of 0.060, by method (3) this gives P(ASD|BP) of 0.272 or 27.2%.
- 5.
Depression (DP).
Kirsch et al. [
58] gave a hazard ratio for depression-related diagnoses of 2.28 which gives P(ASD|DP) of 12.7%. Pezzimenti et al. [
59] for adolescents gave P(DP|ASD) of 0.202 and P(DP) of 0.084. With (P(ASD) of 0.060 this gives P(ASD|DP) of 0.144 or 14.4%. Hudson [
60] gave an effective lifetime hazard ratio of ~4 giving P(ASD|DP) of 0.203 or 20.3% consistent with the cumulative incidence of depression increasing with age.
- 6.
Anxiety disorders (ANX)
Kirsch et al. [
58] gave a hazard ratio for anxiety-related diagnoses of 2.91 which gives P(ASD|ANX) of 15.7%. Croen et al. [
61] gave data for anxiety related diagnoses in adults, allowing calculation of a hazard ratio of 3.22 which gives a P(ASD|ANX) of 0.170 or 17.0%
- 7.
Obsessive compulsive disorder (OCD)
From a UK mental health trust study [
62] of females aged 4-17: females with OCD + ASD (121), OCD (522), ASD (1625). Then P(ASD|OCD) is 121/522 ie 0.2318 or 23.2%. This paper also gives P(OCD|ASD) of 0.0745. With a female OCD prevalence of 0.015 [
63] Bayes’ theorem gives a P(ASD) of 0.047, another independent measure of the high prevalence of female ASD. The relatively young age range may explain the value at the low end of the overall calculated range of P(ASD).
- 8.
Social anxiety disorder (SA)
Values vary for P(SA|ASD) but are generally high. If we use an indicative value of 0.45 with a lifetime P(SA) of 0.103 [
64] then with P(ASD) of 0.060 we find a P(ASD|SA) of 0.262 or 26.2%. This is higher than the estimate for anxiety-related diagnoses but this is not surprising given that social communication is the major problem in ASD. The conclusion overall is that ASD is common in anxiety disorders.
- 9.
Perinatal depression (PND).
Pohl et al. [
65] found a postnatal depression rate P(PND|ASD) in autistic women of 60%. This was potentially signaled by an antenatal rate of 40%. Luca et al. in a 2017 study [
66] found a postnatal rate in pregnancy, P(PND), of 11.5%. Then we find:
The likelihood of autism in postnatal depression may be a shocking 31.3%. Two thirds of these mothers will have antenatal depression [
65] and it is essential to make neurodiversity friendly arrangements to ease the stress of birthing for autistic mothers [
35].
- 10.
Post traumatic stress disorder (PTSD)
PTSD is experienced by 10-12% of women [
67]. There is not a precise figure for female P(PTSD|ASD) but a lifetime figure may be as high as 60% [
68]. Then we find:
Considering the stresses that autistic women have to cope with [
17] it is not surprising that up to 30% of women with PTSD may be autistic and will need to be diagnosed so the PTSD is not perpetuated by inappropriate therapy.
- 11.
Any mental health disorder (MI)
Nyrenius et al. [
69] give P(ASD|MI) as 18.9%. Studies of P(MI|ASD) are quite heterogeneous. LeCavalier et al. [
70] gives any externalizing disorder as 80.9% and any internalizing disorder as 43.6%. Lever et al. [
71] give 79% with no gender breakdown but female rates are typically higher than male rates. If we take 80% as representative then with the National Institute of Mental Health [
72] 2021 value for female P(MI) of 0.272 and P(ASD) of 0.060 we obtain P(ASD|MI) of 0.176 or 17.6%. These values are less than for some individual conditions because it is common with ASD to have multiple mental health comorbidities [
73]. The overall magnitude of P(ASD|MI) however reinforces the primary thesis that female ASD is being routinely missed in female mental illness. Between one in six and one in five women with mental illness are autistic.
The outcome of the calculations of the proportion of women with comorbid ASD in mental illness are summarized in
Table 1. The Bayes’ calculation can done by the probability equation:
or the odds equation:
where the likelihood ratio is P(MI|ASD)/P(MI|not ASD). This is mathematically equivalent to the hazard ratio which is commonly reported as the ratio of the proportion of a mental illness comorbid with ASD to the prevalence of the mental illness in the rest of the relevant population without ASD [
58,
60,
61]. This is providing the presence of comorbid ASD does not alter the proportion of the MI diagnosed. If it is not altered the accuracy of the proportion of the MI diagnosis in each population (ASD and not ASD) is not critical since a systematic error in diagnosis will be cancelled when deriving the ratio.
2.4. Validation of the female prevalence value for P(ASD).
We can use the data from
Figure 6 which summarizes the relation of autism and mental illness in the female population to obtain another independent value for P(ASD) using Bayes’ theorem:
Figure 6 summarizes the relation of autism and mental illness in the female population.
In addition to the 14 values already determined [
13] we can add 0.047, 0.064 and 0.074 found in
Section 2.2. The median value of P(ASD) is then 0.060, with Q1 0.057, Q3 0.074, IQR 0.017, range 0.047-0.094, number of values 17.
For individual mental illnesses, values for P(MI|ASD) and P(MI) are quite variable due to heterogeneous populations, different diagnostic methods and diagnostic overshadowing. Average values for individual comorbid illness P(MI|ASD) and mental illness P(MI) are going to be very different due to the natural variation in prevalence of the different mental illnesses. The degree in which they are comorbid with ASD is going to vary, and there are different degrees and combinations of multiple comorbidities in individual patients. All these variables are integrated in the values of P(MI|ASD), P(ASD|MI) and P(MI) for overall female mental illness shown in
Figure 6. The value of P(ASD) derived from the data for overall mental illness was 0.064. The mean for the other 16 measurements of P(ASD) was 0.064. This suggests the median of 6.0% and mean of 6.4% for the prevalence of female ASD are plausible.
2.5. Degree of benefit: Pareto calculations.
2.5.1. The Pareto principle in health.
The Pareto principle is commonly called the 80/20 rule where 20% of the cause leads to 80% of the effect. While the variable strength of a particular cause will be a continuous power function, the division into 2 parts comparing the effect of the average of the strongest 20% of the cause to the average of the remaining 80% is a useful simplification. There may be multiple causes depending on the scenario but for the effect of ASD on female mental health we will consider ASD as a single cause. In mental health financial cost is a useful measure but is not necessarily the limiting variable. This is often a lack of trained staff poorly distributed in the population. A useful analogy for the drivers is the first law of thermodynamics. The amount of energy remains constant but can be moved around to do work. We will define a unit of energy as what is necessary to do the work to deliver a unit of service to an average patient in the less difficult group. This could equate to funding or performing a service. The two proportions do not have to be 80/20 and do not even have to add up to 100%. The Pareto principle is a useful simplification which seems to have wide application. We will now analyse why this might be so and apply it to health management.
2.5.2. Derivation of Pareto formulae.
The aim is to derive formulae to estimate the gain in system efficiency of management by treating the ASD of comorbid mental health patients in a manner empathetic to neurodiversity.
To examine the general case of the relation of 2 causal proportions let the higher patient proportion with the weaker average effect be p and the lower patient proportion with the stronger average effect be q. Proportion p of the patients requires proportion q of the energy so an average patient in group p uses q/p energy. An average patient in group q uses proportion p/q energy. Then the energy cost ratio of the average high work patient in group q to the average low work patient in group p is (p/q)/(q/p) or (p/q)2.
For the 80/20 rule the relative effect will be 16/1 (
Figure 7). The Pareto principle is said to be an empirical rule but in biology there is a plausible reason why it is rarely higher, say 90/10. Due to the exponential nature of cause and effect for delivery of a costly labor intensive biological service there will be a logistical upper limit to the ratio. The absolute ceiling will be the most intensive inpatient care possible or its clinical equivalent in the system of interest. There is evidence intense limited medical services do obey the 80/20 ratio [
39,
76,
77,
78] and the
p2/q2 ratio of 16/1 for the difficult 20% to the easier 80% reflects the logistic ceiling. The likely patient work distribution for a ratio of 80/20 is shown in
Figure 8. Focusing on the needs of the high risk group is intuitively sensible but can be seen as inequitable if large numbers of patients get no care at all. We will show how to quantify its dramatic positive effect on the whole health system.
If we assume the energy cost of the service is the sum of the costs of individual consultations then the energy cost of the visits of the high work group q is:
The energy cost of the visits of the low work group p is p x 1.
Total energy cost in groups p and q in units of the work per average patient in group p is:
In a typical current reasonably well functioning health system the limiting variable will be staff availability. Group q will have been triaged for treatment but there will be a long waiting list for inclusion in the less severely affected group p. Since ASD makes MI more difficult to treat, the comorbid patients will be distributed preferentially into the higher work group q. In calculating a formula for the distribution of the comorbid ASD patients in the entire p + q patients with a mental illness we will assume that on average the patients without ASD in each group, p and q, will require the same range of energy input as those with ASD. With overall P(ASD|MI) of nearly 20% it is likely that most of the patients in group q will have ASD. ASD should however be sought when assessing all MI patients since the patients in group p will also benefit. ASD does have continuous traits [
79] which will be distributed over a number of further patients in both groups. These patients will also benefit from an empathetic approach to neurodiversity and contribute an extra unmeasured benefit to the final outcome.
We then start with p + q patients with a mental illness (MI):
Each group q patient needs p2/q2 units of work. Each group p patient needs one unit of work.
Let the proportion P(ASD|MI) with comorbid ASD be m.
Then the number of patients with ASD is m(p + q).
Let the proportions of patients with ASD in groups p and q be s/(r + s) and r/(r + s).
Then the total work required by ASD patients in group q is p2/q2 x m(p + q) x r/(r + s).
Then the total work required by ASD patients in group p is 1 x m(p + q) x s/(r + s).
Total work required for all comorbid ASD patients is:
Let the fractional efficiency gain by treating ASD patients in an empathetic neurodiverse manner be e and assume this fraction applies at all degrees of severity. Then reduction in work needed (= energy saved) in units required to treat group p patients is:
This energy can be used to service an extra em(p +q) (p2r + q2s)/ q2(r + s) patients in the low work category p providing there is no removal of energy from the system. The rule should be a guide to the redistribution of resources, not a way to reduce them. The exercise demonstrates that identifying ASD, especially in group q, gives a considerable positive opportunity dividend for therapy, where the clinical need far exceeds the resources.
2.5.3. A worked example.
We can use data from anorexia nervosa and overall female mental illness to derive an indicative overall energy benefit. The 80/20 rule does appear to approximately apply for AN. About 20% are refractory to treatment [
80]. P(ASD|AN) is about 25% [
13]. It is known that this group is more difficult to treat than AN alone [
6,
7,
9,
37,
81]. Successful treatments for this group are being found (
Section 3.2). A study [
82] has shown managing diagnosed inpatient AN/ASD patients accounting for their neurodiversity saves 32.2% cost per patient. Then an indicative
e is 0.322. We have 2 values for P(ASD|MI), 0.189 [
69] and 0.176 (
Section 2.2). The mean 0.1825 for
m is also quite close to 20% and most comorbid patients are likely to be in Pareto group q. A reasonable starting point then for an indicative efficiency dividend would be with both
p/q and
r/s being 80/20.
Then energy saved in group p patient equivalents by formula (2) is:
We have nearly doubled the number of low work patients treatable in female mental illness, where a proportion is comorbid with ASD, by recognizing the ASD and managing it appropriately. This is the substantial result of diagnosing a common unrecognized but treatable comorbidity (ASD) which caused significant management problems in the initial diagnosed mental illness. We might stretch the energy analogy to the second law of thermodynamics. We have reduced the entropy of the open system by improving efficiency and increasing order in the disordered minds of the patients by transforming ASD to ASC.
2.5.4. Downstream effects.
If a treatment gain is achieved there will be a major positive ripple effect due to significant improvement in personal, family and societal problems associated with unrecognized and undertreated mental illness. The ratio of direct treatment cost to overall societal cost for mental illness has been estimated to be about 4.7 [
83]. We then find the total useful societal energy gain (“heat” loss foregone) is:
For every 100 patients (p + q) we would divert enough energy to treat an extra 359 group p patients, though in practice most of the energy would be potential, appropriately preventing adverse consequences in the downstream community.
As an example of downstream savings Luca [
66] found the average cost of an untreated postnatal depression mother/child dyad from conception to 5 years post-partum was
$31,800. P(ASD|PND) from
Section 2.2 is 0.313 so 31.3% of these dyads would be complicated by ASD. The societal ripple effect would give an extended average cost of 4.7 x 31,800 or
$149,460 per dyad. We can calculate the overall savings achieved by screening for maternal ASD in the antenatal period. Assume the costs of no treatment have a Pareto distribution and let the low initial average cost for the mother/child dyad in group p over 5 years be
z. The total cost of the patients
p +
q by formula (1) is:
The average dyad cost including ripple effect over 5 years is:
The proportion of patients with ASD is high at 31.3%. A realistic Pareto distribution would be p/q of 70/30. Then the low average extended cost z is $64,054. The therapeutic effect of a diagnosis for mother is probably going to be quite high. The child would also be assessed as having an increased probability of ASD, and if diagnosed would receive early intervention, so aiming for a 50% saving (e = 0.5) for comorbid dyads would probably be conservative. The proportion of ASD patients in group q is likely to be very high, say r/s in groups q and p respectively of 90/10 then by formula (2) cost saving as multiples of z is:
Then total 5 year cost saving is $64,054 x76.685 ie $4,911,981 for every p + q patients, giving % saving:
The substantial overall saving across all dyads is because most of the ASD/PND dyads are in the high cost Pareto group q.
A 2023 paper on new screening recommendations for PND [
84] describes PND as the leading cause of overall and preventable maternal mortality. This will obviously cause severe downstream effects. The paper lists several comorbid mental illnesses but does not mention autism.