2.1. Role in the Formulation of Clinical and Radiological Diagnoses
Artificial intelligence technologies may represent a novel opportunity to decrease the diagnostic delay that notoriously characterizes both endometriosis and adenomyosis [
19]. In fact, by providing updated medical knowledge, accessible both to physicians and to the public, they may ensure clinicians have considered all differential diagnoses that explain patients’ symptoms, as well as increasing patients’ awareness of the disease, deconstructing the common belief that menses are painful by definition [
20]. Patients questioning chatbots about their symptoms may in fact receive an alert regarding their need to consult a specialist and may search for a second opinion in the event that their symptoms are normalized or psychologized by health professionals. On the other hand, it is of uttermost importance that algorithms are accurate and are characterized by high sensibility, in order to minimize the risk of generating inappropriate referrals and unjustified anxiety.
Minimization of pain is a major determinant of diagnostic delay in endometriosis and adenomyosis, and is especially common among teenagers and individuals with superficial endometriosis [
21,
22]. In the former case, minimization often arises from a low awareness of the disease in young women, among clinicians and among patients and their families [
23]. In the latter, both insufficient medical knowledge and radiological challenges (superficial endometriosis is not visible by the means of an ultrasonographic examination) play a role [
22]. In both cases, AI may represent a game-changer.
Not only, by encouraging women’s referral to specialists, and by increasing physicians’ medical knowledge, AI may help redefine the epidemiology of endometriosis and adenomyosis. This may be particularly true for adenomyosis, which is rarely diagnosed in adolescents, although it actually appears to be more frequent in this specific population than commonly believed [
24].
The accuracy of chatbots in resolving medical cases and in providing accurate medical knowledge has been proven in various studies, although their performance is still far from that expected in clinical settings. When submitting a series of diagnostically challenging medical cases to ChatGPT4, Kanjee and colleagues found that the chatbot gave the right answer in 39% of cases, and included the correct answer among differential diagnoses in 64% of cases [
25]. When questioned with the United States Medical Licensing Examination quiz, chatbots with no specialized training passed or near passed the test [
26]. Astonishingly, in another study comparing the performance of ChatGPT4 with that of medical journal readers in resolving real life medical cases, the chatbot correctly diagnosed 57% of cases, compared to 36% correct diagnoses given by the journal readers [
27]. It must be pointed out that in the latter study, the population of human journal readers was poorly characterized and that their level of medical skills, as well as the effort they put into answering questions was unknown. As what specifically regards gynecology, Ozgor and Simavi recently analyzed the accuracy of ChatGPT in answering questions about endometriosis. As many as 91% of questions, were answered accurately, although among questions based on the ESHRE endometriosis guidelines [
28], accuracy was lower (67.5%) [
29]. Considering the speed at which LLM technologies are expanding, and consequently how fast they may improve in preciseness and accuracy, these results are encouraging.
As well as representing a source of medical knowledge, AI may provide algorithms for the prediction of the likelihood of endometriosis in patients with chronic pelvic pain or with infertility. This will be made possible by the fact AI is able to detect patterns in large volumes of data. Such algorithms will have to be validated on large populations from multiple centers in order for their data to be applicable on a vast scale.
Heterogeneous approaches have been suggested by various study groups to build such algorithms, most of which achieved sensitivity and specificity above 85%. Data used so far to build prediction or diagnostic models includes: clinical features (age, presence and severity of symptoms, comorbidities, infertility, previous surgery) [
30,
31,
32,
33,
34]; serum and salivary biomarkers [
35,
36,
37,
38]; genomics, transcriptomics, metabolomics, proteomics and methylomics data [
39,
40,
41,
42,
43,
44]; lipidomic data from endometrial fluid [
45]; gene, mRNA and proteomic and transcriptomic expression in the endometrium [
46,
47,
48]; mixed data [
49,
50] and radiologic images [
51,
52]. However, the majority of these studies, which have been comprehensively analyzed in Sivajohan and co-workers’ recent review [
19], were retrospective, meaning that the models were trained and validated on patient datasets, rather than in vivo, on humans. Moreover, the efficacy of AI in predicting and/or diagnosing endometriosis and adenomyosis was not compared with that of existing decision algorithms and of clinical diagnostic tools. Further research is needed in this regard.
As what specifically concerns the radiological diagnosis of endometriosis and adenomyosis, AI has a great potential to improve its quality by learning to detect anomalies in ultrasonographic and MRI images. This is made possible by the fact AI is able to match imaging findings with previously registered data [
53,
54,
55].
Computer-assisted interpretation of radiological data is already in use in other medical fields and in some cases appears to be as accurate as experienced radiologists [
56,
57]. Moreover, DL methods may improve diagnostic accuracy by eliminating subjectivity, and may provide diagnoses in a few seconds [
53]. Various studies have been published on this topic so far, however most algorithms are lacking adequate validation and generalizability and are currently limited to research purposes [
58,
59,
60].
In the management of endometriosis and adenomyosis, this kind of technology may be of particular aid in distinguishing patients with and without the disease, especially in complex cases with atypical presentations or in settings in which expert radiologists or ultrasound examiners are not available. In their recent pilot study on 50 individuals with a surgical diagnosis of endometriosis and an equal number of individuals with at least one symptom of endometriosis but without a diagnosis of endometriosis, Balica and co-workers used five different DL methods to aid the sonographic diagnosis of the disease. AI-assisted diagnosis was feasible and efficacious and was able to predict endometriosis with 90% probability and 80% accuracy [
61].
AI may also prove useful in the differential diagnosis of endometriosis from other benign conditions. In Hu and colleagues’ retrospective study, DL was used to distinguish ovarian endometriosis from tubo-ovarian abscesses (TOA) on ultrasonographic images [
58]. Like endometriomas, TOA may in fact present as hypoechoic avascular cystic masses within the context of a pelvis distorted by adhesions. Astonishingly, when comparing AI’s performance with that of three ultrasound examiners and of plasma concentrations of carbohydrate antigen 125 (CA 125), DL’s performance was superior [
58].
A particular aspect of the ultrasonographic diagnosis of endometriosis is represented by the evaluation of the Pouch of Douglas (POD), which may be obliterated by adhesions between the retrocervix, the anterior wall of the rectum and the uterosacral ligaments [
62]. POD obliteration is particularly important to recognize pre-operatively as it increases surgical complexity and the risk of complications, as well as playing a role in the disease’s prognosis [
63,
64,
65]. A sonographic marker of PD obliteration has been described (“sliding sign”), however it relies on the operator’s expertise and largely depends on inter-observer variability [
66,
67]. To overcome such limitations, Maicas and colleagues analyzed the ability of DL in defining the state of the POD by classifying transvaginal ultrasound videos depicting positive and negative “sliding signs”. The accuracy, sensitivity and specificity of such a model were all just short of 90%, indicating a high diagnostic performance [
65].
However, not all studies on this topic have confirmed these encouraging results. When comparing DL’s ability to recognize adenomyosis on uterine ultrasonographic images with that of intermediate skilled trainees, Raimondo and co-workers found that the trainees’ accuracy was higher than DL (70% versus 51%). However, DL’s specificity, i.e. the ability to correctly identify healthy uteruses, was higher than trainees (82% versus 69%). The authors concluded that the DL model could prove useful in limiting the over-diagnosis of adenomyosis, although the literature appears to suggest we are facing the opposite problem (under-diagnosis), especially in some categories of patients, first and foremost adolescents [
68].
2.2. Role in the Choice of Medical Treatments and in the Customized Management of Patients
It has been estimated that the third leading cause of death in the United States is represented by medical errors [
69]. Fortunately, human error in medical practice usually leads to less serious consequences than death, however even mistakes with less catastrophic consequences require attention.
Errors are an inevitable limitation of human actions, which may arise from distraction, work overload, or lack of knowledge. Prescribing medical treatments for which an individual presents contraindications, overlooking the presence of interactions between medications in patients with comorbidities or simply not choosing the most adequate molecule for a given patient are extremely frequent events [
69]. Integrating human activity with AI-driven control systems may represent an innovative solution to mitigate the frequency of such errors and limit their consequences in clinical practice.
Monophasic low-dose hormonal contraceptives and progestins are considered first-line options for the treatment of endometriosis, as they are those with the most favorable safety/efficacy/tolerability/cost profile [
28,
70]. As what regards adenomyosis, no guidelines have been approved at the present time, although levonorgestrel-releasing intrauterine devices (LNG-IUD) appear to be an effective first-line treatment [
71,
72,
73]. Oral progestins, dienogest in particular, and combined oral contraceptives (COCs) have also proven to be effective in these patients [
74,
75,
76]. Second-line treatment consists of GnRH analogues, both for endometriosis and for adenomyosis [
74]. Surgery is the only therapeutic option in specific cases of endometriosis, including obstructive uropathy; bowel occlusion or subocclusion; ovarian cysts with a diameter greater than 5 cm or suspicious for malignancy; and cases in which hormonal therapies are not tolerated or contraindicated [
28]. Conversely, for women with adenomyosis, especially during childbearing age, surgery is rarely an option, and is usually limited to hysterectomy in patients in perimenopause [
74].
However, no size fits all. Patients’ age, ongoing treatments for other conditions, comorbidities, response to treatment and life plans, all strongly influence the choice of medical and surgical treatment. AI may aid in facilitating such choice, starting from the choice of prescribing treatment at all. In fact, not all patients with endometriosis or adenomyosis are promptly prescribed an adequate therapy. This applies especially to adolescents, for whom hormonal treatment does not appear to be the standard of care [
75], although it could prove particularly beneficial in improving painful symptoms and in reducing the risk of disease progression [
71].
The World Health Organization has provided recommendations for contraceptive use in women with medical conditions or medically-relevant characteristics, which should be routinely applied by gynecologists to avoid drug-disease interactions. Some of these recommendations are routinely addressed in medical practice, although not all are. Moreover, authors suggest that within the same class of molecules, some hormonal therapies are more adequate than others for specific populations.
AI algorithms based on such recommendations and on the most recent literature may guide physicians in an accurate manner towards the adoption of customized therapies, also providing alerts for drug-drug interactions. For example, AI algorithms may advise and remind clinicians to avoid molecules which have a greater effect on bone mass density (BMD) (dienogest monotherapies, GnRH analogues) in younger women; those with greater androgenic effects (NETA) in women with hyperlipidemia, hypercholesterolemia, or signs of hyperandrogenism; and those associated with a higher risk of thromboembolic events (COCs containing third and fourth-generation progestins, COCs with ≥ 30 mcg ethinyl estradiol, transdermal patches, vaginal rings) in women with known risk factors. Conversely, they may suggest the adoption of therapies which have none or minor adverse effects on BMD (LNG-IUD, continuous use of COCs, estrogen-progestin transdermal patches, vaginal rings) in adolescents or in women with known risk factors for osteoporosis; those which are less likely to induce adverse serum lipid changes (COCs containing micronized 17β-estradiol (E2), or E2 valerate, or estetrol) in women with hyperlipidemia or hypercholesterolemia; those associated with a reduced risk of venous thromboembolisms (second-generation progestins, LNG-IUD, subdermal implant progestins, COCs containing micronized 17β-E2, or E2 valerate, or estetrol) in those with known risk factors for cardiovascular accidents; and those which are approved as contraceptives (COCs, LNG-IUDs, desogestrel monotherapies and etonogestrel subdermal implants) in women desiring contraception [
71,
76,
77,
78,
79,
80,
81,
82,
83].
AI technologies may not only assist clinicians in the choice of the most adequate treatment for a given patient, they may also guide clinical decisions by predicting outcomes such as reproductive prognosis and cancer risk. Knowledge regarding their reproductive prognosis may empower patients, enabling them to adjust their life projects around their condition and increasing their perception of being taken care of, ultimately improving their satisfaction and their adherence to treatment [
22,
84,
85]. Risk models helping clinicians predict which patients are more likely to encounter a malignant transformation of endometriosis may help identify who requires a timely surgical treatment [
86]. Chao and co-workers recently developed a risk model through ML that can predict the risk of endometriosis-associated ovarian cancer with sensitivity and specificity both short of 90%. The model was built using clinical characteristics including, among others, age, age at menopause and size of the ovarian cysts. Although it was created within a pilot study, and certainly requires further validation, it is a promising example of how AI may facilitate an early identification of malignant transformation, helping clinicians recognize those patients in which risk-reducing medical or surgical interventions should be carried out [
86].
Back in 2005, Awaysheh and co-workers reviewed 97 studies analyzing the effects of computer-based clinical decision support systems in various medical fields (cardiology, general surgery and psychiatrics) and found that such systems improved practitioners’ performance in 64% of studies, as well as improving patients’ outcomes in 13% of studies. Although dated, this study provides encouraging results which can only be improved by the advancing application of AI-technology [
87].
2.4. Role in Reducing the Burden Linked to Administrative Work
The tight link between productivity pressure and burnout is clear. It has been estimated that physicians now spend more than 50% of their time updating electronic health records. Since the advent of COVID-19, they have also been spending an increasing amount of time in their out-of-office hours taking care of the exorbitant volume of electronic communication with patients [
22,
92,
93]. This comes at the expense of efficacious communication, empathy and clinicians’ psychological wellbeing [
22], which all reflect on patients’ empowerment; satisfaction with treatment; adherence to treatment; symptom perception; and ability to remain integrated in society despite suffering from a chronic condition [
94,
95,
96]. However, the solution may be round the corner. In fact, AI-powered technologies may not only be of aid in providing summaries of large medical records; filtering and drafting medical notes and e-mails; generating laboratory and prescription orders; cataloguing diseases according to their ICD and scheduling appointments; they may do so at a greater speed than humans and with greater accuracy [
59,
97]. This would enable physicians to have more time actively interacting with their patients, while reducing working hours, and ultimately reducing their risk of burnout [
98]. Machines can actually augment our humanity [
99] and this has been proven in Ayers and colleagues’ study, in which ChatGPT was found to respond with higher quality and more empathetic answers to patients’ health care questions, compared to physicians [
13].
Further administrative work which may be taken over by AI includes the management of staff rotations and that of operating room slots. In the latter case, optimizing slots by using systems which are able to predict operating room use time may considerably decrease waiting lists, improving health quality at a national level [
5]. Also, by creating large datasets including electronic health records from all medical institutions in a given country, AI may be of aid in the establishment of the transition from a fee- for-service reimbursement model to a value-based care model. In fact, by comparing the indication of medical and surgical treatments with available protocols and guidelines, AI technologies may help identify and reduce low-value health care, where value is considered as the relation between potential benefits, harms and costs of a given medical intervention [
100,
101].