1. Introduction
Endometriosis, characterized by the aberrant growth of endometrial-like tissue outside the uterine cavity, poses significant diagnostic and therapeutic challenges. Its heterogeneous manifestations, including severe pain, infertility, and diverse lesion morphologies, necessitate a multifaceted diagnostic and management strategy.
A critical examination of current diagnostic and classification methods for endometriosis highlights a range of shortcomings. The revised American Society for Reproductive Medicine (r-ASRM) classification [
1,
2,
3], which was modified in 1996 and is now widely known, utilizes a scoring system based on the size of endometriosis lesions and the extent of adhesions. This system classifies the disease into stages I to IV by accumulating points, with a maximum score of 150 points. While this feature proves to be very useful, the r-ASRM classification also presents significant limitations. Despite its widespread use, it falls short of accurately assessing deep endometriosis (DE) and lacks applicability as a preoperative diagnostic tool.
The Enzian score was initially published in 2005 as an independent postoperative assessment for DE [
4]. It has evolved, especially with modifications in 2010 and 2011, to complement the r-ASRM classification by filling in the gaps related to DE [
5]. The Enzian score, modeled after the TNM classification for cervical cancer, considers the tumorigenic nature of DE, categorizing lesions in the Douglas' pouch into A, B, and C sections based on location and size. It also describes adenomyosis and bladder endometriosis as 'F' (far). Nonetheless, these were not preoperative diagnostic methods. Recently, an evolved
♯Enzian classification has emerged as a preoperative diagnostic method achievable through transvaginal ultrasonography [
6,
7], primarily practiced in Europe. This method, however, has its own set of challenges, including the technical difficulty of detecting sub-centimeter-deep lesions via ultrasonography and unclear associations between lesions and pain, among others.
In 2023, the AAGL 2021 Endometriosis Classification [
8] was put forward, advancing its original version of the intraoperative classification [
9] by adopting transvaginal ultrasonography [
7,
8], akin to
♯Enzian classification, enabling it as a preoperative diagnostic method. Based on expert surveys, this approach assigns surgical complexity scores to each lesion, providing a singular indicator of the severity of endometriosis [
8,
9]. Despite its convenience and the availability of a supportive app, this method still faces technical difficulties in assessing peritoneal, tubal, and ureteral lesions via ultrasonography. It does not clarify the relationship between lesions and pain, either.
Other diagnostic methods with unique features have been proposed (such as EFI [
10,
11], Ultrasound mapping system [
12], EBDRECT [
13], UBESS [
14], and others [
15,
16]). Still, they have not secured a place as convenient as first-line diagnostic tools, requiring complex evaluation or MRI, not being preoperative or for other reasons.
Pelvic examination remains superior for detecting pelvic pain. The Beecham classification [
17] for endometriosis, adept at capturing early lesions, is now rarely practiced, highlighting the missing integration of pelvic examination findings in current diagnostics.
Considering the ideal first-line diagnostic method for endometriosis, various requirements emerge [
15], but we believe the following four are crucial:
A simple, objective, noninvasive method that captures early lesions and the diverse states of endometriosis, including their localization and spread.
A scoring system to stratify severe cases and guide referrals to specialized facilities.
An anatomically intuitive and easily shareable format akin to the TNM classification facilitates information exchange between physicians and patients.
A method capable of capturing temporal changes, useful as an indicator for surgical, medicinal, recurrent, and infertility interventions.
To meet the first condition, the foundational examination methods should include pelvic examination and transvaginal ultrasonography. The second condition requires the technique to be scored. For the third, an easily shareable and anatomically illustrative format is needed. Lastly, the method must be non-invasive and quick to execute by anyone, anywhere, to fulfill the fourth condition.
The Numerical Multi-Scoring System of Endometriosis (NMS-E) was designed as a comprehensive new assessment tool for endometriosis, combining insights from pelvic examination and transvaginal ultrasonography. The full details of this system were published in the Japanese Society of Endometriosis journal in 2015 [
18,
19,
20]. We have already reported on the outcomes related to the leading scores of NMS-E, namely the adhesion score in 2020 [
21] and pain score in 2023 [
22]. Therefore, in this instance, we retrospectively investigated whether the E-score, a severity indicator in NMS-E, actually correlates with the severity of endometriosis.
This study aims to address the gap in endometriosis diagnosis by evaluating the feasibility and efficacy of the NMS-E in predicting surgical duration and outcomes. By leveraging a retrospective analysis of patients treated for endometriosis at our institution, we seek to validate the NMS-E against traditional scoring systems and assess its potential to enhance surgical planning and patient management. Our hypothesis posits that the NMS-E can provide a more accurate reflection of disease severity, thereby improving preoperative predictions and surgical outcomes for patients with endometriosis.
4. Discussion
This study established that the E-Score of NMS-E, a preoperative diagnostic indicator for assessing endometriosis's severity, strongly correlates with the widely utilized r-ASRM score derived from surgical findings, with a correlation coefficient of 0.700. It also revealed a more substantial association with the duration of surgery (0.703) compared to the r-ASRM score (0.642). Furthermore, regarding blood loss, the E-Score demonstrated a higher correlation than the r-ASRM (0.407 vs. 0.348). These findings imply that the E-Score could be an equally or more precise predictor of endometriosis severity before surgery than the r-ASRM score.
Considering the predictive power of the E-Score for surgical duration, this study found that the expected surgery time based on the E-score can be determined by the equation y = 6.4974x + 50.345. For instance, an E-score of 19 points predicts a surgical time of 174 minutes. It should be noted that this equation might differ from one facility to another or from one surgeon to another, and even for the same surgeon, it could evolve with increasing experience. Nonetheless, with the accumulation of sufficient data, the precision of these predictions is expected to be enhanced, thereby facilitating the strategic planning and readiness for surgeries within the institution.
Looking at individual scores, the adhesion score had the highest correlation with surgical duration (0.596) despite its average value of 3.88 not being the highest among the scores. This implies that the strength of adhesions is the most significant factor determining surgical time or complexity. Following this comes the uterine score (0.417). Indeed, there is a perception that excision of deep endometriosis is time-consuming. Next in line are the Pain score and Cyst score, respectively. It was unexpected that the time taken for cystectomy had a minor impact on the overall surgical time in this study. One reason for this could be that the majority of the subjects (71.9%) were patients with severe conditions at stage IV, suggesting that adhesions and deep endometriosis, more common in severe cases, could have been dictating the surgical duration. The Cyst score might become a more significant determining factor in less severe conditions.
On the other hand, the E-score showed low to almost no correlation with the four major clinical symptoms of endometriosis (0.151-0.222). When examining the correlation with clinical symptoms, it seems better to focus on individual scores. As previously reported, the Pain score had a high correlation with dyspareunia (0.412), and it was found this time that the uterine score showed a weak correlation with dyschezia (0.226). Interestingly, when breaking down the Uterine score, the group with the retroverted uterus (R) had significantly higher dyschezia scores than the group without that (4.26 vs. 2.23, p=0.0038). This may indicate a mechanism of disease where bowel adhesions and consequent uterine retroversion lead to dyschezia. Further investigation is necessary following this. On the other hand, no correlation was observed between the Cyst score, adhesion score, and clinical symptoms.
The significant correlation between the NMS-E E-score and surgical duration underscores the utility of the NMS-E as a predictive tool in clinical settings. This finding is particularly relevant for the surgical management of endometriosis, where estimating procedure length can aid in resource allocation and patient counseling. Unexpectedly, the correlation with blood loss was less pronounced than anticipated, suggesting that while the NMS-E score is a reliable predictor of surgical duration, it may be less applicable for predicting intraoperative blood loss. This discrepancy highlights the complex nature of endometriosis. It suggests that factors beyond the scope of the NMS-E, such as patient-specific anatomical variations or surgical technique, may influence blood loss. These results encourage further refinement of the NMS-E and suggest additional variables for inclusion to enhance its predictive accuracy.
In this study, the E-score of NMS-E showed a high correlation with the r-ASRM score. Here, we want to consider the reason for this. NMS-E was created to enable a non-invasive implementation of the r-ASRM, an intraoperative diagnostic method. Moreover, it incorporates elements of deep endometriosis, such as pain assessment, which are weaknesses of the r-ASRM [
15]. Therefore, the correlation of these two diagnostic methods is not coincidental but by design. Here's a detailed explanation. The r-ASRM score is graded out of a total of 150 points: peritoneal lesions: 6 points, endometriomas: 40 points, posterior cul-de-sac obliteration: 40 points, ovarian adhesions: 32 points, and tubal adhesions: 32 points [
2].On the other hand, NMS-E grades out of approximately 40 points: endometriomas: 10 points, ovarian adhesions: 10 points, pain: 10 points, and uterine lesions: 9 points (3 points X 3) (In practice, the score limit is not fixed as additional points are given for tubal diseases and rare site endometriosis.). Comparing the elements of both diagnostic methods, the evaluation of the ovaries is nearly the same, and the adhesions are common for the ovaries, with some overlap for posterior cul-de-sac obliteration (adhesion). Regarding pain, it is known that the areas with solid pain are around the cul-de-sac, and cases with posterior cul-de-sac obliteration significantly have higher Pain scores. Therefore, it can be said that the r-ASRM's posterior cul-de-sac obliteration lesions and NMS-E's pain score share some common elements. This means the three main aspects of both diagnostic methods look almost identical. One significant difference is the evaluation of the fallopian tubes. The regular fallopian tubes are rarely visible in transvaginal ultrasonography, so their adhesions are also unknown. Therefore, only when tubal enlargement is observed in NMS-E is graded with 3 points. Another difference is that NMS-E has a uterine score, evaluating deep lesions such as Endometriotic nodules (in r-ASRM, deep lesions are usually rated up to 6 points for peritoneal lesions, which is low compared to other items). The scoring of each element in NMS-E is set to about 1/4 of each item's score in r-ASRM. For these reasons, r-ASRM and NMS-E could show a high correlation.
For preoperative diagnostic methods other than NMS-E, the #Enzian [
6] and the 2021 AAGL classifications [
8,
9] have recently gained attention [
25]. It could also be possible to predict surgery times using these scores. However, attempts to predict surgical duration using these are not common, and there are only reports of attempts to predict surgery times using the traditional Enzian classification [
26]. Therefore, it is unclear how accurately they can predict surgery times. Nevertheless, even if they could, we still believe NMS-E has several advantages. One of them is the adhesion score, as mentioned above. The adhesion score, which has already been shown to diagnose the temporal change in the strength of postoperative adhesions and can be an indicator of infertility [
21], is a unique score of NMS-E that quantitatively measures adhesion strength out of 10 points and is unparalleled. Moreover, it has been demonstrated that it is a significant factor in determining surgery times. This is why we believe NMS-E has superior surgery time prediction capabilities compared to other preoperative diagnostic methods. The base for the adhesion score measurement is also in r-ASRM. In the r-ASRM score, the degree of solid adhesions around the ovary is classified into no adhesion, <1/3, 1/3 < < 2/3, or >2/3, and points are allocated to each adhesion state as 0 points, 4 points, 8 points, and 16 points, respectively [
2].
In NMS-E, it is assumed that the enlarged ovaries are placed within an inverted tetrahedron, and the presence of adhesions is evaluated on four surfaces: the ovarian surface (Inter O-O), the uterine surface (Lt O-Ut), the sidewall surface (Lt O-Side), and the upper surface (usually without adhesions). The loss of mobility on each surface is considered as the presence of adhesion. Thus, adhesions on one surface represent 1/4 coverage of adhesions, corresponding to less than 1/3 of adhesions in the r-ASRM. Adhesions on two surfaces represent 2/4 coverage, corresponding to between 1/3 and 2/3 in the r-ASRM, and adhesions on three surfaces represent 3/4 coverage, corresponding to more than 2/3 in the r-ASRM. In the adhesion score, 1, 2, or 3 points are assigned respectively. This ingenuity has led to the adhesion score of NMS-E not only correlating with the adhesion score of r-ASRM but also with the surgical duration.
Another significant advantage of NMS-E is the existence of the Pain score derived from a pelvic examination. Most endometriosis diagnostic methods do not include the assessment of the pelvic examination. Dyspareunia is one of the critical indicators for deciding whether to perform surgery for endometriosis. There is no better method than a pelvic examination to detect such localized pain. Transvaginal ultrasonography accurately diagnoses deep lesions, but reports are scarce on strategies that can simultaneously assess the pain they induce [
27]. Moreover, they are not comprehensive preoperative diagnostic methods for endometriosis. NMS-E establishes a system that successfully integrates pelvic examination findings and transvaginal ultrasonography imaging using the Pain score. The Pain score has been shown to correlate most strongly with dyspareunia. These features make NMS-E an unparalleled diagnostic method of great value, capable of predicting not only surgical duration but also the activity of deep lesions.
The limitation of this study is that only one examiner performed this method. As a result, we obtained consistent data, but the possibility of bias is fully considered. Therefore, to prove that this method is universal, it is necessary to check the reproducibility of this data among many examiners and facilities and confirm its effectiveness.
Another problem is the small number of study cases. The current study used data from 111 cases by a single operator for data standardization. Since endometriosis is a disease showing various pathologies, many confounding factors exist. Therefore, in the future, it is necessary to increase the number of cases further and make adjustments through matching and stratification.
Another significant issue in this study is the difficulty in determining the optimal weighting for each disease. In NMS-E, in addition to the central four lesions, there are many parameters. Significant big data and effort are necessary to find optimal solutions for all of them. However, as mentioned before, since NMS-E is somewhat based on the r-ASRM score weighting, there may not be such a significant empirical discrepancy. Nevertheless, to solve this problem relatively quickly, the scoring way in the 2021 AAGL classification is a good reference [
9]. In this classification, a survey was conducted on approximately 30 endometriosis expert physicians, and the complexity of each lesion was scored. The allocation of points to each lesion was determined based on the results. For example, Complete Cu-de-sac obliteration scored 9 points, Endometriomas over 3cm is 7 points, Ureteral endometriosis is 6-9 points, and Intestinal endometriosis over 3cm is 8 points. Coincidentally, the scoring is close to NMS-E, where each element is nearly expressed out of a perfect score of 10 points. Also, in this and #Enzian classifications, the score jumps when the lesion size exceeds 3cm. In our current data, the correlation with surgical time improved when we also increased the scoring for the Endometriotic nodules based on size (
Table 5). It is necessary to consolidate this information in the future and make fine adjustments to the scoring in NMS-E.
The clinical significance of NMS-E is fourfold. First, using the E-score enables preoperative identification of patients with severe endometriosis. Surgery for severe endometriosis often requires special procedures such as a complete opening of the obliterated cul-de-sac or shaving of intestinal endometriosis [
28]. Therefore, patients with severe endometriosis should be operated on by experienced surgeons or at specialized facilities. However, until now, the severity of endometriosis was not apparent preoperatively, which might have led to inadequate triage and lost opportunities for adequate surgery for some patients. The use of the E-score can avoid such situations. Second, it enables accurate prediction of surgery time, allowing for efficient operation room management, which is essential for hospital management and medical economics. Predicting surgery time for endometriosis, which can present complex conditions, was particularly challenging. This could lead to complex cases being scheduled late in the afternoon with short expected surgery time, resulting in complications such as bowel injuries that require cooperation from other departments. On the other hand, if a case is considered mild based on the NMS-E preoperatively, it might be possible to plan more than three surgeries in one day, including that surgery. Third, the NMS-E summary facilitates sharing information about patient conditions among physicians, not just the severity. Although it may initially seem confusing, as seen in some examples in the
Appendix A data, one can grasp the overall picture of endometriosis at a glance once accustomed. Finally, the Physical Finding Map allows for understanding the local conditions and the whole picture of endometriosis. This is particularly important as preoperative information. Based on this information, decisions can be made about removing a lesion and to what extent, based on the location of the etiology and its activity (Pain score). The Physical Finding Map becomes an indicator when planning surgical strategies. NMS-E is a non-invasive preoperative diagnostic method that can be easily performed with pelvic examination and transvaginal ultrasonography. With the above features, it realizes the ideal endometriosis diagnostic method initially proposed.
As a future research direction, it is necessary to validate the NMS-E further in more extensive and diverse populations to determine the effectiveness of this comprehensive preoperative diagnostic method for endometriosis, which incorporates numerous variables. Moving forward, collaborative research with multiple physicians and facilities is planned, and individual evaluations of various parameters, such as score limits and allocations, will also be conducted. Furthermore, developing non-invasive infertility prediction using NMS-E is a crucial issue. For this purpose, it is necessary to add the critical information missing in NMS-E. This information pertains to the patency of the fallopian tubes. However, it is impossible to diagnose the patency of the fallopian tubes solely based on the imaging information from transvaginal ultrasonography. Therefore, by combining tests such as the Tubal Insufflation Test (Rubin's Test), Saline-infused hysterosonogram (SIH), Hysterosalpingo-Contrast Sonography (HyCoSy) [
29], or Hysterosalpingography (HSG), it may be possible to achieve the objective.