Application of Artificial Intelligence and Machine Learning for the Prediction of Interbody Cage Height and Postoperative Alignment in Transforaminal Lumbar Interbody Fusion

Preprint

Article

Application of Artificial Intelligence and Machine Learning for the Prediction of Interbody Cage Height and Postoperative Alignment in Transforaminal Lumbar Interbody Fusion

Altmetrics

Downloads

114

Views

Comments

Supplementary Material

supplementary.zip (31.30KB )

Submitted:

30 November 2023

Posted:

01 December 2023

You are already at the latest version

Alerts

Abstract

Transforaminal lumbar interbody fusion (TLIF) is a commonly used technique for treating lumbar degenerative diseases. Here, we developed a fully computer-supported pipeline to predict the cage height and the degree of lumbar lordosis subtraction from the pelvic incidence (PI-LL) after TLIF surgery through preoperative X-ray images. The automated pipeline included two primary stages. First, a deep learning model was used to extract essential features from X-ray images. Second, five machine learning algorithms were trained to identify the optimal models to predict the interbody cage height and postoperative PI-LL. Lasso regression and support vector regression exhibited superior performance for predicting the interbody cage height and postoperative PI-LL, respectively. For cage height prediction, the root mean square error (RMSE) was calculated as 1.01, and the model achieved the highest accuracy at a height of 12 mm, with exact prediction achieved in 54.43% (43/79) of cases. In most of the remaining cases, the prediction error of the model was within 1 mm. In addition, the model demonstrated adequate performance for predicting PI-LL, with an RMSE of 5.19 and an accuracy of 0.81 for PI-LL stratification. In conclusion, the interbody cage height and postoperative PI-LL can be reliably predicted using artificial intelligence and ML models.

Keywords:

Subject: Medicine and Pharmacology - Surgery

1. Introduction

Over the past few decades, transforaminal lumbar interbody fusion (TLIF) has been commonly used to treat lumbar degenerative diseases, demonstrating the benefits of achieving satisfactory arthrodesis through a unilateral approach with minimal impingement on neural components [1,2]. In addition to relieving spinal nerve compression, the primary objective of TLIF is to restore sagittal balance and the intervertebral body height [3,4,5].

In terms of sagittal alignment, several studies have reported a close relationship between postoperative sagittal malalignment and postoperative residual symptoms in patients with lumbar fusion [5,6]. Among the parameters of spinal alignment, subtraction of lumbar lordosis (LL) from the pelvic incidence (PI) is a crucial indicator of postoperative outcomes after short-segment lumbar interbody fusion for lumbar pathologies. Patients with PI-LL (PI minus LL) mismatch have increased risks of adjacent segment disease (ASD), late surgical complications, and revision surgery [7,8,9]. Therefore, postoperative alignment prognosis, especially for critical parameters such as PI-LL, is required for optimal preoperative planning for lumbar fusion. However, predicting postoperative alignment in patients is challenging. Ailon et al. [10] reported that only 42% of cases were accurately predicted by 17 experienced surgeons specializing in treating spinal deformity. Although various methods exist for predicting postoperative parameters in patients with adult spinal deformity [11,12], a method for predicting the value of PI-LL in TLIF procedures still needs to be developed.

Selecting an interbody cage with the correct height is a crucial aspect of lumbar interbody fusion. Utilizing an undersized cage may result in the inability to restore the intervertebral height and segmental lordosis, as well as in complications such as pseudarthrosis and cage migration [13,14,15]. By contrast, utilizing an oversized cage may increase the likelihood of nerve root compression, ASD, or cage subsidence [15]. In clinical practice, the cage height has long been selected subjectively by surgeons depending on their operational experience. Few studies have predicted the height of fusion cages on the basis of the intervertebral height of the pathological segment [16] or the anterior and posterior disc height on a preoperative computed tomography (CT) image [17]. However, in severe degenerative diseases, such as spondylolisthesis and spinal deformity, when the disc height is greatly reduced, these methods are often inaccurate. Thus, estimating the height of interbody cages remains a challenge.

The choice of the cage height affects sagittal balance (and vice versa), and preoperative spinal parameters play a key role in determining the appropriate size of the implanted device for achieving favorable parameters after surgery [16,18]. Therefore, regression models should be developed to predict the interbody cage height and postoperative parameters from preoperative parameters. However, manual measurements are time-consuming for obtaining all parameters and are prone to rater-dependent errors. Currently, automated tools involving artificial intelligence (AI) are used to increase the accuracy and speed of measuring spinal alignment parameters from radiographic images [12,19]. The purpose of this study was to develop a dedicated pipeline based on AI and machine learning (ML) that can reliably predict the interbody cage height and postoperative PI-LL in TLIF surgery from preoperative X-ray images.

2. Materials and Methods

2.1. Patient selection

A total of 311 patients who underwent L4-L5 TLIF surgery between January 2019 and December 2021 at our institution were included in this retrospective study. The following patients were included: (1) patients with lumbar degenerative diseases, such as lumbar disc herniation, lumbar spinal stenosis, and spondylolisthesis; (2) patients who underwent TLIF surgery to implant a single interbody cage; and (3) patients who did not experience any complications, such as cage migration, pseudarthrosis, or fusion failure, and did not require revision surgery because of cage problems or ASD during the follow-up period (at least 6 months). The following patients were excluded: (1) patients with a history of lumbar fractures or patients who received a diagnosis of one-segment lumbar degenerative disease at other levels, multiple lumbar degenerative diseases, lumbar scoliosis, spinal tumors, or severe osteoporosis; (2) patients who received two interbody cage implants; (3) patients with unstandardized sagittal radiographs with low image quality for segmentation or radiographs lacking a femoral head; and (4) patients who experienced neurological or neuromuscular episodes during the follow-up period.

In addition to preoperative and postoperative X-ray images and the size of the surgically implanted interbody fusion cage, the demographics of each patient were obtained. Standing lateral X-ray images were used because they offer higher quality and standardization than intraoperative X-ray images, thereby minimizing segmentation bias and error in parameter measurements. Imaging data were obtained using a Radnext 50 X-ray machine from Hitachi Global (Tokyo, Japan).

2.2. X-ray segmentation and feature extraction

A pretrained BiLuNet model was used to segment each input X-ray image into various semantic regions, including L1, L2, L3, L4, and L5 regions, a sacrum region, and two femoral head regions (Figure 1) [20]. After the size of the original image was changed to 512 × 512 pixels, the model generated an output image with four labels: background, lumbar vertebral regions, sacrum, and two femoral heads. Nearest-neighbor interpolation was then used to resize the segmented image to its original size. Depending on the contours of the segmented areas, multiple corner points were obtained to measure the spinal parameters on the preoperative X-ray images through a computer vision algorithm. Subsequently, these features were combined with four demographic features, namely age, gender, body mass index (BMI), and fusion indication, to obtain the input features for ML algorithms. Lastly, the PI-LL value was measured from the postoperative X-ray image by two experienced surgeons (C.-Y.L. and M.-H.W.) for use as a validation standard for ML models.

To verify the measurement precision of the BiLuNet model, two authors (A.T.B. and G.M.T.) independently measured the aforementioned parameters by using magnetic resonance imaging (MRI) and compared their results with those of the model. Because the MRI angle parameters in the supine position differ from those obtained from standing X-ray images, we selected only bone distance features to evaluate interobserver reliability.

2.3. ML implementation

We divided our ML pipeline into three steps: data extraction, model building, and validation (Figure 1). All steps were performed using Python 3.7 and scikit-learn 1.1.2 package [21].

2.3.1. Data preprocessing

Each value missing from the dataset of all aforementioned features was examined and replaced by the mean value of each parameter. Because all features exhibited distinct units and large differences between their ranges, the z-score was used in the data normalization step [22].

2.3.2. Regression models

Various ML models were evaluated to determine the models that performed well for the aforementioned features. These models included five regression algorithms: decision tree (DT), lasso regression (LR), support vector regression (SVR), K-nearest neighbor (KNN), and multilayer perceptron (MLP). Hyperparameter optimization was conducted for each ML algorithm through the GridSearchCV method to achieve improved results. The algorithm with the highest performance was selected as the baseline model to construct the final ML model. After baseline ML models were obtained for either cage height or postoperative PI-LL prediction, feature selection with Recursive Feature Elimination (RFE) was used to remove the least crucial features and rebuild models with the remaining features. To determine the optimal number of features, an RFE loop was performed with cross-validation (RFECV function). The mean absolute error (MAE) of the model was then calculated across all repetitions and folds of the RFECV function. Generally, the scikit-learn library represents the MAE as a negative value to maximize it. Therefore, a model with a large negative MAE value is regarded as superior for RFE visualization. After the RFE process, the final model was built using the optimal subset of features, with the SHapley Additive exPlanations (SHAP) value indicating the importance of each feature in model prediction [23].

2.4. Statistical analysis and measurement metrics

A five-fold cross-validation (k=5) was performed to assess the efficacy of the ML regression algorithms. The model was then trained on k − 1 data splits, and the trained model was tested on the remaining held-out split. Subsequently, the performance of each model was averaged across all data splits for comparison. This cross-validation scheme provided a more reliable test result than that derived using a single fixed testing data split, especially when training data were limited. It also guaranteed that each data point was tested exactly once. To compare the performance of all ML algorithms, both the root mean square error (RMSE) and the MAE of each model were calculated. The testing error in each case was then visualized to evaluate the accuracy of prediction. To examine the reliability of features in the deep learning model, the intraclass correlation coefficient (ICC) was calculated using SPSS version 18.0 (SPSS, Chicago, IL, USA). The 95% confidence interval of the ICC estimate suggests poor reliability for values below 0.5, moderate reliability for values between 0.5 and 0.75, adequate reliability for values between 0.75 and 0.9, and excellent reliability for values greater than 0.9 [24]. Schwab classification was then performed with three levels of PI-LL, and the final model was evaluated in terms of its ability to stratify postoperative PI-LL based on the accuracy index and F1-score. Generally, a PI-LL value below 10° yields a modifier of 0, a value between 10° and 20° yields a modifier of 1, and a value greater than 20° yields a modifier of 2 [25].

3. Results

3.1. Patient characteristics

This study included 126 men and 185 women, with a mean age of 64.08 years (standard deviation: 11.19) and a mean BMI of 25.4 kg/m² (standard deviation: 3.62). In total, 88 patients had lumbar disc herniation, 154 patients had lumbar spinal stenosis, and 69 patients had lumbar spondylolisthesis. Figure 2 depicts the ground truth distribution of two predictable parameters. Most of the cases (149/311 cases) had cage heights of 12–13 mm, with only few cases having fusion cage heights of 8, 9, and 15 mm. Similar uneven distribution was observed in PI-LL values after surgery, with the majority of patients having PI-LL values ranging from 0 to 20. These unbalanced proportions posed a challenge for the optimization of the ML algorithms.

3.2. Performance of ML algorithms

A total of 53 features were extracted from preoperative X-ray images by using a deep learning model (Supplementary Table S1). These features were determined to be highly reliable, as indicated by the adequate to excellent interobserver reliability within an ICC range of 0.78–0.947 (Supplementary Table S2). These results indicated that the deep learning model had adequate performance for measuring the spinal parameters. After four clinical features were added, 57 features were input into the regression models.

Subsequent experiments were conducted to determine the optimal parameters of each ML algorithm for predicting the cage height and postoperative PI-LL. Table 1 lists the ranges of all searched hyperparameters and their optimal values. Comparison of the five algorithms with optimal parameters revealed that LR outperformed the other models in terms of predicting the cage height, with an RMSE of 1.06 and an MAE of 0.76. With the lowest RMSE and MAE (5.4 and 4.15, respectively), SVR was regarded as the optimal model for predicting postoperative PI-LL, followed by LR, MLP, KNN, and DT (Table 2). Therefore, LR was selected as the baseline model for predicting the cage height, and SVR was selected for predicting PI-LL.

3.3. Final model

3.3.1. Feature selection

Figure 3 depicts the RFECV results of two baseline modes. In the LR model for predicting the interbody cage height, the RFE curve identified 23 features as the optimal input for achieving optimal results, with a negative optimum cut-off MAE of −0.693. Similarly, the SVR model for predicting postoperative PI-LL identified 24 features as the optimal number of features, with a negative cut-off MAE of −4.096. The two subsets of features were therefore used to retrain the models (Supplementary Table S3), and the final models were validated using the testing set.

3.3.2. Optimal model performance

As shown in Table 3, the final lasso algorithm for cage height prediction exhibited an RMSE of 1.01 and an MAE of 0.7, which are more favorable than the values obtained before feature reduction (i.e., 1.06 and 0.76, respectively). Figure 4 depicts the accuracy of cage height prediction using the testing set, with 42.12% (131/311) of cases achieving an exact value. Our model exhibited adequate results for the interbody cage height of 10–13 mm. The most accurate prediction was obtained for a height of 12 mm, with 54.43% (43/79) of cases having an right value. Meanwhile, the accuracy ratios for 10, 11, and 13 mm sizes were 52.63% (20 of 38 cases), 51.02% (25 of 49 cases), and 42.86% (30 of 70 cases), respectively. In the majority of the remaining cases, the model exhibited a 1 mm prediction error, resulting in an accuracy rate of 88.75% (276 out of 311 cases) within the acceptable margin of 1 mm.

Because of the limited number of samples in the 8, 9, and 15 mm fusion cage groups, the model exhibited high prediction errors. Specifically, four of six cases with an actual cage height of 8 mm were predicted to have a height of 9 mm. In the 9 mm group, the predicted values were 8 mm in three cases and 9 mm in two cases. In the 15 mm group, the model tended to predict interbody cage heights in the range of 13 to 14 mm in 10 of 14 cases.

For postoperative PI-LL prediction, the final SVR model achieved lower RMSE and MAE values on the testing set than the baseline model (5.19 and 3.86 versus 5.4 and 4.15; Table 3). Figure 5A depicts the performance of the model on both the training and testing data, indicating a well-calibrated model in which most points are clustered around the regression line; this observation suggested that the predicted PI-LL values were close to the actual values. However, in cases with PI-LL values greater than 20, more considerable errors were observed. The model also exhibited high precision in stratifying postoperative PI-LL, achieving an accuracy of 0.81 and a high F1-score for the 0 group (Figure 5B).

3.3.3. Feature importance

Figure 6 visualizes the 10 most important features of the two final models. For interbody cage height prediction, the intervertebral height at the midpoint of L4-L5 (L4L5_mid) was the most crucial factor in our model. This prediction was influenced by three angles: LL, PI, and the L4-L5 intervertebral disc angle (L4L5_angle). Additionally, the intervertebral heights of lumbar segments from L3 to S1 served as crucial parameters, including the intervertebral height at the midpoint of L3L4 and L5S1 (L3L4_mid and L5S1_mid), the posterior intervertebral height of L3L4 (L3L4_post), and the anterior intervertebral height of L3L4 and L4L5 (L3L4_ant and L4L5_ant). Only one factor related to the size of the vertebral body, namely the upper vertebral width of L3 (L3Width_up), was included in this list.

Preoperative LL, relative LL (RLL), and PI played crucial roles in predicting the postoperative PI-LL. The essential features associated with PI-LL after surgery were primarily angles involved in preoperative sagittal alignment, such as the sacrum slope (SS), pelvic tilt (PT), and L5S1 intervertebral disc angle (L5S1_angle). Other factors influencing PI-LL prediction were related to the height of the vertebral body, such as the anterior height of the L5 vertebra and the posterior height of the L2 and L3 vertebrae (L2Height_Post and L3Height_Post).

4. Discussion

Spinopelvic alignment restoration is essential for both adult spinal deformity surgery and short-segment lumbar interbody fusion [8,26,27]. However, determining the influence of each factor on sagittal alignment is difficult because the normal standing posture is jointly determined by multiple lumbosacral factors [28,29]. As shown in Figure 6 in the present study, the postoperative value of PI-LL is substantially influenced by the preoperative values of LL, RLL, and PI. However, because the PI value is regarded as a constant anatomic feature with slight variation in pathologic disorders or lumbar spine interventions [30], determining the postoperative LL is typically necessary for predicting the optimal PI-LL. According to previous research, LL restoration after surgery is closely linked to preoperative LL and PI [31,32,33,34]. Therefore, LL and PI can be used to predict the LL and PI-LL values after surgery, as in our model.

Appropriate parameters must be obtained for enhancing surgical quality, and surgeons must develop effective strategies to achieve harmonious sagittal alignment. Our model demonstrated a strong capacity to generate a satisfactory PI-LL value while being able to forecast the potential range of this value. By selecting patients without ASD for the dataset, the algorithm trained on these data was able to generate a favorable PI-LL value, which can be used to reduce the incidence of ASD in patients [7]. Our PI-LL prediction model was also able to provide predictions for surgical planning in selecting the appropriate surgical technique and instruments. Actually, the optimal PI-LL has been the subject of debate. Satoshi et al. [35] reported that this value is inconsistent. Meanwhile, multiple studies have suggested that surgeons must strive to reduce PI-LL to 10° or less whenever possible [8,36,37]. According to our model, if unsatisfactory PI-LL prediction values are obtained before surgery, surgeons could consider implementing additional intraoperative techniques. To achieve an adequate LL value, strong fixation with a curved rod system can be implemented. In some cases of severe hypolordosis, osteotomy techniques such as pedicle subtraction osteotomy are also a viable option [38]. Furthermore, the predictive results of postoperative PI-LL from our algorithm may aid in rod bending or in the determination of the number of spinal levels requiring fixation when a surgeon receives intraoperative fluoroscopic images. However, previous studies have revealed substantial discrepancies between standing and prone angle measurements [39,40]. Therefore, these models must be further developed to ensure their seamless integration from preoperative planning to actual surgery.

Size, shape, and position play a crucial role in the insertion of an intervertebral cage. However, findings regarding the importance of the implant shape and placement have been inconsistent. Cage lordosis and final LL after surgery are strongly correlated, with more anterior placement resulting in greater intervertebral lordosis [18]. Conversely, some in vitro biomechanical and clinical studies have reported that the cage position and geometry do not affect sagittal alignment after lumbar interbody fusion [41,42,43]. The cage height typically serves as a key factor applied by surgeons for improving lordosis [44,45], and our research has primarily focused on predicting this index. Most of our cage height values were between 12 and 13 mm, which are consistent with the recommended cage heights of 11, 12, or 13 mm for the L3-4 and L4-5 levels in a previous study conducted in the Chinese population [16]. In addition, our model performed well for cases within this range, indicating its potential clinical applicability for the Asian population. Overall, predicting the appropriate interbody cage size can assist surgeons in decision-making and improve postoperative outcomes, particularly for inexperienced surgeons. Prediction using our model can also provide the cage height with an error of approximately 1 mm only (Figure 4). Consequently, fewer cages need to be sterilized, thus reducing the costs of surgery. In addition, the costs of treatment decrease due to the reduced operation duration and complication rates. Therefore, patients evidently benefit from the development of these models.

Our results indicated that the disc height of the pathological segment and the two adjacent levels plays a crucial role in predicting the height of the interbody cage (Figure 6). To predict this value, Wang et al. [16] developed a regression model that emphasizes the importance of the intervertebral height at the midpoint of the pathological segment (MIVH): interbody cage height = 11.123 − 0.563*gender + 0.149*MIVH. In our study, gender was one of the final 23 features used to build the optimal model, but its influence was not as evident as that of the other parameters. With the exception of the parameters associated with the intervertebral disc height, PI and LL contributed to the prediction of the interbody cage height. These two parameters also contributed to the aforementioned prediction of postoperative PI-LL. Lafage et al. [11] discovered that pelvic retroversion and global sagittal balance in adult patients with spinal deformities were primarily influenced by the PI and LL values. Here, we emphasized that PI and LL are among the most crucial parameters for both long- and short-segment fusion surgeries.

Multiple researchers have attempted to develop algorithms for predicting postoperative sagittal parameters and the interbody cage height, with the primary goal of improving the accuracy and usability of these algorithms in daily practice. Previous formulas typically included a limited number of significant variables to simplify computations. Lafage et al. [11,46] developed one of the most accurate formulas for predicting the sagittal vertical axis (SVA). They used only four variables in their formula: PI, LL, thoracic kyphosis, and age. Legaye and Duval-Beaupère [30,47] proposed multilinear regression models for calculating LL by using only basic parameters, such as thoracic kyphosis, SS, PI, PT, and T9 spinopelvic inclination. In contrast to previous research, our goal was to incorporate all significant parameters that can be measured in the lumbar region for developing an algorithm. Because our prediction models (LR for the interbody cage height and SVR for postoperative PI-LL) and previous models share the same characteristic of utilizing multiple linear algorithms, we took advantage of the current technological advancements and incorporated as many variables as possible. However, certain factors, such as the width and length of the vertebral body, were found to be crucial features in our model, which has never been mentioned in the medical literature; this is likely due to a coincidence during model training with our dataset, necessitating further verification in future research. Previously, the use of multiple parameters may have been inconvenient for daily practice. With the help of computers, the accuracy of predictions can now be improved, and these predictions can be applied in clinical settings. According to Langella et al. [48], computer-assisted methods are associated with a failure rate below 20% for predicting PI and SVA. To the best of our knowledge, this is the first study to provide a pipeline and various models for predicting PI-LL and the cage height from preoperative X-ray images by using AI.

This study has several limitations. First, this was a retrospective, single-center study with a small sample size. As a result, the optimal interbody cage height or postoperative PI-LL may be influenced by subjective factors such as the surgeon’s technique and patient demographics. Nevertheless, by using multiple algorithms, this study introduced a novel concept and laid the foundation to ensure that these predictions are more accurate in future multicenter studies. Second, this study was limited to patients with monosegmental TLIF at the L4L5 level, and only one sagittal parameter, PI-LL, was predicted. However, using our algorithms, a large number of postoperative parameters can be predicted not only for single-level fusion surgery but also for surgeries involving multiple levels. Third, sagittal balance is associated with factors such as SVA, T1 spinopelvic inclination, and C7 plumb line, which are evaluated using full-length spine radiographs [37,49,50]. Because we focused only on short-segment fusion, we examined only the lumbar region. Therefore, global sagittal balance factors must be examined for TLIF surgery in the future. Finally, our model comprised many steps, thereby increasing the likelihood of errors. To increase the accuracy of predictions, a synthetic model must be developed using radiographic parameters from X-ray, MRI, and CT scans.

5. Conclusion

In this study, we developed an end-to-end AI models capable of accurately predicting the interbody cage height and postoperative PI-LL in TLIF surgery. Our results confirmed that powerful computer-assisted models are a valuable tool in spinal morphometry, with ML models exhibiting high levels of accuracy and the potential to aid surgeons in preoperative planning and postoperative evaluation. Our results also indicated that the incorporation of multiple crucial parameters, particularly preoperative PI and LL, into multilinear regression equations is a promising method for predicting the outcomes of spinal fusion surgery. However, to ensure model reliability and generalizability, further validation and refinement with larger datasets and multicenter studies are required.

Supplementary Materials

The following supporting information can be downloaded at the website of this paper posted on Preprints.org. Table S1: Spinal parameter features extracted using a deep learning model; Table S2: ICCs validating the reliability of the deep learning model in measuring bone distance parameters compared with the MRI results; Table S3: Two subsets of crucial features for two baseline ML models.

Author Contributions

Conceptualization, A.T.B., T.-J.H. and M.-H.W.; Methodology, A.T.B. and M.-H.W.; Software, H.L., P.-I.T., and K.-J.C.; Validation, H.-C.S., E.-W.H. and C.-C.H.; Formal Analysis, H.L. and K.L.-C.H.; Data Curation, K.L.-C.H., C.-Y.L. and P.-Y.W. ; Writing – Original Draft Preparation, A.T.B., H.L., and G.M.T.; Writing – Review & Editing, T.T.H., H.-C.S., M.M., C.-Y.L., T.-J.H., and M.-H.W.; Visualization, A.T.B. and H.L.; Supervision, T.T.H., H.-C.S., T.-J.H., and M.-H.W.; Project Administration, M.-H.W. All authors have read and agreed to the published version of the manuscript.

Funding

This study was financially supported by the Higher Education Sprout Project of the Ministry of Education of Taiwan.

Institutional Review Board Statement

This study was approved by the Joint Institutional Review Board of Taipei Medical University (N201807084).

Informed Consent Statement

The patient’s consent was waived for this retrospective study using a clinical database, in accordance with the IRB’s statement and regulations.

Data Availability Statement

Access to dataset shall be provided by the corresponding authors upon reasonable request and in accordance with the policies of the relevant institution.

Acknowledgments

The author(s) express their gratitude to Mr. Po-Yu Hsieh and Mr. Chen-Wei Lai from the Industrial Technology Research Institute, Taiwan, for their invaluable assistance in constructing the AI models utilized in this study.

Conflicts of Interest

The authors declare no conflict of interest.

References

Mummaneni PV, Dhall SS, Eck JC; et al. Guideline update for the performance of fusion procedures for degenerative disease of the lumbar spine. Part 11: Interbody techniques for lumbar fusion. J Neurosurg Spine. 2014, 21, 67–74. [Google Scholar] [CrossRef]
Noshchenko, A.; Hoffecker, L.; Lindley, E.M.; Burger, E.L.; Cain, C.M.; Patel, V.V. Perioperative and long-term clinical outcomes for bone morphogenetic protein versus iliac crest bone graft for lumbar fusion in degenerative disk disease: Systematic review with meta-analysis. J Spinal Disord Tech. May 2014, 27, 117–135. [Google Scholar] [CrossRef] [PubMed]
Xiao, Y.; Li, F.; Chen, Q. Transforaminal lumbar interbody fusion with one cage and excised local bone. Arch Orthop Trauma Surg. May 2010, 130, 591–597. [Google Scholar] [CrossRef] [PubMed]
Ould-Slimane M, Lenoir T, Dauzac C; et al. Influence of transforaminal lumbar interbody fusion procedures on spinal and pelvic parameters of sagittal balance. Eur Spine J. Jun 2012, 21, 1200–1206. [Google Scholar] [CrossRef] [PubMed]
Watkins RGt, Hanna R, Chang D, Watkins RG, 3rd. Sagittal alignment after lumbar interbody fusion: Comparing anterior, lateral, and transforaminal approaches. J Spinal Disord Tech. Jul 2014, 27, 253–256. [Google Scholar] [CrossRef] [PubMed]
Yamasaki K, Hoshino M, Omori K; et al. Risk Factors of Adjacent Segment Disease After Transforaminal Inter-Body Fusion for Degenerative Lumbar Disease. Spine (Phila Pa 1976). Jan 15 2017, 42, E86–e92. [Google Scholar] [CrossRef] [PubMed]
Rothenfluh, D.A.; Mueller, D.A.; Rothenfluh, E.; Min, K. Pelvic incidence-lumbar lordosis mismatch predisposes to adjacent segment disease after lumbar spinal fusion. Eur Spine J. Jun 2015, 24, 1251–1258. [Google Scholar] [CrossRef] [PubMed]
8. Aoki Y, Nakajima A, Takahashi H.; et al. Influence of pelvic incidence-lumbar lordosis mismatch on surgical outcomes of short-segment transforaminal lumbar interbody fusion. BMC Musculoskelet Disord. [CrossRef]
Senteler, M.; Weisse, B.; Snedeker, J.G.; Rothenfluh, D.A. Pelvic incidence-lumbar lordosis mismatch results in increased segmental joint loads in the unfused and fused lumbar spine. Eur Spine J. Jul 2014, 23, 1384–1393. [Google Scholar] [CrossRef] [PubMed]
Ailon T, Scheer JK, Lafage V; et al. Adult Spinal Deformity Surgeons Are Unable to Accurately Predict Postoperative Spinal Alignment Using Clinical Judgment Alone. Spine Deform. Jul 2016, 4, 323–329. [Google Scholar] [CrossRef]
Lafage, V.; Schwab, F.; Vira, S.; Patel, A.; Ungar, B.; Farcy, J.P. Spino-pelvic parameters after surgery can be predicted: A preliminary formula and validation of standing alignment. Spine (Phila Pa 1976). Jun 2011, 36, 1037–1045. [Google Scholar] [CrossRef]
Lafage, R.; Pesenti, S.; Lafage, V.; Schwab, F.J. Self-learning computers for surgical planning and prediction of postoperative alignment. Eur Spine J, 1: 27(Suppl 1). [CrossRef]
Abbushi, A.; Cabraja, M.; Thomale, U.W.; Woiciechowsky, C.; Kroppenstedt, S.N. The influence of cage positioning and cage type on cage migration and fusion rates in patients with monosegmental posterior lumbar interbody fusion and posterior fixation. Eur Spine J. Nov 2009, 18, 1621–1628. [Google Scholar] [CrossRef] [PubMed]
Li, H.; Wang, H.; Zhu, Y.; Ding, W.; Wang, Q. Incidence and risk factors of posterior cage migration following decompression and instrumented fusion for degenerative lumbar disorders. Medicine (Baltimore). Aug 2017, 96, e7804. [Google Scholar] [CrossRef] [PubMed]
Aoki Y, Yamagata M, Nakajima F. ; et al. Examining risk factors for posterior migration of fusion cages following transforaminal lumbar interbody fusion: A possible limitation of unilateral pedicle screw fixation. J Neurosurg Spine. Sep 2010, 13, 381–387. [Google Scholar] [CrossRef] [PubMed]
Wang, H.; Chen, W.; Jiang, J.; Lu, F.; Ma, X.; Xia, X. Analysis of the correlative factors in the selection of interbody fusion cage height in transforaminal lumbar interbody fusion. BMC Musculoskelet Disord, 9: 2016, 17, 2016. [Google Scholar] [CrossRef]
Makino, T.; Honda, H.; Fujiwara, H.; Yoshikawa, H.; Yonenobu, K.; Kaito, T. Low incidence of adjacent segment disease after posterior lumbar interbody fusion with minimum disc distraction: A preliminary report. Medicine (Baltimore). Jan 2018, 97, e9631. [Google Scholar] [CrossRef] [PubMed]
Landham, P.R.; Don, A.S.; Robertson, P.A. Do position and size matter? An analysis of cage and placement variables for optimum lordosis in PLIF reconstruction. Eur Spine J. Nov 2017, 26, 2843–2850. [Google Scholar] [CrossRef]
Cho BH, Kaji D. , Cheung ZB; et al. Automated Measurement of Lumbar Lordosis on Radiographs Using Machine Learning and Computer Vision. Global Spine J. Aug 2020, 10, 611–618. [Google Scholar] [CrossRef]
20. Tran V, Lin H-Y, Liu H-W, Jang F.-J, Tseng C-H. BiLuNet: A Multi-path Network for Semantic Segmentation on X-ray Images, 1004.
21. Pedregosa F, Varoquaux G, Gramfort A; et al. Scikit-learn: Machine learning in Python. the Journal of machine Learning research, 2830.
Shalabi, L.A.; Shaaban, Z.; Kasasbeh, B. Data Mining: A Preprocessing Engine. Journal of Computer Science. [CrossRef]
Lundberg, S.M.; Lee, S.-I. A unified approach to interpreting model predictions. Advances in neural information processing systems. 2017, 30. [Google Scholar]
Koo, T.K.; Li, M.Y. A Guideline of Selecting and Reporting Intraclass Correlation Coefficients for Reliability Research. J Chiropr Med. Jun 2016, 15, 155–163. [Google Scholar] [CrossRef]
Schwab F, Ungar B, Blondel B; et al. Scoliosis Research Society-Schwab adult spinal deformity classification: A validation study. Spine (Phila Pa 1976). May 20 2012, 37, 1077–1082. [Google Scholar] [CrossRef]
Kong, L.D.; Zhang, Y.Z.; Wang, F.; Kong, F.L.; Ding, W.Y.; Shen, Y. Radiographic Restoration of Sagittal Spinopelvic Alignment After Posterior Lumbar Interbody Fusion in Degenerative Spondylolisthesis. Clin Spine Surg. Mar 2016, 29, E87–E92. [Google Scholar] [CrossRef]
Glassman, S.D.; Berven, S.; Bridwell, K.; Horton, W.; Dimar, J.R. Correlation of radiographic parameters and clinical symptoms in adult scoliosis. Spine (Phila Pa 1976). Mar 15 2005, 30, 682–688. [Google Scholar] [CrossRef] [PubMed]
Weisz, G.; Houang, M. Classification of the normal variation in the sagittal alignment of the human lumbar spine and pelvis in the standing position. Spine (Phila Pa 1976). Jul 1 2005, 30, 1558–1559. [Google Scholar] [CrossRef]
Lafage V, Schwab F, Skalli W; et al. Standing balance and sagittal plane spinal deformity: Analysis of spinopelvic and gravity line parameters. Spine (Phila Pa 1976). Jun 15 2008, 33, 1572–1578. [Google Scholar] [CrossRef]
Legaye, J.; Duval-Beaupère, G.; Hecquet, J.; Marty, C. Pelvic incidence: A fundamental pelvic parameter for three-dimensional regulation of spinal sagittal curves. Eur Spine J. 1998, 7, 99–103. [Google Scholar] [CrossRef]
Chou, D. Commentary: Retrospective Review of Immediate Restoration of Lordosis in Single-Level Minimally Invasive Transforaminal Lumbar Interbody Fusion: A Comparison of Static and Expandable Interbody Cages. Operative Neurosurgery. 2020, 18, E153–E154. [Google Scholar] [CrossRef] [PubMed]
McMordie, J.H.; Schmidt, K.P.; Gard, A.P.; Gillis, C.C. Clinical and Short-Term Radiographic Outcomes of Minimally Invasive Transforaminal Lumbar Interbody Fusion With Expandable Lordotic Devices. Neurosurgery. Feb 1 2020, 86, E147–e155. [Google Scholar] [CrossRef] [PubMed]
Porche, K.; Dru, A.; Moor, R.; Kubilis, P.; Vaziri, S.; Hoh, D.J. Preoperative Radiographic Prediction Tool for Early Postoperative Segmental and Lumbar Lordosis Alignment After Transforaminal Lumbar Interbody Fusion. Cureus. Sep 2021, 13, e18175. [Google Scholar] [CrossRef]
Schwab, F.; Lafage, V.; Patel, A.; Farcy, J.P. Sagittal plane considerations and the pelvis in the adult patient. Spine (Phila Pa 1976). Aug 1 2009, 34, 1828–1833. [Google Scholar] [CrossRef] [PubMed]
Inami, S.; Moridaira, H.; Takeuchi, D.; Shiba, Y.; Nohara, Y.; Taneichi, H. Optimum pelvic incidence minus lumbar lordosis value can be determined by individual pelvic incidence. Eur Spine J. Nov 2016, 25, 3638–3643. [Google Scholar] [CrossRef]
Schwab, F.; Patel, A.; Ungar, B.; Farcy, J.P.; Lafage, V. Adult spinal deformity-postoperative standing imbalance: How much can you tolerate? An overview of key parameters in assessing alignment and planning corrective surgery. Spine (Phila Pa 1976). Dec 1 2010, 35, 2224–2231. [Google Scholar] [CrossRef]
Schwab FJ, Blondel B, Bess S; et al. Radiographical spinopelvic parameters and disability in the setting of adult spinal deformity: A prospective multicenter analysis. Spine (Phila Pa 1976). Jun 1 2013, 38, E803–E812. [Google Scholar] [CrossRef] [PubMed]
Berjano, P.; Aebi, M. Pedicle subtraction osteotomies (PSO) in the lumbar spine for sagittal deformities. Eur Spine J, S: 24 Suppl 1. [CrossRef]
39. Brink RC, Colo D, Schlösser TPC; et al. Upright, prone, and supine spinal morphology and alignment in adolescent idiopathic scoliosis. Scoliosis Spinal Disord. [CrossRef]
Salem, W.; Coomans, Y.; Brismée, J.M.; Klein, P.; Sobczak, S.; Dugailly, P.M. Sagittal Thoracic and Lumbar Spine Profiles in Upright Standing and Lying Prone Positions Among Healthy Subjects: Influence of Various Biometric Features. Spine (Phila Pa 1976). Aug 1 2015, 40, E900–E908. [Google Scholar] [CrossRef]
Takahashi, H.; Suguro, T.; Yokoyama, Y.; Iida, Y.; Terashima, F.; Wada, A. Effect of cage geometry on sagittal alignment after posterior lumbar interbody fusion for degenerative disc disease. J Orthop Surg (Hong Kong). Aug 2010, 18, 139–142. [Google Scholar] [CrossRef] [PubMed]
Kepler CK, Rihn JA, Radcliff KE; et al. Restoration of lordosis and disk height after single-level transforaminal lumbar interbody fusion. Orthop Surg. Feb 2012, 4, 15–20. [Google Scholar] [CrossRef]
Faundez, A.A.; Mehbod, A.A.; Wu, C.; Wu, W.; Ploumis, A.; Transfeldt, E.E. Position of interbody spacer in transforaminal lumbar interbody fusion: Effect on 3-dimensional stability and sagittal lumbar contour. J Spinal Disord Tech. May 2008, 21, 175–180. [Google Scholar] [CrossRef] [PubMed]
Gambhir, S.; Wang, T.; Pelletier, M.H.; Walsh, W.R.; Ball, J.R. How Does Cage Lordosis Influence Postoperative Segmental Lordosis in Lumbar Interbody Fusion. World Neurosurg, e: 126. [CrossRef]
Uribe, J.S.; Harris, J.E.; Beckman, J.M.; Turner, A.W.; Mundis, G.M.; Akbarnia, B.A. Finite element analysis of lordosis restoration with anterior longitudinal ligament release and lateral hyperlordotic cage placement. Eur Spine J, 4: 24 Suppl 3. [CrossRef]
Smith JS, Bess S, Shaffrey CI; et al. Dynamic changes of the pelvis and spine are key to predicting postoperative sagittal alignment after pedicle subtraction osteotomy: A critical analysis of preoperative planning techniques. Spine (Phila Pa 1976). May 1 2012, 37, 845–853. [Google Scholar] [CrossRef] [PubMed]
Legaye, J.; Duval-Beaupère, G. Sagittal plane alignment of the spine and gravity: A radiological and clinical evaluation. Acta Orthop Belg. Apr 2005, 71, 213–220. [Google Scholar]
Langella F, Villafañe JH, Damilano M; et al. Predictive Accuracy of Surgimap Surgical Planning for Sagittal Imbalance: A Cohort Study. Spine (Phila Pa 1976). Nov 15 2017, 42, E1297–e1304. [Google Scholar] [CrossRef]
Glassman, S.D.; Bridwell, K.; Dimar, J.R.; Horton, W.; Berven, S.; Schwab, F. The impact of positive sagittal balance in adult spinal deformity. Spine (Phila Pa 1976). Sep 15 2005, 30, 2024–2029. [Google Scholar] [CrossRef]
Lafage, V.; Schwab, F.; Patel, A.; Hawkinson, N.; Farcy, J.P. Pelvic tilt and truncal inclination: Two key radiographic parameters in the setting of adults with spinal deformity. Spine (Phila Pa 1976). Aug 1 2009, 34, E599–E606. [Google Scholar] [CrossRef]

Figure 1. Study flowchart depicting four subprocesses: data cohort collection, feature extraction, feature validation, and ML model construction and validation. ML: machine learning; SVR: support vector regression; LR: lasso regression; DT: decision tree; KNN: K-nearest neighbor; MLP: multilayer perceptron; RFE: recursive feature elimination; RMSE: root mean square error; MAE: mean absolute error.

Figure 2. Distribution of actual interbody cage heights and postoperative PI-LL values.

Figure 3. RFECV curves of two baseline models with negative MAEs for different numbers of features: (A) a lasso model for interbody cage height prediction and (B) an SVR model for postoperative PI-LL prediction.

Figure 4. Confusion matrix for final model performance in the prediction of interbody cage height.

Figure 5. Performance of SVR. (A) Calibration plot (actual and predicted values) for predicting postoperative PI-LL on both training and testing data. (B) Confusion matrix for stratifying postoperative PI-LL on the testing set into three groups: 0 (<10), 1 (10–20), and 2 (>20).

Figure 6. (A) Most crucial features for the model of interbody cage height prediction. (B) Most crucial features for the model of postoperative PI-LL prediction. *Note: The explanations of feature abbreviations are provided in Supplementary Table S1.

Table 1. Hyperparameter optimization for ML algorithms for the prediction of interbody cage height and postoperative PI-LL.

ML algorithm	Hyperparameter ranges	Optimal values for cage height prediction	Optimal values for PI-LL prediction
LR	Alpha = [0, 1], interval = 0.001	0.001	0.01
DT	Criterion = [squared_error, friedman_mse, absolute_error, poisson] min_samples_split = [10, 20, 30, 40, 50] min_samples_leaf = [5, 10, 20, 30, 40]	poisson 30 20	squared_error 50 5
SVR	kernels = [poly, linear, rbf, sigmoid] C = [0.1, 1, 10, 100] gamma = [0.001, 0.01, 0.1, 1]	sigmoid 10 0.001	linear 0.1 1
MLP	hidden_layer_sizes = [(50, 50, 50), (100, 100, 100), (200, 200, 200)] activation = [tanh, relu] solver = [sgd, adam, lbfgs] alpha = [0.0001, 0.001, 0.05]	(200, 200, 200) relu lbfgs 0.05	(200, 200, 200) tanh sgd 0.0001
KNN	n_neighbors = [5, 10, 20, 30, 40, 50] metric = [euclidean, manhattan, minkowski] weights = [uniform, distance]	20 euclidean uniform	5 euclidean distance

Table 2. Performance of ML algorithms in the prediction of interbody cage height and postoperative PI-LL. RMSE: root mean square error; MAE: mean absolute error.

Algorithm	Cage height		Postoperative PI-LL
Algorithm	RMSE	MAE	RMSE	MAE
DT	1.12	0.85	7.05	5.39
LR	1.06	0.76	5.42	4.2
SVR	1.09	0.77	5.4	4.15
MLP	1.16	0.87	6.36	4.84
KNN	1.25	0.498	6.98	5.21

Table 3. Performance of two final models. RMSE: root mean square error; MAE: mean absolute error.

	Baseline model performance		Optimal model performance
	RMSE	MAE	RMSE	MAE
Cage height prediction	1.06	0.76	1.01	0.7
Postoperative PI-LL prediction	5.5	4.15	5.19	3.86

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

MDPI Initiatives

Important Links

Choose an area of interest and we will send you notifications of new preprints at your preferred frequency.

Disclaimer

Application of Artificial Intelligence and Machine Learning for the Prediction of Interbody Cage Height and Postoperative Alignment in Transforaminal Lumbar Interbody Fusion

Abstract

1. Introduction

2. Materials and Methods

2.1. Patient selection

2.2. X-ray segmentation and feature extraction

2.3. ML implementation

2.3.1. Data preprocessing

2.3.2. Regression models

2.4. Statistical analysis and measurement metrics

3. Results

3.1. Patient characteristics

3.2. Performance of ML algorithms

3.3. Final model

3.3.1. Feature selection

3.3.2. Optimal model performance

3.3.3. Feature importance

4. Discussion

5. Conclusion

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

MDPI Initiatives

Important Links

Subscribe