1. Introduction
Hip and groin problems remain prevalent in team sports that require high-intensity tasks [
1,
2] including sprinting, sudden changes in direction, and kicking [
3]. Groin injury incidence rates (IR) range from 1.5 to 1.9 injuries per 1000 hours of total exposure in elite [
4] and amateur [
5] levels, with a recurrent rate of 18% within two months [
6]. These findings suggest the need to implement effective injury prevention programs to reduce the number and severity of groin injuries. To establish effective injury prevention strategies, it is essential that clinicians identify players at high risk of groin injury.
Deficits in muscle strength are often considered important risk factor for hip and groin injuries [
1,
7]. Strength-related variables may include maximum absolute strength of the muscles involved [
8], the strength difference between the two limbs [
9,
10], and the ratio of agonist to antagonist muscle strength [
11] . Some studies have found that low adductor strength levels increased the risk of sustaining a groin injury by 72% [
8] or 80% [
10] in soccer players while others have reported no association between hip adductor or abductor strength and hip / groin injuries [
12]. The association between hip/groin injury and strength difference in both adductor and antagonist muscles is not well understood.To the best of our knowledge, only one study reported that soccer players with hip abduction imbalances favoring the preferred kicking limb had 42% more chances to sustain a future hip/groin injury [
9]. In addition, there is evidence that hockey players with weaker adductors relative to abductors were much more likely to sustain an adductor muscle strain [
13]. Hence, the relationship between strength and groin injury in soccer remains unclear, which highlights the necessity for more research in this area.
Multiple factors may be responsible for the inconsistent results regarding the correlation between strength and hip/groin injury. These factors interact with a rather complex pattern that presents difficulties to capture using traditional statistical analysis models, such as logistic regression (LR). Artificial Intelligence (AI) and Machine Learning (ML) have been proposed to analyze problems where multiple risk factors and complexity are involved [
14,
15]. ML is an advanced tool for data analysis that utilizes algorithms that automatically learn from data to predict events [
16]. ML algorithms have been recently applied to predict muscle injuries with high accuracy [
17] or to identify players at risk for sustaining a hamstring injury [
18]. Moreover, pre-season measurements demonstrated good to excellent accuracy in predicting acute or overuse injury amongst young elite soccer players [
19], or whether a previous reported injury is likely to occur in the next season amongst professional NHL players [
20]. Contrary to the previous promising studies, researchers reported low predictive accuracy in injury prediction [
21], in identifying athletes at high risk for hamstring strain [
22] or ACL injury [
23]. To the best of our knowledge, application of ML algorithms to describe the complex relationship between strength -related variables and hip and groin injuries have not been previously reported.
Muscle strength improvement is an important element of pre-season and in-season exercise programs for enhancing performance as well as for preventing injuries [
24]. From a clinical point of view, screening to identify the risk profile of injury of players would assist practitioners in applying effective preventive interventions. This requires understanding not only of the role of maximum strength capacity of the muscle which sustains injury, but also the strength of the same muscle of one leg relative to the other as well as the strength of the antagonist relative to agonist muscle. The primary objective of this study was to investigate the relationship between various factors and the incidence of groin injury in soccer players using ML algorithms. The use of advanced statistical algorithms, such as ML, may provide new insights into the identification of contributors to groin injuries. We hypothesized that players with deficits in maximum strength and strength imbalances between hip adductors and hip abductors were more likely to sustain a groin injury.
3. Results
Of the total of 120 participants (mean age: 20.0 ± 6.96 years; BMI: 22.53 ± 2.28 kg/m
2, Height: 1.77 ± 0.07 m, body mass: 70.66 ± 10.08 Kg), 22 (18.33%) experienced 25 groin injuries. Two players sustained a reinjury. The mechanisms of injury are presented in
Table 1.
The performance of the k-NN model in predicting the players’ chances of sustaining a groin injury is summarised in
Table 2. The predictive model achieved a mean accuracy score of 55% and an Area Under the Curve (AUC) of 0.43, indicating a reasonable injury prediction. The Precision and Recall scores indicated that the model predicted more than 60% of positive cases and correctly identified 80% of the actual positive classes.
The confusion matrix of the model that developed after cross-validation is presented in
Figure 2. This technique was employed to evaluate the performance of the classifier in predicting the groin injury of the players using the training and test data. The model has correctly predicted 68 out of 74 non-injured players indicating 1 misclassification while 14 injured athletes were correctly classified with no misclassification transpired on the injured players during the training stage of the model. Similarly, the model correctly predicted 4 out of 8 injured athletes whereas 5 non-injured athletes were misclassified out of 29. Overall, the model performed reasonably well in the classification task against the test data, despite a relatively low number of observations as well as the imbalance classes that existed within the data.
Figure 3 demonstrates the graphical visualization of the variable’s contribution toward the performance of the model pipeline via the feature importance plot. It can be observed that 7 out of the 20 variables contributed more to the model performance (>8%) towards the probability of sustaining a groin injury. These (7) variables were further analysed using multivariate logistic regression analysis to determine their contribution to the probability of the players getting injured or not based on odd analysis.
The results of the multivariate regression model are presented in
Table 3. The results showed that players with a history of previous injury had a 67% higher risk of sustaining a groin injury (OR = 0.333, CI95% = [(0.1068-1.038]). Additionally, players with a lower adductors/abductors’ isometric strength ratio in the non-dominant limb were less likely to sustain a groin injury by 76% (OR = 0.238, CI95%= [(0.098-0.572]). No other significant contributor variables were found (p> 0.05). Overall, the model presented a well-fitting value (Hosmer-Lemeshow >.05), a good correct global classification (87%), and its discriminant capacity was also notable, with an AUC of 77% at a 95% confidence level. The model accounts for 22% of the players’ likelihood of sustaining groin injury or not i.e., injured, or non-injured (Negelkerke R
2 = 22.00).
4. Discussion
The main findings of this study are that (a) the adductor/abductor isometric strength ratio of the non-dominant limb was a significant risk factor for groin injury; (b) soccer players with a history of groin injury were at a higher risk of sustaining a groin injury; (c) the isometric strength of either limb or the adductor/abductor ratio of the dominant limb were not significant injury risk factors. To the best of our knowledge, this study is the first to analyze the interrelationship amongst 20 variables through ML applications to predict groin injury in amateur soccer players.
Hip Adductors isometric strength was not a significant contributor to the injury prediction model, which is in line with previous research [
12,
41]. In contrast, our results rebut previous findings that reported an increased risk of injury for athletes with a lower hip adductor muscle strength of the dominant limb [
2,
8,
9,
10,
13]. The discrepancies between studies may be attributed to the corresponding differences in methodology. First, in the present study, an ML algorithm was used to examine potential contributors to groin injury, while previous studies used logistic regression [
42,
43]. ML algorithms have the advantage of that can model highly non-linear relationships, while logistic regression emphasizes inference [
44]. Second, unlike previous studies, we incorporated agonist/antagonist strength ratios (in addition to absolute strength values) as well as hip flexor and knee flexor torque values and ratios into the model (
Figure 3). The results of any statistical algorithm depend on the number of input variables, their interactions, and their relationship with the occurrence of the injury. Hence, the results of this study are not directly comparable to those reported by previous studies. Interestingly, however, as in contrast to our expectations, when absolute strength of the adductors or the abductors were inserted into the model together with various relative strength and other potential risk factors, it was not a significant groin injury predictor (
Figure 3). Therefore, it is doubtful that players with lower absolute strength values would have a higher risk of groin injury.
The results revealed that players with a lower adductor/abductor isometric strength ratio of the non-dominant limb had a 77% greater chance of sustaining a groin injury (
Table 3). This finding is interesting for two reasons. First, it was not the absolute strength of the injured muscle or its antagonist that showed a high predictive capacity, but the relative strength between the two antagonistic muscle groups. A previous study has also reported similar findings in ice hockey players [
13]. Second, a cross-sectional study found that professional soccer players with previous groin injuries had a lower adductor/abductor strength ratio compared to asymptomatic players [
11]. An imbalance between the adductors and abductors may contribute to an altered motion of lumbo-pelvic system, especially when players perform demanding tasks that include acceleration [
45], high-speed running [
45], and change of direction (CoD) [
45,
46] (
Table 1). In addition to the hip adductors and abductors, pelvic movement can be affected by different muscles, including the hip and knee flexors, which were also analyzed in this research. Interestingly, while the regression analysis did not demonstrate significant predictive value for the ratios between the hip and knee flexors in terms of injury (
Table 3), the machine learning algorithm indicated that the knee flexor/hip flexor strength ratio significantly contributed to the model’s performance (
Figure 3). It can be inferred from this that any changes in the coordination or strength of the surrounding muscles around the hip and pelvic region might play a role in the generation of excessive forces in the adductor muscle-tendon units, consequently resulting in the occurrence of injury. However, it is apparent that further investigation is required to validate this suggestion.
Our findings indicated that the balance ratio of the non-injured limb, typically the non-dominant limb, is of primary importance in predicting groin injuries (
Figure 3). Previous research reported that athletes with lower abductor strength in their dominant/preferred limb, relative to the other limb, were at a higher risk of sustaining a groin injury [
9], but other studies failed to confirm similar association [
10]. However, none of these studies has examined strength imbalances in both limbs and their association with injury. A recent study observed a deficit in the hip adductor/abductor strength ratio during the middle and end of the season compared to the preseason, which was more pronounced in the non-dominant limb [
47]. This may assist in explaining the present findings, suggesting that strength levels may change during the competitive season, and these changes may differ not only between the hip adductors and abductors but also between the two limbs [
47].
Consistent with previous research [
3], the leading injury mechanisms were changes in direction (CoD) and acceleration (
Table 1). These tasks are characterized by high loads of the adductor longus and gracilis, as well as the encompassing passive structures of the groin area [
45]. Furthermore, sprint accelerations show kinematics, kinetics, and adductor muscle forces that are like those observed during changes in direction maneuvers, implying that the phase of acceleration phase at the end of the change of direction movement might be responsible for the development of groin injury [
45]. Recent studies have found two main mechanisms responsible for the development of groin pain: (1) high amounts of movement with eccentric contractions [
48] and (2) rapid transitions between flexion and extension [
49]. Both mechanisms are present during changes in direction and side kicking (passing of the ball), which occur repeatedly during training sessions or games [
46,
48,
49]. Consequently, accumulative high muscle stress during eccentric adductor contractions during these accelerations results in high loads and increases the risk of groin injury [
50]. The impact of the non-dominant limb in highly demanding soccer tasks, such as acceleration [
45], CoD [
46], and kicking [
48,
51], has been previously documented. During these movements, the nondominant limb should support the body and stabilize the pelvis through closed kinetic chains. For example, during the first ground contact of sprint acceleration, the largest hip adductors’ forces were observed when there was a fast transition from hip abduction to adduction with the hip in extension [
46]. Similarly, in cutting maneuvers and inside passing, the largest muscle activity of the adductors was found during rapid muscle lengthening [
46]. Speculatively, a lower adductor strength of the non-dominant limb indicates a lower capacity of these muscles to withstand high forces when players change direction, especially when the muscles experience a large stretch while stabilizing the hip of the non-dominant limb during the last phase of the change in direction. Further research is necessary to explain the relationship between non-dominant lower limbs and the development of groin injuries.
The
k-NN algorithm which was implemented in this study represents a novel approach to predicting groin injuries in soccer, enforcing previous efforts to predict injuries in professional adult [
18], or junior players [
19]. However, comparison between various studies is difficult due to differences in algorithm method, injury type or level of play of the study sample. ML analyses the significant variables and identifies those with a high predictive impact on the outcome. The algorithm considers both linear and non-linear relationships between the datasets during these analyses. In contrast, the multivariate logistic regression approach is employed of an odd analysis, predicting the likelihood of injury occurrence. However, it is important to note that the LR algorithm has a limitation: it can only extract data with a linear relationship. Consequently, when the relationship between variables is non-linear, the LR may not be able to identify its importance. Therefore, it is imperative to carefully consider the type of relationship between variables when choosing the appropriate statistical approach for injury prediction. Initially, we attempted to use all variables to fit the LR model, but the performance was subpar. Surprisingly, only one variable, namely “Previous injury,” was found to be significant. This finding emphasizes the limitations of the LR model in accurately capturing non-linear data patterns. However, we have successfully employed the LR model for the likelihood analysis, as demonstrated in the odd analysis.
One notable strength of our study lies in the incorporation of the ratios of the isometric strength variables into the predictive model (
Table 3). This approach allows for a comprehensive analysis of all-encompassing variables pertinent to injury prediction. It is important to note that removing these added ratios may yield different results, as our analysis captures the intricate interrelationships among variables. Specifically, when a particular variable is deemed essential, other variables may be considered unimportant and vice versa. Researchers should take heed of these findings and consider the limitations of the LR model when dealing with non-linear data.
Several limitations have been encountered in our research. First, we acknowledge that the sample of players who took part in this study was relatively small, which results in a relatively few numbers of injury incidents. It should be mentioned that this study examined players who took part in an amateur league which consists of 11 teams. Even though we contacted all teams, 6 teams finally participated, a 54.5% recruitment rate which is reasonable. Another limitation is the that by defining injuries as time-loss injuries, we did not consider players’ problems that required medical assistance but did not result in time loss. In addition, it should be noted that the measurements were performed in a field setting, which precluded the application of any belt-fixation. These limitations should be taken into consideration when interpreting our findings. On the other hand, strength of the study was its internal validity, since all measurements were performed by the same investigator.