1. Introduction:
While sonography is the primary method for evaluating fetal anomalies, it has limitations regarding specificity and visualization. MRI can be a valuable adjunct to sonography, particularly when sonographic findings are inconclusive, as it can provide additional diagnostic information and help improve the accuracy of prenatal diagnoses (1).
Accurately predicting gestational age (GA) has several applications, including pregnancy dating, assessing fetal growth and development, determining the timing of delivery and interventions (such as administering of steroids), detecting fetal growth restriction, preterm labor, and other complications. It is essential to make appropriate obstetrical management decisions and ensure optimal maternal and fetal outcomes (2).
Accurate GA assessment is vital for proper obstetrical management decisions, scheduling and interpreting prenatal tests, and evaluating fetal growth. Knowing the gestational age helps prevent preterm and post-term births, which can harm both the mother and baby (3). Additionally, it enables healthcare professionals to distinguish between normal and abnormal findings. GA is also crucial when interpreting test results, such as the maternal triple-screen blood test, which is normally conducted and interpreted between the 15th and 18th week of pregnancy (4).
GA can be determined using the first day of the mother's last menstrual period (LMP) or ultrasound (US) measurements (5) . While US is generally more accurate, measurement errors and biological variability can also affect its accuracy. The accuracy of gestational age dating decreases with time since conception. In the first trimester, the error range is within 1 week, increasing to 1-2 weeks in the second trimester and up to 1 month in the third trimester (6). When using menstrual dating alone to determine GA, estimations were inaccurate in 11% to 42% of cases . However, data still supports the use of US dating, despite its own sources of inaccuracy (2). The American College of Obstetricians and Gynecologists (ACOG), the American Institute of Ultrasound in Medicine (AIUM), and the Society for Maternal-Fetal Medicine (SMFM) specifically recommend using US dating to determine GA if LMP is unknown or inconsistent with US dates.
Ultrasound parameters used to determine gestation age differ based on the trimester, and include crown-rump length (CRL), biparietal diameter (BPD), corrected biparietal diameter (BPDC), and femur length. The MSD is used in the early part of the first trimester, up to 6 weeks. The CRL is used in the later part of the first trimester, up 12 weeks. The BPD, cBPD, and femur length are all used for gestational age dating during the second and third trimesters. The BPD measures the diameter of a transverse section of the fetal skull at the level of the parietal eminences. The BPDC is calculated as,
The occipitofrontal diameter, measured as the length from the nose to the occipital bone, is also an important biometric parameter for gestational age dating.
Compared to ultrasound, fetal MRI offers several advantages for GA prediction. MRI provides better image resolution and tissue contrast, enabling more accurate fetal anatomy and development measurements. Additionally, MRI can provide more comprehensive information about fetal brain development, which is crucial for detecting abnormalities and planning interventions. Fetal MRI is also less dependent on the operator's skill, which can lead to greater consistency and accuracy in GA prediction. These advantages make fetal MRI a promising GA prediction and prenatal care tool.
While ultrasound is the primary modality for determining gestational age due to its widespread availability and accuracy, MRI can provide additional information in certain clinical situations. Here are some roles of MRI in determining gestational age:
Evaluation of Fetal Anomalies: MRI is particularly useful in assessing fetal anomalies and structural abnormalities that may impact GA determination. It can provide detailed images of the fetus and surrounding structures, allowing for a comprehensive evaluation of fetal development and identification of any abnormalities.
Confirmation of Ultrasound Findings: In cases where ultrasound findings are inconclusive or unclear, MRI can serve as a complementary imaging modality to confirm or further evaluate the findings. It can provide additional anatomical information and help clarify any GA uncertainties.
Assessment of Fetal Brain Development: MRI offers excellent soft tissue contrast and detailed visualization of the fetal brain. It can be valuable in evaluating brain development and detecting abnormalities impacting GA determination.
Evaluation of Placental Function: MRI can provide information about the placental function and blood flow, which can be important in assessing gestational age and overall fetal well-being.
Assessment of Fetal Growth: While ultrasound is the primary method for assessing fetal growth, MRI can be used in cases where ultrasound measurements are limited or difficult to obtain accurately. MRI can provide volumetric measurements and estimations of fetal weight, which can contribute to assessing gestational age and fetal growth.
AI has been utilized in processing anatomic fetal brain MRI to automatically predict landmarks and perform segmentation. Various AI models, including Convolutional Neural Network and U-Net, have been employed and achieved accuracy levels of 95% and higher. AI has shown potential in aiding the preprocessing (7) and post-processing (8)of fetal images, as well as in image reconstruction. Additionally, AI can be applied to tasks such as gestational age prediction with an accuracy of one week (9), fetal brain extraction (10, 11), fetal brain segmentation (12), and placenta detection. Furthermore, certain linear measurements of the fetal brain, such as Cerebral and Bone Biparietal Diameter, have been proposed as potential applications of AI in this field.
Deep learning techniques have shown great potential in accurately predicting GA in fetuses using magnetic resonance imaging (MRI). Artificial neural networks (ANNs) and convolutional neural networks (CNNs) are two types of deep learning computing paradigms that have been utilized for medical image recognition tasks (13). CNNs have demonstrated high accuracy in predicting the chronological age of adults using brain MRI scans (14). By applying these techniques to fetal MRI scans, researchers can accurately determine the gestational age of fetuses, which is essential for appropriate obstetrical management decisions and optimal maternal and fetal outcomes.
The application of artificial intelligence techniques, especially segmentation techniques, to extract fetal brain structures has revolutionized the field of prenatal imaging analysis. DynUNet (Dynamic U-Net) is a deep-learning network architecture for image segmentation tasks. It has great potential in various image segmentation tasks and has been widely used in computer vision. Based on this framework, we automatically combine OpenCV (Open Source Computer Vision) edge detection, convex hull extraction, minimum circumscribed matrix, and other algorithms to obtain key data such as BPD, FOD, and HC for GA prediction.
This study aims to apply deep learning techniques to fetal MRI subjects to determine GA accurately. For this purpose, we used three trusted variables: BPD, FOD, and HC. We measured them manually (by a radiologist) as well as by using an AI tool. Finally, we compared the accuracy of GA prediction in both variables, in GA-BPD, GA-FOD, GA-HC, and average GA.
2. Method:
2.3. Dataset and measurement
The measurements of Biparietal Diameter (BPD), Fronto-occipital Diameter (FOD), and Head Circumference (HC) were obtained from 52 fetal MRI studies included in this study. Manual measurements were performed by an expert radiologist using PACS imaging software. BPD was measured as the maximum distance between the inner edges of the parietal bones, while FOD was measured as the maximum distance between the inner edges of the frontal and occipital bones. For HC, the measurement was calculated using a specific formula.
AI model for measurement:
Fetal Brain Extraction: For automatic measurements using AI models, fetal brains were first extracted by the Dynamic UNet tool, a deep learning pipeline based on the nnU-Net adaptive framework for U-Net-based medical image segmentation. We used a Pytorch-based MONAIfbs (MONAI Fetal Brain Segmentation) toolkit for automatic fetal brain segmentation on HASTE-like MR images.
Defining the length and width of Brain: After getting the Mask, we can use OpenCV to measure the perimeter of the Mask (HC) and the smallest rectangle that wraps the Mask, where the length is FOD and the width is BPD. (Fig1-a)
Choosing the median of all Axial series: Usually, the same patient will have several sets of sequences such as Axial, Sagittal, and Coronal. Fig-b and Fig-c are the 15_AX_BrainT2_HASTE series and 32_AX_BrainT2_HASTE series of the same patient (exam_000001), AI chooses the median of all axial series as the final automatic measurement result.
(It is worth mentioning that the BPD and FOD measured by the clinic come from other Coronal and Sagittal sequences, respectively, but the final results show a high degree of correlation.)
Figure 1 illustrates the automatic measurement process of Biparietal Diameter (BPD), Fronto-occipital Diameter (FOD), and Head Circumference (HC) on T2 fetal MRI. The yellow box represents BPD and FOD, while the red circle indicates HC measurement.
Figure 1.
Automatic Measurement of BPD, FOD, and HC on T2 Fetal MRI.
Figure 1.
Automatic Measurement of BPD, FOD, and HC on T2 Fetal MRI.
(a) AI Processed Mask: The AI model directly processes the mask to obtain measurement data, as shown by the yellow box representing the Biparietal Diameter (BPD) and the length indicating the Fronto-occipital Diameter (FOD). The red circle depicts the measurement of Head Circumference (HC).
(b) Overlay on Original MRI: The mask from (a) is superimposed onto the original MRI image; visually representing the measurements
(c) Measurements on Other Series: Further showcasing the versatility of the AI model, measurements on other series of the same patient demonstrate its consistent performance across different image series.
2.4. Prediction of Fetal Age
To predict the age of the fetus, we used the measurements of BPD, FOD, and HC obtained both manually and automatically (by the AI model). We then compared the predicted age by the AI model versus the predicted age by the manual measurements, in different criteria: BPD FOD, HC, and corrected BPD.
To establish a reliable basis for comparison, the predicted age was determined by applying standard fetal growth charts based on Biparietal Diameter (BPD) and Fronto-occipital Diameter (FOD) measurements from established clinical references commonly used in prenatal care:
MRI of the Fetal Brain Normal Development and Cerebral Pathologies, 1st ed. 2004 Edition, by C. Garel (15). (We mentioned this reference as Garel in the paper). Supplement Tables S1 and S2.
Also, we used another trusted reference, mainly for ultrasound (From Hadlock FP, Deter RL, Harrist RB, et al.: Fetal biparietal diameter: A critical reevaluation of the relation to menstrual age by means of real-time ultrasound. J Ultrasound Med 1982 (16). We mention that as Freq reference in our paper, (Supplement Tables S3 and S4 is derived from the reference table in Hadlock et al. book)
We also used another reference for ultrasound measurement: Snijders RJ, Nicolaides KH. Fetal biometry at 14-40 weeks gestation. Ultrasound Obstet Gynecol. 1994 (17)(We mentioned that as Bio reference in our paper.) Supplement Table S5, Supplement Table S5, and Supplement Table S7.
2.5. Statistical Analysis:
Statistical analysis was conducted using IBM SPSS Statistics software. Paired t-tests were employed to analyze the differences between the predicted age by the AI model and the manual measurements. A significance level of p < 0.05 was used to determine statistical significance. Additionally, Pearson's correlation coefficient was calculated to assess the correlation between the predicted age by the AI model and the manual measurements.
2.6. Ethical Considerations:
This study received approval from our institutional IRB, and a waiver of informed consent was granted, given the study's retrospective nature. All patient data were anonymized and handled with utmost confidentiality throughout the entire duration of the study.
3. Result:
The provided data includes comparisons between GA measurements obtained from different methods (manual and AI) and different biometric parameters (BPD, FOD, BPDC, HC). All the outputs of our measurement are in Table 8:
Table 8.
Name of Outputs of our Measurements.
Table 8.
Name of Outputs of our Measurements.
Variable Name
|
Description |
FOD_manual |
Manual measurement of FOD by the radiologist |
FOD_GA_manual |
GA predicted according to FOD by the radiologist |
BPD_manual |
Manual measurement of BPD by the radiologist |
BPD_GA_manual |
GA predicted according to BPD by the radiologist |
FOD_AI |
Automatic measurement of FOD by AI |
FOD_err |
Difference between FOD measurement (AI versus Manual) |
FOD_GA_AI |
Automatic GA predicted according to FOD by AI |
FOD_GA_err |
Difference between GA predicted according to FOD (AI versus Manual) |
BPD_AI |
Automatic measurement of BPD by AI |
BPD_err |
Difference between BPD measurement (AI versus Manual) |
BPD_GA_AI |
Automatic GA predicted according to BPD by AI |
BPD_GA_err |
Difference between GA predicted according to BPD (AI versus Manual) |
We compared the manual measurements performed by a radiologist with the measurements obtained through an AI model; we will present the result in three main parts:
Part 1: Result with Biometric Measurement (BPD, FOD, HC)
Part 2: Compare between references.
Part 3: Compare measurements by manual versus AI.
Part 1: Result with Biometric Measurement (BPD, FOD, HC)
Part 2 of Results: Comparison of Predictions Among References
In this section, we compared the measurements obtained from the AI model and the manual method for three variables: Biparietal Diameter (BPD), Fronto-occipital Diameter (FOD), and Head Circumference (HC). The objective was to determine which reference yielded stronger correlations between the measurements.
2.1. References Compare of BPD
The following figure and tables present the correlation of the gestational age (GA) predictions based on Biparietal Diameter (BPD) measurements in three different references, along with the correlation with the Picture Archiving and Communication System (PACS).
Figure 2.
Correlation of BPD in three references.
Figure 2.
Correlation of BPD in three references.
Table 9.
Difference of GA predication according to BPD measurement.
Table 9.
Difference of GA predication according to BPD measurement.
GA_PACS vs GA_BPD_garel_manual: 1.92 weeks GA_PACS vs GA_BPD_freq_manual: 1.90 weeks GA_PACS vs GA_BPD_bio_manual: 1.41 weeks |
GA_PACS vs GA_BPD_garel_AI: 2.17 weeks GA_PACS vs GA_BPD_freq_AI: 2.17 weeks GA_PACS vs GA_BPD_bio_AI: 1.24 weeks |
Table 10.
Pearson correlation coefficient Score of correlation with measurements.
Table 10.
Pearson correlation coefficient Score of correlation with measurements.
|
GA_PACS |
GA_BPD _garel_ manually |
GA_BPD _garel_ AI |
GA_BPD _freq_ manually |
GA_BPD _freq_ AI |
GA_BPD _bio_ manually |
GA_BPD _bio_ AI |
GA_PACS |
1.000000 |
0.972165 |
0.970168 |
0.972574 |
0.970744 |
0.976040 |
0.973907 |
GA_BPD _garel_ manually |
|
1.000000 |
0.997013 |
0.999944 |
0.997077 |
0.999306 |
0.996566 |
GA_BPD _garel_ AI |
|
1.000000 |
0.996924 |
0.999954 |
0.996011 |
0.999359 |
GA_BPD _freq_ manually |
|
1.000000 |
0.997013 |
0.999382 |
0.996568 |
GA_BPD _freq_ AI |
|
1.000000 |
0.996212 |
0.999481 |
GA_BPD _bio_ AI |
|
1.000000 |
0.996819 |
GA_BPD _bio_ AI |
|
1.000000 |
2.2. References Compare FOD
The figures and tables presented below illustrate the correlation of GA predictions based on Biparietal Diameter measurements using three different references and the correlation with the Picture Archiving and Communication System.
Figure 3.
Correlation of FOD in three references.
Figure 3.
Correlation of FOD in three references.
Table 11.
Difference of GA predication according to FOD measurement.
Table 11.
Difference of GA predication according to FOD measurement.
GA_PACS vs GA_FOD_garel_manual: 1.89 weeks GA_PACS vs GA_FOD_bio_manual: 2.26 weeks |
GA_PACS vs GA_FOD_garel_AI: 1.77 weeks GA_PACS vs GA_FOD_bio_AI: 2.26 weeks |
Table 12.
Pearson correlation coefficient Score of correlation with measurements.
Table 12.
Pearson correlation coefficient Score of correlation with measurements.
|
GA_PACS |
GA_FOD _garel_ manually
|
GA_FOD _garel_ AI
|
GA_FOD _bio_ manually
|
GA_FOD _bio_ AI
|
GA_PACS
|
1.000000 |
0.973610 |
0.970596 |
0.977314 |
0.975004 |
GA_FOD _garel_ manually
|
|
1.000000 |
0.989438 |
0.998531 |
0.989284 |
GA_FOD _garel_ AI
|
|
1.000000 |
0.988759 |
0.998707 |
GA_FOD _bio_ manually
|
|
1.000000 |
0.991186 |
GA_FOD _bio_ AI
|
|
1.000000 |
2.3. References Compare of HC and BPDC
The below figure and tables show correlation of the GA-predicated according to the HC and BPDC in three references, as well as with PACS.
Figure 4.
Correlation of HC and BPDC in three references.
Figure 4.
Correlation of HC and BPDC in three references.
Table 13.
Difference of GA predication according to HC and BPDC measurement.
Table 13.
Difference of GA predication according to HC and BPDC measurement.
GA_PACS vs GA_BPDC_freq_manual: 1.30 weeks GA_PACS vs GA_HC_freq_manual: 1.40 weeks GA_PACS vs GA_HC_bio_manual: 1.74 weeks |
GA_PACS vs GA_BPDC_freq_AI: 1.24 weeks GA_PACS vs GA_HC_freq_AI: 1.05 weeks GA_PACS vs GA_HC_bio_AI: 1.26 weeks |
Table 14.
Pearson correlation coefficient score of correlation with measurements.
Table 14.
Pearson correlation coefficient score of correlation with measurements.
. |
GA_PACS |
GA_BPDC _freq_ manually |
GA_BPDC _freq_ AI |
GA_HC _freq_ manually |
GA_HC _freq_ AI |
GA_HC _bio_ manually |
GA_HC _bio_ AI |
GA_PACS |
1.000000 |
0.980418 |
0.978139 |
0.981462 |
0.979677 |
0.981587 |
0.980043 |
GA_BPDC _freq_ manually |
|
1.000000 |
0.996551 |
0.999851 |
0.996652 |
0.999732 |
0.996508 |
GA_BPDC _freq_ AI |
|
1.000000 |
0.996446 |
0.999011 |
0.996315 |
0.998772 |
GA_HC _freq_ manually |
|
1.000000 |
0.997054 |
0.999946 |
0.996994 |
GA_HC _freq_ AI |
|
1.000000 |
0.997040 |
0.999942 |
GA_HC _bio_ AI |
|
1.000000 |
0.997043 |
GA_HC _bio_ AI |
|
1.000000 |
Part 3 of results: Compare measurements by manual versus AI
In this part, we compared the manual measurement of indexes (BPD- FOD-HC), between manual (radiologist) versus AI measurement.
We utilized statistical measures, including Mean Absolute Error (MAE) and Root Mean Square Error (RMSE), to assess the accuracy of our predictions. Lower scores in these measures indicate higher prediction accuracy. Additionally, we employed the Pearson correlation coefficient (r) to evaluate the linear correlation between the AI and manual measurements. The close-to-1 values of the correlation coefficient signify a strong positive correlation, indicating that the AI predictions align well with the manual measurements.
Figure 5.
correlation of AI versus manually measured BPD: MAE: 1.6442L RMSE: 1.9790; Pearson correlation coefficient (r): 0.9963.
Figure 5.
correlation of AI versus manually measured BPD: MAE: 1.6442L RMSE: 1.9790; Pearson correlation coefficient (r): 0.9963.
Figure 6.
correlation of AI versus manually measured FOD:. MAE: 1.5481; RMSE: 2.2378; Pearson correlation coefficient (r): 0.9932.
Figure 6.
correlation of AI versus manually measured FOD:. MAE: 1.5481; RMSE: 2.2378; Pearson correlation coefficient (r): 0.9932.
Figure 7.
correlation of AI versus manually measured BPDC:. MAE: 1.1759; RMSE: 1.4460; Pearson correlation coefficient (r): 0.9970.
Figure 7.
correlation of AI versus manually measured BPDC:. MAE: 1.1759; RMSE: 1.4460; Pearson correlation coefficient (r): 0.9970.
Figure 8.
correlation of AI versus manually measured HC:.; MAE: 7.2755; RMSE: 8.1365; Pearson correlation coefficient (r): 0.9973.
Figure 8.
correlation of AI versus manually measured HC:.; MAE: 7.2755; RMSE: 8.1365; Pearson correlation coefficient (r): 0.9973.
The MAE (mean absolute error) is 7.2755 and represents the mean absolute difference between the actual (HC_sorted_m) and predicted (HC_a) values. It shows that, on average, the predicted value differs from the actual value by 7.2755 mm.
RMSE (Root Mean Squared Error) is 8.1365, which means the square root of the average squared difference between the actual and predicted values. It measures the typical difference between actual and predicted values in the same units as the data (in this case millimeters).
The Pearson correlation coefficient (r) is 0.9973, indicating a strong positive linear relationship between HC_sorted_m and HC_a. This means that when the value of HC_sorted_m increases, HC_a also tends to increase, and vice versa. The closer the r value is to 1, the stronger the correlation between the two variables. In this case, a strong correlation indicates that the predicted value (HC_a) is closer to the actual value (HC_sorted_m).
4. Discussion:
In this study, we aimed to evaluate the performance of an AI model in predicting the gestational age (GA) of fetuses using three variables measured in fetal brain MRI: Biparietal Diameter (BPD), Fronto-occipital Diameter (FOD), and Head Circumference (HC). The study used a dataset of 52 normal fetal MRI cases from Rush University, which included T2 Haste sequences. BPD, FOD, and HC measurements were obtained manually by a radiologist and by an AI model using a Dynamic UNet model to extract the fetal brain and then measure BPD and FOD automatically.
The differences between manual and AI GA measurements vary depending on the specific biometric parameter used. However, the different measurements have a high correlation, indicating a consistent relationship between the methods. The analysis also suggests that AI-based measurements of HC show a stronger correlation with the actual values compared to BPD, FOD, and BPDC. We provided discussion for results in three sections:
Discussion for Part 1 of Result with Biometric Measurement (BPD, FOD, HC)
The analysis of the data provides insights into the accuracy of predicting gestational age (GA) using different biometric measurements, namely Biparietal Diameter (BPD), Fronto-occipital Diameter (FOD), and Head Circumference (HC). The key findings are discussed below:
BPD:
The BPD is one of the most measured parameters in the fetus. Campbell was the first investigator to link fetal BPD to gestational age (18); however, since this original report, numerous publications on this subject have appeared in the literature (19-21).
The BPD may be rapidly and reproducibly measured by ultrasound examination from 12 weeks' gestation until the end of pregnancy. The BPD is imaged in the transaxially plane of the fetal head at a level depicting thalamus in the midline, equidistant from the temporoparietal bones and usually the cavum septum pellucidum anteriorly(22). Although several methods have been used to measure BPD, the most accepted method is measurement from leading edge to leading edge (outer-to-inner)
According to the similar studies: The accuracy of estimating gestational age using BPD depends on the stage of pregnancy (23). Between 12 and 26 weeks of gestation, the BPD measurement can provide an estimation accurate to within ±10 to 11 days. As the pregnancy progresses beyond 26 weeks, the accuracy of BPD measurement decreases, and it can have an error of up to ±3 weeks near term (24).
Several factors can affect the accuracy of BPD measurements. Biological factors such as differences in maternal age, parity (number of previous pregnancies), pregnancy weight, geographic location, and specific population characteristics can contribute to variations in BPD measurements. Technical factors like measurement techniques, interobserver error, and the use of single or multiple measurements can also influence the accuracy of BPD in estimating gestational age (20, 25).
Although most dating curves show a general relationship between BPD and gestational age, there can be significant differences in estimating gestational age based on a particular BPD measurement. Additionally, the accuracy of BPD measurements is highest when the shape of the fetal head is appropriately ovoid. If the head shape is unusually rounded or elongated, BPD measurements may overestimate or underestimate gestational age, respectively.
- -
BPD results in our study: the differences between manual and AI measurements of BPD were relatively small across the different references, indicating good agreement. According to the Garel references, the difference in GA predictions was 0.66 weeks, demonstrating a close alignment between the two methods. Similar differences were observed when considering the Freq and Bio references. However, larger differences were observed when comparing BPD measurements with GA in the PACS, ranging from 1.24 to 2.17 weeks. These differences varied depending on the specific reference used, emphasizing the influence of reference selection on the accuracy of GA predictions.
FOD:
To assess the appropriateness of head shape, the BPD can be compared with the FOD (26), and the ratio of these diameters is called the cephalic index (CI). The normal range for CI is between 0.70 and 0.86 (±2 standard deviations). In cases where the fetus has an abnormal cephalic index (which is rare, noted in less than 2% of fetuses before 26 weeks' gestation), gestational age estimates may be more accurately determined using other fetal parameters such as head circumference.
Similar to the BPD measurements, the differences between manual and AI measurements of FOD were also relatively small. According to the Garel references, the difference in GA predictions was 0.59 weeks, indicating a strong agreement between the two measurement approaches. The difference was slightly smaller when considering the Bio reference. However, larger differences were observed when comparing FOD measurements with GA in the PACS, ranging from 1.77 to 2.26 weeks. As with BPD, these differences varied based on the specific reference used.
HC:
Head circumference (HC) measurement can be utilized to estimate gestational age similarly to the biparietal diameter (BPD) measurement. While tracing the outer perimeter of the head using a trackball on ultrasound equipment or a digitizer is the most reliable method for measuring HC, there is also a formula that involves using the BPD and fronto-occipital diameters to calculate HC, with a maximum error of 6% (27).
The accuracy of estimating gestational age through HC measurement is comparable to that of BPD measurement (28). However, in cases where the fetus has an abnormal head shape, such as brachycephaly or dolichocephaly, HC may be a more precise indicator of fetal age compared to BPD (27).
- -
HC in our study: The differences between manual and AI measurements of HC were comparable to those observed for BPD and FOD. According to the Freq reference, the difference in GA predictions was 0.75 weeks, indicating a reasonably close alignment between the two measurement methods. The difference was slightly larger when considering the Bio reference. When comparing HC measurements with GA in the PACS, differences ranging from 1.05 to 1.74 weeks were observed. As with BPD and FOD, these differences varied based on the specific reference used.
The findings suggest that the AI model demonstrates good agreement with manual measurements across all three biometric measurements (BPD, FOD, HC). The differences observed in GA predictions between the manual and AI measurements highlight the importance of reference selection when interpreting the accuracy of the predictions. These results emphasize the need for further validation and testing of the AI model in diverse clinical scenarios to ensure its reliable performance for estimating gestational age using biometric measurements derived from fetal brain MRI.
The use of multiple fetal growth parameters, including BPD, HC, AC, and FL measurements, can improve the accuracy of gestational age assessment (24) . To enhance the precision of determining gestational age, Hadlock and colleagues employed a combination of multiple measurements (29, 30). When multiple parameters predict the same end point, combining their mean gestational ages increases the probability of correctly predicting that end point. This approach enhances accuracy compared to relying on a single parameter alone. However, if the estimates from different parameters are significantly different, averaging multiple parameters may decrease the accuracy of the best predictor(s). It is important to avoid averaging fetal growth parameters in certain conditions, such as fetal macrosomia, intrauterine growth retardation, and congenital anomalies, as it may not provide accurate results.
Discussion for Part 2 of Result: Compare predictions between references:
The provided data includes correlation coefficients between different references and measurements for predicting gestational age using biometric parameters. The interpretation of the correlation analysis is as follows:
The Pearson correlation coefficients indicate a strong positive correlation between the AI system and manual measurements, suggesting the AI system's accuracy in predicting the indices. The coefficients are close to 1 for all three indices, indicating a high level of agreement between the AI and manual measurements.
However, the root mean square error (RMSE) and mean absolute error (MAE) values are higher for HC compared to BPD and FOD, indicating a higher error rate in the predictions of HC. This suggests that the AI's predictions for HC may be less reliable or more variable than its predictions for BPD and FOD. Considering the clinical significance of these measurements, the higher error rates for HC should be carefully evaluated and considered.
Nevertheless, the overall high correlation coefficients across all measurements suggest that the AI system's predictions align well with the manual measurements. This indicates that the AI system could be a valuable tool to assist with these measurements. However, further validation and testing in a broader range of clinical scenarios are necessary to ensure its performance in various settings.
Compared with the previous findings, it is evident that the MAE/RMSE errors measured by manual HC and AI measurements are relatively larger. This discrepancy arises because manual HC is derived from equations involving BPD and FOD. In contrast, AI directly extracts HC from the original MRI image by accumulating each small polyline segment. The AI-based measurement of HC aligns better with the actual fetal values and exhibits a stronger Pearson correlation coefficient compared to BPD, FOD, and corrected BPD (BPDC). Thus, it can be considered a more recommended HC measurement method.
Discussion for Part 3 of results: Compare Measurements by Manual versus AI:
In the provided results, we have the Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and Pearson correlation coefficients (r) for different measurements obtained from manual and AI methods. Here is the interpretation:
1. BPD:
- -
The MAE is 1.6442, indicating the average absolute difference between the actual BPD values and the predicted values obtained from either manual or AI measurements.
- -
The RMSE is 1.9790, representing the square root of the average squared difference between the actual and predicted BPD values. It gives an idea of the typical difference between the actual and predicted values.
- -
The Pearson correlation coefficient (r) is 0.9963, indicating a strong positive linear relationship between the manually measured BPD and the AI-predicted BPD. A value close to 1 indicates a strong correlation.
2. FOD:
- -
The MAE is 1.5481, which is the average absolute difference between the actual FOD values and the predicted values obtained from either manual or AI measurements.
- -
The RMSE is 2.2378, representing the square root of the average squared difference between the actual and predicted FOD values.
- -
The Pearson correlation coefficient (r) is 0.9932, indicating a strong positive linear relationship between the manually measured FOD and the AI-predicted FOD.
3. BPDC:
- -
The MAE is 1.1759, representing the average absolute difference between the actual BPDC values and the predicted values obtained from either manual or AI measurements.
- -
The RMSE is 1.4460, indicating the square root of the average squared difference between the actual and predicted BPDC values.
- -
The Pearson correlation coefficient (r) is 0.9970, indicating a strong positive linear relationship between the manually measured BPDC and the AI predicted BPDC.
4. HC:
- -
The MAE is 7.2755, representing the average absolute difference between the actual HC values and the predicted values obtained from either manual or AI measurements.
- -
The RMSE is 8.1365, indicating the square root of the average squared difference between the actual and predicted HC values.
- -
The Pearson correlation coefficient (r) is 0.9973, indicating a strong positive linear relationship between the manually measured HC and the AI-predicted HC
The findings indicate that Head Circumference (HC) has larger Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) values compared to other measurements such as Biparietal Diameter (BPD), Fronto-occipital Diameter (FOD), and corrected BPD (BPDC), suggesting a higher average difference between the actual and predicted values. Despite this, the HC measurement still demonstrates a strong correlation (r=0.9973) with the AI-predicted values. These larger MAE and RMSE values in HC may be attributed to normal variations, while the AI-based HC measurement offers improved accuracy compared to manual measurements for predicting gestational age.
The study described has several advantages over similar studies:
Comprehensive evaluation: The study assesses the AI model's performance in predicting gestational age using multiple biometric measurements, providing a comprehensive analysis.
Comparison with different references: The study compares AI predictions with multiple references and assesses their correlation with the Picture Archiving and Communication System (PACS).
Statistical evaluation: The study uses statistical measures like MAE, RMSE, and Pearson correlation coefficients to evaluate the accuracy and correlation of the AI model's predictions.
Inclusion of manual measurements: Manual measurements are included as a reference for comparison, allowing assessment of the agreement between AI and human experts.
Focus on AI versus manual measurements: The study compares AI and manual measurements, evaluating their accuracy and correlation for each biometric parameter.
Discussion of clinical implications: The study discusses the clinical significance of the findings, highlighting the importance of reference selection and the potential benefits of integrating AI models in prenatal care.
Overall, this study provides a comprehensive evaluation of an AI model's performance in predicting gestational age using fetal brain MRI measurements, offering valuable insights for researchers and practitioners in prenatal care.
In similar study, Shi et al. (31) evaluated the ability of various biometric measurements derived from MRI in accurately determining GA of fetuses in the second half of gestation. The study utilized MRI scans of 637 fetuses and evaluated 9 standard fetal biometric parameters. Regression models were constructed to predict GA based on these measurements, and a polynomial regression model was found to be the best descriptor. The study concluded that MRI biometry measurements offer a potential estimation model of fetal gestational age in the second half of gestation. Both studies of us and Shi et al. contribute to the understanding of utilizing MRI-based biometric measurements for estimating gestational age. Also, our study provides additional insights into the potential of AI models in enhancing accuracy and efficiency in prenatal care.
Also, in another study Burgos-Artizzu et al. (32) aimed to assess the performance of an AI method for estimating GA in second and third trimester fetuses by analyzing fetal brain morphology on standard cranial ultrasound sections. The AI method was compared to existing formulas based on standard fetal biometry. The study used routine fetal ultrasound scans and analyzed trans thalamic axial plane images from 1394 patients. The AI method, either alone or in combination with fetal biometric parameters, showed a 95% confidence interval error of 14.2 days and 11.0 days, respectively, compared to 12.9 days for the best method using standard biometrics alone. In the third trimester, the AI method combined with biometric parameters had a lower error of 14.3 days compared to 17 days for fetal biometrics, while in the second trimester, the errors were 6.7 and 7 days, respectively. The AI method performed particularly well in estimating GA for small-for-gestational-age fetuses. Compared to our study, both studies demonstrate the effectiveness of AI in estimating GA using different imaging modalities. While the Artizzu et al. study focuses on ultrasound-based analysis of fetal brain morphology, our project utilizes fetal brain MRI and biometric measurements. Both approaches show promise in improving the accuracy of GA estimation and have the potential to enhance prenatal care practices.
In similar study, Kojita et al. (11) evaluate the performance of a deep learning model for predicting gestational age using fetal brain MRI acquired after the first trimester. The model was compared to the traditional method of estimating gestational age using BPD measurement. A total of 184 T2-weighted MRI scans from fetuses were included in the study. The deep learning model was trained on a subset of cases and validated on another subset, while the remaining cases were used as test data. The model's prediction of gestational age showed a substantial correlation with the reference standard (ρc = 0.964), outperforming the BPD prediction (ρc = 0.920). Both the model and BPD predictions had larger differences from the reference standard as gestational age increased. However, the upper limit of the model's prediction was significantly shorter than that of BPD. The study, similar to our studym concludes that deep learning can accurately predict gestational age from fetal brain MRI acquired after the first trimester, providing potential benefits for prenatal care in cases where early ultrasound measurements are lacking.
5. Conclusion:
In this study, we aimed to evaluate the performance of an AI model in predicting the gestational age (GA) of fetuses using biometric measurements obtained from fetal brain MRI, specifically Biparietal Diameter (BPD), Fronto-occipital Diameter (FOD), and Head Circumference (HC). In addition to manual measurements, we developed an AI model based on the Dynamic UNet architecture to extract fetal brain and calculate these variables automatically.
Our dataset included 52 normal fetal MRI cases with T2 Haste sequences from Rush University. This study's results demonstrate the AI model's high accuracy and potential in predicting GA. The AI-based BPD, FOD, and HC measurements showed strong correlations with manual measurements. Notably, the AI-based HC measurement exhibited a stronger correlation with actual values compared to BPD, FOD, and corrected BPD (BPDC), suggesting its reliability as a recommended method for accurately predicting gestational age.
Comparisons between manual and AI measurements revealed small differences in BPD and FOD across different references. However, when comparing measurements with GA in the Picture Archiving and Communication System (PACS), differences varied based on the reference used, highlighting the importance of reference selection.
BPD measurements are commonly used to estimate gestational age during pregnancy. However, their accuracy depends on factors such as gestational age, biological and technical variations, and the shape of the fetal head. Other fetal parameters may be used in cases where the head shape is abnormal to improve the accuracy of gestational age estimation.
Integrating AI models in prenatal care offers several advantages, including improved accuracy, automation, and time efficiency. The AI-based measurements demonstrated consistent correlations with manual measurements, supporting their reliability in assessing fetal development and monitoring pregnancies. The developed AI model provides an accurate and efficient prediction of gestational age, which can aid in clinical management, evaluation of fetal growth, and timely interventions. By reducing human error and variability associated with manual measurements, the AI model has the potential to enhance the precision and effectiveness of prenatal care.
In summary, this study underscores the potential of AI models in accurately predicting gestational age using biometric measurements derived from fetal brain MRI. The strong correlations between manual and AI measurements validate the accuracy of the AI model, particularly in HC predictions. The findings support the integration of AI as a valuable tool in prenatal care, empowering clinicians with automated and reliable GA prediction and contributing to improved decision-making and patient care.
Author Contributions
Conceptualization, S.B.; Methodology, F.V., H.A.A., K.K.M., X.L., M.K. and S.B.; Software, F.V. M.S., X.L., S.A. (Shehbaz Ansari), M.A. and J.O.A.; Validation, K.K.M. and M.A.; Formal analysis, X.L.; Investigation, F.V., K.K.M. and M.K.; Data curation, M.S., K.K.M. and S.A. (Shehbaz Ansari); Writing—original draft, F.V. and X.L.; Writing—review & editing, F.V., H.A.A., M.S. and S.A. (Seth Adler); Supervision, H.A.A., M.K. and S.B.; Funding acquisition, S.B. All authors have read and agreed to the published version of the manuscript.
Funding
The findings of this project are from the Colonel Robert R. McCormick Professorship of Diagnostic Imaging fund at Rush University Medical Center.
Institutional Review Board Statement
The study was conducted in accordance with the Declaration of Helsinki, and approved by the Institutional Review Board of Rush University Medical Center; ORA Number: 18111906-IRB01-AM04; Date of approval: 4/15/2019
Informed Consent Statement
Patient consent was waived due to nature of the study and the retrospective of analysis used anonymous imaging data.
Data Availability Statement
Source code used in this project, which are supporting reported results can be found in the GitHub of Rush University Medical Center: GitHub - marksupanich/RUSHRadiologyResearch: Repository for RUSH Radiology Research
Conflicts of Interest
All authors declare that they do not have any conflict of interest.
References
- Levine D. Ultrasound versus Magnetic Resonance Imaging in Fetal Evaluation. Topics in Magnetic Resonance Imaging. 2001;12(1):25-38. [CrossRef]
- Reddy UM, Abuhamad AZ, Levine D, Saade GR. Fetal imaging: executive summary of a joint Eunice Kennedy Shriver National Institute of Child Health and Human Development, Society for Maternal-Fetal Medicine, American Institute of Ultrasound in Medicine, American College of Obstetricians and Gynecologists, American College of Radiology, Society for Pediatric Radiology, and Society of Radiologists in Ultrasound Fetal Imaging Workshop. Journal of ultrasound in medicine : official journal of the American Institute of Ultrasound in Medicine. 2014;33(5):745-57. [CrossRef]
- Lahti M, Eriksson JG, Heinonen K, Kajantie E, Lahti J, Wahlbeck K, et al. Late preterm birth, post-term birth, and abnormal fetal growth as risk factors for severe mental disorders from early to late adulthood. Psychological medicine. 2015;45(5):985-99. [CrossRef]
- Palka C, Guanciali-Franchi P, Morizio E, Alfonsi M, Papponetti M, Sabbatinelli G, et al. Non-invasive prenatal screening: A 20-year experience in Italy. European Journal of Obstetrics & Gynecology and Reproductive Biology: X. 2019;3:100050.
- Whitworth M, Bricker L, Neilson JP, Dowswell T. Ultrasound for fetal assessment in early pregnancy. The Cochrane database of systematic reviews. 2010(4):Cd007058.
- Committee Opinion No. 700 Summary: Methods for Estimating the Due Date. Obstetrics & Gynecology. 2017;129(5):967-8.
- Gagoski B, Xu J, Wighton P, Tisdall MD, Frost R, Lo WC, et al. Automated detection and reacquisition of motion-degraded images in fetal HASTE imaging at 3 T. Magnetic resonance in medicine. 2022;87(4):1914-22. [CrossRef]
- Li H, Yan G, Luo W, Liu T, Wang Y, Liu R, et al. Mapping fetal brain development based on automated segmentation and 4D brain atlasing. Brain Structure and Function. 2021;226(6):1961-72. [CrossRef]
- Kojita Y, Matsuo H, Kanda T, Nishio M, Sofue K, Nogami M, et al. Deep learning model for predicting gestational age after the first trimester using fetal MRI. European Radiology. 2021;31:3775-82. [CrossRef]
- Ison M, Weigl E, Donner R, Kasprian G, Prayer D, Langs G. Fully Automated Brain Extraction and Orientation in Raw Fetal MRI2012. [CrossRef]
- Shen L, Zheng J, Lee EH, Shpanskaya K, McKenna ES, Atluri MG, et al. Attention-guided deep learning for gestational age prediction using fetal brain MRI. Scientific reports. 2022;12(1):1408. [CrossRef]
- Makropoulos A, Counsell SJ, Rueckert D. A review on automatic fetal and neonatal brain MRI segmentation. NeuroImage. 2018;170:231-48. [CrossRef]
- Vahedifard F, Ai HA, Supanich MP, Marathu KK, Liu X, Kocak M, et al. Automatic Ventriculomegaly Detection in Fetal Brain MRI: A Step-by-Step Deep Learning Model for Novel 2D-3D Linear Measurements. Diagnostics. 2023;13(14):2355. [CrossRef]
- Vahedifard F, Adepoju JO, Supanich M, Ai HA, Liu X, Kocak M, et al. Review of deep learning and artificial intelligence models in fetal brain magnetic resonance imaging. World journal of clinical cases. 2023;11(16):3725-35. [CrossRef]
- Fetal Brain Normal Development and Cerebral Pathologies seE, by C. Garel.
- Hadlock FP, Deter RL, Harrist RB, Park SK. Fetal biparietal diameter: a critical re-evaluation of the relation to menstrual age by means of real-time ultrasound. Journal of ultrasound in medicine : official journal of the American Institute of Ultrasound in Medicine. 1982;1(3):97-104. [CrossRef]
- Snijders RJ, Nicolaides KH. Fetal biometry at 14-40 weeks' gestation. Ultrasound in obstetrics & gynecology : the official journal of the International Society of Ultrasound in Obstetrics and Gynecology. 1994;4(1):34-48.
- Campbell S. The prediction of fetal maturity by ultrasonic measurement of the biparietal diameter. The Journal of obstetrics and gynaecology of the British Commonwealth. 1969;76(7):603-9. [CrossRef]
- Sabbagha RE, Hughey M. Standardization of sonar cephalometry and gestational age. Obstetrics and gynecology. 1978;52(4):402-6.
- Davison JM, Lind T, Farr V, Whittingham TA. The limitations of ultrasonic fetal cephalometry. The Journal of obstetrics and gynaecology of the British Commonwealth. 1973;80(9):769-75.
- Kurtz AB, Wapner RJ, Kurtz RJ, Dershaw DD, Rubin CS, Cole-Beuglet C, et al. Analysis of biparietal diameter as an accurate indicator of gestational age. Journal of clinical ultrasound : JCU. 1980;8(4):319-26. [CrossRef]
- Shepard M, Filly RA. A standardized plane for biparietal diameter measurement. Journal of ultrasound in medicine : official journal of the American Institute of Ultrasound in Medicine. 1982;1(4):145-50.
- Smazal SF, Jr., Weisman LE, Hopper KD, Ghaed N, Shirts S. Comparative analysis of ultrasonographic methods of gestational age assessment. Journal of ultrasound in medicine : official journal of the American Institute of Ultrasound in Medicine. 1983;2(4):147-50.
- MacGregor S, Sabbagha, R, lob. libr. women's med.,(ISSN: 1756-2228) 008; [CrossRef]
- Lunt RM, Chard T. Reproducibility of measurement of fetal biparietal diameter by ultrasonic cephalometry. The Journal of obstetrics and gynaecology of the British Commonwealth. 1974;81(9):682-5. [CrossRef]
- Hadlock FP, Deter RL, Carpenter RJ, Park SK. Estimating fetal age: effect of head shape on BPD. AJR American journal of roentgenology. 1981;137(1):83-5. [CrossRef]
- Shields JR, Medearis AL, Bear MB. Fetal head and abdominal circumferences: ellipse calculations versus planimetry. Journal of clinical ultrasound : JCU. 1987;15(4):237-9.
- Hadlock FP, Deter RL, Harrist RB, Park SK. Fetal head circumference: relation to menstrual age. AJR American journal of roentgenology. 1982;138(4):649-53. [CrossRef]
- Hadlock FP, Deter RL, Harrist RB, Park SK. Estimating fetal age: computer-assisted analysis of multiple fetal growth parameters. Radiology. 1984;152(2):497-501. [CrossRef]
- Hadlock FP, Harrist RB, Shah YP, King DE, Park SK, Sharman RS. Estimating fetal age using multiple parameters: a prospective evaluation in a racially mixed population. American journal of obstetrics and gynecology. 1987;156(4):955-7. [CrossRef]
- Shi Y, Xue Y, Chen C, Lin K, Zhou Z. Association of gestational age with MRI-based biometrics of brain development in fetuses. BMC Medical Imaging. 2020;20(1):125. [CrossRef]
- Burgos-Artizzu XP, Coronado-Gutiérrez D, Valenzuela-Alcaraz B, Vellvé K, Eixarch E, Crispi F, et al. Analysis of maturation features in fetal brain ultrasound via artificial intelligence for the estimation of gestational age. American Journal of Obstetrics & Gynecology MFM. 2021;3(6):100462. [CrossRef]
|
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).