1. Introduction
Redistribution layer (RDL) is one of the most important components of the advanced packages like Chip on Wafer on Substrate (CoWoS) [
1], Foveros [
2], X-Cube [
3], System on Integrated Chips (SoIC) [
4,
5], especially the multilayer RDL interposer based CoWoS-R [
6], FO-PLP [
7] and chiplet-based architectures [
8,
9]. To enable high-bandwidth signal processing between chiplets [
10], several techniques such as dual damascene processing, semi-additive processing (SAP) and polymer damascene processing have been introduced. These techniques narrow the RDL trace width/pitch to as small as 2 microns [
11,
12] and even sub-micron [
13]. To enhance the electricity transmission capacity, traces become thicker, with an aspect ratio of up to 4.2 [
14], which is very different from traditional shell-like thin traces. Furthermore, wafer level integrations can stack up to 5 metal layers [
1], with the area up to 1200mm
2 [
15], or even as extensive as 2500mm
2 [
1]. As a result, RDLs in advanced packages become extremely complex in shape and behavior, posing great challenges in accurately capturing its thermo-mechanical characteristics. Additionally, in advanced packages as substrate is extremely thinned to several microns or even sub-micron [
16] which is similar to or even thinner than RDL, making the RDLs more prominent comparing to other package components [
17,
18]. For this reason, RDLs are playing crucial roles in package thermo-mechanical characteristics.
Integrated packages face severe thermo-mechanical risk due to the coefficient of thermal expansion (CTE) mismatch between materials, the accelerated thermal cycling test (TCT) [
19] is an important approach for assessing their reliability. As a low-cost alternative of testing, finite element analysis (FEA) simulation under TCT conditions becomes a significant research area on design for reliability (DFR). Balancing between efficiency and accuracy of simulation is a critical issue in this field.
For traditional integrations, due to the thicker substrate and larger trace width/pitch, axisymmetric FEA model can meet the efficiency and accuracy requirements for DFR. By 2D axisymmetric model, Lee et al. [
20], Che et al. [
21], Machani et al. [
22] simplified the RDLs into homogeneous rectangles and refined the outermost bump, then evaluated the fatigue life of the bump through transient simulation. By 3D axisymmetric model, Lee et al. [
23], Che et al. [
24] simplified the RDL into homogeneous films and refined the corner solder, thus assessed the reliability of stacked-chip packages. However, advanced packaging has different structural characteristics.
For advanced packages, RDL plays a more significant role in thermo-mechanical reliability. Therefore, more precise RDL modeling approaches are required for DFR of advanced packages. Dividing the RDL then employing material equivalization is a widely used strategy to improve the accuracy. In this strategy, division size and equivalization method are crucial for efficiency and accuracy. When efficiency is concerned, RDL will be divided into large size regions, and there are three typical material equivalization methods. The first method, employed by Wang et al. [
25], involves calculating the isotropic materials based on the copper ratio within each region. The second method, used by McCaslin et al. [
26], is determining the anisotropic equivalent material properties of regions consisting of simple shape traces through composite material mechanics. The third one, utilized by Valdevit et al. [
27], is the morphology-based approach, where regions are categorized into 'lines', 'vias', or 'web' types, then specific mathematical models are developed to compute the material properties for each type of region. When accuracy is concerned, RDL will be divided into small size blocks, and there are also three typical material equivalization methods. The first method, employed by Lien et al. [
28], involves calculating the isotropic equivalent material properties based on the metal percentage within each block. The second method, used by Lee et al. [
29], focuses on calculating the anisotropic material properties and the equivalent reference temperature of blocks using composite material mechanics. The third one, introduced by Gibson [
30], and Lee et al. [
29], is a simulation-based approach that calculates the anisotropic equivalent material properties of blocks containing complex traces. To achieve higher accuracy, Yaddanapudi [
31] divided the RDL into extremely small size pixels and directly defined each pixel as either metal or dielectric, a high-fidelity structural response of RDL was achieved by consuming large amount of solving resources. In general, existing work indicates that there is a trade-off between division sizes and the material equivalization method. The key to balancing efficiency and accurately lies in obtaining the anisotropic equivalent material properties quickly and accurately under larger division size.
Recently, machine learning (ML) has shown the capability of extracting interpretable models from scientific data automatically [
32], and it has been increasingly employed in the material equivalization of composite materials in 3D and 2D structures. Regarding 3D, Dai et al. [
33] used feature matrixes to digitally represent the lattices orientations of 3D structures, and then created a graph-Artificial Neural Network (ANN) to predict the structural performance. Regarding 2D, Liu et al. [
34], Ye et al. [
35] and Gong et al. [
36] used high-resolution grayscale matrix to digitalize the planar strictures, then constructed convolutional neural networks (CNNs) to quickly predict the equivalent mechanical properties. Due to the accurate and fast predictive capabilities of models created based on ML, it has been introduced into the modeling and simulation of integrated package. For example, Selvanayagam et al. [
37,
38] partitioned an 8-RDLs interposer into 3×3 regions, then established surrogate models that link the copper ratio in each layer within each region to the warpage in TCT condition, then ultimately optimized the global package warpage.
In this work, we applied ML for material equivalization and developed a RDL modeling and simulation method for DFR of advanced packages. In this method, a RDL is divided into equal-size blocks. To describe the layout within each block, we further subdivided each block into pixels and represent each pixel using the metal percentage. Consequently, each block can be represented as a tensor composed of pixel metal percentages, and the entire RDL is digitalized as the collection of such block tensor. An ANN is constructed using either fully connected neural networks (FCNN) or convolutional neural networks (CNN), to model the relationship between the block tensor and the corresponding equivalent material properties. Subsequently, the ANN is trained by a small subset of tensors with the equivalent material properties obtained through finite element analysis (FEA) as labels, then the material surrogate model is constructed. Lastly, all blocks are transformed into FEA elements and assigned the material properties predicted by the model, then completing the modeling and simulations of the RDL.
To validate the efficiency and accuracy of the proposed method, line bending simulations were conducted for a 21.6×21.6mm² RDL. The detailed fine mesh model served as the benchmark, with the reaction force being regarded as the accuracy indicator. The results show that as the reduces of substrate thickness, the reaction force error of the traditional method that neglects layout impact increases sharply, indicating that critical layout impact must be considered in the simulation of advanced package. When the proposed method is used with a substrate thickness of 50μm, the reaction force error is 2.81% the layout impact can accurately be estimated utilizing only 200×200 elements, the number of elements is only 1/15 of the traditional approach.
As an application case, a simulation was introduced to investigate the thermo-mechanical response of a 2.5D integrated CPU chip at the maximum temperature state in TCT. The results indicate that for advanced packages the maximum stress more likely to occur in the RDL, which is different from the traditional integrations where the maximum stress occurs in the bumps. It also reveals that the stress of both the RDL and bumps are significantly impacted by the layout. Moreover, it is observed that the stress in RDL is particularly impacted by vias and bumps in adjacent layers.
The method precisely concerns these layout-related impacts with minimal resources and time, presents an opportunity to improve the efficiency of advanced package DFR.
2. Methods
The general workflow of the proposed modeling method is shown in
Figure 1 and consists of the following steps:
Step 1: Two-level RDL digitalization, this step is consisted by global and local discretization. In the global level, the package level RDL pattern is divided into equal-size RDL blocks containing a part of traces, and there are Qx blocks along the X direction and Qy along Y. To capture the layout feature within block, in the local level, each RDL block is divided into pixels with uniform size. The subdivision results in PQx pixels along the X direction and PQy pixels along Y, the pixel value is defined as the metal percentage within pixel region. Consequently, the RDL block can be represented by a PQx×PQy tensor, and the package-level RDL can be digital represented by Qx×Qy tensors.
Step 2: ANN Training dataset preparing, this step prepares the ANN training dataset by randomly selected a small subset of RDL blocks. The dataset consists of input tensors and labels representing the anisotropic equivalent material properties of each block obtained through 3D FEA simulations [
16], including Young's modules, Poisson's ratios, shear modules, thermal conductivities and CTEs in all directions.
Step 3: ANN-based surrogate model establishing, this step establishes an ANN-based material surrogate model. The input data of the surrogate model is the RDL block tensor and the output data include all equivalent properties. The surrogate model employs ANN and is trained by the dataset prepared in Step 2. Once established, the ANN eliminates the need for repeating Steps 2 and 3, it would be highly advantageous for the repeated design iteration process.
Step 4: Equivalent properties predicting, all equivalent properties for all the RDL blocks generated in Step 1 are predicted by the surrogate model established in Steps 2, 3.
Step 5: Global FEA model building, all RDL blocks are transformed into 3D hexahedral solid elements to construct the global RDL FEA model and assigned the material properties predicted by Step 4. The model consists of Qx×Qy elements encompassing the entire RDL, and can also serve as a part of the package-level FEA model.
As the method implementation, a series of programs are established, including: the ANN implemented by Pytorch, the main framework developed in Python and the FEA simulator powered by ANSYS 18.2. The RDL blocks creating processes are implemented by C++. In addition, the C# developed program is used as the graphical post-processor.
2.1. ANN Architectures
Activation function, rectified linear unit (ReLU) is selected as the activation function, shown by (1), (2):
Loss Function, cross-entropy loss is selected as the loss function, shown by (3):
Where, n is the number of samples,
yi represents the true value of the
i-th sample,
represents the predicted value of the
i-th sample,
wi represents the weight of the
i-th sample. The loss for a batch prediction, shown by (4):
Where, batch is the number of predicting procedures for one predicting batch. For the j-th predicting procedure of the batch, yij represents the true value of the i-th sample, represents the predicted value of the i-th sample and wi represents the weight of the i-th sample.
Optimizer, the stochastic gradient descent (SGD) is adopted to optimize neural network, and momentum is considered during the optimization process. Learning rate and momentum parameters are adjusted to fine-tune the optimization process, as indicated in equations (5) and (6)
Where, t is the current iteration step number, relatively t–1 is the number of the previous step, lr represents the learning rate, g is the gradient, m means the momentum. vt, wt represents the velocity and weight in current iteration step respectively. vt-1, wt-1 represents the velocity and weight in the previous step respectively.
ANN networks, full-connected neural network (FCNN) and convolution neural network (CNN) are created. As shown in
Figure 2, the FCNN includes an input layer, output layer and several hidden layers. The parameters of the input layer are represented by the vector shown by (7) and the size of the input vector shown by (8):
where,
is the metal percentage of the
p-th pixel, Q
in is the vector size which is equal to the number of pixels of a single RDL block,
PQx and
PQy represent the number of RDL block’s pixels along the X and Y direction respectively. The output vector of the FCNN includes all 15 anisotropic equivalent material properties of the RDL block, shown by (9):
The hidden layer can be represented by vector as (10):
where,
is the value of the
i-th node,
Qn is the number of nodes of the
n-th hidden layer.
Shown as
Figure 3, the CNN is consisted by convolution network and the full-connected network, the input parameters are the tensor with size of [
PQx,
PQy, 1], and the output vector is the same as (7). The convolution network contains convo lution layers and max pooling layers and the tensor is transf ormed through layers. The tensor size is [
Cnx,
Cny,
Cnz] after the
n-th convo lution layer, and [
Pnx,
Pny,
Pnz] after the
n-th max pooling layer. The output tensor of the convolution network can be compressed into the vector of size
QLin.
where,
Pnx,
Pny,
Pnz are the output tensor size in X, Y and Z directions respectively.
2.2. Training dataset augmentation
To augment the training dataset of blocks, the geometric symmetry and data transformations are employed. The initial block with anisotropic properties (E
x = E
x0, E
y = E
y0) is shown in
Figure 4a, and it can be rotated by 90°, 180°, and 270°, as indicated in
Figure 4b,d, respectively. When the angle of rotation is 90° or 270°, the anisotropic properties are transformed to E
x = E
y0, E
y = E
x0. The initial block and its rotated versions can also be flipped in the Y direction, as shown in
Figure 4e–h, while keeping the same anisotropic properties. Eight unique blocks can be generated from the initial one, without requiring additional FEA solutions.
Figure 1.
Workflow of the proposed ML-based RDL modeling method.
Figure 1.
Workflow of the proposed ML-based RDL modeling method.
Figure 2.
Full-connected neural network (FCNN) architecture of the proposed method.
Figure 2.
Full-connected neural network (FCNN) architecture of the proposed method.
Figure 3.
Convolution neural network (CNN) architecture of the proposed method.
Figure 3.
Convolution neural network (CNN) architecture of the proposed method.
Figure 4.
RDL block augmentation. (a) Initial block, (b) Rotating of 90°, (c) Rotating of 180°, (d) Rotating of 270°, (e) Initial block and flipped in Y, (f) Rotating of 90° and flipping in Y, (g) Rotating of 180° and flipping in Y, (h) Rotating of 270° and flipping in Y.
Figure 4.
RDL block augmentation. (a) Initial block, (b) Rotating of 90°, (c) Rotating of 180°, (d) Rotating of 270°, (e) Initial block and flipped in Y, (f) Rotating of 90° and flipping in Y, (g) Rotating of 180° and flipping in Y, (h) Rotating of 270° and flipping in Y.
Figure 5.
Model and load cases of the two-layered model bending simulations. (a) Parallel trace bending load case (Case P), (b) Vertical trace bending load case (Case V).
Figure 5.
Model and load cases of the two-layered model bending simulations. (a) Parallel trace bending load case (Case P), (b) Vertical trace bending load case (Case V).
Figure 6.
Layout pattern of M1.
Figure 6.
Layout pattern of M1.
Figure 7.
Von Mises stress distribution of the detailed fine mesh model (The substrate thickness, TB = 50μm). (a) CaseP, (b) CaseV.
Figure 7.
Von Mises stress distribution of the detailed fine mesh model (The substrate thickness, TB = 50μm). (a) CaseP, (b) CaseV.
Figure 8.
The loss variation during ANN training and testing procedure.
Figure 8.
The loss variation during ANN training and testing procedure.
Figure 9.
The anisotropic material of the RDL calculated by simulation-based and ML-based methods.
Figure 9.
The anisotropic material of the RDL calculated by simulation-based and ML-based methods.
Figure 10.
Von Mises stress distribution of the proposed ML-based modeling method (The substrate thickness, TB = 50μm). (a) CaseP, (b) CaseV.
Figure 10.
Von Mises stress distribution of the proposed ML-based modeling method (The substrate thickness, TB = 50μm). (a) CaseP, (b) CaseV.
Figure 11.
Boundary conditions and loads for two-layered model warpage simulations.
Figure 11.
Boundary conditions and loads for two-layered model warpage simulations.
Figure 12.
Result of the parallel trace warpage case (CasePT). (a) Z-direction displacement distribution of the proposed ML-based method, (b) Z-direction displacement distribution of the detailed fine mesh model, (c) von Mises stress distribution of the proposed ML-based method, (d) von Mises stress distribution of the detailed fine mesh model.
Figure 12.
Result of the parallel trace warpage case (CasePT). (a) Z-direction displacement distribution of the proposed ML-based method, (b) Z-direction displacement distribution of the detailed fine mesh model, (c) von Mises stress distribution of the proposed ML-based method, (d) von Mises stress distribution of the detailed fine mesh model.
Figure 13.
Result of the vertical trace warpage case (CaseVT). (a) Z-direction displacement distribution of the proposed ML-based method, (b) Z-direction displacement distribution of the detailed fine mesh model, (c) von Mises stress distribution of the proposed ML-based method, (d) von Mises stress distribution of the detailed fine mesh model.
Figure 13.
Result of the vertical trace warpage case (CaseVT). (a) Z-direction displacement distribution of the proposed ML-based method, (b) Z-direction displacement distribution of the detailed fine mesh model, (c) von Mises stress distribution of the proposed ML-based method, (d) von Mises stress distribution of the detailed fine mesh model.
Figure 14.
Reaction force errors of CaseP and CaseV variation by substrate thickness for different modeling methods when global mesh division Q = 200.
Figure 14.
Reaction force errors of CaseP and CaseV variation by substrate thickness for different modeling methods when global mesh division Q = 200.
Figure 15.
Reaction force errors of CaseP and CaseV variation by substrate thickness for different global division values. (a) Q = 100, (b) Q = 300.
Figure 15.
Reaction force errors of CaseP and CaseV variation by substrate thickness for different global division values. (a) Q = 100, (b) Q = 300.
Figure 16.
Von Mises stress distribution of different global mesh division values (The substrate thickness, TB = 50μm). (a) Q = 100, (b) Q = 300.
Figure 16.
Von Mises stress distribution of different global mesh division values (The substrate thickness, TB = 50μm). (a) Q = 100, (b) Q = 300.
Figure 17.
Reaction force errors and simulation cost of different global mesh division values for the traditional layout neglecting block-based method when the thickness of the substrate equals 50μm.
Figure 17.
Reaction force errors and simulation cost of different global mesh division values for the traditional layout neglecting block-based method when the thickness of the substrate equals 50μm.
Figure 18.
Von Mises stress distribution of the traditional layout neglecting block-based modeling method of different global mesh division values (TB = 50μm). (a) Q = 200, (b) Q = 600.
Figure 18.
Von Mises stress distribution of the traditional layout neglecting block-based modeling method of different global mesh division values (TB = 50μm). (a) Q = 200, (b) Q = 600.
Figure 19.
Loss and training time for different training dataset sizes.
Figure 19.
Loss and training time for different training dataset sizes.
Figure 20.
The anisotropic material of the RDL predicted by the ML-based method with training dataset augmentation.
Figure 20.
The anisotropic material of the RDL predicted by the ML-based method with training dataset augmentation.
Figure 21.
Training and prediction results of FCNNs. (a) 2 hidden layers FCNNs, (b) 3 hidden layers FCNNs.
Figure 21.
Training and prediction results of FCNNs. (a) 2 hidden layers FCNNs, (b) 3 hidden layers FCNNs.
Figure 22.
Prediction and training results of different CNNs.
Figure 22.
Prediction and training results of different CNNs.
Figure 23.
Geometry model of the 2.5D integrated CPU chip as an application.
Figure 23.
Geometry model of the 2.5D integrated CPU chip as an application.
Figure 24.
The maximum temperature state in TCT for the 2.5D integrated CPU chip.
Figure 24.
The maximum temperature state in TCT for the 2.5D integrated CPU chip.
Figure 25.
Von Mises stress distribution of the maximum temperature state of the TCT of the 2.5D integrated CPU chip. (a) Key layers’ stress distributions, (b) M1 stress distribution backgrounded by the layout pattern, (c) Part of the M1 stress distribution, (d) Detail of the maximum stress region backgrounded by the M1 layout pattern and the vias and bumps in adjacent layers.
Figure 25.
Von Mises stress distribution of the maximum temperature state of the TCT of the 2.5D integrated CPU chip. (a) Key layers’ stress distributions, (b) M1 stress distribution backgrounded by the layout pattern, (c) Part of the M1 stress distribution, (d) Detail of the maximum stress region backgrounded by the M1 layout pattern and the vias and bumps in adjacent layers.
Table 1.
Material properties used in simulations.
Table 1.
Material properties used in simulations.
Name |
Si [39] |
Cu [40] |
PI [41] |
CTE |
2.6E-6 |
1.64E-5 |
2E-5 |
Young's Modulus (MPa) |
1.31E5 |
1.30E5 |
2.5E3 |
Poisson Ratio |
0.28 |
0.34 |
0.34 |