For saving runtime, there exists a need to have a fast analysis for clock mesh analytically. The clock skew value is the biggest concern among all timing constraints. For clock mesh, we assume that the clock skew is the difference in latency values for mesh buffers from the same clock source. The worst case for this value is when one sink is just located on the mesh node which connects with the driving buffer while the other sink locates on the diagonal position, as shown in
Figure 4. Thus, the clock mesh window method is considered for solving the skew model for clock mesh [
25]. In the paper of [
14], they only proposed the skew model by using a simplified RC network which aims to solve the grid sizing problem without consideration of the wire width parameter.
To make the analytical method integrated with the wire parameters possible, we have introduced the interconnect model for the clock mesh window. Given an interconnect of length
l with the loading capacitance
and driver resistance
, as shown in
Figure 5. The interconnect problem is to determine the best uniform width that minimizes the source-to-sink delay. The
r represents the sheet resistance,
represents the unit area capacitance, and
represents the unit effective-fringing capacitance. To compute the distributed Elmore delay, the original wire is often divided into many small wire segments and each wire segment is modeled as a
-type model, it can be shown that the Elmore delay is the same no matter how the wire is divided into shorter wire segments. Therefore, we can just use the one-segment
-type model as in
Figure 6, where
denotes the total wire resistance and
denotes the total wire capacitance. The Elmore delay from the driver to the load in
Figure 4 can then be written as follows:
Based on formula (3), the estimated delay can be converted into the skew value calculation. Thus, hard constraints include latency and skew value. We also have introduced some soft constraints and parameters for consideration that can reflect the mesh topology quality. Given an index set
I = {1, 2, ..., i} of potential mesh topology, the penalty score formula is defined as follows. For each clock mesh, the
represents the whole wire length which includes the mesh and stub (the sum value of the Manhattan distance from mesh wire to all sinks) wire resource, the
represents the wire area value including mesh and stub resource. In formula (7), we introduce
parameter which is the variance of clock delay to reflect the resistance to OCV effects. The
represents the latency value from each mesh window, which is noted in
Figure 7. The
is the mean value from collected
data. With consideration of PVT conditions, we set the margin 15% up and down for
calculation. The variable factor weights are represented by
,
, and
, respectively. We set these weight values to indicate the importance of a certain factor to balance different terms in the score function.
Next, we need to solve the topological set
I of the clock mesh. To compress the solution space, we provide a numerical range interface based on engineering experience guidance
, such as 5*5 to 30*30, and the general expression for its application is shown in the formula (8). In
Figure 7 (this figure strictly follows the model of
Figure 2), the mesh planning state is described with a potential merging pattern. With the
, the dotted line is removed from the original clock mesh model in
Figure 2. In order to reduce computational complexity, each window selects the most critical sink based on the skew value of the maximum and minimum distance from mesh buffers, and the figure identifies the longest and shortest Manhattan distances for each sink by blue arrows. Since there are redundant areas in the layout segmentation, such as the blue rectangles in
Figure 7, the window is calculated or not calculated based on the sink distribution. For mesh planning phase, the input physical information includes layout size information
, track information
,
, sinks set
S, physical library and engineering experience guidance
. The mesh window merging pattern set
can be computed based on
,
,
and
. The latency-bounded value
can also be defined by designers. In this work, only regularized clock mesh generation is considered. We compute and store the
and
for each mesh planning state, and return the optimal strategy with the minimal
value.
Given the input information, we collect the merging pattern set
for the clock mesh in line 2 (
and
represent stepping strategy values in different directions, respectively), and topological set
I can also be obtained. Additionally, we round up to preserve integer data in a pair data structure like in the
Figure 7 example. The second step is decision scoring calculation. Based on all potential mesh planning states, we calculate the score penalty score value
based on
, mesh wire width optional information
, and the set
S of key sinks. During this procedure, the multi-thread technology can be applied due to no data coupling. Data that does not meet key constraints are not retained to optimize data structure space which is described between lines 6 and 9. Then we can easily find the minimal value in the vector
and return the corresponding mesh planning strategy
, which means the mesh window merging strategy in
Figure 3. The above algorithm is efficiently implemented with the help of the analytical latency model, and optimized mesh planning can be obtained.
Algorithm 1 The mesh planning phase |
-
Input:
The layout size information , track information , , the set S of key sinks, engineering experience guidance , latency-bounded value , and physical library.
-
Output:
The clock mesh topology .
- 1:
for, , , do
- 2:
I⟵⟵;
- 3:
end for
- 4:
for all, , Sdo
- 5:
Calculate each value with multi-threads;
- 6:
if MAX() < then
- 7:
Push back into vector;
- 8:
elseContinue;
- 9:
end if
- 10:
end for
- 11:
for vector with topological set I in hash table do
- 12:
Find the minimal value and return
- 13:
end for
|