FIELD OF THE INVENTION

[0001]
This document concerns an invention relating generally to statistical timing analysis of integrated circuits.
BACKGROUND OF THE INVENTION

[0002]
For integrated circuits (e.g., VLSI chips) to work properly, the signals traveling along their gates and interconnects must be properly timed, and several factors are known to cause timing variations. As examples, variations in manufacturing process parameters (such as variations in interconnect diameter, gate quality, etc.) can cause timing parameters to deviate from their designed value. In lowpower applications, lower supply voltages can cause increased susceptibility to noise and increased timing variations. Densely integrated elements and nonideal onchip power dissipation can cause “hot spots” on a chip, which can also cause excessive timing variations.

[0003]
A classical approach to timing analysis is to analyze each signal path in a circuit and determine the worst case timing. However, this approach produces timing predictions that are often too pessimistic and grossly conservative. As a result, statistical timing analysis (STA)—which characterizes timing delays as statistical random variables—is often used to obtain more realistic timing predictions. By modeling each individual delay as a random variable, the accumulated delays over each path of the circuit will be represented by a statistical distribution. As a result, circuit designers can design and optimize chips in accordance with acceptable likelihoods rather than worstcase scenarios.

[0004]
In STA, a circuit is modeled by a directed acyclic graph (DAG) known as a timing graph wherein each delay source—either a logic gate or an interconnect—is represented as a node. Each node connects to other nodes through input and output edges. Nodes and edges are referred to as delay elements. Each node has a node delay, that is, a delay incurred in the corresponding logic gates or interconnect segments. Similarly, each edge has an edge delay, a term of signal arrival time which represents the cumulative timing delays up to and including the node that feeds into the edge. Each edge delay has a path history: the set of node delays through which a signal travels before arriving at this edge. Each delay element is then modeled as a random variable, which is characterized by its probability density function (pdf) and cumulative distribution function (cdf). The purpose of STA is then to estimate the edge delay distribution at the output(s) of a circuit based on (known or assumed) internal node delay distributions.

[0005]
The three primary approaches to STA are Monte Carlo simulation, pathbased STA, and blockbased STA. As its name implies, Monte Carlo simulation mechanically computes the statistical distribution of edge delays by analyzing all (or most) possible scenarios for the internal node delays. While this will generally yield an accurate timing distribution, it is computationally extremely timeconsuming, and is therefore often impractical to use.

[0006]
Pathbased STA attempts to identify some subset of paths (i.e., series of nodes and edges) whose time constraints are statistically critical. Unfortunately, pathbased STA has a computational complexity that grows exponentially with the circuit size, and thus it too is difficult to practically apply to many modern circuits.

[0007]
Blockbased STA, which has largely been developed owing to the shortcomings of Monte Carlo and pathbased STA, uses progressive computation: statistical timing analysis is performed block by block in the forward direction in the circuit timing graph without looking back at the path history, by use of only an ADD operation and a MAX operation:

[0008]
ADD: When an input edge delay X propagates through a node delay Y, the output edge delay will be Z=X+Y.

[0009]
MAX: When two edge delays X and Y merge in a node, a new edge delay Z=MAX(X,Y) will be formulated before the node delay is added.

[0010]
Note that the MAX operation can also be modeled as a MIN operation, since MIN(X,Y)=−MAX(−X,−Y). Thus, while a MIN operation can also be relevant in STA analysis, it is often simpler to use only one of the MAX and MIN operators. For sake of simplicity, throughout this document, the MAX operator will be used, with the understanding that the same results can be adapted to the MIN operator.

[0011]
With the two operators ADD and MAX, the computational complexity of block based STA grows linearly (rather than exponentially) with respect to the circuit size, which generally results in manageable computations. The computations are further accelerated by assuming that all timing variables in a circuits follow the Gaussian (normal) distribution: since a linear combination of normally distributed variables is also normally distributed, the correlation relations between the delays along a circuit path are efficiently preserved.

[0012]
To illustrate, in the ADD operation ADD(X,Y)=Z, if both input delay elements X and Y are Gaussian random variables, then the delay Z=X+Y will also be a Gaussian random variable whose mean and variance are:
$\begin{array}{cc}\mathrm{Mean}\text{:}\text{\hspace{1em}}{\mu}_{Z}={\mu}_{X}+{\mu}_{Y}& \left(1\right)\\ \mathrm{Variance}\text{:}\text{\hspace{1em}}{\sigma}_{Z}^{2}={\sigma}_{X}^{2}+{\sigma}_{Y}^{2}+2\text{\hspace{1em}}\mathrm{cov}\left(X,Y\right)& \left(2\right)\end{array}$
where cov(X,Y)=E{(X−μ_{X})(Y−μ_{Y})} is the covariance between X and Y.

[0013]
In contrast, in the MAX operation Z=MAX(X,Y), MAX is a nonlinear operator: even if the input delays X and Y are Gaussian random variables, Z will not (usually) have a Gaussian distribution. However, as shown in C. Clark, “The greatest of a finite set of random variables,” Operations Research, pp. 145162, March 1961, if X and Y are Gaussian and statistically independent, the first and second moments of the distribution of MAX(X,Y) are defined by:
$\begin{array}{cc}\mathrm{Mean}\text{:}\text{\hspace{1em}}{\mu}_{Z}={\mu}_{X}\xb7Q+{\mu}_{Y}\left(1Q\right)+\theta \text{\hspace{1em}}P& \left(3\right)\\ \mathrm{Variance}\text{:}\text{\hspace{1em}}{\sigma}_{Z}^{2}=\left({\mu}_{X}^{2}+{\sigma}_{X}^{2}\right)\text{\hspace{1em}}Q+\left({\mu}_{Y}^{2}+{\sigma}_{Y}^{2}\right)\left(1Q\right)+\left({\mu}_{X}+{\mu}_{Y}\right)\theta \text{\hspace{1em}}P{\mu}_{Z}^{2}& \left(4\right)\end{array}$
where θ=σ
_{(X−Y)}. P and Q are the pdf and cdf of the standard Gaussian distribution evaluated at λ=μ
_{(X−Y)}/σ
_{(X−Y)}:
$\begin{array}{cc}P\left(\lambda \right)=\frac{1}{\sqrt{2\text{\hspace{1em}}\pi}}\mathrm{exp}\left(\frac{{\lambda}^{2}}{2}\right)\text{\hspace{1em}}Q\left(\lambda \right)={\int}_{\infty}^{\lambda}P\left(x\right)dx& \left(5\right)\end{array}$
It is then possible to define a Gaussian approximation for the nonGaussian Z=MAX(X,Y). In C. Visweswariah, K. Ravindran, and K. Kalafala, “Firstorder parameterized blockbased statistical timing analysis,” TAU'04, February 2004, the Z=MAX(X,Y) is approximated by a Gaussian random variable
which is a linear combination of X, Y, and an additional independent Gaussian random variable Δ:
Z=MAX(
X, Y)≈
QX+(1
−Q)
Y+Δ= (6)
where Q is defined in the foregoing Equation (5), and is referred to as “tightness.” The purpose of the additional random variable Δ is to ensure that the first and second moments (the mean and the variance) of
match those of Z as specified in the foregoing Equations (3) and (4).

[0014]
In the foregoing Clark reference, it was shown that if W is a Gaussian random variable, then the crosscovariance between W and Z=MAX(X,Y) can be found analytically as:
cov(
W,Z)=
Qcov(
W,X)+(1
−Q)cov(
W,Y) (7)
Substituting Equation (6):
cov(
W,
)=
Qcov(
W,X)+(1
−Q)cov(
W,Y)=cov(
W,Z)
Hence, a convenient property of the approximator
is that the crosscovariance between Z and another timing variable W is preserved when the nonGaussian Z=MAX(X,Y) is replaced by the Gaussian random variable
. Thus, the use of the Gaussian random variable
as an approximation to the nonGaussian Z=MAX(X,Y) allows preservation of linearity.

[0015]
Unfortunately, one flaw of blockbased STA is that its underlying assumption of a simple linear (additive) combination of sequential path delays is often incorrect. The delays of elements in a circuit can be correlated due to various phenomena, two common ones being known as global variations and path reconvergence. Global variations are effects that impact a number of elements simultaneously, such as inter or intradie spatial correlations, temperature or supply voltage fluctuations, etc. These generate global correlation between delay elements, wherein all globally correlated elements are simultaneously affected. An example of the effect of global variations is schematically depicted in FIG. 1(a), wherein node delays X, Y, and Z all depend on some influence g.

[0016]
Path reconvergence occurs where elements share a common element or path along their past path histories owing to path intersections, and this leads to path correlation (local correlation of elements along some section of a path). An example of the effect of path correlation is schematically depicted in FIG. 1(b), wherein edge delays X and Y both depend on node delay p.

[0017]
The underlying problem of global and path correlation is that while the output of the MAX operator can be directly approximated by a Gaussian distribution having its first two moments matching those of Equations (3) and (4), this approach fails to retain any correlation information after the MAX operation is performed. In short, the MAX operator destroys correlation information which may be critical to accurate timing prediction. Several approaches have been proposed for dealing with global and path correlation, but the field of timing analysis is lacking in methods for accounting for both of these correlations in an accurate and computationally efficient manner.

[0018]
One approach to compensating for global variations is to use a canonical timing model (C. Visweswariah, K. Ravindran, and K. Kalafala, “Firstorder parameterized blockbased statistical timing analysis,” TAU'04, February 2004; A. Agarwal, D. Blaauw, and V. Zolotov, “Statistical timing analysis for intradie process variations with spatial correlations,” Computer Aided Design, 2003 International Conference on. ICCAD2003, pp. 900907, November 2003; H. Chang and S. S. Sapatnekar, “Statistical timing analysis considering spatial correlations using a single pertlike traversal,” ICCAD'03, pp. 621625, November 2003). In the canonical timing model, each of the node delays is represented as a summation of three terms:
$\begin{array}{cc}{n}_{i}={\mu}_{i}+{\alpha}_{i}{R}_{i}+\sum _{j=1}{\beta}_{i,j}{G}_{j}& \left(8\right)\end{array}$
where n_{i }(i=1,2, . . . ) is the random variable corresponding to the ith node delay in the timing graph; μ_{i }is the expected value of n_{i}; R_{i}; (called the node variation or local variation), is a zeromean, unity variance Gaussian random variable representing the localized statistical uncertainties of n_{i}; G_{j }represents the jth global variation, and is also modeled as a zeromean, unity variance Gaussian random variable; {R_{i}} and {G_{j}} are additionally assumed to be mutually independent; and the weight parameters α_{i }(named node sensitivity or local sensitivity) and β_{ij }(named global sensitivity) are deterministic constants, explicitly expressing the amount of dependence of n_{i }on each of the corresponding independent random variables.

[0019]
With this canonical representation, the variance of a node delay n_{i }and its correlation (covariance) with another node delay n_{k }can be evaluated as:
$\begin{array}{cc}\begin{array}{c}\mathrm{Variance}\text{:}\\ {\sigma}_{{n}_{i}}^{2}=E\left\{{\left({n}_{i}{\mu}_{i}\right)}^{2}\right\}={\alpha}_{i}^{2}+\sum _{j}{\beta}_{i,j}^{2}\end{array}& \left(9\right)\\ \begin{array}{c}\mathrm{Covariance}\text{:}\text{\hspace{1em}}\\ \mathrm{cov}\left({n}_{i},{n}_{k}\right)=E\left\{\left({n}_{i}{\mu}_{i}\right)\left({n}_{k}{\mu}_{k}\right)\right\}=\sum _{j}{\beta}_{i,j}{\beta}_{k,j}\end{array}& \left(10\right)\end{array}$
However, if Equation 8 is also used to represent edge delays, this approach will implicitly assume that edge delays will only experience global variations, and that no path reconvergence occurs in the timing graph. This approach is acceptable where no path reconvergence is present, or where global variation dominates the correlations in the timing graph, but it will have severe problems where path correlation is important—which is unfortunately a common situation. To illustrate, in FIG. 1(b), both edge delays X and Y share a common path history including node p. However, in the canonical representation of edge delays X and Y, the local variation R_{p }of node p is not present: the path correlation between X and Y due to R_{p }is (incorrectly) dropped.

[0020]
In the Visweswariah et al. reference, the aforementioned concept of tightness is used to retain global correlation information through the nonlinear MAX operation. A tightnessbased linear combination is proposed to approximate the MAX operator while including an independent random variable Δ for the purpose of matching moments and covariance (Equation (6)). While the purpose of the inclusion of an independent Gaussian random variable Δ is to ensure the matching of the covariance of
to the output of the MAX operator Z, this parsimonious random variable may not accurately propagate correlation information, and thus may inadvertently introduce additional modeling error of the output pdf.

[0021]
In A. Devgan and C. Kashyap, “Blockbased static timing analysis with uncertainty,” ICCAD'03, pp. 607614, November 2003, a common node detection procedure is introduced to deal with path correlation (path reconvergence), but here global correlation is neglected. This method assumes that if two edge delays X and Y ever pass a common node whose output edge delay is W, then X=X′+W and Y=Y′+W. Operation MAX(X,Y) is then done as W+MAX(X′+Y′). This approximation is imperfect since X and Y usually don't have a very strong dependence on W. A counter example is illustrated in FIG. 2, where both X and Y are theoretically dependent on W, but practically speaking, X will be independent of W if U>>W, and similarly Y will be independent of W if V>>W.

[0022]
Given that the trend in circuit fabrication is to everincreasing speed and everdecreasing size, there is clearly a pressing need for accurate methods of statistical timing analysis which compensate for both global and path correlation, and which are computationally efficient so that rapid design and testing is feasible.
SUMMARY OF THE INVENTION

[0023]
The invention, which is defined by the claims set forth at the end of this document, is directed to methods of blockbased statistical timing analysis (STA) which can take into account correlations caused by both global variations and path reconvergence, and which can at least partially alleviate the drawbacks of prior STA methods. A basic understanding of some of the preferred features of the invention can be attained from a review of the following brief summary of the invention, with more details being provided in the following section of this document entitled Detailed Description of Preferred Versions of the Invention.

[0024]
Initially, Section 1 of the Detailed Description section describes how the nonlinear MAX operation can be approximated by a weighted linear mixing operator which performs a linear supposition of its inputs, and which has the moments of its output matched to those of the MAX operation. The preferred form of this operation is U=ρX+(1−ρ)Y+ζ, wherein X and Y are the inputs and ρ and ζ are constants chosen such that U has matching moments with Z=MAX(X,Y). Whereas Gaussian inputs to the MAX operator will result in a nonlinear/nonGaussian output which lacks any information regarding correlation between the inputs, the linear mixing operator proposed here conveniently retains both linearity and correlation information, and it is therefore preferred for use in the steps/methods discussed below.

[0025]
Section 2 of the Detailed Description then discusses how a modified version of the traditional canonical timing model, referred to herein as the Extended Canonical Timing Model, can be used to represent all delay elements (both nodes and edges) in a circuit as a weighted linear combination of a set of independent random variables, with the weights being capable of representing the effects of both global and local (path) correlation. A preferred form for the weighted linear combination is:
$X={\mu}_{X}+\sum _{i=1}^{N}{\alpha}_{X,i}{R}_{i}+\sum _{j=1}^{M}{\beta}_{X,j}{G}_{j}$

 wherein: μ_{x }is the mean of a Gaussian distribution defining delay element X;
 N is the number of signal delays in the circuit;
 M is the number of global variations in the circuit;
 R_{i }and G_{j }are each Gaussian random variables;
 α_{X,i }is a node sensitivity reflecting the dependence of X on signal delay i in the circuit; and
 β_{Xj }is a global sensitivity reflecting the dependence of X on global variation j in the circuit.
Further, this combination can be easily expressed in a vector format, referred to herein as the Variation Vector Timing Model, which can greatly ease computation.

[0032]
Section 3 of the Detailed Description then discusses how correlated timing elements can be decomposed into independent (uncorrelated) ones, and then used in the extended canonical timing model (and its variation vector equivalent) to further simplify computation. Here, timing elements X and Y having cov(X, Y)≠0 (i.e., correlated elements) can be substituted with equivalent delays X′+W=X and Y′+W=Y, wherein W is such that cov(X, Y)=0. As will be discussed in Section 3, this can greatly simplify MAX operations.

[0033]
Section 4 of the Detailed Description provides further details how the foregoing steps/methods can be propagated throughout a timing graph (i.e., how the foregoing steps/methods can be used to attain calculated delays throughout a circuit/timing graph).

[0034]
Section 5 of the Detailed Description then describes a preferred modification of the foregoing steps wherein the variation vector timing model is “pruned” to eliminate coefficients that have minor impact on final results, thereby greatly enhancing computational speed and reducing storage burdens. Rather than dropping these coefficients entirely (which can increase error owing to the cumulative effect of these coefficients), they may be replaced by a correction term which at least partially accounts for their collective impact.

[0035]
Section 6 of the Detailed Description then reviews experimental results generated during testing of the foregoing steps/methods.

[0036]
Section 7 then defines a bounded approximation that can be used for the MAX operator in situations where the output of the MAX operation is significantly nonGaussian, a situation which might arise (for example) where the input delays have significantly different variances. Whereas other approximations can sometimes underestimate the MAX output in this situation (which can lead to risky timing assumptions and possibly excessive circuit failure), the proposed bounded approximation is conservatively defined in such a manner that it will marginally exceed the MAX output, to some desired level of confidence.

[0037]
Further advantages, features, and objects of the invention will be apparent from the following detailed description of the invention in conjunction with the associated drawings.
BRIEF DESCRIPTION OF THE DRAWINGS

[0038]
FIG. 1(a) depicts a timing graph schematically depicting the effect of influence g on nodes X, Y, and Z, introducing global correlation of these nodes.

[0039]
FIG. 1(b) depicts a timing graph schematically depicting the effect of node p on edges X and Y, wherein edges X and Y experience path correlation owing to shared portions of their path histories.

[0040]
FIG. 2 provides an exemplary series of interconnected gates which pose challenges in accounting for path correlation.

[0041]
FIG. 3 schematically illustrates a method for constructing a vector w from vectors x and y (which represent delay elements X and Y) wherein w can be used to decompose correlated delay elements X and Y into substitute delay elements which are uncorrelated.

[0042]
FIG. 4 schematically illustrates a method for decomposing correlated delay elements X and Y into uncorrelated substitute elements X′ (=X−W), Y′ (=Y−W), and W.

[0043]
FIG. 5 illustrates an exemplary circuit having a highly nonlinear (nonGaussian) output owing to the cascaded gates (and a corresponding string of MAX operations, with each receiving the output of the prior operation).
DETAILED DESCRIPTION OF PREFERRED VERSIONS OF THE INVENTION

[0000]
1. Asymptotic Approximation of MAX Operator (Linear Mixing Approximation)

[0044]
The invention first proposes an approximation for the nonlinear MAX operator by using a weighted linear mixing operator, such that if the inputs to the linear mixing operator have a Gaussian distribution, so will the output after the linear mixing operation. However, the linear mixing operator also preserves the correlation information that is destroyed by the conventional MAX operation, thereby significantly improving the accuracy of the STA. The preferred linear approximation is of the form:
U=ρX+(1−ρ)
Y+ζ (11)
wherein ρ is a constant between 0 and 1 (0<ρ<1) called the contribution factor, and ζ is an arbitrary constant, chosen so that where X and Y have a Gaussian distribution and cov(X,Y)=0, the first two moments (mean and variance) of the resulting Gaussian output U match those of Z=MAX(X,Y). This requirement leads to the following two equations:
μ
_{U}=ρμ
_{X}+(1−ρ)μ
_{Y}+ζ=μ
_{Z} (12)
σ
_{U} ^{2} =E{U−E{U} ^{2}}=ρ
^{2}σ
_{X} ^{2}+(1−ρ)
^{2}σ
_{Y} ^{2}=σ
_{Z} ^{2} (13)
From Equation (12):
ζ=μ
_{Z}−(μ
_{X}+μ
_{Y})ρ
where the contribution factor ρ can be determined from Equation (13) by solving the quadratic equation:
(σ
_{X} ^{2}+σ
_{Y} ^{2})ρ
^{2}−2σ
_{Y} ^{2}ρ+σ
_{Y} ^{2}−σ
_{Z} ^{2}0 (13)
Here, the contribution factor p must have at least one realvalued root between 0 and 1. This is met when the following inequality is satisfied:
$\begin{array}{cc}{\sigma}_{Z}^{2}\ge \frac{{\sigma}_{X}^{2}{\sigma}_{Y}^{2}}{{\sigma}_{X}^{2}+{\sigma}_{Y}^{2}}& \left(14\right)\end{array}$
And thus the contribution factor ρ has two realvalued roots that can be expressed as:
$\begin{array}{cc}{\rho}_{\pm}=\frac{{\sigma}_{Y}^{2}}{{\sigma}_{X}^{2}+{\sigma}_{Y}^{2}}\pm \sqrt{{\left(\frac{{\sigma}_{X}^{2}}{{\sigma}_{X}^{2}+{\sigma}_{Y}^{2}}\right)}^{2}+\frac{{\sigma}_{Z}^{2}{\sigma}_{X}^{2}}{{\sigma}_{X}^{2}+{\sigma}_{Y}^{2}}}& \left(15\right)\end{array}$
It can be shown that if
σ
_{Y} ^{2}≧σ
_{X} ^{2, }0<σ
_{−}<1
and if
σ
_{Y} ^{2}<σ
_{X} ^{2}, 0<σ
_{+}<1
So by switching the contribution factor ρ between ρ
_{−} and ρ
_{+} according to the relative magnitude of σ
_{X} ^{2 }and
, 0<ρ<1 can always be guaranteed.

[0045]
For independent Gaussian random variables X and Y, if the pdf of MAX(X,Y) as estimated from Monte Carlo simulation is compared to its Gaussian approximation from the linear mixing operator, it can be seen that a very close match results. Thus, the linear mixing operator is sufficient to model the results of MAX(X,Y) where X and Y are independent. However, where X and Y are correlated (as from path correlation), error can exist between MAX(X,Y) and its linear mixing approximation (though simulations can verify that it is nonetheless less than that of the tightnessbased approach of C. Visweswariah, K. Ravindran, and K. Kalafala, “Firstorder parameterized blockbased statistical timing analysis,” TAU'04, February 2004). Nevertheless, as will be discussed later in this document, correlated random variables modeling delay elements in a circuit timing graph can be decomposed into independent delay elements, and the MAX operation on the correlated delay elements can be equated to a MAX on the decomposed independent delay elements followed by an ADD operator. Such decomposition can rely on the variation vector timing model (or extended canonical timing model) discussed below.

[0000]
2. Variation Vector Timing Model (Extended Canonical Timing Model)

[0046]
As previously noted, the canonical timing model of Equation (8) only preserves node delay correlations caused by global variation, while neglecting edge delay correlations from path reconvergence. The present invention preferably implements an extended canonical timing model that is capable of representing all the correlation, whether global or path correlation, between any pair of delay elements in a circuit, whether node delays or edge delays. The extended canonical timing model is based on the following principles.

[0047]
Assume that there are N nodes and M global variations in a timing graph, and that the linear approximation for the MAX operator can be expressed by Equation (11). If every node delay can be modeled by the canonical format of Equation (8), then every delay element—whether a node delay or an edge delay—can then be represented by the following extended canonical timing expression:
$\begin{array}{cc}X={\mu}_{X}+\sum _{i=1}^{N}{\alpha}_{X,i}{R}_{i}+\sum _{j=1}^{M}{\beta}_{X,j}{G}_{j}& \left(16\right)\end{array}$
This can be established using the mathematical induction principle:

[0048]
(a) Node delays: if X is a node delay, it will automatically have the extended canonical timing format because the original canonical timing Equation (8) is a subset of Equation (16) in that for the kth node delay, only one α_{X,k }has nonzero value while all other α_{X,j≠k }are set to zero.

[0049]
(b) ADD operation: X=A+B (e.g., where X is the output edge delay resulting from input edge delay A entering node B), and delay elements A, B all fit Equation (16), then X must have the extended canonical timing format.

[0050]
(c) MAX operation: if X=MAX(A,B) (e.g., where X is the output edge delay resulting from merger of input edge delays A and B), given that delay elements A and B fit Equation (16), then X=MAX(A,B)=ρA+(1−ρ)B+ζ as per Equation (11) if A and B are independent. (If A and B are correlated, they can first be decorrelated as discussed later in this document: here A=A′+W, B=B′+W, cov(A′,B′)=0, and A′, B′, and W also each meet Equation (16). Thus, X=MAX(A,B)=MAX(A′+W, B′+W)=W+max(A′,B′)=W+ρA′+(1−ρ)B′+ζ, and X will still have the extended canonical delay format of Equation (16).)

[0051]
(d) Edge delays: any edge delay can ultimately be expressed as the result of one or multiple steps of ADD and/or MAX operations from node delays. Thus, based on assertions (a)(c), the mathematical induction principle ensures that edge delays will also have the extended canonical format of Equation (16).

[0052]
Thus, with Equation (16), both global and path correlations can be handled uniformly through edges and nodes. More specifically, global variations are represented by the set of global sensitivity terms β_{Xj}, and dependence on path history (local variations from path reconvergence) are represented by nonzero node sensitivity terms α_{X,k}.

[0053]
For ease of processing, the extended canonical format of Equation (16) can be rewritten in a compacted vector format as
X=μ _{X} +x ^{T} b (17)
where b≡[R_{1}, . . . R_{N}, . . . G_{M}]^{T }is a random vector consisting of zeromean, unity variance independent Gaussian random variables and x≡[α_{X,1}, α_{X,N}, β_{X,1}, . . . β_{X,M}]^{T }is a deterministic vector, referred to herein as the Variation Vector (vv) of X. Thus, each delay element (X) in a circuit will be uniquely represented by its mean (μ_{X}) and variation vector (x), noted as
X=X(μ_{X} , x) (18)
The variation vector can be seen to have the following properties:

[0054]
(a) If c is a constant, the delay element X and the delay element X+c have the same variation vector x. This property indicates that a variation vector remains unchanged if a constant is added to the corresponding delay element, since the variation vector contains only the variance information of the delay element while the added constant only affects the mean of the delay element.

[0055]
(b) If delay elements X, Y, and Z can be expressed as Z=X+Y, then their corresponding variation vectors can be expressed as z=x+y. In other words, the variation vectors of delay elements are additive in the same manner as their delay elements.

[0056]
(c) If the relationship between delay elements X and Z can be expressed as Z=kX, then the relationship between their variation vectors can be expressed as z=kx. In other words, variation vectors are multiplicative in the same manner as their delay elements.

[0057]
(d) The variance of delay element X can be expressed as ρ^{2}x=x^{T}·x=∥x∥.

[0058]
(e) The covariance of delay elements X and Y can be expressed in terms of their variation vectors as cov(X,Y)=x^{T}·y=y^{T}·x.

[0059]
As will be discussed later, properties (b) and (c) are useful to the concept of variation vector propagation. Properties (d) and (e) make variation vectors an convenient and systematic way to evaluate the variances and correlations for any delay elements.

[0000]
3. Correlation Decomposition

[0060]
As discussed previously, if X and Y are independent, the linear mixing operator can be used to obtain an accurate Gaussian approximation of the pdf of MAX(X,Y) (i.e., the linear mixing operator is sufficient to model the results of MAX(X,Y) where X and Y are not correlated, as by the presence of path correlation). Since path correlation is common in VSLI (Very Large Scale Integration) and other modern circuits, there remains a need to determine (or at least approximate) the output of the MAX operation for correlated delay elements X and Y. There are known methods for decomposing correlated (dependent) delay elements into uncorrelated (independent) substitute elements, e.g., the Principal Components Analysis (PCA) method of C. Clark, “The greatest of a finite set of random variables,” Operations Research, pp. 145162, March 1961. Unfortunately, the MAX operation is generally not communicative with linear transformation operators U (i.e., any linear transformation operation U performed on correlated independent variables generally does not result in transformed variables having the same MAX output):
MAX {U(X,Y)}≠U{MAX(X,Y)}

[0061]
so this approach is of little use in calculating the MAX output for two correlated delay elements.

[0062]
However, an alternative approach to decomposing correlated delay elements can be developed based on the foregoing extended canonical timing model. As illustrated below, if delay elements X and Y in a circuit are represented in the extended canonical timing model as X=X(μ_{X}, x) and Y=Y(μ_{Y}, y), and if X and Y are correlated, there will be a third delay element W in the extended canonical timing model with an arbitrary mean of μ_{W}, with W=W(μ_{W}, w), such that cov(X′=X−W, Y′=Y−W)=0. In other words, the correlated X, Y can be decomposed into independent X′=X−W, Y′=Y−W, which allows MAX(X,Y) to be expressed as a function of W and MAX(X′, Y′).

[0063]
To illustrate, assume that the variation vectors of X and Y are x=(x_{1}, x_{2}, . . . , x_{N+M})^{T }and y=(y_{1}, Y_{2}, . . . , y_{N+M})^{T}. A new variation vector of w=(w_{1}, W_{2}, . . . , w_{N+M})^{T }can be constructed as in FIG. 3, wherein
w _{i}=MIN(x _{i} , y _{i}) i=1, 2, . . . , N+M (19)
and the MIN operation is token in the sense of absolute value. The constructed variation vector w will completely define a new random variable in the extended canonical timing format as W=W(μ_{W}, w), with an arbitrary mean value of μ_{W}. Further, cov(X−W, Y−W)=(x−w)^{T }(y−w)=0 since it is impossible for (x−w) and (y−w) to have common nonzero components.

[0064]
With this method of correlation decomposition, correlated delay elements X and Y are decomposed into independent X′, Y′and W by the relations:
X=X′+W
Y=Y′+W
cov(X′,Y′)=0
and a valuable feature of the decomposition is that it is communicative with the MAX operation:
MAX(X, Y)=MAX(X′+W, Y′+W)=W+MAX(X′, Y′)
Thus, all MAX operations on dependent (correlated) delay elements can be simplified as a MAX operation on independent delay elements followed by an ADD operation.
4. Propagating Mean and Variation Vectors Throughout a Timing Graph

[0065]
Using the foregoing techniques, the means and variation vectors of the nodes in a timing graph can be propagated through the graph to obtain the means and variation vectors of the edges throughout the entire circuit (graph). As previously noted, all STA operations in a graph can be represented by ADD and MAX operations. With ADD operations,
Z(μ_{Z} , z)=X(μ_{X} , x)+Y(μ_{Y} , y)
and mean and variation propagation is straightforward, and consistent with the foregoing equations (1) and (2):
μ_{Z}=μ_{X}+μ_{Y} (20)
z=x+y (21)
In contrast, the MAX operation:
Z(μ_{Z} , z)=MAX{X(μ_{X} , x), Y(μ_{Y} , y)}
has more complicated mean and variation vector propagation, at least where X and Y are dependent. Four steps are involved, as schematically depicted in FIG. 4:
(1) Correlation Decomposition:
X(μ_{X} , x)=X′(μ_{X′} , x′)+W(0, w)
Y(μ_{Y} , y)=Y′(μ_{Y′} , y′)+W(0, w)
(note that here the arbitrary μ_{W }has been set to zero), and where
cov(X′, Y′)=x′ ^{T} ·y′=0
(2) Calculate μ_{Z′} and σ_{Z′} for Z′=MAX(X′,Y′) (from Equations (3) and (4))
(3) Calculate the contribution factor ρ for Z′=MAX(X′,Y′) (from Equation (11))
(4) Final Results for Z=MAX(X,Y):
μ_{Z}=μ_{Z′}+μ_{W}=μ_{Z′} (22)
z=ρx′+(1−ρ)y′+w (23)
The foregoing assumes correlated X and Y, but if X and Y are known to be independent, decomposition is unnecessary and the MAX operation can be simplified by use of the linear mixing approximation discussed earlier.

[0066]
Occasionally, multiple edges will merge in a node, or other situations may arise where more than two delay elements are involved in the MAX operation. In these circumstances, MAX can be performed iteratively, with MAX being performed on two of the elements at each iteration.

[0000]
5. Streamlining of Computations: Sparsity Enhancement

[0067]
Assuming N nodes in a timing graph, each node delay (as well as a corresponding edge delay) will have a N+M dimensional variation vector. Thus, the total computation and storage required will be O(N(N+M))≈O(N^{2}). However, during testing of the foregoing procedures with benchmark circuits, it has been found that many components in the variation vector have very small values, indicating that their contributions to the overall variance is insignificant. Thus, by setting these small coefficients to zero, the variation vector will become a sparse vector containing many zero components, greatly streamlining computations. A preferred method of detecting and eliminating these small components, referred to herein as the flexible vector approach, follows.

[0068]
The flexible vector approach focuses on curtailing the node sensitivity (also called local sensitivity) portion α_{X,i }of the variation vector (Equation (17)) when the node sensitivity has small values. A drop threshold is defined such that if the node sensitivity is smaller than the drop threshold, the node sensitivity is deemed to have small value, and is preferably placed into a drop candidate pool to be pruned from the variation vector.

[0069]
However, as is often the case, the resulting computational efficiency is achieved at the price of accuracy: dropping node sensitivities α_{X,i }with small magnitude is the same as applying truncation to the variation vector, and in subsequent computations, the quantization error may accumulate, resulting in nonnegligible error (particularly for large circuits). One solution to this problem is to lump the components in the drop candidate pool into a single correction term:
x _{pool} =√{square root over (Σx^{2} _{dropped Components})} (24)
When two variation vectors merge through either an ADD or a MAX operation, their pooling components are assumed to be independent. Hence,
z _{pool}(ADD)=√{square root over (x ^{2} _{pool} +y ^{2} _{pool})} (25)
z _{pool}(MAX)=√{square root over (ρ^{2} x ^{2} _{pool}+(1−ρ)^{2} y ^{2} _{pool})} (26)
Using this drop and pool mechanism, what is really dropped during computation is the path correlation. Thus, a good indication of the extent of path correlation in a circuit can be had by simply looking at the length of the variation vector. If a node is not in any statistically critical paths, its variation will be automatically dropped. A node is also dropped if it is in a critical path but is not statistically important in that path. Thus, by looking to the variation vector generated by use of the foregoing methodology, a circuit designer can readily determine which paths and nodes are statistically critical to circuit performance—information which can be of great value in circuit design.

[0070]
Additionally, if the average length of the node part of the pruned variation vectors for a given drop threshold is then defined as the path correlation length (Γ), it can be seen that using the foregoing poolanddrop methodology, the computation complexity is significantly reduced, from O(N^{2}) down to O[(Γ+M)·N] (where Γ, M<<N).

[0000]
6. Experimental Results

[0071]
It is useful to consider experimental results for the foregoing methodology. The methodology was implemented in C/C++ and tested by ISCAS85 benchmark circuits. Before testing, all benchmark circuits were remapped into a library which has gates of not, nand2, nand3, nor2, nor3 and xor/xnor. TABLE 1 summarizes the gate count for each test circuit after gate remapping:
 TABLE 1 
 
 
 Name 
 c432  c499  c880  c1335  c1908  c2670  c3540  c5315  c6288  c7552 
 
Gate Counts  280  373  641  717  1188  2004  2485  3865  2704  5355 

All gates in the library were implemented in 0.18 μm technology, and their delays were characterized by Monte Carlo simulation with Cadence tools (Cadence Design Systems, Inc., San Jose, Calif., USA) assuming that all variation sources, whether process variations or operational variations, follow the Gaussian distribution.

[0072]
For illustration purposes, only three parameter variations were considered global: channel length (L), supply voltage (Vdd) and temperature (T). All other variation sources, specified in the 0.18 μm technology file, were assumed to be localized in the gate being considered. Further, the spatial dependency of the gate delays was not considered for sake of simplicity (though in real life, gate delay parameters are position dependent).

[0073]
Monte Carlo simulations with 10,000 repetitions were run to determine “ideal results” for each benchmark circuit, against which the present methodology could be compared. Each repetition involved a process of static timing analysis by fixing global and node variations into a set of randomly sampled values. The global variations are sampled once for each repetition, while the node variation for each gate is newly sampled each time the gate is computed.

[0074]
TABLE 2 summarizes the edge delay distribution parameters at the primary output of each testing circuit from the “ideal” Monte Carlo (MC) STA and three other STA methodologies. The results labeled as Present represent a high accuracy test of the present methodology wherein the drop threshold is set to as small as 1% (i.e., most path correlations are considered), while the results labeled as Canonical represent the traditional canonical timing model (which is equivalent to the present methodology wherein the drop threshold is set to 100%, i.e., only global correlation is considered). For comparison purposes, a fourth STA method (labeled as “No Corr.”), wherein no correlation was considered, was also implemented and simulated. The mean and standard variation μ and σ of the distribution are shown in picoseconds. The column τ
_{97}=μ+2σ shows the delay estimation at a confidence level of 97%.
 TABLE 2 
 
 
 CPU  Delay Distribution (ps) 
Circuit  STA Method  Time  μ  σ  τ_{97} 

c432  Monte Carlo  6.449  1288.8  219.3  1727.5 
 Present  0.030  1299.0  220.4  1739.9 
 Canonical  0.010  1348.6  216.0  1780.7 
 No Corr.  0.000  1392.6  22.3  1437.1 
c499  Monte Carlo  8.182  1073.6  178.9  1431.4 
 Present  0.030  1084.8  180.5  1445.8 
 Canonical  0.010  1125.4  178.3  1482.0 
 No Corr.  0.000  1148.6  20.5  1189.5 
c880  Monte Carlo  14.83  1445.4  266.3  1977.9 
 Present  0.050  1447.6  264.9  1977.3 
 Canonical  0.010  1463.1  264.2  1911.5 
 No Corr.  0.010  1471.6  16.5  1504.5 
c1355  Monte Carlo  19.01  1445.4  251.4  1948.3 
 Present  0.071  1460.9  250.7  1962.3 
 Canonical  0.010  1529.4  249.0  2027.4 
 No Corr.  0.000  1536.8  12.5  1561.8 
c1908  Monte Carlo  35.80  1828.2  327.3  2482.8 
 Present  0.150  1841.9  328.4  2498.6 
 Canonical  0.030  1881.7  326.5  2534.7 
 No Corr.  0.010  1895.1  27.0  1949.2 
 Monte Carlo  72.16  2097.0  382.9  2862.8 
 Present  0.181  2104.4  382.9  2870.1 
 Canonical  0.050  2161.8  379.7  2921.2 
 No Corr.  0.020  2193.1  22.0  2237.1 
c3540  Monte Carlo  84.02  2747.2  498.8  3744.8 
 Present  0.240  2752.3  502.1  3756.5 
 Canonical  0.050  2850.3  500.  3851.5 
 No Corr.  0.020  2859.7  23.4  2906.5 
c5315  Monte Carlo  140.8  2399.3  441.7  3282.6 
 Present  0.641  2404.8  442.2  3289.2 
 Canonical  0.080  2474.1  441.3  3356.7 
 No Corr.  0.040  2544.8  31.7  2608.2 
c6288  Monte Carlo  114.2  6740.1  1286.8  9313.6 
 Present  5.198  6775.9  1275.1  9326.2 
 Canonical  0.070  7290.8  1273.1  9836.9 
 No Corr.  0.030  7325.1  14.9  7355.0 
c7552  Monte Carlo  203.0  1911.7  348.7  2609.0 
 Present  0.571  1916.6  353.8  2624.2 
 Canonical  0.110  1974.3  352.5  2679.3 
 No Corr.  0.050  2027.4  26.3  2080.1 


[0075]
TABLE 2 illustrates that the present methodology requires greater CPU time than the Canonical and No Corr. methods, but orders of magnitude less than Monte Carlo simulation. TABLE 3 then summarizes the accuracy of the various STA methods compared with the Monte Carlo STA method. It is apparent that the No Corr. method, lacking any consideration of correlation, fails to give reasonable variance estimation (which exemplifies the importance of correlations in STA). However, the No Corr. method still has fairly reasonable mean estimation, illustrating that the mean delay is not very sensitive to the correlation. The present method has significantly less error in mean estimation than both the traditional canonical timing model (“Canonical”) and the No. Corr. method, illustrating the importance of considering path correlation. Variance estimation error in the present method, while significantly less than the No. Corr. method, is generally (though not always) slightly better than in the traditional canonical timing model. The closeness in variance estimation error between the present and traditional canonical timing models is possibly owing to the fact that the variance in the tested cases is dominated by global variation.
 TABLE 3 
 
 
 Mean Error (δμ)  Variance Error (δσ) 
  Canon    Canon  
Circuit  Present  ical  No Corr.  Present  ical  No Corr. 

c432  0.79%  4.64%  8.05%  0.50%  1.50%  89.80% 
c499  1.04%  4.82%  6.99%  0.89%  0.34%  88.50% 
c880  0.15%  1.22%  1.81%  0.53%  0.79%  93.80% 
c1355  1.07%  5.81%  6.32%  0.28%  0.95%  95.00% 
c1908  0.75%  2.93%  3.66%  0.27%  0.24%  91.80% 
c2670  0.35%  3.09%  4.58%  0.00%  0.84%  94.30% 
c3540  0.19%  3.75%  4.10%  0.66%  0.36%  95.30% 
c5315  0.23%  3.12%  6.06%  0.11%  0.09%  92.80% 
c6288  0.53%  8.17%  8.68%  0.65%  1.06%  98.80% 
c7552  0.25%  3.27%  6.05%  1.46%  1.09%  92.50% 


[0076]
Thus, in summary, the present method yields mean and variance estimation which is close to that of Monte Carlo simulation, but without the significant runtime penalty carried by Monte Carlo methods. If the accuracy of mean estimation can be relaxed, the drop threshold can be raised in the present method to yield faster runtime performance at the cost of some mean overestimation.

[0077]
As previously discussed, another benefit of the use of the present methods is that they allow the degree of path correlation in a circuit to be objectively quantified, e.g., in the form of the path correlation length (Γ) (previously defined as the average length of the node part of the pruned variation vectors for a given drop threshold). For the foregoing ISCAS85 benchmark circuits, the path correlation length Γ at drop threshold of 1% is summarized in TABLE 4. This table illustrates that the path correlation length Γ is much smaller than the circuit size, and basically independent of the circuit size since it remains in the range of about 1020 even when the circuit size changes dramatically. This illustrates how the use of the flexible vector format can help reduce the complexity of the present methods, since the drop threshold can be set to a desired degree of computational complexity for the given circuit size. The only exceptionally high path correlation length among the tested circuits happens with the c6288 circuit, which is known as a 16bit array multiplier. Here there are large amount of equal delay paths in the circuit, so few node variations can be dropped owing to their equal importance. This results in greater path correlation length.
 TABLE 4 
 
 
 Name 
 c432  c499  c880  c1335  c1908  c2670  c3540  c5315  c6288  c7552 
 
Γ  22.0  11.1  14.2  19.3  27.0  15.4  21.2  14.4  80.9  16.0 

7. Use of Bounded MAX Approximation for NonGaussian Timing Variables

[0078]
The foregoing discussion assumed that timing variables all have a normallydistributed (Gaussian) distribution. However, in some circuits, nonGaussian timing variables can exist, usually owing to nonGaussian gate delays and/or the nonGaussian output of the traditional MAX operator. One can compensate for nonGaussian gate delays by approximating them with a Gaussian distribution which conservatively bounds the nonGaussian variables, but the nonGaussian MAX output poses greater challenges. If timing variables are significantly nonGaussian, timing predictions given by both the preferred and prior STA methods previously described can be inaccurate, and risky to rely upon. It is therefore desirable to apply the Gaussian assumption whenever it is valid, and at the same time use a bounded conservative timing prediction whenever the Gaussian assumption is invalid. The following discussion provides a preferred approach to this objective.

[0079]
Initially, it is useful to again briefly review the traditional MAX operator. As discussed previously, its output is nonlinear, such that the MAX of two Gaussian random variables will no longer be Gaussian. If the output of the MAX operation from two Gaussian random variable inputs is plotted, it can be seen that while the Gaussian inputs each have a symmetric pdf, the MAX output is usually asymmetric: if the pdfs of the input X and Y overlap, Z=MAX(X,Y) cannot have any values to the left of the overlap; to the right side of the overlap region, Z will match the PDF of whichever of X and Y has the greater mean; but in the overlap region, there may be areas where X>Y, and conversely, where Y>X. As a result, the pdf of Z in the overlap region is similar to the pdf of whichever of X and Y has the greater mean, except that it is “squeezed” or “compressed” to the right. As a result, the overall pdf of Z=MAX(X,Y) is that it maintains a Gaussian tail at its right side, while its left side is compressed rightwardly to have a shorter tail, resulting in positive skew. Thus, the skewness can be used as a measure of how far the pdf of the MAX output deviates from an ideal Gaussian distribution (i.e., as a measure of its nonlinearity). For Z=MAX(X, Y), the skewness can be defined as:
$\begin{array}{cc}\kappa =\frac{1}{{\sigma}_{Z}^{3}}E\left\{{\left(Z{\mu}_{Z}\right)}^{3}\right\}& \left(27\right)\end{array}$
where μ_{Z }and σ_{Z }are the mean and standard deviation of Z.

[0080]
If one then analyzes the skewness resulting from a variety of X and Y pdfs, it is seen that in many cases, skewness is zero—meaning Z=max(X, Y ) is normally distributed, the MAX operator is linear, and the Gaussian assumption is valid. However, skewness is nonzero, and the MAX operator is significantly nonlinear, where X and Y have very similar means but very different variances.

[0081]
This document previously discussed the use of the tightnessbased MAX approximation of Equation (6) (with tightness being defined in Equation (5)), and the contributionfactor based MAX approximation of Equation (11) (with the contribution factor being defined in Equation (15)). If the results of these equations are compared with the results of Monte Carlo simulation, it can be seen that when the nonlinearity of the MAX operation grows significant (i.e., when the output of MAX operator is significantly nonGaussian)—i.e., where skewness grows—error grows at higher confidence levels, and these approximations begin to underestimate the delays. In contrast to overestimated delays, which lead to conservatively safe circuit designs, the underestimated delays could lead to excessive design failures. Thus, modified versions of the foregoing MAX linear approximations have been developed to compensate for this potential difficulty.

[0082]
It is first useful to note that for any random variables X, Y, and R=X−Y, the following transformation will always be true:
MAX(
X, Y)=MAX(
X−Y, 0)+
Y=MAX(
R, 0)+
Y (28)
When we then consider the MAX approximations of Equations (6) and ( 11)—here Equation (6) will be used for sake of illustration:
Z=MAX(
X, Y)≈
=QX+(1
−Q)
Y+Δ=
then MAX(R, 0) can be reexpressed as:
W=MAX(
R, 0)≈
QR+(1
−Q)·0
+Δ=QR+Δ=
This observation is important because it can be shown that the cdf of
=QR+Δ will effectively serve as an upper bound on the cdf of W=MAX(R, 0). (The formal proof will not be presented here, but this can be established experimentally by simulation results as well.) Thus,
=QR+Δ can serve as a conservative approximation for W=MAX(R, 0). To avoid being too conservative, this approximation can be constrained to apply within a certain confidence interval of [a, b] as:
W=MAX(
R,0)≦
QR+Δ=
(
a≦R≦b) (29)
Agarwal et al. (in A. Agarwal, V. Zolotov, and D. Blaauw, “Statistical timing analysis using bounds and selective enumeration,” IEEE Transactions on ComputerAided Design of Integrated Circuits and Systems, vol. 22, no. 9, pp. 12431260, September 2003) describe a distribution which, if applied here, would effectively amount to MAX(R, 0) being bounded within an interval with 100% confidence. Since this may be more conservative than necessary, the approach of Equation (28) can be useful since a user can flexibly decide the acceptable amount of risk involved.

[0083]
If the mean of Δ is then defined as:
$\begin{array}{cc}E\left\{\Delta \right\}=\begin{array}{cc}a\text{\hspace{1em}}Q& \left(a<b<0\right)\\ b\left(1Q\right)& \left(0<a<b\right)\\ \mathrm{MAX}\left(a\text{\hspace{1em}}Q,bb\text{\hspace{1em}}Q\right)& \left(a\le 0\le b\right)\end{array}& \left(30\right)\end{array}$
this provides a conservative upper bound of MAX(R, 0) over the confidence interval of [a, b] with the form of QR+Δ. In other words, if the confidence interval [a, b] for R corresponds to a confidence of (for example) 97%, then the use of QR+Δ as an approximation for MAX(R,0) will yield values which have a probability of at least 97% of overestimating the true value of MAX(R,0). The use of this approximation is thus conservative (and safe). Note that while this approximation was derived using the tightnessbased MAX approximation of Equation (6), a similar approximation can be derived by use of the contributionfactor based MAX approximation of Equation (11).

[0084]
While the foregoing approach will result in a safely conservative approximation, it will possibly be too conservative when the MAX output is almost linear (Gaussian) distribution. As noted previously, when the MAX operation is highly linear, the approach of Equations (6) (and/or (11)) is sufficient and there is no need for a bounded conservative approximation. Thus, the foregoing approach can be used to complement the approach of Equation (6) (or (11)): when skewness is low and MAX is linear (or nearly so), the bounded approximation is overly conservative and Equation (6) (or (11)) can be used; and when skewness is high, the approach of Equation (6) (or (11)) is risky and the bounded approximation can be used.

[0085]
This approach has been experimentally tested for the circuit depicted in
FIG. 5, where the overall delay and all internal arrival time variables are significantly nonGaussian (with the skewness of the output distribution being 2.2). TABLE 5 presents the predicted delay at the 97% confidence interval, τ
_{97}, using Equation (6), the foregoing bounded approximation, and Monte Carlo simulation.
 TABLE 5 
 
 
    Bounded  
  Equation (6)   Approximation  Monte Carlo 
 

 Skewness κ  2.2 
 τ_{97}  419.4  ps  455.5  ps  450.3  ps 
 μ  379.1  ps  399.6  ps  379.8  ps 
 σ  20.2  ps  27.8  ps  26.4  ps 
 

[0086]
Equation (6) significantly underestimates the τ_{97}, which is important because this underestimation, if relied upon, could lead to excessive design failure. On the other hand, the foregoing bounded approximation leads to a 97% delay which is slightly higher (and thus more conservative and safer than) the Monte Carlo “standard.” Thus, in cases where the Gaussian assumption is bad (skewness is high), the bounded approximation provides superior results. Further, while not reflected in TABLE 5, the computation time for the bounded approximation is only negligibly greater than that for Equation (6) since Equations (27) and (30) bear minimal additional computation overhead.

[0087]
The invention is not intended to be limited to the preferred methods and steps described above, but rather is intended to be limited only by the claims set out below. Thus, the invention encompasses all different versions that fall literally or equivalently within the scope of these claims. It should also be understood that in these claims, where symbols and formulae are expressed, the claims are not to be interpreted as meaning that the invention is limited to these symbols and formulae. Rather, the claims extend to processes utilizing the relations set forth by the formulae, regardless of whether the same characters/symbology are used, and regardless of whether the formulae are expressed in the form set forth in the claims or in a different format.