« PreviousContinue »
REGIONAL CLOCK SKEW MEASUREMENT
FIELD OF THE INVENTION
This invention relates generally to integrated circuits, and more particularly to measuring clock skew in a programmable logic device.
BACKGROUND OF THE INVENTION 10
Programmable logic devices (PLDs) are a well-known type of integrated circuit that can be programmed to perform specified logic functions. One type of PLD, the field programmable gate array (FPGA), typically includes an array of 15 programmable tiles. These programmable tiles can include, for example, input/output blocks (IOBs), configurable logic blocks (CLBs), dedicated random access memory blocks (BRAM), multipliers, digital signal processing blocks (DSPs), processors, clock managers, delay lock loops 20 (DLLs), and so forth.
Each programmable tile typically includes both programmable interconnect and programmable logic. The programmable interconnect typically includes a large number of interconnect lines of varying lengths interconnected by 25 programmable interconnect points (PIPs). The programmable logic implements the logic of a user design using programmable elements that can include, for example, function generators, registers, arithmetic logic, and so forth.
The programmable interconnect and programmable logic 30 are typically programmed by loading a stream of configuration data into internal configuration memory cells that define how the programmable elements are configured. The configuration data can be read from memory (e.g., from an external PROM) or written into the FPGA by an external 35 device. The collective states of the individual memory cells then determine the function of the FPGA.
Another type of PLD is the Complex Programmable Logic Device, or CPLD. A CPLD includes two or more "function blocks" connected together and to input/output 40 (I/O) resources by an interconnect switch matrix. Each function block of the CPLD includes a two-level AND/OR structure similar to those used in Programmable Logic Arrays (PLAs) and Programmable Array Logic (PAL) devices. In some CPLDs, configuration data is stored on- 45 chip in non-volatile memory. In other CPLDs, configuration data is stored on-chip in non-volatile memory, then downloaded to volatile memory as part of an initial configuration sequence.
For all of these programmable logic devices (PLDs), the 50 functionality of the device is controlled by data bits provided to the device for that purpose. The data bits can be stored in volatile memory (e.g., static memory cells, as in FPGAs and some CPLDs), in non-volatile memory (e.g., FLASH memory, as in some CPLDs), or in any other type of memory 55 cell.
Other PLDs are programmed by applying a processing layer, such as a metal layer, that programmably interconnects the various elements on the device. These PLDs are known as mask programmable devices. PLDs can also be 60 implemented in other ways, for example, using fuse or antifuse technology. The terms "PLD" and "programmable logic device" include but are not limited to these exemplary devices, as well as encompassing devices that are only partially programmable. 65
PLDs and other logic devices have functional blocks that are physically separate on the IC chip operating off a
common clock signal. The PLD has a clock distribution network (clock "tree") designed to provide the clock signal to various portions of the IC chip with minimal clock skew. Clock skew is the deviation from ideality of the clock signal at a particular location on the PLD relative to a reference clock signal, such as the clock signal provided to a clock tree or the clock signal present at the clock driver output of the PLD. The clock signal at the particular location is typically delayed from the reference clock signal, and the clock skew is expressed in time, such as in pico-seconds ("ps") of delay. For example, if a rising clock edge occurs at time=0 on the reference clock signal, the rising clock edge might occur a few hundred ps later elsewhere on clock tree.
SUMMARY OF THE INVENTION
In an embodiment of the present invention, an integrated circuit ("IC"), such as a field-programmable gate array ("FPGA") or a complex programmable logic device ("CPLD"), has a global clock buffer coupled to a first regional clock buffer through a first global clock spine. A first flip-flop is close to a first end of a first regional clock spine, and is coupled to a circuit block, such as a configurable logic block ("CLB"). The CLB is coupled to the global clock buffer through a first routing portion and a second routing portion couples the first flip-flop to the CLB so as to form a first clock ring allowing measurement of a first clock ring delay. In further embodiments, additional clock rings are configured in the IC, allowing measurements of additional clock ring delays. In suitably symmetric devices, skew along the regional clock spine is calculated from the clock ring delays.
BRIEF DESCRIPTION OF THE DRAWINGS
FIGS. 1A and IB are simplified plan diagrams of FPGAs suitable for being configured into or operating according to embodiments of the invention.
FIG. 2A is a simplified plan view diagram of an IC showing a symmetrical global clock tree.
FIG. 2B is a simplified plan view diagram of the IC of FIG. 2A showing asymmetrical regional clock trees.
FIG. 2C is a simplified plan view diagram of the IC of FIG. 2B with a first clock ring formed with a global clock buffer and a first regional clock buffer.
FIG. 2D is a simplified plan view diagram of the IC of FIG. 2B with a second clock ring formed with the global clock buffer and the first regional clock buffer.
FIG. 2E is a simplified plan view diagram of the IC of FIG. 2B with a third clock ring formed with the global clock buffer and a second regional clock buffer.
FIG. 2F is a simplified plan view diagram of the IC of FIG. 2B with a fourth clock ring formed with the global clock buffer and the second regional clock buffer.
FIG. 3 is a simplified flow chart of a method of measuring skew in an asymmetric clock tree according to an embodiment of the present invention.
DETAILED DESCRIPTION OF THE DRAWINGS
PLDs often have many regions operating on their own regional clock signals. A global clock tree provides the clock signal to regional clock trees, which in turn provide the clock signal throughout a regional circuit. For example, the global clock tree extends along a center line of the chip, connected to regional clock trees on either side in a symmetrical fashion that branch out from the left or right edge
of the chip. This arrangement decreases skew within a regional circuit, compared to directly clocking the regional circuits off of the global clock tree. Sometimes, a regional clock tree is used to distribute the clock signal to several regional circuits. In such arrangements, the skew on the 5 clock distribution network extending along lines parallel to the center line of the chip is called "vertical" skew, and the skew on the clock distribution network extending along lines at right angles to the center line of the chip is called "horizontal" skew. 10
One way to measure clock delay inside an FPGA is to use a built-in self test ("BIST") ring. Clock skew on a symmetrical clock tree can be deduced by measuring delay at various points of the global clock tree using a BIST ring. Programmable interconnections are used to connect the 15 clock output back to the clock driver to complete the BIST ring. For example, a BIST ring path is configured through available chip resources, such as hex lines, double lines, long lines, and CLBs having flip-flops.
Typically, about twenty parameters are summed to obtain a total delay around the ring. The ring is re-configured several times to obtain, e.g. twenty equations with twenty solutions (i.e. measured delays). Linear algebra is used to solve for the delay of the desired link (parameter) in the 25 BIST ring. However, the errors in the measurements for the programmable connections can be large compared to the clock skew that is being measured. In one example, the clock distribution network is designed with a maximum clock skew in the range of 100 ps. Further discussion of using 3Q BIST ring techniques to measure clock skew of a symmetrical global clock network is found in U.S. patent application Ser. No. 10/021,448, entitled METHODS AND CIRCUITS FOR MEASURING CLOCK SKEW ON PROGRAMMABLE LOGIC DEVICES, filed Oct. 30, 2001 by Siuki 35 Chan, and in U.S. patent application Ser. No. 10/021,447, entitled METHODS AND CIRCUITS FOR MEASURING CLOCK SKEW ON PROGRAMMABLE LOGIC DEVICES, filed Oct. 30, 2001 by Siuki Chan, the disclosures of which are hereby incorporated by reference for all 4Q purposes.
FIG. 1A is a simplified plan diagram of an integrated circuit ("IC") known as an FPGA 100, such as a VIRTEXTM IV FPGA manufactured by XILINX, Inc., which is suitable for being configured into or operating according to embodi- 45 ments of the invention. FIG. 1A FIG. 1A illustrates an FPGA architecture 100 that includes a large number of different programmable tiles including multi-gigabit transceivers (MGTs 101), configurable logic blocks (CLBs 102), random access memory blocks (BRAMs 103), input/output blocks 50 (IOBs 104), configuration and clocking logic (CONFIG/ CLOCKS 105), digital signal processing blocks (DSPs 106), specialized input/output blocks (I/O 107) (e.g., configuration ports and clock ports), and other programmable logic 108 such as digital clock managers, analog-to-digital converters, 55 system monitoring logic, and so forth. Some FPGAs also include dedicated processor blocks (PROC 110).
Each programmable tile includes a programmable interconnect element (INT 111) having standardized connections to and from a corresponding interconnect element in each 60 adjacent tile. Therefore, the programmable interconnect elements taken together implement the programmable interconnect structure for the illustrated FPGA. The programmable interconnect element (INT 111) also includes the connections to and from the programmable logic element 65 within the same tile, as shown by the examples included at the top of FIG. 1A.
For example, a CLB 102 can include a configurable logic element (CLE 112) that can be programmed to implement user logic plus a single programmable interconnect element (INT 111). ABRAM 103 can include a BRAM logic element (BRL 113) in addition to one or more programmable interconnect elements. Typically, the number of interconnect elements included in a tile depends on the height of the tile. In the pictured embodiment, a BRAM tile has the same height as four CLBs, but other numbers (e.g., five) can also be used. A DSP tile 106 can include a DSP logic element (DSPL 114) in addition to an appropriate number of programmable interconnect elements. An IOB 104 can include, for example, two instances of an input/output logic element (IOL 115) in addition to one instance of the programmable interconnect element (INT 111). As will be clear to those of skill in the art, the actual I/O pads connected, for example, to the I/O logic element 115 are manufactured using metal layered above the various illustrated logic blocks, and typically are not confined to the area of the input/output logic element 115.
In the pictured embodiment, a columnar area near the center of the die (shown shaded in FIG. 1A) is used for configuration, global clock, and other control logic. Horizontal areas 109 extending from this column are used to distribute the clocks and configuration signals across the breadth of the FPGA. As illustrated the global clock first propagates down the central vertical spine and then extends out via the horizontal spines, e.g., 109.
Some FPGAs utilizing the architecture illustrated in FIG. 1A include additional logic blocks that disrupt the regular columnar structure making up a large part of the FPGA. The additional logic blocks can be programmable blocks and/or dedicated logic. For example, the processor block PROC 110 shown in FIG. 1A spans several columns of CLBs and BRAMS.
Note that FIG. 1A is intended to illustrate only an exemplary FPGA architecture. The numbers of logic blocks in a column, the relative widths of the columns, the number and order of columns, the types of logic blocks included in the columns, the relative sizes of the logic blocks, and the interconnect/logic implementations included at the top of FIG. 1A are purely exemplary. For example, in an actual FPGA more than one adjacent column of CLBs is typically included wherever the CLBs appear, to facilitate the efficient implementation of user logic.
A further description of the global clock tree is disclosed in co-pending U.S. patent application Ser. No. 10/836,722, entitled "A Differential Clock Tree in an Integrated Circuit," by Vasisht M. Vadi, et. al., filed Apr. 30, 2004, which is herein incorporated by reference.
In one embodiment, FPGA 130 of FIG. IB has the processor block 110 of FIG. 1A replaced by extending the BRAM 103 and CLB 102 columns into the area occupied by the Proc 110. In this case FPGA 130 is symmetric. In particular, the top half is essentially a mirror image of the bottom half, and the right half is essentially a mirror image of the left half. Arranging physically large ICs in a symmetric fashion allows convenient clock distribution from a global clock, such as from or near a center column 132 of the FPGA 130. Furthermore, similarly positioned functional blocks within opposite halves typically perform in a similar fashion in regards to delay, power consumption, and available routing resources.
In other embodiments, an IC does not have exact mirror symmetry. For example, an IC having a large embedded block, such as an embedded microprocessor (e.g., Proc 100 in FIG. 1A), does not have a similar embedded block in its