|Publication number||US6580294 B1|
|Application number||US 10/020,446|
|Publication date||Jun 17, 2003|
|Filing date||Dec 18, 2001|
|Priority date||Dec 18, 2001|
|Also published as||US20030112036|
|Publication number||020446, 10020446, US 6580294 B1, US 6580294B1, US-B1-6580294, US6580294 B1, US6580294B1|
|Inventors||Thomas D. Fletcher|
|Original Assignee||Intel Corporation|
|Export Citation||BiBTeX, EndNote, RefMan|
|Patent Citations (6), Referenced by (3), Classifications (12), Legal Events (6)|
|External Links: USPTO, USPTO Assignment, Espacenet|
1. Technical Field
The present invention generally relates to semiconductor circuits. More particularly, the invention relates to differential domino logic stages for digital adders.
Fundamental to the operation of virtually all digital microprocessors is the function of digital (i.e., binary) addition. Addition is used not only to provide numerical sums, but also in the implementation of numerous logic functions. In a typical microprocessor, many adders are used for these functions. When two digital words are added, the carry bit that results from the addition of lessor significant bits must be considered when adding more significant bits. The carry bit can easily be considered by rippling a carry signal through the entire addition chain as the addition is performed. A problem with such an approach, particularly for relatively large words (e.g., 64 bits) is that substantial time is required to ripple the carry signal. Since adders are often performing logic functions in critical time paths, the time needed to ripple the carry signal can slow up the microprocessor.
In response to the above concerns, techniques such as the static carry look-ahead (CLA) adder described in U.S. Pat. No. 5,847,984 to Mahurin have evolved. A difficulty associated with such a static adder, however, is that there typically is relatively high input loading on the circuit. High input loads can compromise speed. Domino circuits use clock signals to dynamically obtain “precharge” and “evaluation” phases for the domino circuits. These phases enable a reduction in input loading resulting in higher gain per stage and considerable speed increases. Two types of domino circuits are single ended and differential circuits. Single ended domino circuits use fewer transistors than the equivalent evaluate circuits, but require two stages of logic when constructing exclusive OR (XOR) gates. This characteristic can be important considering the fact that XOR gates are used in the fabrication of arithmetic logic units (ALUs). Domino circuits such as the p-type polysilicon (or metal oxide) semiconductor (PMOS) circuit 10 of FIG. 3 and the n-type polysilicon (or metal oxide) semiconductor (NMOS) circuit 12 of FIG. 4, on the other hand, are commonly referred to as differential domino circuits, and are more robust and faster than single ended domino circuits. An important characteristic of differential domino circuits is that they lend themselves to the implementation of XOR gates with one stage of logic.
Traditionally, each differential domino logic stage has a precharge circuit 14, a first evaluate circuit 16 and a second evaluate circuit 18. The precharge circuit 14 is connected to a first potential 20 and a differential output defined by a first output node 22 and a second output node 24. The first evaluate circuit 16 is connected to a second potential 26 and the first output node 22. The second evaluate circuit 18 is connected to the second potential 26 and the second output node 24. It is important to note that the first (or “true”) evaluate circuit 16 and the second (or “not true”) evaluate circuit 18 are not symmetric under the conventional approach. Simply put, input transistor T1 is in parallel with the transistor stack T2/T3, whereas input transistor T4 is not in parallel with the transistor stack T5/T6. This is because in an adder the first evaluate circuit 16 implements the expression g1+p1g0, whereas the second evaluate circuit 18 implements the expression g1n(p1n+g0n). Such an asymmetrical architecture can be more difficult to fabricate and does not allow the gon transistor (T6) to be connected directly to the output node.
The various advantages of the present invention will become apparent to one skilled in the art by reading the following specification and appended claims, and by referencing the following drawings, in which:
FIG. 1 is a transistor level diagram of an example of a logic stage in accordance with one embodiment of the present invention;
FIG. 2 is a transistor level diagram of an example of a logic stage in accordance with an alternative embodiment of the present invention;
FIG. 3 is a transistor level diagram of an example of a conventional logic stage useful in understanding the invention; and
FIG. 4 is a transistor level diagram of an alternative conventional logic stage, useful in understanding the invention.
FIG. 1 shows a logic stage 28 utilizing p-type polysilicon (or metal oxide) semiconductor (PMOS) technology. The PMOS logic stage 28 generally has a precharge circuit 30, a first evaluate circuit 32 and a second evaluate circuit 34. As will be discussed in greater detail below, the PMOS logic stage 28 is commonly referred to as a differential domino circuit and has significant advantages over similar evaluate circuits and single ended domino circuits as already discussed. While the logic stage 28 will be primarily discussed with regard to carry look ahead (CLA) adders, the invention is not so limited. In fact, the principles described herein can be beneficial to any circuit in which speed and performance are issues of concern. Notwithstanding, there are a number of aspects of CLA adders for which the logic state 28 is uniquely suited.
It can generally be seen that the precharge circuit 30 is connected to a first potential 36 and a differential output defined by a first output node 38 and a second output node 40. In the illustrated embodiments, the output nodes 38, 40 correspond to a group generate output for a range of bits defined by a less significant bit and a more significant bit. The first evaluate circuit 32 is connected to a second potential 42 and the first output node 38. The second evaluate circuit 34 is connected to the second potential 42 and the second output node 40. It is important to note that the second evaluate circuit 34 is symmetric with the first evaluate circuit 32. In particular, it can be seen that the second evaluate circuit 34 implements the expression p1n+g1ng0n as opposed to the traditional expression g1n(p1n+g0n). This is possible by making use of the fact that the traditional expression can be expanded to g1np1n+g1ng0n and the fact that when P1n is low g1n is also low. Thus, g1n can be eliminated from the first term of the traditional expression to obtain the expression implemented by second evaluate circuit 34 of PMOS logic stage 28.
It can be therefore be seen that each evaluate circuit 32, 34 includes a transistor stack connected between the second potential 42 and one of the output nodes 38, 40. Each evaluate circuit 32, 34 also includes an input transistor connected in parallel with the transistor stack. Specifically, the first evaluate circuit 32 has a transistor stack T2/T3 connected between the second potential 42 and output node 38. Input transistor T1 is connected in parallel with the transistor stack T2/T3. Similarly, the second evaluate circuit 34 has transistor stack T5/T6 connected between the second potential 42 and the output node 40. Input transistor T4 is connected in parallel with the transistor stack T5/T6.
Each transistor stack includes a first series transistor connected to the second potential 42 and a second series transistor connected between the first series transistor and one of the output nodes 38, 40. In one embodiment, the first series transistor is larger than the second series transistor in order to achieve a “tapering” effect. By tapering the series transistors, a number of benefits can be achieved. For example, one benefit is the ability to place the smaller transistor in the critical path of the adder. This benefit is particularly important with regard to the second series transistior T6 of the second evaluate circuit 34. Specifically, it should be noted that in standard CLA architectures, the g0n signal is in the critical path. By using transistor T6 to receive the generate input corresponding to the less significant bit (gon) of the adder circuit, the input load can be reduced, which speeds up the critical path. Thus, the input load of T6 can be reduced because the T5/T6 transistor stack is tapered such that T5 is larger than T6. Simply put, the gon transistor T6 is moved closer to the output to obtain speed and performance benefits. Furthermore, the input transistor T4 of the second evaluate circuit 34 is no longer stacked and can also be reduced in size. Such speed reductions speed up the propagate path which in turn, speed up the generate path.
Returning now to FIG. 2, it can be seen that similar benefits can be achieved with an n-type polysilicon (or metal oxide) semiconductor (NMOS) logic stage 28′. The above discussion therefore applies with the caveat that in the NMOS logic stage 28′, the first potential 36′ is greater than the second potential 42′, whereas for the PMOS logic stage 28 the first potential 36 is less than the second potential 42. Thus, logic stage 28′ includes a precharge circuit 30′, a first evaluate circuit 32′, and a second evaluate circuit 34′, wherein the evaluate circuits 32′, 34′ are symmetric. As already discussed, the second series transistor T6′ is to receive a generate input corresponding to a less significant bit, whereas the first series transistor T5′ and the input transistor T4′ are to receive inputs corresponding to a more significant bit. Transistor T6′ is connected directly to output node 40′ to obtain the tapering benefits already discussed. Furthermore, transistor T4′ is connected directly between the output node 40′ and second potential 42′ in order to speed up the propagate path.
With continuing reference to FIGS. 1 and 2, it can be seen that the precharge circuit 30 includes a pair of clocked transistors T7, T8 to receive a clock input. The clocked transistors T7, T8 define an evaluate phase and a precharge phase for the logic stage 28 based on the clock input. The precharge circuit 30 further includes a pair of cross-coupled keeper transistors T9, T10 to hold data at the output nodes 38, 40. Precharge circuits such as those shown are well understood as evidenced by the discussion in U.S. Pat. No. 6,205,463 to Manglore et al.
The logic stages described herein can be used to construct adders that are faster, more robust and less difficult to manufacture. For example, by alternating PMOS and NMOS logic stages with relatively fast clock inverters disposed between the stages, XOR functions can be performed more easily and critical paths are significantly reduced.
Those skilled in the art can now appreciate from the foregoing description that the broad techniques of the present invention can be implemented in a variety of forms. Therefore, while this invention has been described in connection with particular examples thereof, the true scope of the invention should not be so limited since other modifications will become apparent to the skilled practitioner upon a study of the drawings, specification, and following claims.
|Cited Patent||Filing date||Publication date||Applicant||Title|
|US5384493 *||Oct 5, 1992||Jan 24, 1995||Nec Corporation||Hi-speed and low-power flip-flop|
|US5777491 *||Jun 10, 1996||Jul 7, 1998||International Business Machines Corporation||High-performance differential cascode voltage switch with pass gate logic elements|
|US5847984||Jul 17, 1997||Dec 8, 1998||Advanced Micro Devices, Inc.||Combination propagate and generate carry lookahead gate|
|US6133761 *||Oct 30, 1997||Oct 17, 2000||Kabushiki Kaisha Toshiba||Logic circuit|
|US6205463||May 5, 1997||Mar 20, 2001||Intel Corporation||Fast 2-input 32-bit domino adder|
|US6316960||Apr 6, 1999||Nov 13, 2001||Intel Corporation||Domino logic circuit and method|
|Citing Patent||Filing date||Publication date||Applicant||Title|
|US7982507||Nov 13, 2008||Jul 19, 2011||Rambus Inc.||Equalizing transceiver with reduced parasitic capacitance|
|US8330503 *||Mar 19, 2008||Dec 11, 2012||Rambus Inc.||Equalizing transceiver with reduced parasitic capacitance|
|US20090066376 *||Nov 13, 2008||Mar 12, 2009||Rambus Inc.||Equalizing Transceiver With Reduced Parasitic Capacitance|
|U.S. Classification||326/98, 326/95, 327/208, 326/93|
|International Classification||G06F7/508, G06F7/50, H03K19/173|
|Cooperative Classification||H03K19/1738, G06F7/508, G06F2207/3872|
|European Classification||H03K19/173C4, G06F7/508|
|Dec 18, 2001||AS||Assignment|
|Dec 18, 2006||FPAY||Fee payment|
Year of fee payment: 4
|Jan 24, 2011||REMI||Maintenance fee reminder mailed|
|Mar 24, 2011||FPAY||Fee payment|
Year of fee payment: 8
|Mar 24, 2011||SULP||Surcharge for late payment|
Year of fee payment: 7
|Nov 19, 2014||FPAY||Fee payment|
Year of fee payment: 12