|Publication number||US7200763 B2|
|Application number||US 10/682,758|
|Publication date||Apr 3, 2007|
|Filing date||Oct 9, 2003|
|Priority date||Oct 9, 2003|
|Also published as||US20050081073|
|Publication number||10682758, 682758, US 7200763 B2, US 7200763B2, US-B2-7200763, US7200763 B2, US7200763B2|
|Inventors||Emrys J. Williams|
|Original Assignee||Sun Microsystems, Inc.|
|Export Citation||BiBTeX, EndNote, RefMan|
|Patent Citations (18), Non-Patent Citations (2), Referenced by (7), Classifications (20), Legal Events (4)|
|External Links: USPTO, USPTO Assignment, Espacenet|
The present invention relates to semiconductor devices such as processors, and more particularly to controlling the power consumption of such devices.
System 10 further includes components, depicted schematically in
It will be appreciated that there are many other known configurations for the power supply within a computing system apart from that shown in
One problem in the design of power supplies for computers is that there can be significant fluctuations in the amount of power that certain components require. This is particularly a problem in relation to digital electronic devices, such as CPUs and other similar forms of logic devices. These components operate at very high speeds (clock rates in excess of 1 GHz are common now), and their power consumption can vary over a very short timescale dependent upon the particular instructions being executed. The system power supply configuration, such as DC/DC converter 102 in
This problem is illustrated in more detail in
As shown in
More particularly, power supply is typically designed with capacitor C1 to hold V2 steady. However, if a device is placed in effect in parallel with processor 210, then it will actually see voltage V3 (i.e. voltage V2 as modified by inductance L2). In general therefore, voltage V3 must be maintained as close as possible to voltage V2, in order to remain within specifications for the various devices in system 10. If the received voltage, V3, suffers an excursion such that it goes outside these specifications, then proper operational behaviour of system 10 can no longer be assured.
The electrical behaviour of processor 210 can be modelled in simplified fashion by the configuration in block 211A, comprising a capacitor, C3, and a switch, S1. For each operational cycle of processor 210, switch S1 is set to position X0 or X1, according to the particular operation to be performed. In the switch setting X0, the capacitor C3 charges up, until it is saturated, at which point no further charge is drawn. Alternatively, with switch S1 set to position X1, capacitor C3 discharges, until no charge is left. (It is assumed that the time taken for capacitor C3 to charge or discharge corresponds approximately to one operational cycle of processor 210).
The current flow through block 211A therefore depends upon the particular sequence of settings for switch S1, which in turn depends upon the sequence of instructions being executed by the processor. If switch S1 is rapidly alternated between position X0 and position X1, then capacitor C3 repeatedly charges up and discharges. This will cause block 211A to draw a relatively high current from power supply 202. Alternatively, if switch S1 is maintained for multiple operations in a fixed position, whether X0 or X1, then the current drawn by block 211A is relatively small (and indeed falls towards zero).
A more accurate representation of the electrical behaviour of processor 210 has multiple blocks 211, each analogous to block 211A, and all in parallel with one another. (The presence of these multiple blocks is indicated schematically in
The total current drawn by the processor 210 corresponds to the sum of the separate currents drawn by each of the individual blocks 211A, 211N. The setting of the switch within each block depends upon the instructions being processed. In general it is expected that some blocks will draw current while others do not. However, it is possible that a particular instruction sequence causes the switches in all (or most) of the blocks to alternate between position X0 and position X1. This will then lead to very high current being drawn by processor 210. Conversely, if the instruction sequence leaves the switch setting unchanged for multiple operations in most or all of the blocks, then a very small amount of current will be drawn. These circumstances will lead to a sharp rise or fall in the current taken by processor 210.
One particular situation in which the current taken by the processor 210 can change suddenly is due to the presence of processor idle time. Thus CPUs rarely operate at 100% capacity, but rather have to wait at times to perform the next instruction. This waiting can arise for various reasons, for example, there may be a delay while the CPU obtains data from a hard disk, or the CPU may be waiting for a user to enter a command to proceed. Since the CPU state does not alter during such idle time, no current is drawn. Consequently, the onset of idle time generally leads to a sudden fall in the current taken by the processor; conversely, the end of idle time (i.e. restarting normal instruction execution) will typically cause a sharp rise in the current taken by the processor.
Unfortunately, such sudden changes in processor current consumption can cause a problem in view of inductance L2 (which is difficult to eliminate, as previously explained). Thus the voltage across inductance L2 varies in proportion to dI/dt, where I is the current through L2. If the current drawn by processor 210 fluctuates rapidly, whether quickly rising or falling, then this will potentially cause V3 to depart significantly from V2. Typically a sudden rise in current consumption will cause V3 to fall, whereas a sudden drop in current consumption will lead to a rise in V3. These changes in the power supply network may cause severe operational difficulties for system 10.
One way to address this problem is by inserting capacitor C2, as shown in
In view of these circumstances, power supply circuits in modern computer systems for providing power to digital logic devices such as processor 210 have to be very carefully designed. In particular, any voltage fluctuations on the supply lines have to be minimised (such as by reducing L2 as much as possible), and the various devices attached to the supply lines then have to be robust enough to accommodate any residual voltage fluctuations. It will be appreciated that this places significant design constraints on the system, and generally adds to overall costs.
Note also that computer systems are typically available in a very wide range of configurations, for example, models in a given product range may vary in terms of processor speed, memory capacity, storage capacity, and so on, and may also evolve over time, as new components (e.g. higher speed processors) become available. In addition, customers frequently perform upgrades to installed systems as well, such as by supplementing or replacing existing components with new and more powerful components that were not necessarily available when the system was originally designed and purchased.
As a result, any given system is typically available in, or may be modified to, a very wide range of configurations. It is difficult for a manufacturer to rigorously test every single potential configuration. Instead, particular components (e.g. processors) are generally designed to be compatible (i.e. interchangeable) with one another, so that a modification or upgrade of such a component should not take a system outside its proper operating regime, including in terms of the power supply requirements. Nevertheless, there may be some subtle and unexpected differences in the way that slightly different versions of a component perform or interact with other components that do impact power consumption. This in turn might adversely affect system operation and reliability for certain particular configurations.
In accordance with one embodiment of the invention, there is provided a method of controlling the power consumption of a semiconductor device that is operable to process a sequence of instructions. The method involves monitoring the power consumption of the device and detecting any significant change. Responsive to the detection of such a change, one or more dummy instructions are inserted into the sequence of instructions. The dummy instructions are selected in order to smooth or to limit the change in power consumption of the device. In this manner, the power consumption by the semiconductor device can be regarded as self-regulating. This then helps to avoid disturbances on the voltage supply lines to the semiconductor device, where such disturbances might otherwise cause problems for additional devices on those lines. Typically this therefore allows the additional devices to be formed to less stringent specifications, and hence at reduced cost.
In general, instructions that consume relatively little power are inserted into the instruction sequence if the significant change in the power consumption of the device represents a rise in power consumption. Conversely, instructions that consume relatively high power are inserted if the significant change in the power consumption of the device represents a fall in power consumption. This therefore has the effect of mitigating what would otherwise be the relatively sudden change in power consumption, and tends instead to spread or smooth the change out over a longer time period.
Note that inserting one or more dummy instructions does not imply that the original instruction is lost. Rather, the original instructions can be buffered or otherwise maintained, ready for resumption when the dummy instructions have been executed, and no further dummy instructions are required. In this way, the overall processing results of the instructions are not affected by the insertion of the dummy instruction(s), apart from a possible slight delay. However, any such delay is unlikely to be significant, assuming that the proportion of dummy instructions is small.
There are various ways in which the power consumption of the device can be monitored. In one embodiment, the sequence of instructions is received and the power consumption of the device for each instruction is then estimated. One way of doing this is to access a look-up table to find a predicted power consumption for the instruction concerned. In another embodiment, a state model of the device is maintained. The state is then updated from a previous state to a new state in accordance with the instruction to be processed, with the predicted power consumption being determined from the previous state and the new state of the device.
Note that the estimated or predicted power consumption does not always have to be exactly correct, as long as overall it provides a reasonable approximation to ongoing power consumption. One reason for this is that if changes in power consumption occur on short timescales, intermediate timescales, and long timescales, then it is power consumption changes on intermediate timescales that are most significant for present purposes. For example, the detection of a significant change in the power consumption of the device may be performed over a timescale of N instruction cycles, such as by determining a difference between the power consumption over a current set of N instruction cycles and the power consumption over a preceding set of N instruction cycles (with a significant change then being detected if this difference exceeds a predetermined threshold).
This focus on intermediate timescales arises because there is typically a smoothing capacitor across a power input to the device that is able to absorb fluctuations on a short timescale. If C is this capacitance, V the normal operating voltage of the device, T the duration of a single instruction cycle, and I the maximum current that can be supplied to the device, then an appropriate value for N is approximately N=VC/IT. Thus the capacitor can absorb variations in power consumption over a smaller timescale than N cycles, but may not be able to do so over a longer timescale.
On the other hand, fluctuations in power consumption over much longer timescales are generally not so problematic. This is mainly because they do not react with any stray inductance to cause unwanted voltage excursions. For example, if the stray inductance is L, then variations on a timescale T>>IL/V (where I and V are as above) represent only relatively minor fluctuations in the voltage supply network, and so can normally be accommodated without too many problems.
The detection of a change in the power consumption can be made in respect of the estimated power consumption itself, or by using any other appropriate physical parameter related to or derived from the power consumption. Thus in one embodiment, the device is provided with a model of the power supply network for the device (typically this is in the form of a logical filter network, representing the various inductances, capacitances, and so on illustrated in
Note that one common reason for a sudden drop in power consumption is where the processor goes into idle mode, which typically occurs if the processor has to wait for an instruction and/or data to process. This sudden drop may cause dummy instructions that consume power to be inserted into the instruction queue. Since the instruction queue is empty (or halted) at this point, given that the processor is idling, such insertion of dummy instructions does not cause any delay in useful processing.
The injection of dummy instructions that consume power can therefore be regarded as more favourable than the injection of dummy instructions that do not consume any power, since the former will frequently occur during idle periods. (In contrast, a dummy instruction that does not consume power is utilised when there is a sudden rise in power consumption, which cannot be at the onset of processor idling). Consequently, to improve efficiency, the system can be configured so that an instruction is more likely to be inserted following a sudden fall in power consumption than following a sudden rise in power consumption.
One way of achieving this asymmetry is to set a nominal voltage for the device that is off-centre from the specified operating voltage range of the system. Thus the nominal voltage range represents the default, steady state voltage that is experienced by the device, assuming that the processor draws a steady current. Changes in processor power consumption then cause the actual voltage to depart from its nominal value. The actual voltage can be estimated using the network model described above to ensure that it remains within the specified operating voltage range, otherwise one or more dummy instructions are inserted.
If the nominal voltage is arranged to be closer to the top of the acceptable operating voltage range than to the bottom of this range, then a sudden fall in power consumption (leading to a rise in voltage) is more likely to take the voltage outside the acceptable range than a sudden rise in power consumption (which would lead to a fall in voltage). Therefore, the leeway or margin for accommodating a sudden rise in power consumption without having to insert dummy instructions (that would cause a processing delay) is increased. Conversely, the margin for accommodating a sudden fall in power consumption is reduced. However, since such a sudden fall in power consumption may well be due to the processor going into idle mode, when dummy instructions can be inserted without any performance penalty, this bias or asymmetry helps to improve overall efficiency and processing throughput.
In accordance with another embodiment of the invention, there is provided a semiconductor device comprising: a queue operable to hold a sequence of instructions; a monitor operable to estimate power consumption of the device; a detector operable to flag a significant change in the estimated power consumption of the device; and a multiplexer. A first input of the multiplexer is connected to the queue, a second input of the multiplexer is connected to one or more stored dummy instructions, and a control input of the multiplexer is connected to the detector. In this way, should a significant change in estimated power consumption be detected, the output of the multiplexer can be switched to an appropriate dummy instruction, in order to obviate or at least mitigate such change.
In accordance with another embodiment of the invention, there is provided a semiconductor device operable to process a sequence of instructions, comprising: a monitor operable to estimate power consumption of the device; a detector operable to flag a significant change in the estimated power consumption of the device; and a set of one or more dummy instructions for injection into the sequence of instructions in response to detecting a significant change in the estimated power consumption. The one or more dummy instructions are selected in order to reduce or to limit the change in power consumption.
In accordance with another embodiment of the invention, there is provided a power supply unit, a semiconductor device operable to process a sequence of instructions, and a power supply line from the power supply unit to the semiconductor device incorporating at least some inductance. The semiconductor device is operable to process one or more dummy instructions if the sequence of instructions for processing would otherwise require a significant change in current passing through the inductance.
The power regulation facility described herein is particularly useful in conjunction with digital electronic semiconductor load devices, especially those liable to occasional peaks in power consumption, such as a CPU and a DRAM. These are typically located in a computer system, which will therefore benefit from this facility. Nevertheless, the approach is potentially applicable to a wide range of other electronic systems, including telecommunication apparatus, household electronic goods (televisions, DVD players, etc.), and so on.
It will be appreciated that such system embodiments can generally utilise the same particular features as described above in relation to the method embodiments.
Various embodiments of the invention will now be described in detail by way of example only with reference to the following drawings in which like reference numerals pertain to like elements and in which:
Instructions for execution by CPU 210 are received into instruction queue 310, where they are stored pending supply to the processing unit 335. The processing unit is responsible for actual execution of the instructions. If the processor is idling, the instruction queue may be temporarily halted or empty (the latter case can be regarded as having a sequence of null instructions in the instruction queue).
Also connected to the output of instruction queue 310 is a monitor 330, while interposed between the instruction queue 310 and processing unit 335 is a multiplexer 315. Thus instructions output from queue 310 pass firstly to processing unit 335 via multiplexer 315, and secondly to monitor 330.
CPU 210 also includes a second multiplexer 325. The output of this second multiplexer passes to the second input of multiplexer 315 (the first input of multiplexer 315 being from instruction queue 310). Monitor 330 controls which of these two inputs is passed by multiplexer 315 to processing unit 335 (as shown schematically in
Connected to the two inputs of multiplexer 325 are dummy instructions D0 and D1 respectively. Monitor 330 controls which of these two inputs is passed to multiplexer 315 (as shown schematically in
In operation, monitor 330 examines the instructions it receives from instruction queue 310 in order to assess the electrical current requirements of processing unit 335 (and hence CPU 210). In particular, monitor detects any sharp rise or fall in consumption. As long as no significant change is detected, then monitor 330 sets control signal B to multiplexer 315 so as to allow instructions from instruction queue 310 to pass through to processing unit 335. This can be regarded as the default or normal state of CPU 210.
However, if monitor 330 detects that the instruction sequence from queue 310 would cause a significant change in the power consumption of CPU 210, then it asserts signal B. This then changes the setting of the multiplexer 315, so that rather than outputting an instruction from queue 310 to the processing unit 335, it outputs a dummy instruction, D0 or D1 instead. Note that in these circumstances the instruction from queue 310 is not lost, but rather remains buffered within instruction queue 310. Then, when control signal B is subsequently released, the flow of instructions from queue 310 to processing unit 335 is resumed at this point. Consequently, the insertion of the dummy instruction(s) does not impact the overall processing flow (apart from any slight delay).
As previously stated, monitor 330 is also responsible for controlling multiplexer 325 (via Arrow A). This therefore allows monitor 330 not only to decide whether to insert a dummy instruction, but also to select which particular dummy instruction—D0 or D1—is to be inserted. Monitor 330 is configured so that if a sudden rise in the power consumption of CPU 210 is detected, then multiplexer A is controlled to pass instruction D0. In contrast, if a sudden fall in power consumption by CPU 210 is detected, then multiplexer A is controlled to pass instruction D1.
The use of dummy instructions D0 and D1 in this manner allows monitor 330 to mitigate or smooth fluctuations in the current drawn by CPU 210. This is done primarily with the aim of reducing current variations through inductance L2 (see
Note that the problems caused by inductance L2 are particularly acute for current variations through processor 210 on a timescale of approximately N processor instructions (where the precise value of N depends on the parameters of the particular circuits involved). Thus fluctuations on a timescale much less than N, i.e. typically one or two processor cycles, can be accommodated by capacitor C2, since the total charge involved is relatively small. Consequently, these do not cause current variations through the inductor L2, and hence should not lead to voltage problems. Conversely, fluctuations on timescales much longer than N have (in effect) a large denominator in dI/dt, and hence the voltage excursions across inductance L2 are again relatively small. In contrast, for fluctuations on intermediate timescales (corresponding approximately to N processor cycles), the charge involved may exceed the damping capacity of capacitor C2, while the timing is still sufficiently short to induce potentially significant voltage variations across inductor L2.
Table 1 provides an illustration of the principle of operation of the embodiment of
It will also be assumed that the system is particularly sensitive to variations in power consumption on a timescale of approximately N=4 processor cycles. In the second row of Table 1 therefore, the instructions of the first row are grouped into blocks of four instructions, and the total power consumption associated with each block of four instructions is specified. Thus the first block of four instructions consumes 1 unit of power, the next block of four instructions consumes 3 units of power, and so on.
We now assume that monitor 330 is designed to impose a constraint that the power consumption should not vary by more than 1 unit between adjacent blocks of 4 instructions. It will be appreciated that this helps to ensure that the current drawn through the supply inductance (such as L2 in
Looking therefore at the second row of Table 1, it will be seen that the instruction sequence originally presented in the first line of Table 1 contravenes the above constraint, in that there are indeed changes in current consumption of more than 1 unit between successive blocks of 4 instructions. For example, the increase in power consumption between the first block, block A, and the second block, block B, is two units, while there is a decrease in consumption of three units from the third block, block C, to the fourth block, block D.
In order to eliminate such significant changes, monitor 330 modifies the instruction sequence by inserting dummy instructions, as described above in relation to
The fourth row of Table 1 shows the aggregated power consumption for each block of four instructions after the dummy instructions have been inserted into the instruction sequence. It will be seen that compared to the second row of Table 1, the variation in power consumption from one block to the next is considerably smoothed out for the modified sequence. In particular, the power consumption within any given block of four instructions now differs by only one unit from the power consumption of the immediately adjacent blocks. Consequently, the power drawn through the supply inductance can be maintained at a reasonably constant level over intermediate timescales, thereby helping to prevent any troublesome voltage excursions. This in turn means that some of the design constraints upon the processor may be relaxed, given that variations in power consumption can now be safely accommodated. For example, developers no longer need to test for (or to prohibit) instruction sequences that can lead to rapidly rising or falling power consumption.
Similarly, the power supply network and/or any additional components attached to that network no longer need such a high level of protection against fluctuations on the power supply network. For example, it may be possible to reduce the size of capacitor C2 (see
It will be appreciated that the above advantages do come at the expense of slightly slower processing. Thus it will be noted that the modified instruction sequence in the third line of Table 1 is somewhat longer than the original instruction sequence (in the first line of Table 1), due of course to the insertion of the dummy instructions into the instruction sequence. Given that in terms of instruction processing, the dummy instructions are just placeholders, with no logical effect, the net result is that it has taken slightly longer to arrive at the same overall processing result. In other words, there is a slight decrease in processing speed.
However, the system is generally designed so that monitor 330 only needs to insert dummy instructions on relatively rare occasions. Thus the processor and power supply network should be able to accommodate current variations arising from the majority of instruction sequences, and so usually the normal or unmodified instruction sequence is executed. It is only when there is some particularly troublesome instruction sequence which would cause an unusually high level of voltage fluctuation that the dummy instructions are inserted. Consequently, it is relatively rare for the instruction sequence to be lengthened by the insertion of dummy instructions. Hence the overall effect on processing efficiency is relatively insignificant.
It is also noted that the most common cause of a sudden fall in power consumption is where the processor starts idling (rather than having some rare instruction sequence that draws unusually low power). It will be appreciated that inserting dummy instructions (D1) at this point does not have any impact on processing throughput, since the processor is idling anyway.
The set of consumption values CR1 441 are then passed into a filter network 480, which in mathematical terms calculates the following:
where xj represents the sequence of current consumption values CR 441 resulting from the input instruction sequence. More particularly, the first term in parentheses is determined by the pairing of a first delay unit 460 and a first summation unit 450. The former can be readily implemented by an N-stage shift register, where N again corresponds to the number of processor cycles over which current fluctuations are to be smoothed. The latter, namely summation unit 450, has two inputs, denoted PLUS and MINUS. The values on these inputs are respectively added to and subtracted from the current value stored in summation unit 450.
The second term in parentheses in Equation 1 is then determined by another pairing of a second summation unit 455 and a second delay unit 465. The design and operation of this second pairing are the same as for the first summation unit 450 and the first delay unit 460.
The outputs of the first and second summation units 450 and 455 are fed to a difference calculator 430, which in effect determines the value of yi from Equation 1 above. The difference calculator 430 outputs two signals, the first representing the magnitude (MAG 432) of the calculated difference, while the second is a binary signal representing the sign (SIGN 431) of the calculated difference. In one embodiment, if the magnitude is zero, then SIGN 431 indicates a positive value, although as will become apparent below, in this situation the value of SIGN 431 is not in fact significant. Note that the SIGN 431 signal corresponds to output A from monitor 330, as depicted in
The initial operation of filter network 480, which is a form of finite impulse response (FIR) filter, is to low pass filter the input sequence CR1 441. This removes the high frequency current variations on a timescale shorter than N processor cycles, since these can be absorbed by a protective capacitor (C2 in
Difference calculator 430 outputs magnitude signal MAG 432 to a comparator 420, which compares MAG 432 to a threshold (THRESH1) that is stored in a register 410 or other suitable memory device. Comparator 420 then outputs a binary signal B (corresponding to that depicted in
For most of the time, it expected that the value of MAG 432 is below the value of THRESH1, and so the value of signal B is null. This indicates that the variation in current consumption between the current block (i.e. for the N immediately preceding instructions) and the previous block (i.e. for the N earlier instructions) is sufficiently low that the system should be able to tolerate any voltage excursion across the supply inductance. Accordingly, with reference to
However, if the difference in current consumption between the current and previous blocks of instructions is too great, then MAG 432 will exceed THRESH1. In this case, the output of the comparator 420 changes, and output B is now asserted. Consequently, multiplexer of 315 is set so that rather than passing instructions from instruction queue 310, instead a dummy instruction is passed from multiplexer 325. At this stage the value of SIGN 431 becomes important, in that if difference calculator 430 detects a negative difference, indicating a fall in power consumption from the previous block to the current block, SIGN 431 is set to control multiplexer 325 to pass dummy instruction D1. This is a dummy instruction that consumes a relatively large amount of power, and so helps to mitigate the fall in power consumption of the normal instruction sequence. Conversely, if the difference calculator 430 detects a positive difference, representing a rise in power consumption from the previous block to the present block, SIGN 431 is set to control multiplexer 325 to pass dummy instruction D0. This dummy instruction consumes relatively little power, and so offsets the rise in power consumption of the normal instruction sequence.
It will be appreciated that the circuitry of
These operations are illustrated in the flowchart of
On the other hand, if there has been a significant change in power consumption, the method proceeds to determine whether this change is a fall or a rise (step 540). In the former case, a dummy instruction taking high power is inserted into the instruction stream (step 550), while in the latter case, a dummy instruction taking low power is inserted into the instruction stream (step 560). Processing then returns to step 520, with the loop to insert dummy instructions being repeated until it is determined that the change in power consumption has now been reduced to an acceptably low level. At this point, the instruction originally received at step 510 can now be executed (step 570), and the next instruction for processing obtained.
Furthermore, if a dummy instruction has been inserted into the instruction stream in place of the postponed instruction, the contents of summation unit 450 and delay unit 460 must be updated in order to reflect this fact. The skilled person will be aware of a variety of mechanisms for achieving this. One possibility is to overwrite the power consumption value for the postponed instruction in delay unit 460 with that for the relevant dummy instruction. In addition, the power consumption value for the postponed instruction value can be subtracted out of summation unit 450, and that for the relevant dummy instruction added back in.
Another possibility is to provide the pairing of summation unit 450 and delay unit 460 in triplicate. The first pairing then represents that shown in
It will be appreciated that there are many potential variations on the embodiments illustrated in
Furthermore, while filter network 480 represents one possible mechanism for detecting variations in power consumption on a particular timescale, the skilled person will be aware of many other circuits for implementing appropriate functionality. For example, in one embodiment filter network 480 models the power supply network, and in particular the behaviour of voltage V3 as a result of the current consumption of each individual instruction (this is dependent upon circuit elements L2, C2, and so on). The predicted value of V3 is compared against specifications (i.e. the minimum and maximum limit values of V3 that can be properly accommodated by the components in the system). Dummy instructions are then inserted as appropriate to prevent or counter any excursion of V3 outside these specifications. It will be appreciated that modelling the power supply network in this manner provides a more rigorous approach than the generalised filtering of Equation 1 above.
Note that in this embodiment, it is the supply voltage level itself that is calculated, rather than an estimated level of power consumption (although the two are of course directly related). It will be appreciated therefore that with regard to step 520 in
An example of a model of a power supply network has been developed for the circuit shown in
In the model, a dummy instruction (D1) is inserted if the voltage rises above a predetermined maximum. The inserted instruction causes the processor to draw current, which tends to decrease the voltage. Conversely, if the voltage falls below a predetermined minimum, an instruction is inhibited. The lack of instruction to be executed prevents the processor from drawing any current, and tends to increase the voltage. (Note that inhibiting an instruction can be regarded as analogous to inserting a void dummy instruction, D0).
The model incorporates seven logical (Boolean) parameters as set out in Table 2 below. It is assumed that B1 (Enable), which determines whether or not voltage regulation is applied, is set once at the beginning of the run. A sequence of values for B2 is provided (corresponding to the instruction sequence). This then allows corresponding values of parameters B3, B4, B5, B6 and B7 to be calculated for each processor cycle, as set out in Table 2.
Indicates whether or
From User Setting
not voltage limiting
circuit is in operation
wants an operation
Indicates whether an
If V2< Vmin, B3=1;
instruction should be
prohibited to avoid
Indicates whether an
If V2> Vmax, B4=1;
instruction should be
inserted to avoid
Indicates an inhibited
If B2=B3=1, B5=1;
instruction is to
Indicates a queued
If B2=B3=0 AND
instruction is to
If B1=0, B7=B2;
an instruction is
to be processed
If B4=1 OR B6=1
OR (B3=0 AND
In the model, we can image all instructions being received into a FIFO queue. The variable QUEUE is then used to hold the number of instructions in this queue. The value of QUEUE is initially set to zero, and the queue remains empty unless (until) there are inhibited instructions. The value of QUEUE is then modified in accordance with the values of B5 and B6 from the previous processor cycle—i.e. QUEUE is increased by one if B5=1, and decreased by one if B6=1. (Note that the definition of the model is such that only one of B5 or B6 is ever set per cycle. Accordingly, if an instruction arrives and an instruction is also processed, so that both B2 and B7 equal 1, then B5 and B6 are both set to zero, and the queue length is unchanged).
The model further defines various physical parameters, as set out in Table 3 below:
(N = 1)
The voltage into
1.0 (as V1)
V2(N − 1) + dV2(N − 1)
the processor (as
shown in FIG. 2)
The change in
The charge flowing
Q1(N) − Q3(N) − QR(N)
C2 per cycle
The charge flowing
I1(N) * T
into the processor
The current into
I1(N − 1) + dI1(N − 1)
The change in
(V1 − V2(N)) * (T/L1)
C3 * V2(N) * B7(N)
consumed (via C3)
by an active
V2(N) * T/R2
resistor R2 per cycle
The variable N in Table 3 counts cycles. It is assumed in Table 3 that the processor is initially (at N=1) in an off state, whereby V2=V1, and I1=V1/R2. The system remains in this steady state for as long as B27 (as defined in Table 2) stays at zero—i.e. there are no active instructions in processor 210.
As an example of the power regulation of the model of Tables 2 and 3, the following values were adopted for the various parameters in the model (all in the appropriate S1 units):
5 × 10−9
5 × 10−9
3 × 10−10
5 × 10−11
A sample input instruction sequence, corresponding to the WANT variable, was provided, comprising 7 zeros, followed by 112 ones, followed by 206 zeros (where one indicates in effect the presence of an instruction, and a zero indicates the absence of an instruction). The outcome of this processing is shown in the graphs of
It can be seen from
Inhibiting an instruction allows the voltage to recover enough to enable the next instruction to be processed. However, the processor voltage again falls below the Vmin threshold as one or more subsequent instructions are processed, causing the inhibit variable to be triggered once more. This cycling of the inhibit variable is rapid at first, but slows as the current (I1) into the processor gradually rises (see
Subsequently, the instruction sequence to be executed is exhausted, as indicated by the Want variable returning to zero. This then allows the queued (inhibited) instructions to be processed (preserving of course the original instruction ordering). This causes the QueueDec variable to become one. (It will be appreciated that these transitions in the Want and QueueDec variables are transparent to the processor, which simply sees a continuous input of instructions to be executed).
After the queue has been emptied, there are no more instructions to process. The resultant sudden reduction in load leads to the processor voltage V2 rising above the Vmax threshold. Accordingly, dummy instructions are periodically inserted, as indicated by the Augment variable, in order to ensure that the processor maintains some load on the power supply. As a dummy instruction executes, the voltage V2 falls. This permits at least one idle cycle before the voltage starts to rise again, and another dummy instruction must be inserted.
The Augment variable now cycles on and off (analogous to the Inhibit variable previously) as the processor voltage repeatedly rises above V2 and then recovers. However, allowing the processor to idle for at least some instructions causes the input current (I1) to fall gradually (see
It will be noted that in the model just described, the minimum voltage (0.97) has a greater offset from the nominal voltage (1.0) than does the maximum voltage (1.02). This is in contrast to the embodiment of
As previously stated, the most likely cause of a sudden fall in power consumption is due to the processor starting idling. In this situation, the insertion of dummy instructions has no impact on overall processing, since the dummy instructions are not replacing any useful instructions. In contrast, if there is a sudden rise in power consumption, then any dummy instructions inserted will generally delay execution of program instructions.
This asymmetry can be exploited by providing an off-centre nominal voltage, as in the above model. For example, let us assume that V2 (as shown in
Having an off-centre nominal voltage for V2 therefore provides greater margin to accommodate for voltage falls (due to increased power consumption) than to accommodate voltage rises (due to decreased power consumption). Consequently, it is expected that the voltage is more likely to go out of range in the latter situation than the former, in other words, insertion of dummy instruction D1 is more likely to be required than insertion of dummy instruction D0. However, as explained above, dummy instruction D1 is typically invoked during processor idle time, without any performance penalty. Thus the bias in favour of inserting D1 rather than D0 will result in a net decrease in overall processing delay, compared to the centralised nominal voltage shown in the embodiment of
A further possible area of modification or adaptation is LUT 440 (see
In conclusion, although the approach described herein is typically intended for use in a computer system, it is applicable to any electronic system that has a power supply and one or more semiconductor digital electronic load devices for processing instructions, such as a microprocessor, a DSP, a controller, an application specific integrated circuit (ASIC), and so on. It will be appreciated that this includes not only a wide variety of computing systems (mainframe, server, workstation, desktop, laptop, handheld, etc.), but also a great range of other electronic systems (e.g. telecommunications apparatus, household electronic devices such as televisions and DVD players, subsystems for transport devices such as cars and aeroplanes, and so on). In addition, the regulation can be employed for a wide range of power supply circuits and configurations (such as shown in
Thus while a variety of particular embodiments have been described in detail herein, it will be appreciated that this is by way of exemplification only. The skilled person will be aware of many further potential modifications and adaptations that fall within the scope of the claimed invention and its equivalents.
|Cited Patent||Filing date||Publication date||Applicant||Title|
|US4272717||Mar 12, 1979||Jun 9, 1981||Hewlett-Packard Company||Output capacitor discharge circuit|
|US4933829||Apr 17, 1989||Jun 12, 1990||Compaq Computer Corporation||Free running flyback DC power supply with current limit circuit|
|US5584031||Mar 20, 1995||Dec 10, 1996||Motorola Inc.||System and method for executing a low power delay instruction|
|US5825674||Nov 28, 1995||Oct 20, 1998||Intel Corporation||Power control for mobile electronics using no-operation instructions|
|US6157008||Jul 8, 1999||Dec 5, 2000||Maytag Corporation||Power distribution system for an appliance|
|US6205555 *||Feb 16, 1999||Mar 20, 2001||Kabushiki Kaisha Toshiba||Processor power consumption estimating system, processor power consumption estimating method, and storage medium storing program for executing the processor power consumption estimating method|
|US6367023 *||Dec 23, 1998||Apr 2, 2002||Intel Corporation||Method and apparatus of measuring current, voltage, or duty cycle of a power supply to manage power consumption in a computer system|
|US6463396||Apr 19, 1999||Oct 8, 2002||Kabushiki Kaisha Toshiba||Apparatus for controlling internal heat generating circuit|
|US6704876 *||Sep 26, 2000||Mar 9, 2004||Sun Microsystems, Inc.||Microprocessor speed control mechanism using power dissipation estimation based on the instruction data path|
|US6775787 *||Jan 2, 2002||Aug 10, 2004||Intel Corporation||Instruction scheduling based on power estimation|
|US20010003207||Dec 23, 1998||Jun 7, 2001||Intel Corporation||Method and apparatus of measuring power consumption in a computer system to meet the power delivery specifications of a power outlet|
|US20040181698||Mar 13, 2003||Sep 16, 2004||Sun Microsystems, Inc.||Method and apparatus for supplying power in electronic equipment|
|GB2260233A||Title not available|
|GB2361326A||Title not available|
|JP2000330673A||Title not available|
|JP2001034370A||Title not available|
|WO1992010032A1||Nov 26, 1990||Jun 11, 1992||Adaptive Solutions, Inc.||Temperature-sensing control system and method for integrated circuits|
|WO1998038734A2||Feb 16, 1998||Sep 3, 1998||Koninklijke Philips Electronics N.V.||Amplifier arrangement|
|1||Combined Search and Examination Report under Sectios 17 and 18(3), Application No. GB0415878.8, mailed Nov. 18, 2004, 5 pages.|
|2||H.D.L. Hollmann, et al., "Protection of Software Algorithms Executed on Secure Modules," Future Generation Computer Systems 13, Elsevier Science B. V., 1997, pp. 55-63.|
|Citing Patent||Filing date||Publication date||Applicant||Title|
|US8762744 *||Dec 6, 2005||Jun 24, 2014||Arm Limited||Energy management system configured to generate energy management information indicative of an energy state of processing elements|
|US8904208 *||Nov 4, 2011||Dec 2, 2014||International Business Machines Corporation||Run-time task-level dynamic energy management|
|US9021281 *||Nov 12, 2013||Apr 28, 2015||International Business Machines Corporation||Run-time task-level dynamic energy management|
|US9170916 *||Jul 11, 2011||Oct 27, 2015||Damian Dalton||Power profiling and auditing consumption systems and methods|
|US20090254767 *||Dec 6, 2005||Oct 8, 2009||Arm Limited||Energy Management|
|US20120011378 *||Jul 11, 2011||Jan 12, 2012||Stratergia Ltd||Power profiling and auditing consumption systems and methods|
|US20130117588 *||Nov 4, 2011||May 9, 2013||International Business Machines Corporation||Run-Time Task-Level Dynamic Energy Management|
|U.S. Classification||713/320, 713/323, 712/E09.049, 712/E09.032, 713/322, 713/300|
|International Classification||G06F1/28, G06F1/32, G06F9/38, G06F1/30, G06F1/26, G06F9/30|
|Cooperative Classification||G06F9/3836, G06F9/30083, G06F9/3869, G06F9/30076, G06F9/38|
|European Classification||G06F9/30A8P, G06F9/30A8, G06F9/38|
|May 3, 2004||AS||Assignment|
Owner name: SUN MICROSYSTEMS, INC., CALIFORNIA
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:WILLIAMS, EMRYS J.;CURTIS, MARK/SUN MICROSYSTEMS LIMITED;REEL/FRAME:015292/0056
Effective date: 20031001
|Sep 1, 2010||FPAY||Fee payment|
Year of fee payment: 4
|Sep 3, 2014||FPAY||Fee payment|
Year of fee payment: 8
|Dec 16, 2015||AS||Assignment|
Owner name: ORACLE AMERICA, INC., CALIFORNIA
Free format text: MERGER AND CHANGE OF NAME;ASSIGNORS:ORACLE USA, INC.;SUN MICROSYSTEMS, INC.;ORACLE AMERICA, INC.;REEL/FRAME:037302/0732
Effective date: 20100212