US 20050268141 A1
A graphics processing device implementing a set of techniques for power management, preferably at both a subsystem level and a device level, and preferably including peak: power management, a system including a graphics processing device that implements such a set of techniques for power management, and the power management methods performed by such a device or system. In preferred embodiments, the device includes at least two subsystems and hardware mechanisms that automatically seek the lowest power state for the device that does not impact performance of the device or of a system that includes the device. Preferably, the device includes a control unit operable in any selected one of multiple power management modes, and system software can intervene to cause the control unit to operate in any of these modes. For example, the device can include a register interface to which an external processor can write control bits to select among the modes. Preferably, the control unit is operable in a subsystem power management mode in which the hardware mechanisms prevent assertion of clocks to idle subsystems, and disable generation of clocks that are not used by non-idle subsystems, or a device power management mode in which power consumption is controlled at levels of broader scope than individual subsystems, such as by disabling generation of a device clock, preventing assertion of a device dock to circuitry of the device, controlling device clock frequencies, and controlling voltage regulators employed to provide power to the device. Peak power management in accordance with the invention is designed to artificially limit the peak power drawn by the device to a predetermined thermal design point by dynamically lowering clock frequencies or voltage or both when power consumption exceeds a threshold. The invention can be implemented to reduce power consumption in mobile computing systems and is also applicable to desktop systems for example to manage peak power requirements.
1. A system, including:
a system bus;
a CPU connected along the system bus;
the graphics processing device being connected along the system bus; and
at least one input device connected along the system bus,
wherein at least one of the CPU and the graphics processing device are configured to operate in a frame generation mode in which frames of image data are generated by the graphics processing device at a selected frame rate and the CPU is configured to respond to control data asserted from the input device over the system bus by causing the device to enter the frame generation mode, where the control data determines the selected frame rate.
2. The system of
3. The system of
4. The system of
5. The system of
6. The system of
7. The system of
8. The system of
9. The system of
10. The system of
11. The system of
12. The system of
13. The system of
14. The system of
15. The system of
16. The system of
17. A system, including:
a system bus;
a CPU connected along the system bus;
a graphics processing device being connected along the system bus; and
a liquid crystal display connected along the system bus,
wherein at least one of the CPU and the graphics processing device is configured to operate in any selected on of a first mode and a second mode, where the display is refreshed in the first mode by asserting a frame of image data to the display at a first rate Rn, and wherein the display is refreshed in the second mode by asserting a frame of image data to the display at a second rate Rr, where Rr is less than Rn.
18. The system of
19. A system, including:
a system bus;
a CPU connected along the system bus;
a graphics processing device being connected along the system bus; and
a backlit display connected along the system bus, wherein the backlit display includes a backlight,
wherein at least one of the CPU and the graphics processing device is configured to drive the backlight with any selected one of at least two predetermined duty cycles.
The application is a divisional of, and claims priority benefit of, co-pending U.S. patent application Ser. No. 09/972,414, titled “Method And Apparatus For Power Management Of Graphics Processors And Subsystems Thereof”, filed Oct. 5, 2001, having common inventors and assignee as this application. The subject matter of the related patent application is hereby incorporated by reference.
The present invention relates to methods and apparatus for power management of graphics chips (graphics processors implemented as integrated circuits), graphics cores (portions of graphics chips), and systems including graphics processors.
Typically, most of the power consumed by a graphics processor (“GPU”) while it renders video data for display is consumed as a result of the toggling of clocks within the GPU.
Conventional methods for power management of processing circuitry (e.g., CPUs and graphics processing circuitry) include placing the processing circuitry in a state in which it consumes low power when it is idle, placing a circuit block of the processing circuitry in a state in which it consumes low power when it is idle, and controlling at least one voltage asserted to the processing circuitry and the frequency of at least one clock used by the processing circuitry to reduce power consumption.
Each of the following terms is used throughout the specification, including in the claims, in the following sense:
In a class of embodiments, the invention is a graphics processing device (an integrated circuit or portion of an integrated circuit) implementing a set of techniques for power management, preferably at both a subsystem level and a device level, and preferably including peak power management. Other aspects of the invention are a system including a graphics processing device that implements such a set of techniques for power management, and the power management methods themselves.
In preferred embodiments of the invention, hardware mechanisms automatically seek the lowest power state for the device that does not impact performance of the device or performance of a system that includes the device. Preferably, the hardware mechanisms disable generation of clocks that are not used by non-idle subsystems and prevent assertion of clocks to idle subsystems. In preferred embodiments of the inventive system, system software intervenes (to determine the power management mode of a graphics processing device) in cases in which the device does not have sufficient information to seek the most appropriate power state, and cases in which a user wishes to override the automatic mechanisms. For example, the system software can be implemented to give a user direct control over power management decisions from a dialog box of a displayed control panel. The user can use the control panel to override automatic mechanisms and choose a high-performance device state (with high power consumption) when playing a game to improve the user's graphics experience at the expense of power consumption. In response, system software must intervene to communicate the user's decision to the device.
The invention addresses six areas of power management: subsystem power management, device power management, peak power management, frame rate management, display interface frequency management, and backlight intensity management.
Subsystem power management focuses on individual subsystems of a graphics processing device, such as a graphics subsystem or an MPEG subsystem. Hardware mechanisms in the device automatically disable generation of clocks that are not used by non-idle subsystems and prevent assertion of clocks to idle subsystems, and optionally also perform other subsystem power management operations. In some implementations, external control of subsystem power management is sometimes necessary (e.g., when the automatic hardware mechanisms do not have sufficient information to seek the most appropriate power state, or when a user wishes to override the automatic mechanisms).
Device power management controls power consumption by a device at levels of broader scope than individual subsystems, such as by disabling generation of a device clock, preventing assertion of a device clock to any circuitry of the device, controlling device clock frequencies, and controlling voltage regulators employed to provide power to the entire device. Preferably, device power management is controlled by an entity (e.g., system software) external to the device, e.g., in response to changes in a system's pattern of using the device or in response to a system-level change in power state such as a change to battery use from A/C power use.
Peak power management is preferably implemented entirely by automatic hardware mechanisms of a device in accordance with the invention. Peak power management in accordance with the invention is designed to artificially limit the peak power drawn by a device to a pre-determined thermal design point by dynamically lowering clock frequencies and/or voltage when power consumption exceeds a threshold (preferably so as to optimize for real applications rather than contrived applications).
In accordance with frame rate management, a graphics processing device is configured to operate in a frame generation mode in which it generates frames of image data at a selected frame rate, where the selected frame rate is a selected one of a set of predetermined frame rates. Preferably, the predetermined frame rates include a maximum frame rate and a reduced frame rate, the device generates pixels of the image data at a first pixel rate when operating in the frame generation mode with the selected frame rate equal to the maximum frame rate, and the device generates pixels of the image data at the first pixel rate but with idle time between generation of at least two subsets of the pixels (e.g., between lines, blocks of lines, fields, or frames of the image data) when operating in the frame generation mode with the selected frame rate equal to the reduced frame rate.
In accordance with display interface frequency management, a system (including a graphics processing device and a liquid crystal display) is configured to operate in at least two modes: a mode in which the display is refreshed (by asserting a frame of image data to the display) at a normal rate, Rn (e.g., Rn=60 frames per second); and a second mode in which the display is updated at a reduced rate Rr (e.g., Rr=40 frames per second). It is especially desirable to implement the system with this capability where the cells of the display require a minimum time, T, to change state, where T is greater than (Rn)−1 but less than (Rr)−1.
In accordance with backlight intensity management, a system (including a graphics processing device and a backlit display) is configured to drive the display's backlight with any of at least two predetermined duty cycles (and preferably with any of a large number of predetermined duty cycles). The power consumption of the backlight (as well as the time-averaged brightness of the light emitted thereby) is reduced by reducing the duty cycle of the signal that provided power thereto.
An important benefit of the invention is reduction of power consumption within mobile computing systems, and the invention is also applicable to desktop systems seeking to manage peak power requirements. In preferred embodiments, the inventive device is operable in any selected one of several power management modes, and includes a register interface to which an external processor can write control bits to select among these modes.
With reference to
The components of device 2 include host slave unit 15, host control bus 11, host clock PLL 24 (which generates a host clock, and asserts the host clock to unit 15 and each of a set of multiplexers including multiplexers 45, 46, and 47), device clock PLLs (including PLLs 25, 26, 27, each of which generates a different device clock and asserts the device clock to one of the multiplexers), subsystem clocks (including the subsystem clock asserted to one input of AND gate 36, the subsystem clock asserted to one input of AND gate 38, the subsystem clock asserted to one input of AND gate 40, and the subsystem clock asserted to one input of AND gate 42), subsystems (including subsystems 16, 18, 20, and 22), and control unit 12, connected as shown. The number of device clocks, subsystem clocks, and subsystems, and the relationships between the device clocks and subsystems is implementation-specific. Subsystems 16, 18, 20, and 22 include at least one graphics subsystem and an MPEG subsystem.
Host slave unit 15 (which can be an AGP interface) is the interface; between system bus 5 and host control bus 11, and thus between subsystems 16, 18, 20, and 22 (connected along host control bus 11) and CPU 4. The subsystems (including subsystems 16, 18, 20, and 22) can be defined with very coarse granularity (e.g., subsystem 16 can be an entire 3D image data processing pipeline), or with finer granularity (e.g., subsystem 16 can be a setup unit within a 3D image data processing pipeline, and subsystem 18 can be another subsystem of the pipeline).
Host slave unit 15 uses host control bus 11 to access subsystem registers (within the subsystems) in response to slave transactions initiated by CPU 4. These slave transactions are always allowed to complete successfully, regardless of the state of the power management control registers of the relevant one (or ones) of the subsystems. In other words, a clock must be provided for each subsystem even when the subsystem's device clock (the device clock determining the subsystem's subsystem clock) has been disabled to reduce power consumption, if necessary to implement a slave transaction. In one embodiment, this is accomplished as follows using multiplexers 45, 46, and 47. The host clock generated by PLL 24 (which controls host slave unit 15) is asserted to one input of each of multiplexers 45, 46, and 47. In response to one or more control signals asserted by control unit 12, multiplexer 45 selects the host clock (and deselects the output of PLL 25) when control unit 12 disables PLL 25, multiplexer 46 selects the host clock (and deselects the output of PLL 26) when control unit 12 disables PLL 26, and multiplexer 47 selects the host clock (and deselects the output of PLL 27) when control unit 12 disables PLL 27. Control unit 12 also asserts the appropriate control bits to AND gates 30, 32, 34, 36, 38, 40, and 42 to ensure that each of subsystems receives receive a toggling clock (either the host clock, or a subsystem clock determined by the output of one of PLLs 25, 26, and 27) so that each of the subsystems can respond to host slave transactions even when device 2 is in a reduced-power state (with one or more of PLLs 25, 26, and 27 disabled).
At least one subsystem clock determined by each of the device clocks generated by circuits 25, 26, and 27 is asserted to each of at least one of the subsystems. Specifically, a first subsystem clock (determined by the device clock output from PLL 25) is asserted through multiplexer 45, AND gate 30, and AND gate 36 to subsystem 16, a second subsystem clock (also determined by the device clock output from PLL 25) is asserted through multiplexer 45, AND gate 30, and AND gate 38 to subsystem 18, a third device clock (determined by the device clock output from PLL 26) is asserted through multiplexer 46, AND gate 32, and AND gate 40 to subsystem 20, and a fourth device clock (determined by the device clock output from PLL 27) is asserted through multiplexer 47, AND gate 34, and AND gate 42 to subsystem 22.
Each branch of a device clock tree that can be separately disabled by control unit 12 (and the clock asserted through such branch) is referred to as a subsystem clock. For example, the first subsystem clock is asserted through the branch from AND gate 30 through AND gate 36 to subsystem 16, and the second subsystem clock is asserted through the branch from AND gate 30 through AND gate 38 to subsystem 18. Circuitry within subsystem 16 operates in response to the first subsystem clock, and circuitry within subsystem 18 operates in response to the second subsystem clock. Control unit 12 asserts a control signal (indicative of a logical zero) to AND gate 36 and another control signal (indicative of a logical one) to AND gate 38 to enable the second subsystem clock and disable the first subsystem clock. In general, one or more subsystem clocks control each subsystem. Subsystem clocks are typically not shared by subsystems, although it is contemplated that they are so shared in some embodiments of the invention.
Control unit 12 can also assert a control signal (indicative of a logical one or zero) to each of AND gates 30, 32, and 34 to enable or disable each device clock (in the sense of allowing each device clock to be asserted, or preventing each device clock from being asserted, to the relevant subsystem or subsystems). For example, unit 12 asserts a control signal (indicative of a logical one) to AND gate 30 and control signals (indicative of logical zeroes) to AND gates 32 and 40 to enable one device clock (the device clock generated by PLL 25) and disable the device clocks generated by PLLs 26 and 27. Control unit 12 can also assert a control signal to each of the clock generation PLLs (e.g., PLL 24, PLL 25, PLL 26, and PLL 27) to enable or disable generation of each device clock. Power consumption by subsystems 16 and 18 (and thus by device 2) would decrease in response to assertion of a control signal indicative of a logical zero to AND gate 30 (although PLL 25 would continue to consume power in this case while it continues to generate a device clock), and power consumption by device 2 could then be further decreased by asserting another control signal (from control unit 12) to PLL 25 to cause PLL 25 to cease generation of a device clock.
In operation of device 2, each of the subsystems (including subsystems 16, 18, 20, and 22) asserts a status signal to control unit 12. For example, each status signal is a single bit indicative of whether the subsystem is or is not idle. In one mode of operation, control unit 12 asserts control signals to elements 24-27, 30, 32, 34, 36, 38, 40, and 42 in response to the status signals. For example, control unit 12 can be implemented to respond (when operating in this mode) to a status signal indicating that subsystem 22 is idle by asserting to PLL 27 a control signal causing PLL 27 to cease generation of a device clock for subsystem 22. For another example, control unit 12 can be implemented to respond (when operating in this mode) to status signals indicating that only subsystem 16 is idle, by asserting a control signal indicative of a logical zero to AND gate 36, asserting control signals indicative of logical ones to the other AND gates, and allowing all of PLLs 24-27 to generate clocks.
As described above, a primary function of control unit 12 is to control the clock distribution circuitry of device 2. For illustrative purposes,
Consistent with the definition set forth above, the term “subsystem” is used herein to denote a block of physically related circuitry, the registers (and other clocked circuitry) of which are physically resident on the same controllable branch of a clock tree, such that assertion of the clock to the subsystem can be disabled in its entirety without affecting assertion of clocks to other circuitry of the device (circuitry other than the subsystem). While it is necessary only that the circuitry of each subsystem is physically related by the clock tree, it is common and convenient for the circuitry of each subsystem to be logically related as well.
The vast majority of the power consumed by a typical graphics processor is consumed as a consequence of the toggling of clocks. The
To determine the necessity for generating and asserting each clock of device 2, it is necessary to consider the two classes of registers that rely on these clocks: host registers and non-host registers. A typical subsystem of device 2 contains both host and non-host registers. Host registers are accessible via host slave unit 15 and host control bus 11 as a consequence of slave accesses to device 2 by CPU 4. Host registers are used for configuration management and are accessed relatively infrequently. Non-host registers are not accessible via host slave unit 15. Non-host registers of a subsystem are used during the subsystem's normal operation rather than for configuration management, and typically have a much higher access frequency than host registers.
In order to avoid requiring system software to query the power management state of device 2 before accessing a host register within any of the subsystems of device 2, preferred embodiments of the invention require each subsystem to respond properly to slave accesses (by CPU 4, via host slave unit 15 and host control bus 11) regardless of the power management state of the subsystem and of device 2. Since response to each such access typically requires that the target subsystem operate in response to a clock, device 2 is configured to provide an appropriate clock to the target subsystem even when device 2 is in a reduced-power state in which a subsystem clock is not asserted to the target subsystem. This can be accomplished (as described above) by configuring control unit 12 to cause multiplexers (e.g., multiplexers 45, 46, and 47 of
Preferably, control unit 12 can be configured by CPU 4 (programmed with system software) to operate in any selected one of a number of different power management modes for controlling power consumption for each subsystem. For example, in a class of embodiments, register array 12A of control unit 12 (or other circuitry of device 2 that is accessible by control unit 12) includes a two-bit host register (called a “PM_SUBSYSTEM_CONTROL” register) for each of subsystems 16, 18, 20, and 22. System software can write a two-bit word to each “PM_SUBSYSTEM_CONTROL” register to indicate the power management mode for the corresponding subsystem. For convenience, the PM_SUBSYSTEM_CONTROL registers for all the subsystems can be aggregated into a physically contiguous array of registers or a single larger register (e.g., a 32-bit register of array 12A, when device 2 includes fifteen or sixteen subsystems). Control unit 12 decodes the two-bit word in each PM_SUBSYSTEM_CONTROL register as follows:
Preferably, each subsystem is designed so as to be operable efficiently when controlled in the AUTOMATIC mode in an efficient manner, so that system software never needs to require that it be controlled in the SUSPENDED mode. The SUSPENDED mode is a concession that this goal may not always be achievable. So that system software can determine whether it must be involved in the subsystem's power management (by triggering SUSPENDED mode control of the subsystem at appropriate times), the system software must be aware of the quality of each subsystem's operation in response to AUTOMATIC mode control.
We next describe in more detail the device-level power management techniques employed by preferred implementations of the
We first consider frame rate management, which is both a device-level power management technique and a system-level power management technique. The power consumed by the inventive device (e.g., device 2) and a system including the inventive device (e.g., the
Consider an implementation of the
More generally, the driver software causes CPU 4 and GPU 2 to respond to a command to run a game program at a reduced frame rate (e.g., 60 fps, 30 fps, or 40 fps) by entering a mode in which GPU 2 generates (and asserts to display device 10) pixels of video data at the same pixel rate as when the system runs the game at the maximum frame rate (e.g., 120 fps) in the sense that each clock of the system has the same frequency during execution of the game at 120 fps as during execution of the gate at the reduced frame rate, but with idle intervals between assertion of subsets of the pixels (e.g., between lines, blocks of lines, fields, or frames of the data), thereby generating frames of the video data at the selected, reduced frame rate (e.g., 60 fps, 40 fps, or 30 fps). During each idle interval (e.g., each lengthened vertical blanking interval), GPU 2 consumes very low power (e.g., it is idle, and consumes only the small amount of power needed for logic circuitry to count to the end of the idle interval and trigger the rendering of the next subset of the data).
In variations on the described embodiment, the user selects the frame rate (at which the system operates while executing a game program) using hardware rather than software, such as by actuating one or more switches.
Another device-level power management technique is device clock control. As previously described, device clocks have a more global scope than subsystem clocks. A device clock originates at the root of a clock tree (typically a PLL, such as PLL 25, 26, or 27 of
Even when a subsystem clock is disabled through subsystem power management (e.g., by assertion of a control signal indicative of a logical zero to AND gate 36), a device clock can still consume power through its device clock generation circuitry and through the portion of the device clock tree up to the branches of the tree at which the subsystem clock originates.
Preferably, system software (i.e., CPU 4, programmed with system software) configures control unit 12 to operate in any selected one of a number of different device-level power management modes for controlling each device clock. For example, in a class of embodiments, register array 12A of control unit 12 (or other circuitry of device 2 that is accessible by control unit 12) includes a four-bit host register (called a “PM_DEVICE_CONTROL” register) for each device clock generation circuit (e.g., each of PLLs 25, 26, and 27). System software can write a four-bit word to each “PM_DEVICE_CONTROL” register to indicate one of three power management modes for controlling the corresponding device clock (a mode in which the device clock generation circuit is powered-on and driving its clock tree, a second mode in which the device clock generation circuit is powered-on yet unused in its clock tree in deference to the host clock, and a third mode in which the device clock generation circuit is powered-down and unused in its clock tree in deference to the host clock). For convenience, the PM_DEVICE_CONTROL registers for all device clocks can be aggregated into a physically contiguous array of registers or a single larger register (e.g., a 32-bit register, when device 2 includes five device clock generation PLLs). Control unit 12 decodes the four-bit word in each PM_DEVICE_CONTROL register as follows:
In a variation on the described BYPASS mode, control unit 12 is configured to enable the device clock generation circuit (if present) and to select the device clock to drive the device clock tree when status signals (from the relevant ones of the subsystems) indicate that at least one subsystem (containing circuitry to be driven by the device clock, or a subsystem clock derived from the device clock) is not idle, and otherwise to enable the device clock generation circuit and select the host clock to drive the device clock tree.
Another device-level power management technique is host clock management. In conventional laptop computer systems that include a graphics device, the graphics device typically has a hardware “suspend” pin, to which the CPU (or other system element) can assert a control signal which guarantees to the graphics device that its services are not required and that it can completely power-down with no adverse affects. Without such a guarantee, there is typically no mechanism for the graphics device to determine when it can safely power-down. For example, since the device's host slave unit and any registers available across the host control bus must remain available at all times absent a guarantee from the system that they are not needed, the device cannot safely power-down its circuitry for responding to a slave access until the hardware suspend pin is asserted.
The system of
In implementing host clock management or device clock control, each host clock generation circuit and device clock generation circuit should be powered up and powered down appropriately. Because clock generation PLLs (in a graphics processing device) take time to acquire their set frequency and in order to avoid glitches or runt pulses on the clock waveforms generated by such PLLs, it is important to power-up and power-down such PLLs in a preferred sequence when altering the power management state of one or more device clocks or the host clock in accordance with the invention. The preferred sequence for powering-down a device clock PLL (in an implementation of device 2 of
The preferred sequence for powering-up a device clock PLL (in an implementation of device 2 of
The power-up and power-down sequences for the host clock generation circuit (e.g., PLL 24 of
Other device-level power management techniques are dynamic supply voltage management and device clock frequency management. To implement supply voltage management, control unit 12 of device 2 is implemented to control external voltage regulator 14 as described above. Typically, control unit 12 asserts a control signal to voltage regulator 14 (to select one of the available set points of regulator 14) in response to a control signal asserted to control unit 12 from CPU 4 (via system bus 5, host slave unit 15, and host control bus 11). Thus, a user can cause the system software to implement a user-specified tradeoff between device performance and power consumption. For example, if coarse granularity of the supply voltage is sufficient, device 2 and regulator 14 can be implemented to support just three voltage levels: a nominal-voltage level (e.g., 1.2 volts), a low-voltage level (e.g., 1.0 volt), and a high-voltage (high-performance) level (e.g., 1.5 volts). Preferably, no device state is lost when device 2 operates at any of the available voltage settings. Optionally, control unit 12 and regulator 14 are implemented so that system software can cause control unit 12 to disable device 2's power supply entirely (or system software can directly disable device 2's power supply) when device 2 enters a “suspended” state, in which case device 2 will lose its state.
Each device that embodies the invention will have a maximum clock frequency for each supported voltage level for each device clock. System software must respect these limits whenever causing a change in the supply voltage level for the device. Typically, when transitioning from a higher supply voltage to a lower supply voltage, the device clock (and host clock) frequencies should be adjusted downward and the clock generation circuitry should be given adequate time to lock to the new frequencies before the voltage is actually changed. Typically, when transitioning from a lower voltage to a higher voltage, the device clock (and host clock) frequencies should be adjusted upward after the voltage has been changed.
In a preferred implementation, a host register in control unit 12 (e.g., a 32-bit register known as a “PM_CORE_VOLTAGE_CONTROL” register) determines the control signals asserted by control unit 12 to control the voltage level of regulator 14. The mappings from the value in that register to one of the available voltage levels and the associated device clock (and host clock) frequencies are device-specific.
Another of the power management techniques implemented in preferred embodiments of the invention is peak power management. System designers must consider the maximum power dissipation of each device to determine an appropriate strategy for heat removal. Space considerations for mobile computing systems limit the options for heat removal. Conventionally, determination of the maximum power dissipation for a device requires finding or writing a worst-case software application. Unfortunately, such an application is often contrived and can dissipate considerably more power than the worst likely (or “real”) application. However, although the application is contrived, the system's thermal design must still respect it.
Preferably, peak power management in accordance with the invention is designed to artificially limit the peak power drawn by a device to a pre-determined thermal design point by dynamically lowering device clock (and host clock) frequencies and/or reducing the supply voltage in the event that power dissipation by a device rises beyond a predetermined threshold. For example, the threshold can be set to correspond (at least approximately) to the power dissipation in the worst “real” application. Alternatively, for systems with more stringent thermal requirements, the threshold can be a lower level of power dissipation.
Preferably, a system including a graphics processing device (e.g., device 2 of
Preferably, a system including a graphics processing device (e.g., device 2 of
Other power management techniques that can be implemented in some embodiments of the invention include:
In order to illustrate advantages attainable by implementing the invention, the following section describes a typical usage pattern of a laptop computer that embodies the invention, highlighting the relationship between the modes of use and the state of the power management system. The example will consider a series of activities starting with a marketing manager preparing a PowerPoint presentation (consuming power equal to 1.5 watts), then playing a video game (consuming power equal to 3 watts), reengaging with PowerPoint (1.5 watt consumption), then leaving the screen idle (consuming power equal to 0.5 watt) until the laptop blacks the screen and enters a suspend state (consuming no more than an insignificant amount of power).
Since the hardware architecture of the inventive device and system (e.g., those embodiments described herein with reference to
In order to appreciate the benefit from the various power savings techniques implemented in accordance with the invention, it is important to understand the parameters that contribute to power consumption. A simple model of power consumption, sufficient for purposes of this example, is given by the following equation:
The capacitance C is invariant for any specific implementation, leaving voltage and clock frequency as the parameters to be optimized in accordance with the invention.
The above-mentioned example illustrates how the these parameters are optimized in accordance with the invention to fit the system's mode of operation. While the marketing manager is preparing his PowerPoint presentation, the graphics device of his computer system is in its normal operating state: the voltage and clock frequencies are set to their nominal levels, the device is managing the subsystem clock enables for the subsystems of the device (based on the activity of each subsystem), and all clocks are enabled. Typically, the graphics device would have five or more device clocks—a core clock, a memory clock, video clocks for two heads, and a TV clock. In the example, the core clock, memory clock, and first head's video clock are certainly enabled during preparation of the PowerPoint presentation, although it is likely that the second head's video clock and the TV clock are disabled because their corresponding subsystems are not in use. Furthermore, those clocks that are enabled at the device level will be disabled at the subsystem level when not in use. For example, while the marketing manager is considering what his next slide should portray, thus leaving the graphics subsystem temporarily idle, the subsystem clock for the graphics subsystem will automatically be disabled to reduce power consumption.
When the marketing manager launches into the video game, the device is placed into its high-performance mode: the voltage V applied across the device core is raised to its high-performance level and the core clock frequency is increased accordingly (a higher core voltage allows a higher core clock frequency). Upon exiting the video game, the core voltage and core clock frequency revert back to their nominal levels.
When the manager stops work, leaving the system's screen idle, the device enters a low-power state by reducing the core voltage and lowering the core clock and memory clock frequencies accordingly. Finally, the system goes into a suspend mode, which disables all clocks (except the host clock, ensuring that the device can still respond to requests from the system using circuitry clocked by the host clock). Ultimately, when system software causes assertion of an appropriate signal to one or more host registers of the device (or assertion of a signal to a hardware suspend pin of the device), the host clock and any sources of static power are disabled so that the entire device is quiescent.
This simple example has included many of the dynamic power management techniques available through the inventive architecture. For example, in the example, the device automatically manages subsystem level clocking to match activity of each subsystem, system software manages clock frequencies and voltage levels to match application requirements, and all significant power consumption mechanisms, including generation of the host clock, can be shut off when the system enters a deep suspend state.
It should be understood that while certain forms of the invention have been illustrated and described herein, the invention is not to be limited to the specific embodiments described and shown or the specific methods described.
While the foregoing is directed to embodiments of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow.