« PreviousContinue »
(12) United States Patent ao) Patent No.: us 6,182,176 Bi
Ziegler et al. (45) Date of Patent: Jan. 30,2001
(54) QUEUE-BASED PREDICTIVE FLOW
(75) Inventors: Michael L. Ziegler, Whitinsville, MA
(US); Robert J. Brooks, Roseville;
William R. Bryg, Saratoga, both of CA
(US); Craig R. Frink, Chelmsford;
Thomas R. Hotchkiss, Groton, both of
MA (US); Robert D. Odineal,
Roseville, CA (US); James B.
Williams, Lowell, MA (US); John L.
Wood, Rochester, NH (US)
(73) Assignee: Hewlett-Packard Company, Palo Alto, CA (US)
( * ) Notice: Under 35 U.S.C. 154(b), the term of this patent shall be extended for 0 days.
(21) Appl. No.: 08/201,185
(22) Filed: Feb. 24, 1994
(51) Int. CI. G06F 13/14
(52) U.S. CI 710/112; 710/113
(58) Field of Search 395/250, 400,
(56) References Cited
U.S. PATENT DOCUMENTS
5,204,954 * 4/1993 Hammer et al 395/425
5,257,374 * 10/1993 Hammer et al 395/650
5,265,235 * 11/1993 Sindhu et al 395/425
U.S. application No. 0497054A3, Sindhu et al., filed Aug. 5, 1992.
U.S. application No. 0317468A3, Hammer et al., filed May 24, 1989.
* cited by examiner
Primary Examiner—Richard L. Ellis
A shared bus system having a bus and a set of client modules coupled to the bus. Each client module is capable of sending transactions on the bus to other client modules and receiving transactions on the bus from other client modules for processing. Each module has a queue for storing transactions received by the module for processing. A bus controller limits the types of transactions that can be sent on the bus to prevent any module's queue from overflowing.
1 Claim, 2 Drawing Sheets
U.S. Patent Jan.30,2001 Sheet 2 of 2 US 6,182,176 Bl
QUEUE-BASED PREDICTIVE FLOW
FIELD OF THE INVENTION
The present invention relates to computer systems that 5 have a shared bus, and more particularly to controlling transactions issued on a shared bus.
BACKGROUND OF THE INVENTION
Computer systems commonly have a plurality of 10 components, such as processors, memory, and input/output devices, and a shared bus for transferring information among two or more of the components. Typically, the components are coupled to the bus in the form of component modules, each of which may contain one or more processors, memory, and/or input/output devices. Information is transmitted on the bus among component modules during bus cycles, each bus cycle being a period of time during which a selected module is permitted to transfer, or drive, a limited quantity of information on the bus. Modules commonly send trans- 20 actions on the bus to other modules to perform operations such as reading and writing data.
One class of computer system has two or more main processor modules for executing software running on the 2J system (or one or more processor modules and one or more coherent input/output modules) and a shared main memory that is used by all of the processors and coherent input/ output modules in the system. The main memory is generally coupled to the bus through a main memory controller. 3Q In many cases, one or more processors also has a cache memory, which stores recently used data values for quick access by the processor.
Ordinarily, a cache memory stores both the frequently used data and the addresses where these data items are stored 35 in main memory. When the processor seeks data from an address in memory, it requests that data from the cache memory using the address associated with the data. The cache memory checks to see whether it holds data associated with that address. If so, the cache memory returns the 40 requested data directly to the processor. If the cache memory does not contain the desired information (i.e., a "cache miss" occurs), the cache requests the data from main memory and stalls the processor while it is waiting for the data. Since cache memory is faster than main RAM memory, this 45 strategy results in improved system performance.
In the case of a shared memory multi-processor computer in which each processor has cache memory, the situation is somewhat more complex. In such a computer, the most current data may be stored in one or more cache memories, 50 or in the main memory. Software executing on the processors must utilize the most current values for data associated with particular addresses. Thus, a "cache coherency scheme," must be implemented to assure that all copies of data for a particular address are the same. 55
In a typical write-back coherency scheme, when data is requested by a module, each module having cache memory performs a "coherency check" of its cache memory to determine whether it has data associated with the requested address and reports the results of its coherency check. Each 60 module also generally reports the status of the data stored in its cache memory in relation to the data associated with the same address stored in main memory and other cache memories. For example, a module may report that its data is "private" (i.e., the data value is only usable by this module) 65 or that the data is "shared" (i.e., the data may reside in more than one cache memory at the same time). A module may
also report whether its data is "clean" (i.e., the same as the data associated with the same address stored in main memory) or "dirty" (i.e., the data has been changed after it was obtained).
The results of the coherency checks performed by each module are analyzed by a selected processor and the most current data is provided to the module that requested the data. A "coherent transaction" is any transaction that requires a check of other caches to see whether data associated with a memory address is stored in the other caches, or to verify that data is current. Most reads and some writes to memory are coherent transactions. Those skilled in the art are familiar with many types of coherent transactions, such as a conventional read private, and non-coherent transactions, such as a conventional write-back.
In many conventional coherency schemes, reporting the results of coherency checks requires a significant amount of communication between the modules and the coherency processor that makes the final decision on how a memory request is to be satisfied. Each module having a cache memory must be informed of a required coherency check and must report the result of its coherency check to the coherency processor. Even if the number of communications is reduced, conventional means of processing and reporting the results of coherency checks are often slow. Coherency checks must be carried out in a manner that does not substantially reduce the effective bandwidth of the shared bus used by the modules for the inter-module communications.
To reduce the impact of memory latency delays, many conventional buses are "split transaction" buses; that is, a transaction does not need to be processed immediately after it is placed on the bus. For example, after a memory read transaction is issued on the bus, the module that issued the read relinquishes the bus, allowing other modules to use the bus for other transactions. When the requested data is available, the responding module for the read obtains control of the bus, and then transmits the data. It is often possible for modules in a shared bus system to initiate transactions faster than they can be serviced by the responding module, or faster than coherency checks can be performed by the other modules. For example, input/output devices often operate at a much slower speed than microprocessors and, thus, modules connecting input/output devices to the bus may be slow to respond. Similarly, main memory accesses are relatively slow, and it is possible for the processor modules to request data faster than it can be read from the main memory. Cache coherency checks may also be slow because the coherency checking processors in a module may be busy with other operations. Thus, it is often necessary to either slow down initiation of new transactions by modules or to handle the overflow of transactions when too many transactions are initiated in too short a time for them to be adequately processed or for coherency checks to be performed.
A typical prior art method for dealing with transaction overflow uses a "busy-abort" mechanism to handle the situation in which too many transactions of some type are initiated too quickly. When the responding module for the transaction sees a new transaction request that it cannot respond to immediately, the responding module sends back a "busy-abort" signal indicating that the transaction cannot be serviced at that time (e.g., an input/output module is occupied or a processor module having a cache memory cannot perform a coherency check fast enough). The requesting module then aborts its request and tries again at a later time. This approach increases design complexity because the requesting module must retain the transaction