CA2089437C - Adaptive cache miss prediction mechanism - Google Patents

Adaptive cache miss prediction mechanism

Info

Publication number
CA2089437C
CA2089437C CA002089437A CA2089437A CA2089437C CA 2089437 C CA2089437 C CA 2089437C CA 002089437 A CA002089437 A CA 002089437A CA 2089437 A CA2089437 A CA 2089437A CA 2089437 C CA2089437 C CA 2089437C
Authority
CA
Canada
Prior art keywords
operand
cache
address
addresses
arithmetic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CA002089437A
Other languages
French (fr)
Other versions
CA2089437A1 (en
Inventor
Charles P. Ryan
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bull HN Information Systems Inc
Original Assignee
Bull HN Information Systems Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bull HN Information Systems Inc filed Critical Bull HN Information Systems Inc
Publication of CA2089437A1 publication Critical patent/CA2089437A1/en
Application granted granted Critical
Publication of CA2089437C publication Critical patent/CA2089437C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0862Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with prefetch
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/60Details of cache memory
    • G06F2212/6026Prefetching based on access pattern detection, e.g. stride based prefetch

Abstract

In a data processing system which employs a cache memory feature, a method and exemplary special purpose apparatus for practicing the method are disclosed to lower the cache miss ratio for called operands. Recent cache misses are stored in a first in, first out miss stack, and the stored addresses are searched for displacement patterns thereamong. Any detected pattern is then employed to predict a succeeding cache miss by prefetching from main memory the signal identified by the predictive address. The apparatus for performing this task is preferably hard wired for speed purposes and includes subtraction circuits for evaluating variously displaced addresses in the miss stack and comparator circuits for determining if the outputs from at least two subtraction circuits are the same indicating a pattern yielding information which can be combined with an address in the stack to develop a predictive address. The cache miss prediction mechanism is adaptively selectively enabled by an adaptive circuit that develops a short term operand cache hit ratio history and responds to ratio improving and ratio deteriorating trends by accordingly enabling and disabling the cache miss prediction mechanism.

Description

~ 2~89~37 . . . .
ADAPTIVE CACHE MISS PREDICTION M~rT~NT~M

3 Field or~ the Invention 4 This invention relates to the art Or data proC~cc1 n~
systems which include a cache memory f eature and, more 6 particularly, to a method and apparatus for adaptively 7 selectively predicting memory cache misses for operand calls 8 and using this information to transfer data from a main 9 memory to cnche memory to thereby lower the cache miss 1~ ratio.

12 Backarollnd o~ the Invention 13 The technique of employing a high speed cache memory 14 intermediate a processor and a main memory to hold a dynamic subset of the inf ormation in the main memory in order to 16 speed up system operation is well known in the art.
17 Brie~ly, the cache holds a dynamically variable collection 18 of main memory information fragments selected and updated 19 such that there is a good chance that the LL.~., LS will include instructions and/or data required by the uLocess~L
21 in upcoming operations. I~ there is a cache "hit" on a 22 given operation, the information is available to the 23 ~LUCe5SUL much faster than if main memory had to be ~cc~cced 24 to obtain the same information. Consequently, in many high per~ormance data processing systems, the "cache miss ratio"
26 is one o~ the ma~or limitations on the system execution 27 rate, and it should therefore be kept as low as possible.
;~".J~ p~;~

2089~37 . ` . .
The key to obtaining a low cache miss ratio is 2 obviously one of carefully selecting the information to be 3 placed in the cache from main memory at any given infitant.
4 There are several techniques for selecting blocks of instructions for transitory residence in the cache, and the 6 more or less linear use of instructions in ~luyL~l~ing 7 renders the3e techniques statistically effective. However, 8 the selection of operand inf ormation to be resident in cache 9 memory at a given instant has been much less ef f ective and has been generally limited to tr~nsferring one or more 11 contiguous blocks including a cache miss address. This 12 approach only slightly lowers the cache miss ratio and is 13 also an ineffective use of cache capacity.
14 Thus, those skilled in the art will understand that it would be highly desirable to provide means for selecting 16 operand information for transitory storage in a cache memory 17 in such a manner as to significantly lower the cache miss 18 ratio . That end was ~c co~rl ~ ~h~ in accordance with the 19 invention disclosed and claimed in United States Patent Application Serial No. 07/364,943 filed June 12, ls89, for 21 CACHE MISS PREDICTION METHOD AND APPARATUS by Charles P.
22 Ryan, now United States Patent No. 5,093,777, by special 23 purpose apparatus in the cache memory which stores recent 24 cache misses and searches f or operand patterns therein . Any detected operand pattern is then employed to anticipate ~
26 s~lC- e~7in~ cache miss by prefetching from main memory the 27 block containing the predicted cache miss.

*r~

~ 2089437 .
., , .
It was de~Prm;nPd, however, that under certain 2 operating conditions, the full time use of the ~oregoing 3 procedure can actually raise the long term miss ratio ( i . e ., 4 lower the long term hit ratio). In a typical cache based processor that executes a single process during n given 6 period, the cache hit ratio will stabilize a~ter some time 7 interval f ollowing the institution of the process . If a 8 change to another process is made, new instructions and data g must be loaded into the cache such that cache hit rntio instant~nPo~lC~ y drops dramatically and then increases ~s the 11 new process is "experienced". If the cache miss prediction 12 r~-h~n;Rm is in operation, the initial rate of increase in 13 the cache hit ratio is much f a6ter . ~owever, the hit rntio 14 never reaches the level it would reach in the long term ir the cache miss prediction r- ' ~n;Cl~ was not in use. This 16 result is caused by the fact that the cache miss prediction 17 r- ` s~n; Rm continues to find and load from main memory the 18 next possible mi6s which, however, is not used, thus forcing 19 the cache to replace blocks that are more important.
The invention disclosed and claimed in United States 2i Patent Application Serial No . 07/ f iled ~ebruary 26, 22 1992, ~or SELECTIVELY ENA~3LED CAC~E MISS PREDICTION METHOD
23 AND APPARATUS by Charles P. Ryan, now United States Patent 24 No. , uV~l~ ,f'- the limiting e~fect of using the cache miss prediction t--h~n;cm continuously after a process 26 has been changed by selectively Pn;lh] i n~ the c~che miss 27 prediction r- ' -n; ~tn only during cache "in-rush" following a 2~89437 ~ '.
, process change to increase the recovery rate; thereafter, it 2 is disabled, based upon timing-out a timer or reaching a hit 3 ratio threshold, in order that normal procedures allow the 4 hit ratio to stabilize at a higher percentage than if the S cache miss prediction ---hln1cm were operated con~nl~n~lRly.
6 There are operzlting conditions, however, under which it 7 would be advantageous to have the cache miss prediction 8 ~--h~ni ~:m in operation even after cache inrush following a 9 process change., An example of such an operating condition is when very large sets (even in excess o~ the cache ~;ize) 11 of regularly addressed operand data (matrix/vector/strings) 12 are used by a procedure.

14 Obiects o~ the Invention It is therefore a broad object of this invention to 16 provide an improved cache memory in a data pronpRsi n-17 system.
18 It is another object of this invention to providQ a 19 cache memory particularly characterized by exhibiting n lower cnche miss ratio in operation.
21 It is a more specific object of this invention to 22 provide a cache memory for adaptively selectively ~n~hl 1n~
23 circuitry for effectively predicting operand cache misses 24 not only during an "in-rush" period following a process change but also in other operating conditions in which it 26 would be advantageous to enable the cache miss prediction 27 ---h~ni cm .

-. 6 .

208~437 72434-122 SummarY of the Inventlon Briefly, these and other objects of the invention are achieved by special purpose apparatus which stores recent cache misses and searches for address patterns therein. Any detected pattern is then employed to anticipate a succeeding cache miss by prefetching from main memory the block containing the predicted cache miss. The cache miss prediction mechanism i8 adaptively selectively enabled by an adaptive circuit that develops a short term operand cache hit ratio history and responds to ratio 10 improving and ratio deteriorating trends by accordingly enabling and disabling the cache miss prediction mechanism.
In accordance with the present invention, there is provided the method for predicting operand addresses of operand requests in a data processing system in which a cache thereof is repeatedly interrogated to determine whether an address corresponding to a request operand address i5 stored therein, which stored operand address corresponds to an operand also stored in said cache; wherein said data processing system includes a main memory and a stack for holding addresses; said method being 20 carried out by said data processlng system in operating said cache, characterized by the computer-implemented steps of s A) upon the occurrence of a cache mi6s when an operand is requested from said cache, entering the request operand address into said stack;
B ) examining the reguest operand addresses present in said stack to determine whether one of a plurality of predetermined operand address patterns is represented by said request operand s ., .~ .
b .
2 9 8 9 ~ 3 7 72434-122 addresses; and C~ (1) if no one of said patterns is determined to be represented in step (B), returning to step (A), but ( 2 ) ( i ) if one of said patterns is determined to be represented in step (B), generating an operand address of a predicted operand request, ( ii ) using sald generated operand address to obtain an operand from said main memory and write such operand into said cache, and l o ( i i i ) re turn ing to step ( A ), wherein, concurrently with the performing of steps A-C, the following steps are performed:
D ) determining successive short term operand hit ratios of said cache memory;
~ ) comparing pairs of successive ratios determined in step D; and F ) generating a control signal when the ratios of said compared pairs represent a downward trend in said successively determined short term operand hit ratios;
20 wherein said procedure of said steps A through C is halted upon the occurrence of said control siqnal.
In accordance with another aspect of the invention, there is provided the method of predicting operand addresses of operand requests in a data processing system in which a cache thereof is repeatedly interrogated to determine whether an operand address corresponding to a requested operand is stored therein, said cache storing operands and their corresponding operand Sa . ~

2089~37 72g34-l22 addresses; wherein said data proce~sing system include~ a maln memory; said method being carried out by said data processing system in operatiny said cache; wherein said method includes the controllably operable address prefetch prediction function of;
(a) durinq the operation of said cache, from time to tlme determining whether a plurality of earlier-generated reque~t operand addresses correspond to one of ~ plurality of predetermined operand address patterns, and (b) (i) if no one of said patterns is determined to corre~pond 10 to said plurality of reque~jt operand addresses, returning to step (a), but (11) lf one of sald patterns i8 determined to corre~pond to said plurallty of request operand addresses, generating an operand address of a predicted operand request employlnq said one pattern, using said predicted operand addre~s to obtain the address of an operand ~rom maln memory, entering said obtained operand into the cache, and returninq to step (a);
said method being characterized by the computer implemented steps of . concurrently ;~ith the performing of ~aid 20 address prefetch prediction function, the following steps are performed:
A) determining successive short term operand hlt ratlos of said cache;
B) comparinq pairs of successive ratios determined in step A; and C ) generating a control sl~nal when the ratlos of said compared palrs represent a downward trend in said successively ,Q~ 5b .

2 0 8 9 ~ 3 7 72434-122 determlned short term operand hit ratios; whereln said addre6s prefetch prediction function is halted upon the occurrence of said control signal.
In accordance with another aspect of the invention, there is provided apparatu3 for controllably generating a predicted operand address of an operand request in a data processing system in which a cache thereof is repeatedly interrogated to determine whether an address corresponding to a request operand address is stored therein, which stored operand 10 address corresponds to an operand also stored in said cache;
wherein said data processing system includes a store for holding a plurality of said addresses in respective cells thereof; said apparatus being characterized by~ a controllable operand address prediction mechanism comprising~ a plurality of switches, each of said switches having a plurality of input terminal~, an output terminal and a control terminal, whereby a control siqnal applied to said control terminal causes said switch to couple one of said input terminals to said output terminal; said control signal being delivered to said control terminals to cause the input terminals 20 of said switches to be successively coupled to said output terminal~; a circuit coupling each of said input terminals of said switches to one of said cells; a plurality of first arithmetic circuits, each of said first arithmetic circuits having a pair of input terminals and an output terminal, the input terminals of each of said first arithmetic circuits being coupled to the respective output terminals of two of said switches, each of said first arithmetic circuits performing an arithmetic operation on 5c the two addressel3 received by its input terminals from the two aells of said store coupled to said input terminals by ~aid two switches and delivering a siqnal at it~ output terminal which represents the result of said arithmetic operation5 a plurality o$
comparators, each of said comparators having a pair of lnput termlnal6 and an output termlnal, the lnput terminals of each of said comparators being coupled to the respective output terminalY
of two of said first arithmetic circuits, each of Yaid comparator2s comparin~ the two arlthmetlc re~ult siqnals recelved thereby and 10 deliverin~ a signal at its output terminal denoting whether said result signals are alike; a ~econd arlthmetic circult having a pair of input terminal~ and an output terminal, a fir3t input terminal of said second arithmetic circuit being coupled to the output terminal of one of said ~witches and the second input terminal of ~aid second arithmetic circuit being coupled to the output terminal of one of said first arithmetic circuits, said second arithmetic circuit performing an arithmetic operation on the address received by ~aid first input terminal and the arlthmetic result represented by the signal received by said 20 6econd input terminal and dellverlng an output slgnal at lts output termlnal which represents the result of said arithmetic operation performed by said second arithmetic circuit5 whereby when the signals delivered at the output terminal~ o all of said comparators denote that the compared arithmetic results are alike, the arithmetic result represented by the concurrent output slgnal of said second arlthmetic circuit represents said predicted operand request address; and a control circuit coupled to said ~d 208~437 72434-122 prediction mechanism and comprisingS mean~ ~or generating number~
representing succes~ive short term operand hit ratios o~ ~aid cache memory, means for comparing pairs of successively generated ones o~ ~aid numbers, and means for ~enerating a control si~nal when the numbers of said compared pairs represent a downward trend in said short term operand hit ratios; wherein the operation o~
said prediction mechanlsm 18 halted upon the occurrence of ~aid control ~ignal.
DescxiPtion o~ the Drawinq The ~ub~ect matter o~ the invention is particularly pointed out and di~tinctly claimed in the concludlng portion of the speci~ication. ~he invention, however, both as to organization and method oi operationr may be~t be understood by reference to the following description taken in con~unction with the sub~oined claims and the accompanying drawing of which:
FIG. 1 iB a generalized block diagram of a typical data processing system employing a cache memory and therefore constituting an exemplary environment for practicing the invention;

I`
., ~ 2089~37 .

.. .. .
FIG . 2 is a flow diagram illustrating, in s i ~rl i f i P~
2 form, the 6equence of operations by which the invention is 3 practiced;
4 FIG. 3 is a logic diagram of a simple F'YPmr~ Ary ~ L of the cache miss prediction r-^hAnlcm;
6 FIG. 4 is a logic diagram of a more powerful ~Ypmrl~ry 7 r-'-o~ It of the cache miss prediction T ~ ' ~ i Cm; and 8 PIG. 5 i5 a logic diagram of an exemplary Pmho~lir-rL O~
9 the adaptive circuit for selectively PnAhl in~ the cache miss lO prediction r~ ~hAni ~m.

12 Detailed Description of the Invention 13 Ref erring now to FIG . 1, there is shown a high level 14 block diagram for a data processing system inc~L~oL~Ling a 15 cache memory feature. Those skilled in the art will 16 appreciate that this block diagram is only PY~ lAry and 17 that m~ny variations on it are employed in practice. Its 18 function is merely to provide a context for tl~ ~c~ n~ the 19 subject invention. Thus, the illustrative data prOa~C~m~

20 system includes a main memory unit 13 which stores the data 21 signa.1 groupS (i.e., information words, 1nr1~ n~

22 instructions and operands) requirea by a central processing 23 unit 14 to execute the desired procedures. Signal groups 24 with an ~nhAnc P~ probability for requirement by the central 25 processing unit 14 in the near term are transferred from the 26 main memory unit 13 (or a user unit 15) through a system 27 interface unit 11 to a cache memory unit 12. (Those skilled . . . ~ . .
.. . .
in the art will understand that, in some data proco~:; n~
2 system architectures, the signal groups are transf erred over 3 a system bus, thQreby reguiring an interface unit for each 4 component interacting with the system bus. ) The signal groups are stored in the cache memory unit 12 until 6 requested by the central processing unit 14. To retrieve 7 the correct signal group, address translation apparatus 16 8 is typically in-_oLuuL~ted to convert a virtual address (used g by the central processing unit 14 to identify the signal group to be fetched) to the real address used for that 11 signal group by the ro~-; n~ r of the data processing system 12 to identify the signal group.
13 ~he information stored transiently in the cache memory 14 unit 14 may include both instructions and operands stored in separate sections or stored homogeneously. Preferably, in 16 the practice of the present invention, in~ uL$ons and 17 operands are stored in separate ( at least in the sense that 18 they do not have commingled addresses) memory 8ections in 19 the cache memory unit 14 ; nA as it is intended to invoke the operation of the present invention as to operand 21 information only.
22 The cache miss prediction ?^~An; ~:rn which is an aspect 23 of the invention is based on recognizing and taking 24 advantage of sensed patterns in cache misses resulting from operand calls. In an extremely ~l Lcry example, cnn~ r 26 a sensed pattern in which three consecutive misses ABC are, 27 in fact, successive operand addresses with D being the next -i . .. .
.. . . .
successive address. This might take place, merely by way of 2 example, in a data manipulation process calling for 3 successively ~rG~cqin~ successive rows in a single column of 4 data. If t_is pattern is sensed, the 1 ik~l ~h~sd that signal group D will also be Acc ~qq~ , and soon, is ~nhAn~ d such 6 that its pref etching into the cache memory unit 14 is in 7 order.
8 The fundamental principles of the cache miss prediction 9 ~-~-hAn;qm are set fgrth in the operational flow chart Or FIG. 2. When a processor (or other system unit) asks for an 11 operand, a determination is made as to whether or not the 12 operand is currently resident in the cache. If 60, there is 13 a cache hit ( i . e ., no cache miss ), the operand is sent to 14 the requesting system unit and the next operand request iæ
awaited. However, if there is a cache miss, the request is, 16 in effect, redirected to the (much slower) main memory.
17 Those skilled in the art will understand that the 18 description to this point of FIG. 2 describes cache memory l9 operation generally. In the context of the present invention, however, the address of the cache miss is 21 meaningful. It is therefore placed at the top of a miss 22 stack to he described in further detail below. The miss 23 stack (which contains a history of the addresses of recent 24 cache misses in consecutive order) is then c~yA~l;n~d to 2~ determine ii~ A f irst of several patterns is prQ8ent. This 26 first pattern might be, merely by way o~ example, contiguous 27 addresses for the recent cache misses. If the first pattern ` ~ . 2089~37 .. .. .
is not sensed, additional patterns are tried. Merely by way 2 o~ example again, a second pattern might be recent cache 3 misses calling for successive addresses situated two 4 locations apart. So long as there is no pattern match, the process continues through the pattern repertoire. If there 6 is no match when all patterns in the repertoire have been 7 t~Y~m1nt~cl, the next cache miss is awaited to institute the 8 process anew.
9 However, if a pattern in the repertoire is detected, a predictive address is calculated from the information in the 11 miss stack and f rom the sensed pattQrn . This predictive 12 address is then employed to prefetch from main memory into 13 cache the signal group identif ied by the predictive address .
14 In the elementary example previously given, if a pattern is sensea in which consecutive operand cache miss operand 16 addresses ABC are consecutive and contiguous, the value of 17 the predictive address, D, will be C + 1.
18 In order to optimize the statistical integrity of the 19 miss stack, the predictive address itself may be placed at the top of the stack since it would ~highly probably) itself 21 have been the subject of a cache miss if it had not ~een 22 pre~etched in accordance with the invention.
23 Since speed of operation is essential, the cache miss 24 prediction ~-^hFlni ~ may advantageously be embodied in a 2S "hard wired" form (e.g., in a gate arr~y) although ~irmware 26 control is contemplated. Consider first a relatively simple 27 hardwired implementation shown in FIG. 3. A miss stack 20 ;

~,. ;."~;", 2~89437 ~ . .. . . , ... .. ' .
.. . .
holds the sixteen most recent cache miss addL~s~s, the 2 oldest being identi~iea as address P with entry onto the 3 stack being made at the top. Four four-input electronic 4 switches 21, 22, 23, 24 are driven in concert by a shift pattern signal via line 25 such that: in a first state, 6 addresses A, B! C, D appear at to respective outputs o~ the 7 switches; in a second state, addresses B, D, F, H appear at 8 the outputs; in a third state, addresses C, F, I, L appear g at the outputs; and in a fourth state, addresses D, H, L, P
appear at the outputs. Subtraction circuits 26, 27, 28 are 11 connected to receive as inputs the respective outputs of the 12 electronic switches 21, 22, 23, 24 such that: the output 13 from the subtraction circuit 26 is the output of the switch 14 21 minus the output of the switch 22; the output from the subtraction circuit 27 is the output of the switch 22 minus 16 the output of the switch 23; and the output from the 17 subtraction circuit 28 is the output of the switch 23 minus 18 the output of the switch 24.
19 The output ~rom the subtraction circuit 26 is applied 2 0 to one input of an adder circuit 31 which has its other 21 input driven by the output of the electronic switch 21. In 22 addition, the output from the subtraction circuit 26 is also 23 applied to one input of a comparator circuit 29. The output 2g from the subtraction circuit 27 is applied to the other input o~ the comparator circuit 29 and also to one input of 26 another comparator circuit 30 which has its other input 27 driven by the output of the subtraction cirouit 28. The 2~89~37 .. ... i .....
outputs from the comparator circuits 29, 30 are applied, 2 respectively, to the two inputs of an AND-gate 32 which 3 selectively issues a prefetch enable signal.
4 rOn c i cl~r now the operation of the circuit shown in FIG .
3. As previously noted, wiss staclc 20 holds the last 6 sixteen cache miss addresses, address A being the most 7 recent . When the request f or the signal group identif ied by 8 address A results in a cache miss, circuit operation is 9 instituted to search f or a pattern among the addresses resident in the miss stnck. The electronic switches 21, 22, 11 23, 24 are at their first state such that address A is 12 pas&ed through to the output of switch 21, address B appe~rs 13 at the output of switch 22, address C appears at the output 14 of switch 23 and address D appears at the output of switch 24. If the differences between A and B, B and C, and C ~md 16 D are not all equal, not all the outputs rrOm the 17 subtraction circuits 26, 27, 28 will be equal such that one 18 or both the comparator circuits 29, 30 will issue a no 19 compare; and AND-gate 32 will not be enabled, thus indicating a "no pattern match found" condition.
21 The switches are then advanced to their second state in 22 which addresses B, D, F, H appear at their respective 23 outputs. Assume now that (B - D) = (D - F) ~ (F - H); i.e., 24 a sequential pattern has been sensed in the address disr~A~ -nts. Consequently, ~oth the comparators 29, 30 26 will issue compare signals to fully enable the AND-gate 32 27 and produce a prefetch enable signal. Simul~Aneoll~ly~ the 11 , , 20~9~37 .
.. ..
output from the adder circuit 31 will be the predictive 2 address (B + (B - D) ) . It will be seen that this predictive 3 address extends the sensed pattern and thus increases the 4 probability that the pref etched signal group will be rerluested by the processor, thereby lowering the cache mi5s 6 ratio.
7 If a pattern had not have been sensed in the address 8 combination BDFH, the electronic switches would have been g advanced to their next state to examine the ~ddress combination CFIL and then on to the address combination DHLP
11 if necessary. If no pattern was sensed, the circuit would 12 await the next cache miss which will place a new entry at 13 the top of the miss stack and push address P out the bottom 14 of the stack bef ore the pattern match search is again instituted.
16 Consider now the somewhat more complex and power~ul 17 pmho~l;r ~ of the oache miss prediction r--hi~n~
18 illustrated in FIG. 4. Electronic switches 41, 42, 43, 44 19 receive at their respective inputs recent oache miss addresses as stored in the miss stack 40 in the PYprrl~ry 21 arrangement shown. It will be noted that each of the 22 electronic switches 41, 42, 43, 44 has eight inputs which 23 can be sequentially selectively transferred to the single 24 outputs under the influence of the shi~t pattern signal. It will also be noted that the miss stack 40 stores, in 26 addition to the sixteen latest cache miss addresse6 A - P, 27 three future entries WXY. Subtraction circu1ts 45, 46, 47 2~89~37 ` ~ , } ~
,.
per~orm the 5ame office n5 the coLLe~onding 5ubtraction Z circuits 26, 27, 28 of the FIG. 3 ~nhofl;r-~t previously 3 de5cribed. Similarly, adder circuit 48 corre5ponds to the 4 adder circuit 31 previously described.
Comparator circuit 49 receives the respective outputs 6 of the subtraction circuits 45, 46, and its output is 7 applied to one input of an AND-gate 38 which selectively 8 i6sues the prefetch enable signal. Comparator circuit 50 9 receives the respective outputs of the subtraction circuits 46, 47, but, unlike its counterpart oomparator 30 of the 11 FIG. 3 ~mho~;r--t, its output is applied to one input o~ an 12 OR-gate 3 9 which has its other input driven by a reduce 13 lookahead signal . The output of OR-gate 3 9 is coupled to 14 the other input of AND-gate 38. With this arrangement, activat~on of the reduce lookahead _ign~l enables OR-Gate 39 16 and partially enables AND-gate 38. The effect of applying 17 the reduce lookahead signal is to compare only the outputs 18 of the subtraction circuits 45, 46 in the comparator circuit 19 49 such that a compare fully enables the A~D-gate 38 to issue the prefetch enable signal. This mode of operation 21 may be useful, for example, when the pattern6 seem to be 22 changing every f ew cache misses, and it f ?vors the most 23 recent PY~rrl6.C
24 With the arrangement of FIG. 4, it is advcl.. L gQ.,us to 25 try all the patterns within pattern groups (as represented 26 by the "YES" response to the ">1 PATTERN GROUP?" query in 27 the flow diagr~m of FIG. 2) even if there is a pattern match 20894~7 . ~ . .
.. .. .
detected intermediate the process. This follows from the~
2 fact that more than one of the future entries WXY to the 3 miss stack may be developed during a single pass through the 4 pattern repertoire or even a subset of the pattern 5 repertoire. With the speci~ic implementation of FIG. 4 6 (which is only f~Y~-nrl ~ry of many possible useful 7 configurations), the following results are obtainable:
8 SWITCH STATE P~T~t~
lo 0 ABC3 W

CFIL W

18 The goal states are searched in groups by switch state;
19 i . Q.: Group 1 includes switch states 0, 1, 2 and could 20 result in filling future entries WXY; Group 2 includes 21 states 3, 4 and could result in filling entries WX; Group 3 22 includes states 5, 6 and could also result in filling 23 entries WX; and Group 4 includes state 7 and could result in 24 filling entry W. When a goal state is reached that has been 25 predicted, the search i8 halted for the current cache miss;
26 i.e., it would not be desirable to replace an already 27 developed predictive address W with a di~erent predictive 28 address W.
2g Those skilled in the art will understand that the logic 30 circuitry of FIGs. 3 and 4 is somewhat ~;mrllf~ sincs 31 multiple binary digit information is presented as if it were .. .. .
single binary digit information. Thus, in practice, arrays 2 of electronic switches, gates, etc. will actually be 3 employed to handle the added dimension as may be necessary 4 and entirely conventionally. Further, timing signals and logic for incorporating the cache miss prediction ~ ni ~::,m~
6 into a given data processing system environment will be 7 those appropriate for that environment and will be the 8 subject of straightforward logic design.
g The foregoing rlicr~lc~ion relates to the invention disclosed and claimed in the abovQ-referenced United States 11 Patent No. 5,093,777 which forms an P~Pmrl~ry environment 12 f or the present invention . Attention is now directed to 13 FIG. 5 which is a logic diagram of exemplary control 14 apparatus ~or adaptively selectively Pn~hl i nrJ the cache 15 prediction -~ell~n; qm in accordance with the present 16 invention.
17 An OR-gate 71 is driven by "system reset" and 'lyLoces~
18 switch" signals, and its output driYes one input to another 19 Or-gate 72. The output of the O~-gate 72 serves to reset the counters 61, 63 and the registers 64, 64', 65, 65'r 67 21 and 68 in the circuit. The counter 61 drives a 100 count 22 decoder 62 whose output is coupled to one input of an AND-23 gate AND-gate 73. The other input to the A~D-gate 73 is 24 driven by an "operand data request" signal which is also applied to an increment input of the counter 61. The output 26 of the AND-gate l 73 drives lo~d inputs to the counter 63 and ~ 2 ~ 8 9 ~ 3 7!
each of the registers 64, 64 ', 65 and 65 ' and to one input 2 of AND-gate 75.
3 An "operand cache hit" signal drives an in~L~ t. input 4 to presettable counter 63 whose output is applied to the input of register 64. In addition, "0" logic signals are 6 present at the datz inputs to counter 63. The instant2neous 7 count held in counter 63 is applied to the data inputs of 8 register A 64 . A one-bit register 64 ', which can be preset g to "1", serves as a validity indicator for register 64. The instantaneous count held in regist~r 64 is applied to the 11 data inputs o~ register B 65, and a one-bit register 65 ' 12 serves as a validity indicator for register 65. The state 13 of one-bit register 65 ' is applied to the second input o~
14 AND~gate 75 which drives load inputs to one-bit registers 67 and 68.
16 The instantaneous counts in the registers 64 and 65 2re 17 applied to the inputs of a comparator 66 which issues a "1"
18 output only when the count in register A 64 is equal to or 19 greater than the count in register B 65. This output signal drives an inverter 74 which drives one-bit register C 67.
21 The state of one-bit register 67 is applied to the input o~
22 one-bit register D 68 and also to one input of AND-gate 76.
23 The state of one-bit register 68 is also applied to an input 24 of AND-g~te 76. Pre~erably, but not n~cecQ~rily, A2~D-gate 76 ha6 a third input driven by a "mode" signal which may be 26 supplied by the process in execution.
I

.

2n89437 ... .
.. .. .
Optional circuitry which may be employed includes hit 2 r~tio regi5ter 6~ ~li¢h may be loa~ed wi~ a y~ ;r-d "hit ratio threshold" supplied by a new process. ThQ value resident in the hit ratio register 69 is applied to one input of a comparator 70 which has its other input driven ~y 6 the instantaneous count in register A 64 such that the 7 comparator 70 issues a logic "1" only when A is greater than 8 or equal to the hit ratio threshold. The output from g comparator 70 is applied to one input o~ an AND-gate 78 which receives at its other input the logical inversion o~
11 the "mode" signal. The output from the AND-gates 76 and 78 12 are applied to the two inputs o~ an OR-gate 77 which issues 13 a `'disable miss prediction" signal when it is enabled.
14 Consider now the operation of the apparatus shown in FIG. 5. Either a "system reset" or a "process switch" will 16 reset the counters 61, 63 and the registers 64, 64 ', 65, 17 65', 67 and 68 to initialize the adaptive circuitry. The 18 counter 61 and the 100 decoder 62 cooperate to enable the 19 AND-gate 73 every 100 times the "operand data request"
20 signal increments the counter 61. This causes the counter 21 63 to load a count of zero ~s it passes along its just 22 previous count to register A 64. Register 64 passes its 23 previous count to register B 65 . One-bit register A' 64 ' i8 24 loaded with a "1", and its previous state is passed to register B' 65'. That previous state will only be a "0" if 26 a sy5tem reset or process switch took place just prior to , ., ~ .,, ,- - .

208g43~
-l .
.
460 the immediately previous count to 100 by the counter 61 took 461 place.
462 The counter 63 i5 now incremented each time an operand 463 cache hit takes place such that it ~ tes, over the 464 next 100 operand data requests, the short term cache hit 465 ratio. In the meantime, comparator 66 c~ a~ C3 the hit 466 ratio now held in register A 64 to the ratio now held in 467 register B 65 to determine ir A is greater than or equal to 468 B; i.e., if the most recent ratio i5 as good or better th~n 469 the second most recent ratio. Hence, an upward (or flat) or 4?0 downward trend is sensed.
471 I~ an upward trend is sensed, the output from the 472 comparator 66 is a "1" which is inverted by the inverter 74 473 and applied as a logic "0" to the input of one-bit register 474 C 67. This state will be loaded into the register 67 when 475 the AND-gate 73 is next enabled if the state of the one-bit 476 register B' 65' is "1"; i.e., i~ the AND-gate 75 is ~ully 477 enabled. This would be the case two cycles following the 478 occurrence of a system reset or process switch since the "1"
479 applied to the one-bit register A' 64 ' would have flushed 480 out the "o"s placed therein and in the one-bit register B' 481 65' by the OR-gate 72. Thus, it will be seen th~t the 482 purpose of the registers 64' and 65' is to lock out invalid 483 hit ratio information which would otherwise be generated 484 immediately following a system reset or process sWitch.
485 When the state of the compar~tor 66, inverted through 486 the inverter 74, is loaded into the one-bit register 67, the , .

. . .. ..... .
, previous state of the register 67 is loaded into the one-bit 2 register 68. As a result, if the hit ratio trend is down 3 f or two consecutive cycles such that the one-bit registers 4 67, 6a both contain "l"s, and n~ n~ the "mode" signal is 5, "1", then the AND-gate 76 is enabled to enable the OR-gate 6 77 which therefore issues the "disable miss prediction"
7 signal. This signal remains until a subsequent comparison 8 o~ the contents of the registers 64, 65 again indicates an g upward ~or ~lat) trend in the short term hit r ~tio.
The optional use of the "mode'l signal permits selective 11 overriding of the adaptive apparatus by estsih~ hin~ a 12 predetermined hit ratio threshold at which, notwithstAnding 13 the presence of an upward trend in the hit ratio, the 14 i'disable miss prediction" signal is issued. This may be an appropriate operating procedure when it is known th~lt a new 16 process would not benefit from, or whose performancs may be 17 detrimentally affected by, the use of the adaptive circuit 18 following cache "in-rush". In the early stages of the 19 execution of a new process, the new proces6 switches the "mode" signal to "o" to partially enable the AND-gate 78 and 21 to disable the AND-gate 76 to thereby lock out the decision 22 section of the adaptive circuit and also causes the 23 predet~ ; n~-d hit ratio to be loaded into the register 69 .
24 Under these conditions, a short term hit r21tio i8 ~till being developed in the register A 64, and this hit ratio i5 26 compared to the hit r21tio thre6hold held in the register 69 27 by the comparator 70. As long as the actual hit ratio is -20~9~3~
~ . . . . " .. .. ..
. . .
less than the hit ratio threshold t the comparator 70 output 2 is " O " . However, when the hit ratio reaches or exceeds the 3 hit ratio threshold, the comparator 70 output switches to 4 "1" to fully enable the AI~D-gate 78 to enable the OR-gate 77 5 which issues the "disable miss prediction" signal. It will 6 be noted that, if the short term hit ratio again ~alls below 7 the hit ratio threshold, the "disable miss prediction"
8 signal will switch back to "O", thus permitting the cache g miss prediction me~h~ni F~ to resume operation.
While the principles of the invention have now been 11 made clear in an illustrative ~mhor~;r-ntl there will be 12 immediately obvious ~o those skilled in the art many 13 modific~tions of structure, arrangements, proportions, the 14 elements, materials, and c _ -~nts, used in the practice of 15 the invention which are particularly adapted for specific 16 environments and operating requirements without departing 17 ~rom thos~ pri~c:1ples.

Claims (12)

1. The method for predicting operand addresses of operand requests in a data processing system in which a cache thereof is repeatedly interrogated to determine whether an address corresponding to a request operand address is stored therein, which stored operand address corresponds to an operand also stored in said cache; wherein said data processing system includes a main memory and a stack for holding addresses; said method being carried out by said data processing system in operating said cache, characterized by the computer-implemented steps of, A) upon the occurrence of a cache miss when an operand is requested from said cache, entering the request operand address into said stack;
B) examining the request operand addresses present in said stack to determine whether one of a plurality of predetermined operand address patterns is represented by said request operand addresses; and C) (1) if no one of said patterns is determined to be represented in step (B), returning to step (A), but (2) (i) if one of said patterns is determined to be represented in step (B), generating an operand address of a predicted operand request, (ii) using said generated operand address to obtain an operand from said main memory and write such operand into said cache, and (iii) returning to step (A), wherein, concurrently with the performing of steps A-C, the following steps are performed;
D) determining successive short term operand hit ratios of said cache memory;
E) comparing pairs of successive ratios determined in step D; and F) generating a control signal when the ratios of said compared pairs represent a downward trend in said successively determined short term operand hit ratios;
wherein said procedure of said steps A through C is halted upon the occurrence of said control signal.
2. The method of claim 1, wherein said procedure of said steps A through C is also halted when said short term operand hit ratio reaches a predetermined value.
3. The method of claim 2, wherein said predetermined value is determined by the process commencing execution following a process change.
4. The method of predicting operand addresses of operand requests in a data processing system in which a cache thereof is repeatedly interrogated to determine whether an operand address corresponding to a requested operand is stored therein, said cache storing operands and their corresponding operand addresses;
wherein said data processing system includes a main memory; said method being carried out by said data processing system in operating said cache; wherein said method includes the controllably operable address prefetch prediction function of;
(a) during the operation of said cache, from time to time determining whether a plurality of earlier-generated request operand addresses correspond to one of a plurality of predetermined operand address patterns, and (b) (i) if no one of said patterns is determined to correspond to said plurality of request operand addresses, returning to step (a), but (ii) if one of said patterns is determined to correspond to sald plurality of request operand addresses, generating an operand address of a predicted operand request employing said one pattern, using said predicted operand address to obtain the address of an operand from main memory, entering said obtained operand into the cache, and returning to step (a);
said method being characterized by the computer implemented steps of:
concurrently with the performing of said address prefetch prediction function, the following steps are performed:
A) determining successive short term operand hit ratios of said cache;
3) comparing pairs of successive ratios determined in step A; and C) generating a control signal when the ratios of said compared pairs represent a downward trend in said successively determined short term operand hit ratios;

wherein said address prefetch prediction function is halted upon the occurrence of said control signal.
5. The method of claim 4, wherein each of said short term operand hit ratios is determined by counting the number of operand cache hits occurring during an interval in which a predetermined number of operands are requested from said cache.
6. The method of claim 4, wherein said control signal also is generated when said short term operand hit ratio reaches a predetermined value.
7. The method of claim 6, wherein said predetermined value is determined by the process commencing execution following a process change.
8. The method of claim 4, wherein substep (a) of said address prefetch prediction function occurs whenever an operand cache miss occurs.
9. The method of claim 4, wherein said address prefetch prediction function is enabled for operation (a) during cache "in-rush" following a process change, at which time said steps A-C are not used for halting said address prefetch prediction function, and (b) subsequent to said cache "in-rush", at which time said steps A-C are performed concurrently with said function.
10. Apparatus for controllably generating a predicted operand address of an operand request in a data processing system in which a cache thereof is repeatedly interrogated to determine whether an address corresponding to a request operand address is stored therein, which stored operand address corresponds to an operand also stored in said cache; wherein said data processing system includes a store for holding a plurality of said addresses in respective cells thereof; said apparatus being characterized by:
a controllable operand address prediction mechanism comprising, a plurality of switches, each of said switches having a plurality of input terminals, an output terminal and a control terminal, whereby a control signal applied to said control terminal causes said switch to couple one of said input terminals to said output terminal;
said control signal being delivered to said control terminals to cause the input terminals of said switches to be successively coupled to said output terminals;
a circuit coupling each of said input terminals of said switches to one of said cells;
a plurality of first arithmetic circuits, each of said first arithmetic circuits having a pair of input terminals and an output terminal, the input terminals of each of said first arithmetic circuits being coupled to the respective output terminals of two of said switches, each of said first arithmetic circuits performing an arithmetic operation on the two addresses received by its input terminals from the two cells of said store coupled to said input terminals by said two switches and delivering a signal at its output terminal which represents the result of said arithmetic operation;
a plurality of comparators, each of said comparators having a pair of input terminals and an output terminal, the input terminals of each of said comparators being coupled to the respective output terminals of two of said first arithmetic circuits, each of said comparators comparing the two arithmetic result signals received thereby and delivering a signal at its output terminal denoting whether said result signals are alike;
a second arithmetic circuit having a pair of input terminals and an output terminal, a first input terminal of said second arithmetic circuit being coupled to the output terminal of one of said switches and the second input terminal of said second arithmetic circuit being coupled to the output terminal of one of said first arithmetic circuits, said second arithmetic circuit performing an arithmetic operation on the address received by said first input terminal and the arithmetic result represented by the signal received by said second input terminal and delivering an output signal at its output terminal which represents the result of said arithmetic operation performed by said second arithmetic circuit;
whereby when the signals delivered at the output terminals of all of said comparators denote that the compared arithmetic results are alike, the arithmetic result represented by the concurrent output signal of said second arithmetic circuit represents said predicted operand request address; and a control circuit coupled to said prediction mechanism and comprising:
means for generating numbers representing successive short term operand hit ratios of said cache memory, means for comparing pairs of successively generated ones of said numbers, and means for generating a control signal when the numbers of said compared pairs represent a downward trend in said short term operand hit ratios;
wherein the operation of said prediction mechanism is halted upon the occurrence of said control signal.
11. The apparatus of claim 10, wherein the operation of said prediction mechanism also is halted when said short term operand hit ratio reaches a predetermined value.
12. The apparatus of claim 11, wherein said predetermined value is determined by the process commencing execution following a process change.
CA002089437A 1992-03-13 1993-02-12 Adaptive cache miss prediction mechanism Expired - Fee Related CA2089437C (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US07/850,713 1992-03-13
US07/850,713 US5367656A (en) 1992-03-13 1992-03-13 Controlling cache predictive prefetching based on cache hit ratio trend

Publications (2)

Publication Number Publication Date
CA2089437A1 CA2089437A1 (en) 1993-09-14
CA2089437C true CA2089437C (en) 1996-08-27

Family

ID=25308915

Family Applications (1)

Application Number Title Priority Date Filing Date
CA002089437A Expired - Fee Related CA2089437C (en) 1992-03-13 1993-02-12 Adaptive cache miss prediction mechanism

Country Status (4)

Country Link
US (1) US5367656A (en)
EP (1) EP0560100B1 (en)
CA (1) CA2089437C (en)
TW (1) TW270184B (en)

Families Citing this family (45)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5093777A (en) * 1989-06-12 1992-03-03 Bull Hn Information Systems Inc. Method and apparatus for predicting address of a subsequent cache request upon analyzing address patterns stored in separate miss stack
SE469402B (en) * 1991-05-02 1993-06-28 Swedish Inst Of Computer Scien PROCEDURE TO Fetch DATA FOR A CACHE MEMORY
US5715421A (en) * 1992-10-16 1998-02-03 Seiko Epson Corporation Apparatus and method of addressing paged mode memory including adjacent page precharging
JPH06314241A (en) * 1993-03-04 1994-11-08 Sharp Corp High-speed semiconductor memory and high-speed associative storage device
US5426764A (en) * 1993-08-24 1995-06-20 Ryan; Charles P. Cache miss prediction apparatus with priority encoder for multiple prediction matches and method therefor
AU2364095A (en) * 1994-05-12 1995-12-05 Ast Research, Inc. Cpu activity monitoring through cache watching
US5701426A (en) * 1995-03-31 1997-12-23 Bull Information Systems Inc. Data processing system and method using cache miss address prediction and forced LRU status in a cache memory to improve cache hit ratio
EP0752645B1 (en) * 1995-07-07 2017-11-22 Oracle America, Inc. Tunable software control of Harvard architecture cache memories using prefetch instructions
US5790823A (en) * 1995-07-13 1998-08-04 International Business Machines Corporation Operand prefetch table
US5694568A (en) * 1995-07-27 1997-12-02 Board Of Trustees Of The University Of Illinois Prefetch system applicable to complex memory access schemes
US5765213A (en) * 1995-12-08 1998-06-09 Emc Corporation Method providing for the flexible prefetching of data from a data storage system
US5919256A (en) * 1996-03-26 1999-07-06 Advanced Micro Devices, Inc. Operand cache addressed by the instruction address for reducing latency of read instruction
US5761468A (en) * 1996-05-15 1998-06-02 Sun Microsystems Inc Hardware mechanism for optimizing instruction and data prefetching by forming augmented prefetch instructions
US5893930A (en) * 1996-07-12 1999-04-13 International Business Machines Corporation Predictive translation of a data address utilizing sets of associative entries stored consecutively in a translation lookaside buffer
US5822790A (en) * 1997-02-07 1998-10-13 Sun Microsystems, Inc. Voting data prefetch engine
US6098154A (en) * 1997-06-25 2000-08-01 Sun Microsystems, Inc. Apparatus and method for generating a stride used to derive a prefetch address
US6138212A (en) * 1997-06-25 2000-10-24 Sun Microsystems, Inc. Apparatus and method for generating a stride used to derive a prefetch address
US6047363A (en) * 1997-10-14 2000-04-04 Advanced Micro Devices, Inc. Prefetching data using profile of cache misses from earlier code executions
WO1999034356A2 (en) * 1997-12-30 1999-07-08 Genesis One Technologies, Inc. Disk cache enhancer with dynamically sized read request based upon current cache hit rate
US6055650A (en) * 1998-04-06 2000-04-25 Advanced Micro Devices, Inc. Processor configured to detect program phase changes and to adapt thereto
US6341281B1 (en) 1998-04-14 2002-01-22 Sybase, Inc. Database system with methods for optimizing performance of correlated subqueries by reusing invariant results of operator tree
US6401193B1 (en) 1998-10-26 2002-06-04 Infineon Technologies North America Corp. Dynamic data prefetching based on program counter and addressing mode
US6311260B1 (en) 1999-02-25 2001-10-30 Nec Research Institute, Inc. Method for perfetching structured data
US6470427B1 (en) 1999-11-09 2002-10-22 International Business Machines Corporation Programmable agent and method for managing prefetch queues
US6862657B1 (en) * 1999-12-21 2005-03-01 Intel Corporation Reading data from a storage medium
US6934807B1 (en) * 2000-03-31 2005-08-23 Intel Corporation Determining an amount of data read from a storage medium
US6973542B1 (en) * 2000-07-18 2005-12-06 International Business Machines Corporation Detecting when to prefetch inodes and then prefetching inodes in parallel
US7257810B2 (en) * 2001-11-02 2007-08-14 Sun Microsystems, Inc. Method and apparatus for inserting prefetch instructions in an optimizing compiler
US7234136B2 (en) * 2001-11-02 2007-06-19 Sun Microsystems, Inc. Method and apparatus for selecting references for prefetching in an optimizing compiler
US7035979B2 (en) * 2002-05-22 2006-04-25 International Business Machines Corporation Method and apparatus for optimizing cache hit ratio in non L1 caches
JP4066833B2 (en) * 2003-02-18 2008-03-26 日本電気株式会社 Disk array control device and method, and disk array control program
US7328309B2 (en) * 2004-10-14 2008-02-05 International Business Machines Corporation On-demand cache memory for storage subsystems
JP4827469B2 (en) * 2005-09-08 2011-11-30 パナソニック株式会社 Cache memory analysis method, processor, and simulated information processing apparatus
US7917702B2 (en) * 2007-07-10 2011-03-29 Qualcomm Incorporated Data prefetch throttle
US7925865B2 (en) * 2008-06-02 2011-04-12 Oracle America, Inc. Accuracy of correlation prefetching via block correlation and adaptive prefetch degree selection
WO2013188460A2 (en) 2012-06-15 2013-12-19 Soft Machines, Inc. A virtual load store queue having a dynamic dispatch window with a distributed structure
EP2862060A4 (en) * 2012-06-15 2016-11-30 Soft Machines Inc A method and system for filtering the stores to prevent all stores from having to snoop check against all words of a cache
CN104823168B (en) 2012-06-15 2018-11-09 英特尔公司 The method and system restored in prediction/mistake is omitted in predictive forwarding caused by for realizing from being resequenced by load store and optimizing
CN104583957B (en) 2012-06-15 2018-08-10 英特尔公司 With the speculative instructions sequence without the rearrangement for disambiguating out of order load store queue
EP2862061A4 (en) 2012-06-15 2016-12-21 Soft Machines Inc A virtual load store queue having a dynamic dispatch window with a unified structure
CN104583956B (en) 2012-06-15 2019-01-04 英特尔公司 The instruction definition resequenced and optimized for realizing load store
KR101996462B1 (en) 2012-06-15 2019-07-04 인텔 코포레이션 A disambiguation-free out of order load store queue
US20140189244A1 (en) * 2013-01-02 2014-07-03 Brian C. Grayson Suppression of redundant cache status updates
KR102429903B1 (en) * 2015-12-03 2022-08-05 삼성전자주식회사 The control method of a page fault in the non-volatile main memory system
US10592414B2 (en) * 2017-07-14 2020-03-17 International Business Machines Corporation Filtering of redundantly scheduled write passes

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5093777A (en) * 1989-06-12 1992-03-03 Bull Hn Information Systems Inc. Method and apparatus for predicting address of a subsequent cache request upon analyzing address patterns stored in separate miss stack
US5285527A (en) * 1991-12-11 1994-02-08 Northern Telecom Limited Predictive historical cache memory

Also Published As

Publication number Publication date
EP0560100A1 (en) 1993-09-15
TW270184B (en) 1996-02-11
US5367656A (en) 1994-11-22
CA2089437A1 (en) 1993-09-14
EP0560100B1 (en) 1999-06-09

Similar Documents

Publication Publication Date Title
CA2089437C (en) Adaptive cache miss prediction mechanism
US5694572A (en) Controllably operable method and apparatus for predicting addresses of future operand requests by examination of addresses of prior cache misses
US5701426A (en) Data processing system and method using cache miss address prediction and forced LRU status in a cache memory to improve cache hit ratio
US5941981A (en) System for using a data history table to select among multiple data prefetch algorithms
CA1204219A (en) Method and apparatus for prefetching instructions
US5553255A (en) Data processor with programmable levels of speculative instruction fetching and method of operation
US5507028A (en) History based branch prediction accessed via a history based earlier instruction address
US4980823A (en) Sequential prefetching with deconfirmation
TWI519955B (en) Prefetcher, method of prefetch data and computer program product
US7293164B2 (en) Autonomic method and apparatus for counting branch instructions to generate branch statistics meant to improve branch predictions
US4894772A (en) Method and apparatus for qualifying branch cache entries
US4942520A (en) Method and apparatus for indexing, accessing and updating a memory
US4860199A (en) Hashing indexer for branch cache
EP0097790A2 (en) Apparatus for controlling storage access in a multilevel storage system
US5450561A (en) Cache miss prediction method and apparatus for use with a paged main memory in a data processing system
JPH0557617B2 (en)
JPH10232827A (en) Method and device for writing back prefetch cache
CA2121221C (en) High speed cache miss prediction method and apparatus
US5495591A (en) Method and system for cache miss prediction based on previous cache access requests
EP0377436A2 (en) Apparatus and method for increased operand availability in a data processing unit with a store through cache memory unit strategy
EP0296430A2 (en) Sequential prefetching with deconfirmation
JP3043631B2 (en) File look-ahead system and method
JPH07219838A (en) Data look-ahead controller

Legal Events

Date Code Title Description
EEER Examination request
MKLA Lapsed