CA1205565A

CA1205565A - Collector

Info

Publication number: CA1205565A
Application number: CA000438919A
Authority: CA
Inventors: Russell W. Guenthner; Joseph C. Circello; Gregory C. Edgington; Leonard G. Trubisky
Original assignee: Honeywell Information Systems Inc
Current assignee: Bull HN Information Systems Inc
Priority date: 1982-10-13
Filing date: 1983-10-13
Publication date: 1986-06-03
Also published as: JPS5991547A; DE3377027D1; AU571461B2; EP0106670A3; EP0106670B1; JPS6353571B2; EP0106670A2; US4594660A; AU2007983A

Abstract

COLLECTOR

ABSTRACT

A collector for the results of a pipelined central processing unit of a digital data processing system. The processor has a plurality of execution units, with each execution unit executing a different set of instructions of the instruction repertoire of the processor. The execution units execute instructions issued to them in order of issuance by the pipeline and in parallel. As instructions are issued to the execution units, the operation code identifying each instruction is also issued in program order to an instruction execution queue of the collector. The results of the execution of each instruction by an execution unit are stored in a result stack associated with each execution unit.
Collector control causes the results of the execution of instructions to program visible registers to be stored in a master safe store register in program order which is determined by the order of instructions stored in the instruction execution stack on a first-in first-out basis. The collector also issues write commands to write results of the execution of instructions into memory in program order.

Description

Z~3S565 ~,~2B

1~ ~d~
This invention is in the field of diyital data processing ~ystems in which the central processor of the sy6tem includes a plurality of execution units. Each of the execution unit~
executes a different subset of the instructions constituting the repertoire of the processor. ~he execution units are independe~t of each other and act in parallel. More particularly, thi6 invention relates to a collector in which the results oi the execution of in~tructions by the execution units are received and ~tored in program order ~nd in which a current copy of the program addres~able registers of the central processor is maintained, whlch copy is available for recovering from faults.

2. ~escrlption oS~ e~
Typ~cally, in the prior art central proce~s~ng sy6tems, the proce~sor include~ c~rcuits for producing the addresse~ of the ln~tru~tion word~, fetching the instruction ~rom memory, preparing the ~ddre66e~ of operand~, fetching the operands from memory, loading data into designated regi~ter~, executing the lnstruction and, when the re~ult~ are produced, wr ting the results into memory or into program ~i~ible registers~

S~7g7-11

3~', :12~'5565 To increase the performan~e, i.e., throughput, of data proce~sing systems, variou~ ~odification6 have been ~ncorporated ~n central procesfiing units. To reduce the time required to obtain operand~ and in~tructions, high-speed cache~ located in the proce~sor h~ve been provided. In order to speed up the ~y~tems, the systems are synchronized, i.e., a clock produces clock pulses which control each ~tep of the operation of a central processing unit. ~n pipelined processors, tbe ~teps of preparing and fetching the lnstructions and the operands are overlapped to increase performance.
Because some instructions in a synchronous data processing 6y6tem take many m~re clock periods than other6, or much more time than others to execute, there is an imbalance in the time required to execute different instructions. One solution to this problem i8 to div~de the proces60r into a pluralilty of execution unit6, where each execution unit will execute a 6ubset of the instruction repertoire of the processor. Executing more than one in~truction at a time by operating the execution units in parallel ~ncreases the throughput of the proce660rt however, if the proce6Bor 16 provided with multiple execution unit6 which execute instructions in parallel, there i6` a need, or requirement, to make certaln that the result~ of the execution of ~n6truction~ by each of the execution units are a~embled in program order BO ~hat the data that i~ wri~ten into memory and lnto the program ~lsible register~ ~ written in proper order.

~2797-11 `3~ 3 It is also necessary that there be readily available a current and correct copy of the contents of the proqram addressable reqisters to allow for precise ~andling of faults and interrupt~, a~d to allow for recovering from hardware errors.

~, 4 ~L2~556S
~s~s~

The pre~ent invention provide6 a collector for a central proce6sor of a digital data proce~sing ~y~tem. The proeessor has a gi~en repertoire of instruct~on~ and a plurality of execution unit6. Each of the execution units has the cap~bility of executing a different set of instructions of the instruction repertoire. The central pipeline unit of the proce~or supplie~
the necessary instructions and operand6 to an execution unit having the capability of performing the instruc~ions. The in6tructions and operands are ~upplied in program order. As the in6tructions are applied to the execution unit that can execute the instruction in program order, the operational code of the in6truction word and other relevant information are transmitted to an lnstruction execution queue in program order. All but one of the execution unit~ i~ provided with an input ~tack which is capable of stacking input instructions in the order received.
Each execution unit i~ proYided with a result ctack into which is placed tbe re~ults ~f the eYecution of the in~tructions. Each execution unit executes the in~truction~ applied to it by the pipeline unit in the order the in6tructions are receiYed. The ~afiter 6afe store and the ~tore stack of the collector then have applied to them the result6 of the execution of all in~tructions wlth the re~ults being taken from the execution un$t re6ults ~tacks in progra~ order. The control portion of the_collector l~s56S
use~ the instruction code 6tored in the fir~t, or bottom, entry of the first-in, first-out ~tack of the instruction execution queue to enable the transmi~6ion of the result of the execu~ion of that instruction from the result s~ack of the appropriate execution unit to either the master 6afe ~tore or to the ~tore ~tack. A valid copy of the program visible register~ of the CPU
i~ maintained in the master safe store and the data to be written lnto memory ~ placed ~n the ~tore ~tack prior to being written ~nto memory, or into cache memory.
It i&~ therefore, an object of this invention to provide a collector which allows for increased throughput of the central processor.
It is another obiect of thi6 invention to provide a collector for a central processing unit having several execution units in which the results of the instruction~ ln program order are received and ~tored in the collector.
It i~ yet another object of thiC invention to provide a collector baving a m~ster safe ~tore in which a valid copy of the data stored in program visible regi6ter6 i~ ~tored.
It i6 still another object of thi6 invention to provide improved apparatu6 for fault handling and reoovery and for interupt handling ln a pipelined processor.
It i6 still ~nother object of thi~ invention to provide ~pparatu6 which provide6 an improved capability for hardware ~nstruction retry.

lZ~5S65 It is a further object of this invention to provide a collector which allows the execution uni.ts to operate independ-ently and in parallel without interference from each other and permits precise fault handling, interrupt handling, and error recovery without deleteriously impacting the performance of the individual execution units, and without requiring the execution units to delay between execution of instructions.
In accordance with the present invention, there is provided in a central processing unit of a data processing system having a memory, the CPU having a repertoire of instructions, a plurality of execution units, each execution unit adapted to execute a different subset of the repertoire of instructions of the CPU in the order received, the CPU having means for issuing instructions to the execution units in program order, the improve-ments comprising: means for storing instructions issued by the CPU
in program order; result means associated with each execution unit for storing the results of the execution of each instruction in the order executed by its associated execution unit; store means for stori.ng the results of the execution of each instruction exec-uted by the execution units; and control means responsive to theorder instructions are received by the means for storing instruc-tions issued by the CPU for issuing results stored in the result means associated with each execution unit in program order and for storing such results in the store means.
In accordance with the present invention, there is also provided in a central processing unit of a synchronous digital data processing system having a memory, the CPU having program addressable registers, a repertoire of instructions, a plurality .~ - 7 of execution units, each execution unit adapted to execute a dif-ferent subset of the repertoire of instructions in the order received, at least one of the execution units executing a subset of the instructions that can be executed in one clock period, the CPU having means for issuing instructions to the execution units in program order, the improvement comprising: an instruction exec-ution queue for storing the operation code of each instruction issued by the CPU in program order; an input stack associated with each execution unit except the unit which executes the subset of the instruction repertoire in one clock period; a result stack associated with each execution for storing the results of the exec-ution of each instruction in the order executed by its associated execution unit; a master safe store and a store stack for storing the results of the execution of each instruction executed by the execution units; and collector control means responsive to the order instructions are received by the instruction execution queue for issuing results stored in the result stack associated with the execution unit in program order and for storing such results in the master safe store of the results change the contents of an addressable register and in the store stack if the results are to be written into memory.
In accordance with the present invention, there is also provided a collector for a central processor having a repertoire of instructions, said central processing unit having a plurality of execution units, each execution unit adapted to execute a pre-determined set of operation codes of the instruction repertoire of the processor, the processor executing operation codes in program order, the collector comprising: an instruction execution queue - 7a adapted to receive an IEQ word including an operation code in pro-gram order; a plurality of result stacks, one result stack being associated with each execution unit; each execution unit result stack adapted to receive the results of the execution of each oper-ation code by the execution unit with which it is associated; and means for receiving and storing results from the execution of instructions executed by the execution unit and for transmitting those results which are to be stored to the storage means and control means for issuing in program order the results of the execution of each operation code by the execution units to the means for receiving and storing results; whereby upon the occur-rence of a fault, an interrupt, or a hardware error, a correct copy of the contents of the program addressable registers as they existed immediately prior to the occurrence is stored in the master safe store.
In accordance with the present invention, there is also provided a collector for a central processor having a plurality of program addressable registers having addressable memory locations, a system memory, a repertoire of instructions, the central pro-cessing unit having a plurality of execution units, each executionunit adapted to execute a predetermined set of operation codes of the instruction repertoire in the order received, the CPU having means for issuing instructions to the execution units in program order, the collector comprising: an instruction execution queue adapted to receive an IEQ word which includes the operation code of an instruction word in program order; a plurality of result stacks, one such result stack being associated with each execution unit, each execution unit result stack adapted to receive the - 7b lZ05565 results of the execution of each operation code by the execution unit with which it is associated; a master safe store for storing changes to the contents of program addressable registers; a store stack for storing the results of the execution of each operation code that changes the data stored in a memory location in system memory; and collec~or control means responsive to the order oper-ation codes are received by the IEQ for causing results stored in the result stack associated with the execution unit capable of executing the operation code to be stored in the master safe store if the result changes the contents of a program addressable reg-ister and in the store stack if the results change the data storedin a memory location in system memory.

- 7c ~ ~5565 13RIEF DESCRIPTION . OF ~IE DRAWl~l~i Other objects, fePtures and advantages of the invention will be readily apparent from the following description of certain preferred embodiments thereof taken in conjunction with the accompanying drawings, althou~h variations and modifications may be effected without departing from the ~pirit and ~cope of the novel concept~ of the disclo~ure, and in which:
~ igure 1 i8 a block diagram of a central processing unit (CPU) of a general-purpo~e data processing system provided with the collector of this invention:
Figure 2 is a block diagram of a portion of the collector of this invention;
Figure 3 i~ a block diagram of another portion of the collector of thi6 inventions Figure 4 is the format of an instruction word;
Flgure 5 i~ the format of an information execution queue wordS and Pigure 6 i~ a ~chematic illustration of a master 6afe ~tore register illustrating the pOSitiOnE in the register in which the content~ of program viEi~le registers are ~tored.

1%~556~

I2!ESCRI PTION O~ T~E PREFE;RRED ~MB~DIME~T

In Figure 1, the major components, or ~ubsy6tems, of a central processing unit lO are illustrated. The central unit pipellne structure 12 controls the overall operation of processor lO. The instruction fetch unit 14 supplies, or transmits, the addresses of instruction words to the instruction cache 16. In response to the in~truction address being applied to the 1nstruction cache 16 from in~truction fetch unit 14, an instruction word is transmitted from cache 16 to instruction fetch unit 14 which, in turn, forwards instruction words to the central pipeline unit structure 12. Central p9 peline unit 12 decodes the in~truction code, or operation code, of each ~n~truction word and forwards the operation code plus additional ~nformatlon derived from the in~truction word to in~truction execution queue lB for ~torage. Since most operation codes reguire an operand, central pipel;ne unit 12 produces the address of a data word, or operand~ and forwards the operand address to operand cache 20. After the operand i6 received from the operand cache, the operand and the instruction code are transmitted to the di~tributor 22. The di~tributor, by analyzing ~he operation code, typic~lly by u6e of a table look-up technique, determines the execution unit to which the in~truction and operand are to be forw~rded, d1str~buted, or ingated.
-~;2797-11 ~, 9 12~PSS65 ~ he four execution unit~ of CPU 10 are the central execution unit (CEU) 24, the virtual memoxy sy~tem and Multics de~criptor proce~sing execution unit (VMSM) 26, the binary ~rithmetic execution unit [BINAU) 28, and the decimal character execution unit (DECCU) 30.
Each of the execution unit~ 24, 26, 28 and 30, i~ capable of receiving instruction~ and operands, and proce~sing them independently of the other execution unitsO Each of the execution units includes logic circuits which are optimi2ed for performing that ~et of instructions assigned to it. In the preferred embodiment, central execution unit 24 perf~rms basic computer operations, ~uch a& ~imple load~, adds, ~ubstract~, etc., and certain miscellaneous instructions. CEU 24 is unique among the four execution units in that it executes each lnfitruct10n a8 received, i.e., within one clock period. A~ a result, central execution unit 24 is not provided with an input 6tack as are the other execution unit~ illustrated in Figure 1.
VMSM execution unit 26 eYecute6 in~truction6 relating tD virtual me~ory, fiecurlty Dnd special instructions that ~re peculiar to a secure oper~ting system. BINAU execution unit 28 executes binary arithmetic instructlon~, ~uch as multiply, divide and floati~g point in6tructlon6. The decimal/character execution unit 30 e~ecute~ alphanumeric, decimal arithmetic~ and bit etring in6truction6. Execution unit 26 i~ provlded with, or has as~oclated with it, an input ~tack 32; execution URit 28 i~

S2797-~1 lZ~5565 provided with an input stack 34; and execution unit 30 is provided with an input ~tack 36. The function of input ~tackc 32, 34 and 36 is to ~tore the operation code and operands, if required, of ~nstruction codes awaiting execution by its associated execution unit.
Each of the input 6tacks 32, 34, 36 is a conventional fir~t-in, fir6t-out 6tack having sixteen levels, with each level adapted to 6tore a double data word, or two operands. In the preferred embodiment, each word has 36 bits ~o that a double word has 72 bits. In addition, the operation code of the instruction word to be performed, or executed, by the execution unit in who~e ~tack the operand and in6truction code are located is also stored in the input stack. The input stacks 32, 34, 36 of execution unit6 26, 28 and 30 are fifo, or first-int first-out stackst so that the fir6t operation code and one or two operand words required for each operation code applied to a given execution unit 1~ the fir6t one read out of the input stack for execution by a given unit. Each of the execution units i6 al~o provided w$th a result~ 6tack. Re6ult~ etack 38 i8 ~s60ciated with the control execution unit 24, re6ults 6tack 40 is associated with executlon unit 26, re~ults ~tack 42 i~ assoclated with the BINAU
execut$on unit 28, and re~ults 6tack 44 i8 associated with the execution unit 30. In the preferred embodiment, ~he result6 ~tacks are conventional fir6t-in, fir6t-out stacks, each of which ~as slxteen level6. The re6ult6 of the operatl~n of an ~2797-11 ~ZQ5S6s instruction are stored in the ~tacks in the order in which they are executed. Each level of a results stack ha~ the capability of ~toring a double word, as well a6 additional information with respect to the d~uble word, a~ will be de~cribed in more detail below.
Results ~tacks 3 8, 40, 42, 44 are a part of collector 46, as are instruction execution queue 18 and collector control 47. The oper~tional code of each ~n~truction word in execution, along with other information, i8 a part of an lnstruction execut~on queue word illustrated in Figure 5 and is ~tored in instruction execution queue 18, which, in the preferred embodiment, is a conventional fir~t-in, first-out ~tack of sixteen levels, or layers.
~ he central unit pipeline structure 12 forwards the operation code of each in~truct.ion in progra~ order to ~nstruction execution queue 18 for 6torage therein. Up to ~ixteen instruction execution ~ueue, IEQ, words can be stored in ~ueue 18. Collector control 47 ufies the operation code of each IEQ
word to control the reading out of the re6ults located, os ~tored, in the resultfi stacks 38, 40, 42, 44 of each of the execution unit~ 24, 26, 28, 30, so that the result~ in proper program order can be stored in either the ma6ter ~afe ~tore, MSS, 48 or into ~tore stack 50. Result6 that are stored in store stack 50 are for writes of operands to memory. Instructions ~hich change program addre~able regi~ter~ of CP~ lQ generate S~797-11 12C~5~6S
re~ults that are stored in the master safe ~tore 48, ~o that at such time a~ an in~errupt, a fault, or a hardware error occur~, the content6 of the program ~ddre~able register6 Qf the CPU 10 are available in master safe ~tore 48. The availability of corrent and valld content6 of all program addre~sable register~
greatly facilitates fault recovery, handling of interrupts, and retrying of instruc~ions as appropriate.
Main memory 51 ~f the data proces~ing ~y6tem of whlch CP~ 10 i6 ~ subsy~tem provides in6tructions for the instruction cache 16 and operands for operand cache 20. All stores, or writes, to main memory 51 are from data ~tored in operand cache 20. ~hus, whenever data iB to be written into memory as a result of an execution of an instruction, the necessary data, operands, are 6tored in store stack 50 in program order and are issued or ~rltten into the operand cache 20 in program orderO As a block of operand cache 20 ~ relea6ed so that new data can be written $nto that block, the operand cache control will have data in that ~lock of the cache written into main memory before new data is written into that block, as i5 well known in the art.
In ~igure 2, ~ portion of the collector 46 i8 illustrated;
more particul~rly, the in~truction execution queue 18 and collector control logic circuitry 47 of collector 46. Collector sequence control 52 of collector control 47 has applied to i~
f~ult detection flag~, error detection flags~ interrupt flags from the fault detection circuit 54, error detection circuit 56, ~nd interrupt detection circu~t 58 of central processor 10.

Collector sequence control 52 compare~ the instruction code of the first of the remaining instructions entered into queue 18 and from the lnstruction code determines, using a conven~ional table look~up techniq~e, in which one of the result stack~ the result of the execution of that instruction is located. Collector ~equence control 52 causes the results of the first in time of the remaining instructions ~tored into the fifo results 6tack of that execution unlt to be read out and written into either master safe store 48 or B~ore ~tack 50 by appropriate control ~ignals produced by collector control 47. If the results are to be written into main memory 20, the operand, or operands, is cau~ed to be written into 6tore 6tack 50, which, as de~cribed above, i6 a conventional sixteen entry first-in, first-out stack with each level having the capability of 6toring two operands or, in the preferred embodlment, 6eventy-two blt~ of data. If the next result to be read out of an execution unit in program order iB a change in ~ ~oftware addressable register of CP~ 10, then that change ln a pro~ram addres~able register will be stored in master safe store 48 with the register beiny de6ignated by additional blts of information stored in the result~ stack along with $he result of the execution of a given instruction.
Collector control 47 controle the reading out of the operands from re6ult stack~ 38, 40, 42, 44 and where they are to be ~tored, l.e., ln ma6ter ~afe 6tore 48 or in 6tore Etack 50.
Input slgnal6 to eollector sequence control 52 in~lude the S2797-ll 12~5565 operation code 6tored in execution queue 18 of the next in~truction in program order to be read from one of the execution unit6. Additional input6 to the collector ~equence control 52 ~re fault detection flags from a ~ource 54, error detection flags from source 56 which are conventional part6 of the CPU lO, as well as interrupt6 from the I/O system, source 58. Collector control 47 then compares the exerution code of the next instruction to be read out of execution queue 18~ which code is tranFferred to collector sequence control 52. From the operational code, the collector ~equence control determines which execution unit should have executed that particular instruction and because of the order in which the instructions are applied to the execution units and to the in~truction execution queue, the next result to be read GUt of the fifo results stack of the execution unit caph~le of executing th~t 1nstruction will be the result of the execution of that instruction. If there i~ an operand, or operands, in the lowest level of that res~lt stack, then collector ~eguence control 52 will issue the appropriate control ~ignal~ to gate the oper~nd, or operands, from the approprlate re6ult stack and will ~180 i&sue appropriate control 6ignal6 to 6tore the operand in either master safe store 48 or store stack 50. If the instruction i~ a write to memory, the re6ult from the ~elected result stack is written into ~tore Etack ~0. If the reeult of the selected instruction i5 a change in a program ~i~ible reg1~ter, ~he result of the selected iGstruction S27g7-11 12~565 will be stored in a portion of master safe store 48 designated to ~tore the content~ of the regi~ter changed by that ins~ruction.
Thst portlon of master ~afe store 58 $nto which a change in a program addre6~able resi~ter i~ wri~ten is determined by add~tional b~nary data stored with the operand in the result stack.
In Pig~re 3, additional detailR of collector 46~ particularly the circuit connectlons between the result ~ack~ 3B, 40, 42, 44 ~nd the master s~fe store 48 and the store stack 50 are illu~trated~ The reRults of the execution o~ an operation code by an execution unitt such as central execution unit 24, are tran6mltted to CEU result stack 38 and from it to a ~et of 6witches 70 when C~U result stack 38 is enabled by a control signal from collector control 47. ~he result will be stored in master ~afe Etore 48 lf ~t 1E en~bled by control signals from collector control or in store stack 50 if it $8 enabled. The foregoing iB true for each of the result ~tacks 40, 42 and 44 ~l~o. Between result ~tacks 38, 40, 42 and 44 and the master safe store 48 and ~tore 6tack 50, there i~ located a 1 of 4 ~elect 5witch 70. Selector switch 70 ~electsl or determines, which ~lgnal~, l.e., tho~e fro~ which re~ult ~tack will be tr~nsmitted to ma~ter safe ~tore ~8 or ~tore stack 50.
Pigure 4 illu~trate~ tbe format of an in~truction word 72 for CPU 10. The flrst 18 bits, denoted Y, ~it po~itions 0 through 17, ~re the addre~ of the operand on which an opera~ion i6 to be ~2797-11 ~ 2~t5s65 performed. Bit~ 18 thr~ugh 27 con~titute the operational code ~hich uniquely identifies the in~truction to be performed. 8it 28 is an interrupt inhibit bit which, when a logical 1, inhibits a~ interrupt from being serviced, or recognized, while an e~ecution unit is executing that operation code, or instruction.
Bit 29 specifically refers to seg~ents external to the current ~nstruction ~egment. The tag field, bit positions 30 through 35, is generally u~ed to control address modification.
Figure 5 i8 the format of an information execution queue (IEQ) word 74, one of which can be stored in each of the 16 level~ of in~truction execution queue 18, a fifo stack. In IEQ
word 74, bit position~ 0 through 9 are the operational code of the instruction word 77 as illustrated in Figure 4. Bit 10 indicates the location of the operand within a double word, i.e., ~hether ~t i6 the flr~t or the ~econd of the two words. Bits 11 through 13 identify the address registers into which the re~ult of the execution of the operation code is to be written. Bit 14 correspond~ to instruction word bit 28, the interrupt inhibit blt~ Bit 15 ~5 the same as instruction word bit 2S. Bit6 16 through 21 con~titute instruction word bits 30 through 35, or the tag. Bit 22 iE instruction word bit 5, a truncation fault enable bit, and bit 23 lndicates the store of A/Q with character ~odification. In addition, the instruction word address, or the contents of the instruction counter of CPU 10, is a part of IEQ

12~5565 word 74, an 18-bit binary number, a6 is the memory addre~s needed for a write command which, in thi~ example, i~ a 28-bit binary number.
Figure 6 illustrates the location in master safe ~tore 48 in which the content~ of the various program ~ddres~able regicters are stored. In the re~isters indicated, an X denotes an index regi6ter, A the accumulation register, Q the quotient register, AR 0-7 addre~s register~, B~R a ba6e extension register, ~AR a base address register, MBA and MBB ma~ter base regi~ter~ A and . ~he instruction counter or the address of the last in~truction loaded into the master &afe store is stored in portion 76, and the current contentC of ~he CPU indicator register are stored in portion 78 of master ~afe store 48.
The repertoire cf in~tructions for processor 10 consists of several cla~ses of operation codes, or instructions. Some of these operation codes can be deemed to be basic operations, i.e., simple loads, add~, subtracts, etc., which require but one clock period to execute. Other classe~ are floating point, multiply and divide, both floating point and fixed point, address regi~ter modification, ~hifts and rotates, decimal arithmetic, alphanumerlc, bit ~tring, virtual memory and security-related, store and miscellaneous in6truction6. Proce~sor 10 i8 phy~ically and functionally divided along these ~ame lines with each e~ecution unlt having a highly optimized ~et of logic for executing the clas~ of instruction~ as~igned to lt.

~2797-11 The central unit pipeline 6tructure 12 control6 the overall ~peration of proces~or 10 and ha~ the function of 6ending operation code6; or command~, and as~ociated operands to the varlous execution units 24, 26, 28, 30, where the actual execution of each operation code i~ performed. The in6truction fetch unit 14 under the control of the central unit pipeline ~tructure 12 fetche~ in~truction6 from the instruction cache 16 and forwards the ~n~truction word to the central unit pipeline structure 12. ~he central p~pellne unit 12 decodes the infitructions, fetche~ the operands from the operand cache 20, and ~end~ the operation code and operands to one of the execution unit~ 24, 26, 28, or 300 Within the central unit pipeline 6tructure 12 are performed instruction preprocessing, instruction deeode, operand address formation, including paging and earch of ~n a660ciati~e ~emory. Structure 12 al80 controls cache 20.
The execution units 24, 26, 28, 30 receive co~mand6 from the central pipeline unit 12 operands from the cache 20 and performr or execute, the in~truction code. The execution of an instruction generally involves the formation of ~ome result ba~ed upon current regi~ter content6 and the input operand which produces a change to a program vi61ble regi6ter or to memory.
Proce~8Qr 10 i6 divided lnto four major execution units, each of whlch i~ made up of one or more subunit~. These units are (1) the centr~l executlon unit, CEU 24, (2) the binary arithmetic unit, ~hich perform6 floating point and multiply aDd divide S~797-11 -~.Z~jS~65 in~truction~, BINA~ 28, (3) the decimal character unit, DECCU 30, and ~4) the v~rtual memory, security and ~iscellaneous unit, VMSM
26. Each of the execution units 24, 26, 28, 30 receive6 in~truction~ and operands and processe~ them independently of what any of the other execution units may be doing.
Execution units 26, 28 and 30 each have an input stack 32, 34, 36, a 16-level fifo stack with each stack capable of holding two data words, or operands. A double-word in the preferred embodlment compr~se~ ~eventy~two bit~ plus parity bit6. In additi~n, each level holds an associated operation code. Thus, an input stack can hold up to sixteen commands and operands awaitin~ execution by the execution unit to which the stack i6 a~si~ned. It might be noted that the decision as to which execution unit receive~, or i6 assigned, a given instruction and lt6 a~60ciated operands i8 determlned by di~tributor 22 by examining the operation code of the instruction. The particular ~ethod used in the preferred embodiment is the table look-up technique. Input ~tacks 32, 34, 36 allow central pipeline 6tructure 12 to issue operands and associated opera~ion code~ to the execution unitE at the rate of one per clock period, without waiting for completion of the execution of preceding multiple executlon cycle in6tructions, for example. Such an arran~ement al~o allows execution of lnstruction~ in the different execution unlts to be overlapped. ~owe~er, within an executlon unit there is no oYerlap. Each in6truction code i6 alway~ execu~ed in the order it is received from the pipeline unit 12 and forwarded to di~tributor 22.

~ .

~.2s~556s The 6y~tem architecture of processor 10, i.e., having several execution un~t~, require~ that 6everal copies of the major registers, for example the A and the Q, be kept. As proce~sing proceeds, the valid copy of a particular regi~ter ~ay be in any one of the execution units or in any one of ~everal different register bank~ within proces~or 10. Central unit pipeline ~tructure 12 maintains a record of the currently valid copy for each regi~ter ~nd recognizes when tbe e~ecution of the next in6truction require~ tran~ferring a copy of the contents of a regi~ter from one execution unit to another. ~owever, maintaining a valid copy of the contents of a particular register i5 complicated by the length of pipeline 12, which i~ five ln~truction~ deep.
The ability to determine the content~ of each addre~sable regifiter immediately prior to the occur~ence of a fAult i~ a re~uirement for prompt fault recovery. In any pipeline computer, proces~ing o$ any one instruction i6 overlapped with the proce~ing of ~everal other lnctruction~ in different ctages of e~ecution. ~n addition, ~n CPU 10 of the executlon of several in~tructions may simultaneously occur in different execution unit6~ As a result, at ~ny one time, the register6 in pipeline 12 and in execution units 24, 26, 28 and 30 could contain regi~ter changes re~ulting from the proces~ing and exe~ution of several different instruction codes.

1.2~55~5 When an in~truction fault or an interrupt occur~, pipeline 12 must be halted at the end of the last instruction before the fault or interrupt occurred. All register change~ ~ the re~ult o~ the execution of instructions in program order pr$or to the fault or interrupt ~hould be completed, and any pro~ram visible regi~ter change6 or changes to ~emory as the result of the execution of later in program order instructions must be canceled or deleted. Collector 46 provldes a v~lid, current copy of each of the program addre~able regi6ter~ to facilitate fault recovery and ~or handling interrupts.
Collector 46 consists of execution re~ult ~tacks 38, 40, 42, 44, in~truction execution queue 18, master Qafe store 48 in which the contents of all program visible registers of CPU 10 are ~tored, store stack 50 through which the operands to be written into ~emory and their addresse~ are tran~mitted for s~orage ln the operand cache 20, and collector control 47. Each execution unit 24, 26, 28, 30 ha~ it~ own result ~tack 38, 40, 42, 44. As eacb eYecution unit complete~ the eYecution of ~n instruction, it forwards to itc a~sociated result~ stack 38, 40, 42, or 44, an updated copy of any program v1sible registers which were changed by that 1nstruct1on or ar,y change to ~emory 20. The~e register changes ~re stored ~n the 16-level fifo ~tack constituting re6ult st~ck6 38, 40, 42, 44 in the order they are received in the e~ecutlon re~ults ~t~ck. Likewise, the reEult of any store lnEtruction executed by central e~ecution unit 24 is ~l~o placed 12~5565 in re~ult stack 38 of CEU 24 in program order. Thus, each result st~ck contalns a record of up to the lafit 16 program visible regiEter changes or change~ to memory, or both, in program order, i.e., the first in will be the ~irst out, or the result that is stored in a result ~tack the longest will be the fir~t one read out of the stack.
The master copy of all program visible registers is m~intained in the ma~ter safe ~tore 48. ~aster ~afe ~tore 48 is a register bank ln which valid copies of the content~ of the program visible, i.e., program addressable, registers are stored. The loc2tions in which the contents of each such register are located in MSS 48 are illustrated in Figure 60 Collector 46 through collector control 47, which includes the collector sequence control 52, store proces~ing logic 60, and re~ult stack unload control 52 t~ke~ re~ult~r register chAnges or stores out of the execution result6 stacks one at a time and coples ~hem into master safe ~tore 48 or sends them to the store stack 50 for writing into the operand cache 20 in program order.
An lmportant aspect of collector 46 i~ t~at all program visible register changes are recei~ed by and all stores to memory 20 pass through collector 46.
In order to properly handle a fault, it is necessary to stop the operation code, or in~truction, proce~sing B0 that to the computer progra~ lt appears that no instructisn following the faulting $n~tructlon has ~e~n executed. When a fault or ~2797-11 interrupt occur~ a flag denoting the fault i6 Bet in the level of the result stack of the execution unit in which the results of the in~truction in which the fault occurred i6 ~tored. ~hen the col~ector ~equence control identifies that a fault flag is ~tored with the re~ults of the next inBtruct~on in program order to be stored in either master safe ~tore 48 or store 6tack 50, ~he re~ults at that level and all results of instructions ~ub~equent thereto in p ogram order will be di~carded. It should be noted though that collector 46 will continue to unload the result~
~tacks of execution units and to transfer changes in program vi~ible regiEter to ma~ter ~afe store 48 or to ~tore ~tack 50 and thence to the operand cache 20 until the result in which the fault occurred is reached in the instruction stream, or in program order. Thus, at the time collector 46 sense , or recognizes, a f~ult, both operand cache 20 and ma6ter ~afe ~tore 48 are ln the correct state, i.e., h~ve valid copies of program vi~ible regi~ter6 and only valid data has been written into operand cache 20 ~reparatory to being written into memory 51.
The data in MSS 48 i6 available for subsequent fault processing in order to recover therefrom. The master copy of the regicters in master safe store 48 i8 the only valid copy of such regi~ters, and any out~tanding store~ to the cache including that ~n~truction in which the fault occurred and any 6ubsequent are c~nceled. CPU 10 then begins the normal proce~s of fault recovery.

~2797-11 , ,~, The record of the proper program order for all instructions in execution, or be~ng processed, by CPU lO ls mainta~ned in the collector' B instruction execution queue 18. Instruction execution queue 18 con~ain~ one entry for each in~truction in process. ~ntries into master 6afe ~tore 48 and ~nto the store stack 50 are ordered 80 that they are unloaded ln proper program order, i.e., the same order in which the instructions are stored into tbe instruction execution stack 18 by the central pipeline ~tructure 12. In~truction execution queue word 74 contains the operation code of the in6truction and identifies, u~ing a table look-up technigue, the execution result stack in which the result of that in~truction when executed is, or will be, entered. The result of each instruction executed is then transferred from the appropriate result 6tack to master safe store 48 or to ~tore st~ck 50 ln program order. Thus, ln collector 46 in~tructions are completed, i.e., the result~ of each are received and arranged in the proper, or program, order.
In addition to handling fault recovery and interrupts, collector 46 ~l~o performs the actual e~ecution of ail memory store instruction~. Master safe ~tore 48 contains a copy of all program visible regi~ters, BO it i~ a convenient place to obtain the content~ of program visible registers which are to be written into memory. ~andling store instruction~ in collector 46 with the data to be wrltten lnto memory 51 coming from either master sa~e ~tore 48 or execution unit re~ult ~tacks via store stack 50 S~797-11 ~ZQ5565 ~aintains program order and avoid~ the necessity of the execution units 24, 26, 28 and 30 from being involved in ~tore {nstruction6. In thi6 6en~e, collector 46 i~ another execution unit or processing stores. As a result, simple store~ can be overlapped with the e~ecution of any other in~truction t~king two or more clock period~. The information ~tored in master safe ~tore 48 also make~ it relatively ea~y for CPU 10 to retry hardware ~nstruction6.
Instruction execution queue 18 $B a first-~n, fir6t-out stack containing an entry for every Dutstanding instruction of central procecsor lOo The order of entry of each of the IEQ w~rd~ 74 in instruction execution queue 18 controls the completion of instructions by controlling the order of unloading of the results of the execution of each operation code from the result stacks 38, ~0, 42, ~4. Instruct$on e~ecution queue 18 al80 contains the content6 of the in~truction counter for each operation code and the addre~ in memory 51 into which stores are made from ~ore 6tack 50.
The e~ecution unit results stacks 38, 40, ~2, 44 are first-in, first-out stacks in which ~he re~ult6 from the execution of each operation code by an execution unit are stored. An $nstruction, or operation, code iE comp~eted when the result of ~ny regi~ter change has been recorded in the master safe store or any change~ to memory have been transferred to the store stack 50. Pault fl~g~ ~nd indicator register ~hanges of 1;~05S65 as~ociated execution unit6 are al~o contained in each entry in the execution recult ~tack for each execution unit.
The ma6ter &afe 6tore ~8 contain6 the final or correct copy of all program vi6ible regi6ter~ in CPU 10 plu~ 6uch program visible but ~enerally ~i~cellaneou~ regi~ter~ such as the ~nstruction counter and the ~ndicator reglster a~ indicated in Figure 6. All change~ in program visible register of CPU 10 are reflected in the information 6tored in master safe 6tore 48.
Ch~nge6 to master ~afe ~tore 48 are completed in program sequence as eacb IEQ word 74 i~ removed from instruction execution queue 18 and the re~ults of the execution of the operation code of that IEQ word are removed from the appropriate result stack. In the normal operation of CPU 10, results are held in the result ~tacks 38, 40, 42, 44 long enough 80 that there i6 a high probability that no fault has occurred in the proces~ing of that instruction. Then, if no fault is detected by collector control ~8, the result6 are transferred to ma~ter safe etore 48 or to store fitack 50.
Data to be wrltten into memory 51 can come from ma6ter safe 6tore 48 by transferring such data to store stack 50 on 6imple 6tore in6tructions or from the execution re6ult stacks for all other store6. ~tore in6tructionr executed from data in the ter shfe ~tore ~ake no time in any of the execution units and only one pipe~1ne cycle i~ required to enter a store instruction in the in~truction execution queue.

There ~re two ways in which a fault can be ~ignalled to the result ~tack unload control 62. Faults that can be flagged with the re~ult of an in~truction are entered in the ~ame entry a~ the result in the results Rtack of the execution unit in which the fault occurred or wa6 detected. This i6 done by raising an execution fault flag on the input to the appropriate result stack, entering the fault type in the execution unit's fault type regi~ter, ~nd entering a five-bit fault code ln the execution unit'~ fault code regi~ter. Thifi type of fault 6ignalling i~
used by the execution units for any fault where the fault signal can ~e placed in the resul~ 6tack, either with or in place of the data as60ciated with the normal proces~ing of an instruction.
~ or interrupt handling, an interrupt present line from the I/O portion of the data processing ~ystem signals CPU 10 that an $nterrupt mu~t be ~andled. Logic circuit 62 check~ the interrupt inhiblt bit of the last in~truction unloaded. If the interrupt inhibit bit ls a one, or on, then collector 46 will continue to unload the re~ult~ ~tack. If the interrupt inhibit bit is off, it will ~cknowle~ge the 1nterrupt, stop further unloaaing of ~ny re~ult6 in the result~ ~tack, and invoke interrupt proces~ing.
The removal or the s~orage of result~ into ma~ter safe store 4B if lt i~ a change to a program visible regi ter or into the ~tore stack if lt is a write to memory are controlled by e~ecution queue words 74 ~tored in the fir~t-$n, f~r6t-out 6tack of ln6truction execution queue 18.

~2Q5565 In those in~tances where the information stored in the master B~fe Btore iB to be writ~en into memory, the data i~ transferred from ma~ter ~afe 6tore regifiter 48 into ~tore ~tack 50 and then the addres6 from IEQ stack IB and the data to be ~tored are forwarded to the operand cache 20. Operand cache 20 will, when the block of the cache into which data ha~ been ~ritten i8 being released by cache 20, before releasing that block write the data ln that block into ~emory 51. The information contained in the results stack6, particularly that for ~torage in the ~a~ter safe store 48, will include, in addition to the contents of the re~ister that have been changed by the instruction, ~he contents of the in~truction counter of the last instruction read out of IEQ lB.
From the foregoing, it is Reen that this invention provides a ~echanism for collecting in proper program order instruction result6 from a plurality of execution unit6 which are executing lnstructions in parallel. The collector al~o maintains a master copy of program visible register~ and performs the execution of ~tore instructions. Collector 4C alds by providing t~e contents of the MSS 48 to ald in interrupt, fault, or error processing and in as~isting CPU 10 in recovering because it containR a valid copy of the program vi~ible regi6ter~ of CPU 10 a6 they exi~ted prior to the lnterrupt, fault, or error.
It should be evldent that various modifications can be made to the described embodiment without departing from the scope of the present invention.
W~at i~ clalmed ~BS

~2~97-11

Claims

Claim 1. In a central processing unit of a data processing system having a memory, said CPU having a repertoire of instructions, a plurality of execution units, each execution unit adapted to execute a different subset of the repertoire of instructions of the CPU in the order received, said CPU having means for issuing instructions to the execution units in program order; the improvements comprising:
means for storing instructions issued by the CPU in program order; result means associated with each execution unit for storing the results of the execution of each instruction in the order executed by its associated execution unit; store means for storing the results of the execution of each instruction executed by the execution units; and control means responsive to the order instructions are received by the means for storing instructions issued by the CPU for issuing results stored in the result means associated with each execution unit in program order and for storing such results in said store means.
Claim 2. In a central processing unit as defined in Claim 1 in which results means associated with each execution unit is a fifo stack.
Claim 3. In a central processing unit as defined in Claim 2 in which the means for storing instructions issued by the CPU in program order is a fifo stack.
Claim 4. In a central processing unit as defined in Claim 3 in which the store means for storing the results of the execution of each instruction executed by the execution units is a master safe store and a store stack.
Claim 5. In a central processing unit as defined in Claim 4 in which the store stacks each is a fifo stack.
Claim 6. In a central processing unit of a synchronous digital data processing system having a memory, said CPU having program addressable registers, a repertoire of instructions, a plurality of execution units, each execution unit adapted to execute a different subset of the repertoire of instructions in the order received, at least one of said execution units executing a subset of the instructions that can be executed in one clock period, said CPU having means for issuing instructions to the execution units in program order, the improvement comprising:
an instruction execution queue for storing the operation code of each instruction issued by the CPU in program order; an input stack associated with each execution unit except the unit which executes the subset of the instruction repertoire in one clock period; a result stack associated with each execution for storing the results of the execution of each instruction in the order executed by its associated execution unit; a master safe store and a store stack for storing the results of the execution of each instruction executed by the execution units; and collector control means responsive to the order instructions are received by the instruction execution queue for issuing results stored in the result stack associated with the execution unit in program order and for storing such results in the master safe store of the results change the contents of an addressable register and in the store stack if the results are to be written into memory.
7. A collector for a central processor having a repertoire of instruc-tions, said central processing unit having a plurality of execution units, each execution unit adapted to execute a predetermined set of operation codes of the instruction repertoire of the processor, said processor executing operation codes in program order, said collector comprising:
an instruction execution queue adapted to receive an IEQ word includ-ing an operation code in program order; a plurality of result stacks, one result stack being associated with each execution unit; each execution unit result stack adapted to receive the results of the execution of each operation code by the execution unit with which it is associated; and means for receiving and storing results from the execution of instructions executed by the execution unit and for transmitting those results which are to be stored to the storage means and control means for issuing in program order the results of the execution of each operation code by the execution units to the means for receiving and storing results; whereby upon the occurrence of a fault, an interrupt, or a hardware error, a correct copy of the contents of the program addressable registers as they existed immediately prior to the occurrence is stored in the master safe store.
8. A collector for a central processor as defined in Claim 7 in which the instruction execution queue is a fifo stack.
9. A collector for a central processor as defined in Claim 8 in which each of the plurality of result stacks is a fifo stack.
10. A collector for a central processor having a plurality of program addressable registers having addressable memory locations, a system memory, a repertoire of instructions, said central processing unit having a plurality of execution units, each execution unit adapted to execute a predetermined set of operation codes of the instruction repertoire in the order received, said CPU
having means for issuing instructions to the execution units in program order, said collector comprising:
an instruction execution queue adapted to receive an IEQ word which includes the operation code of an instruction word in program order; a plurality of result stacks, one such result stack being associated with each execution unit, each execution unit result stack adapted to receive the results of the execution of each operation code by the execution unit with which it is associated; a master safe store for storing changes to the contents of program addressable registers; a store stack for storing the results of the execution of each operation code that changes the data stored in a memory location in system memory; and collector control means responsive to the order operation codes are received by the IEQ for causing results stored in the result stack associated with the execution unit capable of executing the operation code to be stored in the master safe store if the result changes the contents of a program addressable register and in the store stack if the results change the data stored in a memory location in system memory.