CA1171544A

CA1171544A - Multiprocessor computer system with dynamical allocation of multiprocessing tasks and processor for use in such multiprocessor computer system

Info

Publication number: CA1171544A
Application number: CA000397122A
Authority: CA
Inventors: Denis R.C. Pottier; Edmond Lemaire; Luyen Letat; Jean Gobert
Original assignee: Philips Gloeilampenfabrieken NV
Current assignee: Koninklijke Philips NV
Priority date: 1981-02-25
Filing date: 1982-02-25
Publication date: 1984-07-24
Also published as: DE3263600D1; EP0059018A1; FR2500659A1; US4459664A; FR2500659B1; JPS57174746A; EP0059018B1

Abstract

PHF 81.512 18 27.11.81 ABSTRACT OF THE DISCLOSURE:

"Multiprocessor computer system with dynamical allocation of multi-processing tasks and processor for use in such multiprocessor computer system"
A multiprogramming data processing system comprises a plurality of data processing devices P1, P2, P3, P4 each having local storage 110-116 and has furthermore an interconnecting standard bus 100. The program is divided in program segments S1-S4, while the program segments are grouped into program portions (k, m, n) The respective program portions are each stored at one of the local memory sections. When an extended branch instruction calls and address in a different program portion, a portion change interruptsignal (26) is generated, whereby dynamical allocation of the execution of program segments may be reali-zed. When a privileged portion (0) is called, the portion change inter-rupt is nullified, both at the calling to, (28) and the return (23) from the privileged program portion.

Description

5~
.
PHF 81.512 l 27.11.81 "Multiprocessor computer system with dynamical allocation of m~ltipro-cessing tasks and prccessor for use in such multiprocessor ccmputer system".

. . .
~ACKGRCUMD OF THE INVENTION
The invention relates to a mMltiprogramming data processing system comprising a plurality of data prccessing devices interconnected - by a CQmmOn standard bus for transporting address signals and data sig- nals, wherein each data processing device camprises:
a) a local data processor having a first data port connected to the siandard bus;
b) a local memDry section having a second data port connected to the standard bus;
c) a local bus interconnecting the local data processor to the local memory section;
d) priority means for providing a local data processor with a privileged access to the associated local memory in preference over a memory access - æriving over the stand æd bus;
the data processing system furthermore having an overall memory enccmr passing the aggregated local m~mories for storing respec-tive program segments for any program to be executed. Such data processing systems - or computer systems are products of the data processing industry.The notion of a program segment is well-known; the division of a program into se$ments ~ executed by the programmer writing the pro-gram. The elements of a program segment ~re usually logically related, for example, in tha~ all such elements are either user data items, or are all parameter values, or are all program code or instruction statements. Seg~ents may be of equal or unequal lengths, while also a single category of elements, such as user data, may be distributed over different segments. The overall memory may enCQmpass a relatively large share memory of equal or lower operating speed in cwl~arison to the respective local memories. me data processing system may comprise other devices such as input/output devices. me local data processors and/or local memory-sections may have different capabilities. A
distributed system of the kind describes is suitable for parallel exe-cuting of a plurality o~ programs at a time. The stationary linkage of . .
,~

~ PHF 81.512 2 27.11.81 .. .
a specific program to one associated data processing device has several disadvantages. In the first place certain program segmsnts may ~e used - in several programs. For example, a certain set of code statements may be used with different sets of user data, while each set of user data must s furthermore be processed by means of respective further sets of code sta-tements. Now the memory capacity would be stressed if the com~on seg-ment(s) were stored a plurality of times, each time at a respective one of the data processing devices. F~rthermore, in case of a modification within a common segme~t, the uniqueness of the content would be destroye~
Alternatively, if a common segment were stored only once, frequent - access thereto via the standard bus w~uld ~e detrimental to processing speed. Furthermore the capacity of one local memory would readily be insufficient to store all segments of one long program. Finally such stat-ionary linkage would be quite unflexible.
SUMMARY OF THE INVENTION
It is an object of the present invention to provide a nLlti-"
~ programming data processing device as explained herebefore with means ., ~
for allowing dynamical assignment of respective pro~rams to respective ~ data processing devices therein without the need for aentralized inter-;~ 20 rogation of all data processing devices as concerning the status of execution of the program assigned thereto and to realize the require~
mechanisms with quite modest additions to the respective data pro-cessing devices.
~i The object of the invention is realized in that t~e data proces-2s sing system furthermore has waiting list means for receiving and storing - a succession of program execution request signals arriving ~y way of the standard bus and for each such request signal generating a waiting list item for an associated data processing device;
and wherein each data processing device furthermore comprises:
; 30 e. first detector means for selectively detecting during execution of a program se~ment therein a non-branch instruction, a first type branch ! instruction governing a branch within the current program segment, and ~ ~ a second type branch instruction governing a branch cutside the current .i segment;
3s f. second detector means for upon detection of a second type branch instruction selectively detecting a branch within the current program portion containing the program segment being executed and a portion .
.
-.

. .

~.' ~. .

.. - !

, ~ . . . . ................... .. .
P~F 81.512 3 27.11.81 changing branch to a second pxogram portion differing fram the current ~` program portion and thereupon generating a portion changing interrupt signal while terminating the execution of the current program portion;
g. execution request signalling means for u~on generation of a portion changing interrupt signal accessing said waiting list means with a ~- progxam execut70n request signal for the program containing the program portion thus tenninated and awaiting list accessing signal for interroga-ting a waiting list item from its own waiting list.
Consequentially, two new elements are introduced. In the first place, the portion of a program is built up from one or more segments of that program. The portion, by virtue of allocating a portion numker to its constituent program segments, may be realized by the programmer du-ring generation of the program. Alternatively, the portion number is allocated dynamically during loading of the program: the portion number should preferably be the same as the portion number of one or ~ore pro-gram segments resident in the same local memory and also forming part of-the same program. It is however, SQmetimes preferable that one or more program portiQns contain only a single program segment. The first portion number allocating method would ensure that program segments having a strong operational interrelation ship, for example, in that one w~uld frequently address the other(s), would reside in the same local memory. In any case, segments contained in the same program portion must also be contained in the same local nemory section. If a segment of a specific program portion is being loaded, it is mandatory that further segments of the same program portion should b~ accommodated in the same local memory section.
Now, it is not necessary to include the portion numker in the writing of the source program and the notion thereof does not ccmplica-te the task of the progr~nner. Furthermore, the portability of the pro-0 gram from one machine to ano ~e~ ~or example, in the case where asingle processing de~ice were present, the control system of this ele-mPntary system cculd ignore the portion number by disabling the second detector means.
The waiting list means may reside in a specific program that is residential in a specific one of the data processing devices; for each of the data processing devices a waiting list is being kept of the program execution request signals. The dlstri~ution o execution tLme ~: -;"' '. ' `

5~
PHF 81.512 4 27.11.81 may be based on a momentary priority: each program remains under exe-cution until a portion change interrupt occurs and then the next level of priority is taken up. Otherwise, the maximum uninterrupted execution - time for any program portion could he limited to a predetermined maximum - value. Also, other types of time distribution may be used. In the above situation the specific data processing device that contains the waiting list means functions as a central processing de~ice.
Alternatively, each data processing device could form a list of received program execution request sign-als directed to itself. m e next program to be executed would then again be selectable from the respective data processing device's own waiting list. In consequence, the execution of a program is dynamically assigned to the data proces-sing device containing the program pOrtiQn being executed within that pro~ram. If the destination portion is present at the same data proces- -sing device as the origin portion, in certain situations it may be nece~sary to reorganize the local memory section, for example if a portion of a program is no longer needed it could be unloaded to background memory.
If the destination portion is presen-~ at a different data processing device, the waiting list of the latter gets a new item. Also in this case the origin portion may be unloaded to background m~m~ry. In special cases, as will be explained hereinafter, a local data processor may execute a program portion, present in another local memory.
In a preferred mode of execution each data processing device furthermore comprises third detector means for detecting in a second type branch instruction a predetermined program portion number and i! thereupon nullifying the associated portion changing interrupt signal oth at entering and at returning from said predetermined program portion.
An advantageous realization of this concept is attained ~hen -the control system of the data processing device wculd have capacityfor storing a portion number (A). Upon occurrence of a portion change trap, two further portion numkers w~uld be relevant, the program portion keing executed (B) and the numker of the destination program numker (C).
.:
Normally, A and B are equal, while A is u~dated at each program portion change. Now however, the predeternined portion number may ~e associated to a program portion containing an often used service routine. In certain circumstances it is advantageous that a change to such service routine . .

5~
PHF 81.512 5 27 11.81 w~uld nDt entrail terminating the execution of the current program por-tion, and the change-over between the data-processing devices, kecause this would slow-down the overall capability of the data processing sys-tem. In this respect it should be noted that each portion change takes - 5 a certain time kecause register contents must be saved and other termina-tion steps taken. Thus the slow-down by resorting to a non-privileged access to a local memory section would be less than the slow-down by effecting a portion change. If in such case portion number A is not up-dated, at the return to the originating portion, the portion numbers A
and C would be equal. The latter condition is reaailv detectable. In cer-tain other cases no switch to a different data processing device is un-dertaken even if the latter contains the destination program portion.
This would be the case with dynamically shared prcgrams. Such doNble ac-cess to shared programs has, for example, been disclosed in United States lS Patent 3997875 to Applicant. The shared program could, for example, be the control program of a printer device. m e situation could also ke the other way ro~nd, where the data processing device currently performing the execution calls the control program of the printer, whiCh control program is then executed by a different data processing device. m e lat-ter could execute the control program (portion) as a prerequisite tooutputting its own updated information (e~g. user data program portion).
For krevity, this is not explained further.
The difference ketween the portion change trap mechani m and an older trap mechanism "segment not stored in mem~ry" should be clear.
The latter mechanism is used in a hierarchic memory organization with a f~st f~regrcund memory and a slower back ground memory. Addressing of a segment that is not present in the foreground memory necessitates the loading of the latter segment according to a segment replacement algorithm. Typically this feature is part of a monoprocessor system.
0 According to the above, the portion change trap occurs when the des-tinaticn segment is ~1ready present in memo~y. Of ccurse, the present invention could be combined with this older mechanism, where the calling of a program portion not resident in any local memory section would ;; trigger a different trap.
The invention also relates to a data processing device for . ,~
use in a data processing system as described herekefore, wherein said second detector means c~,~rise compæ ing means for upon decoding of a ... .

PHF 81.512 6 27.11.81 .
second type branch instruction comparing the portion numkers of the current program portion and the program portion containing the destina-tion address of the second type branch. In this way a simple realization is reached.
; 5 - BRIEF DESCRIPTION OF THE FIGURES
- An emkcdiment of a realization according to the invention is described with reference to the following figures:
Fig. 1 shows the general lay-out of/nultiprocessor data pro-cessing system;
Fig. 2 shows a single processor;
Fig. 3 shows a flow diagram of the realization of a pQrtiOnchange trap-the "execution" part;
Fig. 4 shows a flow diagram of the realization of a portion change trap-the "fetch" pa~t;
Fig. 5 gives a variation to Fig. 4.
GENERAL DESCRIPTION OF A PREFERRED EM~ODIMENT OF THE MLITI-~ P~0CESSOR SYSTEM.
; Fig. 1 shows the general lay-out of a multiprocessor system.
The system, in this embodiment comprises four processors P1, P2, P3, P4, ;~~ 20 each of which has access to the standard bus 100, via its proper access :~
~; line 102, 104, 106, 108, respectively. For convenience both these ,~
~; proper access lines and the standard `DUS 100 have been drawn as only a single line. Eor brevity, the mechanism for granting the use of the bus to the several data processing devices is no further described; it could ,... ~ . ~
ke effected by distri~uted arbitrage. The memory M is divided into four memory sections 110, 112, 114, 116, each provided with its proper memory access device 118, 120, 122, 124. The memory access devices -:
comprise the usual address buffers, data buffers, and furthermore tw~
access ports. One access port is connected to the asscciated processor (126, 128, 130, 132), while the other one (134, 136, 138, 140) is connected to the standard ~us 100. Each ~emory access device comprises furthermore a priority element for granting a memory access emanating frcm the associated processor (for example from processor P3 to memory section 114). Priority over a memory access emanating from another pro-cessing and communicated therefrom via the standard bus. The priority ~ may be simply realized as follows: the associated request signals are ,~ OR-ed to the memory section proper, while the ensued acknowledge sig-nals from the memory section are forwarded to the originator of the re-r: .

7~5~
PHE~ 81.512 7 27.11.81 I
quest, If b~th local processor and a non~local processor ~ould request at the same time, the request signal of the former w~uld block the acknowledge signal to the latter. Consequently, data transport via the standard ,us is relatively slow for several reasons. In the first place the standard bus is a shared facility among a plurality of potential requesting stations (in this case four). In the second place, even if a request for the standard bus has been granted, the memory access to be - effected there~y may be negated becaus,e the memory section affected grants priority to an access request by its proper processor. Finally, the local buses 126, 128, 130, 132 may accommDdate a higher data rate in-that the local memory section and its processor are physically close to each other; also the bus exchange protocol operates slower.
Now, each pro_essor may exe,~ute instructions of any program present in the whole of nEm~ry M. For enhanced processing capability, generally, the execution of a program segment present in a certain local memory, such as memory 110 is thus assigned to the associated data processor, in this case data processor P1. The assignment algorithm is uniquely part of the monitor program or the operating system of the data processing system. Such operating system could be of a conventional nature. The specific data processing device containing the assignment ; algorithm has a privileged role as a central processor, for example, in that it h2,s the highest priority level in case of multiple simulta-neouS requests to the stàndard data bus. The assignment program form~, for every data processing device, a waiting list of programs assigned thereto and ready ~or execution. Such waiting li~t could be realized ; in a resp ctive seguen~e of storage pc,sitio,ns for each data processing device provided with, a read pointer and wr-ite pointer. Each item of the waiting list comprises the necessary indications for the processor to start execu~ion of the program, such as a program identifier, the address of the starting instruction, and the initial contents of a given number of registers. Generally, the program instructions con-tained, for example, in memory 114 are m~re efficiently executed by processor P3 than by others, while data transport i~ effected via local bus 130. For other memory sections 110....... 116 a corresponding situa-tions exists, because then no large-scale data transport over the standard bus 100 would ke required.
It should be noted that the standard bus 100 could be connectEd P 715'~4 PHF 81.512 8 27.11.81 :
to other devices AD1, AD2 such as: back-ground memory, input-output devices and datacommunication controllers with associat~d external , ccmmunication lines.
` Fig. 2 shows diagra~matically the structure of one of the pro cessors. According to the present state of large scale integration, a -processor essientially consists of two interconnected and micropro-grammed IC-mcdules LSI1, LSI2. In this specific situation, tw~
additionall smaller modules 142, 144, were added. In future these f~r ~ m~dules will readily be united into a single module. Now m~dule LSI1 - 10 comprises especially the hardware for executing the instructions and ~ may be represented by a MC68000 sixteen bit microprocessor, manufactu-; red by Mbtorola Corporation, Phoenix, Arizona. It has been divulged ~;~ in the MC 68000 users manual, Original issue September 1, 1979 anddescribed in an article by E. Stritter, et al, in Computer, February . 15 1979, p. 43-52. Notable features of this microcom~ter are a 16-bit data w~rd, eight 32-bit data registers, seven 32-bit address registers, two ~ 32-bit stack pointers and a 24-bit program counter. It has an instruc-:~ tion length of one to fiYe 16-bit wards, of-which the first, or opera-- - tion word, specifies the operation and mDdes (the further ones speci~y ;~ 20 an immediate operand and address extensions). Furthermore, a sixteen ` bit status register is provided, contaLning an eight bit user byte and ~ an eight bit system byte. In addition to address, data and power pins, 5~ . the packzge has three processor status pins (FC0, FC1, FC2), three peripheral control pins (E, VM~, VPA), three system control pins (BERR, RESET, HALT), five asynchronous bus control pins (AS, R/W, UDS, LDS, DrACK), three bus arbitration control pins (BR, BG, BGP~K) and ; three interrupt control pins (IPLO, IPL1, IPL2). As powerful as it is, thQ M~ 6~000 microcomputer cannot manage large memory resources by it-self. It relies therefore on a second integrated circuit m~dule LSI2 of the type M~ 68451 to handle the tasks associated with virtual memory and multitasklng supFort. me unit LSI2 may effect:
, - address translation from logical to physical memory ~ - separation of user and system resource address space -~ - separation of program and data address spaces - interproces~communication through shared resources - support of koth paging and segmentation of memory - provision of multiple mi~ory management units in a system , . . . .

5~ 4 PHF 81.512 9 27.11.81 me 64-pin package has a first set of pins associated with the logical address space (address pins A~-A23 and Al-A5, processor status pins, asynchranous bus control pins) a second set of pins associated with the physical address space (bus management pins MAS, WIN, CS, IRQ, IRCX, - 5 data pins P~8- PA23 and further pins ED, HAD) and several so-called global connections.
Now, as shown the first seven address bits of LSI1 are directly forwarded to the input/output line 102 as are data bis DO-D15 and a ~ d read/write cantrol signal. ~ddress bits A8-A23 are furthermore transpor-ted to mEmory m~nagement unit LSI2. me latter has a m~ltiplexed data/address port connected to data transceiver 142, make Texas Instruments, type 74 LS 245 and to address latch 144, make Texas Instruments, t~pe S4 LS 373. Further cantrol is provided by outputs ~3 and HAD of ele-ment LSI2. In this way the line 104 is provided with a 23 bit address path and a 16 bit data path. A further enlargement of the set-up shown by means of doubling the memory management unit wculd allow for h~nd-; lmg 64 segment descriptors. The inclusion of further control signals to the control part of the standard bus is not further describad for brevity. In the following specific reference is made to the instruction register of element LSI1, named R1, and therein the partsCOP for stori~g the cperation code, and OP for storing an operand, furthenmore to one of the cited registers D1 and a status signalling bit TCP which may be in the status register or m a further one of the cited registers. Furthermore referenae is had to register R~ of element LSI2 which serves for storing a relevant segment descriptor;
these parts form no~mal elements of the two LSI1, 2 building blocks.
For simplicity the storage registers and bits have only b~en indicated schematically. Consequently, in this set-up the processor, such as P1 in Fig. 1 has only a single data/address part 102. This necessitates -a slight mGdification in Fig. 1 in that line 102 would be connected to~lock 118 in lieu of line 126. me latter w wld then be ab~ent. A
similar situatic,n would occur in the other data processing devices.
Fig. 3 gives a flow diagralm of the realization of a portion change trap, the "execution" part thereof; it is understood that the - 35 instruction under execution has been fetched already and is present in the instruction register. Block 22 sym~olizes the entry to the flow diagram, notably from the flow diagram of Fig. 4. Block 9 of Fig. 3 .

s ~'7~5~
,, I
~ - PHF ~1.512 10 27.11.81 .'''' ~
symkolizes the initial stages of the execution of this instructio~ -This w~uld in the exemplary situation shcwn in Fig. 1 be, for example, the instruction that was present in local memory section 144 at address A1. m e execution therefore is done in processor P3. At a certain in-s - stan~ during this execution the processor P3 undertaks the detection whether the current instruetion is a branch instruction and notably whether this is a so called "extended branch" instruction; the latter action is symbolized by block 10. The program eontaining this current instruction was assigned to this specific-processor by the assignmen ;~ 10 algorithm and up to the commencem~nt of the execution thereof plaeed - in the waiting list of processor P3; this cculd be done by a centralizsd assignment program. In this example, the program in exeeution considere~
consists of three program segments of which segments S1,S2 have been stored in memory section 114 and program segment S3 has been stored in another memory section 112. Now, the two prcgram segments S1, S2 have beæAn placed in program portion "k", for example in that the indicator number "k" was inserted into the segment descriptors, either at concep-tion of the program by a programm~r, or during loading of the respective ; segments by a portion number assignment routine. m is routine, in a simple form, would assign the same portion number to all program segments of a program provided they were loaded into the same local m~-mory section. It wculd be allcwable that one lccal memory section eon-~ tained tWD groups of program segm~nts associated to tw~ respeetive pro-;~ grams. Of course, in the latter ease eaeh group of program segments; 25 has a respective different portion number. Now, in the example shcwn inFig. 1, segn~nt S3 is plaeed in program portion "m", wherein k differs from m. FurthermDre, no further program segments, not stored in ~ respective memory sections 114, 112, form part of program portions i~ "k", "m". As stated earlier the grouping of program segments to pro- ;
gram portions may be done for operational reasons in that the respeetive program segments of a single pr~gram portion are used coineidingly. Re-verting to Fig. 3, the operation code of the eurrent instruetion, ., address A1 is present in register R1 of processor P3 for (further~ de-coding. The decoding and execution of the instruetion is done under mieroprogram control, but, apart from the steps described, no further - explanation is given of the remaining, conventional operation. m edeteetion of the "eXtended braneh" code, to an address outside the cur-,, ., .

, .

':
' .

.
PHF 81.512 11 27.11.8~
rent segment, is done in block 10. If this "extended branch" is not present lno branch instruction or a branch within the curEent segment- f branch of the first type) the system goes directly to block 200. If, on the other hand, the current instruction governs a branch of the - 5 second typej the pro_essor goes to block 11. In block 11 the signalling bit TCP is set to "1". This "1" signals the out-of-segment branch, and this signalling bit should therefore be at zero at the exit from blcck 9. This is effected in that either bit TCP is reserved only for this specific signalling. Othersise, this signalling bit may be 10 shared for a further function, but in that case, the further function must ensure that bit TCP is reset to zero before the end of block 9 is reached. In block 12 the logic address A2 of the desired branch is loa-ded; this address would be the operand of the branch instruciion. Fur-thermore, the nu~ber of the destination segment~ for example segment S3, 15 is ~uffered in register Rr. In block 200 the remainder (if any) of blocks 9, 200 forms the, possibly conventional, execution of the instruction wherein either one of thoseblocks 9, 200 , may be absent. In block 14 it is detected whether the current instruction was the last one of the prcgram. If so ~Y) in block 15 the program is termanated and the waiting 20 list accessed to detect whether a further program is to be executed by the relevant processor (here P3). If not (N) in block 202 the execution of the current instruction has thus been conpleted, and the system goes to Fig. 4.
Fig. 4 shows a flow diagram of a second part of a realization - 25 f the portion change trap, the "fetch" part. In addition thereto the - initializing steps of the program are shown. BloCk 16 sym~olizes the start of the program when it is ~aken from the waiting list of the rele-vant data processing device. Block 17 Symbolizes the normal initiali-zing steps before the program is started; among other things, it 30 wculd be required that at the end thereof signalling bit TCP were zero, and the address of the initial instruction (here considered as the "n~xt" address, A2) loaded. The functions in blocks 14, 15, 16, 17 are given ky way of example.
Block 202 symbolizes the entry from the flow diagram of 35 Fig. 3. In block 18 the logical address of the next instruction A2 is - translated into a physical address. As is well known, a logical memory ; ! address as a kind of numker: the address partS (bits or bit grcups) 5~9L
- P~F 81~512 12 27.11.81 need to bear no relationship to the physical partitioning of the memory, such as in this case to the respective m~mory sections. On the other hand, a physical address may contain several parts, each part thereof ad-dressing a specific memory device or sub~device. Consequently, this translation would produce the portion numker, WhiCh in case of address .~2 would be "m". Next, the processor, in blo~k 19, tests the state of signalling bit TCP, which,in case of a negative response (N) controls a transition to block 2~, which will be considered hereinafter. If, how-ever signalling bit TCP is in the "1" state, the system gces to block 10 204. In block 204 signalling bit TCP is reset to zero and is no ~urther used until a possible next entrance in Fig. 3, block 11, or, as expl~he~
before, possibly for executing a different routine which would eventual-,; ly leave this bit position at zero at the latest when leaving block 9 in Fig. 3. Reverting to Fig. 4, in blo~k 21 the numker of the current pro-gram portion (in case of current address A1 this would be "k") is buffe-red in the input register RE1 of the arithmatic and logic unit (ALU) of the part LSI1 of processor P3. In block 22 the portion number of the destination address (in case of destination address A~ in program seg-ment S3 this wculd ~e "m") is buffered in a further input register RE2 of the arithmatic and logic unit of the part LSI1 of processor P3. The destination address contains in this example the instruction code COPc+
; of the instruction required next. Alternatively, such destination addn~
could contain any other program element required next. In block 23 the ~- unit LSI1 tests whether the contents of the tw~ registers~RE1 and RE2 ; 25 are identical. If they are identical (Y) no portion changing is requir~d '~! and the system goes to block 20. This could, even in ~ase of an extended branch instruction, occur during a transition from pro7ram seyment S1 to program segment S~, both of which are in program portion "k". If the two numkers are not equal (N) in block 24 the s~atus of the current pro-gram is saved: depending on the state of the program the contents of the program counter, of crucial p æ ameters and register contents are saved . .~, ~ in buffer storage proper to the current program. Saving measures are ;~ well-kncwn, and for brevity no furhter explanation is given. Next, in - block 25 the current portion numker is updated frcm "k" to "m". Alter-natively, the updating could ke executed in block 17, while blo~k 25 were suppressed. In the next block, block 26 a portion changing interrupt signal is generated ~or the waiting list program.
This portion changing mterrupt signal contains the program portion r :' PHF 81.512 13 27.11.81 number "m" (destination) and possibly, one or more data that are to be used inst~teously when entering program portion "m". The portion changing interrupt signal furthermore implies the instruction fetch of the first instruction of the trap routine by pr w essor P3. This routine - s is present in a program segment stored in the local memory section of - processor P3, it has not been shown in the Figure. Finally, in block 27, processor P3 abandons the current program and selects the next program present in its own waiting list. me start of the execution - thereof w~uld imply going to block 16 in Fig. 4 again. me same occurs when the "stop" block 15 in Fig. 3 is reached.
On the other hand, the assignment program m~st place the portion changing interrupt request in the waiting list for the intended processor, in this case, processor P2. If the assignment program is to be executed by a specific one of the processors the associated task may have preference over any task presently being executed in this specific processor. In certain cases, the user may have specified that a different processor wculd ke ketter suited to execute program segment S3; then the assignment program w~uld process the re~uest corresponding-ly. It is to ke noted that also the generation of a waiting list for any processor may be done;according to one of several principles.
Examples are "first come-first served", and an algorithm wherein each program has a fixed priority level among the programs. If, however, in the foregoing block 20 is reached, therein the destination instruction is loaded from the local memory section into the instruction register ~; 25 of the local processor. In this case no portion change w~uld ke effectal Thereafter, via block 22 the execution cycle in Fig. 3 is reached again.
In certain cases the program could contain further program segments, such as segment S4, also present in local memory section 112.
A first approach w~uld be to give program segments S3 and S4 the same program portion number. Branching from program segment S3 to program segment S4 (for example to destination address A3) w~uld produce no portion change. In other cases it could ke advantageous or even neces-sary to assign different program portion numbers "m" and "n" as shcwn.
Examples could ke that program segment S4 is shared by several programs or that the occurrence of a branch ketween segments S3 and S4 would be an exceptional case. It shculd be remembered that ali program segments of the same program portion should ke stored at a single local memory ~L~ '715~4 PhiF 81.512 14 27.11.81 :, section and the ensuing storage requirement for one local memory section could enda~er the flexibility of m~mory management. However, if program segments S3, S4 have different portion numbers a branch ketween them - would trigger a portion change and this would cause a decrease of s - processing efficiency. Obviously, the adverse effects of both choices - - may vary from case to case.
In certain situations it i6 advantageous to nullify a "portion changing trap" signal in case it would refer to a specific or privileged portion number. Although several such privileged portion numbers could exist in any single system, m the following only a single one is considered; portion number "zero'~. This program portion could - contain one or more program segments, such as segment S0. Such segment ; could contain-a service routine, such as a conversion or encryption routine. miS service routine would re~eatedly ke adclressed from various ; 15 other segments S1, S2, S3, S4 of the program being executed or evenfrom various different programs. As shown in Fig. 1, segment S0, por-tion "zero", has been loaded in mem~ry section 110. Especially ii the execution of the service routine requires only limited data/address transfer over the common standard bus and local bus 134 the delay by 20 such transfer (and the required contention for the shared bus facilitics~
w~uld only produce a limited slow-down in the prccessing. This slow- i down should k~lance the delay incurred by the double portion change ~ during the calling to and the return from the service routine, the -l~ updating of the waiting lists, and so o~. ~
; ; 2s In consequence, Fig. 5 gives a variation to Fig. 4, while `; only a part of the earlier flow diagram has bsen repeated. Blocks 19, 20, 22, 23, 24 are identical to the corresponding blocks in Fig. 4.
Tw~ notable differences are present with respect to Fig. 4. In the first place a storage adc1ress PC has been used in the machine for sto-0 ring the pro~iram portion num~er of the portion currently being exe-cuted. This wculd be the storage adc1ress that is updated in block 25 of Fig. 4. Now, in block 29, the number of the current program portion is fetched from address position PC and buffered in the input register REl of the arithmetic and logic unit of the part LSI1 of the originating ;~ 3s processor. Furthermore, in block 28 it is tested whether the portion number of the destination portion is equal to zero. If it is, the trap is not effected as ~he system goes to block 20: this means that the :

:

~ 715'~4 PHF 81.512 15 27.11.81 contents of address position PC are not updated either. N~w, when the privileged program portion is called, the test in block 28 has a posi-tive outcome~ Alternatively, upon return from the privileged program portion the test in block 23 has a positive outcome because PC still c~ntains the first p~rtion numker. On the other hand when a non-privileged pro~ram portion is called, the tests in blocks 23, 28 give negative outcomes, koth at the calling to the latter program portion and at the return therefrom. It shculd be noted that a call from the privileged pro~ram portion to another portion than the originating lo portion does not imply a return. Consequently, ~oth tests in blocks 23, 28 wculd give a negative cutcome and in consequence, block 24 is rea-ched. A similar situation would arise when there were two (or rr.ore) privileged program portions. The test in block 28 w~uld then be for equality to either of the privileged program portion numbers.

.
: .
: .

,'; . ., ~ 25 . .
i' ;
'

Claims

PHF 81.512 16 27.11.81 THE EMBODIMENTS OF THE INVENTION IN WHICH AN EXCLUSIVE
PROPERTY OR PRIVILEGE IS CLAIMED ARE DEFINED AS FOLLOWS:

1. A multiprogramming data processing system comprising a plu-rality of data processing devices interconnected by a common standard bus (100) for transporting address signals and data signals, wherein each data processing device comprises:
a. a local data processor P1-P4 having a first data port connected to the standard bus;
b. a local memory section (110-116) having a second data port connec-ted to the standard bus;
c. a local bus (126-132) interconnecting the local data processor to the local memory section;
d. priority means (118-124) for providing a local data processor with a privileged access to the associated local memory in preference over a memory access arriving over the standard bus;
the data processing system furthermore having:
A. an overall memory encompassing the aggregating local memories for storing respective program segments for any program to be executed;
B. waiting list means for receiving and storing a succession of pro-gram execution request signals arriving by way of the standard bus and for each such request signal generating a waiting list item for an as-sociated data processing device;
wherein each data processing device furthermore comprises:
e. first detector means (10) for selectively detecting during exe-cution of a program segment therein a non-branch instruction, a first type branch instruction governing a branch within the current pro-gram segment, and a second type branch instruction governing a branch outside the current segment;
f. second detector means (23) for upon detection of a second type branch instruction selectively detecting a branch within the current program portion containing the program segment being executed and a portion changing branch to a second program portion differing from the current program portion and thereupon generating a portion changing interrupt signal (26) while terminating the execution of the current program portion;

PHF 81.512 17 27.11.81 g. execution request signalling means (27) for upon generation of a -portion changing interrupt signal accessing said waiting list means with a program execution request signal for the program containing the program portion thus terminated and a waiting list accessing signal for interrogating a waiting list item from its own waiting list.

2. A system as claimed in claim 1 wherein each data processing device furthermore comprises third detector means (28) for detecting in a second type branch instruction a predetermined program portion number and thereupon nullifying the associated portion changing inter-rupt signal both at entering and at returning from said predetermined program portion.

3. A data processing device for use in a data processing system as claimed in claims 1 or 2, wherein said second detector means com-prise comparing means for upon decoding of a second type branch instruc-tion comparing the portion numbers of the current program portion and the program portion containing the destination address of the second type branch.

4. A data processing device for use in a data processing system as claimed in claim 1 or 2, wherein said second detector means com-prise address translator means (18) for upon reception of a second type branch instruction translating a logical address into a physical address comprising a device number, and comparing means for comparing the latter device number to its own device number.