WO1992022030A1 - Interrupt driven, separately clocked, fault tolerant processor synchronization - Google Patents

Interrupt driven, separately clocked, fault tolerant processor synchronization Download PDF

Info

Publication number
WO1992022030A1
WO1992022030A1 PCT/US1992/004557 US9204557W WO9222030A1 WO 1992022030 A1 WO1992022030 A1 WO 1992022030A1 US 9204557 W US9204557 W US 9204557W WO 9222030 A1 WO9222030 A1 WO 9222030A1
Authority
WO
WIPO (PCT)
Prior art keywords
microprocessors
microprocessor
program
fault tolerant
timing
Prior art date
Application number
PCT/US1992/004557
Other languages
French (fr)
Inventor
Jay R. Goetz
Original Assignee
Honeywell Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Honeywell Inc. filed Critical Honeywell Inc.
Priority to JP5500594A priority Critical patent/JPH06508229A/en
Publication of WO1992022030A1 publication Critical patent/WO1992022030A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/1675Temporal synchronisation or re-synchronisation of redundant processing components
    • G06F11/1691Temporal synchronisation or re-synchronisation of redundant processing components using a quantum
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/18Error detection or correction of the data by redundancy in hardware using passive fault-masking of the redundant circuits
    • G06F11/183Error detection or correction of the data by redundancy in hardware using passive fault-masking of the redundant circuits by voting, the voting not being performed by the redundant components
    • G06F11/184Error detection or correction of the data by redundancy in hardware using passive fault-masking of the redundant circuits by voting, the voting not being performed by the redundant components where the redundant components implement processing functionality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/16Error detection or correction of the data by redundancy in hardware
    • G06F11/1675Temporal synchronisation or re-synchronisation of redundant processing components
    • G06F11/1679Temporal synchronisation or re-synchronisation of redundant processing components at clock signal level

Definitions

  • This invention relates generally to highly reliable, fault tolerant digital data processing systems, and more particularly to an N- modular redundancy microprocessor system in which the plural processors may be event-driven but remain in synchronization.
  • II. Discussion of the Prior Art It is well known in the digital computing arts to employ a multiplicity of redundant processors to achieve fault tolerant operation, i.e., a microprocessor system capable of error-free operation in spite of one or more hardware faults.
  • the most common prior art fault tolerant architecture is referred to as N-modular redundancy where N represents an odd number greater than one, and typically three or five.
  • the N identical processors are programmed to execute identical programs in synchrony in response to a common set of input signals (data).
  • Fault tolerance is achieved by continuously voting on the output signals produced by each processor in a majority decision logic voter.
  • the result of the voting circuit is guaranteed to be correct, provided a majority of the processors compute a correct result. Synchronization may be established on instruction boundaries or, alternatively, for each processor clock cycle. In either case, voting is typically performed for each instruction, thus requiring that all processors execute identical programs in lock step.
  • Interrupts allow the processor to function more efficiently in real-time applications, such as inertial navigation and flight control.
  • the processor is responsive to input signals which occur in real time, i.e., asynchronously with respect to the processor clock.
  • Interrupts are the means by which program execution of an interrupt program sequence is temporarily suspended while an interrupt subroutine, often termed the service routine, processes the input data associated with the interrupt.
  • the last step of any interrupt subroutine is the execution of a Return-From- Interrupt instruction such that execution of the instant program resumes exactly at the point where its suspension occurred.
  • Typical microprocessor systems are responsive to a multiplicity of interrupts and provide circuitry to prioritize and selectively mask interrupts.
  • the Type 8259 Programmable Interrupt Controller manufactured by the Intel Corporation may be considered typical. Any real-time program must be written to assure that any combination of asynchronous events which generate interrupt requests will result in an orderly execution of the associated interrupt subroutines. To accomplish this, the interrupt controller must periodically strobe or sample the state of the multiple interrupt request lines with a signal derived from the processor clock. The set of interrupt request samples is processed by a priority encoder to determine which of simultaneous requests will be processed first. The interrupt controller then generates an interrupt signal to trigger execution of the associated service routine when the execution of the present instruction is complete.
  • a fault tolerant computing system having a plurality of microprocessors, each programmed to execute the same stored program of instructions in synchrony in response to a common set of data input signals where each of the plurality of microprocessors has its data output coupled in common to a majority decision logic voting means which operates to determine the extent of comparison of the data outputs of all of the processors.
  • the system also includes a processor clock for each of the processors which produces timing pulses for timing the execution of the stored program by the microprocessors into clock cycles.
  • timing means are included for each of the plural processor employed which function to establish a predetermined program execution interval comprised of a fixed and identical number of clock cycles -3- for all of the microprocessors employed in the redundant system, the timing means assuring that all of the microprocessors achieve an identical machine state at the conclusion of the program execution interval.
  • Figure 1 illustrates a system block diagram of a typical prior art, triple- modularly-redundant fault tolerant system
  • Figure 2 is a logical block diagram illustrating the principles of the invention
  • Figure 3 is a program flow chart illustrative of the general program organization required to practice the instant invention.
  • FIG. 1 comprises a general block diagram of a prior art, N- modular redundancy, fault-tolerant architecture in which N equals three, t is to be understood that the program memory and data memory which are not explicitly shown in Figure
  • the three processors 10, 12 and 14 are maintained in synchronism by virtue of being driven by a common master oscillator 16 which is connected to provide the clock source for each of the microprocessors.
  • System inputs are provided in common, via bus 18 and input/output interfaces 20, 22 and 24 to each of the plural microprocessors.
  • each of the plural microprocessors provides output data via the buses 26, 28 and 30 to voter logic 32 such that the system outputs on bus 34 correspond to majority agreement between the plural inputs to the voter.
  • voting is typically performed for each instruction, thus requiring all processors to execute identical programs in lock step.
  • the prior art system of Figure 1 is incapable of operating in an event-driven mode.
  • microprocessor 36 and the associated logic circuitry which will allow that microprocessor to be used along with other redundant microprocessors, identical to those shown in Figure 2, in configuring a N-modular redundancy fault tolerant processor capable of operating in an event- driven mode.
  • the microprocessors employed in the system may preferably comprise a RISC (Reduced Instruction Set Computer), Type 8960 microprocessor manufactured by the Intel Corporation.
  • Such microprocessors sequentially execute a stored program in response to clock signals from a processor clock 38 provided that the "hold” signal on line 40 is low. Whenever the "hold” signal goes high, program execution is suspended.
  • the remainder of the circuitry of Figure 2 functions to define a computational frame, i.e., a specified period of time during which a specified number of clock cycles are executed.
  • An essential feature of the invention is that all of the plural microprocessors in the N-modular redundant system are guaranteed to execute the identical number of clock cycles in each computational frame, even though the frequency of their individual processor clocks may vary within a practical tolerance.
  • set/reset flip-flop 42 is set and the microprocessor 36 is put in the "hold” mode.
  • the start of a computational frame is established by a cyclic interrupt signal on line 44 which is applied simultaneously to the interrupt input terminal (INT) of each of the redundant microprocessors 36 employed in the system from a master timing source (not shown).
  • the assertion of the cyclic interrupt on line 44 causes three concurrent actions: (1) a set/reset flip-flip 46 is set, thus holding the presettable counter 48 in its "cleared” state; (2) flip-flip 42 is reset, thus releasing the "hold” on microprocessor 36; and (3) an interrupt is initiated in microprocessor 36.
  • the microprocessor 36 and its associated program can be considered to be a state machine, albeit a very complex one. This means that the results produced by the microprocessor are deterministic and that multiple microprocessors executing identical programs will produce identical results response to a common interrupt signal, provided that all microprocessors have been placed in the "hold" mode at the same program address.
  • Microprocessor 36 will enter its interrupt service routine a predetermined number of clock cycles after the occurrence of the cyclic interrupt. A predetermined number of clock pulses later, an output instruction is executed to set output latch 50.
  • I/O control 52 has the structure typical of conventional memory mapped I/O.
  • a write instruction of a predetermined data pattern to a predetermined address via lines 54, 56 and 58, will set data line 60 to the logical "1" state and will generate a clock pulse on I/O write line 62.
  • a subsequent write instruction is used to reset the output latch 50.
  • the load signal on line 64 causes a predetermined binary value established by a switch register 66 to be "jammed” into the presettable down counter 48.
  • the counter 48 begins to decrement with each processor clock cycle.
  • the aforementioned circuitry provides a means to execute a predefined number of clock cycles in response to each cyclic interrupt. For example, consider a typical system employing a 10 Hz clock with a frequency tolerance of plus or minus 0.005% and a computation frame period of 10 microseconds. A microprocessor running with the slowest possible processor clock could reliably execute 99993 clock cycles and yet be assured of reaching the "hold" state before the next cyclic interrupt takes place. Assuming that output latch 50 is reset 16 clock cycles after the cyclic interrupt, a binary value of 11000011010001001 (99977 decimal) will accomplish this result.
  • the hold interval at the end of each computational cycle allows the slowest microprocessor an opportunity to catch up with the fastest microprocessor so that the next computational frame starts in synchronization.
  • the divergence from synchronization is limited to only a few clock cycles, five in the above example, by the processor clock frequency tolerance. This divergence is readily tolerated by the voting circuitry provided that each output state to be voted is stable for more than five clock cycles and voting is not performed until five clock cycles have elapsed after any state transition at the voter inputs. Voting is thus performed within a window of time when the microprocessors are outputting corresponding data in spite of a slight synchronization divergence.
  • each computational frame still ends with all microprocessors in an identical state, i.e., in a hold mode.
  • the contents in data memory and CPU registers, particularly the program address register will be identical.
  • all event interrupt requests should be resynchronized by a clock which is coherent multiple of the cyclic interrupt.
  • the common set of resynchronized interrupt requests is provided to the interrupt controller associated with each microprocessor. This establishes a set of windows during which interrupts can occur for each computational frame.
  • the frequency of this clock should be chosen such that a sufficiently high sampling rate is achieved while at the same time assuring that the latest interrupt in any computational frame will be serviced within that frame.
  • timing means are provided for establishing a predetermined program execution interval comprised of a fixed number of clock cycles for the plurality of computer means of the N-modular fault tolerant system, the timing means assuring that all of said computer means achieve an identical machine state at the end of the predetermined program execution interval.
  • FIG. 3 is a generalized program flow diagram which is illustrative of how any program, which is concurrently executed on an N-modular set of microprocessors may be organized to satisfy the aforementioned constraints.
  • interrupts are disabled and the microprocessors begin unsynchronized program execution.
  • Each processor executes an initialization routine (block 72) to set its registers and data memory to identical predetermined values.
  • a Wait flag is set (block 74) which will only be reset when servicing a cyclic interrupt.
  • the cyclic interrupt is enabled (block 76), following which the microprocessors enter a tight loop waiting for the Wait flag to be reset. This provides the required initial synchronization.
  • the next cyclic interrupt causes program execution to switch to the cyclic interrupt routine indicated generally by numeral 78, the first step of which (block 80) is to load the presettable down counter 48 in the manner previously described.
  • a programmable counter indicative of the number of clock cycles per computational frame is set (block 84), the Wait flag is reset (block 86) and a "Return-from-Interrupt" instruction is executed (block 88) to jump back to the tight loop from whence the interrupt occurred. With the Wait flag now reset, execution falls through to execute program partition 1 (block 90). The last program step of any program partition is to decrement the computational frame counter by the number of clock cycles used in the execution of the program partition. Next, a test is performed to determine if the next scheduled program partition can be completely executed within the present computational frame (block 92). If the program partition can complete, the program branches to the next program partition (block 94). Otherwise, the Wait flag is set (block 96) to delay the execution of program partitional 2 until the next computational frame.

Abstract

A N-modular redundancy fault tolerant computing system comprises a plurality of microprocessors having their data outputs applied simultaneously to a voter. In accordance with this invention, the system can be event driven while still maintaining synchrony. This is accomplished by providing timing means for establishing a predetermined program execution interval comprised of a fixed number of clock cycles for the plurality of computer means of the N-modular fault tolerant system, the timing means assuring that all of said computer means achieve an identical machine state at the end of the predetermined program execution interval.

Description

I TERRUPT DRIVEN, SEPARATELY CLOCKED, FAULT TOLERANT PROCESSOR SYNCHRONIZATION
BACKGROUND OF THE INVENTION
I. Field of the Invention: This invention relates generally to highly reliable, fault tolerant digital data processing systems, and more particularly to an N- modular redundancy microprocessor system in which the plural processors may be event-driven but remain in synchronization. II. Discussion of the Prior Art: It is well known in the digital computing arts to employ a multiplicity of redundant processors to achieve fault tolerant operation, i.e., a microprocessor system capable of error-free operation in spite of one or more hardware faults. The most common prior art fault tolerant architecture is referred to as N-modular redundancy where N represents an odd number greater than one, and typically three or five. The N identical processors are programmed to execute identical programs in synchrony in response to a common set of input signals (data). Fault tolerance is achieved by continuously voting on the output signals produced by each processor in a majority decision logic voter. The result of the voting circuit is guaranteed to be correct, provided a majority of the processors compute a correct result. Synchronization may be established on instruction boundaries or, alternatively, for each processor clock cycle. In either case, voting is typically performed for each instruction, thus requiring that all processors execute identical programs in lock step.
It is also well known in the digital computing arts to employ interrupts to perform what is known as "event-driven computing". Interrupts allow the processor to function more efficiently in real-time applications, such as inertial navigation and flight control. In such systems, the processor is responsive to input signals which occur in real time, i.e., asynchronously with respect to the processor clock. Interrupts are the means by which program execution of an interrupt program sequence is temporarily suspended while an interrupt subroutine, often termed the service routine, processes the input data associated with the interrupt. The last step of any interrupt subroutine is the execution of a Return-From- Interrupt instruction such that execution of the instant program resumes exactly at the point where its suspension occurred. Typical microprocessor systems are responsive to a multiplicity of interrupts and provide circuitry to prioritize and selectively mask interrupts. The Type 8259 Programmable Interrupt Controller manufactured by the Intel Corporation may be considered typical. Any real-time program must be written to assure that any combination of asynchronous events which generate interrupt requests will result in an orderly execution of the associated interrupt subroutines. To accomplish this, the interrupt controller must periodically strobe or sample the state of the multiple interrupt request lines with a signal derived from the processor clock. The set of interrupt request samples is processed by a priority encoder to determine which of simultaneous requests will be processed first. The interrupt controller then generates an interrupt signal to trigger execution of the associated service routine when the execution of the present instruction is complete.
Those skilled in the art can appreciate that it is not possible to assure the lock- step operation of plural redundant processors which is required for fault tolerant voting in accordance with the prior art when the processors are event-driven. When an interrupt request occurs simultaneous with the strobe signal, the results may be indeterminant. In spite of the best efforts to synchronize the processor clocks to one another and thereby synchronize interrupt request sampling, an event that one processor may resolve as "in time", i.e., occurring before the strobe signal, another processor may resolve as "too late", i.e., occurring after the strobe signal. The result is that the programs of the respective processors are interrupted at different points in the program and majority voting is no longer valid since synchronization is lost. In a like manner, two nearly simultaneous events may be serviced in one order in a first processor and in the reverse order in a second processor. This likewise invalidates the voting. Thus, the prior art use of N-modular redundancy generally precludes an event-driven processor architecture.
It is accordingly a principal object of the present invention to provide a means of processor synchronization which permits a processor redundant processor architecture to be both event driven and fault tolerant, yielding a system exhibiting high reliability and functional efficiency not heretofore found in the prior art.
SUMMARY OF THE INVENTION
In accordance with the present invention, the foregoing object is achieved by providing a fault tolerant computing system having a plurality of microprocessors, each programmed to execute the same stored program of instructions in synchrony in response to a common set of data input signals where each of the plurality of microprocessors has its data output coupled in common to a majority decision logic voting means which operates to determine the extent of comparison of the data outputs of all of the processors. The system also includes a processor clock for each of the processors which produces timing pulses for timing the execution of the stored program by the microprocessors into clock cycles. Then, timing means are included for each of the plural processor employed which function to establish a predetermined program execution interval comprised of a fixed and identical number of clock cycles -3- for all of the microprocessors employed in the redundant system, the timing means assuring that all of the microprocessors achieve an identical machine state at the conclusion of the program execution interval.
While the present invention will be described as including discrete logic modules external to the microprocessor, those skilled in the art can appreciate that the invention may also be practiced in software executed by the microprocessors themselves.
DESCRIPTION OF THE DRAWINGS The foregoing features, objects, and advantages of the invention will become apparent to those skilled in the art from the following detailed description of a preferred embodiment, especially when considered in conjunction with the accompanying drawings in which:
Figure 1 illustrates a system block diagram of a typical prior art, triple- modularly-redundant fault tolerant system; Figure 2 is a logical block diagram illustrating the principles of the invention; and
Figure 3 is a program flow chart illustrative of the general program organization required to practice the instant invention.
DESCRIPTION OF THE PREFERRED EMBODIMENT Figure 1 comprises a general block diagram of a prior art, N- modular redundancy, fault-tolerant architecture in which N equals three, t is to be understood that the program memory and data memory which are not explicitly shown in Figure
1 may be part of the microprocessor or, optionally, may be added to the microprocessor bus. The three processors 10, 12 and 14 are maintained in synchronism by virtue of being driven by a common master oscillator 16 which is connected to provide the clock source for each of the microprocessors. System inputs are provided in common, via bus 18 and input/output interfaces 20, 22 and 24 to each of the plural microprocessors. Likewise, each of the plural microprocessors provides output data via the buses 26, 28 and 30 to voter logic 32 such that the system outputs on bus 34 correspond to majority agreement between the plural inputs to the voter.
As mentioned above, voting is typically performed for each instruction, thus requiring all processors to execute identical programs in lock step. The prior art system of Figure 1 is incapable of operating in an event-driven mode.
Referring next to Figure 2, there is shown a single microprocessor 36 and the associated logic circuitry which will allow that microprocessor to be used along with other redundant microprocessors, identical to those shown in Figure 2, in configuring a N-modular redundancy fault tolerant processor capable of operating in an event- driven mode. With no limitation intended, the microprocessors employed in the system may preferably comprise a RISC (Reduced Instruction Set Computer), Type 8960 microprocessor manufactured by the Intel Corporation. Such microprocessors sequentially execute a stored program in response to clock signals from a processor clock 38 provided that the "hold" signal on line 40 is low. Whenever the "hold" signal goes high, program execution is suspended. The remainder of the circuitry of Figure 2 functions to define a computational frame, i.e., a specified period of time during which a specified number of clock cycles are executed. An essential feature of the invention is that all of the plural microprocessors in the N-modular redundant system are guaranteed to execute the identical number of clock cycles in each computational frame, even though the frequency of their individual processor clocks may vary within a practical tolerance.
At the end of each computational frame, set/reset flip-flop 42 is set and the microprocessor 36 is put in the "hold" mode. The start of a computational frame is established by a cyclic interrupt signal on line 44 which is applied simultaneously to the interrupt input terminal (INT) of each of the redundant microprocessors 36 employed in the system from a master timing source (not shown). The assertion of the cyclic interrupt on line 44 causes three concurrent actions: (1) a set/reset flip-flip 46 is set, thus holding the presettable counter 48 in its "cleared" state; (2) flip-flip 42 is reset, thus releasing the "hold" on microprocessor 36; and (3) an interrupt is initiated in microprocessor 36.
The microprocessor 36 and its associated program can be considered to be a state machine, albeit a very complex one. This means that the results produced by the microprocessor are deterministic and that multiple microprocessors executing identical programs will produce identical results response to a common interrupt signal, provided that all microprocessors have been placed in the "hold" mode at the same program address. Microprocessor 36 will enter its interrupt service routine a predetermined number of clock cycles after the occurrence of the cyclic interrupt. A predetermined number of clock pulses later, an output instruction is executed to set output latch 50. Specifically, I/O control 52 has the structure typical of conventional memory mapped I/O. Thus, a write instruction of a predetermined data pattern to a predetermined address, via lines 54, 56 and 58, will set data line 60 to the logical "1" state and will generate a clock pulse on I/O write line 62. A subsequent write instruction is used to reset the output latch 50. When that latch is set, the load signal on line 64 causes a predetermined binary value established by a switch register 66 to be "jammed" into the presettable down counter 48. When output latch 50 has been reset, the counter 48 begins to decrement with each processor clock cycle. When counter 48 reaches a count of 0, the next positive transition of the processor clock 38 generates a carry-out signal (CY) on line 68 which functions to set the flip-flop 42, thereby synchronously asserting the "hold" signal for the microprocessor 36. Program execution is suspended at this point until the next cyclic interrupt on line 44 releases the microprocessor 36 from its "hold" mode.
It can be seen, then, that the aforementioned circuitry provides a means to execute a predefined number of clock cycles in response to each cyclic interrupt. For example, consider a typical system employing a 10 Hz clock with a frequency tolerance of plus or minus 0.005% and a computation frame period of 10 microseconds. A microprocessor running with the slowest possible processor clock could reliably execute 99993 clock cycles and yet be assured of reaching the "hold" state before the next cyclic interrupt takes place. Assuming that output latch 50 is reset 16 clock cycles after the cyclic interrupt, a binary value of 11000011010001001 (99977 decimal) will accomplish this result. The hold interval at the end of each computational cycle allows the slowest microprocessor an opportunity to catch up with the fastest microprocessor so that the next computational frame starts in synchronization. Within a given frame, the divergence from synchronization is limited to only a few clock cycles, five in the above example, by the processor clock frequency tolerance. This divergence is readily tolerated by the voting circuitry provided that each output state to be voted is stable for more than five clock cycles and voting is not performed until five clock cycles have elapsed after any state transition at the voter inputs. Voting is thus performed within a window of time when the microprocessors are outputting corresponding data in spite of a slight synchronization divergence.
From the foregoing discussion, it is clear that synchronization is maintained within an acceptable tolerance when the cyclic interrupt is the only asynchronous stimulus. Further, it is possible to maintain such synchronization with any number of additional interrupts provided the following constraints are observed; 1) all microprocessors must sense the identical set of interrupts within a computational frame, 2) servicing of all interrupts sensed in a given computational frame must complete within that frame, and 3) a common order of interrupt servicing must be enforced or, alternatively, the microprocessor outputs must be independent of the order of interrupt servicing. The system can tolerate the additional divergence temporarily caused by event interrupts, recognizing that even though each microprocessor may thread the program differently while servicing the interrupts, each path involves an identical number of clock cycles so that the system will reconverge within the processor clock tolerances when interrupt servicing is complete. Hence, each computational frame still ends with all microprocessors in an identical state, i.e., in a hold mode. The contents in data memory and CPU registers, particularly the program address register will be identical. The above constraints are satisfied by controlling the time when interrupt requests are sampled and by organizing the program into partitions which can be assured to complete within a single computational frame. Preferably, all event interrupt requests should be resynchronized by a clock which is coherent multiple of the cyclic interrupt. The common set of resynchronized interrupt requests is provided to the interrupt controller associated with each microprocessor. This establishes a set of windows during which interrupts can occur for each computational frame. The frequency of this clock should be chosen such that a sufficiently high sampling rate is achieved while at the same time assuring that the latest interrupt in any computational frame will be serviced within that frame.
It can be seen then that timing means are provided for establishing a predetermined program execution interval comprised of a fixed number of clock cycles for the plurality of computer means of the N-modular fault tolerant system, the timing means assuring that all of said computer means achieve an identical machine state at the end of the predetermined program execution interval.
Figure 3 is a generalized program flow diagram which is illustrative of how any program, which is concurrently executed on an N-modular set of microprocessors may be organized to satisfy the aforementioned constraints. At the initial event of a "Power-on Reset", (block 70) interrupts are disabled and the microprocessors begin unsynchronized program execution. Each processor executes an initialization routine (block 72) to set its registers and data memory to identical predetermined values. Next, a Wait flag is set (block 74) which will only be reset when servicing a cyclic interrupt. Next, the cyclic interrupt is enabled (block 76), following which the microprocessors enter a tight loop waiting for the Wait flag to be reset. This provides the required initial synchronization. The next cyclic interrupt causes program execution to switch to the cyclic interrupt routine indicated generally by numeral 78, the first step of which (block 80) is to load the presettable down counter 48 in the manner previously described.
When all other interrupt processing is complete (block 82), a programmable counter indicative of the number of clock cycles per computational frame is set (block 84), the Wait flag is reset (block 86) and a "Return-from-Interrupt" instruction is executed (block 88) to jump back to the tight loop from whence the interrupt occurred. With the Wait flag now reset, execution falls through to execute program partition 1 (block 90). The last program step of any program partition is to decrement the computational frame counter by the number of clock cycles used in the execution of the program partition. Next, a test is performed to determine if the next scheduled program partition can be completely executed within the present computational frame (block 92). If the program partition can complete, the program branches to the next program partition (block 94). Otherwise, the Wait flag is set (block 96) to delay the execution of program partitional 2 until the next computational frame.
This simple example shows how a program may be organized to assure that all processors complete each computational frame in the identical machine state using a fixed rotational scheme to schedule program partitions. A more versatile scheme would be to select from the set of program partitions which are pending execution, the longest program partition which can complete within the present computational frame and setting the Wait flag only when there are no longer program partitions pending which can complete. Although the preferred embodiment show the computational frame timing means to be external to microprocessor 36, portions, including all, of this circuitry may be included in the on-chip circuitry of the microprocessor itself without departing from the spirit of this invention.
This invention has been described herein in considerable detail in order to comply with the Patent Statutes and to provide those skilled in the art with the information needed to apply the novel principles and to construct and use such specialized components as are required. However, it is to be understood that the invention can be carried out by specifically different equipment and devices, and that various modifications, both as to the equipment details and operating procedures, can be accomplished without departing from the scope of the invention itself. What is claimed is:

Claims

CLAIMS 1. A fault tolerant computing system comprising, in combination:
(a) a plurality of microprocessors, each programmed to execute the same stored program of instructions in synchrony in response to a common set of data input signals, each of said plurality of microprocessors having its data output coupled in common to majority decision logic means, said majority decision logic means determining the extent of comparison of the data outputs of all of said plurality of microprocessors;
(b) processor clock means individually associated with each of said plurality of microprocessors for producing timing pulses for timing the execution of said stored program of instructions by its microprocessor into clock cycles; and
(c) timing means associated with each of said plurality of microprocessors for establishing a predetermined program execution interval comprised of a fixed and identical number of said clock cycles for all of said plurality of microprocessors, said timing means assuring that all of said plurality of microprocessors achieve identical machine states at the conclusion of said program execution interval.
2. The fault tolerant computing system as in Claim 1 wherein said timing means for each of said microprocessor comprises: (a) clock pulse counting means coupled to receive said timing pulses for producing a first control signal when the count value accumulated therein reaches a predetermined value;
(b) means controlled by the execution of said program of instructions by said microprocessor for loading said clock pulse counting means with a predetermined initial value;
(c) control means operatively associated with said clock pulse counting means and with said microprocessor and responsive to the receipt of a cyclic interrupt signal applied to all of said microprocessors simultaneously for initializing said clock pulse counting means prior to enabling said timing pulses from said processor clock to increment or decrement said initial value until said predetermined value is reached to produce said first control signal, said first control signal being applied to said control means for suspending execution of said stored program by said microprocessor until a next cyclic interrupt signal is applied to said microprocessor, said program execution interval being the interval between the occurrence of said cyclic interrupt signal and the production of said first control signal by said clock pulse counting means.
3. The fault tolerant computing system as in Claim 1 wherein said timing means is internal to said plurality of microprocessors.
4. The fault tolerant computing system as in Claim 1 wherein said timing means includes circuit means external to said plurality of microprocessors.
PCT/US1992/004557 1991-06-06 1992-06-02 Interrupt driven, separately clocked, fault tolerant processor synchronization WO1992022030A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP5500594A JPH06508229A (en) 1991-06-06 1992-06-02 Synchronization of fault-tolerant processors using interrupt-driven discrete clock schemes

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US07/711,638 US5233615A (en) 1991-06-06 1991-06-06 Interrupt driven, separately clocked, fault tolerant processor synchronization
US711,638 1991-06-06

Publications (1)

Publication Number Publication Date
WO1992022030A1 true WO1992022030A1 (en) 1992-12-10

Family

ID=24858906

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1992/004557 WO1992022030A1 (en) 1991-06-06 1992-06-02 Interrupt driven, separately clocked, fault tolerant processor synchronization

Country Status (4)

Country Link
US (1) US5233615A (en)
JP (1) JPH06508229A (en)
CA (1) CA2107083A1 (en)
WO (1) WO1992022030A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1995006277A2 (en) * 1993-08-18 1995-03-02 Honeywell Inc. Separately clocked processor synchronization improvement
US5613127A (en) * 1992-08-17 1997-03-18 Honeywell Inc. Separately clocked processor synchronization improvement
WO1999036847A2 (en) * 1998-01-20 1999-07-22 Alliedsignal Inc. Fault tolerant computing system using instruction counting
EP0972244A1 (en) * 1997-04-02 2000-01-19 General Dynamics Information Systems, Inc. Fault tolerant computer system
EP0980546A1 (en) * 1997-05-07 2000-02-23 General Dynamics Information Systems, Inc. Non-intrusive power control for computer systems

Families Citing this family (63)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DK0543825T3 (en) * 1990-08-14 1995-03-20 Siemens Ag Device for interrupt distribution in a multi-computer system
US5455935A (en) * 1991-05-31 1995-10-03 Tandem Computers Incorporated Clock synchronization system
US5537655A (en) * 1992-09-28 1996-07-16 The Boeing Company Synchronized fault tolerant reset
US5379415A (en) * 1992-09-29 1995-01-03 Zitel Corporation Fault tolerant memory system
JPH0773059A (en) * 1993-03-02 1995-03-17 Tandem Comput Inc Fault-tolerant computer system
DE59302826D1 (en) * 1993-03-16 1996-07-11 Siemens Ag Synchronization procedure for automation systems
US5758058A (en) * 1993-03-31 1998-05-26 Intel Corporation Apparatus and method for initializing a master/checker fault detecting microprocessor
US5548797A (en) * 1994-10-03 1996-08-20 International Business Machines Corporation Digital clock pulse positioning circuit for delaying a signal input by a fist time duration and a second time duration to provide a positioned clock signal
US5680408A (en) * 1994-12-28 1997-10-21 Intel Corporation Method and apparatus for determining a value of a majority of operands
US5915082A (en) * 1996-06-07 1999-06-22 Lockheed Martin Corporation Error detection and fault isolation for lockstep processor systems
EP0825506B1 (en) 1996-08-20 2013-03-06 Invensys Systems, Inc. Methods and apparatus for remote process control
US6691183B1 (en) 1998-05-20 2004-02-10 Invensys Systems, Inc. Second transfer logic causing a first transfer logic to check a data ready bit prior to each of multibit transfer of a continous transfer operation
US7089530B1 (en) 1999-05-17 2006-08-08 Invensys Systems, Inc. Process control configuration system with connection validation and configuration
US7096465B1 (en) 1999-05-17 2006-08-22 Invensys Systems, Inc. Process control configuration system with parameterized objects
US6754885B1 (en) 1999-05-17 2004-06-22 Invensys Systems, Inc. Methods and apparatus for controlling object appearance in a process control configuration system
WO2000070417A1 (en) 1999-05-17 2000-11-23 The Foxboro Company Process control configuration system with parameterized objects
US7272815B1 (en) 1999-05-17 2007-09-18 Invensys Systems, Inc. Methods and apparatus for control configuration with versioning, security, composite blocks, edit selection, object swapping, formulaic values and other aspects
US7043728B1 (en) 1999-06-08 2006-05-09 Invensys Systems, Inc. Methods and apparatus for fault-detecting and fault-tolerant process control
US6788980B1 (en) 1999-06-11 2004-09-07 Invensys Systems, Inc. Methods and apparatus for control using control devices that provide a virtual machine environment and that communicate via an IP network
US6501995B1 (en) 1999-06-30 2002-12-31 The Foxboro Company Process control system and method with improved distribution, installation and validation of components
AU6615600A (en) 1999-07-29 2001-02-19 Foxboro Company, The Methods and apparatus for object-based process control
US6473660B1 (en) 1999-12-03 2002-10-29 The Foxboro Company Process control system and method with automatic fault avoidance
US6779128B1 (en) 2000-02-18 2004-08-17 Invensys Systems, Inc. Fault-tolerant data transfer
GB2370380B (en) 2000-12-19 2003-12-31 Picochip Designs Ltd Processor architecture
US20030217054A1 (en) 2002-04-15 2003-11-20 Bachman George E. Methods and apparatus for process, factory-floor, environmental, computer aided manufacturing-based or other control system with real-time data distribution
EP1398699A1 (en) * 2002-09-12 2004-03-17 Siemens Aktiengesellschaft Method for synchronizing events, in particular for fault-tolerant systems
GB2396446B (en) * 2002-12-20 2005-11-16 Picochip Designs Ltd Array synchronization
US7761923B2 (en) 2004-03-01 2010-07-20 Invensys Systems, Inc. Process control methods and apparatus for intrusion detection, protection and network hardening
KR101017444B1 (en) 2004-10-25 2011-02-25 로베르트 보쉬 게엠베하 Method and device for mode switching and signal comparison in a computer system comprising at least two processing units
US7236005B1 (en) 2005-02-09 2007-06-26 Intel Corporation Majority voter circuit design
US7346793B2 (en) * 2005-02-10 2008-03-18 Northrop Grumman Corporation Synchronization of multiple operational flight programs
US7426614B2 (en) * 2005-04-28 2008-09-16 Hewlett-Packard Development Company, L.P. Method and system of executing duplicate copies of a program in lock step
US7730350B2 (en) * 2005-04-28 2010-06-01 Hewlett-Packard Development Company, L.P. Method and system of determining the execution point of programs executed in lock step
DE102005037230A1 (en) * 2005-08-08 2007-02-15 Robert Bosch Gmbh Method and device for monitoring functions of a computer system
US7860857B2 (en) * 2006-03-30 2010-12-28 Invensys Systems, Inc. Digital data processing apparatus and methods for improving plant performance
US7549085B2 (en) * 2006-04-28 2009-06-16 Hewlett-Packard Development Company, L.P. Method and apparatus to insert special instruction
US20080270017A1 (en) * 2007-04-26 2008-10-30 Saks Steven L Asynchronous inertial navigation system
US7966538B2 (en) * 2007-10-18 2011-06-21 The Regents Of The University Of Michigan Microprocessor and method for detecting faults therein
GB2454865B (en) * 2007-11-05 2012-06-13 Picochip Designs Ltd Power control
US8200947B1 (en) * 2008-03-24 2012-06-12 Nvidia Corporation Systems and methods for voting among parallel threads
US7996714B2 (en) * 2008-04-14 2011-08-09 Charles Stark Draper Laboratory, Inc. Systems and methods for redundancy management in fault tolerant computing
US8010846B1 (en) 2008-04-30 2011-08-30 Honeywell International Inc. Scalable self-checking processing platform including processors executing both coupled and uncoupled applications within a frame
RU2495476C2 (en) 2008-06-20 2013-10-10 Инвенсис Системз, Инк. Systems and methods for immersive interaction with actual and/or simulated facilities for process, environmental and industrial control
JP5481889B2 (en) * 2009-03-11 2014-04-23 日本電気株式会社 Fault tolerant computer, synchronous control method thereof and computer program
GB2470037B (en) 2009-05-07 2013-07-10 Picochip Designs Ltd Methods and devices for reducing interference in an uplink
US8127060B2 (en) 2009-05-29 2012-02-28 Invensys Systems, Inc Methods and apparatus for control configuration with control objects that are fieldbus protocol-aware
US8463964B2 (en) 2009-05-29 2013-06-11 Invensys Systems, Inc. Methods and apparatus for control configuration with enhanced change-tracking
GB2470891B (en) 2009-06-05 2013-11-27 Picochip Designs Ltd A method and device in a communication network
GB2470771B (en) 2009-06-05 2012-07-18 Picochip Designs Ltd A method and device in a communication network
US8564616B1 (en) 2009-07-17 2013-10-22 Nvidia Corporation Cull before vertex attribute fetch and vertex lighting
US8542247B1 (en) 2009-07-17 2013-09-24 Nvidia Corporation Cull before vertex attribute fetch and vertex lighting
GB2474071B (en) 2009-10-05 2013-08-07 Picochip Designs Ltd Femtocell base station
US8384736B1 (en) 2009-10-14 2013-02-26 Nvidia Corporation Generating clip state for a batch of vertices
US8976195B1 (en) 2009-10-14 2015-03-10 Nvidia Corporation Generating clip state for a batch of vertices
GB2482869B (en) 2010-08-16 2013-11-06 Picochip Designs Ltd Femtocell access control
GB2489716B (en) 2011-04-05 2015-06-24 Intel Corp Multimode base system
GB2489919B (en) 2011-04-05 2018-02-14 Intel Corp Filter
GB2491098B (en) 2011-05-16 2015-05-20 Intel Corp Accessing a base station
US9342358B2 (en) 2012-09-14 2016-05-17 General Electric Company System and method for synchronizing processor instruction execution
US9256426B2 (en) 2012-09-14 2016-02-09 General Electric Company Controlling total number of instructions executed to a desired number after iterations of monitoring for successively less number of instructions until a predetermined time period elapse
US9384858B2 (en) 2014-11-21 2016-07-05 Wisconsin Alumni Research Foundation Computer system predicting memory failure
US10089194B2 (en) * 2016-06-08 2018-10-02 Qualcomm Incorporated System and method for false pass detection in lockstep dual core or triple modular redundancy (TMR) systems
CN113110124B (en) * 2021-03-11 2022-08-19 上海新时达电气股份有限公司 double-MCU control method and control system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4733353A (en) * 1985-12-13 1988-03-22 General Electric Company Frame synchronization of multiply redundant computers
EP0372580A2 (en) * 1987-11-09 1990-06-13 Tandem Computers Incorporated Synchronization of fault-tolerant computer system having multiple processors
US4937741A (en) * 1988-04-28 1990-06-26 The Charles Stark Draper Laboratory, Inc. Synchronization of fault-tolerant parallel processing systems

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4497059A (en) * 1982-04-28 1985-01-29 The Charles Stark Draper Laboratory, Inc. Multi-channel redundant processing systems
US4965717A (en) * 1988-12-09 1990-10-23 Tandem Computers Incorporated Multiple processor system having shared memory with private-write capability

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4733353A (en) * 1985-12-13 1988-03-22 General Electric Company Frame synchronization of multiply redundant computers
EP0372580A2 (en) * 1987-11-09 1990-06-13 Tandem Computers Incorporated Synchronization of fault-tolerant computer system having multiple processors
US4937741A (en) * 1988-04-28 1990-06-26 The Charles Stark Draper Laboratory, Inc. Synchronization of fault-tolerant parallel processing systems

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
15th ANNUAL SYMPOSIUM ON FAULT-TOLERANT COMPU- TING, June 19-21, 1985, pages 246-251, IEEE, New York, US; T. YONEDA et al.:'Implementation of interrupt handler for loosely-synchronized TMR systems' *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5613127A (en) * 1992-08-17 1997-03-18 Honeywell Inc. Separately clocked processor synchronization improvement
WO1995006277A2 (en) * 1993-08-18 1995-03-02 Honeywell Inc. Separately clocked processor synchronization improvement
WO1995006277A3 (en) * 1993-08-18 1995-06-08 Honeywell Inc Separately clocked processor synchronization improvement
EP0972244A1 (en) * 1997-04-02 2000-01-19 General Dynamics Information Systems, Inc. Fault tolerant computer system
EP0972244A4 (en) * 1997-04-02 2000-11-15 Gen Dynamics Inf Systems Inc Fault tolerant computer system
EP0980546A1 (en) * 1997-05-07 2000-02-23 General Dynamics Information Systems, Inc. Non-intrusive power control for computer systems
EP0980546A4 (en) * 1997-05-07 2000-11-15 Gen Dynamics Inf Systems Inc Non-intrusive power control for computer systems
WO1999036847A2 (en) * 1998-01-20 1999-07-22 Alliedsignal Inc. Fault tolerant computing system using instruction counting
WO1999036847A3 (en) * 1998-01-20 2000-12-28 Allied Signal Inc Fault tolerant computing system using instruction counting
US6374364B1 (en) 1998-01-20 2002-04-16 Honeywell International, Inc. Fault tolerant computing system using instruction counting

Also Published As

Publication number Publication date
JPH06508229A (en) 1994-09-14
US5233615A (en) 1993-08-03
CA2107083A1 (en) 1992-12-07

Similar Documents

Publication Publication Date Title
US5233615A (en) Interrupt driven, separately clocked, fault tolerant processor synchronization
US6374364B1 (en) Fault tolerant computing system using instruction counting
US4796211A (en) Watchdog timer having a reset detection circuit
EP0969369A2 (en) Control of multiple computer processes
US8205201B2 (en) Process for maintaining execution synchronization between several asynchronous processors working in parallel and in a redundant manner
EP0969373A2 (en) I/O handling for a fault tolerant multiprocessor computer system
US5557764A (en) Interrupt vector method and apparatus
US6021457A (en) Method and an apparatus for minimizing perturbation while monitoring parallel applications
US4196470A (en) Method and arrangement for transfer of data information to two parallelly working computer means
JP2000187600A (en) Watchdog timer system
US20040193735A1 (en) Method and circuit arrangement for synchronization of synchronously or asynchronously clocked processor units
US20040088520A1 (en) System for and a method of controlling pipeline process stages
EP0337993B1 (en) Parallel processing state alignment
US7549085B2 (en) Method and apparatus to insert special instruction
JPH0320776B2 (en)
US4862352A (en) Data processor having pulse width encoded status output signal
RU2029365C1 (en) Three-channel asynchronous system
Halang et al. Methodologies for meeting hard deadlines in industrial distributed real-time systems
JP3152014B2 (en) Timer circuit
SU1575182A1 (en) Device for distribution of problems to processors
SU1711173A1 (en) Device for providing priority access to common bus
Halang New approaches for distributed industrial process control systems aimed to cope with strict time constraints
Michaloski et al. Design principles for a real-time robot control system
RU1829033C (en) Priority device
JPH03105487A (en) Microprocessor

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): CA JP

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH DE DK ES FR GB GR IT LU MC NL SE

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
WWE Wipo information: entry into national phase

Ref document number: 2107083

Country of ref document: CA

122 Ep: pct application non-entry in european phase