WO1999036847A2 - Fault tolerant computing system using instruction counting - Google Patents
Fault tolerant computing system using instruction counting Download PDFInfo
- Publication number
- WO1999036847A2 WO1999036847A2 PCT/US1999/001221 US9901221W WO9936847A2 WO 1999036847 A2 WO1999036847 A2 WO 1999036847A2 US 9901221 W US9901221 W US 9901221W WO 9936847 A2 WO9936847 A2 WO 9936847A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- microprocessors
- application program
- application
- data
- execution
- Prior art date
Links
- 238000000034 method Methods 0.000 claims description 14
- 230000006870 function Effects 0.000 description 10
- 238000012545 processing Methods 0.000 description 9
- 230000001360 synchronised effect Effects 0.000 description 8
- 238000010586 diagram Methods 0.000 description 7
- 238000007726 management method Methods 0.000 description 5
- 238000013459 approach Methods 0.000 description 4
- 238000004891 communication Methods 0.000 description 4
- 230000007246 mechanism Effects 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 3
- 238000013524 data verification Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000012905 input function Methods 0.000 description 2
- 239000013256 coordination polymer Substances 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 231100001261 hazardous Toxicity 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/1675—Temporal synchronisation or re-synchronisation of redundant processing components
- G06F11/1683—Temporal synchronisation or re-synchronisation of redundant processing components at instruction level
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/0703—Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
- G06F11/0751—Error or fault detection not based on redundancy
- G06F11/0754—Error or fault detection not based on redundancy by exceeding limits
- G06F11/076—Error or fault detection not based on redundancy by exceeding limits by exceeding a count or rate limit, e.g. word- or bit count limit
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/1675—Temporal synchronisation or re-synchronisation of redundant processing components
- G06F11/1691—Temporal synchronisation or re-synchronisation of redundant processing components using a quantum
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/18—Error detection or correction of the data by redundancy in hardware using passive fault-masking of the redundant circuits
- G06F11/182—Error detection or correction of the data by redundancy in hardware using passive fault-masking of the redundant circuits based on mutual exchange of the output between redundant processing components
Definitions
- the invention relates to fault tolerant computing systems and in particular to fault tolerant systems where application programs are synchronized at the processor level.
- a central computer which may include multiple processors for redundancy, receives via various input/output (I/O) modules various types of flight data useful for anticipating and warning of hazardous flight conditions.
- I/O input/output
- Such information may include but is not limited to: barometric altitude, radio altitude, roll and pitch, airspeed, flap setting, gear position, and navigation data. This information is communicated to the central computer via a data bus.
- the bus includes four data lines and has a pair of Bus Interface Units("BIU")for each processor or node on the data system where each BIU is connected to two data lines in the bus. Data is transferred according to a time schedule contained in a table memory associated with each BIU.
- the tables define the length of time windows on the bus and contain the source and destination addresses in the processor memory for each message transmitted on the bus.
- These types of systems also use for some applications two processors that operate in a lock-step arrangement with additional logic provided to cross-compare the activity of the two processors.
- the two processors, each with its own memory execute identical copies of a software application in exact synchrony. This approach usually requires that the two processors must be driven by clock signals that are synchronized.
- This invention provides a way of using hardware facilities that are part of commercially available microprocessors together with control software to implement a fault tolerant computing system.
- a robust fault-tolerant computing system can be built.
- Application software that executes on the system can remain simple because it does not need to be aware of the measures taken to achieve the fault-tolerant characteristics of the system, that is, no special redundancy management code is built into application code.
- the redundancy management code is entirely at the operating system level.
- the application software does not need to adhere to restrictive design rules to allow the system's fault detection and containment mechanisms to work.
- the invention thus provides a way to separate the concerns of fault tolerance mechanisms and application logic. This makes it much easier and therefore less expensive to build robust fault tolerant computing systems.
- This invention has commercial value because it allows strong and robust fault tolerant computing systems to be built with low cost commercial off the shelf components. For example, by using counters or event monitors that are built into the microprocessor chips, it is possible to count the instructions being executed in an application program so as to cause the application programs to execute in congruent frames. Therefore, systems built using this technology will have a substantial advantage over systems built with fault tolerant architectures that require custom electronics, custom integrated circuits or tricky and expensive application software design techniques.
- This invention is also valuable because it uses technology that will be enhanced and extended as part of the natural growth path of microprocessor technology. Future microprocessors and microcontrollers now on the drawing boards having hardware that can be used to monitor the execution of application programs will almost certainly increase the advantage of this approach to fault tolerant systems.
- FIG. 1 is a block diagram of fault tolerant computing system according to the invention
- Fig. 2 is a logic diagram of a data input function of the system of Fig. 1 according to the invention
- Fig. 3 is a logic diagram illustrating data verification in the system of Fig. 1; and Fig. 4 is a timing diagram illustrating loose synchronization of microprocessors according to the invention.
- This microcontroller includes a 32-bit microprocessor which is a variant of the IBM/Motorola PowerPC architecture together with a second processor called the
- the component also includes 8 serial I O adapters integrated on-chip with the processor; of these, four are of similar type and are referred to as Serial Communications Controllers (SCCs).
- SCCs Serial Communications Controllers
- the PowerPC part of the component includes a Memory Management Unit (MMU) that supports page level memory allocation and relocation.
- MMU Memory Management Unit
- the MMU is able to treat physical memory as a range of pages of 4096 bytes. Each page is simply a range of addresses of 4096 bytes length.
- the code that is executing in the processor identifies an address space consisting of pages having locations which are controlled by the MMU.
- the MMU is capable of making various pages accessible or inaccessible to the executing code.
- the MMU may alter the address ranges where the executing code "sees" each page of physical memory (relocation).
- a feature of the invention involves the use of a processor hardware feature that can preempt (interrupt) executing application code after a precise number of instructions have been executed rather than the usual method of preempting executing code after a predetermined amount of time has elapsed.
- this capability allows essentially unrestricted application software architectures to be used with the system.
- the use of this feature has very significant advantages in implementing fault tolerant computing systems.
- Fig. 1 shows an example of a fault tolerant computing system 10 having a group of six microprocessors 12A-F, such as the MPC 860, that can be programmed with appropriate software that makes use of the invention.
- Each of the microprocessors 12 A-F is able to communicate with all others by means of a group of serial buses indicated at 14, using a set of Serial Communications Controllers (SCCs) 16A-D that are incorporated on the microprocessors 12 A-F with the microprocessor cores.
- SCCs Serial Communications Controllers
- the three microprocessors 12A-C to the left in Fig. 1 are also connected to external devices by means of an I/O circuit 18A-C as needed for the application.
- the microprocessors 12A-C are thus able to take care of the input/output functions of the systemlO while the microprocessors 12D-F on the right of Fig. 1 are able to perform processing duties, obtaining input and returning output to the microprocessors 12A-C on the left.
- each of the microprocessor 12A-F drives only one of the serial busses 14, but is able to receive from all of the serial busses 14.
- each of the microprocessors 12A-F has only four SCCs 16A-D. Therefore, the microprocessors 12A-F are not capable of obtaining data simultaneously from more than four of the microprocessors 12A-F.
- a set of four multiplexer circuits 20A-D are added external to each of the microprocessors 12 A-F to enable them to obtain data from all the microprocessors 12 A-F in a time-sequenced manner.
- Software at the executive level in each of the microprocessor 12 A-F is used to synchronize and sequence the communications between the microprocessors 12 A-F.
- 4-to-l multiplexers 20A-D as shown in Fig. 1, up to sixteen microprocessors could be used in a system.
- Fig. 2 illustrates in logical block diagram form three data input processing functions as might be implemented on the microprocessors 12A-C that have access to I/O signals from the I/O circuits 18A-C shown in Fig. 1.
- Each of these functions periodically sample signals from various sensors as indicated by a set of blocks 22A- C. These signals can be from redundant sensors but, in general, they need not be identical or even synchronized with each other. After sampling, the signals are sent for rule based voting as indicated by a set of function blocks 24A-C.
- the function indicated by the blocks 24A-C might be implemented in software executing on the same microprocessor 12A-C as the data input function in the blocks 22A-C or on different microprocessors in the system 10.
- each of the rule based voting functions 24A-C has access to all of the input signals from blocks 22A-C, both its own and those from the other microprocessors 12A, B or C. Rules are then applied to the available signal samples to determine which ones to select and how to make the best use of the available data. This function depends on the details of the input data and the sensors from which the data is obtained and would readily be apparent to those of ordinary skill in the art of data verification.
- the rule based voting functions 24A-C When the rule based voting functions 24A-C have finished processing a frame of data, they will have separately produced, in the absence of faults, three independent versions of the input data.
- the three versions of the input data in this embodiment will be loosely synchronized and bit-for-bit identical. This property makes fault containment much easier in the subsequent processing.
- next processing steps occur at some point between the output of the rule based voting identified by the blocks 24A-C and three identical application programs represented by a set of blocks 26A-C in Fig. 2 that make use of the incoming data from the sensors via the blocks 22A-C.
- a transmitting processor for example microprocessor 12A, executes one instance of the input processing function.
- This microprocessor 12A produces a sequence of output data sets that are then transmitted via that microprocessor's SCC 16D to the serial output bus 14.
- a receiving processor for example the microprocessor 12D, that is the host of one of the redundant instances of the receiving application 26A.
- the receiving microprocessor 12D obtains the data from the transmitting microprocessor 12A on one of its serial input buses 14. At the same time, it receives data from several other transmitting processors 12B and C.
- the system 10 is preferably set up so that the data obtained from the several processors is loosely synchronized and congruent as long as there are no hardware faults in the system. With this arrangement, it becomes possible to use voting at the receiving microprocessors such as 12D-F to handle faults in the incoming data.
- Voting consists of examining the incoming data and checking it to determine if all copies match exactly as indicated by a set of blocks 28A-C shown in Figs. 2 and 3. If all the incoming data matches, then any version of the data can be used. If only two versions of the data match, then either of the matching versions can be used and the non-matching data is discarded as illustrated at 30 of Fig. 3. If there are no matches then the voter will be unable to determine which, if any, of the data sources should be trusted but for this to occur more than one failure must have occurred.
- the architecture of the system 10 can effectively operate with any single failure in the data input system and while providing loosely synchronized congruent data for use by subsequent application programs such as application programs 26A-C.
- One of the most significant features of the invention permits the software applications such as 26A-C to be executed in the microprocessors 12D-F in such a way that it becomes possible to check the execution of those applications 12A-C, and to reject any that produce faulty outputs, without the necessity of designing or implementing application programs with fault tolerance in mind. It should be noted that the assumption in this description of the invention is that the faulty output would occur because of a hardware processing fault not a software fault. As discussed below, system verification is accomplished by comparing the results from one instance of the application executing on one processor with the results from another identical instance of the application executing on another processor.
- the voter 28A now selects any one of the input message blocks that has the maximum number of matches in agreement with other message blocks.
- a second 32 or a third message block 34 might be selected because the first message block 30 contains an error.
- Incoming messages are received from multiple sources and stored.
- the receiving application is not aware of the message selection made and does not take part in any way in accomplishing voting or redundancy management on its own. Therefore, its logic is not made more complicated by redundancy management issues. 4. Testing and proving such a system, such as certification for use on an aircraft, is made simpler and less expensive because the fault tolerant aspects of the system can be understood, tested and proven independently of the complexities of the application software.
- data message from the transmitting processors 12A-C need not be tightly synchronized. However, it is desirable in this particular embodiment that the voters 28A-C wait until all the messages have arrived or should have arrived before voting; therefore loose synchronization is needed. For example, it can be required that non-failed data sources deliver their data messages before a predetermined time has expired. In the fault tolerant system of the invention, it is desirable to insure that new data is presented to the several redundant application programs 26A-C only at points in the programs' execution that is the same on all processors. In other words, it is desirable to synchronize the operation of the microprocessors 12D-F during the processing of the application programs. As described above, this can be done by one of several methods. For example:
- steps that run to completion each time the application is started. No new data is provided to the program once it starts executing. New data is provided only between steps.
- Fig. 4 provides, in a timing diagram form, an illustration of the two microprocessors 12D and 12E executing a common application program utilizing congruent preemption under the control of an operating system according to the preferred embodiment of the invention. In this example, the executive or operating system of both microprocessors 12D and 12E are running at points Cl and Dl respectively.
- the executive causes the application programs 26A and 26B to begin to execute.
- each executive After a predetermined count of application instruction completions in each of the microprocessors 12D and 12E, each executive generates a congruent preemption to halt the execution of the application programs as shown at C3 and D3.
- the system 10 can compare the results of the two application programs or if the output of the programs are to be transmitted to another system, the results can be voted on to insure an accurate output.
- the executives in each of the microprocessors 12D and 12E complete their tasks, they initiate a resumption in the application programs 26A and 26B at points C4 and D4. This process continues indefinitely or until the application programs 26A and 26B are completed. In this manner the applications, 26 A - 26C can be made to execute in congruent frames.
- This method makes use of built-in hardware on the microprocessor chip that is usually used for other purposes such as debugging the chip to count the instructions executed in the application program.
- a number of commercially available microprocessors including the Motorola MPC860, the MPC823, the 604e and the 750 contain suitable hardware to perform this function.
- the MPC860 for example has a subsystem that was designed as an aid to software development and debugging that is called "Development Support" by Motorola. It includes eight internal comparators that can detect various events that occur during instruction execution. It also includes two 16-bit counters that are capable of counting events.
- This subsystem for counting instruction completions involves setting up the comparators to detect instructions executed within a desired address range, for instance the address range where the application being monitored is located, and setting up a counter to count the number of events detected. When a predetermined number of counts is reached, the processor will "trap" to an executive routine.
- the models 604e and the 750 contain a different mechanism, but one that can also be put to the same use.
- these processors there is a subsystem referred to as the
- Performance Monitor This facility was designed into the processors to provide the ability to monitor and count predefined events such as processor clocks, misses in the instruction cache, data cache, or L2 cache, the types of instructions dispatched, mispredicted branches and other events. In particular, it is possible to set up the event monitor system to take a "Performance Monitor Interrupt" after a predetermined number of application program instruction completions.
- microprocessors have similar capabilities. Also, as microprocessors become more complex and as the amount of logic on the processors increases, the need for features to assist with debug and performance measurement increases. For these reasons, it is likely that in the future more and more microprocessors will include a mechanism capable of counting instruction completions. Such microprocessors may then be used in fault tolerant architectures of this type.
Abstract
Description
Claims
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP99902396A EP1082660A2 (en) | 1998-01-20 | 1999-01-20 | Fault tolerant computing system using instruction counting |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US7191498P | 1998-01-20 | 1998-01-20 | |
US60/071,914 | 1998-01-20 | ||
US09/234,797 US6374364B1 (en) | 1998-01-20 | 1999-01-19 | Fault tolerant computing system using instruction counting |
US09/234,797 | 1999-01-19 |
Publications (3)
Publication Number | Publication Date |
---|---|
WO1999036847A2 true WO1999036847A2 (en) | 1999-07-22 |
WO1999036847A8 WO1999036847A8 (en) | 1999-09-30 |
WO1999036847A3 WO1999036847A3 (en) | 2000-12-28 |
Family
ID=26752816
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US1999/001221 WO1999036847A2 (en) | 1998-01-20 | 1999-01-20 | Fault tolerant computing system using instruction counting |
Country Status (3)
Country | Link |
---|---|
US (1) | US6374364B1 (en) |
EP (1) | EP1082660A2 (en) |
WO (1) | WO1999036847A2 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2004034172A2 (en) * | 2002-09-12 | 2004-04-22 | Siemens Aktiengesellschaft | Method for synchronizing events, particularly for processors of fault-tolerant systems |
Families Citing this family (48)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050071821A1 (en) * | 2003-09-30 | 2005-03-31 | International Business Machines Corporation | Method and apparatus to autonomically select instructions for selective counting |
US20050071816A1 (en) * | 2003-09-30 | 2005-03-31 | International Business Machines Corporation | Method and apparatus to autonomically count instruction execution for applications |
US7395527B2 (en) * | 2003-09-30 | 2008-07-01 | International Business Machines Corporation | Method and apparatus for counting instruction execution and data accesses |
US7937691B2 (en) * | 2003-09-30 | 2011-05-03 | International Business Machines Corporation | Method and apparatus for counting execution of specific instructions and accesses to specific data locations |
US20050071610A1 (en) * | 2003-09-30 | 2005-03-31 | International Business Machines Corporation | Method and apparatus for debug support for individual instructions and memory locations |
US20050071611A1 (en) * | 2003-09-30 | 2005-03-31 | International Business Machines Corporation | Method and apparatus for counting data accesses and instruction executions that exceed a threshold |
US20050071608A1 (en) * | 2003-09-30 | 2005-03-31 | International Business Machines Corporation | Method and apparatus for selectively counting instructions and data accesses |
US7373637B2 (en) * | 2003-09-30 | 2008-05-13 | International Business Machines Corporation | Method and apparatus for counting instruction and memory location ranges |
US20050071609A1 (en) * | 2003-09-30 | 2005-03-31 | International Business Machines Corporation | Method and apparatus to autonomically take an exception on specified instructions |
US20050071612A1 (en) * | 2003-09-30 | 2005-03-31 | International Business Machines Corporation | Method and apparatus for generating interrupts upon execution of marked instructions and upon access to marked memory locations |
US7225309B2 (en) * | 2003-10-09 | 2007-05-29 | International Business Machines Corporation | Method and system for autonomic performance improvements in an application via memory relocation |
US8381037B2 (en) * | 2003-10-09 | 2013-02-19 | International Business Machines Corporation | Method and system for autonomic execution path selection in an application |
US7421681B2 (en) * | 2003-10-09 | 2008-09-02 | International Business Machines Corporation | Method and system for autonomic monitoring of semaphore operation in an application |
US20050086455A1 (en) * | 2003-10-16 | 2005-04-21 | International Business Machines Corporation | Method and apparatus for generating interrupts for specific types of instructions |
US7257657B2 (en) * | 2003-11-06 | 2007-08-14 | International Business Machines Corporation | Method and apparatus for counting instruction execution and data accesses for specific types of instructions |
US7458078B2 (en) * | 2003-11-06 | 2008-11-25 | International Business Machines Corporation | Apparatus and method for autonomic hardware assisted thread stack tracking |
US7290255B2 (en) * | 2004-01-14 | 2007-10-30 | International Business Machines Corporation | Autonomic method and apparatus for local program code reorganization using branch count per instruction hardware |
US7082486B2 (en) * | 2004-01-14 | 2006-07-25 | International Business Machines Corporation | Method and apparatus for counting interrupts by type |
US7392370B2 (en) | 2004-01-14 | 2008-06-24 | International Business Machines Corporation | Method and apparatus for autonomically initiating measurement of secondary metrics based on hardware counter values for primary metrics |
US20050155018A1 (en) * | 2004-01-14 | 2005-07-14 | International Business Machines Corporation | Method and apparatus for generating interrupts based on arithmetic combinations of performance counter values |
US7293164B2 (en) * | 2004-01-14 | 2007-11-06 | International Business Machines Corporation | Autonomic method and apparatus for counting branch instructions to generate branch statistics meant to improve branch predictions |
US7526757B2 (en) * | 2004-01-14 | 2009-04-28 | International Business Machines Corporation | Method and apparatus for maintaining performance monitoring structures in a page table for use in monitoring performance of a computer program |
US7197586B2 (en) * | 2004-01-14 | 2007-03-27 | International Business Machines Corporation | Method and system for recording events of an interrupt using pre-interrupt handler and post-interrupt handler |
US7895382B2 (en) * | 2004-01-14 | 2011-02-22 | International Business Machines Corporation | Method and apparatus for qualifying collection of performance monitoring events by types of interrupt when interrupt occurs |
US7496908B2 (en) * | 2004-01-14 | 2009-02-24 | International Business Machines Corporation | Method and apparatus for optimizing code execution using annotated trace information having performance indicator and counter information |
US7093081B2 (en) * | 2004-01-14 | 2006-08-15 | International Business Machines Corporation | Method and apparatus for identifying false cache line sharing |
US7415705B2 (en) * | 2004-01-14 | 2008-08-19 | International Business Machines Corporation | Autonomic method and apparatus for hardware assist for patching code |
US7181599B2 (en) * | 2004-01-14 | 2007-02-20 | International Business Machines Corporation | Method and apparatus for autonomic detection of cache “chase tail” conditions and storage of instructions/data in “chase tail” data structure |
US7114036B2 (en) * | 2004-01-14 | 2006-09-26 | International Business Machines Corporation | Method and apparatus for autonomically moving cache entries to dedicated storage when false cache line sharing is detected |
US7296183B2 (en) * | 2004-01-23 | 2007-11-13 | Microsoft Corporation | Selectable data field consistency checking |
US7987453B2 (en) * | 2004-03-18 | 2011-07-26 | International Business Machines Corporation | Method and apparatus for determining computer program flows autonomically using hardware assisted thread stack tracking and cataloged symbolic data |
US7296130B2 (en) | 2004-03-22 | 2007-11-13 | International Business Machines Corporation | Method and apparatus for providing hardware assistance for data access coverage on dynamically allocated data |
US7480899B2 (en) * | 2004-03-22 | 2009-01-20 | International Business Machines Corporation | Method and apparatus for autonomic test case feedback using hardware assistance for code coverage |
US7299319B2 (en) | 2004-03-22 | 2007-11-20 | International Business Machines Corporation | Method and apparatus for providing hardware assistance for code coverage |
US8135915B2 (en) | 2004-03-22 | 2012-03-13 | International Business Machines Corporation | Method and apparatus for hardware assistance for prefetching a pointer to a data structure identified by a prefetch indicator |
US7421684B2 (en) | 2004-03-22 | 2008-09-02 | International Business Machines Corporation | Method and apparatus for autonomic test case feedback using hardware assistance for data coverage |
US7526616B2 (en) | 2004-03-22 | 2009-04-28 | International Business Machines Corporation | Method and apparatus for prefetching data from a data structure |
DE502004005875D1 (en) * | 2004-07-27 | 2008-02-14 | Nokia Siemens Networks Gmbh | METHOD AND DEVICE FOR SECURING CONSISTENT MEMORY CONTENT IN REDUNDANT STORAGE UNITS |
US20060048011A1 (en) * | 2004-08-26 | 2006-03-02 | International Business Machines Corporation | Performance profiling of microprocessor systems using debug hardware and performance monitor |
US7433803B2 (en) * | 2005-04-27 | 2008-10-07 | Freescale Semiconductor, Inc. | Performance monitor with precise start-stop control |
RU2305313C1 (en) * | 2005-12-27 | 2007-08-27 | Яков Аркадьевич Горбадей | Method for ensuring reliable operation of program computing means |
FR2912526B1 (en) | 2007-02-13 | 2009-04-17 | Thales Sa | METHOD OF MAINTAINING SYNCHRONISM OF EXECUTION BETWEEN MULTIPLE ASYNCHRONOUS PROCESSORS WORKING IN PARALLEL REDUNDANTLY. |
US8508387B2 (en) * | 2007-05-24 | 2013-08-13 | Aviation Communication & Surveillance Systems Llc | Systems and methods for aircraft windshear detection |
DE102007033885A1 (en) * | 2007-07-20 | 2009-01-22 | Siemens Ag | Method for the transparent replication of a software component of a software system |
CN101931580B (en) * | 2009-12-22 | 2012-02-22 | 中国航空工业集团公司第六三一研究所 | System on chip adopting ARINC 659 rear panel data bus interface chip |
US9256426B2 (en) | 2012-09-14 | 2016-02-09 | General Electric Company | Controlling total number of instructions executed to a desired number after iterations of monitoring for successively less number of instructions until a predetermined time period elapse |
US9342358B2 (en) | 2012-09-14 | 2016-05-17 | General Electric Company | System and method for synchronizing processor instruction execution |
RU2559767C2 (en) * | 2013-11-15 | 2015-08-10 | Открытое акционерное общество "Научно-исследовательский институт "Субмикрон" | Method of providing fault-tolerance computer system based on task replication, self-reconfiguration and self-management of degradation |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0447576A1 (en) * | 1987-11-09 | 1991-09-25 | Tandem Computers Incorporated | Synchronization of fault-tolerant computer system having multiple processors |
WO1992022030A1 (en) * | 1991-06-06 | 1992-12-10 | Honeywell Inc. | Interrupt driven, separately clocked, fault tolerant processor synchronization |
WO1995015529A1 (en) * | 1993-12-01 | 1995-06-08 | Marathon Technologies Corporation | Fault resilient/fault tolerant computing |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0773059A (en) * | 1993-03-02 | 1995-03-17 | Tandem Comput Inc | Fault-tolerant computer system |
US5896523A (en) * | 1997-06-04 | 1999-04-20 | Marathon Technologies Corporation | Loosely-coupled, synchronized execution |
-
1999
- 1999-01-19 US US09/234,797 patent/US6374364B1/en not_active Expired - Fee Related
- 1999-01-20 EP EP99902396A patent/EP1082660A2/en not_active Ceased
- 1999-01-20 WO PCT/US1999/001221 patent/WO1999036847A2/en not_active Application Discontinuation
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0447576A1 (en) * | 1987-11-09 | 1991-09-25 | Tandem Computers Incorporated | Synchronization of fault-tolerant computer system having multiple processors |
WO1992022030A1 (en) * | 1991-06-06 | 1992-12-10 | Honeywell Inc. | Interrupt driven, separately clocked, fault tolerant processor synchronization |
WO1995015529A1 (en) * | 1993-12-01 | 1995-06-08 | Marathon Technologies Corporation | Fault resilient/fault tolerant computing |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2004034172A2 (en) * | 2002-09-12 | 2004-04-22 | Siemens Aktiengesellschaft | Method for synchronizing events, particularly for processors of fault-tolerant systems |
WO2004034172A3 (en) * | 2002-09-12 | 2004-09-23 | Siemens Ag | Method for synchronizing events, particularly for processors of fault-tolerant systems |
Also Published As
Publication number | Publication date |
---|---|
US6374364B1 (en) | 2002-04-16 |
WO1999036847A8 (en) | 1999-09-30 |
EP1082660A2 (en) | 2001-03-14 |
WO1999036847A3 (en) | 2000-12-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US6374364B1 (en) | Fault tolerant computing system using instruction counting | |
US4497059A (en) | Multi-channel redundant processing systems | |
CA1306546C (en) | Dual zone, fault tolerant computer system with error checking on i/o writes | |
US5239641A (en) | Method and apparatus for synchronizing a plurality of processors | |
EP0306209B1 (en) | Dual rail processors with error checking at single rail interfaces | |
US7774659B2 (en) | Method of monitoring the correct operation of a computer | |
US8930752B2 (en) | Scheduler for multiprocessor system switch with selective pairing | |
EP0514075A2 (en) | Fault tolerant processing section with dynamically reconfigurable voting | |
Goldberg | Development and analysis of the software implemented fault-tolerance (SIFT) computer | |
US8671311B2 (en) | Multiprocessor switch with selective pairing | |
JPH052654A (en) | Method and circuit for detecting fault of microcomputer | |
US7624336B2 (en) | Selection of status data from synchronous redundant devices | |
KR20020063237A (en) | Systems and methods for fail safe process execution, monitering and output conterol for critical system | |
Randell | Reliable computing systems | |
JPH02220164A (en) | Input/output control processing delaying apparatus | |
Weinstock | SIFT: System design and implementation | |
Palumbo et al. | A performance evaluation of the software-implemented fault-tolerancecomputer | |
Smith Jr et al. | Development and evaluation of a fault-tolerant multiprocessor (FTMP) computer. Volume 1: FTMP principles of operation | |
Moser et al. | Design verification of SIFT | |
WALTER | MAFT-An architecture for reliable fly-by-wire flight control | |
Hopkins Jr et al. | The evolution of fault tolerant computing at the Charles Stark Draper Laboratory, 1955–85 | |
Thompson | Transputer-based fault tolerance in safety-critical systems | |
Lala et al. | Reducing the probability of common-mode failure in the fault tolerant parallel processor | |
Lala et al. | Fault tolerance in embedded real-time systems: importance and treatment of common mode failures | |
Smith et al. | Development and evaluation of a fault-tolerant multiprocessor (FTMP) computer |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AL | Designated countries for regional patents |
Kind code of ref document: A2 Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
AL | Designated countries for regional patents |
Kind code of ref document: C1 Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE |
|
CFP | Corrected version of a pamphlet front page | ||
CR1 | Correction of entry in section i |
Free format text: PAT. BUL. 29/99 UNDER (30) REPLACE "NOT FURNISHED" BY "09/234797" |
|
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
WWE | Wipo information: entry into national phase |
Ref document number: 1999902396 Country of ref document: EP |
|
AL | Designated countries for regional patents |
Kind code of ref document: A3 Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE |
|
WWP | Wipo information: published in national office |
Ref document number: 1999902396 Country of ref document: EP |
|
WWR | Wipo information: refused in national office |
Ref document number: 1999902396 Country of ref document: EP |
|
WWW | Wipo information: withdrawn in national office |
Ref document number: 1999902396 Country of ref document: EP |