|Publication number||US20090077553 A1|
|Application number||US 11/854,953|
|Publication date||Mar 19, 2009|
|Filing date||Sep 13, 2007|
|Priority date||Sep 13, 2007|
|Publication number||11854953, 854953, US 2009/0077553 A1, US 2009/077553 A1, US 20090077553 A1, US 20090077553A1, US 2009077553 A1, US 2009077553A1, US-A1-20090077553, US-A1-2009077553, US2009/0077553A1, US2009/077553A1, US20090077553 A1, US20090077553A1, US2009077553 A1, US2009077553A1|
|Inventors||Jian Tang, Yufu Li|
|Original Assignee||Jian Tang, Yufu Li|
|Export Citation||BiBTeX, EndNote, RefMan|
|Patent Citations (2), Referenced by (1), Classifications (8), Legal Events (1)|
|External Links: USPTO, USPTO Assignment, Espacenet|
Server computer systems demand high levels of reliability, availability and serviceability (“RAS”). Reliability, availability, and serviceability are enhanced in some servers through RAS features. Some RAS features a allow, a system configuration changes, such as changes necessary for link, memory, and processor maintenance and swapping, may be made in an Operating System (“OS”) transparent manner. Some system architectures utilizes System Management Interrupts (“SMI”) to implement RAS features, but to meet real-time demands in such systems, SMI latency limits are in the order of microseconds. In link-based systems, to change system configuration requires the system to enter a quiesce state to pause OS execution, such as for several milliseconds. Current operating systems are not tolerant of long time tick losses while the underlying system is in a quiesce state. Some previous efforts have utilized a quiesce data buffer to separate data calculations from the data commitment, or configuration change implementation. Such efforts have been successful in reducing quiesce time, but as systems continue to increase in the number of included resources, such as an increased number of processors, these efforts have limitations. Further, these efforts utilize only a single processors designated as a System Bootstrap Processor (“SBSP”) to implement configuration changes while in a quiesce state. All Application Processors (“AP”) are placed in an idle loop during system quiesce and do not participate in the implementation of the configuration changes.
Various embodiments described herein provide one or more of systems, methods, and software/firmware that provide increased efficiency in implementing configuration changes during system quiesce time. Some embodiments may separate a quiesce data buffer into small slices wherein each slice includes configuration change data or instructions. These slices may be individually distributed by a system bootstrap processor, or other processor, to other processors or logical processors of a multi-core processor in the system. In some such embodiments, the system bootstrap processor and application processors may change system configuration in parallel while a system is in a quiesce state so as to minimize time spent in the quiesce state. Furthermore, typical system configuration change become local operations, such as local hardware register modifications, which suffer much less transaction delay than remote hardware register accesses as has been previously performed. These embodiments, and others, are described in greater detail herein.
In the following detailed description, reference is made to the accompanying drawings that form a part hereof, and in which is shown by way of illustration specific embodiments in which the inventive subject matter may be practiced. These embodiments are described in sufficient detail to enable those skilled in the art to practice them, and it is to be understood that other embodiments may be utilized and that structural, logical, and electrical changes may be made without departing from the scope of the inventive subject matter. Such embodiments of the inventive subject matter may be referred to, individually and/or collectively, herein by the term “invention” merely for convenience and without intending to voluntarily limit the scope of this application to any single invention or inventive concept if more than one is in fact disclosed.
The following description is, therefore, not to be taken in a limited sense, and the scope of the inventive subject matter is defined by the appended claims.
The functions or algorithms described herein are implemented in hardware, software or a combination of software and hardware in one embodiment. The software comprises computer executable instructions stored on computer readable media such as memory or other type of storage devices. Further, described functions may correspond to modules, which may be software, hardware, firmware, or any combination thereof. Multiple functions are performed in one or more modules as desired, and the embodiments described are merely examples. The software is executed on a digital signal processor, ASIC, microprocessor, or other type of processor operating on a system, such as a personal computer, server, a router, or other device capable of processing data including network interconnection devices.
Some embodiments implement the functions in two or more specific interconnected hardware modules or devices with related control and data signals communicated between and through the modules, or as portions of an application-specific integrated circuit. Thus, the exemplary process flow is applicable to software, firmware, and hardware implementations.
In system 100 as an example embodiment, one of the processors 102, 106, 110, 114 is designated as a system bootstrap processor (“SBSP”). The non-SBSP processors are then designated as application processors (“AP”).
In a common scenario, assume the CPU3 114 needs to be removed from service along with its local memory 116 while an operating system is running on the system 100. Removal of CPU 3 114 requires RTA and SAD reconfigurations such that the related entries are removed on all the other CSI components, which may include CPU 0 102, CPU 1 106, CPU 2 110, IOH 0 120, and IOH 1 128. CSI components, in some links based embodiments, support a quiesce mode by which normal traffic may be paused to perform the RTA/SAD change operations.
When the processor 114 and memory 116 are ready to be removed, a system management interrupt (“SMI”) may be generated to begin the remove operation. However, prior to placing the system in a quiesce state, the SBSP calculates configuration data changes and may register the configuration data to a quiesce data buffer.
The SBSP then organizes the data in the quiesce data buffer, or other location into slices. Each slice may correspond to one processor socket or logical processor in the system and only contains Quiesce data which belongs to that socket, processor, or its neighbor IOH. For example, a slice for processor socket 0 102 may contains all RTA/SAD entries needed to be updated in processor socket 0 102 and IOH socket 0 120.
The pre-quiesce sub-portion 240 of the SBSP portion may include calculating configuration data slices in a buffer 204. Such calculations may include determining what and where configuration changes need to be made as a function of the SMI. The calculation of configuration data slices in the buffer 204 may also include slicing the data as a function of processors and there location in reference to other system components such as IOHs and slices of configuration changes assigned to other APs. For example, if two processors are neighbors of an IOH and only one processor has local configuration changes, configuration changes may be placed in a slice of the other processor that are to be implemented within the IOH.
After the slices are calculated 204, the method 200 further includes communicating the slices to the APs 206. The slices may be communicated 206 in any number of ways. One way may include utilization of a globally accessible register or memory location to place the slices in for pickup by the APs. Another way to communicate the slices may include packetized CSI messages or messages sent via another suitable technology. The slices are typically communicated as a starting address and bit or byte length of the slice in a shared memory. However, other embodiments may include communicating the actual data of the slice which may eliminate some memory operations necessary for an AP to obtain a slice.
The method 200 continues with the SBSP copying a quiesce data slice allocated to the SBSP into local memory or cache 208. The pre-quiesce sub-portion 240 concludes by determining 216 if each AP has copied, or otherwise received, its respective slice.
Referring now to the AP pre-quiesce sub-portion 242, the method 200 includes the AP getting a quiesce data slice address and length 210 from the mailbox mechanism, via a message, or in another way depending on the particular embodiment. The AP may then copy the quiesce data slice to local memory or cache 212. However, as noted above, the getting of the quiesce data slice address and length 210 and copying of the quiesce data slice 212 may be a single operation. After the AP copies, or otherwise places, the data into local memory or cache 212, the AP tells the SBSP that the quiesce data copy is complete 214. Again, this messaging maybe made utilizing a mailbox mechanism or other messaging technology. Note that although only a single AP portion of the method 200 is illustrated, the same AP portion of the method may be performed in parallel by virtually any number of APs. Further, the AP portion of the method 200 may be performed in parallel with the SBSP portion of the method 200.
At this point, the method 200 is ready to enter the system into a quiesced state. Referring now to the quiesce sub-portion 250 of the SBSP portion, the method 200 includes quiescing the system 218. The SBSP then processes it quiesce data slice, if one is assigned, and commits the quiesce data to the local CPU and/or IOH neighbor 222. At this point, the SBSP determines when all of the APs have finished committing their respective data slices 228 and then de-quiesces the system 230.
At the same time as the quiesce sub-portion 250 of the SBSP portion of the method 200 is being processed, the quiesce sub-portion 252 of the AP portion is processed. This sub-portion 252 of the method 200 includes the AP determining if the socket of the AP is quiesced 220. Once quiesced, the AP processes it quiesce data slice, if one is assigned, and commits the quiesce data to the local CPU and/or IOH neighbor 224. After committing the quiesce data 224, the AP tells the SBSP the quiesced data has been committed 226 and the AP waits for its socket to be de-quiesced 232. Once all of the AP sockets have been de-quiesced 232 and the SBSP has de-quiesced the remainder of the system 230, both the AP and SBSP portions of the method exit the SMI state 234 and the method 200 is complete.
In various embodiments, needed configuration changes may include one or more of an update to a routing table array (“RTA”), a source address decoder, a target address decoder, or other configuration setting depending on the needed change and the particular system of the embodiment. Such changes may be needed due to addition or subtraction of an element from a computing environment of the system, detected errors within the system, or other events that may necessitate a system configuration change.
In some embodiments of the method 300, identifying the configuration change task delegation scheme 306 may include identifying one or more configuration settings in need of modification, identifying a location of where the one or more configuration settings are located and tasking processors with needed configuration changes with making their own configuration changes. Identifying the configuration change task delegation scheme 306 may also include identifying and tasking a processor not already tasked with a configuration change task in proximity to each device in need of a configuration change to make the needed device configuration changes.
In some embodiments, either of the methods 200, of
It is emphasized that the Abstract is provided to comply with 37 C.F.R. § 1.72(b) requiring an Abstract that will allow the reader to quickly ascertain the nature and gist of the technical disclosure. It is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims.
In the foregoing Detailed Description, various features are grouped together in a single embodiment to streamline the disclosure. This method of disclosure is not to be interpreted as reflecting an intention that the claimed embodiments of the inventive subject matter require more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive subject matter lies in less than all features of a single disclosed embodiment. Thus, the following claims are hereby incorporated into the Detailed Description, with each claim standing on its own as a separate embodiment.
It will be readily understood to those skilled in the art that various other changes in the details, material, and arrangements of the parts and method stages which have been described and illustrated in order to explain the nature of the inventive subject matter may be made without departing from the principles and scope of the inventive subject matter as expressed in the subjoined claims.
|Cited Patent||Filing date||Publication date||Applicant||Title|
|US7379418 *||May 12, 2003||May 27, 2008||International Business Machines Corporation||Method for ensuring system serialization (quiesce) in a multi-processor environment|
|US7761696 *||Mar 30, 2007||Jul 20, 2010||Intel Corporation||Quiescing and de-quiescing point-to-point links|
|Citing Patent||Filing date||Publication date||Applicant||Title|
|US7640453 *||Dec 29, 2006||Dec 29, 2009||Intel Corporation||Methods and apparatus to change a configuration of a processor system|
|Cooperative Classification||G06F8/67, G06F9/4405, G06F9/44505|
|European Classification||G06F9/445C, G06F8/67, G06F9/44A2|
|Mar 10, 2009||AS||Assignment|
Owner name: INTEL CORPORATION, CALIFORNIA
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:TANG, JIAN;LI, YUFU;REEL/FRAME:022376/0247
Effective date: 20070910