Search Images Maps Play YouTube News Gmail Drive More »
Sign in
Screen reader users: click this link for accessible mode. Accessible mode has the same essential features but works better with your reader.

Patents

  1. Advanced Patent Search
Publication numberUS20050015545 A1
Publication typeApplication
Application numberUS 10/708,156
Publication dateJan 20, 2005
Filing dateFeb 12, 2004
Priority dateJul 2, 2003
Also published asCN1320436C, CN1320437C, CN1553346A, CN1558321A, CN1598755A, CN100334567C, US7281072, US8301809, US9594510, US20050005044, US20050005062, US20050005063, US20130013827
Publication number10708156, 708156, US 2005/0015545 A1, US 2005/015545 A1, US 20050015545 A1, US 20050015545A1, US 2005015545 A1, US 2005015545A1, US-A1-20050015545, US-A1-2005015545, US2005/0015545A1, US2005/015545A1, US20050015545 A1, US20050015545A1, US2005015545 A1, US2005015545A1
InventorsLing-Yi Liu, Tse-Han Lee, Michael Gordon Schnapp, Yun-Huei Wang, Chung-Hua Pao
Original AssigneeLing-Yi Liu, Tse-Han Lee, Michael Gordon Schnapp, Yun-Huei Wang, Chung-Hua Pao
Export CitationBiBTeX, EndNote, RefMan
External Links: USPTO, USPTO Assignment, Espacenet
Redundant external storage virtualization computer system
US 20050015545 A1
Abstract
A redundant external storage virtualization computer system. The redundant storage virtualization computer system includes a host entity for issuing an IO request, a redundant external storage virtualization controller pair coupled to the host entity for performing an IO operation in response to the IO request issued by the host entity, and a plurality of physical storage devices for providing storage to the computer system. Each of the physical storage devices is coupled to the redundant storage virtualization controller pair through a point-to-point serial signal interconnect. The redundant storage virtualization controller pair includes a first and a second storage virtualization controller both coupled to the host entity. In the redundant storage virtualization controller pair, when the second storage virtualization controller is not on line, the first storage virtualization controller will take over the functionality originally performed by the second storage virtualization controller
Images(46)
Previous page
Next page
Claims(68)
1. A computer system comprising:
a host entity for issuing an IO request;
a redundant external storage virtualization controller (SVC) pair coupled to the host entity for performing an IO operation in response to the IO request issued by the host entity comprising a first and a second external storage virtualization controller coupled to the host entity; and
a plurality of physical storage devices for providing storage to the computer system, each of the physical storage devices coupled to the redundant storage virtualization controller pair through a point-to-point serial signal interconnect;
wherein when the second storage virtualization controller is not on line, the first storage virtualization controller will automatically take over the functionality originally performed by the second storage virtualization controller.
2. The computer system of claim 1, wherein for each of the physical storage devices, the computer system further comprises an access control switch coupled between the physical storage device and the redundant storage virtualization controller pair for selectively switching the connection of the physical storage device to the redundant SVC pair between the first and the second storage virtualization controller.
3. The computer system of claim 1 wherein in the redundant storage virtualization controller pair, each of the storage virtualization controllers further comprises:
a host-side IO device interconnect controller coupled to the host entity;
a central processing circuitry coupled to the host-side IO device interconnect controller for performing the IO operation in response to the IO request issued by the host entity; and
a plurality of device-side IO device interconnect controllers coupled to the central processing circuitry;
wherein each of the physical storage devices is coupled to the device-side IO device interconnect controllers through the point-to-point serial signal interconnect.
4. The redundant storage virtualization computer system of claim 1 wherein the point-to-point serial signal interconnect is a Serial ATA IO device interconnect.
5. A redundant storage virtualization subsystem for providing storage to a host entity, comprising:
a redundant external storage virtualization controller (SVC) pair for coupling to the host entity for performing an IO operation in response to an IO request issued by the host entity comprising a first and a second storage virtualization controller for coupling to the host entity; and
a plurality of physical storage devices (PSDs) for providing storage to the host, each of the physical storage devices coupled to the redundant storage virtualization controller pair through a point-to-point serial signal interconnect;
wherein when the second storage virtualization controller is not on line, the first storage virtualization controller will automatically take over the functionality originally performed by the second storage virtualization controller.
6. The redundant storage virtualization subsystem of claim 5 further comprising an access control switch coupled between a said the physical storage device and the redundant storage virtualization controller pair for selectively switching the connection of the physical storage device to the redundant storage virtualization controller pair between the first and the second storage virtualization controller.
7. The redundant storage virtualization subsystem of claim 5, wherein a said PSD is received in a canister removably attached to the redundant storage virtualization subsystem.
8. The redundant storage virtualization subsystem of claim 7, wherein said PSD is a SATA PSD.
9. The redundant storage virtualization subsystem of claim 7, wherein said PSD is a PATA PSD.
10. The redundant storage virtualization subsystem of claim 5 further comprising an access control switch coupled between a said physical storage device and the redundant storage virtualization controller pair for selectively allowing patching through of the serial signal of the physical storage device to and from the first SVC when in a first patching state of said access control switch and to and from the second SVC when in a second patching state of said access control switch.
11. The redundant storage virtualization subsystem of claim 10, wherein an access ownership arbitration mechanism is provided between said SVC pair and said access control switch to control the patching state of said access control switch.
12. The redundant storage virtualization subsystem of claim 11, wherein said access ownership arbitration mechanism comprises a pair of access request signal lines coupled between said SVC pair for each complementary pair of device-side IO device interconnect from said SVC pair to said access control switch; said first SVC being active on a first of said access request signal line pair and passive on a second of said access request signal line pair; said second SVC being active on said second and passive on said first of said access request signal line pair; and said SVC pair each being capable of issuing access request signal on its own said active access request signal line, and reading a requesting state of said access request signal and identifying a change of said requesting state since previous reading on its own said passive access request signal line.
13. The redundant storage virtualization subsystem of claim 11, wherein said access ownership arbitration mechanism comprises an access ownership detecting mechanism to determine if access ownership is possessed by a said SVC.
14. The redundant storage virtualization subsystem of claim 11, wherein said access ownership arbitration mechanism comprises an access ownership granting mechanism to grant access ownership when said access ownership is requested by a said SVC.
15. The redundant storage virtualization subsystem of claim 11, wherein said access ownership arbitration mechanism comprises a first and a second access ownership arbitration circuit (AOAC) each coupled to said first and second SVCs and said access control switch, and wherein if said first SVC issues a first access ownership request signal received by said first AOAC, access ownership will be granted to said first SVC when said second SVC does not possess the access ownership, and if said second SVC issues a second access ownership request signal received by said second AOAC, access ownership will be granted to said second SVC when said first SVC does not possess the access ownership.
16. The redundant storage virtualization subsystem of claim 15, further comprises an access ownership determining mechanism whereby when said first and said second SVC concurrently issues said first and second access ownership request signal to said first and said second AOAC, respectively, access ownership will be granted to a predetermined one of said SVC pair.
17. The redundant storage virtualization subsystem of claim 5 wherein in the redundant storage virtualization controller pair, each of the storage virtualization controllers further comprises:
at least one host-side IO device interconnect controller for coupling to the host entity;
a central processing circuitry coupled to the host-side IO device interconnect controller for performing the IO operation in response to the IO request issued by the host entity; and
a plurality of device-side IO device interconnect controller coupled to the central processing circuitry;
wherein each of the physical storage devices is coupled to the device-side IO device interconnect controllers through the point-to-point serial signal interconnect.
18. The redundant storage virtualization subsystem of claim 17, wherein said at least one host-side IO device interconnect controller each comprises at least one host-side port, a said host-side port of the first SVC and a said host-side port of said second SVC constitute a complementary port pair coupled to a same said host-side IO device interconnect.
19. The redundant storage virtualization subsystem of claim 17, wherein said at least one host-side IO device interconnect controller each comprises at least one host-side port, and a said host-side port of the first SVC and a said host-side port of said second SVC constitute a complementary port pair each coupled to a different said host-side IO device interconnect.
20. The redundant storage virtualization subsystem of claim 17, wherein a logical media unit is redundantly presented to the host entity by said redundant SVC pair on said complementary port pair.
21. The redundant storage virtualization subsystem of claim 17, wherein a said SVC includes a plurality of host-side IO device interconnect ports in said at least one host-side IO device interconnect controller.
22. The redundant storage virtualization subsystem of claim 21, wherein a logical media unit is redundantly presented to the host entity on more than one of said host-side IO device interconnect port.
23. The redundant storage virtualization subsystem of claim 5, wherein said host-side IO device interconnect controller is of a type selecting from one of the followings: a Fibre Channel controller supporting Fabric, point-to-point, public loop and/or private loop connectivity in target mode, a parallel SCSI controller operating in target mode, ethernet controller supporting the iSCSI protocol operating in target mode, and a serial SCSI controller operating in target mode.
24. The redundant storage virtualization subsystem of claim 5 wherein the point-to-point serial signal interconnect is a Serial ATA IO device interconnect.
25. The redundant storage virtualization subsystem of claim 5, wherein when the first storage virtualization controller is not on line, the second storage virtualization controller will automatically take over the functionality originally performed by the first storage virtualization controller.
26. The redundant storage virtualization subsystem of claim 25, further comprising an access control switch for a pair of device-side IO device interconnects connecting to different SVCs and configured in redundant pair, said access control switch being coupled between the redundant storage virtualization controller pair for selectively switching the connection of the physical storage device to one of said redundant storage virtualization controllers; a cooperating mechanism for the SVC pair to cooperatively control a patching state of said access control switch; a monitoring mechanism for each SVC of said SVC pair to monitor status of the other SVC of said SVC pair; and, a state control mechanism for each SVC of said SVC pair to forcibly taking complete control of the other SVC of said SVC pair.
27. The redundant storage virtualization subsystem of claim 5, wherein an inter-controller communication channel is provided between said redundant SVC pair for communicating state synchronization information.
28. The redundant storage virtualization subsystem of claim 27, wherein a said inter-controller communication channel is an existing IO device interconnect, whereby inter-controller communication exchange is multiplexed with IO requests and associated data.
29. The redundant storage virtualization subsystem of claim 27, wherein a said inter-controller communication channel is a dedicated channel the primary function thereof is to exchange said state synchronization information.
30. The redundant storage virtualization subsystem of claim 27, wherein a said inter-controller communication channel is selecting from one of the following: Fibre, SATA, Parallel SCSI, Ethernet, Serial SCSI, 12C.
31. The redundant storage virtualization subsystem of claim 5, wherein said redundant SVC pair can perform IO request rerouting function.
32. The redundant storage virtualization subsystem of claim 5, wherein said redundant SVC pair can perform PSD access ownership transfer function.
33. The redundant storage virtualization subsystem of claim 5, wherein each SVC of said redundant SVC pair includes at least one expansion port for coupling to a second plurality of PSDs through a multiple-device device-side IO device interconnects and each said expansion port on said first SVC has a complementary expansion port on the second SVC.
34. The redundant storage virtualization subsystem of claim 33, wherein said second plurality of PSDs each have complementary IO ports in dual-port pair, each connecting to a different one of said complementary expansion port pair.
35. The redundant storage virtualization subsystem of claim 33, wherein said complementary expansion port pair are connected to the said second plurality of PSDs through a switch circuit.
36. The redundant storage virtualization subsystem of claim 33, wherein each of said complementary expansion port pair has a redundant complement expansion port on the same SVC, and said complementary expansion port and redundant complement expansion port are connected to a different one of complementary IO ports in dual-port pair of a said second plurality of PSDs.
37. The redundant storage virtualization subsystem of claim 36, wherein said complementary expansion port pair are connected to the second plurality of PSDs through a switch circuit.
38. The redundant storage virtualization subsystem of claim 33, wherein a said device-side expansion port and interconnect is of a type selecting from one of the following: Fibre, Parallel SCSI, Expanded Serial ATA, Ethernet, and Serial SCSI.
39. The redundant storage virtualization subsystem of claim 5, wherein said first and second SVC each includes a first and a second expansion port for coupling to a second plurality of PSDs through a multiple-device de-vice-side IO device interconnects and said second plurality of PSDs each have a first and a second IO port forming a dual-ported port pair of said second plurality of PSDs, said first and second expansion ports of said first SVC formed a first redundant complement, said first and second expansion ports of said second SVC formed a second redundant complement, said first expansion ports of said SVCs formed a third redundant complement, said second expansion ports of said SVCs formed a fourth redundant complement, and wherein an interconnect signal line switching mechanism is provided for each said redundant complement between corresponding IO device interconnects to switch connection to said dual-ported port pair.
40. The redundant storage virtualization subsystem of claim 39, wherein said interconnect signal line switching mechanism supports one of the following arrangements:
(1) direct connecting of a said expansion port on the first SVC to a first PSD IO port of said dual-ported port pair and direct connecting of a said expansion port on the second SVC to a second PSD port of said dual-ported port pair;
(2) interconnecting of two expansion ports in said redundant complement to an IO device interconnect in the redundant complement that connects to said first PSD IO port;
(3) interconnecting of two expansion ports in said redundant complement to an IO device interconnect in the redundant complement that connects to said second PSD IO port;
(4) directly connecting of a said expansion port on the first SVC to said IO device interconnect in the redundant complement that connects to said first PSD IO port;
(5) directly connecting of a said expansion port on the first SVC to said IO device interconnect in the redundant complement that connects to said second PSD IO port;
(6) directly connecting of a said expansion port on the second SVC to said IO device interconnect in the redundant complement that connects to said first PSD IO port; and,
(7) directly connecting of a said expansion port on the second SVC to said IO device interconnect in the redundant complement that connects to said second PSD IO port.
41. The redundant storage virtualization subsystem of claim 39, wherein said interconnect signal line switching mechanism supports all of the following arrangements:
(1) direct connecting of a said expansion port on the first SVC to a first PSD IO port of said dual-ported port pair and direct connecting of a said expansion port on the second SVC to a second PSD port of said dual-ported port pair;
(2) interconnecting of two expansion ports in said redundant complement to an IO device interconnect in the redundant complement that connects to said first PSD IO port;
(3) interconnecting of two expansion ports in said redundant complement to an IO device interconnect in the redundant complement that connects to said second PSD IO port;
(4) directly connecting of a said expansion port on the first SVC to said IO device interconnect in the redundant complement that connects to said first PSD IO port;
(5) directly connecting of a said expansion port on the first SVC to said IO device interconnect in the redundant complement that connects to said second PSD IO port;
(6) directly connecting of a said expansion port on the second SVC to said IO device interconnect in the redundant complement that connects to said first PSD IO port; and,
(7) directly connecting of a said expansion port on the second SVC to said IO device interconnect in the redundant complement that connects to said second PSD IO port.
42. The redundant storage virtualization subsystem of claim 5, wherein said first SVC of said redundant SVC pair includes a state-defining circuit for forcefully defining externally connected signal lines of said second SVC to a predetermined state.
43. The redundant storage virtualization subsystem of claim 5, wherein each SVC of said redundant SVC pair includes a self-killing circuit for forcefully defining externally connected signal lines thereof to a predetermined state.
44. An external storage virtualization controller for use in a redundant storage virtualization controller pair, comprising:
a host-side IO device interconnect controller for coupling to a host entity;
a central processing circuitry coupled to the host-side IO device interconnect controller for performing an IO operation in response to an IO request issued by the host entity;
a memory coupled to the central processing circuitry; and
at least one device-side IO device interconnect controller coupled to the central processing circuitry, for performing point-to-point serial signal transmission with a plurality of physical storage devices;
wherein when a second external storage virtualization controller in the redundant storage virtualization controller pair is not on line, said external storage virtualization controller will automatically take over the functionality originally performed by the second external storage virtualization controller.
45. The storage virtualization controller of claim 44 wherein the device-side IO device interconnect controller is a Serial ATA IO device interconnect controller comprising a plurality of Serial ATA ports, each for connecting to a said physical storage device through a Serial ATA IO device interconnect.
46. The storage virtualization controller of claim 44 further comprising an off-line detecting mechanism for detecting an off-line state of said second storage virtualization controller.
47. The storage virtualization controller of claim 44 wherein said functionality includes presenting and making available to the host entity accessible resources that were originally presented and made available by said second storage virtualization controller as well as accessible resources that were presented and made available by said storage virtualization controller itself.
48. A method for performing storage virtualization in a computer system having a first and a second external storage virtualization controller, the method comprising:
performing an IO operation by the second storage virtualization controller in response to an IO request issued by a host entity of the computer system to access at least one of a plurality of physical storage devices of the computer system in point-to-point serial signal transmission; and
when the second storage virtualization controller is not on line, performing the IO operation by the first storage virtualization controller in response to the IO request issued by the host entity to access said at least one of the physical storage devices of the computer system in point-to-point serial signal transmission.
49. The method of claim 48 wherein the point-to-point serial signal transmission is performed in a format complying with Serial ATA protocol.
50. The method of claim 48 wherein said first storage virtualization controller will automatically take over the functionality originally performed by said second storage virtualization controller when the second storage virtualization controller is not on line.
51. The method of claim 50 wherein said functionality includes presenting and making available to the host entity accessible resources that were originally presented and made available by said second storage virtualization controller as well as accessible resources that were presented and made available by said first storage virtualization controller itself.
52. The method of claim 48, further comprising providing a rerouting mechanism for said SVC pair to perform IO request rerouting function.
53. The method of claim 52, wherein said IO request rerouting function is performed by the steps of:
a request initiator of said SVC pair transferring IO request to an access owner of said SVC pair;
said access owner performing said IO request transferred from said request initiator; whereby data associated with said IO request transferred between said access owner and said PSDs are forwarded over to said request initiator.
54. The method of claim 48, further comprising the steps of:
performing an IO operation by the first storage virtualization controller in response to a second IO request issued by said host entity to access said at least one of a plurality of physical storage devices of the computer system in point-to-point serial signal transmission; and
when the first storage virtualization controller is not on line, performing the IO operation by the second storage virtualization controller in response to the IO request issued by the host entity to access said at least one of the physical storage devices in point-to-point serial signal transmission.
55. The method of claim 54, further comprising the steps of:
providing an access control switch coupled between a said the physical storage device and the redundant storage virtualization controller pair for selectively allowing patching through of the serial signal of the physical storage device to and from the first SVC when in a first patching state of said access control switch and to and from the second SVC when in a second patching state of said access control switch;
providing a first signal line being of said first SVC active and of said second SVC passive for issuing a first access request signal from said first SVC to said second SVC; providing a second signal line being of said second SVC active and of said first SVC passive for issuing a second access request signal from said second SVC to said first SVC;
an access requester of said SVC pair asserting its active signal line to request access ownership to a said PSD from an access owner of said SVC pair;
said access owner deasserting its active signal line to relinquish said access ownership; and
said access requester asserting its active signal line and changing said patching state of said access control switch to acquire said access ownership.
56. The method of claim 55, further comprising the steps of said access owner holding up and queuing up new IO requests for later execution and completing all pending IOs, after the step of said access requester asserting its active signal line and before the step of said access owner deasserting its active signal line.
57. The method of claim 54, further comprising the steps of:
providing an access control switch coupled between a said PSD and the redundant storage virtualization controller pair for selectively switching the connection of the PSD to the redundant storage virtualization controller pair between the first and the second storage virtualization controller;
providing a first access ownership arbitration circuit for acquiring access ownership of said access control switch for said first SVC having a first access ownership request (AOR) signal line as a first input line thereof couple to said first SVC, and a first access control switch control signal (ACSCS) line coupled to said access control switch as a first output line;
providing a second access ownership arbitration circuit for acquiring access ownership of said access control switch for said second SVC having a second access ownership request signal line as a first input line thereof couple to said second SVC, and a second access control switch control signal line coupled to said access control switch as a second output line; whereby
when one SVC of said SVC pair asserts its AOR signal line while the other SVC of said SVC pair having been asserting its ACSCS line, said ACSCS line of said one SVC will not be asserted until said the other SVC deasserts its said ACSCS line.
58. The method of claim 57, further comprising the steps of:
providing said first access ownership arbitration circuit a first alternate SVC access ownership request (ASAOR) signal line as a second input line thereof couple to said second SVC;
providing said second access ownership arbitration circuit a second alternate SVC access ownership request (ASAOR) signal line as a second input line thereof couple to said first SVC; whereby
when a said ASAOR signal line of one SVC of said SVC pair is asserted, said ASAOR signal line of the other SVC of said SVC pair will be asserted unless said first and second SVCs are asserting said AOR signal lines concurrently.
59. The method of claim 58, further comprising the steps of:
providing an access ownership determining mechanism for granting access ownership of said access control switch to one SVC of said SVC pair when said first and second SVCs are asserting said AOR signal lines concurrently.
60. The method of claim 59, wherein said access ownership determining mechanism grants access ownership of said access control switch to said one SVC of said SVC pair by asserting said ASAOR signal line coupled to said access ownership arbitration circuit for the other SVC of said SVC pair.
61. The method of claim 57, wherein information exchanges associated with access ownership transfer between said SVC pair is communicated as a part of inter-controller communications.
62. The method of claim 48, further comprising providing an access ownership transferring mechanism for one of said SVCs that possessed access ownership of said PSD to transfer said access ownership to the other of said SVCs.
63. The method of claim 62, wherein said access ownership transferring mechanism performing the steps of:
(a) an access requester of said SVC pair issuing an access request signal to an access owner of said SVC pair for requesting access ownership to a said PSD;
(b) said access owner relinquishing said access ownership such that it is not an access owner now; and
(c) said access requester acquiring said access ownership and becoming an new access owner of said PSD.
64. The method of claim 63, further comprising the steps of said access owner holding up and queuing up new IO requests for later execution and completing all pending IOs, after the step of (a) said access requester issuing said access request signal to said access owner and before the step of (b) said access owner relinquishing said access ownership.
65. The method of claim 63, wherein in step of (b), said access owner relinquishing said access ownership by modifying a state of an access control switch coupled between said SVC pair and said PSD.
66. The method of claim 63, wherein in step of (c), said access requester acquiring said access ownership by modifying a state of an access control switch coupled between said SVC pair and said PSD.
67. A computer-readable storage medium having a computer program code stored therein that is capable of causing a computer system having a host entity, a first and a second external storage virtualization controller coupled to the host entity and a plurality of physical storage devices coupled to the first and the second storage virtualization controller to perform the steps of:
performing an IO operation by the second storage virtualization controller in response to an IO request issued by the host entity to access at least one of the physical storage devices in point-to-point serial signal transmission; and
said first storage virtualization controller automatically performing the IO operation that was originally performed by the second storage virtualization controller in response to the IO request issued by the host entity to access at least one of the physical storage devices in point-to-point serial signal transmission when the second storage virtualization controller is not on line.
68. The computer readable medium of claim 67 wherein each of the physical storage devices is coupled to the first and the second storage virtualization controller through a Serial ATA IO device interconnect.
Description
    BACKGROUND OF INVENTION
  • [0001]
    1. Field of the Invention
  • [0002]
    The invention relates to a redundant external storage virtualization computer system. More particularly, a redundant external storage virtualization computer system that uses point-to-point serial-signal interconnects as the primary device-side IO device interconnects is disclosed.
  • [0003]
    2. Description of the Prior Art
  • [0004]
    Storage virtualization is a technology that has been used to virtualize physical storage by combining sections of physical storage devices (PSDs) into logical storage entities, herein referred to as logical media units (LMUs), that are made accessible to a host system. This technology has been used primarily in redundant arrays of independent disks (RAID) storage virtualization, which combines smaller physical storage devices into larger, fault tolerant, higher performance logical media units via RAID technology.
  • [0005]
    An External (sometimes referred to as “Stand-alone”) Storage Virtualization Controller is a Storage Virtualization Controller that connects to the host system via an IO interface and that is capable of supporting connection to devices that reside external to the host system and, otherwise, operates independently of the host.
  • [0006]
    One example of an external Storage Virtualization Controller is an external, or stand-alone, direct-access RAID controller. A RAID controller combines sections on one or multiple physical storage devices (PSDs), the combination of which is determined by the nature of a particular RAID level, to form logical media units that are contiguously addressable by a host system to which the logical media unit is made available. A single RAID controller will typically support multiple RAID levels so that different logical media units may consist of sections of PSDs combined in different ways by virtue of the different RAID levels that characterize the different units.
  • [0007]
    Another example of an external Storage Virtualization Controller is a JBOD emulation controller. A JBOD, short for “Just a Bunch of Drives”, is a set of physical DASDs that connect directly to a host system via one or more a multiple-device IO device interconnect channels. PSDs that implement point-to-point IO device interconnects to connect to the host system (e.g., Parallel ATA HDDs, Serial ATA HDDs, etc.) cannot be directly combined to form a “JBOD” system as defined above for they do not allow the connection of multiple devices directly to the IO device channel.
  • [0008]
    Another example of an external Storage Virtualization Controller is a controller for an external tape backup sub-system.
  • [0009]
    The primary motivation in configuring a pair of external storage virtualization controllers (SVCs) into a redundant pair is to allow continued, uninterrupted access to data by a host (or more than one host) even in the event of a malfunction or failure of a single SVC. This is accomplished by incorporating functionality into the SVCs that allow one controller to take over for the other in the event that the other becomes handicapped or completely incapacitated.
  • [0010]
    On the device side, this requires that both controllers are able to access all of the physical storage devices (PSDs) that are being managed by the SVCs, no matter which SVC any given PSD may initially be assigned to be managed by. On the host side, this requires that each SVC have the ability to present and make available to the host all accessible resources, including those that were originally assigned to be managed by the alternate SVC, in the event that its mate does not initially come on line or goes off line at some point (e.g., due to a malfunction/failure, maintenance operation, etc.).
  • [0011]
    A typical device-side implementation of this would be one in which device-side IO device interconnects are of the multiple-initiator, multiple-device kind (such as Fibre, Parallel SCSI), and all device-side IO device interconnects are connected to both SVCs such that either SVC can access any PSD connected on a device-side IO device interconnect. When both SVCs are on-line and operational, each PSD would be managed by one or the other SVC, typically determined by user setting or configuration. As an example, all member PSDs of a logical media unit (LMU) that consists of a RAID combination of PSDs would be managed by the particular SVC to which the logical media unit itself is assigned.
  • [0012]
    A typical host-side implementation would consist of multiple-device IO device interconnects to which the host(s) and both SVCs are connected and, for each interconnect, each SVC would present its own unique set of device IDs, to which LMUs are mapped. If a particular SVC does not come on line or goes off line, the on-line SVC presents both sets of device IDs on the host-side interconnect, its own set together with the set normally assigned to its mate, and maps LMUs to these IDs in the identical way they are mapped when both SVCs are on-line and fully operational. In this kind of implementation, no special functionality on the part of the host that switches over from one device/path to another is required to maintain access to all logical media units in the event that an SVC is not on-line. This kind of implementation is commonly referred to as “transparent” redundancy.
  • [0013]
    Redundant SVC configurations are typically divided into two categories. The first is “active-standby” in which one SVC is presenting, managing, and processing all IO requests for all logical media units in the storage virtualization subsystem (abbreviated SVS) while the other SVC simply stands by ready to take over in the event that the active SVC becomes handicapped or incapacitated. The second is “active-active” in which both SVCs are presenting, managing, and processing IO requests for the various LMUs that are present in the SVS concurrently. In active-active configurations, both SVCs are always ready to take over for the other in the event that one malfunctions, causing it to become handicapped or incapacitated. Active-active configurations typically provide better levels of performance because the resources of both SVCs (e.g., CPU time, internal bus bandwidth, etc) can be brought to bear in servicing IO requests rather than the resources of only one SVC.
  • [0014]
    Another essential element of a redundant storage virtualization system is the ability for each SVC to monitor the status of the other. Typically, this would be accomplished by a implementing an inter-controller communications channel (abbreviated ICC) between the two SVCs over which they can exchange the operating status. This communications channel may be dedicated, the sole function of which is to exchange parameters and data relating to the operation of the redundant storage virtualization sub-system, or it can be one or more of the IO device interconnects, host-side or device-side, over which operational parameters and data exchange are multiplexed together with host-SVC or device-SVC IO-request-associated data on these interconnects.
  • [0015]
    Yet another important element of a redundant storage virtualization system is the ability of one SVC to completely incapacitate the other so that it can completely take over for the other SVC without interference. For example, for the surviving SVC to take on the identity of its mate, it may need to take on the device IDs that the SVC going off line originally presented on the host-side IO device interconnect, which, in turn, requires that the SVC going off line relinquish its control over those IDs.
  • [0016]
    This “incapacitation” is typically accomplished by the assertion of reset signal lines on the controller being taken off line bringing all externally connected signal lines to a pre-defined state that eliminates the possibility of interference with the surviving SVC. Interconnecting reset lines between the SVCs so that one can reset the other in this event is one common way of achieving this. Another way to accomplish this is to build in the ability of an SVC to detect when itself may be malfunctioning and “kill” itself by asserting its own reset signals (e.g., inclusion of a “watchdog” timer that will assert a reset signal should the program running on the SVC fail to poll it within a predefined interval), bringing all externally connected signal lines to a pre-defined state that eliminates the possibility of interference with the surviving SVC.
  • [0017]
    Traditionally storage virtualization has been done with Parallel SCSI or Fibre IO device interconnects as the primary device-side IO device interconnects connecting physical storage devices to the storage virtualization controller pair. Both Parallel SCSI and Fibre are multiple-device IO device interconnects. Multiple-device IO device interconnects share bandwidth among all hosts and all devices interconnected by the interconnects.
  • [0018]
    Please refer to FIG. 1, where a block diagram of a conventional redundant external storage virtualization computer system is illustrated. Note the interconnection of the host-side IO device interconnects that allows an SVC to take over for its mate by taking over the IO device interconnect IDs that would normally be presented onto the interconnect by its mate and mapping logical media units to these IDs in the same way its mate would. Also, note the interconnection of the device-side IO device interconnects that allow both SVCs access to all PSDs connected to the device-side IO device interconnects. In this example, a typical IO device interconnect that might be used on either host side or device side might be parallel SCSI or Fibre FC-AL, both multiple-initiator, multiple-device IO device interconnects. Therefore, both SVCs operating in target mode (i.e., device mode) are connected to a single interconnect on the host side and allow both SVCs operating in initiator mode, together with multiple devices, to be interconnected on the device side. The configuration shown in FIG. 1 suffers from the drawback that a malfunction of a single PSD, depending on the nature of the malfunction, can potentially bring down an entire device-side IO device interconnect making all other PSDs connected on the same interconnect inaccessible.
  • [0019]
    FIG. 2 diagrams an improvement on this that effectively avoids the possibility that access to other PSDs connected on the same device-side IO device interconnect might be disrupted due to a malfunction that causes a single device-side interconnect to fail by making use of dual-ported PSDs and adding an additional interconnect to each PSD. In this way, the blockage of a single device-side IO device interconnect, possibly caused by a malfunction of an interconnect controller IC on the PSD, would not result in the inaccessibility of other PSDs connected on the same interconnect because the second interconnect connected to each of the same PSDs can be used to access those PSDs without interference.
  • [0020]
    The configuration shown in FIG. 2 has the further advantage that IO request load can be distributed between the redundant device-side interconnects thereby effectively doubling the overall bandwidth of the device-side IO device interconnect subsystem as compared to the single-interconnect-per-PSD-set configuration shown in FIG. 1. In this case, the typical device-side IO device interconnect of choice would typically be Fibre FC-AL because of the dual-ported nature of Fibre FC-AL PSDs currently on the market and the elements of the Fibre protocol that allow an initiator, such as an SVC, to determine which interconnect IDs on different interconnects correspond to the same PSD.
  • [0021]
    While the configuration depicted in FIG. 2 is, indeed, far more robust than that depicted in FIG. 1 in the face of device-side IO device interconnect failure, there is still the possibility that a PSD might malfunction in such a way that it could bring down both IO device interconnects that are connected to its dual-ported port pair. Were this to happen, once again, access to other PSDs connected on the same interconnect pair would be disrupted. In a logical media unit that consists of a standard singly-redundant RAID combination of PSDs (e.g., RAID 5), this could prove disastrous for it can cause multiple PSDs in the combination to go off line causing the entire LMU to go off line.
  • SUMMARY OF INVENTION
  • [0022]
    It is therefore a primary objective of the claimed invention to provide a redundant external storage virtualization computer system using point-to-point serial-signal transmissions as the primary device-side IO device interconnects to solve the above-mentioned problem.
  • [0023]
    According to the claimed invention, a redundant external storage virtualization computer system is introduced. The redundant external storage virtualization computer system includes a host entity for issuing an IO request, a redundant storage virtualization controller pair coupled to the host entity for performing an IO operation in response to the IO request issued by the host entity, and a plurality of physical storage devices for providing storage to the computer system. Each of the physical storage devices is coupled to the redundant storage virtualization controller pair through a point-to-point serial-signal interconnect. The redundant storage virtualization controller pair includes a first and a second storage virtualization controller coupled to the host entity. In the redundant storage virtualization controller pair, when the first storage virtualization controller is not on line or not in operation, the second storage virtualization controller will take over the functionality originally performed by the first storage virtualization controller. In one embodiment, of the present invention, the point-to-point serial-signal interconnect is a Serial ATA IO device interconnect.
  • [0024]
    It is an advantage of the claimed invention that in the redundant external storage virtualization computer system using Serial ATA as the primary device-side IO device, each physical storage device has a dedicated interconnect to the storage virtualization controller pair.
  • [0025]
    It is another advantage of the claimed invention that not only the payload data portion of information but also the control information are protected by the SATA IO device interconnect.
  • [0026]
    These and other objectives of the present invention will no doubt become obvious to those of ordinary skill in the art after reading the following detailed description of the preferred embodiment that is illustrated in the various figures and drawings.
  • BRIEF DESCRIPTION OF DRAWINGS
  • [0027]
    FIG. 1 is a block diagram of a conventional redundant external storage virtualization computer system.
  • [0028]
    FIG. 2 is a block diagram of another conventional redundant external storage virtualization computer system.
  • [0029]
    FIG. 3 is a block diagram of a redundant external storage virtualization computer system according to the present invention.
  • [0030]
    FIG. 4 is a block diagram of an access control switch using only one signal line to control states and a related truth table.
  • [0031]
    FIG. 5 is a block diagram of an access control switch using two signal lines to control states and a related truth table.
  • [0032]
    FIG. 6 is a block diagram of an SVC according to the present invention.
  • [0033]
    FIG. 7 is an embodiment block diagram of the CPC in FIG. 6.
  • [0034]
    FIG. 8 is an embodiment block diagram of the CPU chipset/parity engine in FIG. 7.
  • [0035]
    FIG. 9 is a block diagram of the SATA IO device interconnect controller of FIG. 6.
  • [0036]
    FIG. 10 is a block diagram of the PCI-X to SATA controller of FIG. 9.
  • [0037]
    FIG. 11 is a block diagram of the SATA port of FIG. 10
  • [0038]
    FIG. 12 illustrates the transmission structure complying with serial ATA protocol.
  • [0039]
    FIG. 13 illustrates a first FIS data structure complying with serial ATA protocol.
  • [0040]
    FIG. 14 illustrates a second FIS data structure complying with serial ATA protocol.
  • [0041]
    FIG. 15 is a flow chart of a switchover process.
  • [0042]
    FIG. 16 is a flow chart of a switchover process when a Binary Signal Pair Access Ownership Arbitration mechanism is used.
  • [0043]
    FIG. 17 is a timing diagram for a switchover process of an alternate Binary Signal Pair Access Ownership Arbitration mechanism.
  • [0044]
    FIG. 18 is a flow chart for the switchover process depicted in FIG. 17.
  • [0045]
    FIG. 19 is a flow chart of a mandatory switchover process when one SVC in the SVC pair malfunctions.
  • [0046]
    FIG. 20 is a flow chart of an IO request routing.
  • [0047]
    FIG. 21 is a block diagram of a redundant SVC interconnected expansion port implementation.
  • [0048]
    FIG. 22 is a block diagram showing how hardware switches might be employed to achieve a switchable interconnection.
  • [0049]
    FIG. 23 shows a circuit design that relies on hardware signal detection to activate the switch state changes.
  • [0050]
    FIG. 24 shows a circuit design that takes input from the SVC1 and SVC2, C1 and C2 respectively, to trigger the switch state changes.
  • [0051]
    FIG. 25 shows a hybrid circuit of those shown in FIG. 23 and FIG. 24.
  • [0052]
    FIG. 26 is a block diagram of a redundant SVC interconnected redundant expansion port implementation.
  • [0053]
    FIG. 27 is a block diagram of another redundant SVC interconnected expansion port implementation.
  • [0054]
    FIG. 28 is a block diagram showing an implementation that uses hardware switches to interconnect the two IO device interconnects connecting the two SVCs to the storage units shown in FIG. 27.
  • [0055]
    FIG. 29 shows a circuit design that relies on hardware signal detection to activate the switch state changes of hardware switches shown in FIG. 28.
  • [0056]
    FIG. 30 shows a circuit design that takes input from the SVC1 and SVC2, C1 and C2 respectively, to trigger the switch state changes of hardware switches shown in FIG. 28.
  • [0057]
    FIG. 31 shows a hybrid circuit of those shown in FIG. 29 and FIG. 30.
  • [0058]
    FIG. 32 is a flow chart of an IO request routing in a redundant SVC interconnected expansion port implementation.
  • [0059]
    FIG. 33 is a block diagram of a redundant external storage virtualization computer system comprising two separate host-side ports on each SVC connecting to two entirely separate host-side IO device interconnects and host ports.
  • [0060]
    FIG. 34 is a block diagram showing an example of a switch circuit that can be used to accomplish the host side interconnection of FIG. 33.
  • [0061]
    FIG. 35 is a block diagram of a redundant external storage virtualization computer system comprising one host-side port on each SVC connecting to one host-side IO device interconnect and host ports.
  • [0062]
    FIG. 36 is a block diagram of a removable redundant SATA-PSD canister.
  • [0063]
    FIG. 37 is a block diagram includes more details of the individual PCBs in the canister of FIG. 36.
  • [0064]
    FIG. 38 is a block diagram of a removable redundant PATA-PSD canister.
  • [0065]
    FIG. 39 is a block diagram includes more details of the individual PCBs in the canister of FIG. 38.
  • [0066]
    FIG. 40 is a truth table for the control switch of FIG. 4.
  • [0067]
    FIG. 41 is a truth table for the control switch of FIG. 5.
  • [0068]
    FIG. 42 is a table for rerouting loop connections in the event of malfunction.
  • [0069]
    FIG. 43 is a truth table for the circuit shown in FIG. 29.
  • [0070]
    FIG. 44 is a truth table for the circuit shown in FIG. 30.
  • [0071]
    FIG. 45 is a truth table for the circuit shown in FIG. 31.
  • DETAILED DESCRIPTION
  • [0072]
    Please refer to FIG. 3, where an embodiment block diagram of the current invention is illustrated. The system contains a host entity 10 and a redundant storage virtualization subsystem (SVS) 20. The SVS 20 contains a redundant storage virtualization controller pair (including SVC1 200 and SVC2 200) and a plurality of PSDs 420.The redundant storage virtualization subsystem configuration incorporates dedicated, point-to-point IO device interconnects to connect all PSDs 420 to both SVCs 200. The storage virtualization controller 200 can be a RAID controller or a JBOD emulator.
  • [0073]
    Although there is illustrated in FIG. 3 only one host entity 10 connected with one SVS 20, there can be more than one host entity 10 attached to the SVS 20. The host entity 10 can be a host computer, such as a server system, a workstation, a PC system, or the like. Alternatively, the host entity 10 can be another SVC.
  • [0074]
    In order to allow both controllers to access the same PSD 420, an access control switch 342 is inserted in the device-side IO device interconnect path between the SVCs 200 and a PSD 420. Because of the point-to-point nature of the interconnect, only one SVC 200, i.e. the SVC 200 to which the particular PSD 420 is assigned at the time, can be actively accessing the PSD 420 at a time. The other SVC 200 remains in a stand-by mode with respect to this PSD 420 with its IO device interconnect to the particular PSD 420 disabled. One signal line from each SVC 200 is provided for controlling the access control switch 342. The switch 342 determines which SVC interconnect is patched through to the PSD 420.
  • [0075]
    As depicted in FIG. 4, these signal lines may be wired together outside the switch to form a single control line that controls the state of the switch 342 according to a truth table shown in FIG. 40. Alternately, as shown in FIG. 5, the switch 342 can be designed to accept two control inputs, one from each SVC 200 with a truth table (FIG. 41) determining the state of the switch 342 for the four possible combinations of signals from the two SVCs.
  • [0076]
    In a redundant SVS in which it is important that any active components or groups thereof be positioned in hot-swappable units so that the subsystem does not have to be brought down in order to replace such components should a component failure occur. Such hot-swappable units are typically referred to as “Field Replaceable Units” (abbreviated FRU). Being active components, both the PSD itself and the access control switch would quite naturally also be located on FRUs. It makes sense to put them together on the same FRU, for one cannot achieve its intended functionality without the other. Therefore, the access control switch would typically be situated with the PSD in the removable PSD canister. FIG. 36 and FIG. 38 show block diagrams of one possible such arrangement.
  • [0077]
    In one implementation, all of the PSDs 420 in the SVS 20 can be combined to form a PSD array 400, and all the access control switches 342 can be combined to form a switching circuit 340. An example of such implementation is shown in FIG. 6. FIG. 6 depicts a block diagram showing an embodiment of an SVC 200 according to the present invention and the connection thereof to the host entity 10 and the PSD array 400. In this embodiment, the SVC1 200 comprises a host-side IO device interconnect controller 220, a central processing circuit (CPC) 240, a memory 280, a SATA IO device interconnect controller 300, and a redundant controller communicating (RCC) interconnect controller 236. Although illustrated in separate functional blocks, some or all of these functional blocks can be incorporated into to one chip. For example, the RCC interconnect controller 236 can be integrated with the host-side IO device interconnect controller 220 as a single-chip IC.
  • [0078]
    The host-side IO device interconnect controller 220 is connected to the host entity 10 and the CPC 240 to serve as an interface and buffer between the SVC1 200 and the host entity 10, and receives IO requests and related data from the host entity 10 and maps and/or transfers them to the CPC 240.
  • [0079]
    When the CPC 240 receives the IO requests of the host entity 10 from the host-side IO device interconnect controller 220, CPC 240 parses it and performs some operations in response to the IO requests and sends the data requested and/or reports and/or information of the SVC1 200 back to the host entity 10 through the host-side IO device interconnect controller 220.
  • [0080]
    After parsing a request received from the host entity 10, while a read request being received and performing one or more operations in response, the CPC 240 get the requested data either internally or from the memory 280, or in both ways, and transfers them to the host entity 10. If the data is not available either internally or does not exists in the memory 280, the IO request will be issued to the PSD array 400 through the SATA IO device interconnect controller 300 and the switching circuit. The requested data will then be transferred from the PSD array 400 to the memory 280 and passed to the host entity 10 through host-side IO device interconnect controller 220.
  • [0081]
    When a write request is received from the host entity 10, after parsing the request and performing one or more operations, the CPC 240 gets the data from the host entity 10 through the host-side IO device interconnect controller 220, stores them in the memory 280, and then transmits the data to the PSD array 400 through the CPC 240. When the write request is a write back request, the IO complete report can be issued to the host entity 10 first and then the CPC 240 performs the actual write operation later; otherwise, an IO complete report can be issued to the host entity 10 after the requested data is actually written into the PSD array 400.
  • [0082]
    The memory 280 is connected to the CPC 240 and acts as a buffer to buffer the data transferred between the host entity 10 and the PSD array 400 through the CPC 240. In one embodiment, the memory 280 can be a DRAM; or more particularly, the DRAM can be a SDRAM.
  • [0083]
    The SATA IO device interconnect controller 300 is the device-side IO device interconnect controller connected between the CPC 240 and the PSD array 400. It serves as an interface and buffer between the SVC 200 and the PSD array 400 and receives IO requests and related data issued from CPC 240 and maps and/or transfers them to the PSD array 400. The SATA IO device interconnect controller 300 re-formats the data and control signals received from CPC 240 to comply with SATA protocol and transmits them to the PSD array 400.
  • [0084]
    An enclosure management service (EMS) circuitry 360 can be attached to the CPC 240 for management circuitry on an enclosure for containing the PSD array 400. In another arrangement of the SVS 20, the EMS circuitry 360 can be omitted, depending on the actual requirements of the various product functionality. Alternatively, the function of the EMS circuitry 360 can be incorporated into the CPC 240.
  • [0085]
    In this embodiment, the RCC interconnect controller 236 is implemented in SVC1 200 to connect the CPC 240 to SVC2 200. In addition, the SATA IO device interconnect controller 300 is connected to the PSD array 400 through the switching circuit 340. The switching circuit 340 is also connected to the SVC2 200. In this arrangement, the SVC2 200 can be attached to the SVC1 200. The PSD array 400 can be accessed by the two SVCs 200 through the switching circuit 340. Moreover, the control/data information from the host IO can be transferred from the CPC 240 through the RCC interconnect controller 236 to the SVC2 200 and further to a second PSD array (not shown).
  • [0086]
    In FIG. 7, an embodiment of the CPC 240 is shown, comprising the CPU chipset/parity engine 244, the CPU 242, a ROM (Read Only Memory) 246, a NVRAM (Non-volatile RAM) 248, an LCD module 350 and an enclosure management service circuitry EMS 360. The CPU can be, e. g., a Power PC CPU. The ROM 246 can be a FLASH memory for storing BIOS and/or other programs. The NVRAM is provided for saving some information regarding the IO operation execution status of the disk which can be examined after an abnormal power shut-off occurs and meanwhile the IO operation execution does not complete. LCD module 350 shows the operation of the subsystem LCDs. EMS 360 can control the power of the DASA array and do some other management. The ROM 246, the NVRAM 248, the LCD module 350 and the enclosure management service circuitry EMS 360 are connected to the CPU chipset/parity engine 244 through an X-bus.
  • [0087]
    FIG. 8 is a block diagram illustrating an embodiment of the CPU chipset/parity engine 244 according to the present invention. In the present embodiment, the CPU chipset/parity engine 244 mainly comprises parity engine 260, CPU interface 910, memory interface 920, PCI interfaces 930, 932, X-BUS interface 940, and PM BUS 950. The PM BUS 950 is, for example, a 64-bit, 133 Mhz bus and connects the parity engine 260, CPU interface 910, memory interface 920, PCI interfaces 930, 932, X-BUS interface 940 altogether for communicating data signal and control signal among them.
  • [0088]
    Data and control signals from host-side IO device interconnect controller 220 enter CPU chip/parity engine 244 through PCI interface 930 and are buffered in PM FIFO 934. The PCI interface 930 to the host-side IO device interconnect controller 220 can be, for example, of a bandwidth of 64-bit, 66 Mhz. When in the PCI slave cycle, the PCI interface 930 owns the PM bus 950 and the data and control signals in the PM FIFO 934 are then transmitted to either the memory interface 920 or to the CPU interface 910.
  • [0089]
    The data and control signals received by the CPU interface 9IO from PM bus 950 are transmitted to CPU 242 for further treatment. The communication between the CPU interface 9IO and the CPU 242 can be performed, for example, through a 64 bit data line and a 32 bit address line. The data and control signals can be transmitted to the memory interface 920 through a CM FIFO 922 of a bandwidth of 64 bit, 133 MHz.
  • [0090]
    An ECC (Error Correction Code) circuit 924 is also provided and connected between the CM FIFO 922 and the memory interface 920 to generate ECC code. The ECC code can be generated, for example, by XORing 8 bits of data for a bit of ECC code. The memory interface 920 then stores the data and ECC code to the memory 280, for example, an SDRAM. The data in the memory 280 is transmitted to PM bus 950 through the ECC correction circuit 926 and compared with the ECC code from the ECC circuit 924. The ECC correction circuit 926 has the functionality of one-bit auto-correcting and multi-bit error detecting.
  • [0091]
    The parity engine 260 can perform parity functionality of a certain RAID level in response to the instruction of the CPU 242. Of course, the parity engine 260 can be shut off and perform no parity functionality at all in some situation, for example, in a RAID level 0 case. In one embodiment as shown in the FIG. 8, the parity engine 260 can include an XOR engine 262 to connect with the PM bus 950 through XOR FIFO 264. The XOR engine 262 can perform, for example, the XOR function for a memory location with given address and length of the location.
  • [0092]
    The PLL (Phase Locked Loop) 980 is provided for maintaining desirable phase shifts between related signals. The timer controller 982 is provided as a timing base for various clocks and signals. The internal registers 984 are provided to register status of CPU chip/parity engine 244 and for controlling the traffic on the PM bus 950. In addition, a pair of UART functionality blocks 986 are provided so that CPU chip/parity engine 244 can communicate with outside through RS232 interface.
  • [0093]
    In an alternative embodiment, PCI-X interfaces can be used in place of the PCI interfaces 930, 932. Those skilled in the art will know such replacement can be easily accomplished without any difficulty.
  • [0094]
    Please refer to FIG. 9, where an embodiment block diagram of the SATA IO device interconnect controller 300 of FIG. 6 is illustrated. According to the present embodiment, the SATA IO device interconnect controller 300 comprises two PCI-X to SATA controllers 310. FIG. 10 shows an embodiment block diagram of the PCI-X to SATA controller 310 of FIG. 9. As shown in FIG. 10, each PCI-X to SATA controller 310 comprises a PCI-X Interface 312 connected to the CPC 240, a Dec/Mux Arbiter 314 connected to the PCI-X interface 312, and 8 SATA Ports 600 connected to the Dec/Mux Arbiter 314. The PCI-X interface 312 comprises a bus interface 318 connecting to the Dec/Mux arbiter 314 and a configuration circuitry 316 storing the configuration of the PCI-X to SATA controller 310. The Dec/Mux arbiter 314 performs arbitration between the PCI-X interface 312 and the plurality of SATA ports 600 and address decoding of the transactions from the PCI-X interface 312 to the SATA ports 600. Through an SATA port 600 and the switching circuit 340, the data are transmitted to a PSD 420.
  • [0095]
    Next please refer to FIG. 11. FIG. 11 shows a block diagram illustrating an embodiment of the SATA port 600 of FIG. 10. As shown in FIG. 11, the SATA ports 600 comprises a superset register 630, a command block register 640, a control block register 650, and a DMA register 620, all connected to the bus interface 318 of the PCI-X interface 312 through the Dec/Mux Arbiter 314. By filling these registers, data will be transferred between the Dec/Mux arbiter 314 and a transport layer 690 through a dual port FIFO 660 under the control of a DMA controller 670. The information received by a transport layer 690 will be reformatted into a frame information structure (FIS) primitive and transmitted to a Link layer 700.
  • [0096]
    The Link layer 700 is then to re-format the FIS into a frame by adding SOF, CRC, EOF, etc., thereto and performing the 8b/10b encoding into encoded 8b/10b characters and transmits it to a PHY layer 710.
  • [0097]
    The PHY layer 710 will transmit signals through a pair of differential signal lines, transmission lines LTX+, LTX−, to and receive signals through another pair of differential signal lines, reception lines LRX+, LRX−, through the switching circuit 340 to a PSD controller in a PSD 420. The two signal lines of each pair of the signal lines, for example LTX+/LTX−, transmit signals TX+/TX− simultaneously at inverse voltage, for example, +V/−V or V/+V, with respective to a reference voltage Vref so that the voltage difference will be +2v or 2V and thus enhance signal quality. This is also applicable to the transmission of the reception signals RX+/RX− on reception lines LRX+, LRX−.
  • [0098]
    When receiving a frame from the PHY layer 710, the Link layer 700 will decode the encoded 8b/10b characters and remove the SOF, CRC, EOF. A CRC will be calculated over the FIS to compare with the received CRC to ensure the correctness of the received information. When receiving a FIS from the Link layer 700, the transport layer 690 will determine the FIS type and distribute the FIS content to the locations indicated by the FIS type.
  • [0099]
    A transmission structure complying with serial ATA protocol is shown in FIG. 12. The information communicated on the serial line is a sequence of 8b/10b encoded characters. The smallest unit thereof is a double-word (32 bits).
  • [0100]
    The contents of each double-word are grouped to provide low-level control information or to transfer information between a host and an device connected thereto. Two types of data structures transmitted on signal lines are primitives and frames.
  • [0101]
    A primitive consists of a single double-word and is the simplest unit of information that may be communicated between a host and a device. When the bytes in a primitive are encoded, the resulting pattern is not easy to be misinterpreted as another primitive or a random pattern. Primitives are used primarily to convey real-time state information, to control the transfer of information and to coordinate communication between the host and the device. The first byte of a primitive is a special character.
  • [0102]
    A frame consists of a plurality of double-words, and starts with an SOF (Start Of Frame) primitive and ends with an EOF (End Of Frame) primitive. The SOF is followed by a user payload called a FIS (Frame Information Structure). A CRC (Cyclic-Redundancy Check Code) is the last non-primitive double-word immediately proceeding the EOF primitive. The CRC is calculated over the contents of the FIS. Some other flow control primitives (HOLD or HOLDA) are allowed between the SOF and EOF to adjust data flow for the purpose of speed matching.
  • [0103]
    The transport layer constructs FISs for transmission and decomposes FISs received from the link layer. The transport layer does not maintain context of ATA commands or previous FIS content. As requested, the transport layer constructs an FIS by gathering FIS content and placing them in proper order. There are various types of FIS, two of which are shown in FIG. 13 and FIG. 14.
  • [0104]
    As shown in FIG. 13, a DMA setup FIS contains a HEADER in field 0. The first byte (byte 0) thereof defines the FIS type (41 h), and the FIS type defines the rest fields of this FIS and defines the total length of this FIS as seven double-words. Bit D in byte 1 indicates the direction of the subsequent data transfer. D=1 means transmitter to receiver; D=0 means receiver to transmitter. Bit I in byte 1 is an interrupt bit. Bit R in byte 1 is a reserved bit and set to 0. DMA buffer identifier low/high field (field 1) indicates the DMA buffer region in the host memory. DMA buffer offset field (field 4) is the byte offset into the buffer. DMA transfer count field (field 5) is the number of bytes that will be read or written by the device.
  • [0105]
    As shown in FIG. 14, a DATA FIS contains a HEADER in field 0. The first byte (byte 0) thereof defines the FIS type (46 h), and the FIS type defines the rest fields of this FIS and defines the total length of this FIS as n+1 double-words.
  • [0106]
    The R bits in byte 1 are reserved bits and set to 0. The fields 1 through n are double-words of data, which contain the data to transfer. The maximum amount of a single DATA FIS is limited.
  • [0107]
    Please refer back to FIG. 4 and FIG. 5. Typically, the access control switch 342, of which the two depicted in FIG. 4 and FIG. 5 are examples, will be kept in a state that patches through to the PSD 420 the SVC 200 that is assigned to field device-side IO requests that are generated as a result of operations initiated in response to host-side IO requests. However, under certain conditions, it may be necessary to temporarily allow the alternate SVC 200 to access the PSD 420. An example of a configuration in which such a condition can arise is one in which one of the SVCs 200 is designated as the master when it comes to certain PSD-related management functions that are performed by the SVC 200 (e.g., monitoring the health of the PSD 420 or accessing an area of the media on the PSD 420 that is reserved for the SVC pairs internal use), while at the same time, some of the LMUs made accessible to the host entity 10 over the host-side IO device interconnects are assigned to the alternate SVC 200 for the purposes of host-side IO request processing. In this case, the SVCs 200 may communicate between themselves (possibly through the RCC interconnect controller 236 of FIG. 6) to determine an appropriate opportunity in which the state of the switch 342 can be safely altered to allow the alternate SVC 200 access to the PSD 420 without disruption of PSD 420 accesses in process. This process will be referred to hereafter as “PSD access ownership transfer”.
  • [0108]
    A switchover process is depicted in the flow chart in FIG. 15. Each SVC 200 is able to determine the current state of the switch 342. This can be accomplished by allowing the SVC 200 to read back the state of the control signal(s) or by having each SVC 200 maintain an image in memory of the current state. The SVC 200 that is not currently patched through to the PSD 420 but requires PSD access, termed access requester, posts a request to the SVC 200 that is currently patched through to the PSD 420, termed access owner, to allow the state of the access control switch 342 to be switched to allow the access requester access to the PSD 420. On receipt of this request, the access owner waits for a “convenient” opportunity to start the switchover process. This process entails allowing any pending IO requests to complete while queuing any IO requests that have not yet started. When all pending requests have completed, the access owner modifies the state of its switch control signal to relinquish accessibility to the PSD 420 and then sends an acknowledgement to the access requester informing it that it can now safely access the PSD 420. At this point, the access requester also modifies the state of its switch control signal to gain accessibility to the PSD 420 which completes the process of switchover and the access requester now becomes the new access owner.
  • [0109]
    At this point, the new access owner is free to issue IO requests to the PSD 420 at will. It may keep ownership until switching back of access ownership is requested by the original access owner, following the same procedure as above. Alternately, at some “convenient” opportunity, as when all the IO requests it has to issue for the time being are executed to completion, the new access owner can automatically start the switch-back process by modifying the state of its access control switch control signal to relinquish accessibility to the PSD 420 and issuing an unsolicited acknowledgement to the original access owner informing it that access ownership has been relinquished and the original access owner can now take back access ownership.
  • [0110]
    Typically, whether the new access owner keeps ownership until a request for switch-back from the original access owner is posted or it automatically transfers ownership back to the original access owner might be fixed by implementation or might be dynamically decided based on such factors as relative frequency of access by the two SVCs 200 and the relative performance impact of keeping ownership versus automatically restoring ownership to the original access owner.
  • [0111]
    The mechanism of inter-controller communication to achieve the above switchover process can take any of a number of forms. One possible mechanism of communication, referred to here as the Binary Signal Pair Access Ownership Arbitration mechanism, is a pair of “access request” signal lines per SATA IO device interconnect, with each digital binary signal line having one SVC 200 set/clear (active SVC on this signal line) the signal and the other SVC 200 read the state of the signal (passive SVC). One of the access request signal lines has SVC1 200 active and SVC2 200 passive while the other access request signal line has SVC1 200 passive and SVC2 200 active. On the passive side, the SVC 200 can read both the current state of the alternate SVC's access request signal and whether the signal has changed state since last reading. On reading, the latter will be cleared.
  • [0112]
    In this mechanism, at the outset, one SVC 200 has ownership of the SATA IO device interconnect and has the signal line on which it is actively asserted. When the alternate SVC 200 wishes to assume ownership, it becomes the access requester and asserts the signal line on which it is active. It then monitors the signal line on which it is passive, watching for a change in its state indicating that it was deasserted at some point, which, in turn, indicates that the access owner acknowledged its request. At this point, the requesting SVC 200 can take control of the SATA IO device interconnect by altering the state of the switch 342 so that it is patched through to the PSD 420.
  • [0113]
    The access owner, on the other hand, continuously monitors for assertion the signal line on which it is passive. Following detection of assertion, at a “convenient” time, it starts waiting for any pending IO requests to complete while queuing any new IO requests. When all pending IO requests are complete, it acknowledges the access control request by deasserting the signal line on which it is active. If it wishes access control to be returned, as when there are new queued IO requests to be issued, it reasserts the signal line. The access requester monitors the signal line on which it is passive for a change in state rather than for a deasserted state because the access owner may have deasserted, then immediately asserted the signal line such that the access requester may not have the chance to detect the deasserted state. FIG. 16 depicts the flow chart described above.
  • [0114]
    A variation on the above Binary Signal Pair Access Ownership Arbitration mechanism for achieving coordinated access ownership transfer would be to implement a pair of HW circuits, referred to here as “Access Ownership Arbitration” circuits (abbreviated AOA), one for each SVC, that indirectly control the access control switch control signals rather than those signals being controlled directly by the SVC.
  • [0115]
    The output from one of the two AOAS would be connected to and control the access control switch control signal associated with one of the SVCs and the output from the other circuit would be connected to and control the access control switch control signal associated with other SVC. In addition, each of these AOAS would have the “Access Ownership Request” signals (abbreviated AOR) from both of the SVCs as inputs. When an SVC does not possess nor is requesting access ownership, its AOR is kept in a deasserted state. While in this state, the output signal of the AOA associated with this SVC is inactive. When the SVC wishes to assume ownership, it asserts its AOR. If the other SVC's AOR is not active, then the AOA associated with the requesting SVC would assert its output signal thereby asserting the access control switch control signal associated with the requesting SVC. If the other SVC's AOR is active, then the requesting SVC's AOA's output remains deasserted until the other SVC's AOR is deasserted, at which time the requesting SVC's AOA output becomes active. The requesting SVC's AOA output then remains active until the requesting SVC's AOR is deasserted, independent of the state of the other SVC's AOR. Typically, the two AOAS would be located in close proximity to the access control switch itself, such as in the PSD canister together with the access control switch.
  • [0116]
    A facility by which a SVC can determine whether or not it was granted access ownership in the event that both SVC's assert their AORs concurrently and by which the SVC currently possessing access ownership can determine when the other SVC is requesting access ownership is also required in this mechanism. Providing a SVC the ability to determine the state of the access control switch would accomplish the former while providing the ability to determine the state of the other SVC's AOR would achieve the latter. However, since these two determinations are made at different times during the access ownership transfer process, they can be combined into a single facility consisting single digital binary signal per SVC, referred to here as the “Alternate SVC Access Ownership Request” signal (abbreviated ASAOR), the state of which can be read by the firmware running on the SVC. Normally, this signal would reflect the state of the other SVC's AOR. However, when the SVC is granted access ownership, its ASAOR would cleared to inactive, independent of the state of the other SVC's AOR, and remain in that state until read by the SVC firmware, after which it would go back to reflecting the state of the other SVC's AOR. FIG. 17 shows a timing diagram depicting the interaction between the various signals in this implementation.
  • [0117]
    In this mechanism, when a SVC wishes to assume ownership, it asserts it's AOR and then starts monitoring its ASAOR. When the SVC detects that its ASAOR is inactive, it knows it has been granted ownership and can proceed to access the PSD. It then keeps its AOR asserted until it wishes to relinquish access ownership, at which point it deasserts its AOR. During the period in which the SVC wishes to maintain access ownership, it addition to keeping its AOR asserted, it also monitors its ASAOR for assertion. If it detects assertion indicating that the other SVC wishes to assume ownership, at a “convenient” time, it would start waiting for any pending IO requests to complete while queuing any new IO requests. When all pending IO requests are complete, it would then relinquish ownership and deassert its AOR. If it wishes access ownership to be returned, as when there are new queued IO requests to be issued, it would immediately reassert its AOR. FIG. 18 depicts the flow described above.
  • [0118]
    Another possible communication mechanism is passing access ownership transfer requests and acknowledgements over communication channels that support the transfer of multiple bits and/or bytes of information. A set of inexpensive dedicated communication channels, such as 12C channels, can be implemented for the purpose of exchanging these requests and acknowledgements. Alternately, the implementation can take advantage of the existing inter-controller communication channel (ICC) that allows the two SVCs 200 in the redundant pair to communicate with each other to exchange these access ownership transfer requests and acknowledges as part of the normal state-synchronization information that gets exchanged between the two SVCs 200.
  • [0119]
    A condition under which switchover of access ownership would be mandated is when the access owner SVC 200 malfunctions in such a way that the alternate SVC 200 must take over its functions. FIG. 19 diagrams the process of switchover of the access ownership in this case. On detection of malfunction of the malfunctioning SVC 200, the alternate SVC 200 asserts the malfunctioning SVCs reset signal to completely incapacitate it and to force all external signal lines into pre-defined states. One such external signal line is the access control switch control signal of the malfunctioning SVC 200. On assertion of the SVC reset signal, this signal line is set to a state that enables the patching through of the surviving SVC 200 to the PSD 420. Following the assertion of the malfunctioning SVC reset signal, the surviving SVC 200 sets the state of its access control switch control signal to engage patching through of itself to the PSD 420. This completes the switchover process.
  • [0120]
    The access control switch 342 will remain in this state until the malfunctioning SVC 200 is replaced or brought back on line and requests ownership to be transferred over to it. The state of the access control switch signal line for each controller at reset, power-up, and during initialization remains such as to disable patching through of itself to the PSD 420 to insure that it does not interfere with potentially on-going PSD 420 accesses by the on-line SVC 200 by inadvertently forcing the access control switch 342 into a state that disrupts such accesses.
  • [0121]
    An alternate method of handling “occasional” access requirements on the part of the SVCs 200 that does not normally have access ownership of the PSD 420 is to have the access owner act as an agent for issuing the IO requests that the SVC 200 requiring access, termed access requester, needs to have executed, an operation termed here as “IO Request Rerouting”. This would typically entail transferring all the necessary IO request information to the access owner for it to construct into an IO request to issue to the PSD 420 for the access requester. In addition to the IO request information, the access requester would transfer any payload data to be written to the PSD to the access owner before or during IO request issuance and execution. Any payload data being read from the PSD would be transferred back to the access requester during or after IO request execution. Completion status of the operation, typically information that indicates whether the operation “succeeded” or “failed” and for what reason, would be passed back to the access requester on completion of the IO request execution. FIG. 20 depicts such a flow chart.
  • [0122]
    There are a couple of advantages of IO Request Rerouting over actually transferring access ownership back and forth in order to allow both SVCs 200 to have access to each PSD 420. Firstly, because of the nature of Serial ATA protocol, requiring a rather extended “bring-up” on the interconnect in going from a “down” state to an “up” state, there could be a significant latency between when an access requester receives ownership from the access owner and when it can actually start initiating PSD accesses. Secondly, the process of “downing” the SATA interface then bringing it back up again may result in the need for SATA interface circuitry on either side to enter states of abnormal condition handling. Occasionally, because abnormal condition handling procedures are typically not as thoroughly tested as normal condition processing, bugs may appear which may interfere with the successful re-bring-up of the interconnect. To minimize this risk, it is good practice to try to minimize the occurrence of what either side would interpret and deal with as abnormal conditions, which, in this case would include minimizing the instances that access ownership needs be transferred.
  • [0123]
    One limitation of a “pure” Serial ATA SVC in which all of the device-side IO device interconnects are Serial ATA is that the number of PSDs that can be connected is limited by the number of device-side IO device interconnects that can be packed onto a single SVC. Because the SATA specification only allows for maximum signal line lengths of 1.5 m, the PSDs connected to one SVC must be packed close enough so that no signal line length exceeds 1.5 m. A typical SATA storage virtualization subsystem will only provide for connection of a maximum of 16 SATA PSDs because of these limitations. So a “pure” SATA storage virtualization subsystem is unable to match the expandability of a Fibre FC-AL storage virtualization subsystem, which would typically allow for connection of up to 250 PSDs via connection of external expansion chassis on the same set of device-side IO device interconnects.
  • [0124]
    In order to overcome this limitation, the current invention optionally includes one or more expansion device-side multiple-device IO device interconnects, herein referred to as device-side expansion ports, such as Parallel SCSI or Fibre FC-AL, on the SVC. These interconnects would typically be wired in such a way as to allow external connection of external expansion chassis. These chassis can be simple “native” JBODs of PSDs directly connected to the interconnect without any intervening conversion circuitry or can be intelligent JBOD emulation subsystems that emulate “native” JBODs using a combination of SATA or PATA PSDs and a single or redundant set of SVCs that provide the conversion from the multiple-device IO device interconnect protocol that provides the connection of the JBOD subsystem to the primary storage virtualization subsystem to the device-side IO device interconnect (SATA or PATA) protocol that provides the connection between the JBOD SVC(s) and the PSDs that they manage.
  • [0125]
    The current invention introduces three possible options for wiring of the device-side expansion ports. FIG. 2 1 depicts an implementation in which each device-side expansion port on one SVC is interconnected with its complement on the other SVC, referred to here as redundant SVC interconnected expansion port implementation. This allows both SVCs to share the device-side interconnect and the bandwidth it provides during normal operation and allows both SVCs full access to each storage unit port. It further allows either SVC to retain full access to all of the storage units, including those that were originally assigned to the alternate SVC, even in the event of an alternate SVC malfunction.
  • [0126]
    FIG. 22 shows how hardware switches might be employed to achieve this kind of switchable interconnection with an expansion port of a loop-style multiple-device IO device interconnect such as Fibre FC-AL. During normal operation, the state of all of the switches would be “0” thereby interconnecting both SVCs onto the interconnect with the storage unit. If SVC1 malfunctioned, M2 would be set to “1” while keeping M1 cleared at “0” thereby bypassing SV1 and creating a direct connection from SVC2 to the storage units. If SVC2 malfunctioned, M1 would be set to “1” (M2 is “Don't Care”) thereby bypassing SV2 and creating a direct connection from SVC1 to the storage units. This switching can be initiated by a hardware signal detection circuit (SDC) that detects whether or not there is a valid signal present on S1 or S2 or it can be initiated by one of the two SVCs when it detects a malfunction in the alternate SVC.
  • [0127]
    FIG. 23 shows a circuit design that relies on hardware signal detection to activate the switch state changes. FIG. 24 shows a circuit design that takes input from the SVC1 and SVC2, C1 and C2 respectively, to trigger the switch state changes. In this implementation, each control signal would be forced into a CLEAR (“0”) state when its corresponding SVC is off line (corresponding circuitry not shown in figure) to avoid the consequences that might arise should the control signal be tri-state or floating and thereby potentially be interpreted by follow on circuitry as SET (“1”). FIG. 25 shows a hybrid of the two that supports switch state changes activated either by hardware signal detection or by input from the SVCs that offers greater flexibility than either one alone.
  • [0128]
    An enhanced such implementation is depicted in FIG. 26, referred to here as the redundant SVC interconnected redundant expansion port implementation, would have pairs of redundant expansion ports rather than independent ports in order to keep a break or malfunctioning in an interconnect that services the expansion port of an SVC from causing a complete loss of access by the SVC to storage units connected on the interconnect. In such a configuration, each port in a redundant pair would connect to one of the ports in each of the dual-ported PSDs connected onto the interconnect or to one of the ports in a dual-ported storage virtualization subsystem that emulates a multiplicity of PSDs connected onto the interconnect (e.g., JBOD emulation storage virtualization subsystem). Should one of the expansion ports or the connected interconnects malfunction on an SVC, IO requests would be rerouted through the other expansion-port/interconnect.
  • [0129]
    FIG. 27 depicts another possible implementation in which each expansion port on one SVC has a redundant complement on the other SVC. The expansion port on one SVC and its redundant complement on the other SVC are connected to the two ports of each dual-ported storage unit in such a way that one SVC's expansion port is connected to one port in the dual-ported pair and its complement on the other SVC to the other port. The complementary expansion ports are not interconnected but rather achieve redundancy by virtue of the dual-ported nature of each storage unit. However, dual-portedness of the storage units alone is not sufficient to support redundancy such that access by both SVCs can be maintained in the face of a malfunctioning expansion port on one of the SVCs or a malfunctioning interconnect connecting the expansion port on one of the SVCs to storage units. To achieve this, it is necessary to provide some mechanism for rerouting IO requests over to the interconnect that connects the alternate SVC to the storage units in the event that an SVC's own expansion port or the interconnect connecting it to the storage units malfunctions.
  • [0130]
    FIG. 28 depicts an implementation that uses hardware switches to interconnect the two IO device interconnects connecting the two SVCs to the storage units shown in FIG. 27 in the event of a break or malfunctioning of the portion of interconnect that runs between the SVS and the storage units. This portion of the interconnect would typically be a cable that runs between chassis and is especially vulnerable to breaks. In FIG. 28, the state of all of the switches would be “0” during normal operation, that is, operation in which all interconnects are functioning properly. Signals from the expansion ports are routed directly to an associated port in the storage unit, referred to as the default path, so that each complementary expansion-port/interconnect in the redundant pair, one on each SVC, can operate completely independently without interference of any kind.
  • [0131]
    When it becomes necessary to route all IOs from both SVCs expansion ports to Storage Unit Port 1 due to a break in the interconnect connecting to Storage Unit Port 2 or perhaps a malfunction in Storage Unit Port 2 itself, M1 and M2 would be set to 1 while M3 would remain clear at 0 and M4 and M5 are “Don't Care”. To route all IOs from both SVCs expansion ports to Storage Unit Port 2, M1 and M3 would be set to 1 while M4 and M5 would remain clear at 0 and M2 is “Don't Care”. If SVC1 goes off line, to route IOs from SVC2 directly to Storage Unit Port 2, M1, M4, and M5 would remain clear at 0 while M2 and M3 are “Don't Care”. If, in addition to SVC1 going off line there is also a break in the interconnect to Storage Unit Port 2 or perhaps a malfunction in the port itself, then IOs from SVC2 would need to be routed to Storage Unit Port 1. This would be done by setting M5 and M2 to 1 while M1 remains clear at 0 and M3 and M4 are “Don't Care”. Conversely, if SVC2 goes off line, to route IOs from SVC1 directly to Storage Unit Port 1, M2 and M3 would remain clear at 0 while M1, M4, and M5 are “Don't Care”. If, in addition to SVC2 going off line there is also a break in the interconnect to Storage Unit Port 1 or perhaps a malfunction in the port itself, then IOs from SVC1 would need to be routed to Storage Unit Port 2. This would be done by setting M3 and M4 to 1 while M5 remains clear at 0 and M1 and M2 are “Don't Care”. The table shown in FIG. 42 summarizes the switch settings for the various possible scenarios.
  • [0132]
    The switching could be initiated by a hardware signal detection circuit (SDC) that detects whether or not there is a valid signal present on S1 or S2 or it could be initiated by one of the two SVCs when it detects a break or malfunction in the default path. FIG. 29 shows a circuit design that relies on hardware signal detection to activate the switch state changes (a truth table for the circuit is shown in FIG. 43). In this figure, R1 is an override signal that allows SVC1 to forcibly take control of the state of the switches in the event that SVC2 and/or associated circuitry malfunction. Similarly, R2 is an override signal that allows SVC2 to forcibly take control of the state of the switches in the event that SVC1 and/or associated circuitry malfunction. FIG. 30 shows a circuit design that takes input from the SVC1 and SVC2, C1 and C2 respectively, to trigger the switch state changes(a truth table for the circuit is shown in FIG. 44). In this case, setting C1 by SVC1 is an indication that its IO requests should be routed through Storage Unit Port 2 instead of the default Storage Unit Port 1 while setting C2 by SVC2 is an indication that its IO requests should be routed through Storage Unit Port 1 instead of the default Storage Unit Port 2. FIG. 31 shows a hybrid of the two that supports switch state changes activated either by hardware signal detection or by input from the SVCs that offers greater flexibility than either one alone(a truth table for the circuit is shown in FIG. 45).
  • [0133]
    Yet another option for wiring of device expansion ports in the configuration depicted in FIG. 27 is without any interconnection at all. In this case, redundancy can be achieved by rerouting IO requests from one SVC to the other and out the surviving complementary expansion-port/device-side-interconnect over the inter-SVC communication interconnect that is normally used for synchronizing the mutual states of the two SVCs with each other.
  • [0134]
    When an SVC detects that a storage unit connected on an IO device interconnect that connects to one of its expansion ports can no longer be accessed, whether it is due to a detected break/malfunction in the expansion-port/interconnect or some other cause, the detecting SVC passes the IO request to the alternate SVC for the alternate SVC to issue to the same storage unit via the complementary expansion-port/interconnect and alternate storage unit port. Any data/status associated with the IO request is transferred between the two SVCs during the execution of the IO request. If the expansion-port/interconnect on the alternate SVC appears to be up and functioning normally yet access to the storage unit fails on the alternate SVC also, the storage unit would be considered as having failed or having been removed. If access succeeds, then the loss of access would be considered to be localized to the original SVC and IO requests associated with future accesses to the storage unit are automatically rerouted to the alternate SVC for issuance over the complementary expansion-port/interconnect. During this time, the original SVC monitors the accessibility of the storage unit via its expansion-port/interconnect typically by periodically issuing internally generated IO requests that check the state of the interconnect and storage unit. If, at some point, the original SVC discovers that the storage unit can now be accessed over its expansion-port/interconnect, it will stop rerouting IO requests to the alternate SVC and start issuing them directly over its own expansion-port/interconnect again. FIG. 32 shows a flow chart of this process.
  • [0135]
    Another feature that as SVC might typically implement is redundancy in the host-side interconnects in which multiple host-side interconnect ports are included on the SVC and LMUs are presented to the host identically over two or more of these interconnects. This feature is designed to allow the host the ability to maintain access to the LMU even if one of the interconnects and/or ports on the interconnect should break, become blocked, or otherwise malfunction.
  • [0136]
    FIG. 33 depicts a redundant external storage virtualization computer system comprising two separate host-side ports on each SVC connecting to two entirely separate host-side IO device interconnects and host ports. Each port on one SVC has a complementary port on the alternate SVC to which it is interconnected. In a typical implementation supporting redundancy in the host-side interconnects, each SVC would present the same set of logical media units in an identical fashion on both of its ports.
  • [0137]
    Under normal operation, host(s) can access logical media units through an SVC that is configured to present the LMU over a host-side interconnect. This can be one SVC or both of the SVCs in the redundant pair. If one SVC were to malfunction, logical media units that were already being presented to the host(s) by both SVCs would remain accessible through the normally-functioning SVC and, with the help of special purpose “multiple-redundant-pathing” functionality on the host, on detection that IO request processing through one of the SVCs is disrupted, the IO requests would be completely routed to the normally-functioning SVC.
  • [0138]
    Those LMUs that were originally only being presented to the host by the SVC that is now malfunctioning would immediately be presented to the host(s) by the normally-functioning SVC over host-side interconnects that connect it to the hosts. For these LMUs, the normally-functioning SVC would be able to transparently take over the processing of host IO requests simply by presenting itself on each interconnect, together with all the reassigned logical media units, in an identical way to what the malfunctioning SVC did prior to its malfunctioning. With this kind of “transparent takeover”, the host need not implement special functionality to make it aware of the SVC malfunctioning and reroute IOs itself in response.
  • [0139]
    In addition to SVC redundancy, the two sets of complementary ports in turn form a redundant port complement. A host that has two independent ports connected using two separate IO device interconnects to these two complementary redundant port sets then has two independent paths to each logical media unit over which it can issue IO requests. Should one of the ports on the host or on an SVC malfunction or should the IO device interconnect itself break or become blocked, the hosts implementing multiple-redundant-pathing functionality can reroute IO requests over the other redundant path. Alternately, when both paths are functioning normally, the host can elect to issue IO requests over both paths in an effort to balance the load between the paths, a technique referred to as “load balancing”.
  • [0140]
    To achieve the transparent takeover functionality described above, each of the pair of ports, one on each SVC, that form a complementary port pair are physically interconnected. For bus-style multiple-device IO device interconnects such as Parallel SCSI, the interconnection simply consists of wiring the devices together directly without any intervening circuitry. For other types of interconnects, special switch circuitry may be required to achieve the physical interconnection required. FIG. 34 shows an example of a switch circuit that can be used to accomplish this interconnection for Fibre interconnects in which hardware signal detection (SDC) is used to activate the switch state changes.
  • [0141]
    In configurations in which the hosts implement multiple-redundant-pathing functionality, there is an alternate host-side interconnect configuration that requires fewer interconnects to achieve similar levels of redundancy as shown in FIG. 35. Note that host-side interconnects connecting an SVC to the hosts are not interconnected to the alternate SVC. In this configuration, interconnect redundancy is achieved by making each LMU accessible to the hosts over a host-side interconnect of one SVC also accessible through an alternate host-side interconnect on the alternate SVC. Should one of the interconnects break, become blocked, or otherwise malfunction, the hosts would still be able to access the LMU through the alternate SVC via the alternate interconnect. Similarly, should one of the SVCs malfunction, the other SVC can take over and, once again, the hosts would still be able to access the LMU through the normally-functioning SVC via the alternate interconnect.
  • [0142]
    A variation on the Redundant Serial ATA storage virtualization Subsystem uses Parallel ATA PSDs rather than Serial ATA PSDs. For each PSD, it incorporates a SATA-to-PATA conversion circuit that resides in close proximity to the PATA PSD between the access control switch and the PSD and typically together with the access control switch in the same FRU. This conversion circuit converts SATA signals and protocol to PATA and back again in the opposite direction. The importance of a Redundant Serial ATA SVS that uses Parallel ATA PSDs lies in the fact that, in the short term, supplies of Serial ATA drives will still be relatively short compared to Parallel ATA and Serial ATA drives will still be significantly more expensive. During this transitional period, this kind of a subsystem would allow PATA PSDs to be substituted for SATA PSDs, eliminating the concerns over SATA PSD supply and cost. Such a subsystem would typically place the conversion circuit, together with the access control switch, in the removable canister in which the PSD resides. The removable canister allows the PSD and any related circuitry to be easily swapped out in the event of that a PSD and/or related circuitry needs servicing. By placing the conversion circuit in the canister, when SATA drives become readily available at a competitive price point, the entire canister contents can be swapped out with a SATA PSD and SATA related circuitry.
  • [0143]
    Please refer to FIG. 36 and FIG. 38. FIG. 36 depicts a block diagram of a removable redundant SATA-PSD canister. FIG. 37 includes more detailed block diagrams of the individual PCBs in the canister on which the access control switch is situated. FIG. 38 depicts a block diagram of a removable redundant PATA-PSD canister. FIG. 39 includes more detailed block diagrams of the individual PCBs in the canister on which the access control switch and SATA-to-PATA conversion circuits are situated. They both have a pair SATA IO device interconnects and a set of access control switch control signals coming in from the two SVCs connecting into an access control switch circuit. The primary difference is in the presence of a SATA-to-PATA conversion circuit in the removable PATA-PSD canister which is otherwise absent in the removable SATA-PSD canister.
  • [0144]
    Those skilled in the art will readily observe that numerous modifications and alternations of the device may be made while retaining the teaching of the invention. Accordingly, the above disclosure should be construed as limited only by the metes and bounds of the appended claims.
Referenced by
Citing PatentFiling datePublication dateApplicantTitle
US7363457Jul 21, 2005Apr 22, 2008Sun Microsystems, Inc.Method and system for providing virtualization data services for legacy storage devices
US7502954 *May 26, 2004Mar 10, 2009Emc CorporationHigh availability data storage system
US7584304 *Jul 13, 2005Sep 1, 2009Samsung ElectronicsNon-volatile memory storage device including an interface select switch and associated method
US7627005 *Sep 29, 2005Dec 1, 2009Emc CorporationMultiplexing system
US7676614 *Jan 12, 2006Mar 9, 2010Infortrend Technology, Inc.Redundant storage virtualization computer system
US8201020 *Nov 12, 2009Jun 12, 2012International Business Machines CorporationMethod apparatus and system for a redundant and fault tolerant solid state disk
US8281090 *Oct 9, 2006Oct 2, 2012Infortrend Technology, Inc.Pool spares for data storage virtualization subsystem
US8412869 *Jan 7, 2010Apr 2, 2013Infortrend Technology, Inc.Redundant storage virtualization computer system
US8489914Apr 30, 2012Jul 16, 2013International Business Machines CorporationMethod apparatus and system for a redundant and fault tolerant solid state disk
US8756454May 13, 2013Jun 17, 2014International Business Machines CorporationMethod, apparatus, and system for a redundant and fault tolerant solid state disk
US8825974Aug 30, 2012Sep 2, 2014Infortrend Technology, Inc.Pool spares for data storage virtualization subsystem
US9110607Jul 25, 2014Aug 18, 2015Infortrend Technology, Inc.Pool spares for data storage virtualization subsystem
US9256521 *Nov 3, 2011Feb 9, 2016Pmc-Sierra Us, Inc.Methods and apparatus for SAS controllers with link list based target queues
US9342413 *Apr 26, 2007May 17, 2016Infortrend Technology, Inc.SAS RAID head
US20060069820 *Jul 13, 2005Mar 30, 2006Jeong-Woo LeeNon-volatile memory storage device including an interface select switch and associated method
US20060155883 *Jan 12, 2006Jul 13, 2006Infortrend Technology, Inc.Redundant storage virtualization computer system
US20070073967 *Sep 29, 2005Mar 29, 2007Peeke Douglas EMultiplexing system
US20070078794 *Oct 9, 2006Apr 5, 2007Infortrend Technology, Inc.Pool spares for data storage virtualization subsystem
US20070255900 *Apr 26, 2007Nov 1, 2007Infortrend Technology, Inc.SAS Raid Head
US20100115162 *Jan 7, 2010May 6, 2010Infortrend Technology, Inc.Redundant Storage Virtualization Computer System
US20110113279 *Nov 12, 2009May 12, 2011International Business Machines CorporationMethod Apparatus and System for a Redundant and Fault Tolerant Solid State Disk
Classifications
U.S. Classification711/114
International ClassificationG06F12/00, G06F13/14, G06F3/06, G06F12/08, G06F13/00
Cooperative ClassificationG06F3/0617, G06F3/0683, G06F3/0658, G06F3/0664, G06F3/0607, G06F3/0689
European ClassificationG06F3/06A2A4, G06F3/06A6L4, G06F3/06A4T4, G06F3/06A4V2
Legal Events
DateCodeEventDescription
Feb 12, 2004ASAssignment
Owner name: INFORTREND TECHNOLOGY, INC., TAIWAN
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:LIU, LING-YI;LEE, TSE-HAN;SCHNAPP, MICHAEL GORDON;AND OTHERS;REEL/FRAME:014326/0570
Effective date: 20040212