MULTI-PROTOCOL STORAGE APPLIANCE THAT PROVIDES INTEGRATED SUPPORT FOR FILE AND BLOCK ACCESS PROTOCOLS
FIELD OF THE INVENTION
[0001] The present invention relates to storage systems and, in particular, to a multi-protocol storage appliance that supports file and block access protocols.
BACKGROUND OF THE INVENTION
[0002] A storage system is a computer that provides storage service relating to the organization of information on writable persistent storage devices, such as memories, tapes or disks. The storage system is commonly deployed within a storage area network (SAN) or a network attached storage (NAS) environment. When used within a NAS environment, the storage system may be embodied as a file server including an operating system that implements a file system to logically organize the information as a hierarchical structure of directories and files on, e.g. the disks. Each "on-disk" file may be implemented as a set of data structures, e.g., disk blocks, configured to store information, such as the actual data for the file. A directory, on the other hand, may be implemented as a specially formatted file in which information about other files and directories are stored.
[0003] The file server, or filer, may be further configured to operate according to a client/server model of information delivery to thereby allow many client systems (clients) to access shared resources, such as files, stored on the filer. Sharing of files is a hallmark of a NAS system, which is enabled because of its semantic level of access to files and file systems. Storage of information on a NAS system is typically deployed over a computer network comprising a geographically distributed collection of interconnected communication links, such as Ethernet, that allow clients to remotely access the information (files) on the filer. The clients typically communicate with the filer by exchanging discrete frames or packets of data according to pre-defined protocols, such as the Transmission Control Protocol/Internet Protocol (TCP/IP).
[0004] In the client/server model, the client may comprise an application executing on a computer that "connects" to the filer over a computer network, such as a point-to-point link, shared local area network, wide area network or virtual private network implemented over a public network, such as the Internet. NAS systems generally utilize file-based access protocols; therefore, each client may request the services of the filer by issuing file system protocol messages (in the form of packets) to the file system over the network identifying one or more files to be accessed without regard to specific locations, e.g., blocks, in which the data are stored on disk. By supporting a plurality of file system protocols, such as the conventional Common Internet File System (CIFS), the Network File System (NFS) and the Direct Access File System (DAFS) protocols, the utility of the filer may be enhanced for networking clients.
[0005] A SAN is a high-speed network that enables establishment of direct connections between a storage system and its storage devices. The SAN may thus be viewed as an extension to a storage bus and, as such, an operating system of the storage system enables access to stored information using block-based access protocols over the "extended bus".
In this context, the extended bus is typically embodied as Fibre Channel (FC) or Ethernet media adapted to operate with block access protocols, such as Small Computer Systems Interface (SCSI) protocol encapsulation over FC or TCP/IP/Ethernet.
[0006] A SAN arrangement or deployment allows decoupling of storage from the storage system, such as an application server, and some level of information storage sharing at the application server level. There are, however, environments wherein a SAN is dedicated to a single server. In some SAN deployments, the information is organized in the form of databases, while in others a file-based organization is employed. Where the information is organized as files, the client requesting the information maintains file mappings and manages file semantics, while its requests (and server responses) address the information in terms of block addressing on disk using, e.g., a logical unit number (lun).
[0007] Previous approaches generally address the SAN and NAS environments using two separate solutions. For those approaches that provide a single solution for both environments, the NAS capabilities are typically "disposed" over the SAN storage system platform using, e.g., a "sidecar" device attached to the SAN platform. However, even these prior systems typically divide storage into distinct SAN and NAS storage domains. That is, the storage spaces for the SAN and NAS domains do not coexist and are physically partitioned by a configuration process implemented by, e.g., a user (system administrator).
[0008] An example of such a prior system is the Symmetrix® system platform available from EMC® Corporation. Broadly stated, individual disks of the SAN storage system (Symmetrix system) are allocated to a NAS sidecar device (e.g., CelerraTM device) that, in turn, exports those disks to NAS clients via, e.g., the NFS and CIFS protocols. A system administrator makes decisions as to the number of disks and the locations of "slices" (extents) of those disks that are aggregated to construct "user-defined volumes" and, thereafter, how those volumes are used. The term "volume" as conventionally used in a SAN environment implies a storage entity that is constructed by specifying physical disks and extents within those disks via operations that combine those extents/disks into a user-defined volume storage entity. Notably, the SAN-based disks and NAS-based disks comprising the user-defined volumes are physically partitioned within the system platform.
[0009] Typically, the system administrator renders its decisions through a complex user interface oriented towards users that are knowledgeable about the underlying physical aspects of the system. That is, the user interface revolves primarily around physical disk structures and management that a system administrator must manipulate in order to present a view of the SAN platform on behalf of a client. For example, the user interface may prompt the administrator to specify the physical disks, along with the sizes of extents within those disks, needed to construct the user-defined volume. In addition, the interface prompts the administrator for the physical locations of those extents and disks, as well as the manner in which they are "glued together" (organized) and made visible (exported) to a SAN client as a userdefined volume corresponding to a disk or lun. Once the physical disks and their extents are selected to construct a volume, only those disks/extents comprise that volume. The
system administrator must also specify the form of reliability, e.g., a Redundant Array of Independent (or Inexpensive) Disks (RAID) protection level and/or mirroring, for that constructed volume. RAID groups are then overlaid on top of those selected disks/extents.
[0010] In sum, the prior system approach requires a system administrator to finely configure the physical layout of the disks and their organization to create a user-defined volume that is exported as a single lun to a SAN client. All of the administration associated with this prior approach is grounded on a physical disk basis. For the system administrator to increase the size of the user-defined volume, disks are added and RAID calculations are re-computed to include redundant information associated with data stored on the disks constituting the volume. Clearly, this is a complex and costly approach. The present invention is directed to providing a simple and efficient integrated solution to SAN and NAS storage environments.
SUMMARY OF THE INVENTION
[0011] The present invention relates to a multi-protocol storage appliance that serves file and block protocol access to information stored on storage devices in an integrated manner for both network attached storage (NAS) and storage area network (SAN) deployments. A storage operating system of the appliance implements a file system that cooperates with novel virtualization modules to provide a virtualization system that "virtualizes" the storage space provided by the devices. Notably, the file system provides volume management capabilities for use in block-based access to the information stored on the devices. The virtualization system allows the file system to logically organize the information as named file, directory and virtual disk (vdisk) storage objects to thereby provide an integrated NAS and SAN appliance approach to storage by enabling filebased access to the files and directories, while further enabling block-based access to the vdisks.
[0012] In the illustrative embodiment, the virtualization modules are embodied, e.g., as a vdisk module and a Small Computer Systems Interface (SCSI) target module. The vdisk module provides a data path from the block-based SCSI target module to blocks managed by the file system. The vdisk module also interacts with the file system to enable access by administrative interfaces, such as a streamlined user interface (UI), in response to a system administrator issuing commands to the multi-protocol storage appliance. In addition, the vdisk module manages SAN deployments by, among other things, implementing a comprehensive set of vdisk commands issued through the UI by a system administrator. These vdisk commands are converted to primitive file system operations that interact with the file system and the SCSI target module to implement the vdisks.
[0013] The SCSI target module, in turn, initiates emulation of a disk or logical unit number (lun) by providing a mapping procedure that translates logical block access to luns specified in access requests into virtual block access to vdisks and, for responses to the requests, vdisks into luns. The SCSI target module thus provides a translation layer of the virtualization system between a SAN block (lun) space and a file system space, where luns are represented as vdisks. By "disposing" SAN virtualization over the file system, the
multi-protocol storage appliance reverses the approaches taken by prior systems to thereby provide a single unified storage platform for essentially all storage access protocols.
[0014] Advantageously, the integrated multi-protocol storage appliance provides access controls and, if appropriate, sharing of files and vdisks for all protocols, while preserving data integrity. The storage appliance further provides embedded/integrated virtualization capabilities that obviate the need for a user to apportion storage resources when creating NAS and SAN storage objects. These capabilities include a virtualized storage space that allows the SAN and NAS objects to coexist with respect to global space management within the appliance. Moreover, the integrated storage appliance provides simultaneous support for block access protocols to the same vdisk, as well as a heterogeneous SAN environment with support for clustering.
BRIEF DESCRIPTION OF THE DRAWINGS
[0015] The above and further advantages of invention may be better understood by referring to the following description in conjunction with the accompanying drawings in which like reference numerals indicate identical or functionally similar elements:
[0016] FIG. 1 is a schematic block diagram of a multiprotocol storage appliance configured to operate in storage area network (SAN) and network attached storage (NAS) environments in accordance with the present invention;
[0017] FIG. 2 is a schematic block diagram of a storage operating system of the multi-protocol storage appliance that may be advantageously used with the present invention;
[0018] FIG. 3 is a schematic block diagram of a virtualization system that is implemented by a file system interacting with virtualization modules according to the present invention; and
[0019] FIG. 4 is a flowchart illustrating the sequence of steps involved when accessing information stored on the multi-protocol storage appliance over a SAN network.
DETAILED DESCRIPTION OF AN ILLUSTRATIVE EMBODIMENT
[0020] The present invention is directed to a multi-protocol storage appliance that serves both file and block protocol access to information stored on storage devices in an integrated manner. In this context, the integrated multi-protocol appliance denotes a computer having features such as simplicity of storage service management and ease of storage reconfiguration, including reusable storage space, for users (system administrators) and clients of network attached storage (NAS) and storage area network (SAN) deployments. The storage appliance may provide NAS services through a file system, while the same appliance provides SAN services through SAN virtualization, including logical unit number (lun) emulation.
[0021] FIG. 1 is a schematic block diagram of the multiprotocol storage appliance 100 configured to provide storage service relating to the organization of information on storage devices, such as disks 130. The storage appliance 100 is illustratively embodied as a storage system comprising a processor 122, a memory 124, a plurality of network adapters 125, 126 and a storage adapter 128 interconnected by a
system bus 123. The multi-protocol storage appliance 100 also includes a storage operating system 200 that provides a virtualization system (and, in particular, a file system) to logically organize the information as a hierarchical structure of named directory, file and virtual disk (vdisk) storage objects on the disks 130.
[0022] Whereas clients of a NAS-based network environment have a storage viewpoint of files, the clients of a SAN-based network environment have a storage viewpoint of blocks or disks. To that end, the multi-protocol storage appliance 100 presents (exports) disks to SAN clients through the creation of luns or vdisk objects. A vdisk object (hereinafter "vdisk") is a special file type that is implemented by the virtualization system and translated into an emulated disk as viewed by the SAN clients. The multiprotocol storage appliance thereafter makes these emulated disks accessible to the SAN clients through controlled exports, as described further herein.
[0023] In the illustrative embodiment, the memory 124 comprises storage locations that are addressable by the processor and adapters for storing software program code and data structures associated with the present invention. The processor and adapters may, in turn, comprise processing elements and/or logic circuitry configured to execute the software code and manipulate the data structures. The storage operating system 200, portions of which are typically resident in memory and executed by the processing elements, functionally organizes the storage appliance by, inter alia, invoking storage operations in support of the storage service implemented by the appliance. It will be apparent to those skilled in the art that other processing and memory means, including various computer readable media, may be used for storing and executing program instructions pertaining to the inventive system and method described herein.
[0024] The network adapter 125 couples the storage appliance to a plurality of clients 160a,fc over point-to-point links, wide area networks, virtual private networks implemented over a public network (Internet) or a shared local area network, hereinafter referred to as an illustrative Ethernet network 165. Therefore, the network adapter 125 may comprise a network interface card (NIC) having the mechanical, electrical and signaling circuitry needed to connect the appliance to a network switch, such as a conventional Ethernet switch 170. For this NAS-based network environment, the clients are configured to access information stored on the multi-protocol appliance as files. The clients 160 communicate with the storage appliance over network 165 by exchanging discrete frames or packets of data according to pre-defined protocols, such as the Transmission Control Protocol/Internet Protocol (TCP/IP).
[0025] The clients 160 may be general-purpose computers configured to execute applications over a variety of operating systems, including the UNIX® and Microsoft® WindowsTM operating systems. Client systems generally utilize file-based access protocols when accessing information (in the form of files and directories) over a NAS-based network. Therefore, each client 160 may request the services of the storage appliance 100 by issuing file access protocol messages (in the form of packets) to the appliance over the network 165. For example, a client 160a running the Windows operating system may communicate with the storage appliance 100 using the Common Internet File System
(CIFS) protocol over TCP/IP. On the other hand, a client 160fc running the UNIX operating system may communicate with the multi-protocol appliance using either the Network File System (NFS) protocol over TCP/IP or the Direct Access File System (DAFS) protocol over a virtual interface (VI) transport in accordance with a remote DMA (RDMA) protocol over TCP/IP. It will be apparent to those skilled in the art that other clients running other types of operating systems may also communicate with the integrated multiprotocol storage appliance using other file access protocols.
[0026] The storage network "target" adapter 126 also couples the multi-protocol storage appliance 100 to clients 160 that may be further configured to access the stored information as blocks or disks. For this SAN-based network environment, the storage appliance is coupled to an illustrative Fibre Channel (FC) network 185. FC is a networking standard describing a suite of protocols and media that is primarily found in SAN deployments. The network target adapter 126 may comprise a FC host bus adapter (HBA) having the mechanical, electrical and signaling circuitry needed to connect the appliance 100 to a SAN network switch, such as a conventional FC switch 180. In addition to providing FC access, the FC HBA may offload fiber channel network processing operations for the storage appliance.
[0027] The clients 160 generally utilize block-based access protocols, such as the Small Computer Systems Interface (SCSI) protocol, when accessing information (in the form of blocks, disks or vdisks) over a SAN-based network. SCSI is a peripheral input/output (I/O) interface with a standard, device independent protocol that allows different peripheral devices, such as disks 130, to attach to the storage appliance 100. In SCSI terminology, clients 160 operating in a SAN environment are initiators that initiate requests and commands for data. The multi-protocol storage appliance is thus a target configured to respond to the requests issued by the initiators in accordance with a request/response protocol. The initiators and targets have endpoint addresses that, in accordance with the FC protocol, comprise worldwide names (WWN). A WWN is a unique identifier, e.g., a node name or a port name, consisting of an 8-byte number.
[0028] The multi-protocol storage appliance 100 supports various SCSI-based protocols used in SAN deployments, including SCSI encapsulated over TCP (iSCSI) and SCSI encapsulated over FC (FCP). The initiators (hereinafter clients 160) may thus request the services of the target (hereinafter storage appliance 100) by issuing iSCSI and FCP messages over the network 165, 185 to access information stored on the disks. It will be apparent to those skilled in the art that the clients may also request the services of the integrated multi-protocol storage appliance using other block access protocols. By supporting a plurality of block access protocols, the multi-protocol storage appliance provides a unified and coherent access solution to vdisks/ luns in a heterogeneous SAN environment.
[0029] The storage adapter 128 cooperates with the storage operating system 200 executing on the storage appliance to access information requested by the clients. The information may be stored on the disks 130 or other similar media adapted to store information. The storage adapter includes I/O interface circuitry that couples to the disks over an I/O interconnect arrangement, such as a conventional highperformance, FC serial link topology. The information is retrieved by the storage adapter and, if necessary, processed by the processor 122 (or the adapter 128 itself) prior to being forwarded over the system bus 123 to the network adapters 125,126, where the information is formatted into packets or messages and returned to the clients.
[0030] Storage of information on the appliance 100 is preferably implemented as one or more storage volumes (e.g., VOL1-2 150) that comprise a cluster of physical storage disks 130, defining an overall logical arrangement of disk space. The disks within a volume are typically organized as one or more groups of Redundant Array of Independent (or Inexpensive) Disks (RAID). RAID implementations enhance the reliability/integrity of data storage through the writing of data "stripes" across a given number of physical disks in the RAID group, and the appropriate storing of redundant information with respect to the striped data. The redundant information enables recovery of data lost when a storage device fails. It will be apparent to those skilled in the art that other redundancy techniques, such as mirroring, may used in accordance with the present invention.
[0031] Specifically, each volume 150 is constructed from an array of physical disks 130 that are organized as RAID groups 140,142, and 144. The physical disks of each RAID group include those disks configured to store striped data (D) and those configured to store parity (P) for the data, in accordance with an illustrative RAID 4 level configuration. It should be noted that other RAID level configurations (e.g. RAID 5) are also contemplated for use with the teachings described herein. In the illustrative embodiment, a minimum of one parity disk and one data disk may be employed. However, a typical implementation may include three data and one parity disk per RAID group and at least one RAID group per volume.
[0032] To facilitate access to the disks 130, the storage operating system 200 implements a write-anywhere file system of a novel virtualization system that "virtualizes" the storage space provided by disks 130. The file system logically organizes the information as a hierarchical structure of named directory and file objects (hereinafter "directories" and "files") on the disks. Each "on-disk" file may be implemented as set of disk blocks configured to store information, such as data, whereas the directory may be implemented as a specially formatted file in which names and links to other files and directories are stored. The virtualization system allows the file system to further logically organize information as a hierarchical structure of named vdisks on the disks, thereby providing an integrated NAS and SAN appliance approach to storage by enabling file-based (NAS) access to the named files and directories, while further enabling block-based (SAN) access to the named vdisks on a file-based storage platform. The file system simplifies the complexity of management of the underlying physical storage in SAN deployments.
[0033] As noted, a vdisk is a special file type in a volume that derives from a plain (regular) file, but that has associated export controls and operation restrictions that support emulation of a disk. Unlike a file that can be created by a client using, e.g., the NFS or CIFS protocol, a vdisk is created on the multi-protocol storage appliance via, e.g. a user interface (UI) as a special typed file (object). Illustra
tively, the vdisk is a multi-inode object comprising a special file inode that holds data and at least one associated stream inode that holds attributes, including security information. The special file inode functions as a main container for storing data, such as application data, associated with the emulated disk. The stream inode stores attributes that allow luns and exports to persist over, e.g., reboot operations, while also enabling management of the vdisk as a single disk object in relation to SAN clients. An example of a vdisk and its associated modes that may be advantageously used with the present invention is described in co-pending and commonly assigned U.S. Patent Application Serial No. (1120560069) titled Storage Virtualization by Layering Vdisks on a File System, which application is hereby incorporated by reference as though fully set forth herein.
[0034] In the illustrative embodiment, the storage operating system is preferably the NetApp® Data ONTAPTM operating system available from Network Appliance, Inc., Sunnyvale, Calif, that implements a Write Anywhere File Layout (WAFLTM) file system. However, it is expressly contemplated that any appropriate storage operating system, including a write in-place file system, may be enhanced for use in accordance with the inventive principles described herein. As such, where the term "WAFL" is employed, it should be taken broadly to refer to any storage operating system that is otherwise adaptable to the teachings of this invention.
[0035] As used herein, the term "storage operating system" generally refers to the computer-executable code operable on a computer that manages data access and may, in the case of a multi-protocol storage appliance, implement data access semantics, such as the Data ONTAP storage operating system, which is implemented as a microkernel. The storage operating system can also be implemented as an application program operating over a general-purpose operating system, such as UNIX® or Windows NT®, or as a general-purpose operating system with configurable functionality, which is configured for storage applications as described herein.
[0036] In addition, it will be understood to those skilled in the art that the inventive system and method described herein may apply to any type of special-purpose (e.g., storage serving appliance) or general-purpose computer, including a standalone computer or portion thereof, embodied as or including a storage system. Moreover, the teachings of this invention can be adapted to a variety of storage system architectures including, but not limited to, a networkattached storage environment, a storage area network and disk assembly directly-attached to a client or host computer. The term "storage system" should therefore be taken broadly to include such arrangements in addition to any subsystems configured to perform a storage function and associated with other equipment or systems.
[0037] FIG. 2 is a schematic block diagram of the storage operating system 200 that may be advantageously used with the present invention. The storage operating system comprises a series of software layers organized to form an integrated network protocol stack or, more generally, a multi-protocol engine that provides data paths for clients to access information stored on the multi-protocol storage appliance using block and file access protocols. The protocol stack includes a media access layer 210 of network drivers (e.g., gigabit Ethernet drivers) that interfaces to network
« PreviousContinue » |