WO2004097686A1 - Transparent file replication using namespace replication - Google Patents

Transparent file replication using namespace replication Download PDF

Info

Publication number
WO2004097686A1
WO2004097686A1 PCT/US2004/012846 US2004012846W WO2004097686A1 WO 2004097686 A1 WO2004097686 A1 WO 2004097686A1 US 2004012846 W US2004012846 W US 2004012846W WO 2004097686 A1 WO2004097686 A1 WO 2004097686A1
Authority
WO
WIPO (PCT)
Prior art keywords
file
nas
switch
request
file handle
Prior art date
Application number
PCT/US2004/012846
Other languages
French (fr)
Inventor
Thomas K. Wong
Panagiotis Tsirigotis
Anand Iyengar
Rajeev Chawla
Original Assignee
Neopath Networks, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US10/831,376 external-priority patent/US7346664B2/en
Priority claimed from US10/831,701 external-priority patent/US7587422B2/en
Application filed by Neopath Networks, Inc. filed Critical Neopath Networks, Inc.
Publication of WO2004097686A1 publication Critical patent/WO2004097686A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/06Protocols specially adapted for file transfer, e.g. file transfer protocol [FTP]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots
    • G06F16/119Details of migration of file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • G06F16/184Distributed file systems implemented as replicated file system
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1095Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/30Definitions, standards or architectural aspects of layered protocol stacks
    • H04L69/32Architecture of open systems interconnection [OSI] 7-layer type protocol stacks, e.g. the interfaces between the data link level and the physical level
    • H04L69/322Intralayer communication protocols among peer entities or protocol data unit [PDU] definitions
    • H04L69/329Intralayer communication protocols among peer entities or protocol data unit [PDU] definitions in the application layer [OSI layer 7]

Abstract

A NAS switch provides a centralized point of reconfiguration after a network change that alleviates the need for reconfiguration of each connected client. A file server module (114) includes a file server interface (210), a replication module (220), and a synchronization module (230) with a persistent buffer (235). The file server interface manages client requests before replication without assistance. The replication module replicates a namespace separately from data contained therein. Afterwards, synchronization module looks-up the switch file handle in a file handle replication table to determine if the object has been replicated, and if so, sends one of the replica NAS file handles. The synchronization module also maintains synchronicity between the primary and replica file servers through critical NAS requests that modify objects such as create, delete, and the like. The synchronization module includes a persistent buffer (236) such as a non-volatile memory to improve data integrity.

Description

TRANSPARENT FILE REPLICATION USING NAMESPACE REPLICATION
BACKGROUND OF THE INVENTION
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] This application is claims priority under 35 U.S.C. § 119(e) to: U.S. Provisional
Patent Application No. 60/465,578, filed on April 24, 2003, entitled "Method and Apparatus for Transparent File Replication Using the Technique of Namespace Replication," by Thomas K. Wong et al; U.S. Provisional Patent Application No. 60/465,579, filed on April 24, 2003, entitled "Method and Apparatus for Transparent File Migration Using the Technique of Namespace Replication," by Thomas K. Wong et al; and is related to U.S. Patent Application No. [attorney docket #23313-07965], filed on [date even herewith], entitled "Transparent File Migration Using Namespace Replication," by Thomas K. Wong et al, each of which applications are herein incorporated by reference in their entirety.
FIELD OF THE INVENTION
[0002] This invention relates generally to storage networks and, more specifically, to a network device that tracks locations of an object before and after replication on a back-end, while maintaining transparency for a client on the front-end by using persistent file handles to access the objects.
DESCRIPTION OF RELATED ART
[0003] In a computer network, NAS (Network Attached Storage) file servers connected directly to the network provide an inexpensive and easily configurable solution for a storage network. These NAS file servers are self-sufficient because they contain file systems that allow interoperability with clients running any operating system and communication using open protocols. For example, a Unix-based client can use the NFS (Network File System) protocol
\ by Sun Microsystems, Inc. of Santa Clara, California and a Windows-based client can use CIFS (Common Internet File System) by Microsoft Corp. of Redmond, Washington to access files on a NAS file server. However, the operating system does not affect communication between the client and file server. Thus, NAS file servers provide true universal file access.
[0004] By contrast, more expensive and powerful SAN (Storage Area Network) file servers use resources connected by Fibre Channel on a back-end, or dedicated network. Additionally, a SAN file system is part of the operating system or an application running on the client. Different operating systems may require additional copies of each file to be stored on the storage network to ensure compatibility. Communication between file servers on a SAN use proprietary protocols and thus are typically provided by a common vendor. As a result, NAS file servers are preferred when price and ease of use are major considerations. However, the benefits of NAS storage networks over SAN storage networks also have drawbacks.
[0005] One drawback with NAS file servers is that there is no centralized control.
When NAS file servers are either added or removed from the storage network, each client must mount or unmount the associated storage resources as appropriate. This is particularly inefficient when there are changes in hardware, but not in the particular files available on the network, such as when a failing NAS file server is swapped out for an identically configured back-up NAS file server.
[0006] A related drawback is that a client must be reconfigured each time a file is relocated within the storage network, such as during file migration or file replication. To access objects, the client generates a NAS file handle from a mounted directory. The NAS file handle identifies a physical location of the object on the storage network. To request that the NAS file server perform an operation on the object (e.g., create, delete, etc.), the client sends a NAS request directly to the NAS file server with the NAS file handle. But when the file is relocated to a different NAS file server, subsequent requests for access to the object require a new look-up in an updated directory to generate a new NAS file handle for the new location.
[0007] An additional drawback is that NAS file servers are inaccessible during large data transfer operations such as file migrations and replications. These data transfers typically occur during non-business hours to reduce consequential downtime. However, ever-larger storage capacities increase the amount of time necessary for data transfers. Additionally, many enterprises and applications have a need for data that is always available.
[0008] Therefore, what is needed is a network device to provide transparency for clients of decentralized file servers such as NAS file servers. Furthermore, there is a need for the network device to maintain transparency through file replications by managing new locations of replicated files, and tracking their availability. Moreover, there is a need for the network device to provide access to data during file replication.
BRIEF SUMMARY OF THE INVENTION [0009] The present invention meets these needs by providing file replications in a decentralized storage network that are transparent to a client. A NAS switch, in the data path of a client and NAS file servers, reliably coordinates file replication of a primary file server to a replica file server using namespace replication to track new file locations. Additionally, the NAS switch maintains data availability during time-consuming data transfers and as a result of failing file servers.
[0010] An embodiment of a system configured according to the present invention comprises the NAS switch in communication with the client on a front-end of the storage network, and both a primary file server and a replica file server on a back-end. The NAS switch associates NAS file handles (e.g., CIFS file handles or NFS file handles) received from the primary and replica file servers with switch file handles that are independent of a location. The NAS switch then exports the switch file handles to the client. In response to subsequent object access requests from the client, the NAS switch substitutes switch file handles with appropriate NAS file handles for submission to the appropriate NAS file server.
[0011] In another embodiment, the NAS switch comprises a replication module to coordinate replication of source objects at locations on the primary file server to destination objects at locations on the replica file server. Before replicating data, the replication module separately replicates a namespace of the directory hierarchy containing data to be replicated. Namespace replication can also include the use of stored file handles as pointers from objects to be replicated on the file server to the corresponding objects on the primary file server. This replication process allows the NAS switch to track replicated copies of an object. Additionally, the NAS switch keeps the primary file server available during replication, and also maintains consistency across both namespaces by replicating critical operations. The replication module advantageously provides replication services to decentralized file servers and file servers that do not otherwise natively support replication.
[0012] In yet another embodiment, the NAS switch comprises a synchronization module to select a switch file handle. The synchronization module looks-up the NAS file handle in a file handle replication table to determine if the object has been replicated and, if not, returns a switch file handle similar to the NAS file handle. The synchronization module looks-up replicated files in a synchronization location table to determine a current primary file server from which to access the object, and checks a status of the current primary server. The synchronization module returns a switch file handle corresponding to the current primary server, or alternate file server if not available.
[0013] In still embodiment, the redirection module maintains synchronicity between the primary and replica file servers. When the client requests a critical operation on a replicated object (e.g., create, delete, etc.), the synchronization module replicates the critical operation on other copies of the object. In one embodiment, the synchronization module further comprises a persistent buffer to store operations that have yet to be successfully completed in both namespaces. Thus, if a critical operation is unsuccessful due to a file server failure or otherwise, the synchronization module can resubmit the critical operation until successful.
BRIEF DESCRIPTION OF THE DRAWINGS [0014] FIG. 1 is a high-level block diagram illustrating a storage network system according to one embodiment of the present invention.
[0015] FIG. 2 is a block diagram illustrating the file server module according to one embodiment of the present invention.
[0016] FIG. 3 is a high-level flow chart illustrating a method of providing transparent file replication in a NAS storage network according to one embodiment of the present invention.
[0017] FIG. 4 is a flow chart illustration the method of associating NAS file handles with switch file handles according to one embodiment of the present invention.
[0018] FIG. 5 is a flow chart illustrating the method of performing file replication using namespace replication according to one embodiment of the present invention.
[0019] FIG. 6 is a flow chart illustrating the method of replicating a directory hierarchy from a primary server to a replica server according to one embodiment of the present invention.
[0020] FIG. 7 is a flow chart illustrating the method of redirecting NAS requests concerning replicated objects according to one embodiment of the present invention.
[0021] FIG. 8 is a flow chart illustrating the method of determining a NAS file handles from a switch file handles according to one embodiment of the present invention. DETAILED DESCRIPTIONS OF THE INVENTION
[0022] The present invention provides file replication in a decentralized storage network that is transparent to clients. The accompanying description is for the purpose of providing a thorough explanation with numerous specific details. Of course, the field of storage networking is such that many different variations of the illustrated and described features of the invention are possible. Those skilled in the art will thus undoubtedly appreciate that the invention can be practiced without some specific details described below, and indeed will see that many other variations and embodiments of the invention can be practiced while still satisfying its teachings and spirit. For example, although the present invention is described with reference to storage networks operating under the NAS protocol, it can similarly be embodied in future storage network protocols other than NAS, or in mixed protocol networks. Accordingly, the present invention should not be understood as being limited to the specific implementations described below, but only by the claims that follow.
[0023] The processes, features, or functions of the present invention can be implemented by program instructions that execute in an appropriate computing device. Example computing devices include enterprise servers, application servers, workstations, personal computers, network computers, network appliances, personal digital assistants, game consoles, televisions, set-top boxes, premises automation equipment, point-of-sale terminals, automobiles, and personal communications devices. The program instructions can be distributed on a computer readable medium, storage volume, or the Internet. Program instructions can be in any appropriate form, such as source code, object code, or scripting code.
[0024] FIG. 1 is a high-level block diagram illustrating a storage network system 100 according to one embodiment of the present invention. The system 100 comprises a NAS switch 110 and a client 140 coupled to a network 195. The NAS switch 110, a primary file server 120, and a replica file server 130, are each coupled in communication through a sub- network 196. Note that there can be various configurations of the system 100, such as embodiments including additional clients 140, additional primary and/or replica file servers 120, 130, and additional NAS switches 110. The system 100 components are implemented in, for example, a personal computer with an x86-type processor executing an operating system and/or an application program, a workstation, a specialized NAS device with an optimized operating system and/or application program, a modified server blade, etc. In one embodiment, the storage network 175 comprises a NAS using protocols such as NFS and CIFS. In another embodiment, the storage network 175 comprises a combination of NAS, SAN, and other types of storage networks. In yet another embodiment the storage network 175 comprises a decentralized standard or proprietary storage system other than NAS.
[0025] The NAS switch 110 provides continuous transparency to the client 140 with respect to physical configurations and replication operations on the storage network 175. Preferably, the NAS switch 110 emulates file server processes to the client 140 and emulates client processes to the file servers 120, 130. As such, the client 140 is unaware of the NAS switch 110 since the NAS switch is able to redirect NAS requests intended for the primary file server 120 to appropriate locations on the replica file server 130. Thus, the client 140 submits object requests, such as file writes and directory reads, directly to the NAS switch 110. Likewise, the file servers 120, 130 are unaware of the NAS switch 110 since the NAS switch is able to resubmit requests, contained in server file handles, as if they originated from the client 140. To do so, the NAS switch 110 can use mapping, translating, bridging, packet forwarding, other network interface functionality, and other control processes to perform file handle switching, thereby relieving the client 140 of the need to track changes in a file's physical location.
[0026] In one embodiment, the NAS switch 110 comprises a file server module 114 and a client module 112 to facilitate communications and file handle switching. The client module 112 receives exported file system directories from the file servers 120, 130 containing NAS switch handles. To create compatibility between the client 140 and the NAS switch 110, the client module 112 maps the file system directories to internal switch file systems which it sends to the client 140. To request an object, the client 140 traverses an exported switch file system and selects a switch file handle which it sends to the NAS switch 110 along with a requested operation.
[0027] The file server module 114 coordinates the replication process. The file server module 114 initiates tasks that are passively performed by the primary and replica file servers 112, 114. The file server module 114 replicates a namespace containing the data to be replicated from the primary file server 120 to the replica file server 130, and then replicates associated data. During and afterwards, the file server module 112 redirects namespace and file object accesses by the client 140 to appropriate locations. Thus, data transfer services remain available to the client 140.
[0028] In one embodiment, the file server module 114 also tracks reconfigurations resulting from replication and other processes (e.g. adding or removing file server capacity) with a nested system of tables, or information otherwise linked to the switch file systems. The switch file handles are static as they are persistent through replications, but the associated NAS file handles can be dynamic as they are selected depending upon which particular copy is being accessed. To track various copies of an object, the file server module 114 maintains a file handle replication table, corresponding to each file system, that maps NAS file handles of replicated objects to locations on the storage network 175 and to status information about the replication locations. Further embodiments of the file server module 114 are described with respect to FIG. 2.
[0029] In general, NAS file handles uniquely identify objects on the primary or replica file servers 120, 130, such as a directory or file, as long as that object exists. NAS file handles are file server specific, and are valid only to the file servers 120, 130 that issued the file handles. The process of obtaining a NAS file handle from a file name is called a look-up. A NAS file handle, which identifies a directory or file object by location, may be formatted according to protocols such as NFS or CIFS as discussed in further detail below, e.g., with reference to Tables 1 A and IB. By contrast, a switch file handle identifies a directory or file object independent of location, making it persistent through file replications, migrations, and other data transfers. The switch file handle can be a modified NAS file handle that refers to an internal system within the NAS switch 110 rather than the primary file server 120. A stored file handle is stored in place of a migrated or to be replicated object as a pointer to an alternate location.
[0030] Object access requests handled by the NAS switch 110 include, for example, directory and/or file reads, writes, creation, deletion, moving, and copying. As used herein, various terms are used synonymously to refer to a location of an object prior to replication (e.g., "primary"; "source"; "original"; and "first") and various terms are used to refer to a location of the same object after migration (e.g., "replica"; "destination"; "substitute"; and "second"). Further embodiments of the NAS switch 110 and methods operating therein are described below.
[0031] The client 140 accesses resources on the primary and second file servers 120,
130 by using a switch file handle submitted to the NAS switch 110. To access an object, the client 140 first mounts an exported file system preferably containing switch file handles. In another embodiment, however, the exported file system also contains unaltered NAS file handles. The client 140 looks-up an object to obtain its file handle and submits an associated request. From the perspective of the client 140, transactions are carried out by a file server 120, 130 having object locations that do not change. Thus, the client 140 interacts with the NAS switch 110 before and after a file replication in the same manner. A user of the client 140 can submit operations through a command line interface, a windows environment, a software application, or otherwise. In one embodiment, the client 140 provides access to a storage network 175 other than a NAS storage network.
[0032] The primary file server 120 is the default or original network file server for the client 140 before file replication. The primary file server 120 further comprises primary objects 125, which include directory metadata and file data such as enterprise data, records, database information, applications, and the like.
[0033] The replica file server 130 is able to substitute for, or take over as, the primary network file server for the client 140 during and after file replication. The NAS switch 110 resubmits client requests to the replica file server 130 rather than the primary file server 120 responsive to, for example, a failure, load imbalance, etc. on the primary file server 120. The replica file server 130 further comprises replica objects 135, which include the replicated source directories and files. In one embodiment, more than one replica file server 130 contains a replicated object. Both the primary and replica file servers 120, 130 also preferably comprise a file system compatible with NAS protocols. In one embodiment, the file servers 120, 130 comprise a decentralized file servers, or file servers that otherwise do not natively support file replication.
[0034] The network 195 facilitates data transfers between connected hosts (e.g., 110,
120, 130, 140). The connections to the network 195 maybe wired and/or wireless, packet and or circuit switched, and use network protocols such as TCP/IP (Transmission Control Protocol/Internet Protocol), IEEE (Institute of Electrical and Electronics Engineers) 802.11, IEEE 802.3 (i.e., Ethernet), ATM (Asynchronous Transfer Mode), or the like. The network, 195 comprises, for example, a LAN (Local Area Network), WAN (Wide Area Network), the Internet, and the like. In one embodiment, the NAS switch 110 acts as a gateway between the client 140, connected to the Internet, and the directory file server 120, and the shadow file servers 130, connected to a LAN. The sub-network 196 is preferably a local area network providing optimal response time to the NAS switch 110. h one embodiment, the sub-network 196 is integrated into the network 195.
[0035] FIG. 2 is a block diagram illustrating the file server module 114 according to one embodiment of the present invention. The file server module 114 comprises a file server interface 210, a replication module 220, and a synchronization module 230 with a persistent buffer 235. Generally, the file server interface 210 manages client requests before replication without assistance, but afterwards, checks with the synchronization module 230 for alternative locations or additional processes required by, for example, critical operations. Note that rather than being strict structural separations, "modules" are merely exemplary groupings of fu ctionality corresponding to one or many structures.
[0036] The file server interface 210 receives a switch file handle with a request in from the client 140. If the synchronization module 230 does not recognize the switch file handle as an object subject to replication processes, the file server interface 210 forwards the request with an original NAS file handle. Alternatively, the file server interface 210 can receive a replica NAS file handle for the replica file server 130 from the synchronization module 230 responsive to, for example, a need to access the object at a replicated location or a need to maintain synchronicity between file servers 120, 130.
[0037] The replication module 220 in the NAS switch 110 coordinates replication such that the primary server 120 and the replica server 130 remain available to the client 140. The replication module 220 replicates directory metadata separate from time-consuming data replication. After successful data replication, the replication module 220 updates the file handle replication table including the location on the primary file server 120 and the location on the replica file server 130. In one embodiment, the replication module 220 recognizes replicated directories in exported file systems and maps replicated objects to primary objects.
[0038] The synchronization module 230 substitutes a switch file handle with a replica
NAS file handle for objects subject to replication processes. The synchronization module 230 recognizes such objects by looking-up the NAS file handle in a directory replication table and/or a file handle replication table. The directory replication table contains entries for objects that are currently undergoing namespace replication. The file handle replication table contains entries for objects that have completed replication. In one embodiment, the synchronization module 230 further comprises a persistent buffer 235 such as a non- volatile memory to improve data integrity. For critical requests, the synchronization module 230 uses the persistent buffer 235 to ensure that operations are completed in both the primary and replica file servers 120, 130, for example, when one file server is unavailable or experiences any other type of failure.
[0039] FIG. 3 is a high-level flow chart illustrating a method 300 of providing transparent file migration in a NAS storage network according to one embodiment of the present invention. The client server module 112 associates 310 an original NAS file handle with a switch file handle as described below with respect to FIG. 4. This enables the NAS switch 110 to act as an intermediary between the client 140 and the file servers 120, 130. The client 140 submits NAS requests using switch file handles as if the NAS switch 110 were a file server 120, 130, and, in turn, the file servers 120, 130 process NAS file handles from the NAS switch 110 as if they were submitted by the client 140.
[0040] The replication module 220 performs 320 file replication using namespace replication as described below with respect to FIG. 5. By separating directory replication from data replication, the replication module 220 is able to process changes to objects being modified during replication and maintains synchronicity between the primary and replica file servers 120, [0041] The replication module 220 redirects 330 NAS requests concerning replicated files as described below with respect to FIG. 6. Because the NAS switch 110 coordinates and stores elements involved in replication, the client 140 continues referring to objects stored in alternative locations with the same switch file handle used prior to replication. On the back end, however, many changes can occur, h one embodiment, the NAS switch 110 uses replications as synchronized data back-ups when the primary file server 110 is nonresponsive, or fails in other ways, hi another embodiment, the NAS switch 110 balances requests between servers to optimize latency, I/O bandwidth, and other performance metrics.
[0042] FIG. 4 is a flow chart illustration the method 310 of associating a NAS file handle with a switch file handle according to one embodiment of the present invention. Initially, the NAS switch 140 mounts 410 an exported directory of file systems from the primary server 120. in general, the file system organizes objects on a file server 120, 130 into a directory hierarchy of NA file handles, h one embodiment, the NAS switch 110 receives exported directories from associated primary file servers 120, and, in turn, sends exported directories to associated clients 140.
[0043] The client module 112 generates 420 switch file handles independent of object locations in the primary file server 120. The client module 112 organizes exported file systems from the file server 120 by replacing file system or tree identifiers with a switch file system number as shown below in Tables 2A and 2B. The client module 112 exports 430 the switch file system to the client 140 to use in requesting operations. In the reverse process, the NAS switch 110 receives the NAS request and searches replicated file handles and/or replicated namespaces using the NAS file handle. Accordingly, the file server interface 210 checks entries of nested tables maintained by the synchronization module 230. The file server interface 210 generates a NAS file handle from the switch file handle based on an object location. An example of the contents of an NFS and CIFS file handle are shown in Tables 1A IB, while an example of switch file handles or modified NFS and CIFS file handles are shown in Tables 2A and 2B:
Figure imgf000017_0001
Table 1A - NFS File Handle Contents
Figure imgf000017_0002
Table IB - CIFS File Handle Contents
Figure imgf000017_0003
Table 2A - Contents of NFS Switch File Handle
Figure imgf000017_0004
Table 2B - Contents of CIFS Switch File Handle
As discussed below, after objects have been replicated, the NAS switch 110 can accesses objects at new locations using updated NAS file handle.
[0045] FIG. 5 is a flow chart illustrating the method 220 of performing file replication using namespace according to one embodiment of the present invention. The replication module 220 replicates 510 a directory hierarchy of the primary server 120 to organize data copied from the primary file server 120 to the replica file server 130.
[0046] In a separate process, the replication module 220 copies 520 data. If no error occurs during the data transfer, the replica file server 130 commits the data migration. If an error does occur 730, the data transfer is repeated. To commit the data transfer, the reproduction module 220 locks the source file to prevent further access to the file. The reproduction module 220 marks the current entry in the replicated fie list as done, and enters the source and destination file handles indicative of the locations on the primary and replica file servers 120, 130 in the file replication table. Finally, the reproduction module 220 resumes access to the source file.
[0047] If a critical request is issued to the primary server 530, the synchronization module 230 resubmits 540 the critical request to the replica server 130. When the data copy is complete 550, the replication module 220 updates 560 the file handle replication table.
[0048] During data copying, if a client 140 issues a critical request 530 concerning the primary server 120, the synchronization module 230 resubmits 540 the critical request to the replica server 130. In one embodiment, the synchronization module 230 stores requests in the persistent buffer 235 to ensure that critical operations are carried out even if a failure occurs. However, if the request is not a critical request 530, the resubmission is not necessary. Non- critical requests include, for example, read, copy, and other passive operations. When data copying is complete 550, the synchronization module 230 updates 560 the file handle replication table. If there is more data to copy 550, the process loops back to copy data 520. [0049] FIG. 6 is a flow chart illustrating the method 510 of replicating a directory hierarchy from the primary server 120 to the replica server 130 according to one embodiment of the present invention. The reproduction module 220 selects 610 a current source directory from the directory hierarchy of the primary file server 120 and the current destination directory from the replica file server 130. The replication module 220 adds 620 a mapping entry in a replication table with switch file handles related to the source and destination locations. The replication module 220 selects 630 a current object from a listing of file and directory objects in the current source directory.
If the current object is a directory 530, the reproduction module 220 creates 650 a directory in the replica file server 130 with the same name as the current directory in the primary file server 120. On the other hand, if the current object is a file 640, the reproduction module 220 creates 645 a file with a stored file handle for the object from the file handle in the current destination directory. In one embodiment, the stored file handle is similar to the switch file handle. Preferably, the stored file handle is a predetermined size so that the NAS switch 110 can determine whether a file contains a stored file handle merely by inspecting the file's size. An exemplary stored file format is shown in Table 3:
Figure imgf000020_0001
Table 3 — Exemplary Stored File Handle
[0050] Note, however, that there can be variations of the stored file format. The replication module 220 adds 655 a mapping entry in a replicated file list with source and destination switch file handles.
[0051] If all objects have been processed 660, no errors were committed in the process
670, and there are no more directories to replicate 680, the reproduction module 220 commits 690 the namespace replication. However, if there are more objects to be processed 660, the replication module 220 continues the process from selecting 630 objects. If there was an error in the directory or file creation 670, the reproduction module 220 deletes 675 the destination directory, and repeats the process from adding 620 mapping entries. Also, if there are more directories to process 680, the first file server 120 returns to selecting 510 primary directories.
[0052] To commit 690 the namespace replication, the reproduction module 220 adds entries to the replicated directory table. As a result, future object access requests will be directed to the replica file server 130 in addition to the primary file server 120. When critical operations are executed on the primary server 120, the replication module 220 uses the replicated directory table to recognize that the request needs to be resubmitted to the replica server 130. The primary file server 120 deletes 620 the replication table since it is no longer needed.
[0053] FIG. 7 is a flow chart illustrating the method 230 of redirecting requests concerning replicated objects according to one embodiment of the present invention. The NAS switch 110 receives 710 the NAS request containing the switch file handle from the client 140. The file server interface 210 determines 720 the NAS file handle from the switch file handle as described below with respect to FIG. 8.
[0054] If the switch file handle is a replicated file handle 730, and the NAS request is a critical request 740, the synchronization module 230 executes 750 the request in both primary and replica file servers 120, 130 through the persistent buffer 235. By replicating the critical request, the synchronization module 230 is able to keep identical directories and data on a primary file server 120 and each replica file server 130. Because replicated requests are stored in the persistent buffer 235 until successful in all file servers 120, 130, the NAS switch 110 ensures that temporarily unavailable file servers 120, 130 receive the same modifications. In one embodiment, if the synchronization module 230 is unable to successfully complete critical operations, an error message can be returned to the client 140. On the other hand, for non- replicated file handles 730 and/or non-critical NAS requests 740, the file server interface 210 executes 760 the request in the primary file server 120. Since non-critical operations do not modify contents or disrupt synchronicity between file servers 120, 130, replicated requests are not necessary.
[0055] FIG. 8 is a flow chart illustrating the method 720 of determining a NAS file handle from a switch file handle according to one embodiment of the present invention. Note that nested tables of FIG. 8 are merely an example as various data structures can be used to associate NAS switch file handles with appropriate switch file handles.
[0056] The reproduction module 230 determines if a switch file handle represents a replicated object 810. As described above in Tables 1 and 2, the switch file handle contains a file system ID as exported by the NAS switch 110 to identify a file system as exported by the file servers 120, 130. The NAS switch locates a file handle replication table associated with the file system. The file handle replication table contains: a replicated file handle representing the switch file handle that has been replicated; a primary file handle representing the primary file server 120 when the object is replicated; a replication location ID representing an entry number to a replication location table identifying where the object is replicated; and a primary file attributes representing attributes (e.g., creation date, etc.) that differ between file servers 120, 130, but can be substituted as attributes for the replicated objects when the primary file server 120 is down.
[0057] If the object has not been replicated, the replication module 230 returns the original NAS file handle. However, if the object has been replicated, the replication module 230 returns either the primary file handle or the replica file handle after determining 820 the primary file server 120 from the file handle replication table and the replica file servers 130 from the replication location table. The replication location table contains: a current primary file system ID representing the file system acting as the primary file system at the present time; an original primary file system ID representing the configured primary file server 120, and a list of replica file system IDs representing one or more replica file servers 130 containing the replicated object.
[0058] To select a file server 120, 130, the reproduction module 230 first determines whether the primary file server 120 is currently acting as the primary server 830. If so, the current primary file system ID from the replication location table matches the primary file handle from the file handle replication table. The reproduction module 230 thus returns 825 the primary file handle as the output NAS file handle. If the current primary file system ID does not match the primary file handle, the reproduction module 230 determines a replica file handle from the current primary file system ID. As such, the reproduction module 230 searches an associated file handle replication table for a primary file handle matching the original primary file handle. The reproduction module 230 returns 835 the replicated file handle of the same entry. In one embodiment, the synchronization module 230 first checks a status of the replica file server 130 in a replica file system status table. The replica file system status table containing: the replication location ID, the replicated file system ID; and a replica file system status representing whether a replica file server 130 is ready to act in a primary capacity, is ready to replicate, or is not ready.

Claims

WE CLAIM:
1. In a NAS (Network Attached Storage) switch, a method for accessing an object in a storage network, the method comprising: mapping a switch file handle that is independent of an object location to a first NAS file handle that is indicative of a location of the object on a primary file server; replicating the object on the primary file server to a replica object on a replica file server; and mapping the switch file handle to a second NAS file handle that is indicative of a location of the replica object on the replica file server.
2. The method of claim 1, wherein the object comprises a directory having objects representative of sub-directories and files.
3. The method of claim 1, wherein the object comprises a file.
4. The method of claim 1, further comprising receiving the switch file handle with a corresponding operation.
5. The method of claim 1, further comprising submitting the switch file handle including the NAS request to the replica server.
6. The method of claim 1, wherein the replicating the object comprises replicating a namespace containing the object by separately replicating a directory and data.
7. The method of claim 1, wherein the replicating the object comprises: during the replicating, sending critical requests involving the object to both the primary file server and the replica file server, the critical request comprising one from the group consisting of a create request, a delete request, a move request, and a copy request.
8. The method of claim 1, wherein the replicating the object further comprises: prior to replicating object, storing a file handle in the location on the replica file server, the stored file handle indicative of the location of the object on the primary file server.
9. The method of claim 1, wherein the replicating the object comprises updating a file handle replication table with an entry for the location on the primary file server and an entry for the location on the replica file server responsive to a successful migration.
10. The method of claim 1, further comprising: receiving the switch file handle with a request from a client; looking-up the second NAS file handle in the file handle replication table with the switch file handle; and sending the request to the location on the replica file server.
11. The method of claim 1 , further comprising: receiving a request concerning a non-replicated object; looking-up the non-replicated object in the file replication table with a non-replicated file handle; and responsive to failing to find an entry for the non-replicated object in the file handle replication table, sending the second request to a second location on the primary file server.
12. The method of claim 1, wherein the request comprises one from the group consisting of a read request and a write request.
13. The method of claim I, wherein the request comprises one from the group consisting of a create request, a delete request, a move request, and a copy request.
14. The method of claim 1, wherein the NAS file handle comprises one from the group consisting of a NFS (Network File System) file handle and a CIFS (Common Internet File System) file handle.
15. In a centralizing switch, a method for accessing an object in a decentralized storage network, the method comprising: mapping a switch file handle that is independent of an object location to a first server file handle that is indicative of a location of the object on a primary file server; replicating the object on the primary file server to a replica object on a replica file server; and mapping the switch file handle to a second server file handle that is indicative of a location of the replica object on the replica file server.
16. The method of claim 15, wherein the first file handles is a first NAS (Network
Attached Storage) file handle and the second file handle is a second NAS file handle.
17. A NAS (Network Attached Storage) switch to access an object in a storage network, comprising: a file server interface to map a switch file handle that is independent of an object location to a first NAS file handle that is indicative of a location of the object on a primary file server; and a replication module to copy the object on the primary file server to a replica object on a replica file server; wherein the file server interface maps the switch file handle to a second NAS file handle that is indicative of a location of the replica object on the replica file server.
18. The NAS switch of claim 17, wherein the obj ect comprises a directory having objects representative of sub-directories and files.
19. The NAS switch of claim 17, wherein the object comprises a file.
20. The NAS switch of claim 17, further comprising a client module to receive the switch file handle, with a corresponding operation, from a client.
21. The NAS switch of claim 17, wherein the file server module submits the second file handle including the NAS request to the replica server.
22. The NAS switch of claim 17, wherein the replicating module replicates a namespace containing the object by separately replicating a directory and data.
23. The NAS switch of claim 17, wherein a client module receives the switch file handle with a request from a client; a synchronization module looks-up the second NAS file handle in a file handle replication table with the switch file handle; and responsive to finding an entry in the file handle replication table, the server module also sends the request to the location on the replica file server.
24. The NAS switch of claim 17 wherein a client module receives a request concerning a non-replicated object; a synchronization modules looks-up the non-replicated object in the file replication table with a non-replicated file handle; and responsive to failing to find an entry for the non-replicated object in the file handle replication table, the file server module sends the non-replicated request to the primary file server location using the first NAS file handle.
25. The NAS switch of claim 17, wherein the request comprises one from the group consisting of a read request and a write request.
26. The NAS switch of claim 17, wherein the request comprises one from the group consisting of a create request, a delete request, a move request, and a copy request.
27. The NAS switch of claim 17, wherein the NAS file handle comprises one from the group consisting of a NFS (Network File System) file handle and a CIFS (Common Internet File System) file handle.
28. A computer program product, comprising a computer-readable medium having computer program instructions and data embodied thereon In a NAS (Network Attached
Storage) switch, a method for accessing an object in a storage network, comprising: mapping a switch file handle that is independent of an object location to a first NAS file handle that is indicative of a location of the object on a primary file server; replicating the object on the primary file server to a replica object on a replica file server; and mapping the switch file handle to a second NAS file handle that is indicative of a location of the replica object on the replica file server.
29. The computer program product of claim 28, wherein the object comprises a directory having objects representative of sub-directories and files.
30. The computer program product of claim 28, wherein the object comprises a file.
31. The computer program product of claim 28, further comprising receiving the switch file handle with a corresponding operation.
32. The computer program product of claim 28, further comprising submitting the switch file handle including the NAS request to the replica server.
33. The computer program product of claim 28, wherein the replicating the object comprises replicating a namespace containing the object by separately replicating a directory and data.
34. The computer program product of claim 28, further comprising: receiving the switch file handle with a request from a client; looking-up the second NAS file handle in the file handle replication table with the switch file handle; and sending the request to the location on the replica file server.
35. The computer program product of claim 28, further comprising: receiving a request concerning a non-replicated object; looking-up the non-replicated object in the file replication table with a non-replicated file handle; and responsive to failing to find an entry for the non-replicated object in the file handle replication table, sending the second request to a second location on the primary file server.
36. The computer program product of claim 28, wherein the request comprises one from the group consisting of a read request and a write request.
37. The computer program product of claim 28, wherein the request comprises one from the group consisting of a create request, a delete request, a move request, and a copy request.
38. The computer program product of claim 28 , wherein the NAS file handle comprises one from the group consisting of a NFS (Network File System) file handle and a CIFS (Common Internet File System) file handle.
PCT/US2004/012846 2003-04-24 2004-04-26 Transparent file replication using namespace replication WO2004097686A1 (en)

Applications Claiming Priority (8)

Application Number Priority Date Filing Date Title
US46557903P 2003-04-24 2003-04-24
US46557803P 2003-04-24 2003-04-24
US60/465,579 2003-04-24
US60/465,578 2003-04-24
US10/831,376 US7346664B2 (en) 2003-04-24 2004-04-23 Transparent file migration using namespace replication
US10/831,701 US7587422B2 (en) 2003-04-24 2004-04-23 Transparent file replication using namespace replication
US10/831,701 2004-04-23
US10/831,376 2004-04-23

Publications (1)

Publication Number Publication Date
WO2004097686A1 true WO2004097686A1 (en) 2004-11-11

Family

ID=33425572

Family Applications (2)

Application Number Title Priority Date Filing Date
PCT/US2004/012846 WO2004097686A1 (en) 2003-04-24 2004-04-26 Transparent file replication using namespace replication
PCT/US2004/012847 WO2004097572A2 (en) 2003-04-24 2004-04-26 Transparent file migration using namespace replication

Family Applications After (1)

Application Number Title Priority Date Filing Date
PCT/US2004/012847 WO2004097572A2 (en) 2003-04-24 2004-04-26 Transparent file migration using namespace replication

Country Status (3)

Country Link
EP (1) EP1618500A4 (en)
JP (1) JP4588024B2 (en)
WO (2) WO2004097686A1 (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7587422B2 (en) 2003-04-24 2009-09-08 Neopath Networks, Inc. Transparent file replication using namespace replication
US7720796B2 (en) 2004-04-23 2010-05-18 Neopath Networks, Inc. Directory and file mirroring for migration, snapshot, and replication
US7831641B2 (en) 2003-04-24 2010-11-09 Neopath Networks, Inc. Large file support for a network file server
US8131689B2 (en) 2005-09-30 2012-03-06 Panagiotis Tsirigotis Accumulating access frequency and file attributes for supporting policy based storage management
US8180843B2 (en) 2003-04-24 2012-05-15 Neopath Networks, Inc. Transparent file migration using namespace replication
US8190741B2 (en) 2004-04-23 2012-05-29 Neopath Networks, Inc. Customizing a namespace in a decentralized storage environment
US8195627B2 (en) 2004-04-23 2012-06-05 Neopath Networks, Inc. Storage policy monitoring for a storage network
US8539081B2 (en) 2003-09-15 2013-09-17 Neopath Networks, Inc. Enabling proxy services using referral mechanisms
US8832697B2 (en) 2005-06-29 2014-09-09 Cisco Technology, Inc. Parallel filesystem traversal for transparent mirroring of directories and files
CN109428942A (en) * 2017-09-05 2019-03-05 南京南瑞继保电气有限公司 Live File Serving System and file server are across live synchronous method more than a kind of

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4451293B2 (en) 2004-12-10 2010-04-14 株式会社日立製作所 Network storage system of cluster configuration sharing name space and control method thereof
DE102005013502A1 (en) * 2005-03-23 2006-09-28 Fujitsu Siemens Computers Gmbh A method for removing a mass storage system from a computer network and computer program product and computer network for performing the method
US7716420B2 (en) * 2006-04-28 2010-05-11 Network Appliance, Inc. Methods of converting traditional volumes into flexible volumes
JP4288525B2 (en) 2007-02-16 2009-07-01 日本電気株式会社 File sharing system and file sharing method
US9235595B2 (en) * 2009-10-02 2016-01-12 Symantec Corporation Storage replication systems and methods
JP2012008854A (en) * 2010-06-25 2012-01-12 Hitachi Ltd Storage virtualization device
JP6485352B2 (en) * 2013-07-23 2019-03-20 株式会社アレスシステム Receiving apparatus, method, computer program
CN104407947B (en) * 2014-10-29 2018-04-27 中国建设银行股份有限公司 Active and standby NAS switching methods and device
JP6720744B2 (en) 2016-07-15 2020-07-08 富士通株式会社 Information processing system, information processing apparatus, and control program
JP7347007B2 (en) 2019-08-28 2023-09-20 富士フイルムビジネスイノベーション株式会社 Information processing device, information processing system, and information processing program

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030154236A1 (en) * 2002-01-22 2003-08-14 Shaul Dar Database Switch enabling a database area network
US20030195903A1 (en) * 2002-03-19 2003-10-16 Manley Stephen L. System and method for asynchronous mirroring of snapshots at a destination using a purgatory directory and inode mapping
US20040017438A1 (en) * 2002-07-26 2004-01-29 Pollard Jeffrey R. Slotted substrates and methods and systems for forming same

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5689701A (en) * 1994-12-14 1997-11-18 International Business Machines Corporation System and method for providing compatibility between distributed file system namespaces and operating system pathname syntax
JP3410010B2 (en) 1997-12-24 2003-05-26 株式会社日立製作所 Subsystem migration method and information processing system
US5937406A (en) * 1997-01-31 1999-08-10 Informix Software, Inc. File system interface to a database
AU3304699A (en) * 1998-02-20 1999-09-06 Storm Systems Llc File system performance enhancement
US6850959B1 (en) 2000-10-26 2005-02-01 Microsoft Corporation Method and system for transparently extending non-volatile storage
US6976060B2 (en) 2000-12-05 2005-12-13 Agami Sytems, Inc. Symmetric shared file storage system
EP1368736A2 (en) * 2001-01-11 2003-12-10 Z-Force Communications, Inc. File switch and switched file system
US7284030B2 (en) * 2002-09-16 2007-10-16 Network Appliance, Inc. Apparatus and method for processing data in a network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030154236A1 (en) * 2002-01-22 2003-08-14 Shaul Dar Database Switch enabling a database area network
US20030195903A1 (en) * 2002-03-19 2003-10-16 Manley Stephen L. System and method for asynchronous mirroring of snapshots at a destination using a purgatory directory and inode mapping
US20040017438A1 (en) * 2002-07-26 2004-01-29 Pollard Jeffrey R. Slotted substrates and methods and systems for forming same

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7587422B2 (en) 2003-04-24 2009-09-08 Neopath Networks, Inc. Transparent file replication using namespace replication
US7831641B2 (en) 2003-04-24 2010-11-09 Neopath Networks, Inc. Large file support for a network file server
US8180843B2 (en) 2003-04-24 2012-05-15 Neopath Networks, Inc. Transparent file migration using namespace replication
US8539081B2 (en) 2003-09-15 2013-09-17 Neopath Networks, Inc. Enabling proxy services using referral mechanisms
US7720796B2 (en) 2004-04-23 2010-05-18 Neopath Networks, Inc. Directory and file mirroring for migration, snapshot, and replication
US8190741B2 (en) 2004-04-23 2012-05-29 Neopath Networks, Inc. Customizing a namespace in a decentralized storage environment
US8195627B2 (en) 2004-04-23 2012-06-05 Neopath Networks, Inc. Storage policy monitoring for a storage network
US8832697B2 (en) 2005-06-29 2014-09-09 Cisco Technology, Inc. Parallel filesystem traversal for transparent mirroring of directories and files
US8131689B2 (en) 2005-09-30 2012-03-06 Panagiotis Tsirigotis Accumulating access frequency and file attributes for supporting policy based storage management
CN109428942A (en) * 2017-09-05 2019-03-05 南京南瑞继保电气有限公司 Live File Serving System and file server are across live synchronous method more than a kind of

Also Published As

Publication number Publication date
JP2006524873A (en) 2006-11-02
EP1618500A4 (en) 2009-01-07
WO2004097572A2 (en) 2004-11-11
WO2004097572A3 (en) 2005-01-06
EP1618500A2 (en) 2006-01-25
JP4588024B2 (en) 2010-11-24

Similar Documents

Publication Publication Date Title
US7587422B2 (en) Transparent file replication using namespace replication
US7346664B2 (en) Transparent file migration using namespace replication
US7720796B2 (en) Directory and file mirroring for migration, snapshot, and replication
US7072917B2 (en) Extended storage capacity for a network file server
US8195627B2 (en) Storage policy monitoring for a storage network
US8190741B2 (en) Customizing a namespace in a decentralized storage environment
US7831641B2 (en) Large file support for a network file server
Braam The Lustre storage architecture
WO2004097686A1 (en) Transparent file replication using namespace replication
EP1805665B1 (en) Storage policy monitoring for a storage network
US7877511B1 (en) Method and apparatus for adaptive services networking
US8131689B2 (en) Accumulating access frequency and file attributes for supporting policy based storage management
US6947940B2 (en) Uniform name space referrals with location independence
US6732117B1 (en) Techniques for handling client-oriented requests within a data storage system
US8909753B2 (en) Root node for file level virtualization
US20100114889A1 (en) Remote volume access and migration via a clustered server namespace
EP1934838A2 (en) Accumulating access frequency and file attributes for supporting policy based storage management
US20050108237A1 (en) File system
US6952699B2 (en) Method and system for migrating data while maintaining access to data with use of the same pathname
WO2004097571A2 (en) Extended storage capacity for a network file server
WO2023097229A1 (en) Fast database scaling utilizing a decoupled storage and compute architecture
Comer et al. Shadow editing: A distributed service for supercomputer access
CN111562936A (en) Object history version management method and device based on Openstack-Swift
WO2003107208A1 (en) Scalable storage system

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: COMMUNICATION UNDER RULE 69 EPC ( EPO FORM 1205A DATED 10/02/06 )

122 Ep: pct application non-entry in european phase