TECHNIQUES FOR STORING DATA BASED
UPON STORAGE POLICIES
CROSS-REFERENCES TO RELATED
This application claims priority from and is a non-provisional of the following applications, the entire contents of which are herein incorporated by reference for all purposes:
(1) U.S. Provisional Patent Application No. 60/316,764 10 filed Aug. 31, 2001; and
(2) U.S. Provisional Patent Application No. 60/358,915 filed Feb. 21, 2002.
This application also incorporates by reference for all purposes the entire contents of the following applications: 15
(1) U.S. Provisional Patent Application No. 60/340,227 filed Dec. 14, 2001; and
(2) U.S. Non-Provisional patent application Ser. No. 10/133,123 filed Apr. 25, 2002. 20
BACKGROUND OF THE INVENTION
The present invention relates generally to the field of data storage and management, and more particularly to tech- 25 niques for determining storage locations for data in a storage environment based upon storage policies configured for the storage environment.
Heterogeneous and complex storage environments comprising storage systems and devices with different cost, 30 capacity, bandwidth, and other performance characteristics are rapidly replacing conventional homogeneous data storage environments. Due to their heterogeneous nature, managing storage of data in such environments is a difficult and complex task. An important information management func- 35 tion in such heterogeneous data storage environments is to determine where to store the data among the various available storage devices in a manner that reduces costs associated with the data storage while providing efficient data access. 40
In several conventional data storage environments, the decision where to store the data is generally manually determined by a user (e.g., a system administrator) of the data storage environment. The user may make the decision based upon data usage patterns and upon characteristics of 45 the storage devices available in the storage environment for storing the data. Accordingly, in such environments, the system administrator has to gather frequency and data usage information, data access and performance requirements, and frequency of access information from users or consumers of 50 the data. The administrator also has to determine characteristics (e.g., cost, capacity, other performance characteristics) of storage devices available for storing the data. The administrator then typically makes an educated guess as to where the data is to be stored. While the manual approach 55 described above may be feasible in simple homogeneous storage environments supporting a small number of data consumers, such an approach is impractical for today's large and heterogeneous storage environments.
Presently, several conventional data management systems 60 are available that automate part of the data storage decision making process. For example, automated data backup applications are available that perform hierarchical storage management (HSM) to move data from online to off-line storage (or primary to secondary backup media). However, conven- 65 tional data management systems do not presently offer the flexibility, control, and automation desired by system admin
istrators for managing large heterogeneous storage environments comprising a large number of data consumers, servers, and hosts.
In light of the above, there is a need for automated techniques that allow data storage administrators to efficiently manage distributed data and storage resources with minimum intervention in a manner the facilitates efficient data access while optimizing the use of available storage resources.
BRIEF SUMMARY OF THE INVENTION
Embodiments of the present invention provide automated techniques for determining storage locations for data in a storage environment based upon storage policies configured for the storage environment. The storage location is determined in a manner that enables efficient data access while optimizing the available storage resources with minimum human intervention. The storage locations are determined based upon characteristics associated with the data to be stored, based upon characteristics of the storage devices, and based upon storage policies configured for the storage environment.
According to an embodiment of the present invention, techniques are provided for a storage device for storing data in a storage environment comprising a plurality of storage devices. An embodiment of the present invention receives a signal to store a data file. The present invention embodiment then identifies a set of one or more placement rules configured for the storage environment, each placement rule comprising data-related criteria identifying one or more conditions related to one or more characteristics of the data to be stored and device-related criteria identifying one or more conditions related to one or more storage device characteristics. A data value score (DVS) is calculated for each placement rule in the set of placement rules based upon the data-related criteria of the placement rule and characteristics of the data file. The present invention embodiment then determines a storage device, from the plurality of storage devices, for storing the data file based upon the set of placement rules and their associated DVSs, characteristics of the plurality of storage devices, and characteristics of the data file to be stored.
The foregoing, together with other features, embodiments, and advantages of the present invention, will become more apparent when referring to the following specification, claims, and accompanying drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a simplified block diagram of a distributed system that may incorporate an embodiment of the present invention; and
FIG. 2 is a simplified block diagram of a data management server according to an embodiment of the present invention;
FIG. 3 depicts examples of placement rules according to an embodiment of the present invention;
FIG. 4 is a simplified high-level flowchart depicting a method of selecting a storage device from a storage environment for storing a data file based upon a storage policy configured for the storage environment according to an embodiment of the present invention; and
FIGS. 5A and 5B depict a simplified high-level flowchart showing processing performed for identifying a storage device for storing the data file based upon the ranked