CAPACITY PLANNING FOR SERVER
This is a continuation of U.S. patent application Ser. No. 09/549,816, filed on Apr. 14, 2000, entitled "Capacity Planning For Server Resources", listing Matt Odhner, Giedrius Zizys and Kent Schliiter as inventors, and which is assigned to the assignee of this application, now U.S. Pat. No. 6,862,623, which is hereby incorporated by reference. This application is related to U.S. patent application Ser. No. 10/897,645, filed on Jul. 23, 2004, which is a divisional of application Ser. No. 09/549,816, and which is also assigned to the assignee of this application.
This invention relates to server systems, and more particularly to systems and methods for server resource capacity planning in server systems.
Capacity planning is forward-looking resource management that allows a computer system administrator to plan for expected changes of system resource utilization and alter a system to adequately handle such changes. Server performance and capacity planning is a top concern of computer administrators and business managers. If a lack of proactive and continuous capacity planning procedure leads to unexpected unavailability and performance problems, the downtime that results could be financially devastating to a company that depends heavily on server performance, such as an Internet-based merchant.
The importance of superior capacity planning is heightened by the continuous growth in server-dependent companies and potential customers for such companies. Even a solid company that has millions of customers can quickly decline in popularity if it does not increase its resources to handle a constant increase in customers. Excessive downtime of such a company can cause customers to take their business elsewhere.
Capacity planning requires both scientific and intuitive knowledge of a server system. It requires in-depth knowledge of the resource being provided and an adequate understanding of future server traffic. The difficulty of the problem has increased by the development of technology in which multiple servers, or a server cluster, is employed to handle a network or an Internet website.
Current capacity planning methods do not adequately estimate a number of servers having certain resources that a system will need to handle expected loads (number of requests per second). Therefore, a capacity planning method and system is needed in which a user can provide an expected load that the system needs to handle and receive information on how to increase servers and/or resources to adequately handle that load.
Methods and systems for providing capacity planning of server resources are described herein. The methods and systems contemplate using measured data, estimations and extrapolation to provide capacity planning results that are more accurate than current schemes. Server resources for which utilization is calculated are processor utilization,
communication bandwidth utilization, memory utilization, and general server utilization.
Utilization is expressed in terms of actual use of the resource in relation to the total amount of resource available
5 for use. For example, processor utilization is expressed as a percentage of total processing power available. Communication bandwidth utilization is expressed as a percentage of total communication bandwidth available. Memory utilization is expressed as a percentage of total memory available.
10 General server utilization is expressed as a ratio between a current service rate (number of requests per second served) and maximum possible service rate (maximum number of requests the server is capable of serving). This is less specific than showing the processor, bandwidth, and memory utili
15 zation, but it is useful for viewing resource constraints that do not fall under the other three categories.
In a first implementation described herein—referred to as a 'manual' method—a user provides several server parameter values that indicate operating parameters for one or
20 more servers in a server cluster. The parameters include, but are not limited to, a specified load to be handled by the server cluster, the number of servers in the cluster, the available communication bandwidth, the processor type and speed for each machine, and the number of processors and
25 amount of memory per machine.
In addition, the user provides document type information that includes the types of documents the server cluster will transmit in response to requests from clients. In the manual method, the documents are classified according to type and
30 size of document, and the user provides the capacity planner with the percentage of each type of document as it relates to the entire amount of documents.
The user also provides information regarding the percentages of different client connections, e.g., 14K, 56K, ADSL,
35 Tl, etc. The differences in client connection types affect the resources of the server. For instance, if a client connects to the server cluster at a lower connection speed, then that connection will be held open for a longer period of time to accommodate data transmission and more server resources
40 will be consumed than if the client had connected at a higher connection speed.
A theoretical maximum load value is obtained from a pre-defined load table that contains empirically-derived maximum load values handled by servers having a known
45 amount of memory and a processor having a known speed. If the server does not have a processor speed and memory that exactly matches a load table entry, then the closest match is found and the load value for that match is used as the maximum load that can be handled by the system. This
50 maximum load value is used in calculations to obtain the server resource utilization estimates.
Once the server resource utilizations have been derived, a recommendation is made to the user as to what changes, if any, need to be made to the server cluster to accommodate
55 the specified load. For instance, if a specified load input by the user produces a processor utilization estimate of, say 90%, the capacity planner would recommend that another processor be added to the server cluster to safely handle the specified load.
60 In a second implementation described herein—referred to as a 'historical' method—a filter, such as an ISAPI (Internet Server Application Programmer Interface) filter, collects actual server communication parameter values at certain time intervals from the server cluster. Also, a monitor on
65 each server in the server cluster collects other types of server parameter values at certain time intervals. The collected server parameter values are then used to extrapolate a