US 20080154525 A1 Abstract A computer program for performing a method of providing a parameter estimate from noisy data with aperiodic data arrival. The parameter of the measurement is estimated as a numerator divided by the denominator. The method involves setting a fixed time interval and then waiting for the time interval to expire or for a measurement to occur. If a measurement occurs before the time interval expires the numerator is estimated as a previous numerator plus the new measurement, and the denominator is estimated as a previous denominator plus one. Regardless of whether the measurement occurs or the time interval expires the numerator is estimated as a previous numerator times a step size and the denominator is estimated as a previous denominator times a step size. The method can be applied to numerous applications including assessing data temperature and predicting I/O response times.
Claims(17) 1. A computer-readable medium having a computer program stored thereon comprising executable instructions for performing a method comprising:
setting a fixed time interval; waiting for the time interval to expire or for a measurement to occur; if a measurement occurs, calculating a first numerator as a numerator of a previous measurement plus the measurement and calculating a first denominator as a denominator of the previous measurement plus one; calculating a second numerator as the first numerator times a step size and calculating a second denominator as the first denominator times the step size; estimating a staleness measurement from the second denominator; and calculating an estimate of the measurement as the second numerator divided by the second denominator. 2. The computer-readable medium of 3. The computer-readable medium of 4. The computer-readable medium of 5. The computer-readable medium of 6. The computer-readable medium of 7. The computer-readable medium of ^{τ }where τ is a quotient of a staleness period and an alarm period and a is the step size.8. The computer-readable medium of 9. A computer-readable medium having a computer program stored thereon comprising executable instructions for estimating a temperature comprising:
setting a time interval; waiting for the time interval to expire or for a measurement of temperature to occur; if a measurement of temperature occurs, calculating a first numerator as a numerator of a previous temperature measurement plus the measurement and calculating a first denominator as a denominator of the previous measurement plus one; calculating a second numerator as the first numerator times a step size and calculating a second denominator as the first denominator times the step size; estimating a staleness measurement from the second denominator; and calculating an estimate the temperature as the second numerator divided by the second denominator. 10. The computer-readable medium of 11. The computer-readable medium of 12. The computer-readable medium of 13. The computer-readable medium of 14. The computer-readable medium of 15. The computer-readable medium of ^{τ }where τ is a quotient of a staleness period and an alarm period and α is the step size.16. The computer-readable medium of 17. A computer-readable medium having a computer program stored thereon comprising executable instructions for predicting I/O response times comprising:
setting a fixed time interval; waiting for the time interval to expire or for an I/O measurement to occur; if an I/O measurement occurs, calculating a first numerator as a numerator of a previous measurement plus the I/O measurement and calculating a first denominator as a denominator of the previous measurement plus one; calculating a second numerator as the first numerator times a step size and calculating a second denominator as the flrst denominator times the step size; estimating a staleness measurement from the second denominator; and calculating an estimate of the I/O measurement response time as the second numerator divided by the second denominator. Description Estimates of a time varying parameters are often required from a plurality of noisy measurements of that parameter. A standard technique for providing these estimates is called exponential smoothing. Exponential smoothing is essentially a simple average that weights recent measurements more heavily than earlier measurements. An estimate of the parameter is adjusted towards each new measurement according to the equation: _{i }is the new estimate, k_{j-1 }is the current estimate and x_{i }is the measurement.The configurable step size a controls the size of the adjustment. Exponential smoothing is used in a wide range of applications. A disadvantage with the exponential smoothing technique arises when measurements do not occur at fixed intervals. The above exponential smoothing technique does not provide any way of discounting old measurements or give any indication of the age of measurements that could be used for controlling old measurements. One simple technique for discounting old measurements is discounting on the basis of how many measurements have been received after the measurement in question. The assumption with this technique is that measurements are received on a regular or semi-regular basis. For example, if there are two physical clusters, one of which receives I/Os more frequently than the other, the smoothed measurement of the response time of the frequently accessed cluster will react more quickly to changes in the measured parameter than the infrequently accessed cluster. If left uncorrected these changes cause unpredictability. Another disadvantage is that the smoothing equation given above requires an initial estimate of the parameter before the algorithm is run. After this initial estimate exponential smoothing proceeds as if the estimate were as accurate as the smoothed average of many measurements. The exponential smoothing relies on the assumption that the estimate is accurate. This assumption is unrealistic in situations where system architects have little or no prior knowledge of the parameter being measured. Described below is a computer program stored on tangible storage media for performing a method of providing a parameter estimate from noisy data with aperiodic data arrival. One technique described below involves setting a fixed time interval and then waiting for the time interval to expire or for a measurement to occur. If a measurement occurs before the time interval expires a numerator is estimated as a previous numerator plus the new measurement and the denominator is estimated as a previous denominator plus one. Regardless of whether the measurement occurs or the time interval expires the numerator is estimated as a previous numerator times a step size and the denominator is estimated as a previous denominator times a step size. A measurement is estimated from the denominator. A parameter of the measurement is estimated as a numerator divided by the denominator. The method can be applied to many different applications including assessing data temperature and predicting I/O response times. Staleness ranges from just above 0 to infinity. A staleness value just above zero is extremely fresh whereas in infinite staleness value is the stalest. A staleness value of zero can only occur if there are an infinite number of measurements. The staleness value produced by the algorithm can be compared to a configurable staleness threshold to decide whether some action should be taken. Actions include ignoring the estimate and causing a measurement to be taken. For example, when the staleness value of a physical cluster exceeds the threshold, TVSA removes that physical cluster from consideration for migration and issues probe I/Os to measure the response time of the cluster and reduce the staleness value. If a prior estimate is available of the parameter being measured, the staleness value can be set in accordance of confidence in the parameter estimate. For example, if a prior estimate is available and there is confidence in the accuracy of the estimate, then the staleness value can be set to a high value. If a prior estimate is available for the parameter being measured, then after the staleness value is set, the numerator is set as the denominator multiplied by the estimate of the parameter. The staleness threshold is set by a user. Alternatively the staleness threshold can be set indirectly. In setting the staleness threshold indirectly, users of the method can consider how long the parameter should remain fresh between measurements. The hypothetical situation used to set the staleness threshold according to this rationale is a parameter that had infinite staleness (v=0) until the moment a measurement arrived. The staleness period is measured from this moment in time. The staleness period is set according to the equation: In this formula, t=(staleness period)/(alarm period). The staleness period is the maximum time allowed between measurements. If the staleness period is exceeded a measurement is triggered. The alarm period denotes how frequently the staleness value is updated in the absence of a measurement. This formula will always create a staleness threshold between 0 and 1. In general the staleness threshold can range between 0 and infinity. The disparity between the staleness threshold set by the formula and the general staleness threshold is due to the assumption that the staleness period is measured on a hypothetical parameter the staleness of which was 0 before the measurement that began the staleness period. The staleness threshold formula is derived as follows. At the beginning of the time period in which the final measurement of the parameter occurred, v was set to 0. After the measurement occurred v was incremented to 1. In each successive time period v as multiplied by (1−a) but was not incremented because no more measurements were made. The parameter v would have been multiplied by the (1−a) factor t=(staleness period)/(alarm period) times so the staleness of the parameter after the full staleness period elapsed would be (1−a) The program waits until a set interval expires or a measurement (x) is received (step When a measurement is received, the numerator is set to the previous numerator+the measurement (u=u+x). The denominator is set to the previous denominator+1 (v=v+1) (step After the numerator and denominator have been updated, or if the interval is exceeded then the numerator is updated to the previous numerator times (1−the step size), (u=u(1−a)) and the denominator is updated to the previous denominator times (1−the step size), (v=v(1−a)) (step The step size a is chosen as a value between 0 and 1. The closer the step size is to 1 the more confidence is put in the current measurement. The closer the step size is to 0 the more confidence is put in previous measurements. A typical value of the step size is 0.3. Staleness can be measured by (1/denominator) (1/v) and an estimate of the measurement can be given as (the numerator/the denominator) (u/v) (step An alternative measure to staleness is freshness. The freshness value is the denominator (v). If measurement occur at the same fixed interval as the set interval and confidence in the initial estimate leads to an initial staleness value of v=(1−a)/a, then the method is identical to traditional exponential smoothing. Aperiodic exponential smoothing with staleness reporting can be applied to measuring or forecasting any quantity that satisfies three conditions. The first condition is that the quantity changes with time. The second condition is that aperiodic measurements of the quantity are occasionally received. The third condition is that there is the ability to initiate measurements of the quantity. The self initiated measurements may have some cost. Examples of applications of the invention include network routing. The network could either be a data network such as the internet or physical transport network such as a network of roads. With any network there are a number of paths for traffic from a source to a destination. The task of routing traffic along the network requires knowing or producing estimates of travel times along various paths of the network The goal of routing is to send traffic from the source to the destination in the shortest time possible. Estimates of travel times along paths of the network arrive aperiodically, for example from packet arrival times on a data network, or a transport network from publicly available reports or radio communication from employees. Although estimates are provided by previously made traffic routing decisions, routing is usually done without the express purpose of measurement. Thus, from the point of view of measurement, the arriving estimates are out of control of the router. The present invention can be applied to estimate traffic times along paths and provide estimates of when travel time along a path has become stale. When an estimated travel time along the path has become stale, the decision can be made to route the packet of data or physical vehicle along the path to measure the travel time along it. As an alternative, the invention can be provided to military surveillance. For example a military target may be monitored by a satellite camera but only when the weather is clear. If the satellite is unable to view the target for several days due to cloudy weather, the military may send a low flying plane over the target to compensate for the lack of satellite information. The military may use the invention to estimate some scale of quantity, for example the number of personnel at the target, and decide when supplement satellite imagery with other more dangerous and expensive surveillance techniques. A third application is a website that compiles reviews of a product or service whose quantity may change overtime. One example is reviewing restaurants. Most of the reviews hosted by the website are provided at no cost to the visitors to the website. Some reviews are written by website employees. The website requires accurate and up to date reviews. The website could use the invention to estimate the current quality of the restaurant based on previous reviews and decide when to send a paid employee to a restaurant that hasn't been reviewed for a while. As shown here, the data warehouse A parsing engine The text above describes one or more specific embodiments of a broader invention. The invention also is carried out in a variety of alternative embodiments and thus is not limited to those described here. Those other embodiments are also within the scope of the following claims. Classifications
Legal Events
Rotate |