US 20020012449 A1 Abstract A tracking method is disclosed. The method of the present invention tracks a object using a probability distribution of the desired object. The method operates by first calculating a mean location of a probability distribution within a search window. Next, the search window is centered on the calculated mean location. The steps of calculating a mean location and centering the search window may be performed until convergence. The search window may then be resized. Successive iterations of calculating a mean, centering on the mean, and resizing the search window track an object represented by the probability distribution. In one embodiment, a flesh hue probability distribution is generated from an input video image. The flesh hue probability distribution is used to track a human head within the video image.
Claims(20) 1. A method of tracking a probability distribution, said method comprising:
calculating a mean location of a probability distribution within a search window; centering said search window onto said mean location; resizing said search window; and repeating said steps of calculating, resizing and centering. 2. The method as claimed in 3. The method as claimed in 4. The method as claimed in 5. The method as claimed in 6. The method as claimed in determining an orientation of said probability distribution. 7. A method of tracking an object within a video image, said method comprising:
converting said video image into a probability distribution; calculating a mean location of a probability distribution within a search window; centering said search window onto said mean location; and repeating said steps of calculating and centering. 8. The method as claimed in resizing said search window. 9. The method as claimed in 10. The method as claimed in 11. The method as claimed in 12. The method as claimed in determining an orientation of said probability distribution. 13. The method as claimed in 14. The method as claimed in 15. An apparatus for tracking an object, said apparatus comprising:
a video camera, said video camera capturing an image of said object; a video digitizer, said video digitizer digitizing said image; and a computer system, said computer system converting said image into a probability distribution, said computer system iteratively calculating a mean location of a probability distribution within a subarea of said image and centering said subarea onto said mean location. 16. The apparatus as claimed in 17. The apparatus as claimed in 18. The apparatus as claimed in 19. The apparatus as claimed in 20. The apparatus as claimed in Description [0001] The present invention relates to the field of image processing, computer vision, and computer graphical user interfaces. In particular, the present invention discloses a video image based tracking system that allows a computer to identify and track the location of a moving object within a sequence of video images. [0002] There are many applications of object tracking in video images. For example, a security system can be created that tracks people that enter a video image. A user interface can be created wherein a computer tracks the gestures and movements of a person in order to control some activity. [0003] However, traditional object tracking systems are computationally expensive and difficult to use. One example of a traditional method of tracking objects in a scene uses object pattern recognition and edge detection. Such methods are very computationally intensive. Furthermore, such systems are notoriously difficult to train and calibrate. The results produced by such methods often contain a significant amount of jitter such that the results must be filtered before they can be used for a practical purpose. This additional filtering adds more computation work that must be performed. It would therefore be desirable to have a simpler more elegant method of visually tracking a dynamic object. [0004] A method of tracking a dynamically changing probability distribution is disclosed. The method operates by first calculating a mean location of a probability distribution within a search window. Next, the search window is centered on the calculated mean location and the search window is then resized. Successive iterations of calculating a mean, centering on the mean, and resizing the search window track an object represented by the probability distribution. [0005] Other objects, features, and advantages of present invention will be apparent from the company drawings and from the following detailed description that follows below. [0006] The objects, features and advantages of the present invention will be apparent to one skilled in the art, in view of the following detailed description in which: [0007]FIG. 1 illustrates an example computer workstation that may use the teachings of the present invention. [0008]FIG. 2 illustrates a pixel sampling of a human face. [0009]FIG. 3A illustrates a small portion of sample image being converted into a flesh hue histogram. [0010]FIG. 3B illustrates a normalized flesh hue histogram created by a sampling a human face. [0011]FIG. 4 illustrates a probability distribution of flesh hues of an input image. [0012]FIG. 5 illustrates a flow diagram describing the operation of the mean shift method. [0013]FIG. 6 illustrates an example of a continuously adaptive mean shift method applied to one dimensional data. [0014]FIG. 7 illustrates a flow diagram describing the operation of the continuously adaptive mean shift method. [0015]FIG. 8 illustrates example of the continuously adaptive mean shift method applied to one dimensional data. [0016]FIG. 9 illustrates a flow diagram describing the operation of a head tracker using the continuously adaptive mean shift method. [0017]FIG. 10A illustrates a first diagram of a head within a video frame, a head tracker search window, and an calculation area used by the search window. [0018]FIG. 10B illustrates a second diagram of a head within a video frame that is very close to the camera, a head tracker search window, and an calculation area used by the search window. [0019] A method and apparatus for object tracking using a continuous mean shift method is disclosed. In the following description, for purposes of explanation, specific nomenclature is set forth to provide a thorough understanding of the present invention. However, it will be apparent to one skilled in the art that these specific details are not required in order to practice the present invention. For example, the present invention has been described with reference to an image flesh hue probability distribution. However, the same techniques can easily be applied to other types of dynamically changing probability distributions. [0020] A method of tracking objects using a continuously adaptive mean shift method on a probability distribution is disclosed. To simplify the disclosure of the invention, one embodiment is presented wherein a human head is located and tracked within a flesh hue probability distribution created from video image. However, the present invention can easily be used to track other types of objects using other types of probability distribution data. For example, the present invention could be used to track heat emitting objects using an infrared detection system. The present invention can also be used to track objects that are described using non image data such as population distributions. [0021] The disclosed embodiment operates by first capturing a “talking head” video image wherein the head and shoulders of the target person are within the video frame. Next, the method creates a two dimensional flesh hue probability distribution using a preset flesh hue histogram. Then, the location of the target person's head is determined by locating the center of the flesh hue probability distribution. To determine the orientation of the target person's head, the major and minor axis of the flesh hue probability distribution is calculated. [0022] Example Hardware [0023]FIG. 1 illustrates one possible system for using the teachings of the present invention. In the illustration of FIG. 1, a user [0024] Generating a Flesh Hue Histogram [0025] The computer system [0026] In one embodiment, each pixel in the video image is converted to or captured in a hue (H), saturation (S), and value (V) color space. Certain hue values in the sample region are accumulated into a flesh hue histogram. FIG. 3A illustrates a small nine by nine pixel block that has been divided into its hue (H), saturation (S), and value (V) components being converted into a flesh hue histogram. In the embodiment of FIG. 3A, the hue values are grouped into bins wherein each bin comprises five consecutive hue values. Hue values are only accumulated if their corresponding saturation (S) and value (V) values are above respective saturation (S) and value (V) thresholds. Referring to the example of FIG. 3A, the S threshold is 20 and the V threshold is 15 such that a pixel will only be added to the flesh hue histogram if the pixel's S value exceeds 20 and the pixel's V value exceeds 15. Starting at the upper left pixel, this first pixel is added to the flesh hue histogram since the pixel's S value exceeds 20 and the pixel's V value exceeds 15. Thus, a marker [0027] After sampling all the pixels in the sample area, the flesh hue histogram is normalized such that the maximum value in the histogram is equal to a probability value of one (“1”). In a percentage embodiment of FIG. 3B, all the histogram bins contain flesh hue probability values between zero (“0”) and one hundred (“100”) as illustrated in FIG. 3B. Thus, in the normalized flesh hue probability histogram illustrated in FIG. 3B, pixel hues that are likely to be flesh hues are given high percentage values and pixel hues that are not likely to be flesh hues are given low probability values. [0028] Generating a Flesh Hue Probability Images [0029] Once a flesh hue probability histogram has been created, the computer system [0030] Once a probability distribution has been created, the teachings of the present invention can be used to locate the center of an object and to track the object. An early embodiment of the present invention uses a standard mean shift method to track objects that have been converted into probability distributions. [0031]FIG. 5 graphically illustrates how the standard mean shift method operates. Initially, at steps [0032] An example of the mean shift method in operation is presented in FIG. 6. To simplify the explanation, the example is provided using a one dimension slice of a two dimensional probability distribution. However, the same principles apply for a two or more dimensional probability distribution. Referring to step [0033] To use the mean shift method for two dimensional image data, the following procedures are followed: [0034] Find the zeroth moment:
[0035] Find the first moment for x & y:
[0036] Then the mean location (the centroid) is:
[0037] Where I(x, y) is the image value at position (x, y) in the image, and x and y range over the search window. [0038] The mean shift method disclosed with reference to FIG. 5 and FIG. 6 provides relatively good results, but it does have a few flaws. For example, for dynamically changing and moving probability distributions such as probability distributions derived from video sequences, there is no proper fixed search window size. Specifically, a small window might get caught tracking a user's nose or get lost entirely for large movements. A large search window might include a user and his hands as well as people in the background. Thus, if the distribution dymanically changes in time, then a static search window not produce optimal results. [0039] To improve upon the mean shift method, the present invention introduces a continuously adaptive mean shift method referred to as a CAMSHIFT method. The CAMSHIFT method dynamically adjusts the size of the search window to produce improved results. The dynamically adjusting search window allows the mean shift method to operate better in environments where the data changes dynamically. [0040]FIG. 7 graphically illustrates how the CAMSHIFT method of the present invention operates. At steps [0041] An example of the continuously adaptive mean shift method in operation is presented in FIG. 8. Again, to simplify the explanation, the example is provided using a one dimension slice of a two dimensional probability distribution. However, the same principles apply for a two or more dimensional distribution. In the example of FIG. 8, the continuously adaptive mean shift method adjusts the search window to a size that is proportional square root of the zeroth moment. Specifically, the continuously adaptive mean shift method in the example of FIG. 8 in two dimensions adjusts the search window to have a width and height of: [0042] wherein M [0043] where α [0044] Referring to step [0045] To provide an example usage of the continuously adaptive mean shift method, one specific embodiment of a head tracking system is provided. However, many variations exist. The example is provided with reference to FIG. 9, FIG. 10A and FIG. 10B. [0046] To reduce the amount of data that needs to be processed, the head tracker uses a limited “calculation region” that defines the area that will be examined closely. Specifically, the calculation region is the area for which a probability distribution is calculated. Area outside of the calculation region is not operated upon. Referring to step [0047] Next, at step [0048] At step [0049] A Kickstart Method [0050] To initially determine the size and location of the search window, other methods of object detection and tracking may be used. For example, in one embodiment, a motion difference is calculated for successive video frames. The center of the motion difference is then selected as the center of the search window since the center of the motion difference is likely to be a person in the image. [0051] Search Window Sizing [0052] In a digital embodiment such as digitized video images, the probability distributions are discrete. Since the methods of the present invention climb the gradient of a probability distribution, the minimum search window size must be greater than one in order to detect a gradient. Furthermore, in order to center the search window, the search window should be of odd size. Thus, for discrete distributions, the minimum window size is set at three. Also, as the method adapts the search window size, the size of the search window is rounded to the nearest odd number greater or equal to three in order to be able to center the search window. For tracking colored objects in video sequences, we adjust the search window size as described in equation 4. [0053] Determining Orientation [0054] After the probability distribution has been located by the search window, the orientation of the probability distribution can be determined. In the example of a flesh hue tracking system to locate a human head, the orientation of the head can be determined. To determine the probability distribution orientation, the second moment of the probability distribution is calculated. Equation 6 describes how a second moment is calculated. [0055] Second moments are:
[0056] After determining the second moments of the probability distribution, the orientation of the probability distribution (the angle of the head) can be determined. [0057] Then the object orientation (major axis) is:
[0058] In the embodiment of a head tracker, the orientation of the probability distribution is highly correlated with the orientation of the person's head. [0059] The foregoing has described a method of tracking objects by tracking probability densities. It is contemplated that changes and modifications may be made by one of ordinary skill in the art, to the materials and arrangements of elements of the present invention without departing from the scope of the invention. Referenced by
Classifications
Legal Events
Rotate |