WO2008019020A2 - Methods for discriminating moving objects in motion image sequences - Google Patents

Methods for discriminating moving objects in motion image sequences Download PDF

Info

Publication number
WO2008019020A2
WO2008019020A2 PCT/US2007/017156 US2007017156W WO2008019020A2 WO 2008019020 A2 WO2008019020 A2 WO 2008019020A2 US 2007017156 W US2007017156 W US 2007017156W WO 2008019020 A2 WO2008019020 A2 WO 2008019020A2
Authority
WO
WIPO (PCT)
Prior art keywords
pixel
cpu
sequence
images
image
Prior art date
Application number
PCT/US2007/017156
Other languages
French (fr)
Other versions
WO2008019020A3 (en
Inventor
Steven Joseph Bushell
Bruce Allen Maxwell
Richard Mark Friedhoff
Original Assignee
Tandent Vision Science, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tandent Vision Science, Inc. filed Critical Tandent Vision Science, Inc.
Priority to EP07810967A priority Critical patent/EP2047405A4/en
Publication of WO2008019020A2 publication Critical patent/WO2008019020A2/en
Publication of WO2008019020A3 publication Critical patent/WO2008019020A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/194Segmentation; Edge detection involving foreground-background segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/215Motion-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/28Quantising the image, e.g. histogram thresholding for discrimination between background and foreground patterns
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/42Global feature extraction by analysis of the whole pattern, e.g. using frequency domain transformations or autocorrelation
    • G06V10/422Global feature extraction by analysis of the whole pattern, e.g. using frequency domain transformations or autocorrelation for representing the structure of the pattern or shape of an object therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/457Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by analysing connectivity, e.g. edge linking, connected component analysis or slices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/60Extraction of image or video features relating to illumination properties, e.g. using a reflectance or lighting model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30232Surveillance

Definitions

  • the present invention provides a method and system comprising image techniques that accurately and correctly reflect and represent physical phenomena occurring in the visual world, and more specifically, techniques which can be utilized to identify a moving object within a field of view.
  • an automated, computerized method for classifying pixel values in a motion sequence of images.
  • the method comprises the steps of determining spectral information relevant to the sequence of images, and utilizing the spectral information to classify a pixel as one of background, shadow and object.
  • an automated, computerized method for tracking an object in a motion sequence of images.
  • the method comprises the steps of determining motion pixels from the motion sequence of images, calculating second order tokens as a function of the motion pixels, and identifying motion of an object as a function of the second order tokens.
  • a computer system comprises a CPU and a memory storing a sequence of images, the CPU is arranged and configured to execute a routine to determine spectral information relevant to the sequence of images, and utilize the spectral information to classify a pixel as one of background, shadow and object.
  • a computer system comprises a CPU and a memory storing a motion sequence of images, the CPU is arranged and configured to execute a routine to determine motion pixels from the motion sequence of images, calculate second order tokens as a function of the motion pixels and identify motion of an object as a function of the second order tokens.
  • computer systems are provided, which include one or more computers configured (e.g., programmed) to perform the methods described above.
  • computer readable media are provided which have stored thereon computer executable process steps operable to control a computer(s) to implement the embodiments described above.
  • the automated, computerized methods can be performed by a digital computer, analog computer, optical sensor, state machine, sequencer or any device or apparatus that can be designed or programed to carry out the steps of the methods of the present invention.
  • Figure 1 is a block diagram of a computer system arranged and configured to perform operations related to images.
  • Figure 2 shows an n X m pixel array image file for one frame of a sequence of frames of a motion image stored, in the computer system of figure 1.
  • Figures 3a-e depict a sample image sequence illustrating an object moving through a static background field of view.
  • Figure 4a is a flow chart for performing an operation to classify pixel types in a frame of a motion sequence with a static background image, according to a feature of the present invention.
  • Figure 4b is a flow chart for identifying token regions in the image file of figure 2, according to a feature of the present invention.
  • Figure 4c is a flow chart for identifying X-junctions in an image, according to a feature of the present invention.
  • Figure 4d shows an image having an x-junction.
  • Figure 4e is a flow chart for an X-junction testing sub-routine of the flow chart of figure 4c.
  • Figure 4f is a flow chart for identifying a local spectral ratio using an X- j unction of figure 4d, according to a feature of the present invention.
  • Figures 5a-e depict a sample image sequence illustrating an object moving in a background field of view having a variable illumination.
  • Figure 6 is a graph showing a representation of a temporal linear token, according to a feature of the present invention.
  • Figure 7 is a flow chart for performing an operation to classify pixel types in a frame of a motion sequence with highly variable illumination, according to a feature of the present invention.
  • Figures 8a,b depict a sample image sequence showing two objects moving across a scene from full illumination to shadow under bi-illuminant conditions.
  • Figure 9a is a flow chart for using second order tokens as an aid in motion object tracking, according to a feature of the present invention.
  • Figure 9b is a flow chart of a method for creating an Nth order token for use in the routine of figure 9a.
  • Figures 10a-d depict a sample image sequence showing four objects moving randomly across a scene under bi-illuminant conditions.
  • Figure 11 is a graph showing a representation of a Gaussian mixture model in a linear RGB space.
  • Figure 12 is a flow chart for classifying pixels in a motion image sequence via an enhanced Gaussian mixture model, according to a feature of the present invention.
  • FIG. 1 a block diagram of a computer system 10 arranged and configured to perform operations related to images.
  • a CPU 12 is coupled to a device such as, for example, a digital video camera 14 via, for example, a USB port.
  • the digital video camera 14 operates to download a sequence of images sensed by the camera 14 of a preselected field of view, to the CPU 12.
  • the CPU 12 stores the downloaded images in a memory 16 as image files 18.
  • the image files 18 can be accessed by the CPU 12 for display in a sequence, on a monitor 20, or for print out, one at a time, on a printer 22.
  • each image file 18 comprises an n X m pixel array and corresponds to a frame of a sequence of frames recorded by the video camera 14.
  • Each pixel, p is a picture element corresponding to a discrete portion of the overall image. All of the pixels together define the image represented by the image file 18.
  • Each pixel comprises a digital value corresponding to a set of color bands, for example, red, green and blue color components (RGB) of the picture element.
  • RGB red, green and blue color components
  • the present invention is applicable to any multi-band image, where each band corresponds to a piece of the electro-magnetic spectrum.
  • the pixel array includes m columns of n rows each, starting with the pixel p (1,1) and ending with the pixel p(n, m).
  • the CPU 12 retrieves the corresponding image file 18 from the memory 16, and operates the monitor 20 or printer 22, as the case may be, as a function of the digital values of the pixels in the image file 18, as is generally known.
  • the display on the monitor 20 of a sequence of image files 18 provides a display of events within the field of view, as recorded by the video camera 14.
  • the CPU 12 operates to analyze the RGB values of the pixels of each of a sequence of stored image files 18 to achieve various objectives, such as, for example, the detection and identification of an object moving through the preselected field of view recorded by the video camera 14 in the sequence of images.
  • the image operation can be performed by the CPU 12 in connection with discriminating moving objects from shadows in sequence with a static background image. This can be, for example, a security camera mounted indoors, with a fixed field of view, and constant or slowly varying lighting.
  • the CPU 12 can analyze sequences of images with highly variable illumination, as in the instance of a video camera 14 mounted in an outdoor setting. In either case, the CPU 12 utilizes spectral information to classify pixels of the scene in a determination of the presence of objects.
  • an image comprises two components, material and illumination.
  • an illumination flux impinging on a material depicted in an image is a bi- illuminant flux which comprises an ambient illuminant and a direct or incident illuminant.
  • the incident illuminant is light that causes a shadow and is found outside a shadow perimeter.
  • the ambient illuminant is light present on both the bright and dark sides of a shadow, but is more perceptible within the dark region of a shadow.
  • Spectra for the incident illuminant and the ambient illuminant can be different from one another.
  • a spectral shift caused by a shadow i.e., a decrease of the intensity of the incident illuminant, will be substantially invariant over different materials present in a scene depicted in an image when the scene is illuminated by a common illumination flux.
  • the spectral shift caused by a shadow can be expressed by a spectral ratio of colors across an illumination boundary defined by a shadow on a material.
  • spectral ratios throughout the image that are associated with illumination change should be consistently and approximately equal, regardless of the color of the bright side or the material object characteristics of the boundary.
  • a characteristic spectral ratio for a particular image or scene within an image is a spectral ratio associated with illumination change caused by a shadow, as occurs in the particular image, and can be used to determine if a particular boundary in a scene is caused by a shadow or an object. This information can be utilized to classify pixels of a time sequence of images to detect moving objects.
  • the spectral ratio S Dark/(Bright-Dark) is utilized because it has been discovered during development of the present invention that the normalized value for the ratio Dark/(Bright-Dark) is invariant across different geometric orientations for a material object, and thus, the ratio remains constant across illumination boundaries for objects at different orientations.
  • the normalized value for the ratio Dark/(Bright-Dark) produced by a fully shadowed pixel and a penumbra pixel will be the same as the normalized value produced by a fully shadowed pixel and a fully lit pixel.
  • FIG. 3a-e there is shown a sample image sequence illustrating an object moving through a static background field of view illuminated by constant bi-illuminant conditions (incident or direct illuminant and ambient illuminant), as may occur, for example, with a security camera mounted for monitoring a fixed field of view, and constant lighting.
  • the sequence illustrated in figures 3a-c can correspond to a sequence of image files 18, stored by the CPU 12 from images of a scene recorded by the video camera 14. They illustrate a ball moving across the static background of the field of view.
  • the sample image sequence shows a ball and its shadow moving across a background under bi-illuminant conditions. The most common pixel values at each location define the static background image.
  • Figure 3d corresponds to a depiction of the static background of the field of view, determined by the CPU 12 from the images of figures 3a-c, as will be described.
  • the normalized characteristic spectral ratio for the static background depicted in figure 3d provides a basis for analyzing each individual pixel of each image file 18 of the sequence, as represented by figures 3a-c, to determine whether an individual pixel of a particular image file 18 is a background pixel, a pixel from a moving object, such as the ball of the example of figures 3a-c, or a shadow pixel.
  • Figure 3e depicts an image of the ball without a shadow, as can be generated by the CPU 12 upon analysis of the images of figures 3a-c, as will appear.
  • figure 3a is redrawn by replacing the pixels identified as shadow with static background pixels from the same location.
  • the moving object and the static background image remain, greatly simplifying the tracking of the moving object, in this case the ball.
  • a location (X, Y) at a first time represents the static background, and pixels at this location will equal the value of pixels from the static background represented by figure 3d.
  • the location (X, Y) falls within a shadow cast by the moving ball, as the ball moves across the field of view.
  • the normalized spectral ratio between the current pixel at location (X 5 Y) and the static background pixel at location (X, Y) matches the scene normalized spectral ratio, and so the pixel is recognized as being a shadow.
  • the location (X 5 Y) falls within the ball object moving through the scene.
  • FIG. 3d shows the static background image calculated from the three frames of figures 3a-c.
  • the sample image sequence shows a ball and its shadow moving across a background under bi-illuminant conditions.
  • the most common pixel values at each location define the static background image.
  • the pixel values at location (X, Y) represent the static background image.
  • the static background image is used with the scene normalized spectral ratio to classify pixels that change from frame to frame as either background pixels, shadow pixels, or pixels from moving objects.
  • Figure 4a is a flow chart for performing an operation to classify pixel types in a frame of a motion sequence with a static background image, according to a feature of the present invention.
  • a sequence of image files 18, corresponding to N frames recorded in a time sequence by the video camera 14 is input to the CPU 12, for example, the images of figures 3a-c.
  • the recording of the N frames by the camera 14 can be in real time, relative to the execution of the routine of figure 4a by the CPU 12.
  • the CPU 12 analyzes each pixel of each frame represented by the sequence of image files 18. Step 102 is performed to determine each value for each pixel of the sequence of N frames, to establish the static background of the scene.
  • step 104 the CPU 12 selects a pixel value for each scene location, in each instance a value corresponding to the most common value for the respective pixel from the N pixel values throughout the time sequence of N frames.
  • the common value occurring most frequently in the N frames will correspond to the static background value.
  • Performance of step 104 in respect of all of the pixel values for the N frames generates an image file 18 that depicts the static background, as for example, the image of figure 3d (step 106).
  • Other known techniques for building a static background can be implemented, in lieu of the most frequent occurrence approach described above.
  • step 106 will be the input to the CPU 12.
  • step 108 an image file 18, corresponding to a single frame T of the N frames is input to the CPU 12, for analysis relative to the image file 18 depicting the static background.
  • the image depicted in figure 3a is analyzed relative to the static background of figure 3d.
  • Each pixel of the subject frame is designated by its location within the corresponding image file 18: P(X, Y) and each pixel of the static background is likewise designated by a location: B(X, Y).
  • step 110 the CPU 12 selects one pixel location P(X, Y) from the subject frame T for analysis to classify the selected pixel P(X, Y) as a background pixel, a pixel from a moving object, such as the ball of the example of figures 3a-c, or a shadow pixel.
  • step 112 the CPU 12 calculates a normalized spectral ratio between the pixel P(X, Y) of the subject frame and the corresponding pixel in the static background image for the same X, Y location (B(X, Y)): Sn(P, B) - Norm (P/(B - P)).
  • the CPU 12 then proceeds to decision block 114.
  • the CPU 12 determines whether the pixel, P(X, Y) equals the corresponding pixel B(X, Y) of the static background. If yes, the particular pixel P(X, Y) is classified as a pixel depicting the static background of the scene (step 116). In step 118, the CPU 12 stores the pixel classification. [0040] If no, the CPU 12 proceeds to decision block 120 for the respective pixel P(X 5 Y). In decision block 120, the CPU 12 compares the normalized spectral ratio for the pixel P(X, Y), Sn(P, B), as calculated in step 112, to the global or characteristic spectral ratio for the static background scene.
  • the characteristic spectral ratio comprises the spectral information used to classify the respective pixel P(X, Y).
  • the co-pending Application Serial No. 11/341,742 teaches methods for automatically determining a characteristic spectral ratio for a scene.
  • the CPU 12 is operated to identify spatio-spectral features of an image, that is, features that comprise conditions that are indicative of illumination flux.
  • An example of a spatio-spectral feature is an X- junction.
  • An X-junction is an area of an image where a material edge and an illumination boundary cross one another.
  • An X-junction is an optimal location for an accurate determination of an ⁇ lluminant ratio or characteristic spectral ratio for a scene.
  • the spectral ratio information for illumination boundaries is determined on a local level, that is, an illuminant ratio is determined for each of several preselected local areas of a scene depicted in an image.
  • An analysis of a boundary is then executed utilizing the spectral ratio for the specific location of the boundary within the image.
  • the determination of locally relevant spectral ratios accommodates complexities that may be encountered in a real world image, for example, the interplay of several different sources of light in a room, inter-reflections, and so on.
  • a token analysis of an image is used to identify spatio-spectral features.
  • a token is a connected region of an image wherein the pixels of the region are related to one another in a manner relevant to identification of spatio-spectral features.
  • the pixels of a token can be related in terms of either homogeneous factors, such as, for example, close correlation of color among the pixels (a uniform token), or nonhomogeneous factors, such as, for example, differing color values related geometrically in a color space such as RGB space (for example, a linear token).
  • the use of tokens rather than individual pixels reduces complexity and noise in image processing and provides a more efficient, less intense computational operation for the computer system 10.
  • a uniform token analysis is used to identify X-junctions in an image.
  • a uniform token is a homogeneous token that comprises a connected region of an image with approximately constant pixel values (for example, within a range determined by the expected noise margin of the recording equipment or normal variations in materials) throughout the region.
  • a first order uniform token comprises a single robust color measurement among contiguous pixels of the image.
  • the analysis can include an examination of token neighbor relationships indicative of spatio-spectral features of an image.
  • Figure 4b shows a flow chart for identifying token regions in the subject frame of the N frames, (depicted as an image file 18 in figure 2), according to a feature of the present invention.
  • the CPU 12 sets up a region map in memory.
  • the CPU 12 clears the region map and assigns a region ID, which is initially set at 1.
  • Nstart can be any integer > 0, for example it can be set at set at 11 or 15 pixels.
  • a seed test is begun.
  • the pixel is then tested in decision block 204 to determine if the selected pixel is part of a good seed.
  • the test can comprise a comparison of the color value of the selected pixel to the color values of a preselected number of its neighboring pixels as the seed, for example, the N x N array.
  • the routine After reduction of the seed size, the routine returns to step 202, and continues to test for token seeds.
  • step 212 the CPU 12 pushes the pixels from the seed onto a queue. All of the pixels in the queue are marked with the current region ID in the region map. The CPU 12 then inquires as to whether the queue is empty (decision block 214). If the queue is not empty, the routine proceeds to step 216.
  • step 216 the CPU 12 pops the front pixel off the queue and proceeds to step 218.
  • step 218 the CPU 12 marks "good' neighbors around the subject pixel, that is neighbors approximately equal in color value to the subject pixel, with the current region ID. All of the marked good neighbors are placed in the region map and also pushed onto the queue. The CPU then returns to the decision block 214. The routine of steps 214, 216, 218 is repeated until the queue is empty. At that time, all of the pixels forming a token in the current region will have been identified and marked in the region map.
  • step 220 the CPU 12 increments the region ID for use with identification of a next token.
  • the CPU 12 then returns to step 206 to repeat the routine in respect of the new current token region.
  • the CPU 12 Upon completion of the token region map for the frame, the CPU 12 proceeds to the routine of figure 4c, to identify X-junctions in the image of the subject frame using the token region map.
  • the CPU 12 is input the token region map completed through execution of the routine of figure 4b, the color of each token region and a set of pixels to provide a basis for searching for X-junctions.
  • a decision block the CPU 12 determines whether there are still pixels in the pixel set for search of an X-junction. If no, the CPU 12 exits the routine with a map of likely X-junctions in the frame (step 304). If yes, the CPU 12 proceeds to step 306, to select a next pixel Pi, from the pixel set.
  • step 318 If the CPU 12 determines a yes condition in the decision block of step 314, the CPU 12 proceeds to step 318.
  • step 318 the CPU 12 selects a token from the set Q to designate as a token B (with B not equal to A).
  • a decision block step 320
  • step 320 If the CPU 12 determines a yes condition in the decision block of step 320, the CPU 12 proceeds to step 324.
  • step 324 the CPU 12 selects a token from the set Q to designate as a token D (D not equal to either A or B).
  • a decision block step 326), the CPU 12 tests the token D to determine whether D > Minimum intensity, for example an intensity equal to 20% of the maximum intensity within the subject frame. If no, the CPU 12 proceeds to a decision block (step 328) to determine whether there are more tokens in the set Q for testing as an D token. If yes, the CPU 12 selects another token and returns to step 324 to repeat the test. If no, the CPU 12 returns to step 322. [0060] If the CPU 12 determines a yes condition in the decision block of step 326, the CPU 12 proceeds to step 330.
  • step 330 the CPU 12 selects a token from the set Q to designate as a token C (C not equal to either A or B or D).
  • a decision block step 332
  • step 336 If the CPU 12 determines a yes condition in the decision block of step 332, the CPU 12 proceeds to step 336.
  • Figure 4e shows a flow chart for the X-junction testing sub-routine, step 336 of the flow chart of figure 4c.
  • the token neighbor set ⁇ A, B, C, D ⁇ is set as the starting point of the routine.
  • the hypothesis is that A and B are the same material 1, and that D and C are the same material 2, and that B and C are in shadow (step 428, see figure 4d).
  • step 430 the CPU 12 determines whether the pixels of token A > the pixels of token B and the pixels of token D > the pixels of token C, in each color band.
  • the colors B and C are multiplied by a factor, f, which is a scalar value greater than 1.
  • step 432 it is determined whether the bright measurements for A and D tokens are brighter than a minimum threshold.
  • the next test comprises a determination of whether each of the bright tokens A and D, are significantly different in a color space, for example, in an RGB space. In this regard, a determination is made as to whether the color space distance (A, D) > threshold.
  • step 336 In the event a token set ⁇ A, B, C, D) fails the X-junction tests of step 336, the CPU 12, returns to step 334. If the token set ⁇ A, B, C 5 D) passes the X-junction tests of step 336, the CPU 12 proceeds to step 338 to mark the token set ⁇ A, B, C, D) as a valid X-junction. The CPU 12 then returns to step 302.
  • step 502 comprises the performance of the processes of figures 4b-e, throughout the given image to identify all X-junctions within the image.
  • step 504 the CPU 12 proceeds to step 504 to calculate a spectral ratio for each bright/dark pixel pair in each X-junction, and store the results in a memory array.
  • step 506 the CPU 12 executes a mean shift algorithm on the array of spectral ratios.
  • the mean shift algorithm can comprise, for example, an algorithm described in "Mean shift analysis and applications," Comaniciu, D.; Meer, P.; Computer Vision, 1999, The Proceedings of the Seventh IEEE International Conference on; Volume 2, 20-27 September, 1999; Pages 1197-1203.
  • the output of execution of the mean shift algorithm (step 508) is a spectral ratio for all or a specific local region of the image.
  • the execution of step 506 can include a survey of values for the spectral ratios throughout the image.
  • the spectral ratios calculated over the image by, for example, one of the methods described above vary by an amount > a threshold variance
  • a local approach will be implemented for the spectral ratio information used in determining illumination boundaries. That is, the value at a specific X-junction, or a mean or median of a set of nearby X-junctions will be used as the spectral ratio when the CPU 12 determines illumination boundaries in the region of the image near the specific X-junction. If all of the spectral ratios for the entire image vary by less than the threshold variance, a global approach can be used with the same mean shift spectral ratio used in all illumination boundary determinations.
  • the CPU 12 Upon determination by the CPU 12 of a characteristic spectral ratio for the static background scene (from step 106 of figure 4a), the CPU 12 returns to step 120 of figure 4a to run the comparison between the normalized spectral ratio for the pixel P(X, Y) from the subject frame T, Sn(P, B), as calculated in step 112, and the characteristic spectral ratio for the static background scene. IfSn(P, B) is similar enough to the characteristic spectral ratio for the static background scene, the pixel P(X, Y) is classified as a shadow pixel (step 122), and the CPU 12 stores the classification (step 118). The similarity between Sn(P, B) and the characteristic spectral ratio can be a function of noise present in the sensing equipment. If the ratios are not equal, the pixel P(X 5 Y) is classified as a pixel from a moving object (step 124), and the CPU 12 stores the classification for the pixel P(X, Y) in step 118.
  • step 118 the CPU 12 proceeds to a decision block (stepl26).
  • the CPU 12 determines whether there are any more pixels in the subject frame T for classification. If yes, the CPU 12 returns to step 110, selects another pixel P(X, Y), and continues to the steps 112-124. If no, the CPU 12 proceeds to the decision block of step 128.
  • the CPU 12 determines whether there are any more frames T in the sequence of N frames, for examination of pixel values. If yes, the CPU 12 returns to step 108, selects another frame T, and continues to steps 110-124. If no, the routine is complete (step 130).
  • pixels having a value that is either clipped by the maximum level of the image sensor of the video camera 14, or at the low level of the dark range and therefore subject to sensor noise are often inaccurate, and can be ignored.
  • Any pixel having an overall intensity value that is higher than the corresponding background pixel can be classified as an object. This is because a background pixel in shadow is always less intense than a fully lit background pixel, and, thus, the higher intensity indicates that it is not a background pixel.
  • any pixel having an overall intensity value that is darker than the corresponding background pixel, but having at least one color channel that is brighter than that color channel for the background pixel can be classified as a pixel of an object. This is because when a change is due to shadow, all of the channels must change in the same direction.
  • a known texture matching technique can be used to compare the subject pixel to surrounding pixels. If there is not a match, the subject pixel may be related to a moving object.
  • Figures 5a-e depict a sample image sequence in time illustrating an object in a background field of view having a variable illumination, as may occur, for example, when utilizing a video camera mounted outdoors.
  • the shadow of a tree leaf under bi-illuminant conditions, is projected onto the scene background.
  • the shadow moves in time, within the background field of view, and is the result of a leaf a considerable distance from the scene, causing a significant penumbra and highly variable illumination flux on many pixels of the recorded sequence of images.
  • a location (X 3 Y) is within the full shadow of the leaf, and thus, illuminated solely by the ambient illuminant of the bi-illuminant flux.
  • a second frame recorded at a time subsequent to the time of the first frame (figure 5b), the location (X, Y) is now in the shadow penumbra, illuminated by the ambient illuminant and approximately one third of the direct or incident illuminant.
  • FIG 5c a third frame of the sample sequence, the location (X, Y) is in a portion of the penumbra illuminated by the ambient illuminant and approximately two thirds of the incident illuminant.
  • the shadow has moved beyond the location (X, Y) 5 leaving the location within the scene background, fully lit by the incident illuminant and the ambient illuminant.
  • the location(X, Y) is covered by an object, in our example, a ball, that has moved into the field of view. The ball is lit by the incident illuminant and the ambient illuminant and projects a shadow.
  • a linear token is a set of pixels of differing color values related geometrically in a color space such as RGB space, for example, an axially extending linear or cylindrical geometric shape, from a dark end (in shadow) to a bright end (lit end), along a positive slope.
  • a linear or cylindrical configuration for the token is predicted by a bi-illuminant dichromatic reflection model (BIDR model), according to a feature of the present invention, when the color change is due to an illumination change forming a shadow (i.e.
  • ⁇ b is the product of a shadow factor sx, y, z and a geometric factor mb ( ⁇ i)
  • Ma ( ⁇ ) is the integral of the ambient illuminant and geometric body reflectance over a hemisphere, excluding the incident illuminant.
  • FIG. 6 is a graph showing a representation of a temporal linear token, according to a feature of the present invention.
  • a temporal linear token is a linear token wherein the set of pixels of differing color values related geometrically comprises a single pixel location, for example, location(X, Y) of figures 5a-e, at different times, for example, the time sequence of figures 5a-e.
  • the graph of figure 6 is in an RGB space, and depicts the color values for the location(X, Y) at each of frames 1-5 of figures 5a-e.
  • Point(X, Y) Frame 1 corresponds to the location(X, Y) at the time of frame 1, when the location(X, Y) is in full shadow of the leaf, and thus, illuminated solely by the ambient illuminant of the bi-illuminant flux.
  • Frame 2 the location(X, Y) is in the shadow penumbra, and illuminated by the ambient illuminant and one third of the direct or incident illuminant.
  • Frame 3 the location(X, Y) is in the shadow penumbra, and illuminated by the ambient illuminant and approximately two thirds of the direct or incident illuminant.
  • Point(X, Y) Frame 4 corresponds to the location(X, Y) after the shadow has moved beyond the location (X, Y), leaving the location within the scene background, fully lit by the incident illuminant and the ambient illuminant.
  • the points corresponding to frames 1 -4 comprise a temporal linear token defining a line that represents a BIDR cylinder (the spectral information) that can be utilized to classify pixels of an image sequence.
  • a temporal linear token defining a line that represents a BIDR cylinder (the spectral information) that can be utilized to classify pixels of an image sequence.
  • any pixel that falls in or near the temporal linear token defined by, for example, the points of frames 1-4 (the BIDR cylinder) can be classified as a pixel depicting the background scene under varying illumination.
  • Any pixel that falls outside the BIDR cylinder in the RGB space would be classified as a pixel value of an object distinct from the background scene.
  • Point(X, Y), Frame 5 corresponding to the location(X, Y) when it is covered by an object, in our example, a ball, that has moved into the field of view, falls outside the BIDR cylinder in the RGB space of figure 6.
  • the pixel at point(X,Y), Frame 5 is recognized as belonging to a newly introduced object.
  • step 600 a sequence of image files 18, corresponding to N frames recorded in a time sequence by the video camera 14, is input to the CPU 12, for example, the images of figures 5a-e.
  • step 602 a location(X, Y) in the scene depicted in the sequence N is selected and input to the CPU 12.
  • step 604 the CPU 12 stores all of the pixel values throughout the sequence of N frames, for the selected location(X, Y) as a set C.
  • the CPU 12 performs a mathematical operation, such as, for example, a linear regression technique or a least median of squares regression technique, to find a best fit BIDR cylinder in an RGB space corresponding to the pixel values of the set C.
  • a mathematical operation such as, for example, a linear regression technique or a least median of squares regression technique.
  • An example of such a mathematical technique is described in "Least Median of Squares Regression” by Peter J. Rousseuw, Journal of American Statistical Association, Vol. 79, No. 388 (December, 1984), pp. 871-880.
  • step 608 the CPU 12 determines whether predetermined amount of the pixel values in the set C are within a predetermined relationship to the BIDR cylinder, the dimensions of the cylinder being related to an axis between a fully lit value for the pixel values of the set C, and a value for the pixels of set C in shadow, and a radius determined as a function of variations in pixel values due to noise and other naturally occurring phenomena.
  • the predetermined amount of pixels can be set at an amount that indicates a BIDR cylinder confirmation, for example, 80% of the pixels of the set C.
  • the pixel values of the set C correspond to a location that is subject to variable illumination flux, as in our example, a location in the scene over which the shadow of a leaf variously passes, the values will define a BIDR cylinder, that can then be used to classify the pixel values at that location.
  • step 610 the CPU 12 examines a selected pixel value from the set C, corresponding to a single frame T of the sequence N of frames.
  • step 614 the CPU 12 determines whether the selected pixel value for the frame T is within the stored BIDR cylinder. If yes, the pixel value is classified as representing the background under varying illumination (step 616), if no, the pixel value is classified as an object distinct from the background (step 618). In either instance, the CPU 12 proceeds to step 620 to store the classification for the selected pixel value.
  • a decision block the CPU 12 determines whether there are more pixel values in the set C 5 for examination and classification. If yes, the CPU 12 returns to step 612, if no, the CPU 12 continues to a decision block (step 624). In the decision block of step 624, the CPU 12 determines whether there are any more locations(X, Y) in the scene depicted in the sequence of images N. If no, the CPU 12 exits the routine of figure 7 (step 626). If yes, the CPU 12 returns to step 602.
  • step 608 if most of the pixel values in the set C do not define a BIDR cylinder, the location(X, Y) can not be characterized using a temporal linear token (step 628), and the CPU 12 returns to step 624.
  • the CPU 12 can store all of the pixel locations that do not define a BIDR cylinder, and upon completion of the routine of figure 7, return to the routine of figure 4a s for execution in respect of the stored pixel locations.
  • step 108 is changed to input a set of pixels corresponding to the stored locations for the N images of the sequence, and step 126 exits the routine when there are no more pixels to examine.
  • Enhancements can be implemented to improve the accuracy of pixel classification.
  • temporal linear tokens defining BIDR cylinders can be utilized to determine spectral ratios for the image.
  • Each temporal linear token includes a brightest pixel and a darkest pixel. Those pixels can be used to calculate a spectral ratio for the corresponding token.
  • the temporal linear tokens with the greatest ratios of intensity provide a measurement of the characteristic spectral ratio for the scene.
  • Figures 8a,b depict a sample image sequence in time showing two objects moving across a scene from full illumination to shadow under bi-illuminant conditions. In figure 8a, there is shown two objects, disk 1 and disk 2, each comprising two materials.
  • Figure 9a is a flow chart for using second order tokens as an aid in motion object tracking, for example, the two disks of figures 8a,b, according to a feature of the present invention.
  • the routine of figure 4a for example, can be executed to identify objects, the routine of figure 9a is then used to track movement of the identified objects.
  • an input to the CPU 12 comprises a motion image sequence of N frames, for example, the frames depicted in figures 8a,b.
  • the CPU 12 calculates motion pixels from all frames. The calculation is completed by classifying pixels of the images utilizing, for example, the routine of one of figures 4a or 7.
  • the CPU 12 stores all of the motion pixels, i.e., in our example, all pixels associated with disk 1 and disk 2.
  • the CPU 12 Upon storing of the motion pixels, the CPU 12 executes two sub-routines. In a first sub-routine, the CPU 12 calculates second order tokens from the stored motion pixels (step 706). The second order tokens will be used to analyze objects in the sequence of frames to determine the positions of each object (disk 1 and disk 2) throughout the sequence of frames.
  • An Nth order token is a set of N first order uniform tokens that are different colors, as measured in a selected color space, for example, RGB, hue or chromaticity, and are near to one another in the image.
  • disk 1 and disk 2 of figures 8a,b each comprise two distinct materials. Each material of each disk will form a uniform token, and the adjacent materials of each disk could form a second-order token, two close uniform tokens.
  • Figure 9b shows a flow chart of a method for creating an Nth order token.
  • the CPU 12 is given a list of tokens, for example, as identified through execution of the routine of figure 4b for each frame of the sequence, an input image area A and a maximum distance Dmax, which could be set at 10 pixels (step 800).
  • the CPU 12 adds each set of N tokens found in step 802, into an Nth order token set, checking for duplicates.
  • the CPU 12 outputs the Nth order token sets for the image area.
  • the CPU 12 repeats the routine for each frame of the sequence for a complete list of Nth order tokens.
  • the CPU 12 stores the second order token list for each frame of the sequence (step 708).
  • the CPU 12 separates all of the stored motion pixels into object regions for tracking (step 710).
  • the regions comprise the pixels defining each of disk 1 and disk 2.
  • the CPU 12 stores the region list (step 712) and then selects an object region from the list, Ob(I), from frame 1 (for example the image of figure 8a) of the sequence (step 714).
  • the CPU 12 selects another frame, M (for example the image of figure 8b), from the sequence.
  • the CPU 12 selects an object region Ob(M), from the frame M.
  • step 720 the CPU 12 finds second order tokens T(I) and T(M) from the token list, that correspond to the object regions Ob(I) and Ob(M) 3 for example, executing a pixel location comparison.
  • step 722 the CPU 12 calculates normalized spectral ratios, Sn(I) and Sn(2), respectively, and reflectance ratios, R(I) and R(2), respectively, between corresponding tokens of each of the second order tokens, T(I) and T(M). In performance of the calculations, the CPU 12 first determines the second order token having the brighter pixels.
  • the mean average RGB values for the pixels of each of the two tokens of the brighter second order token are denoted as Bl and B2, respectively, while the average RGB values for the pixels of each of the two tokens of the darker second order token are denoted as Dl and D2, respectively.
  • R(2) (B2 - D2)/(B2 + D2).
  • the CPU 12 stores the classification and proceeds to a decision block (step 732).
  • the CPU 12 determines whether there are any more object regions in frame M for examination and classification. If yes, the CPU returns to step 718 to select another object region Ob(M). If no, the CPU 12 proceeds to a decision block (step 734).
  • the CPU 12 determines whether there are any more frames in the sequence N to examine. If yes, the CPU 12 returns to step 716 to select another frame M. If no, the CPU 12 proceeds to a decision block (step 736). In the decision block of step 736, the CPU 12 determines whether there are any more objects to be examined in frame 1. If yes, the CPU 12 returns to step 716 to select another object, Ob(I). If no, the routine is complete, and the CPU 12 exits the routine (step 738).
  • Figures 10a-d depict a sample image sequence in time, showing four objects moving randomly across a scene under bi-illuminant conditions. Each object is composed of a different material, and casts a shadow on the background of the image.
  • a situation such as shown in figures 10a-d can be analyzed using an enhanced Gaussian mixture model.
  • a location(X, Y), at a first time frame is a pixel in the static background of the scene illuminated by an incident illuminant and an ambient illuminant, and corresponds to a most common state for a pixel.
  • the location(X, Y), at a second time frame is in a shadow, illuminated by the ambient illuminant.
  • Figure 11 is a graph showing a representation of a Gaussian mixture model in a linear RGB space, including values corresponding to the location(X, Y), in each of the frames depicted in figures 10a-d.
  • a Gaussian mixture model each pixel of the image sequence will be in a set of states, and the model is used to determine a most common state and a second most common on state for each pixel location, as will appear.
  • the most common state for the pixel at location(X, Y) will correspond to the static background, while illuminated by an incident illuminant and an ambient illuminant (figure 10a).
  • the static background is typically obscured briefly by moving objects, and is otherwise constant, making this state dominant.
  • the second most common state for the pixel is depicted in figure 10b, a shadow illuminated by the ambient illuminant.
  • the second most common state is the shadowed background image. All moving objects cast a similar shadow, so this state appears as the next most common when objects are moving around a scene.
  • the least common states correspond to object surfaces, moving through the static background. Moving objects can be made of many different materials, all of which appear briefly at a particular location, so none of the particular materials has a chance to become a common state. In figure 11, a line is drawn between the two most common states (from location(X, Y) in figure 10a, to location(X, Y) in figure 10b).
  • the line is used to define a BIDR cylinder for the sequence of frames N depicted in figures 10a-d. As can be seen in figure 11, the RGB values for the location(X, Y) when covered by one of the objects, is outside the BIDR cylinder.
  • FIG. 12 is a flow chart for classifying pixels in a motion image sequence via an enhanced Gaussian mixture model, according to a feature of the present invention.
  • an input to the CPU 12 comprises a motion image sequence of N frames, for example, the frames depicted in figures 10a-d.
  • the CPU 12 calculates Gaussian probabilities at each pixel location among the sequence of images, TSF.
  • the CPU 12 then stores the calculated probabilities (step 904).
  • the Gaussian mixture model reference can be made to C. Stauffer and W. Grimson "Adaptive background mixture models for real time tracking," Proceedings of the IEEE International Conference on CVPR, Vol. 2, pp. 246-252, June, 1999.
  • step 906 the CPU 12 calculates and stores a BIDR cylinder C for each pixel location, setting the ends of the cylinder to the RGB values corresponding to the first and second most probable pixel states for that location.
  • the most common state for a pixel corresponds to the static background, while illuminated by an incident illuminant and an ambient illuminant and the second most common state for the pixel corresponds to a shadow illuminated by the ambient illuminant.
  • the BIDR cylinder can be validated by applying constraints to the BIDR cylinder derived from the two most probable pixel states, by, for example, requiring certain reasonable constraints for the illuminants.
  • the stored BIDR cylinders are made available at a decision block (step 908).
  • the CPU 12 selects a frame T from the images of the sequence N (step 910).
  • the CPU 12 selects a pixel P(X, Y) from the image T, and proceeds to a decision block (Step 914).
  • the CPU 12 determines whether the pixel P(X, Y) is close in value, within some measure of noise, to the value of probability state 1 (the most common state). If yes, the Gaussian probability indicates that the pixel P(X, Y) is to be classified as part of the static background of the scene (step 916). The CPU 12 then proceeds to a decision block (step 918).
  • step 918 the CPU 12 determines whether there are any more pixels in the frame T for analysis and classification. If yes, the CPU returns to step 912. If no, the CPU 12 proceeds to a decision block (step 920). In the decision block of step 920, the CPU 12 determines whether there are any more frames in the sequence N for analysis. If yes, the CPU 12 returns to step 910. If no, the routine is ended (step 922).
  • step 924 the CPU 12 determines whether the pixel P(X, Y) is close in value, within some measure of noise, to the value of probability state 2 (the second most common state). If yes, the Gaussian probability indicates that the pixel P(X, Y) is to be classified as within a shadow of the static background (step 926). The CPU 12 then proceeds to the decision block of step 918. If no, the CPU 12 proceeds to the decision block of step 908.
  • the CPU 12 determines whether the pixel P(X, Y) lies within the BIDR cylinder for the corresponding location of the subject pixel P(X, Y). If yes, the Gaussian probability indicates that the pixel P(X, Y) is to be classified as within a shadow penumbra of the static background (step 928). If no, the Gaussian probability indicates that the pixel P(X, Y) is to be classified as part of a transient object distinct from the background. From either steps 928 or 930, the CPU 12 proceeds to the decision block of step 918.

Abstract

In an exemplary embodiment of the present invention, an automated, computerized method is provided for classifying pixel values in a motion sequence of images. According to a feature of the present invention, the method comprises the steps of determining spectral information relevant to the sequence of images, and utilizing th spectral information to classify a pixel as one of background, shadow and object.

Description

METHODS FOR DISCRIMINATING MOVING OBJECTS IN MOTION
IMAGE SEQUENCES
Background of the Invention
[0001] Many significant and commercially important uses of modern computer technology relate to images. These include image processing, image analysis and computer vision applications. The ability of a computer to correctly and accurately distinguish between a shadow and a material object within an image has been a persistent challenge to scientists. This is particularly important in connection with object recognition and tracking in motion image sequences. For example, a video camera fixed for observation of a field of view, for detection of the appearance and motion of certain objects of interest, can be coupled to a computer for analysis of the image sequence recorded by the camera. It is a reasonably straightforward task for the computer to distinguish background elements of the image (forming the unobstructed field of view) from changes in subsequent recorded images of the field of view. A difficulty is determining whether differences detected in subsequent image recordings are the result of moving objects, or are due to changes in the illumination flux radiating the field of view, for example, a shadow caused by the object.
[0002] If a cast shadow is indistinguishable from the object casting the shadow, it would not be possible for the computer to recognize the object. Thus, the ability of a computer to recognize a specific object moving within a preselected field of view being surveyed by a video camera is an important design consideration. Accordingly, there is a persistent need for the development of accurate and correct techniques that can be utilized in the operation of computers relating to images.
Summary of the Invention
[0003] The present invention provides a method and system comprising image techniques that accurately and correctly reflect and represent physical phenomena occurring in the visual world, and more specifically, techniques which can be utilized to identify a moving object within a field of view.
[0004] In a first exemplary embodiment of the present invention, an automated, computerized method is provided for classifying pixel values in a motion sequence of images. According to a feature of the present invention, the method comprises the steps of determining spectral information relevant to the sequence of images, and utilizing the spectral information to classify a pixel as one of background, shadow and object.
[0005] In a second exemplary embodiment of the present invention, an automated, computerized method is provided for tracking an object in a motion sequence of images. According to a feature of the present invention, the method comprises the steps of determining motion pixels from the motion sequence of images, calculating second order tokens as a function of the motion pixels, and identifying motion of an object as a function of the second order tokens.
[0006] In a third exemplary embodiment of the present invention, a computer system comprises a CPU and a memory storing a sequence of images, the CPU is arranged and configured to execute a routine to determine spectral information relevant to the sequence of images, and utilize the spectral information to classify a pixel as one of background, shadow and object.
[0007] In a fourth exemplary embodiment of the present invention, a computer system comprises a CPU and a memory storing a motion sequence of images, the CPU is arranged and configured to execute a routine to determine motion pixels from the motion sequence of images, calculate second order tokens as a function of the motion pixels and identify motion of an object as a function of the second order tokens. [0008] In accordance with yet further embodiments of the present invention, computer systems are provided, which include one or more computers configured (e.g., programmed) to perform the methods described above. In accordance with other embodiments of the present invention, computer readable media are provided which have stored thereon computer executable process steps operable to control a computer(s) to implement the embodiments described above. The automated, computerized methods can be performed by a digital computer, analog computer, optical sensor, state machine, sequencer or any device or apparatus that can be designed or programed to carry out the steps of the methods of the present invention.
Brief Description of the Drawings
[0009] Figure 1 is a block diagram of a computer system arranged and configured to perform operations related to images.
[0010] Figure 2 shows an n X m pixel array image file for one frame of a sequence of frames of a motion image stored, in the computer system of figure 1.
[0011] Figures 3a-e depict a sample image sequence illustrating an object moving through a static background field of view.
[0012] Figure 4a is a flow chart for performing an operation to classify pixel types in a frame of a motion sequence with a static background image, according to a feature of the present invention.
[0013] Figure 4b is a flow chart for identifying token regions in the image file of figure 2, according to a feature of the present invention.
[0014] Figure 4c is a flow chart for identifying X-junctions in an image, according to a feature of the present invention. [0015] Figure 4d shows an image having an x-junction.
[0016] Figure 4e is a flow chart for an X-junction testing sub-routine of the flow chart of figure 4c.
[0017] Figure 4f is a flow chart for identifying a local spectral ratio using an X- j unction of figure 4d, according to a feature of the present invention.
[0018] Figures 5a-e depict a sample image sequence illustrating an object moving in a background field of view having a variable illumination.
[0019] Figure 6 is a graph showing a representation of a temporal linear token, according to a feature of the present invention.
[0020] Figure 7 is a flow chart for performing an operation to classify pixel types in a frame of a motion sequence with highly variable illumination, according to a feature of the present invention.
[0021] Figures 8a,b depict a sample image sequence showing two objects moving across a scene from full illumination to shadow under bi-illuminant conditions.
[0022] Figure 9a is a flow chart for using second order tokens as an aid in motion object tracking, according to a feature of the present invention.
[0023] Figure 9b is a flow chart of a method for creating an Nth order token for use in the routine of figure 9a. [0024] Figures 10a-d depict a sample image sequence showing four objects moving randomly across a scene under bi-illuminant conditions.
[0025] Figure 11 is a graph showing a representation of a Gaussian mixture model in a linear RGB space.
[0026] Figure 12 is a flow chart for classifying pixels in a motion image sequence via an enhanced Gaussian mixture model, according to a feature of the present invention.
Detailed Description of the Preferred Embodiments
[0027] Referring now to the drawings, and initially to figure 1 , there is shown a block diagram of a computer system 10 arranged and configured to perform operations related to images. A CPU 12 is coupled to a device such as, for example, a digital video camera 14 via, for example, a USB port. The digital video camera 14 operates to download a sequence of images sensed by the camera 14 of a preselected field of view, to the CPU 12. The CPU 12 stores the downloaded images in a memory 16 as image files 18. The image files 18 can be accessed by the CPU 12 for display in a sequence, on a monitor 20, or for print out, one at a time, on a printer 22.
[0028] As shown in figure 2, each image file 18 comprises an n X m pixel array and corresponds to a frame of a sequence of frames recorded by the video camera 14. Each pixel, p, is a picture element corresponding to a discrete portion of the overall image. All of the pixels together define the image represented by the image file 18. Each pixel comprises a digital value corresponding to a set of color bands, for example, red, green and blue color components (RGB) of the picture element. The present invention is applicable to any multi-band image, where each band corresponds to a piece of the electro-magnetic spectrum. The pixel array includes m columns of n rows each, starting with the pixel p (1,1) and ending with the pixel p(n, m). When displaying or printing an image, the CPU 12 retrieves the corresponding image file 18 from the memory 16, and operates the monitor 20 or printer 22, as the case may be, as a function of the digital values of the pixels in the image file 18, as is generally known. The display on the monitor 20 of a sequence of image files 18 provides a display of events within the field of view, as recorded by the video camera 14.
[0029] In an image operation, the CPU 12 operates to analyze the RGB values of the pixels of each of a sequence of stored image files 18 to achieve various objectives, such as, for example, the detection and identification of an object moving through the preselected field of view recorded by the video camera 14 in the sequence of images. In one example, the image operation can be performed by the CPU 12 in connection with discriminating moving objects from shadows in sequence with a static background image. This can be, for example, a security camera mounted indoors, with a fixed field of view, and constant or slowly varying lighting. Or, in another example, the CPU 12 can analyze sequences of images with highly variable illumination, as in the instance of a video camera 14 mounted in an outdoor setting. In either case, the CPU 12 utilizes spectral information to classify pixels of the scene in a determination of the presence of objects.
[0030] As taught in co-pending Application Serial No. 11/341,742, filed on January 27, 2006, entitled: "Method and System For Identifying Illumination Flux In An Image," which is hereby incorporated by reference, an image comprises two components, material and illumination. Moreover, as further taught in the co-pending Application, an illumination flux impinging on a material depicted in an image is a bi- illuminant flux which comprises an ambient illuminant and a direct or incident illuminant. The incident illuminant is light that causes a shadow and is found outside a shadow perimeter. The ambient illuminant is light present on both the bright and dark sides of a shadow, but is more perceptible within the dark region of a shadow.
[0031] Spectra for the incident illuminant and the ambient illuminant can be different from one another. A spectral shift caused by a shadow, i.e., a decrease of the intensity of the incident illuminant, will be substantially invariant over different materials present in a scene depicted in an image when the scene is illuminated by a common illumination flux. Thus, the spectral shift caused by a shadow can be expressed by a spectral ratio of colors across an illumination boundary defined by a shadow on a material. Inasmuch as an illumination boundary is caused by the interplay between the incident illuminant and the ambient illuminant, spectral ratios throughout the image that are associated with illumination change (illuminant ratios), should be consistently and approximately equal, regardless of the color of the bright side or the material object characteristics of the boundary. A characteristic spectral ratio for a particular image or scene within an image, is a spectral ratio associated with illumination change caused by a shadow, as occurs in the particular image, and can be used to determine if a particular boundary in a scene is caused by a shadow or an object. This information can be utilized to classify pixels of a time sequence of images to detect moving objects.
[0032] An exemplary characteristic spectral ratio of an image can be expressed by the equation: S = Dark/(Bright-Dark), wherein Dark is a sensor reading of a color of a material depicted in the image, in shadow, and Bright is the sensor reading for the color of that same material when fully lit by the incident illuminant. In a preferred embodiment of the present invention, the spectral ratio S = Dark/(Bright-Dark) is utilized because it has been discovered during development of the present invention that the normalized value for the ratio Dark/(Bright-Dark) is invariant across different geometric orientations for a material object, and thus, the ratio remains constant across illumination boundaries for objects at different orientations. Moreover, the normalized value for the ratio Dark/(Bright-Dark) produced by a fully shadowed pixel and a penumbra pixel will be the same as the normalized value produced by a fully shadowed pixel and a fully lit pixel. These relationships are not exhibited by the normalized values of Bright/Dark and Bright/(Bright-Dark). Accordingly, the ratio Dark/(Bright- Dark) and the normalized value thereof, provide the optimum combination of accuracy and correctness. [0033] Referring now to figures 3a-e, there is shown a sample image sequence illustrating an object moving through a static background field of view illuminated by constant bi-illuminant conditions (incident or direct illuminant and ambient illuminant), as may occur, for example, with a security camera mounted for monitoring a fixed field of view, and constant lighting. The sequence illustrated in figures 3a-c can correspond to a sequence of image files 18, stored by the CPU 12 from images of a scene recorded by the video camera 14. They illustrate a ball moving across the static background of the field of view. The sample image sequence shows a ball and its shadow moving across a background under bi-illuminant conditions. The most common pixel values at each location define the static background image. Figure 3d corresponds to a depiction of the static background of the field of view, determined by the CPU 12 from the images of figures 3a-c, as will be described. According to a feature of the present invention, the normalized characteristic spectral ratio for the static background depicted in figure 3d provides a basis for analyzing each individual pixel of each image file 18 of the sequence, as represented by figures 3a-c, to determine whether an individual pixel of a particular image file 18 is a background pixel, a pixel from a moving object, such as the ball of the example of figures 3a-c, or a shadow pixel. Figure 3e depicts an image of the ball without a shadow, as can be generated by the CPU 12 upon analysis of the images of figures 3a-c, as will appear. For example, figure 3a is redrawn by replacing the pixels identified as shadow with static background pixels from the same location. Thus only the moving object and the static background image remain, greatly simplifying the tracking of the moving object, in this case the ball.
[0034] As shown in figure 3a, a location (X, Y) at a first time represents the static background, and pixels at this location will equal the value of pixels from the static background represented by figure 3d. At a second time, depicted in figure 3b, the location (X, Y) falls within a shadow cast by the moving ball, as the ball moves across the field of view. The normalized spectral ratio between the current pixel at location (X5Y) and the static background pixel at location (X, Y) matches the scene normalized spectral ratio, and so the pixel is recognized as being a shadow. At a third time, depicted in figure 3c, the location (X5 Y) falls within the ball object moving through the scene. The normalized spectral ratio between the current pixel at location (X5Y), and the static background pixel at location (X5Y) does not match the scene normalized spectral ratio, and so the pixel is recognized as being a distinct material from a moving object. Figure 3d shows the static background image calculated from the three frames of figures 3a-c. The sample image sequence shows a ball and its shadow moving across a background under bi-illuminant conditions. The most common pixel values at each location define the static background image. The pixel values at location (X, Y) represent the static background image. The static background image is used with the scene normalized spectral ratio to classify pixels that change from frame to frame as either background pixels, shadow pixels, or pixels from moving objects.
[0035] Figure 4a is a flow chart for performing an operation to classify pixel types in a frame of a motion sequence with a static background image, according to a feature of the present invention. As a first step (step 100), a sequence of image files 18, corresponding to N frames recorded in a time sequence by the video camera 14, is input to the CPU 12, for example, the images of figures 3a-c. The recording of the N frames by the camera 14 can be in real time, relative to the execution of the routine of figure 4a by the CPU 12. In step 102, the CPU 12 analyzes each pixel of each frame represented by the sequence of image files 18. Step 102 is performed to determine each value for each pixel of the sequence of N frames, to establish the static background of the scene.
[0036] To that end, in step 104, the CPU 12 selects a pixel value for each scene location, in each instance a value corresponding to the most common value for the respective pixel from the N pixel values throughout the time sequence of N frames. As any value different from the static background value will be due to the temporary presence of a moving object, the common value occurring most frequently in the N frames will correspond to the static background value. [0037] Performance of step 104 in respect of all of the pixel values for the N frames generates an image file 18 that depicts the static background, as for example, the image of figure 3d (step 106). Other known techniques for building a static background can be implemented, in lieu of the most frequent occurrence approach described above. Moreover, in a situation wherein the static background is fixed, and the video camera 14 is mounted in a predetermined fixed location, the video camera 14 can be operated to record the static background before any objects are permitted to pass through the scene. In that instance, step 106 will be the input to the CPU 12.
[0038] In step 108, an image file 18, corresponding to a single frame T of the N frames is input to the CPU 12, for analysis relative to the image file 18 depicting the static background. For example, the image depicted in figure 3a is analyzed relative to the static background of figure 3d. Each pixel of the subject frame is designated by its location within the corresponding image file 18: P(X, Y) and each pixel of the static background is likewise designated by a location: B(X, Y).
[0039] In step 110, the CPU 12 selects one pixel location P(X, Y) from the subject frame T for analysis to classify the selected pixel P(X, Y) as a background pixel, a pixel from a moving object, such as the ball of the example of figures 3a-c, or a shadow pixel. Ln step 112, the CPU 12 calculates a normalized spectral ratio between the pixel P(X, Y) of the subject frame and the corresponding pixel in the static background image for the same X, Y location (B(X, Y)): Sn(P, B) - Norm (P/(B - P)). The CPU 12 then proceeds to decision block 114. In the decision block 114, the CPU 12 determines whether the pixel, P(X, Y) equals the corresponding pixel B(X, Y) of the static background. If yes, the particular pixel P(X, Y) is classified as a pixel depicting the static background of the scene (step 116). In step 118, the CPU 12 stores the pixel classification. [0040] If no, the CPU 12 proceeds to decision block 120 for the respective pixel P(X5 Y). In decision block 120, the CPU 12 compares the normalized spectral ratio for the pixel P(X, Y), Sn(P, B), as calculated in step 112, to the global or characteristic spectral ratio for the static background scene. The characteristic spectral ratio comprises the spectral information used to classify the respective pixel P(X, Y). The co-pending Application Serial No. 11/341,742 teaches methods for automatically determining a characteristic spectral ratio for a scene. For example, the CPU 12 is operated to identify spatio-spectral features of an image, that is, features that comprise conditions that are indicative of illumination flux. An example of a spatio-spectral feature is an X- junction. An X-junction is an area of an image where a material edge and an illumination boundary cross one another. An X-junction is an optimal location for an accurate determination of an ϊlluminant ratio or characteristic spectral ratio for a scene.
[0041] To improve the accuracy and correctness of the characteristic ratio for an image, the spectral ratio information for illumination boundaries is determined on a local level, that is, an illuminant ratio is determined for each of several preselected local areas of a scene depicted in an image. An analysis of a boundary is then executed utilizing the spectral ratio for the specific location of the boundary within the image. The determination of locally relevant spectral ratios accommodates complexities that may be encountered in a real world image, for example, the interplay of several different sources of light in a room, inter-reflections, and so on.
[0042] A token analysis of an image is used to identify spatio-spectral features. A token is a connected region of an image wherein the pixels of the region are related to one another in a manner relevant to identification of spatio-spectral features. The pixels of a token can be related in terms of either homogeneous factors, such as, for example, close correlation of color among the pixels (a uniform token), or nonhomogeneous factors, such as, for example, differing color values related geometrically in a color space such as RGB space (for example, a linear token). The use of tokens rather than individual pixels reduces complexity and noise in image processing and provides a more efficient, less intense computational operation for the computer system 10.
[0043] A uniform token analysis is used to identify X-junctions in an image. A uniform token is a homogeneous token that comprises a connected region of an image with approximately constant pixel values (for example, within a range determined by the expected noise margin of the recording equipment or normal variations in materials) throughout the region. A first order uniform token comprises a single robust color measurement among contiguous pixels of the image. The analysis can include an examination of token neighbor relationships indicative of spatio-spectral features of an image.
[0044] Figure 4b shows a flow chart for identifying token regions in the subject frame of the N frames, (depicted as an image file 18 in figure 2), according to a feature of the present invention. At the start of the identification routine, the CPU 12 sets up a region map in memory. In step 200, the CPU 12 clears the region map and assigns a region ID, which is initially set at 1. An iteration for the routine, corresponding to a pixel number, is set at i = 0, and a number for an N x N pixel array, for use as a seed to determine the token, is set an initial value, N = Nstart. Nstart can be any integer > 0, for example it can be set at set at 11 or 15 pixels.
[0045] At step 202, a seed test is begun. The CPU 12 selects a first pixel, i = (1, 1) for example, the pixel at the upper left corner of a first N x N sample. The pixel is then tested in decision block 204 to determine if the selected pixel is part of a good seed. The test can comprise a comparison of the color value of the selected pixel to the color values of a preselected number of its neighboring pixels as the seed, for example, the N x N array. If the comparison does not result in approximately equal values for the pixels in the seed, the CPU 12 increments the value of i (step 206), for example, i = (1, 2), for a next N xN seed sample, and then tests to determine if i = imax (decision block 208). [0046] If the pixel value is at imax, a value selected as a threshold for deciding to reduce the seed size for improved results, the seed size, N, is reduced (step 210), for example, from N = 15 to N = 12. In an exemplary embodiment of the present invention, imax can be set at i = (n, m). In this manner, the routine of figure 3a parses the entire image at a first value of N before repeating the routine for a reduced value of N.
[0047] After reduction of the seed size, the routine returns to step 202, and continues to test for token seeds. An Nstop value (for example, N = 2) is also checked in step 210 to determine if the analysis is complete. If the value of N is at Nstop , the CPU 12 has completed a survey of the image pixel arrays and exits the routine.
[0048] If the value of i is less than imax , and N is greater than Nstop, the routine returns to step 202, and continues to test for token seeds.
[0049] When a good seed (an N x N array with approximately equal pixel values) is found (block 204), the token is grown from the seed. In step 212, the CPU 12 pushes the pixels from the seed onto a queue. All of the pixels in the queue are marked with the current region ID in the region map. The CPU 12 then inquires as to whether the queue is empty (decision block 214). If the queue is not empty, the routine proceeds to step 216.
[0050] In step 216, the CPU 12 pops the front pixel off the queue and proceeds to step 218. In step 218, the CPU 12 marks "good' neighbors around the subject pixel, that is neighbors approximately equal in color value to the subject pixel, with the current region ID. All of the marked good neighbors are placed in the region map and also pushed onto the queue. The CPU then returns to the decision block 214. The routine of steps 214, 216, 218 is repeated until the queue is empty. At that time, all of the pixels forming a token in the current region will have been identified and marked in the region map.
[0051] When the queue is empty, the CPU proceeds to step 220. At step 220, the CPU 12 increments the region ID for use with identification of a next token. The CPU 12 then returns to step 206 to repeat the routine in respect of the new current token region. Upon arrival at N = Nstop, step 110 of the flow chart of figure 4b, or completion of a region map that coincides with the image, the routine will have completed the token building task.
[0052] Upon completion of the token region map for the frame, the CPU 12 proceeds to the routine of figure 4c, to identify X-junctions in the image of the subject frame using the token region map. In step 300, the CPU 12 is input the token region map completed through execution of the routine of figure 4b, the color of each token region and a set of pixels to provide a basis for searching for X-junctions. The pixel set can be a subset of the pixels of the image, for example, selecting every Nth pixel of every Nth row, or jitter sampling, for example selecting a pixel randomly of each N x N box throughout the image. In our examples, N = 4.
[0053] In a decision block (step 302), the CPU 12 determines whether there are still pixels in the pixel set for search of an X-junction. If no, the CPU 12 exits the routine with a map of likely X-junctions in the frame (step 304). If yes, the CPU 12 proceeds to step 306, to select a next pixel Pi, from the pixel set.
[0054] In step 308, the CPU 12 builds a set Q of all unique tokens with a seed size >S within a distance D of the pixel Pi. In our example, S = 2 and D = IO. hi a decision block (step 310), the CPU 12 determines if the set Q has at least four tokens. If no, the CPU 12 returns to the decision block of step 302. If yes, the CPU 12 proceeds to step 312. [0055] In step 312, the CPU 12 selects a token from the set Q to designate as a token A. In a decision block (step 314), the CPU 12 tests the token A to determine whether A > Minimum intensity, for example an intensity equal to 20% of the maximum intensity within the subject frame. If no, the CPU 12 proceeds to a decision block (step 316) to determine whether there are more tokens in the set Q for testing as an A token. If yes, the CPU 12 returns to step 312 to repeat the test. If no, the CPU 12 returns to step 302.
[0056] If the CPU 12 determines a yes condition in the decision block of step 314, the CPU 12 proceeds to step 318.
[0057] In step 318, the CPU 12 selects a token from the set Q to designate as a token B (with B not equal to A). In a decision block (step 320), the CPU 12 tests the token B to determine whether F*B < A in all color bands. In our example, F = 2. If no, the CPU 12 proceeds to a decision block (step 322) to determine whether there are more tokens in the set Q for testing as a B token. If yes, the CPU 12 selects another token and returns to step 318 to repeat the test. If no, the CPU 12 returns to step 316.
[0058] If the CPU 12 determines a yes condition in the decision block of step 320, the CPU 12 proceeds to step 324.
[0059] In step 324 the CPU 12 selects a token from the set Q to designate as a token D (D not equal to either A or B). In a decision block (step 326), the CPU 12 tests the token D to determine whether D > Minimum intensity, for example an intensity equal to 20% of the maximum intensity within the subject frame. If no, the CPU 12 proceeds to a decision block (step 328) to determine whether there are more tokens in the set Q for testing as an D token. If yes, the CPU 12 selects another token and returns to step 324 to repeat the test. If no, the CPU 12 returns to step 322. [0060] If the CPU 12 determines a yes condition in the decision block of step 326, the CPU 12 proceeds to step 330.
[0061] In step 330, the CPU 12 selects a token from the set Q to designate as a token C (C not equal to either A or B or D). In a decision block (step 332), the CPU 12 tests the token C to determine whether F*C < D in all color bands. In our example, F = 2. If no, the CPU 12 proceeds to a decision block (step 334) to determine whether there are more tokens in the set Q for testing as a C token. If yes, the CPU 12 selects another token and returns to step 330 to repeat the test. If no, the CPU 12 returns to step 328.
[0062] If the CPU 12 determines a yes condition in the decision block of step 332, the CPU 12 proceeds to step 336.
[0063] Figure 4e shows a flow chart for the X-junction testing sub-routine, step 336 of the flow chart of figure 4c. In step 426 the token neighbor set {A, B, C, D} is set as the starting point of the routine. As noted, the hypothesis is that A and B are the same material 1, and that D and C are the same material 2, and that B and C are in shadow (step 428, see figure 4d).
[0064] In a first test, step 430, the CPU 12 determines whether the pixels of token A > the pixels of token B and the pixels of token D > the pixels of token C, in each color band. The colors B and C are multiplied by a factor, f, which is a scalar value greater than 1. In step 432, it is determined whether the bright measurements for A and D tokens are brighter than a minimum threshold.
[0065] The next test (step 434) comprises a determination of whether each of the bright tokens A and D, are significantly different in a color space, for example, in an RGB space. In this regard, a determination is made as to whether the color space distance (A, D) > threshold. [0066] In step 436 the CPU 12 determines whether the reflectance ratio between A and D is approximately equal to the reflectance ratio for B and C. The bounded version of the ratios can be used, Ri = (A-D)/(A+D), and R.2 = (B-C)/(B+C), with Ri = R2. In step 438, the spectral ratios Si = B/(A-B) and S2 = C/(D-C) are compared to determine if they are similar to one another (within a predetermined difference).
[0067] In step 440, the CPU 12 determines if the spectral ratios fit an a priori model of a reasonable illuminant. Variations on the constraints can include, for example, requiring the dark measurements for the B and C tokens to be less than a percentage of the corresponding bright measurement. Moreover, the CPU 12 can test the spectral ratios determined in step 338 for saturation levels. Saturation is defined as saturation = 1 - (minimum color band/maximum color band). An upper boundary can be established for the spectral ratio, in terms of saturation, for example, any spectral ratio with a saturation > 0.9 is considered to be unreasonable. If all of the above constraints are met, the X-junction criteria are considered to be satisfied (step 442).
[0068] In the event a token set {A, B, C, D) fails the X-junction tests of step 336, the CPU 12, returns to step 334. If the token set {A, B, C5 D) passes the X-junction tests of step 336, the CPU 12 proceeds to step 338 to mark the token set {A, B, C, D) as a valid X-junction. The CPU 12 then returns to step 302.
[0069] Referring now to figure 4f, there is shown a flow chart for identifying a local spectral ratio using an X-junction, according to a feature of the present invention. The CPU 12 is given the image file 18 of the subject frame of the N frames, and X-junction parameters in step 500. The CPU 12 then proceeds to step 502, which comprises the performance of the processes of figures 4b-e, throughout the given image to identify all X-junctions within the image. [0070] Upon completion of step 502, the CPU 12 proceeds to step 504 to calculate a spectral ratio for each bright/dark pixel pair in each X-junction, and store the results in a memory array. In step 506, the CPU 12 executes a mean shift algorithm on the array of spectral ratios. The mean shift algorithm can comprise, for example, an algorithm described in "Mean shift analysis and applications," Comaniciu, D.; Meer, P.; Computer Vision, 1999, The Proceedings of the Seventh IEEE International Conference on; Volume 2, 20-27 September, 1999; Pages 1197-1203. The output of execution of the mean shift algorithm (step 508) is a spectral ratio for all or a specific local region of the image. The execution of step 506 can include a survey of values for the spectral ratios throughout the image.
[0071] If the spectral ratios calculated over the image by, for example, one of the methods described above, vary by an amount > a threshold variance, a local approach will be implemented for the spectral ratio information used in determining illumination boundaries. That is, the value at a specific X-junction, or a mean or median of a set of nearby X-junctions will be used as the spectral ratio when the CPU 12 determines illumination boundaries in the region of the image near the specific X-junction. If all of the spectral ratios for the entire image vary by less than the threshold variance, a global approach can be used with the same mean shift spectral ratio used in all illumination boundary determinations.
[0072] Upon determination by the CPU 12 of a characteristic spectral ratio for the static background scene (from step 106 of figure 4a), the CPU 12 returns to step 120 of figure 4a to run the comparison between the normalized spectral ratio for the pixel P(X, Y) from the subject frame T, Sn(P, B), as calculated in step 112, and the characteristic spectral ratio for the static background scene. IfSn(P, B) is similar enough to the characteristic spectral ratio for the static background scene, the pixel P(X, Y) is classified as a shadow pixel (step 122), and the CPU 12 stores the classification (step 118). The similarity between Sn(P, B) and the characteristic spectral ratio can be a function of noise present in the sensing equipment. If the ratios are not equal, the pixel P(X5 Y) is classified as a pixel from a moving object (step 124), and the CPU 12 stores the classification for the pixel P(X, Y) in step 118.
[0073] Once a pixel P(X, Y) is classified, and the classification is stored (step 118), the CPU 12 proceeds to a decision block (stepl26). In the decision block, the CPU 12 determines whether there are any more pixels in the subject frame T for classification. If yes, the CPU 12 returns to step 110, selects another pixel P(X, Y), and continues to the steps 112-124. If no, the CPU 12 proceeds to the decision block of step 128. In the decision block of step 128, the CPU 12 determines whether there are any more frames T in the sequence of N frames, for examination of pixel values. If yes, the CPU 12 returns to step 108, selects another frame T, and continues to steps 110-124. If no, the routine is complete (step 130).
[0074] Several enhancements can be implemented to improve the accuracy of the pixel classification obtained through execution of the routine of figure 4a. For example, pixels having a value that is either clipped by the maximum level of the image sensor of the video camera 14, or at the low level of the dark range and therefore subject to sensor noise, are often inaccurate, and can be ignored. Any pixel having an overall intensity value that is higher than the corresponding background pixel can be classified as an object. This is because a background pixel in shadow is always less intense than a fully lit background pixel, and, thus, the higher intensity indicates that it is not a background pixel.
[0075] Similarly, any pixel having an overall intensity value that is darker than the corresponding background pixel, but having at least one color channel that is brighter than that color channel for the background pixel can be classified as a pixel of an object. This is because when a change is due to shadow, all of the channels must change in the same direction. [0076] There are instances when the pixel value of an object is nearly identical to the corresponding pixel of the background image in some degree of shadow. Under such circumstances a known texture matching technique can be used to compare the subject pixel to surrounding pixels. If there is not a match, the subject pixel may be related to a moving object.
[0077] Figures 5a-e depict a sample image sequence in time illustrating an object in a background field of view having a variable illumination, as may occur, for example, when utilizing a video camera mounted outdoors. In the example of figures 5a-e, the shadow of a tree leaf, under bi-illuminant conditions, is projected onto the scene background. The shadow moves in time, within the background field of view, and is the result of a leaf a considerable distance from the scene, causing a significant penumbra and highly variable illumination flux on many pixels of the recorded sequence of images.
[00781 In a first frame of the sequence, depicted in figure 5a, a location (X3 Y) is within the full shadow of the leaf, and thus, illuminated solely by the ambient illuminant of the bi-illuminant flux. In a second frame, recorded at a time subsequent to the time of the first frame (figure 5b), the location (X, Y) is now in the shadow penumbra, illuminated by the ambient illuminant and approximately one third of the direct or incident illuminant.
[0079] In figure 5c, a third frame of the sample sequence, the location (X, Y) is in a portion of the penumbra illuminated by the ambient illuminant and approximately two thirds of the incident illuminant. In a fourth frame(figure 5d), the shadow has moved beyond the location (X, Y)5 leaving the location within the scene background, fully lit by the incident illuminant and the ambient illuminant. Finally, in a fifth frame, figure 5e, the location(X, Y) is covered by an object, in our example, a ball, that has moved into the field of view. The ball is lit by the incident illuminant and the ambient illuminant and projects a shadow.
[0080] As discussed above, a linear token is a set of pixels of differing color values related geometrically in a color space such as RGB space, for example, an axially extending linear or cylindrical geometric shape, from a dark end (in shadow) to a bright end (lit end), along a positive slope. A linear or cylindrical configuration for the token is predicted by a bi-illuminant dichromatic reflection model (BIDR model), according to a feature of the present invention, when the color change is due to an illumination change forming a shadow (i.e. a decrease in the intensity of the incident illuminant as the interplay between the incident or direct illuminant and the ambient illuminant in the illumination field) over a single material of a scene depicted in the image (a BIDR cylinder).
[0081] For purposes of describing, identifying and using linear tokens, the BIDR model can be stated as: I(x, y, z, θ, φ, λ) = cb (λ) Id (λ) γb + Ma (λ) cb (λ), where: I(x, y, z, θ, Φ, λ) is the radiance of a surface point at (x, y, z) in the direction θ, φ for the wavelength λ, cb (λ) is the geometry independent body reflectance of a surface for the wavelength λ, Id (λ) is the incident illuminant for the wavelength λ. γb is the product of a shadow factor sx, y, z and a geometric factor mb (θi), and Ma (λ) is the integral of the ambient illuminant and geometric body reflectance over a hemisphere, excluding the incident illuminant. A more detailed discussion of the BIDR model reference is made to co-pending Application Serial No. 11/341,751, filed on January 27, 2006, entitled: "Bi-illuminant Dichromatic Reflection Model For Image Manipulation," which is hereby incorporated by reference.
[0082] Figure 6 is a graph showing a representation of a temporal linear token, according to a feature of the present invention. A temporal linear token is a linear token wherein the set of pixels of differing color values related geometrically comprises a single pixel location, for example, location(X, Y) of figures 5a-e, at different times, for example, the time sequence of figures 5a-e. The graph of figure 6 is in an RGB space, and depicts the color values for the location(X, Y) at each of frames 1-5 of figures 5a-e.
[0083] As can be seen in the graph of figure 6, Point(X, Y), Frame 1 corresponds to the location(X, Y) at the time of frame 1, when the location(X, Y) is in full shadow of the leaf, and thus, illuminated solely by the ambient illuminant of the bi-illuminant flux. At Point(X, Y), Frame 2, the location(X, Y) is in the shadow penumbra, and illuminated by the ambient illuminant and one third of the direct or incident illuminant. At Point(X, Y), Frame 3, the location(X, Y) is in the shadow penumbra, and illuminated by the ambient illuminant and approximately two thirds of the direct or incident illuminant. Point(X, Y), Frame 4 corresponds to the location(X, Y) after the shadow has moved beyond the location (X, Y), leaving the location within the scene background, fully lit by the incident illuminant and the ambient illuminant.
[0084] As can be seen in the graph of figure 6, the points corresponding to frames 1 -4, comprise a temporal linear token defining a line that represents a BIDR cylinder (the spectral information) that can be utilized to classify pixels of an image sequence. Thus, any pixel that falls in or near the temporal linear token defined by, for example, the points of frames 1-4 (the BIDR cylinder), can be classified as a pixel depicting the background scene under varying illumination. Any pixel that falls outside the BIDR cylinder in the RGB space, would be classified as a pixel value of an object distinct from the background scene. Point(X, Y), Frame 5, corresponding to the location(X, Y) when it is covered by an object, in our example, a ball, that has moved into the field of view, falls outside the BIDR cylinder in the RGB space of figure 6. Thus, the pixel at point(X,Y), Frame 5, is recognized as belonging to a newly introduced object.
[0085] Referring now to figure 7, there is shown a flow chart for performing an operation to classify pixel types in a frame of a motion sequence with highly variable illumination, using a BIDR cylinder according to a feature of the present invention. In step 600, a sequence of image files 18, corresponding to N frames recorded in a time sequence by the video camera 14, is input to the CPU 12, for example, the images of figures 5a-e. In step 602, a location(X, Y) in the scene depicted in the sequence N is selected and input to the CPU 12. In step 604, the CPU 12 stores all of the pixel values throughout the sequence of N frames, for the selected location(X, Y) as a set C.
[0086] According to a feature of the present invention, in step 606, the CPU 12 performs a mathematical operation, such as, for example, a linear regression technique or a least median of squares regression technique, to find a best fit BIDR cylinder in an RGB space corresponding to the pixel values of the set C. An example of such a mathematical technique is described in "Least Median of Squares Regression" by Peter J. Rousseuw, Journal of American Statistical Association, Vol. 79, No. 388 (December, 1984), pp. 871-880.
[0087] In step 608, the CPU 12 determines whether predetermined amount of the pixel values in the set C are within a predetermined relationship to the BIDR cylinder, the dimensions of the cylinder being related to an axis between a fully lit value for the pixel values of the set C, and a value for the pixels of set C in shadow, and a radius determined as a function of variations in pixel values due to noise and other naturally occurring phenomena. The predetermined amount of pixels can be set at an amount that indicates a BIDR cylinder confirmation, for example, 80% of the pixels of the set C. If the pixel values of the set C correspond to a location that is subject to variable illumination flux, as in our example, a location in the scene over which the shadow of a leaf variously passes, the values will define a BIDR cylinder, that can then be used to classify the pixel values at that location.
[0088] If yes, the CPU 12 continues to step 610 to store the BEDR cylinder representing the set C. In step 612, the CPU 12 examines a selected pixel value from the set C, corresponding to a single frame T of the sequence N of frames. In a decision block (step 614), the CPU 12 determines whether the selected pixel value for the frame T is within the stored BIDR cylinder. If yes, the pixel value is classified as representing the background under varying illumination (step 616), if no, the pixel value is classified as an object distinct from the background (step 618). In either instance, the CPU 12 proceeds to step 620 to store the classification for the selected pixel value.
[0089] In a decision block (step 622), the CPU 12 determines whether there are more pixel values in the set C5 for examination and classification. If yes, the CPU 12 returns to step 612, if no, the CPU 12 continues to a decision block (step 624). In the decision block of step 624, the CPU 12 determines whether there are any more locations(X, Y) in the scene depicted in the sequence of images N. If no, the CPU 12 exits the routine of figure 7 (step 626). If yes, the CPU 12 returns to step 602.
[0090] Referring once again to step 608, if most of the pixel values in the set C do not define a BIDR cylinder, the location(X, Y) can not be characterized using a temporal linear token (step 628), and the CPU 12 returns to step 624. The CPU 12 can store all of the pixel locations that do not define a BIDR cylinder, and upon completion of the routine of figure 7, return to the routine of figure 4as for execution in respect of the stored pixel locations. When performing the routine of figure 4a for the pixel locations that do not define a BIDR cylinder, step 108 is changed to input a set of pixels corresponding to the stored locations for the N images of the sequence, and step 126 exits the routine when there are no more pixels to examine.
[0091] Enhancements, such as those described above in respect of figure 4a, can be implemented to improve the accuracy of pixel classification. Moreover, temporal linear tokens defining BIDR cylinders can be utilized to determine spectral ratios for the image. Each temporal linear token includes a brightest pixel and a darkest pixel. Those pixels can be used to calculate a spectral ratio for the corresponding token. The temporal linear tokens with the greatest ratios of intensity provide a measurement of the characteristic spectral ratio for the scene. [0092] Figures 8a,b depict a sample image sequence in time showing two objects moving across a scene from full illumination to shadow under bi-illuminant conditions. In figure 8a, there is shown two objects, disk 1 and disk 2, each comprising two materials. The objects are shown while in a section of the background image that is illuminated by both an incident illuminant and an ambient illuminant. In figure 8b, the disks have moved into a section of the background image that is in full shadow, and thus illuminated solely by the ambient illuminant.
[0093] Figure 9a is a flow chart for using second order tokens as an aid in motion object tracking, for example, the two disks of figures 8a,b, according to a feature of the present invention. By comparing the spectral ratios of the second order tokens found on the disks as they move from light to dark, it can be determined with a high degree of certainty which disk has moved where. The routine of figure 4a, for example, can be executed to identify objects, the routine of figure 9a is then used to track movement of the identified objects.
[0094] In step 700, an input to the CPU 12 comprises a motion image sequence of N frames, for example, the frames depicted in figures 8a,b. In step 702, the CPU 12 calculates motion pixels from all frames. The calculation is completed by classifying pixels of the images utilizing, for example, the routine of one of figures 4a or 7. In step 704, the CPU 12 stores all of the motion pixels, i.e., in our example, all pixels associated with disk 1 and disk 2.
[0095] Upon storing of the motion pixels, the CPU 12 executes two sub-routines. In a first sub-routine, the CPU 12 calculates second order tokens from the stored motion pixels (step 706). The second order tokens will be used to analyze objects in the sequence of frames to determine the positions of each object (disk 1 and disk 2) throughout the sequence of frames. [0096] An Nth order token is a set of N first order uniform tokens that are different colors, as measured in a selected color space, for example, RGB, hue or chromaticity, and are near to one another in the image. As an example, disk 1 and disk 2 of figures 8a,b, each comprise two distinct materials. Each material of each disk will form a uniform token, and the adjacent materials of each disk could form a second-order token, two close uniform tokens.
[0097] Figure 9b shows a flow chart of a method for creating an Nth order token. The CPU 12 is given a list of tokens, for example, as identified through execution of the routine of figure 4b for each frame of the sequence, an input image area A and a maximum distance Dmax, which could be set at 10 pixels (step 800). In step 802, the CPU 12, for each first order token within the image area A, selects an image location or pixel X = p(i, j) and then finds all unique sets of N tokens, that is, for example, all tokens of different color, withing Dmax of each location X. In step 804, the CPU 12 adds each set of N tokens found in step 802, into an Nth order token set, checking for duplicates. In step 806, the CPU 12 outputs the Nth order token sets for the image area. The CPU 12 repeats the routine for each frame of the sequence for a complete list of Nth order tokens.
[0098] Returning to figure 9a, the CPU 12 stores the second order token list for each frame of the sequence (step 708). In the second sub-routine, the CPU 12 separates all of the stored motion pixels into object regions for tracking (step 710). In our example, the regions comprise the pixels defining each of disk 1 and disk 2. The CPU 12 stores the region list (step 712) and then selects an object region from the list, Ob(I), from frame 1 (for example the image of figure 8a) of the sequence (step 714). In step 716, the CPU 12 selects another frame, M (for example the image of figure 8b), from the sequence. In step 718, the CPU 12 selects an object region Ob(M), from the frame M. [0099] In step 720, the CPU 12 finds second order tokens T(I) and T(M) from the token list, that correspond to the object regions Ob(I) and Ob(M)3 for example, executing a pixel location comparison. In step 722, the CPU 12 calculates normalized spectral ratios, Sn(I) and Sn(2), respectively, and reflectance ratios, R(I) and R(2), respectively, between corresponding tokens of each of the second order tokens, T(I) and T(M). In performance of the calculations, the CPU 12 first determines the second order token having the brighter pixels. The mean average RGB values for the pixels of each of the two tokens of the brighter second order token are denoted as Bl and B2, respectively, while the average RGB values for the pixels of each of the two tokens of the darker second order token are denoted as Dl and D2, respectively.
[0100] A calculation of the reflectance ratios between the two pairs of tokens comprising the second order tokens, T(I) and T(M), is completed according to the formulas:
R(I) = (Bl - D1)/(B1 + Dl)
R(2) = (B2 - D2)/(B2 + D2).
[0101] A calculation of the normalized spectral ratios between the two pairs of tokens comprising the second order tokens, T(I) and T(M), is completed according to the formulas:
Sn(I) = Norm(D 1/(Bl - Dl))
Sn(2) = Norm(D2/(B2 - D2)).
[0102] In step 724, the CPU 12 determines whether the mean average RGB values of T(I) and T(M) are equal. If yes, the CPU proceeds to step 726 and classifies the object regions Ob(I) and Ob(M) as likely to be the same object. If no, the CPU 12 proceeds to step 728 to determine whether Sn(I) = Sn(2) and R(I) = R(2). If yes, the CPU 12 proceeds to step 726, discussed above. In no, the CPU 12 proceeds to step 730 and classifies the objects as not the same. In an alternative embodiment of the present invention, step 724 can be performed before calculation of the spectral and reflectance ratios, and those calculations are made only if the mean average values are not equal.
[0103] From either steps 726 or 730, the CPU 12 stores the classification and proceeds to a decision block (step 732). At the decision block of step 732, the CPU 12 determines whether there are any more object regions in frame M for examination and classification. If yes, the CPU returns to step 718 to select another object region Ob(M). If no, the CPU 12 proceeds to a decision block (step 734). In the decision block of step 734, the CPU 12 determines whether there are any more frames in the sequence N to examine. If yes, the CPU 12 returns to step 716 to select another frame M. If no, the CPU 12 proceeds to a decision block (step 736). In the decision block of step 736, the CPU 12 determines whether there are any more objects to be examined in frame 1. If yes, the CPU 12 returns to step 716 to select another object, Ob(I). If no, the routine is complete, and the CPU 12 exits the routine (step 738).
[0104] Figures 10a-d depict a sample image sequence in time, showing four objects moving randomly across a scene under bi-illuminant conditions. Each object is composed of a different material, and casts a shadow on the background of the image. A situation such as shown in figures 10a-d can be analyzed using an enhanced Gaussian mixture model. In figure 10a, a location(X, Y), at a first time frame, is a pixel in the static background of the scene illuminated by an incident illuminant and an ambient illuminant, and corresponds to a most common state for a pixel. In figure 10b, the location(X, Y), at a second time frame, is in a shadow, illuminated by the ambient illuminant. This corresponds to a second most common state. In figure 10c, the location(X, Y) is covered by one of the objects, an unlikely state for the pixel. Finally, in figure 1Od, the location(X, Y) is covered by another of the objects, also an unlikely state for the pixel.
[0105] Figure 11 is a graph showing a representation of a Gaussian mixture model in a linear RGB space, including values corresponding to the location(X, Y), in each of the frames depicted in figures 10a-d. In a Gaussian mixture model, each pixel of the image sequence will be in a set of states, and the model is used to determine a most common state and a second most common on state for each pixel location, as will appear. As noted above, the most common state for the pixel at location(X, Y) will correspond to the static background, while illuminated by an incident illuminant and an ambient illuminant (figure 10a). The static background is typically obscured briefly by moving objects, and is otherwise constant, making this state dominant. The second most common state for the pixel is depicted in figure 10b, a shadow illuminated by the ambient illuminant. The second most common state is the shadowed background image. All moving objects cast a similar shadow, so this state appears as the next most common when objects are moving around a scene. The least common states correspond to object surfaces, moving through the static background. Moving objects can be made of many different materials, all of which appear briefly at a particular location, so none of the particular materials has a chance to become a common state. In figure 11, a line is drawn between the two most common states (from location(X, Y) in figure 10a, to location(X, Y) in figure 10b). The line is used to define a BIDR cylinder for the sequence of frames N depicted in figures 10a-d. As can be seen in figure 11, the RGB values for the location(X, Y) when covered by one of the objects, is outside the BIDR cylinder.
[0106] Figure 12 is a flow chart for classifying pixels in a motion image sequence via an enhanced Gaussian mixture model, according to a feature of the present invention. In step 900, an input to the CPU 12 comprises a motion image sequence of N frames, for example, the frames depicted in figures 10a-d. In a first sub-routine, in step 902, the CPU 12 calculates Gaussian probabilities at each pixel location among the sequence of images, TSF. The CPU 12 then stores the calculated probabilities (step 904). For a description of the Gaussian mixture model, reference can be made to C. Stauffer and W. Grimson "Adaptive background mixture models for real time tracking," Proceedings of the IEEE International Conference on CVPR, Vol. 2, pp. 246-252, June, 1999. [0107] In step 906, the CPU 12 calculates and stores a BIDR cylinder C for each pixel location, setting the ends of the cylinder to the RGB values corresponding to the first and second most probable pixel states for that location. As should be understood, the most common state for a pixel corresponds to the static background, while illuminated by an incident illuminant and an ambient illuminant and the second most common state for the pixel corresponds to a shadow illuminated by the ambient illuminant. The BIDR cylinder can be validated by applying constraints to the BIDR cylinder derived from the two most probable pixel states, by, for example, requiring certain reasonable constraints for the illuminants. The stored BIDR cylinders are made available at a decision block (step 908).
[0108] In a second sub-routine, the CPU 12 selects a frame T from the images of the sequence N (step 910). In step 912, the CPU 12 selects a pixel P(X, Y) from the image T, and proceeds to a decision block (Step 914). hi the decision block of step 914, the CPU 12 determines whether the pixel P(X, Y) is close in value, within some measure of noise, to the value of probability state 1 (the most common state). If yes, the Gaussian probability indicates that the pixel P(X, Y) is to be classified as part of the static background of the scene (step 916). The CPU 12 then proceeds to a decision block (step 918). In the decision block of step 918, the CPU 12 determines whether there are any more pixels in the frame T for analysis and classification. If yes, the CPU returns to step 912. If no, the CPU 12 proceeds to a decision block (step 920). In the decision block of step 920, the CPU 12 determines whether there are any more frames in the sequence N for analysis. If yes, the CPU 12 returns to step 910. If no, the routine is ended (step 922).
[0109] If the determination of step 914 is no, the CPU 12 proceeds to a decision block (step 924). In the decision block of step 924, the CPU 12 determines whether the pixel P(X, Y) is close in value, within some measure of noise, to the value of probability state 2 (the second most common state). If yes, the Gaussian probability indicates that the pixel P(X, Y) is to be classified as within a shadow of the static background (step 926). The CPU 12 then proceeds to the decision block of step 918. If no, the CPU 12 proceeds to the decision block of step 908. In the decision block of step 908, the CPU 12 determines whether the pixel P(X, Y) lies within the BIDR cylinder for the corresponding location of the subject pixel P(X, Y). If yes, the Gaussian probability indicates that the pixel P(X, Y) is to be classified as within a shadow penumbra of the static background (step 928). If no, the Gaussian probability indicates that the pixel P(X, Y) is to be classified as part of a transient object distinct from the background. From either steps 928 or 930, the CPU 12 proceeds to the decision block of step 918.
[0110] In the preceding specification, the invention has been described with reference to specific exemplary embodiments and examples thereof. It will, however, be evident that various modifications and changes may be made thereto without departing from the broader spirit and scope of the invention as set forth in the claims that follow. The specification and drawings are accordingly to be regarded in an illustrative manner rather than a restrictive sense.

Claims

What is claimed is:
1. An automated, computerized method for classifying pixel values in a motion sequence of images, comprising the steps of: determining spectral information relevant to the sequence of images; and utilizing the spectral information to classify a pixel as one of background, shadow and object.
2. The method of claim 1, wherein the spectral information comprises a characteristic spectral ratio for the background.
3. The method of claim 2, wherein the step of utilizing the spectral information to classify a pixel as one of background, shadow and object is carried out by: determining a characteristic spectral ratio for a static background of the sequence of images; calculating, for a subject pixel of an image in the sequence, a spectral ratio between the subject pixel and a corresponding pixel of the static background; and comparing the spectral ratio for the subject pixel to the characteristic spectral ratio.
4. The method of claim 3, wherein the spectral ratio for the subject pixel is a' normalized spectral ratio.
5. The method of claim 1, wherein the spectral information comprises a BIDR cylinder representation of pixel values, in a color space.
6. The method of claim 5, wherein the step of utilizing the spectral information to classify a pixel as one of background, shadow and object is carried out by: generating a BIDR cylinder representative of values of a subject pixel location throughout the sequence of images; and comparing a value of the subject pixel location in one image, to the BIDR cylinder.
7. The method of claim 7, wherein the color space is an RGB space.
8. The method of claim 6, wherein the BIDR cylinder representative of values of a subject pixel location throughout the sequence of images, comprises a temporal linear token of pixel values at the location.
9. The method of claim 6, wherein the step of generating a BEDR cylinder representative of values a subject pixel location throughout the sequence of images is carried out by calculating state probabilities for pixel values at the subject pixel location, and generating the BIDR cylinder representation as a function of the state probabilities.
10. The method of claim 9, wherein the step of calculating state probabilities for pixel values at the subject pixel location is carried out by utilizing a Gaussian mixture model.
11. An automated, computerized method for tracking an object in a motion sequence of images, comprising the steps of: determining motion pixels from the motion sequence of images; calculating Nth order tokens as a function of the motion pixels; and identifying motion of an object as a function of the Nth order tokens.
12. The method of claim 115 wherein the step of identifying motion of an object as a function of the Nth order tokens is carried out as a function of comparison of spectral ratios and reflectance ratios between Nth order tokens of separate instances of motion pixels, in separate images of the motion sequence of images. /
13. The method of claim 11, wherein the Nth order tokens comprise second order tokens.
14. A computer system comprising: a CPU; and a memory storing a sequence of images; the CPU arranged and configured to execute a routine to determine spectral information relevant to the sequence of images, and utilize the spectral information to classify a pixel as one of background, shadow and object.
15. The computer system of claim 14, wherein the spectral information comprises a characteristic spectral ratio for the background.
16. The computer system of claim 14, wherein the spectral information comprises a BIDR cylinder representation of pixel values, in a color space.
17. A computer system comprising: a CPU; and a memory storing a motion sequence of images; the CPU arranged and configured to execute a routine to determine motion pixels from the motion sequence of images, calculate second order tokens as a function of the motion pixels and identify motion of an object as a function of the second order tokens.
PCT/US2007/017156 2006-08-03 2007-08-01 Methods for discriminating moving objects in motion image sequences WO2008019020A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP07810967A EP2047405A4 (en) 2006-08-03 2007-08-01 Methods for discriminating moving objects in motion image sequences

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US11/498,621 US7639874B2 (en) 2006-08-03 2006-08-03 Methods for discriminating moving objects in motion image sequences
US11/498,621 2006-08-03

Publications (2)

Publication Number Publication Date
WO2008019020A2 true WO2008019020A2 (en) 2008-02-14
WO2008019020A3 WO2008019020A3 (en) 2008-12-18

Family

ID=39029235

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2007/017156 WO2008019020A2 (en) 2006-08-03 2007-08-01 Methods for discriminating moving objects in motion image sequences

Country Status (3)

Country Link
US (2) US7639874B2 (en)
EP (1) EP2047405A4 (en)
WO (1) WO2008019020A2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2171645A4 (en) * 2007-08-01 2011-12-07 Tandent Vision Science Inc System and method for identifying complex tokens in an image

Families Citing this family (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7925112B2 (en) * 2007-02-28 2011-04-12 Honeywell International Inc. Video data matching using clustering on covariance appearance
US7898576B2 (en) * 2007-02-28 2011-03-01 Honeywell International Inc. Method and system for indexing and searching objects of interest across a plurality of video streams
US9778351B1 (en) 2007-10-04 2017-10-03 Hrl Laboratories, Llc System for surveillance by integrating radar with a panoramic staring sensor
US9196053B1 (en) * 2007-10-04 2015-11-24 Hrl Laboratories, Llc Motion-seeded object based attention for dynamic visual imagery
US8390685B2 (en) * 2008-02-06 2013-03-05 International Business Machines Corporation Virtual fence
US8345097B2 (en) * 2008-02-15 2013-01-01 Harris Corporation Hybrid remote digital recording and acquisition system
US20090324071A1 (en) * 2008-06-30 2009-12-31 Shengqi Yang Color enhancement for graphic images
US8391601B2 (en) 2009-04-30 2013-03-05 Tandent Vision Science, Inc. Method for image modification
US8346022B2 (en) * 2009-09-02 2013-01-01 Tandent Vision Science, Inc. System and method for generating an intrinsic image using tone mapping and log chromaticity
US20120051589A1 (en) * 2010-08-24 2012-03-01 Honeywell International Inc. method for clustering multi-modal data that contain hard and soft cross-mode constraints
US9090214B2 (en) 2011-01-05 2015-07-28 Orbotix, Inc. Magnetically coupled accessory for a self-propelled device
US9429940B2 (en) 2011-01-05 2016-08-30 Sphero, Inc. Self propelled device with magnetic coupling
US10281915B2 (en) 2011-01-05 2019-05-07 Sphero, Inc. Multi-purposed self-propelled device
US9218316B2 (en) 2011-01-05 2015-12-22 Sphero, Inc. Remotely controlling a self-propelled device in a virtualized environment
US8571781B2 (en) 2011-01-05 2013-10-29 Orbotix, Inc. Self-propelled device with actively engaged drive system
CN102393913B (en) * 2011-10-31 2017-04-05 北京航空航天大学 A kind of Weak target precise tracking method based on spectral fingerprint feature
US9552648B1 (en) 2012-01-23 2017-01-24 Hrl Laboratories, Llc Object tracking with integrated motion-based object detection (MogS) and enhanced kalman-type filtering
US9069075B2 (en) * 2012-02-10 2015-06-30 GM Global Technology Operations LLC Coupled range and intensity imaging for motion estimation
US9292758B2 (en) 2012-05-14 2016-03-22 Sphero, Inc. Augmentation of elements in data content
US9827487B2 (en) 2012-05-14 2017-11-28 Sphero, Inc. Interactive augmented reality using a self-propelled device
KR20150012274A (en) * 2012-05-14 2015-02-03 오보틱스, 아이엔씨. Operating a computing device by detecting rounded objects in image
US10056791B2 (en) 2012-07-13 2018-08-21 Sphero, Inc. Self-optimizing power transfer
US9829882B2 (en) 2013-12-20 2017-11-28 Sphero, Inc. Self-propelled device with center of mass drive system
WO2015171355A1 (en) * 2014-05-08 2015-11-12 Tandent Vision Science, Inc. A method for identifying color-based vectors for analysis of frames of a video
US9582895B2 (en) * 2015-05-22 2017-02-28 International Business Machines Corporation Real-time object analysis with occlusion handling
US10366379B2 (en) * 2017-01-30 2019-07-30 Ncr Corporation Remote weigh station with delayed fraud intervention
KR102303764B1 (en) * 2017-04-24 2021-09-17 삼성전자주식회사 Crosstalk processing modules and methdods of processing crosstalk
JP2019102008A (en) * 2017-12-08 2019-06-24 日本放送協会 Area determination device and program thereof

Family Cites Families (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5134472A (en) * 1989-02-08 1992-07-28 Kabushiki Kaisha Toshiba Moving object detection apparatus and method
GB9019538D0 (en) * 1990-09-07 1990-10-24 Philips Electronic Associated Tracking a moving object
US5243418A (en) * 1990-11-27 1993-09-07 Kabushiki Kaisha Toshiba Display monitoring system for detecting and tracking an intruder in a monitor area
US5214744A (en) * 1990-12-14 1993-05-25 Westinghouse Electric Corp. Method and apparatus for automatically identifying targets in sonar images
US5706417A (en) * 1992-05-27 1998-01-06 Massachusetts Institute Of Technology Layered representation for image coding
GB9308952D0 (en) * 1993-04-30 1993-06-16 Philips Electronics Uk Ltd Tracking objects in video sequences
WO1997016926A1 (en) * 1995-10-31 1997-05-09 Sarnoff Corporation Method and apparatus for determining ambient conditions from an image sequence
WO1997016921A1 (en) * 1995-10-31 1997-05-09 Sarnoff Corporation Method and apparatus for generating a reference image from an image sequence
US6985172B1 (en) * 1995-12-01 2006-01-10 Southwest Research Institute Model-based incident detection system with motion classification
US5764283A (en) * 1995-12-29 1998-06-09 Lucent Technologies Inc. Method and apparatus for tracking moving objects in real time using contours of the objects and feature paths
US6191812B1 (en) * 1997-04-01 2001-02-20 Rt-Set Ltd. Method of providing background patterns for camera tracking
US6445409B1 (en) * 1997-05-14 2002-09-03 Hitachi Denshi Kabushiki Kaisha Method of distinguishing a moving object and apparatus of tracking and monitoring a moving object
US6295367B1 (en) * 1997-06-19 2001-09-25 Emtera Corporation System and method for tracking movement of objects in a scene using correspondence graphs
US6263088B1 (en) * 1997-06-19 2001-07-17 Ncr Corporation System and method for tracking movement of objects in a scene
US6724915B1 (en) * 1998-03-13 2004-04-20 Siemens Corporate Research, Inc. Method for tracking a video object in a time-ordered sequence of image frames
US6363088B1 (en) * 1998-11-30 2002-03-26 Sarnoff Corporation All solid-state power broadband visible light source
US6272250B1 (en) * 1999-01-20 2001-08-07 University Of Washington Color clustering for scene change detection and object tracking in video sequences
US6591006B1 (en) * 1999-06-23 2003-07-08 Electronic Data Systems Corporation Intelligent image recording system and method
US6668078B1 (en) * 2000-09-29 2003-12-23 International Business Machines Corporation System and method for segmentation of images of objects that are occluded by a semi-transparent material
US6711279B1 (en) * 2000-11-17 2004-03-23 Honeywell International Inc. Object detection
US7415164B2 (en) * 2005-01-05 2008-08-19 Mitsubishi Electric Research Laboratories, Inc. Modeling scenes in videos using spectral similarity
US7672530B2 (en) * 2005-01-27 2010-03-02 Tandent Vision Science, Inc. Method and system for identifying illumination flux in an image
US8976173B2 (en) * 2006-01-27 2015-03-10 Tandent Vision Science, Inc. Bi-illuminant dichromatic reflection model for image manipulation
US7596266B2 (en) * 2006-04-13 2009-09-29 Tandent Vision Science, Inc. Method and system for separating illumination and reflectance using a log color space

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
ANDREA PRATI; IVANA MIKIC; RITA CUCCHIARA; MOHAN M TRIVEDI: "Comparative Evaluation of Moving Shadow Detection Algorithms", WORKSHOP ON EMPIRICAL EVALUATION METHODS IN COMPUTER VISION, vol. XX, 10 December 2001 (2001-12-10)
JÜRGEN STAUDER; ROLAND MECH; JORN OSTERMANN: "Detection of Moving Cast Shadows for Object Segmentation", IEEE TRANSACTIONS ON MULTIMEDIA, vol. 1, no. 1, March 1999 (1999-03-01), XP011036282
See also references of EP2047405A4

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2171645A4 (en) * 2007-08-01 2011-12-07 Tandent Vision Science Inc System and method for identifying complex tokens in an image

Also Published As

Publication number Publication date
EP2047405A2 (en) 2009-04-15
US7760942B2 (en) 2010-07-20
US7639874B2 (en) 2009-12-29
WO2008019020A3 (en) 2008-12-18
US20090175502A1 (en) 2009-07-09
EP2047405A4 (en) 2012-02-08
US20080031526A1 (en) 2008-02-07

Similar Documents

Publication Publication Date Title
US7639874B2 (en) Methods for discriminating moving objects in motion image sequences
EP1842154B1 (en) Method and system for identifying illumination flux in an image
US7894662B2 (en) Method for using image depth information in identifying illumination fields
López-García et al. Automatic detection of skin defects in citrus fruits using a multivariate image analysis approach
EP1977371B1 (en) Method and system for identifying illumination fields in an image
US8139850B2 (en) Constraint generation for use in image segregation
US8139867B2 (en) Image segregation system architecture
US8600169B2 (en) Method and system for learning a same-material constraint in an image
Sadeghi-Tehran et al. Multi-feature machine learning model for automatic segmentation of green fractional vegetation cover for high-throughput field phenotyping
US8542917B2 (en) System and method for identifying complex tokens in an image
US8260050B2 (en) Test bed for optimizing an image segregation
WO2010065087A1 (en) Solver for image segregation
US8983183B2 (en) Spatially varying log-chromaticity normals for use in an image process
US9542614B2 (en) Constraint relationship for use in an image segregation
US20150324662A1 (en) Method for identifying color-based vectors for analysis of frames of a video
WO2015171355A1 (en) A method for identifying color-based vectors for analysis of frames of a video
US20140050399A1 (en) Log-chromaticity clustering pipeline for use in an image process
US8811732B2 (en) Weighted entropy minimization for optimizing a log-chromaticity normal for use in an image process
US20130094762A1 (en) System and method for identifying complex tokens in an image

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07810967

Country of ref document: EP

Kind code of ref document: A2

REEP Request for entry into the european phase

Ref document number: 2007810967

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2007810967

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE

NENP Non-entry into the national phase

Ref country code: RU