US 20040091047 A1 Abstract A method and apparatus for nonlinear multiple motion model and moving boundary extraction are disclosed. In one embodiment, an input image is received, the input image is partitioned into regions/blocks, and the new multiple motion model is applied to each region to extract the motions and associated moving boundaries.
Claims(60) 1. A method for motion estimation comprising:
receiving an input image; receiving a set of reference images; partitioning said input image into regions; and for each region:
extracting one or more motions; and
extracting one or more associated moving boundaries.
2. The method of 3. The method of 4. The method of 5. The method of 6. The method of 7. The method of 8. The method of ^{1}, is coupled to pixel positions, denoted (x,y), and is used to control and refine an estimate of said one or more associated moving boundaries, and has a form: t ^{1} =F(s) where s is a function of the pixel position (i.e., s=B(x,y)) within one of said regions.
9. The method of 10. The method of 11. The method of t ^{1} =F(s) s=B(x,y) x ^{1} =v _{1} ^{x}(x,y)+(v _{2} ^{x}(x,y))(t ^{1}+1) y ^{1} =v _{1} ^{y}(x,y)+(v _{2} ^{y}(x,y))(t ^{1}+1) where v
_{i} ^{x,y }is a motion vector map for an i motion, and where F,B are a time referencing and boundary model, respectively. 12. The method of s=B(x,y)=gx+hy+αx ^{2} +βy ^{2} +i where {g,h,α,β,i} are parameters of the boundary model.
13. The method of 14. The method of {right arrow over (v)} _{i}(x,y)=(a _{i} ^{x} +b _{i} y+c _{i} ,d _{i} x+e _{i} y+f _{i}) where {a
_{i},b_{i},c_{i},d_{i},e_{i},f_{i}} are parameters of the affine model. 15. The method of where M is the number of motions in said regions, and {t
_{i} ^{ref}} label time values of said set of reference images used; for example t_{i} ^{ref}=−1 (past), t_{i} ^{ref}=0 (future), t_{i} ^{ref}=−2 (2 frames deep in past), etc., and equations s_{i}=B_{j}(x,y) are boundary models for j=1,2, . . . M−1 boundaries. 16. The method of s _{j} =B _{j}(x,y)=g _{j} x+h _{j} y+α _{j} x ^{2}+β_{j} y ^{2}+i_{j } where {g
_{j},h_{j},α_{j},β_{j},i_{j}} are parameters of the j boundary model. 17. The method of 18. The method of {right arrow over (v)} _{i}(x,y)=(a _{i} x+b _{i} y+c _{i} ,d _{i} x+e _{i} ^{y+f} _{i}) where {a
_{i},b_{i},c_{i},d_{i},e_{i},f_{i}} are parameters of the affine model. 19. The method of t ^{1} =F({s _{j} },{w _{j} },{t _{i} ^{ref}}) where {w
_{j}} are a set of width parameters for said one or more associated moving boundaries. 20. The method of where w is a width parameter for the moving boundary.
21. The method of _{j}}, which control widths of said one or more associated boundaries, itself varies dynamically in said motion model, such that a system may select the best boundary widths {w_{j}} according to a minimum prediction error. 22. The method of _{j}} are initially fixed and small during a first estimation of parameters for said motion model, and then successively reduced in re-estimation stages of said parameters for said motion model. 23. The method of _{i},B,F} for the said model are determined from a steepest descent algorithm to minimize prediction error, and use multiple resolution layers and projection of said parameters from one layer to a next layer. 24. The method of _{i},B_{i},F}for the said model are determined from a steepest descent algorithm to minimize prediction error, and use multiple resolution layers and projection of said parameters from one layer to a next layer. 25. The method of t ^{1} =F(s) s=B(x,y) x ^{1} =v _{1} ^{x}(x,y)+(v _{2} ^{x}(x,y))(t ^{1}+1) y ^{1} =v _{1} ^{y}(x,y)+(v _{2} ^{y}(x,y))(t ^{1}+1) where {right arrow over (v)}
_{i} ^{x,y }is a motion vector map for an i motion, and where F,B are the time referencing and boundary model, respectively. 26. The method of s=B(x,y)=gx+hy+α ^{2} +βy ^{2} +i where {g,h,α,β,i} are parameters of the boundary model.
27. The method of 28. The method of {right arrow over (v)} _{i}(x,y)=(a _{i} x+b _{i} y+c _{i} ,d _{i} x+e _{i} y+f _{i}) where {a
_{i},b_{i},c_{i},d_{i},e_{i},f_{i}} are parameters of the affine model. 29. The method of where M is the number of motions in said regions, and {t
_{i} ^{ref}} label time values of said set of reference images used; for example t_{i} ^{ref}=−1 (past), t_{i} ^{ref}=0 (future), t_{i} ^{ref}=−2 (2 frames deep in past), etc., and equations s_{j}B_{j}(x,y) are boundary models for j=1,2, . . . M−1 boundaries, and F({s_{j}}) is the nonlinear time-referencing equation. 30. The method of s _{j} =B _{j}(x,y)=g _{j} x+h _{j} y+α_{i} x ^{2}+β_{j} y ^{2} +i _{j } where {g
_{j},h_{j},α_{j},β_{j},f_{j}} are parameters of the j boundary model. 31. The method of 32. The method of {right arrow over (v)} _{i}(x,y)=(a _{i} x+b _{i} y+c _{i} ,d _{i} x+e _{i} y+f _{i}) where {a
_{i},b_{i},c_{i},d_{i},e_{i},f_{i}} are parameters of the affine model. 33. The method of t ^{1} =F({s _{j} },{w _{j} },{t _{i} ^{ref}}) where {w
_{j}} are a set of width parameters for said one or more associated moving boundaries. 35. The method of _{j}}, which control widths of said one or more associated moving boundaries, itself varies dynamically in said motion model, such that a system may select a best boundary width w according to a minimum prediction error. 36. The method of _{j}} are initially fixed and small during a first estimation of parameters for said motion model, and then successively reduced in re-estimation stages of said parameters for said motion model. 37. The method of _{i},B,F} for the said model are determined from a steepest descent algorithm to minimize prediction error, and use multiple resolution layers and projection of said parameters from one layer to a next layer. 38. The method of _{j},B_{j},F}for the said model are determined from a steepest descent algorithm to minimize prediction error, and use multiple resolution layers and projection of said parameters from one layer to a next layer. 39. The method of 40. The method of 41. A processing system comprising a processor coupled to a memory, which when executing a set of instructions performs the method of 42. A machine-readable medium having stored thereon instructions, which when executed performs the method of 43. An apparatus for motion estimation comprising:
means for receiving an input image; means for receiving a set of reference images; means for partitioning said input image into regions; and for each region:
means for extracting one or more motions; and
means for extracting one or more associated moving boundaries.
44. The apparatus of 45. The apparatus of 46. The method of t=F(s) s=B(x,y) x ^{1} =v _{1} ^{x}(x,y)+(v _{2} ^{x}(x,y))(t ^{1}+1) y ^{1} v _{1} ^{y}(x,y)+(v _{2} ^{y}(x,y))(t ^{1}+1) where v
_{i} ^{x,y }is a motion vector map for an i motion, and where F,B are a time referencing and boundary model, respectively. 47. The method of where M is the number of motions in said regions, and {t
_{i} ^{ref}} label time values of said set of reference images used; for example t_{1} ^{ref}=−1 (past), t_{1} ^{ref}=0 (future), t_{1} ^{ref}=−2 (2 frames deep in past), etc., and equations s_{j}==B_{j}(x,y) are boundary models for j=1,2, . . . M−1 boundaries, and F({s_{j}}) is a nonlinear time-referencing equation. 48. A machine-readable medium having stored thereon information representing the apparatus of 49. An apparatus for improvement of a standard motion segmentation comprising:
means for receiving an input image; means for receiving a set of reference images; means for partitioning said input image into regions; and for each region:
means for extracting one or more motions; and
means for extracting one or more associated moving boundaries.
50. The apparatus of 51. The apparatus of 52. The method of t ^{1} =F(s) s=B(x,y) x ^{1} =v _{1} ^{x}(x,y)+(v _{2} ^{x}(x,y))(t ^{1}+1) y ^{1} =v _{1} ^{y}(x,y)+(v _{2} ^{y}(x,y))(t ^{1}+1) where v
_{i} ^{x,y }is a motion vector map for an i motion, and where F,B are a time referencing and boundary model, respectively. 53. The method of where M is the number of motions in said regions, and {t
_{i} ^{ref}} label time values of said set of reference images used; for example t_{i} ^{ref}=−1 (past), t_{i} ^{ref}=0 (future), t_{i} ^{ref}=−2 (2 frames deep in past) etc., and equations s_{j}=B_{j}(x,y) are boundary models for j=1,2, . . . , M−1 boundaries, and F({s_{j}}) is a nonlinear time-referencing equation. 54. A machine-readable medium having stored thereon information representing the apparatus of 55. An apparatus for preprocessing an image for a non-rigid boundary tracker comprising:
means for receiving an input image; means for receiving a set of reference images; means for partitioning said input image into regions; and for each region:
means for extracting one or more motions; and
means for extracting one or more associated moving boundaries.
56. The apparatus of 57. The apparatus of 58. The method of t ^{1} =F(s) s=B(x,y) x ^{1} =v _{1} ^{x}(x,y)+(v _{2} ^{x}(x,y))(t ^{1}+1) y ^{1} =v _{1} ^{y}(x,y)+(v _{2} ^{y}(x,y))(t ^{1}+1) _{i} ^{x,y }is a motion vector map for an i motion, and where F,B are a time referencing and boundary model, respectively. 59. The method of where M is the number of motions in said regions, and {t
_{i} ^{ref}} label time values of said set of reference images used; for example t_{i} ^{ref}=−1 (past), t_{i} ^{ref}=0 (future), t_{i} ^{ref}=−2 (2 frames deep in past), etc., and equations s_{j}=B_{j}(x,y) are boundary models for j=1,2, . . . M−1 boundaries, and F({s_{j}}) is a nonlinear time-referencing equation. 60. A machine-readable medium having stored thereon information representing the apparatus of Description [0001] The present invention pertains to image processing. More particularly, the present invention relates to estimation of object motion in images. [0002] Standard motion modeling for video coding involves parametric models, applied to a fixed region (motion block), to estimate the motion. These approaches are limited in that the models cannot handle the existence of multiple (different) motions within the motion block. This presents a problem. [0003] A basic problem in motion estimation is the ability of the model to handle multiple motion and moving object boundaries. Standard motion models, such as the affine or perspective models, allow for smooth deformations of a region (i.e., the motion block) to capture a coherent motion (such as translation, zoom, rotation) for all the pixels in the motion block. The region or block over which the motion is estimated cannot be chosen to be to small; this is from (1) a coding point of view, since larger regions mean smaller motion overhead, and (2) from an estimation point of view, larger region allows for better estimation of motion parameters. [0004] A key problem that arises, from the standard limitation of common motion models, is the occurrence of multiple motions within the motion block. A moving object boundary within a motion region is indication of two possibly very different motions (motion of the object and motion of say the background). Also, a moving object boundary implies that some pixels will be occluded (hidden) with respect to the past or future motion estimation. This occlusion effect can bias the motion estimate, lead to higher prediction error, and make it difficult to accurately extract the object boundary. [0005] Approaches in motion segmentation often rely on optical flow estimates or parametric (i.e., affine) motion models; these will have the usual problems near object boundaries and occlusion effects. Some degree of smoothness in the segmentation field, and hence in object boundaries, can be achieved with a prior probability term in MAP/Bayesian methods. This is more of a constraint on the connectivity of the segmentation field, without any explicit coupled model to account for object boundary and motion fields. A curvature evolution model may be used to capture the boundary of a moving object. However, this approach does not involve motion estimations/field, and relies on a temporal difference operator in the model for the evolution of the object boundary. [0006] In another approach, the context of a level set approach, implicitly models the contour of the object boundary and multiple affine motion fields, however, motion estimation is with respect to only one reference frame, i.e., motion of frame n is determined from n−1. As discussed above, this has problems. Some pixels close to the object boundary may be occluded; this will in turn bias the estimation of the boundary, since the motion field is not reliable near the boundary due to occlusion. [0007] Thus, there are problems with the common motion models. [0008] The present invention is illustrated by way of example and not limitation in the figures of the accompanying drawings, in which like references indicate similar elements and in which: [0009]FIG. 1 illustrates a network environment in which techniques of the present invention may be used; [0010]FIG. 2 is a block diagram of a computer system in which embodiments of the present invention may be implemented; [0011]FIG. 3 illustrates one embodiment of the invention in flow chart form; [0012]FIG. 4 illustrates in flow chart form one embodiment of video coding; [0013]FIG. 5 illustrates one embodiment of motion segmentation into 2 regions; [0014]FIG. 6 illustrates the behavior of one embodiment of a function that controls the time reference assignment of pixels; [0015]FIG. 7, FIG. 8, and FIG. 9 are examples illustrating how embodiments of the invention motion model, applied to a local block region, achieve separation into past and future motion references, and hence the extraction of the moving boundary is captured; [0016]FIG. 10 is an example illustrating how an embodiment of the invention motion model estimated the location of a moving boundary; [0017]FIG. 11 is an example illustrating the comparison between a standard motion model and an embodiment of the invention motion model; [0018]FIG. 12 is an example illustrating 3 motions, their movement, and lowest predicted error reference frames; and [0019]FIG. 13 illustrates the behavior of one embodiment of an interface function which controls the time reference assignment for 3 motions. [0020] A method and apparatus for nonlinear multiple motion model and moving boundary extraction are described. [0021] The present invention involves a new motion model for estimation of object motion in video images. In one embodiment of the invention, a new motion model that involves nonlinear coupling between space and time variables is used, a type of region competition to separate multiple motions, and boundary modeling to extract an estimate of the moving object boundary. The model is compact and can be used in motion segmentation and/or video coding applications. [0022] In another embodiment of the present invention, an extension of motion modeling has been used to account for the problems discussed in the background section. The basic features of this model are the following: [0023] 1) a time variable is introduced to allow for combined motion estimation with respect to the past and future frame; [0024] 2) multiple motions (2 motions or more) are allowed to coexist; [0025] 3) object boundary extraction (curvature of boundary incorporated), is determined from a type of region competition for boundary selection; and [0026] 4) a nonlinear function is used to control/refine the estimate of the object boundary. [0027] The present invention is capable of handling multiple motions (two or more). However, to not unnecessarily obscure the present invention, the discussion will initially discuss two motions, with an extension to more than two motions described later in the specification. [0028] One skilled in the art will appreciate that the use of the time variable allows the introduction of two motions and yet avoids occlusion effects. If some pixels close to a moving object boundary are, for example, hidden in the previous frame, then the motion region (to which these pixels belong) will tend to reference its motion with respect to the future (and vice-versa) to reduce prediction error. This is, in a sense, a type of “region competition”, where the object boundary is obtained as the 2 motion regions compete to reduce their prediction error by selecting either past or future as their reference frame for motion estimation. Therefore, the moving object boundary in our model is determined implicitly from this type of region competition. This is in contrast to models that explicitly introduce a contour model (i.e., active contour models); these methods can have significant problems with discretization of the contour and control of the length/curvature as the contour evolves. [0029] In one embodiment of the invention, the motion model is applied locally to a region/block in an image, and it may be viewed as part of a refinement stage to motion estimation or motion segmentation. That is, if after one pass of a motion estimation/segmentation algorithm of an image (say initially using a standard affine motion model) the prediction error in some region is above some quality threshold, then an embodiment of the present invention motion model may be applied to those regions. FIG. 3 illustrates the process in flow chart form [0030] At [0031] In another embodiment of the invention, an extension in the motion model may be used for true non-rigid deformation of object boundary. For example, box [0032] For a video coding application, simple segmentation (for low overhead) of a motion block/region to capture multiple motions (to reduce prediction error) may be achieved with quadtree segmentation of blocks, where large prediction error blocks are partitioned in sub-blocks for improved motion estimation. Similarly, blocks with large prediction errors may be quadtree segmented with a straight line model of the boundary/partition. In one embodiment of the invention, the approach is more aligned with the motion segmentation problem itself, which involves the ability to obtain good estimates of the location and local shape of the moving object boundaries. [0033]FIG. 4 illustrates in flow chart form [0034] In FIG. 4, at [0035] In an embodiment of the invention, the time variable is used for representation of 2 motions. In the motion model, simultaneous estimation with respect to past and future is used (i.e., 2 reference frames are used), so that pixels close to the boundary that are occluded in, say the past frame, will choose estimation from the future frame (where they are not occluded), and vice-versa. It is this duality of occlusion that is exploited in the model. [0036] In an embodiment of the invention, a nonlinear aspect is used on the time variable (and hence boundary model) to control and refine the estimation of the boundary interface. [0037] In an embodiment of the invention, the extended motion model may be used locally, and as part of a successive iterative approach, as illustrated in FIG. 3. Regions that are deemed poor (because of high prediction error), say in a first pass of a segmentation process, may be re-estimated with the extended motion model to capture multiple motions and the moving boundaries. [0038] As mentioned above, the boundary is defined implicitly through the time variable in the motion model, whose functional form allows for the motion domains to be defined by regions of smooth compact support. [0039] A Standard Model Review [0040] In order for the reader to more quickly and fully understand embodiments of the present invention, a review of a standard motion model is presented. A standard motion model often used in motion estimation is the affine model, which takes the following form: [0041] where (x [0042] Motion Model [0043] Embodiments of the invention include a model to account for multiple motions and estimation of moving object boundaries. Past and future motion estimation is used. This involves the use of the time variable t [0044] where B(x,y) contains information on the boundary/interface model, and {right arrow over (v)} [0045] For one realization of the model, we consider the model (i.e., smooth function of pixel coordinates) [0046] where {g,h,α,β,i} are parameters to the model boundary curve. [0047] We also take the standard affine motion models for {right arrow over (v)}
[0048] where {a,b,c,d,e,f,a′,b′,c′,d′,e′,f′} are parameters of the affine motion models. [0049] The description of the model is as follows: [0050] First, consider the last two equations above. These model the two motions, one is a 6 parameter affine motion, the other is another 6 parameter affine motion. [0051] For pixels with t′=−1, the motion vector is given by: v [0052] For pixels with t′=0, the motion vector is given by: [0053] The coupling to the time variable allows for the presence of 2 different motions in this embodiment (i.e., with different translation, rotation, and zooming). The partition of the motion region into 2 motions is defined according to whether the region uses a past or a future frame for motion estimation. This is shown in FIG. 5. [0054] In FIG. 5 motion segmentation into 2 regions is obtained by the region's frame reference for motion estimation. The object moving with velocity V [0055] The time variable in Equation (1) is a smooth function of the pixel locations, and varies from −1 to 0. Operationally, a given pixel location in the motion block on the current frame defines the time variable t [0056] The time variable controls the motion of the object boundary. The boundary is defined to be where s=−0.5, which in general is a curve described by a polynomial gx+hy+αx [0057] achieves this feature, where w controls the slope at the boundary. Reference is made to the parameter w as the “width” of the boundary or interface. Some plots of the function F for different boundary widths are shown in FIG. 6. [0058] As shown in FIG. 6, is the behavior [0059] A key feature in the model is the “boundary width” (w) that controls the spread of the time variable from −1 (past frame) to 0 (future frame). Pixels near the boundary (defined by width w) are a type of mixture phase, i.e., linear combination of the 2 domains. That is, for a pixel within the boundary region, the prediction is: [0060] and a Mixture State can be defined as: [0061] Mixture states: weight (1+t [0062] Pure States: [0063] In one embodiment of the invention to cleanly extract 2 (pure) domains with a fine boundary, w is fixed and small during the estimation step of the motion parameters. For example, the width parameter is fixed at w=⅓, and then re-estimation is performed using a successively finer interface width (as shown in FIG. 5). The nonlinear function F(s) in the model and the decrease of w is used to control and refine the estimate of the boundary. As the interface width decreases, pixels away from the boundary become “frozen” with regard to their reference frame. Only the pixels in the vicinity of the boundary (determined by s=−0.5) are allowed to have their time variable change (i.e., migrate to the other reference frame), and hence modify the boundary. [0064] Estimation of Model Parameters: [0065] In one embodiment of the invention, the estimation of the motion model parameters is obtained from minimization of the prediction error.
[0066] where (x [0067] The detail procedure of the estimation algorithm for the motion model proceeds as follows. There are 3 sets of initial conditions that may be used below: [0068] (1) Motion parameters initialized with respect to a previous frame [0069] (2) Motion parameters initialized with respect to a future frame [0070] (3) The average of the motion parameters from set (1) and (2) [0071] For each set, the interface parameters, in one embodiment, are chosen to be g=h=α=β=0; [0072] Thus a total of 9 initial conditions are used, although most often set 1 or 2 with i=−0.5 may be sufficient. The width parameter is kept fixed to w=⅓ for the sequence 1-7 below. [0073] 1. Initialization of Parameters: [0074] For {fraction (1/16)} size image (obtained from simple decimation of original image), block matching (BM) is performed on small blocks in the corresponding motion block. For initial condition set 1, BM is done with respect to the past; with respect to the future for set 2. The set of motion vectors is then mapped onto the model parameters using Least Squares (LS). This yields an initial set of parameters (a,b,c,d,e,f) for initial condition set 1 and 2; the parameters (a [0075] 2. Steepest descent is used on the {fraction (1/16)} size image to yield an estimate of the model parameters {right arrow over (V)} [0076] 3. Projection from {fraction (1/16)} to ¼ size image to initiate estimation on ¼ size image. This projection is determined so as to preserve the functional form of the model under spatial scaling. For projection of motion parameters from layer 2 to layer 1, we have: [0077] Layer Projection: a b c d e f g h i α β [0078] 4. Use the projected estimate from upper layer as an initial condition for level 1. Repeat iteration/steepest descent for ¼ size image. This yields an estimate {right arrow over (V)} [0079] 5. Projection of parameters for ¼ to original size image, as in 3. [0080] 6. Repeat iteration/steepest descent estimation for full size image. Final solution is {right arrow over (V)} [0081] 7. Repeat 1-6 for the set of initial conditions stated above. [0082] 8. Select the estimate of parameters from the set of initial conditions with lowest prediction error. Re-estimate the motion parameters using the best {right arrow over (V)} [0083] Some examples of the motion model are illustrated here. In the first set of examples, the motion model was applied to a region (80×80 block) which contains 2 motions. For the examples, the original image is shown on the left, and the right image shows the segmentation of a multiple motion region into 2 regions. The dark region references the past frame, and the white region references the future frame. Note that in each example the segmentation into past/future regions is consistent with the effect of occlusion being minimized, as discussed, and shown in FIG. 5. [0084] Example 1 is shown in FIG. 7. The fan moves to the right. Curvature of the fan object is captured, and the motion model achieves separation into past and future motion references as discussed, and shown in FIG. 5. [0085] Example 2 is shown in FIG. 8. Here, the man moves downwards. This is the same effect as in the previous example. [0086] Example 3 is shown in FIG. 9. The girl in the foreground moves to the left. Because the girl moves to the left, the stationary region in front of her will prefer motion estimation with respect to the past where no occlusion occurs. [0087] For the above examples the prediction error data was calculated as the mean square error between the motion predicted region/block and the original block. The standard motion model refers to a single motion affine model, often used in motion estimation. The new motion model refers to an embodiment of the invention. As shown below, there is an improvement in prediction error using the new motion model.
[0088] Motion Model Applied to Large Region [0089] In the example below, a large region around the objects of interest was partitioned into 80×80 blocks. This region was obtained from a standard type of motion segmentation (affine motion model and k-means clustering), with poorly labeled blocks (blocks with high prediction error and/or high distortion classification) identifying the regions of moving objects. Next, an embodiment of the invention new motion model was applied to a set of 80×80 blocks covering a large region around the moving object of interest. Example 4 is shown in FIG. 10 where the thin black line [0090] In Example 4 as shown in FIG. 10, the girl walks to the right, the background “moves” to the left. The motion model is applied to a large region around the girl. The black lines around the girl ( [0091] Shown in FIG. 11 is a comparison between a segmentation using an affine motion model (standard motion model) [0092] Video Coding [0093] In another embodiment of the invention, video coding may make use of the new motion model. The model discussed above, by virtue of its ability to account for 2 motions, can be applied to a large region. In the examples discussed previously, 80×80 blocks were used. The new motion model may be viewed as “compactly” representing different motions and boundary information. For example, in one embodiment of the invention, the present model has 17 parameters, and if used in say 80×80 blocks (in a 704×484 image), is about 900 motion parameters; this includes all information necessary for a decoder to extract motion field and locations of some moving boundaries. Compare this to the approximately 2662 parameters needed for a very simple standard 16×16 Block Matching Algorithm (2 translation parameters, with no explicit moving boundary information). [0094] Model for M Motions [0095] As was mentioned previously, the discussion above primarily focused on 2 motions so as to not obscure embodiments of the invention. Other embodiments of the invention may account for an arbitrary number of motions (M) and may be applied to extend the examples and embodiments discussed above. [0096] An extension of the 2 motion model to account for M motions with non-intersecting boundaries can be written in the form (this is an extension of Equation (1)):
[0097] where, as in Equation (1) above, we can use the model equations below as: [0098] and [0099] In the above model, {right arrow over (x)} refers to a pixel position on the current frame (the one whose motion is being estimated), {right arrow over (x)} [0100] is chosen to be 1 at i=1 (i.e., for t [0101] Case of 2 Motions [0102] The model above reduces to the case realized earlier (see Equation (1)). The 2 reference frames are t [0103] There is only one boundary/interface variable s, and one width variable w. The nonlinear time equation becomes: [0104] where, for example, the model used for 2 motions is:
[0105] 3 Motions [0106] An example for 3 motions is shown in FIG. 12. Here, the three “motions” in the image region [0107] In order to minimize the occlusion/uncovered region effect, an optimal state (lower prediction error) will result in the region frame reference (Frame ref:) shown in FIG. [0108] An example of an interface function for 3 motions (2 non-intersecting boundaries) is shown in FIG. 13. The function can be written as:
[0109] where t [0110] Thus what has been disclosed is a method and apparatus for nonlinear multiple motion model and moving boundary extraction. [0111]FIG. 1 illustrates a network environment [0112]FIG. 2 illustrates a computer system [0113] For purposes of discussing and understanding the invention, it is to be understood that various terms are used by those knowledgeable in the art to describe techniques and approaches. Furthermore, in the description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the present invention. It will be evident, however, to one skilled in the art that the present invention may be practiced without these specific details. In some instances, well-known structures and devices are shown in block diagram form, rather than in detail, in order to avoid obscuring the present invention. These embodiments are described in sufficient detail to enable those skilled in the art to practice the invention, and it is to be understood that other embodiments may be utilized and that logical, mechanical, electrical, and other changes may be made without departing from the scope of the present invention. [0114] Some portions of the description may be presented in terms of algorithms and symbolic representations of operations on, for example, data bits within a computer memory. These algorithmic descriptions and representations are the means used by those skilled in the data processing arts to most effectively convey the substance of their work to others skilled in the art. An algorithm is here, and generally, conceived to be a self-consistent sequence of acts leading to a desired result. The acts are those requiring physical manipulations of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated. It has proven convenient at times, principally for reasons of common usage, to refer to these signals as bits, values, elements, symbols, characters, terms, numbers, or the like. [0115] It should be borne in mind, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless specifically stated otherwise as apparent from the discussion, it is appreciated that throughout the description, discussions utilizing terms such as “processing” or “computing” or “calculating” or “determining” or “displaying” or the like, can refer to the action and processes of a computer system, or similar electronic computing device, that manipulates and transforms data represented as physical (electronic) quantities within the computer system's registers and memories into other data similarly represented as physical quantities within the computer system memories or registers or other such information storage, transmission, or display devices. [0116] The present invention can be implemented by an apparatus for performing the operations herein. This apparatus may be specially constructed for the required purposes, or it may comprise a general-purpose computer, selectively activated or reconfigured by a computer program stored in the computer. Such a computer program may be stored in a computer readable storage medium, such as, but not limited to, any type of disk including floppy disks, hard disks, optical disks, compact disk-read only memories (CD-ROMs), and magnetic-optical disks, read-only memories (ROMs), random access memories (RAMs), electrically programmable read-only memories (EPROM)s, electrically erasable programmable read-only memories (EEPROMs), FLASH memories, magnetic or optical cards, etc., or any type of media suitable for storing electronic instructions either local to the computer or remote to the computer. [0117] The algorithms and displays presented herein are not inherently related to any particular computer or other apparatus. Various general purpose systems may be used with programs in accordance with the teachings herein, or it may prove convenient to construct more specialized apparatus to perform the required method. For example, any of the methods according to the present invention can be implemented in hard-wired circuitry, by programming a general-purpose processor, or by any combination of hardware and software. One of skill in the art will immediately appreciate that the invention can be practiced with computer system configurations other than those described, including handheld devices, multiprocessor systems, microprocessor-based or programmable consumer electronics, digital signal processing (DSP) devices, set top boxes, network PCs, minicomputers, mainframe computers, and the like. The invention can also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. [0118] The methods of the invention may be implemented using computer software. If written in a programming language conforming to a recognized standard, sequences of instructions designed to implement the methods can be compiled for execution on a variety of hardware platforms and for interface to a variety of operating systems. In addition, the present invention is not described with reference to any particular programming language. It will be appreciated that a variety of programming languages may be used to implement the teachings of the invention as described herein. Furthermore, it is common in the art to speak of software, in one form or another (e.g., program, procedure, application, driver, . . . ), as taking an action or causing a result. Such expressions are merely a shorthand way of saying that execution of the software by a computer causes the processor of the computer to perform an action or produce a result. [0119] It is to be understood that various terms and techniques are used by those knowledgeable in the art to describe communications, protocols, applications, implementations, mechanisms, etc. One such technique is the description of an implementation of a technique in terms of an algorithm or mathematical expression. That is, while the technique may be, for example, implemented as executing code on a computer, the expression of that technique may be more aptly and succinctly conveyed and communicated as a formula, algorithm, or mathematical expression. Thus, one skilled in the art would recognize a block denoting A+B=C as an additive function whose implementation in hardware and/or software would take two inputs (A and B) and produce a summation output (C). Thus, the use of formula, algorithm, or mathematical expression as descriptions is to be understood as having a physical embodiment in at least hardware and/or software (such as a computer system in which the techniques of the present invention may be practiced as well as implemented as an embodiment). [0120] A machine-readable medium is understood to include any mechanism for storing or transmitting information in a form readable by a machine (e.g., a computer). For example, a machine-readable medium includes read only memory (ROM); random access memory (RAM); magnetic disk storage media; optical storage media; flash memory devices; electrical, optical, acoustical or other form of propagated signals (e.g., carrier waves, infrared signals, digital signals, etc.); etc. [0121] Use of the words “one embodiment”, or “an embodiment”, or similar language does not imply that there is only a single embodiment of the invention, but rather indicates that in the particular embodiment being discussed it is one of several possible embodiments. [0122] Thus, a method and apparatus for nonlinear multiple motion model and moving boundary extraction have been described. Referenced by
Classifications
Legal Events
Rotate |