|Publication number||US5083202 A|
|Application number||US 07/469,494|
|Publication date||Jan 21, 1992|
|Filing date||Sep 22, 1988|
|Priority date||Sep 25, 1987|
|Also published as||CA1318970C, DE3885695D1, DE3885695T2, EP0309251A1, EP0309251B1, WO1989003152A1|
|Publication number||07469494, 469494, PCT/1988/781, PCT/GB/1988/000781, PCT/GB/1988/00781, PCT/GB/88/000781, PCT/GB/88/00781, PCT/GB1988/000781, PCT/GB1988/00781, PCT/GB1988000781, PCT/GB198800781, PCT/GB88/000781, PCT/GB88/00781, PCT/GB88000781, PCT/GB8800781, US 5083202 A, US 5083202A, US-A-5083202, US5083202 A, US5083202A|
|Original Assignee||British Telecommunications Public Limited Company|
|Export Citation||BiBTeX, EndNote, RefMan|
|Patent Citations (5), Non-Patent Citations (6), Referenced by (18), Classifications (14), Legal Events (4)|
|External Links: USPTO, USPTO Assignment, Espacenet|
The present invention concerns motion estimation, particularly, though not exclusively, in the context of video coders employing inter-frame differential coding.
FIG. 1 shows a known form of video coder. Video signals (commonly in digital form) are received at an input 1. A subtractor 2 forms the difference between the input and a predicted signal from a predictor 3 which is then further coded in box 4. The coding performed here is not material to the present invention, but may include thresholding (to suppress transmission of zero or minor differencs) quantisation or transform coding for example. The input to the predictor is the sum, formed in an adder 2 of the prediction and the coded difference signal decoded in a local decoder 6 (so that loss of information in the coding and decoing process is included in the predictor loop).
The differential coding is essentially inter-frame, and the predictor 3 could simply consist of a one-frame delay; as shown however a motion estimator 7 is also included. This comapres the frame of the picture being coded with the previous frame being supplied to the predictor. For each block of the current frame (into which the picture is regarded as divided) it identifies that region of the previous frame which the block most closely resembles. The vector difference in position between the identified region and the block in question is termed a motion vector (since it usually represents motion of an object within the scene depicted by the television picture) and is applied to the predictor to shift the identified region of the previous frame into the position of the relevant block in the current frame, thereby making the predictor output a better prediction. This results in the differences formed by the substractor 2 being, on average, smaller and permits the coder 4 to encode the picture using a lower bit rate than would otherwise be the case.
The motion estimator must typically compare each block with the corresponding block of the previous frame and regions positionally shifted from that block position; this involves a considerable amount of processing and often necessitates many accesses to stored versions of both frames.
The present invention is defined in the claims.
One embodiment of the invention will now be described, with reference to the accompanying drawings, in which:
FIG. 1 is a known video coder.
FIG. 2 is a diagram of a television picture illustrating a co-ordinate system and search area;
FIG. 3 is a block diagram of aprt (P1 of FIG. 5) of a motion estimator according to the invention;
FIG. 4 is a diagram of the search area SN of FIG. 2;
FIG. 5 is a block diagram of a complete motion estimator; and
FIG. 6 is a block diagram of the sorter CSO of FIG. 5.
The motino estimator to be described regards a "current" frame of a television picture which is being coded as being divided into 8×8 blocks--that is, eight picture elements (pixels) horizontally by eight lines vertically. Although the principles are equally applicable to interlaced systems, for simplicity of description a non-interlaced picture is assumed. It is designed to generate for each block a motion vector which indicates the position of the 8×8 region, lying within a defined search area of the (or a) previous frame of the picture, which is most similar to the block in question. FIG. 2 illustrates a field with an 8×8 block N (shaded) and a typical associated 23×23 search area indicated by a rectangle SN. If the pixels horizontally and lines vertically are identified by coordinates x, y, with an origin at the top left-hand corner, then the search area for a block whose upper left hand corner pixel has coordinates xN, yN is the area extending horizontally from (xN -8) to (xN +14) and verticlaly from (yN -8) to (yN +14).
In order to obtain the motion vector it is necessary to conduct a search in which the block is compared with each of the 256 possible 8×8 regions of the previous frame lying within the search area--i.e. those whose upper left pixel has coordinates xN +u, yN +v where u and v are in the range -8 to +7. The motion vector is the values of u,v for which the comparison indicates the greatest similarity. The test for similarity can be any conventionally used--e.g. the sum of the absolute values (or other monotonically increasing even function) of the differences between each of the pixels in the "current" block and the relevant region of the previous frame.
Thus, if the current frame and previous frame pixel values are a(i,j) and b(i,j) respectively then the sum of differences is ##EQU1##
Commonly the search is carried out for each block of the current picture in turn. However because the search area associated with a block overlaps the search areas of a number (24 in the case of blocks not close to the edge of the picture) of other blocks this (see the search area shown dotted in FIG. 2 for block N+1) often requires multiple accesses to the previous frame information stored in a frame store, which are time consuming and may interfere with other coder functions.
The motion estimator to be described is assumed to be provided, in real time, with
(a) a digital video signal corresponding to the "current" frame of a picture to be coded.
(b) a digital video signal corresponding to the previous frame of the picture.
The signals consist of a sequence of 8-bit digital words representing the luminance of successive picture elements of the first line, (though chrominance signals could be similarly processed if desired) followed by similar sequences for the second, third and subsequent lines.
FIG. 3 shows part of the apparatus, the operation of which will be described with reference to FIG. 4. The "position" in the picture of an 8×8 block or region will be considered as defined by the x,y coordinates of its upper left pixel. Thus the block N in FIG. 4 is at position x,y. The apparatus of FIG. 2 serves to compare the 8×8 block N at x,y with the 8×8 regions at (x+u),(y+v) where u ragnes from -8 to +7 and v from 0 to +7--i.e. the regions whose positions are within the lower broken line area S1 in FIG. 4. In FIG. 3, the previous frame is received (following a 9-pixel delay XD, the purpose of which will be apparent later) at an input PI (all signals paths in the Figure are 8-bit unless otherwise indicated by a diagonal bar and adjacent number). This supplies a tapped delay line consisting of eight one-line delay units LD1 . . . LD8, so that the signal for any line of the picture is available at the output of LD8 and that for the seven later lines is available at the outputs of LD7, LD6 etc. The current picture signal is supplied via input CI to a delay line with storage having eight sections DS1 . . . DS8. Each section consists of two delay units of one line period duration; one half of each section forms part of an eight line period delay line and the other half forms a reciculating store. When the last line of a group of eight lines has entered the delay line, the roles of the two halves of each section are reversed so that while the next egith lines are entering the delay line the eight already entered are repeatedly available at the outputs of the recirulating sections.
Thus, referring to FIG. 4, (and ignoring the 9-pixel offset and assuming that line y is the first of a block) at the conclusion of line y+7, lines y, y+1 . . . y+7 of the current frame are about to be output from DS8, DS7 . . . DS1, whilst lines y,y+1, . . . y+7 of the previous frame are about to be output from LD8, LD7, . . . LD1.
The outputs of LD8 and DS8 feed respective (8-bit wide) 8-stage serial in parallel out (SIPO) registers PS8, CS8 clocked at pixel rate. The eight outputs of the latter are latched in a latch CL8 every 16 pixels synchronously with the horizontal block structure of the frame. Thus, once pixel x+7 of line y has entered the SIPO CS8, pixels x to (x+7) are available at the output of latch CL8 on the next clock pulse. At this time, pixels x-8 to x-1 are available at the outputs of the SIPO PS8 (due to the 9-pixel delay XD).
The outputs of the PS8 and CL8 are supplied to subtractors M81-M88; the sum of the moduli of the differences is formed in a summation unit A8. Thus in the cicumstances described in the preceding paragraph, the summer output represents the "sum of differences" between the first line of the current picture block N, and the first line of the region indicated by chain-dot lines in FIG. 4.
The arrangement consisting of PS8,CS8, CL8, M81-M88 and A8 is provided for the outputs of LD8 and DS8; seven further such arragnements are provided (though, for clarity, not shown in FIG. 3) for the outputs of LD7/DS7, LD6/DS6 . . . LD1/DS1. They function in an identical manner, except hat, being connected to earlier taps of the delay lines, they operate on the seven later lines of the picture. The outputs of the summers A8 . . . A1 are added in an adder AA which produces the "sum of differences" between the current block N and the 8×8 region of the previous frame idnicated by the chain-dot lines in FIG. 4. This is the value Ex,y (-8,0) according to the definition given above.
One pixel clock cycle later, the SIPO's PS8 . . . 1 now contain pixels x-7 to x instead of x-8 to x-1. The output of the latch CL8 is unchanged, thus the comparison now is between the current block and block (x-7), 0 and adder AA produces an output Ex,y (-7,0). This process continues for 16 clock cycles, at the conclusion of which the adder AA has produced 16 results E(-8,0),E(-7,0). The first pass of the search area S1 in FIG. 4 is now complete.
The pixels of the next-but-one block N+2 of the current frame (i.e. those with horizontal co-ordinates x+16) to (x+23)) are now present in the SIPO's CS8 . . . ; these are clocked into the latches CL8 . . . , at which point pixels (x+8) to (x+15) are available at PS8, corresponding to search position for the new block, and the first pass now proceeds for block N+2, and successive alternate blocks until the end of the line.
At the conclusion of a line period, line y+1 of the previous frame is appearing at the output of LD*; however, by the recirculating action of DS8, line y of the current frame appears again at the output of DS8, and the second pass for block N takes place, it being this time compared with the chain-dot region of FIG. 4, shifted down by one line. It will be seen, that after eight line periods, a comparison will have been made between block N and all the regions defined by the area S1 (and similarly for all the even numbered blocks in a row).
When this has occurred, the delay and store stages DS8 . . . are clocked and lines y+8 to y+15 of the current frame now become available at their outputs. Line y+8 of the previous frame is just about to appear at the output of delay LD8, and the arrangement of FIG. 2 is ready to accommodate the next row of blocks.
It will be seen that subtractors N, summation units A and adder AA thus form arithmetic means to compare each block of the current frame with the corresponding region of the previous frame and with a plurality of positionally shifted regions of the previous frame. Unlike the block-by-block approach mentioned earlier, the arithmetic means is arranged so that all comparisons involving a particular line of the picture are carried out consecutively. As can be seen, this requires a storage capacity of only a total of sixteen line delays for the previous frame.
It will be observed however that the search process for block N is incomplete; the search area S2 (FIG. 4) has not been dealt with. Also the odd numbered blocks have received no attention. Referring now to FIG. 5, the arrangements of FIG. 3 (apart from delay XD) now appear as processor P1. A second processor P2 handles the upper search area S2; it is identical in all respects to P1 but is supplied with the previous frame signal via an 8 line delay (shown explicitly as ELD though in practice the signal could be tapped from LD8 of the processor P1), giving the desired result of defining a search area S2 which is 8 ines earlier than S1. It is noted in passing that were the search SN area smaller (i.e. 17×17 or less rather than 23×23) one processor (suitably timed) would suffice.
The even-numbered blocks are handled by a further pair of processors P3, P4 which are identical to P1 and P2 and receive the same signal inputs as P1 and P2 respectively. However their latches CL8 are clocked (every 16 pixels as before) with pulses which are 8 pixels out of phase with those supplied to the processors P1 and P2. Although the figure shows identical processors (which may be a convenient modular hardware implementation), certain of the elements within the processors may if desired be common to two or more of the processors, (e.g. the lines DS8 . . . DS1).
We will now consider the further processing of the "sum of difference" values E. It is necessary to find for each block of the current frame the position (u,v) giving the lowest E. Since the E values for a given block appear over an 8-line period, interspersed with those for other blocks in the same horizontal row, a degree of sorting is also necessary. Two compare and sort units CSE,CSO are shown in FIG. 5 for the even and odd blocks respectively, with outputs VOE,VOO for the output motion vectors.
Unit CSO is shown in FIG. 6 and is identical to CSE (except for the timing of the inputs, of course).
The outputs of the processors P1, P2 are applied to the unit. A vector generator VG synchronised by pixel clock (and line and field synchronising pulses) produces the value of the vector component u (4 bits) and the lower 3 bits of component v associated with the E value being received from the processor P1.
The two E values received simultaneously from P1 and P2 always relate to the same block of the current picture, so that these can readily be compared by a comparator C1. The comparator output controls a data selector SEL1 to output the smaller of the received values; the comparator output also is appended to the vector generator output to form the most insignificant bit of v.
The description so far has conveniently ignored problems that may arise where the block under consideration is within 8 lines or pixels of the edge of the picture--i.e. certain of the regions defined by x, y, u, v overlap the line and field blanking periods. This is readily overcome by disregarding such regions. A border detector BG serves to override the action of the selector SEL1 by:
(a) at the top of the picture, where results for such regions are produced by processor P2, forcing the selector to pass the output from processor P1;
(b) at the bottom of the picture, where results for such regions are produced by the processor P1, forcing the selector to pass the output from processor P2;
(c) at the sides of the picture, where both processors produce such results, setting the output of the selector to its maximum value, thereby obliging the subsequent stages of the sorter to select a different value.
A first-in-first-out store FIFO1 stores the lowest E value for each odd block of a row. All the store locations are set to their maximum value at the commencement of a row. Each processor pass generates, for any block N, sixteen E values in succession. When the first of these is received from the selector SEL 1 a comparator C2 compares it with the prevous value recorded in the store for the relevant block and controls the selector SEL2 to enter into a ltach whichever is the lower of the two values. A second selector SEL3 then switches to feed the latch output to the comparator C2 (and selector SEL1) and the comparator compares each of the remaining fifteen values with the value held in the latch L1; again, the selector SEL1 passes the lower value of the pair to form the new latched value. After the sixteen values have been compared, the content of the latch is loaded back into the store. In the same manner, selectors SEL4, SEL5 and latch L2 select for entry into a second store FIFO2 wither the vector previously stored therein or the incoming vector (from VG and Cl).
After all eight passes of the current row have taken place, the store FIFO1 contains the lowest "sum of differences" value E for each block of the row, and the store FIFO2 contains the corresponding vectors u,v. These can then be read out and output to the output VOO prior to the processing of the next row.
In some circumstances it may be easier to generate in the vector generator VG vectors which are coded differently from those required at the output, in which case a vector mapping unit VM--which may be a simple look-up table--can be included.
It has been assumed that the previous frame region to be identified is the one having the smallest difference from the current block in question. However it may be desired to give a bias to the zero vector--i.e. a non-zero vector is output only if a region u,v gives a sum of differences E(u,v) which is less by a predetermined amount than the value E (O,O) for the undisplaced region of the previous frame--e.g. is less than 75% of E(O,O). This can be achieved by a scaling unit ZVS which normally passes the valves received from the processor P1 unchanged, but reduces the value to 75% of the input value when a signal VO from the vector generator VG indicates a position (O,O).
While the invention has been described in connection with what is presently considered to be the most practical and preferred embodiment, it is to be understood that the invention is not to be limited to the disclosed embodiment, but on the contrary, is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.
|Cited Patent||Filing date||Publication date||Applicant||Title|
|US4656511 *||Sep 28, 1984||Apr 7, 1987||Nec Corporation||Video signal compressing and coding apparatus|
|US4670851 *||Oct 22, 1984||Jun 2, 1987||Mitsubishi Denki Kabushiki Kaisha||Vector quantizer|
|US4800425 *||Dec 18, 1987||Jan 24, 1989||Licentia Patent-Verwaltungs-Gmbh||System for displacement vector searching during digital image analysis|
|US4897720 *||Mar 14, 1988||Jan 30, 1990||Bell Communications Research, Inc.||Circuit implementation of block matching algorithm|
|US4933761 *||Apr 27, 1988||Jun 12, 1990||Mitsubishi Denki Kabushiki Kaisha||Image coding and decoding device|
|1||"Correlation Techniques of Image Registration" IEEE Transactions, vol. AES-10, #3, May 1974; Pratt.|
|2||"Interface Television Coding Using Movement Compensation"; IEEE June 1979; Robbins et al.|
|3||"Motion Compensated Prediction for Inter-Frame Coding Systems" Electronics and Communications vol. 64B #1, '81; Ninomiya et al.|
|4||*||Correlation Techniques of Image Registration IEEE Transactions, vol. AES 10, 3, May 1974; Pratt.|
|5||*||Interface Television Coding Using Movement Compensation ; IEEE June 1979; Robbins et al.|
|6||*||Motion Compensated Prediction for Inter Frame Coding Systems Electronics and Communications vol. 64B 1, 81; Ninomiya et al.|
|Citing Patent||Filing date||Publication date||Applicant||Title|
|US5206723 *||Apr 18, 1990||Apr 27, 1993||British Telecommunications Public Limited Company||Motion estimator|
|US5210605 *||Jun 11, 1991||May 11, 1993||Trustees Of Princeton University||Method and apparatus for determining motion vectors for image sequences|
|US5396284 *||Aug 20, 1993||Mar 7, 1995||Burle Technologies, Inc.||Motion detection system|
|US5398068 *||Sep 2, 1993||Mar 14, 1995||Trustees Of Princeton University||Method and apparatus for determining motion vectors for image sequences|
|US5412435 *||Jun 25, 1993||May 2, 1995||Kokusai Denshin Denwa Kabushiki Kaisha||Interlaced video signal motion compensation prediction system|
|US5430886 *||Jun 15, 1992||Jul 4, 1995||Furtek; Frederick C.||Method and apparatus for motion estimation|
|US5461423 *||Nov 4, 1993||Oct 24, 1995||Sony Corporation||Apparatus for generating a motion vector with half-pixel precision for use in compressing a digital motion picture signal|
|US5471248 *||Nov 13, 1992||Nov 28, 1995||National Semiconductor Corporation||System for tile coding of moving images|
|US5504931 *||May 10, 1995||Apr 2, 1996||Atmel Corporation||Method and apparatus for comparing data sets|
|US5512962 *||Jul 5, 1995||Apr 30, 1996||Nec Corporation||Motion vector detecting apparatus for moving picture|
|US5537155 *||Apr 29, 1994||Jul 16, 1996||Motorola, Inc.||Method for estimating motion in a video sequence|
|US5568203 *||May 5, 1993||Oct 22, 1996||Samsung Electronics Co., Ltd.||Apparatus for estimating real-time motion an a method thereof|
|US5604546 *||Oct 18, 1994||Feb 18, 1997||Sony Corporation||Image signal processing circuit for performing motion estimation|
|US5696698 *||Apr 18, 1995||Dec 9, 1997||Sgs-Thomson Microelectronics S.A.||Device for addressing a cache memory of a compressing motion picture circuit|
|US5748248 *||Nov 6, 1995||May 5, 1998||British Telecommunications Public Limited Company||Real time motion vector processing of image data|
|US6263112 *||May 30, 1997||Jul 17, 2001||Fujitsu Limited||Motion vector searching apparatus and motion picture coding apparatus|
|US6965644 *||Mar 1, 2001||Nov 15, 2005||8×8, Inc.||Programmable architecture and methods for motion estimation|
|US20010046264 *||Mar 1, 2001||Nov 29, 2001||Netergy Networks, Inc.||Programmable architecture and methods for motion estimation|
|U.S. Classification||348/699, 375/E07.1, 348/416.1, 375/E07.105|
|International Classification||H04N7/32, G06T7/20, G06T9/00, H04N7/26|
|Cooperative Classification||G06T7/202, H04N19/51, H04N19/43|
|European Classification||G06T7/20B1, H04N7/26L4, H04N7/26M2|
|Apr 13, 1990||AS||Assignment|
Owner name: BRITISH TELECOMMUNICATIONS PUBLIC LIMITED COMPANY,
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST.;ASSIGNOR:PARKE, IAN;REEL/FRAME:005276/0949
Effective date: 19900322
|Jun 15, 1995||FPAY||Fee payment|
Year of fee payment: 4
|Jun 23, 1999||FPAY||Fee payment|
Year of fee payment: 8
|Jun 24, 2003||FPAY||Fee payment|
Year of fee payment: 12