3 4
To increase the accuracy of motion estimation, MPEG Other embodiments are within the following claims. For
motion vectors have a one-half pixel resolution, which is example, although it could require an appreciable increase in
implemented by using as the reference block an interpolated chip complexity (because of multiplication steps required),
block assumed to lie one-half pixel in either, or both, the X and thus not achieve as dramatic gains in price/performance
or Y directions from an actual block. Performing that 5 as the preferred embodiment, the IDCT operation could also
interpolation requires that either an 8x9, 9x8, or 9x9 block be moved to the graphics coprocessor. Such a configuration
be processed to produce the interpolated reference block. could be quite practical, and of significant value, if the
Each pixel in the interpolated block is the average of either graphics coprocessor provided with the personal computer
two or four pixels. The interpolation is performed (40) using had built-in fast multiply capability, such as may be the case
a series of bit block transfer (bit BLT) operations in which in three-dimensional graphics coprocessors,
pixels from one 8x8 block are added to pixels of the 8x8 The block size referred to throughout the discussion of the
block one pixel over, and the sums are divided by two. preferred embodiment is 8x8, but other sizes could be used
Alternatively, if available in the graphics coprocessor, the (e.g., a macro block, 16x16, could be processed at once). For
interpolation can be performed using a scaling bit BLT, by frames in which many adjoining blocks receive the same
supplying the scaling bit BLT with either the 9x8, 8x9, or motion compensation, a large number of blocks (even
9x9 input block, and requesting an 8x8 output block. FIG. 15 approach an entire frame in size) could efficiently be pro
3 illustrates the operation for the simple case in which the cessed in a single bit BLT operation,
reference block R is the average of two 8x8 blocks A, B, If the invention is applied to MPEG2, it would probably
offset from one another by one pixel in the X direction. be preferable to use the next generation processor (e.g.,
In the case of B frames, the reference blocks from the 2Q Pentium MMX or Pentium Pro MMX).
prior and future frame are averaged (42), using the same bit What is claimed is:
BLT operation (add and divide by two) used for interpola- 1- A method of decoding a series of frames of motion
tion. compensated video data using a personal computer that
These interpolation and averaging operations provide the includes a central processor and a graphics coprocessor, the
reference blocks that are added (44) to the error blocks 25 method comprising the steps of:
produced by the inverse transform operation (IDCT). This executing a stored program on the central processor to
addition is also performed using a bit BLT operation. This carry out at least the following steps: extracting motion
particular bit BLT operation is not one conventionally found vectors from the video data, and decompressing the
in graphics coprocessor chips, but it could be added at little video data, and
increase in chip complexity. Pixels of the source block are 30 operating the graphics coprocessor to carry out at least the
added to pixels of the destination block and the resulting following step: motion compensating the video data
sums (after appropriate clipping) are written over the cor- based on the motion vectors using bit BLT operations,
responding pixels of the destination block. The pixels rep- 2. The method of claim 1 wherein the bit BLT operations
resenting the error terms are signed numbers, whereas the comprise adding the pixels of a source block to the pixels of
pixels representing the reference block are unsigned num- 35 a destination block to create sum pixels, and replacing the
bers. The bit BLT operation must, therefore, add a signed pixels of the destination block with the sum pixels,
number to an unsigned number, and provide appropriate 3. The method of claim 2 wherein one of the source and
clipping of the result (e.g., clipping if it exceeds an accept- destination pixels is an unsigned number and the other is a
able range, which could be the full 0 to 255 range provided signed number.
by 8 bits, or a smaller range such as 16 to 240, to allow 40 4. The method of claim 3 wherein the bit BLT operations
values outside those limits to be used for other purposes). comprise adding the pixels of a source block to the pixels of
FIG. 4 shows the bit BLT operations required to handle a destination block to create sum pixels, dividing the sum
the two reference blocks used in motion compensating a B Pixels bY a constant to create interpolated pixels, and replac
frame block. The add-and-divide-by-two operation could be ing the pixels of the destination block with the interpolated
implemented in at least two ways. The graphics coprocessor 45 pixels.
could be designed to read both blocks and perform the 5. The method of claim 1 wherein the step of decomhalf-pixel interpolation operation simultaneously. pressing the video data by the central processor includes Alternatively, it could read one reference block, write it to a decompressing the video data using RLE decoding, temporary location, and then read the second reference 6- A method of decoding a series of frames of motionblock, add it to the first block and divide by two, and write 50 compensated video data using a personal computer that the result to the temporary location. includes a central processor and a graphics coprocessor, the
FIG. 4 also shows the bit BLT operations required to add method comprising the steps of:
the reference block (the averaged blocks in the case of B executing a stored program by the central processor to
frames) to the error block. The reference data is the source carry out at least the following steps: extracting motion
block, and the error block the destination block. The addition 55 vectors from the video data, and decompressing the
of the source and destination blocks must be a straight add video data,
(no division by two), and since there is no divide by two, and operating the graphics coprocessor to carry out at least the since one value is signed (those from the error block), the following step: motion compensating the video data result must be clipped as noted elsewhere. using bit BLT operations, including interpolating to Preferably the bit BLT operations are performed in one or 60 determine an interpolated reference block, and wherein a small number of batch operations, in which a list of the bit the interpolating is performed using bit BLT operations. BLT operations are executed. Such batch processing can 7. A method of decoding a series of frames of motionperform the bit BLT operations more efficiently than is compensated video data using a personal computer that possible if isolated bit BLT operations are performed. Batch includes a central processor and a graphics coprocessor, the processing is made possible by providing sufficient memory 65 method comprising the steps of:
in which to store the lists of bit BLT operations needing executing a stored program on the central processor to
execution. carry out at least the following steps: