WO2003036981A1 - Spatial scalable compression - Google Patents

Spatial scalable compression Download PDF

Info

Publication number
WO2003036981A1
WO2003036981A1 PCT/IB2002/004370 IB0204370W WO03036981A1 WO 2003036981 A1 WO2003036981 A1 WO 2003036981A1 IB 0204370 W IB0204370 W IB 0204370W WO 03036981 A1 WO03036981 A1 WO 03036981A1
Authority
WO
WIPO (PCT)
Prior art keywords
stream
streams
encoder
base
enhancement
Prior art date
Application number
PCT/IB2002/004370
Other languages
French (fr)
Inventor
Wilhelmus H. A. Bruls
Reinier B. M. Klein Gunnewiek
Original Assignee
Koninklijke Philips Electronics N.V.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Koninklijke Philips Electronics N.V. filed Critical Koninklijke Philips Electronics N.V.
Priority to KR10-2004-7006228A priority Critical patent/KR20040047977A/en
Priority to US10/493,267 priority patent/US20050002458A1/en
Priority to EP02777621A priority patent/EP1442606A1/en
Priority to JP2003539340A priority patent/JP2005507587A/en
Publication of WO2003036981A1 publication Critical patent/WO2003036981A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/12Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/137Motion inside a coding unit, e.g. average field, frame or block difference
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/152Data rate or code amount at the encoder output by measuring the fullness of the transmission buffer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/187Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a scalable video layer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/189Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
    • H04N19/192Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding the adaptation method, adaptation tool or adaptation type being iterative or recursive
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/189Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
    • H04N19/196Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/189Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
    • H04N19/196Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters
    • H04N19/198Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters including smoothing of a sequence of encoding parameters, e.g. by averaging, by choice of the maximum, minimum or median value
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • H04N19/33Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability in the spatial domain
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/44Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • H04N19/517Processing of motion vectors by encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/577Motion compensation with bidirectional frame interpolation, i.e. using B-pictures
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/59Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial sub-sampling or interpolation, e.g. alteration of picture size or resolution
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/625Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using discrete cosine transform [DCT]

Definitions

  • the invention relates to a video encoder, and more particularly to a video encoder which uses spatial scalable compression schemes to produce a plurality of base streams and a plurality enhancement streams.
  • each digital image frame is a still image formed from an array of pixels according to the display resolution of a particular system.
  • the amounts of raw digital information included in high resolution video sequences are massive.
  • compression schemes are used to compress the data.
  • Various video compression standards or processes have been established, including, MPEG-2, MPEG-4, H.263 and H.264.
  • Many applications are enabled where video is available at various resolutions and/or qualities in one stream. Methods to accomplish this are loosely referred to as scalability techniques.
  • the first is scalability on the time axis, often referred to as temporal scalability.
  • temporal scalability there is scalability on the quality axis, often referred to as signal-to-noise scalability or fine-grain scalability.
  • the third axis is the resolution axis (number of pixels in image) often referred to as spatial scalability or layered coding.
  • layered coding the bitstream is divided into two or more bitstreams, or layers. Each layer can be combined to form a single high quality signal.
  • the base layer may provide a lower quality video signal
  • the enhancement layer provides additional information that can enhance the base layer image.
  • FIG. 1 illustrates a block diagram of an encoder 100 which supports MPEG-2/MPEG-4 spatial scalability.
  • the encoder 100 comprises a base encoder 112 and an enhancement encoder 114.
  • the base encoder is comprised of a low pass filter and downsampler 120, a motion estimator 122, a motion compensator 124, an orthogonal transform (e.g., Discrete Cosine Transform (DCT)) circuit 130, a quantizer 132, a variable length coder 134, a bitrate control circuit 135, an inverse quantizer 138, an inverse transform circuit 140, switches 128, 144, and an interpolate and upsample circuit 150.
  • DCT Discrete Cosine Transform
  • the enhancement encoder 114 comprises a motion estimator 154, a motion compensator 155, a selector 156, an orthogonal transform (e.g., Discrete Cosine Transform (DCT)) circuit 158, a quantizer 160, a variable length coder 162, a bitrate control circuit 164, an inverse quantizer 166, an inverse transform circuit 168, switches 170 and 172.
  • DCT Discrete Cosine Transform
  • the bitrate of the base layer and the enhancement layer together for a sequence is greater than the bitrate of the same sequence coded at once.
  • Figure 2 illustrates another known encoder 200 proposed by DemoGrafx.
  • the encoder is comprised of substantially the same components as the encoder 100 and the operation of each is substantially the same so the individual components will not be described.
  • the residue difference between the input block and the upsampled output from the upsampler 150 is inputted into a motion estimator 154.
  • the scaled motion vectors from the base layer are used in the motion estimator 154 as indicated by the dashed line in Figure 2.
  • this arrangement does not significantly overcome the problems of the arrangement illustrated in Figure 1.
  • an apparatus for efficiently performing spatial scalable compression of an input video stream is disclosed.
  • a base encoder encodes a base encoder stream. Modifying means modifies content of the base encoder stream to create a plurality of base streams.
  • An enhancement encoder encodes an enhancement encoder stream. Modifying means modifies content of the enhancement encoder stream to create a plurality of enhancement streams.
  • a method and apparatus for providing spatial scalable compression of an input video stream is disclosed.
  • the input video stream is downsampled to reduce the resolution of the video stream.
  • the downsampled video stream is encoded to produce a base encoder stream.
  • a plurality of base streams are created from the base encoder stream.
  • the base encoder stream is decoded and upconverted to produce a reconstructed video stream.
  • the expected motion between frames from the input video stream and the reconstructed video stream is estimated and motion vectors for each frame of the received streams is calculated based upon an upscaled base layer plus enhancement layer.
  • the reconstructed video stream is subtracted from the video stream to produce a residual stream.
  • a predicted stream is calculated using the motion vectors in a motion compensation unit.
  • the predicted stream is subtracted from the residual stream.
  • the resulting residual stream is encoded and an enhancement encoder stream is outputted.
  • a plurality of enhancement streams are created from the enhancement encoder stream.
  • a method and apparatus for decoding a plurality of coded video signals is disclosed.
  • Each of the video streams is decoded and then the video streams are combined.
  • An inverse quantization operation is performed on quantization coefficients in the decoded video streams to produce DCT coefficients.
  • An inverse DCT operation is performed on the DCT coefficients to produce a first signal.
  • Predicted pictures are produced in a motion compensator and the first signal and the predicted pictures are combined to produce an output signal.
  • Figure 1 is a block schematic representation of a known encoder with spatial scalability
  • Figure 2 is a block schematic representation of a known encoder with spatial scalability
  • Figure 3 is a block schematic representation of an encoder with spatial scalability according to one embodiment of the invention
  • Figure 4 illustrates a modifying device with attenuators in series according to one embodiment of the invention
  • Figure 5 illustrates a modifying device with attenuators in cascade according to one embodiment of the invention
  • Figure 6 illustrates a decoder according to one embodiment of the invention.
  • FIG. 3 is a schematic diagram of an encoder according to one embodiment of the invention.
  • the depicted encoding system 300 accomplishes layered compression, whereby a portion of the channel is used for providing a plurality of lower resolution base layers and the remaining portion is used for transmitting a plurality of enhancement layers, whereby various base layers and base and enhancement layers can be combined to create video streams of differing quality levels. It will be understood by those skilled in the art that other encoding arrangements can also be used to create multilayered base and enhancement video streams and the invention is not limited thereto.
  • the encoder 300 comprises a base encoder 312 and an enhancement encoder 314.
  • the base encoder is comprised of a low pass filter and downsampler 320, a motion estimator 322, a motion compensator 324, an orthogonal transform (e.g., Discrete Cosine Transform (DCT)) circuit 330, a quantizer 332, a variable length coder (VLC) 334, a bitrate control circuit 335, an inverse quantizer 338, an inverse transform circuit 340, switches 328, 344, and an interpolate and upsample circuit 350.
  • DCT Discrete Cosine Transform
  • VLC variable length coder
  • An input video block 316 is split by a splitter 318 and sent to both the base encoder 312 and the enhancement encoder 314.
  • the input block is inputted into a low pass filter and downsampler 320.
  • the low pass filter reduces the resolution of the video block which is then fed to the motion estimator 322.
  • the motion estimator 322 processes picture data of each frame as an I-picture, a P-picture, or as a B- picture.
  • Each of the pictures of the sequentially entered frames is processed as one of the I-, P-, or B-pictures in a pre-set manner, such as in the sequence of I, B, P, B, P,..., B, P.
  • the motion estimator 322 refers to a pre-set reference frame in a series of pictures stored in a frame memory not illustrated and detects the motion vector of a macro-block, that is, a small block of 16 pixels by 16 lines of the frame being encoded by pattern matching (block Matching) between the macro-block and the reference frame for detecting the motion vector of the macro-block.
  • a macro-block that is, a small block of 16 pixels by 16 lines of the frame being encoded by pattern matching (block Matching) between the macro-block and the reference frame for detecting the motion vector of the macro-block.
  • an intra-coding intra- frame coding
  • a forward predictive coding forward predictive coded
  • a backward predictive coding backward predictive coding
  • a bi- directional predictive-coding there are four picture prediction modes, that is an intra-coding (intra- frame coding), a forward predictive coding, a backward predictive coding, and a bi- directional predictive-coding.
  • An I-picture is an intra-coded picture
  • a P-picture is an intra- coded or forward predictive coded or backward predictive coded picture
  • a B-picture is an intra-coded, a forward predictive coded, or a bi-directional predictive-coded picture.
  • the motion estimator 322 performs forward prediction on a P-picture to detect its motion vector. Additionally, the motion estimator 322 performs forward prediction, backward prediction, and bi-directional prediction for a B-picture to detect the respective motion vectors. In a known manner, the motion estimator 322 searches, in the frame memory, for a block of pixels which most resembles the current input block of pixels. Various search algorithms are known in the art. They are generally based on evaluating the mean absolute difference (MAD) or the mean square error (MSE) between the pixels of the current input block and those of the candidate block. The candidate block having the least MAD or MSE is then selected to be the motion-compensated prediction block. Its relative location with respect to the location of the current input block is the motion vector.
  • MAD mean absolute difference
  • MSE mean square error
  • the motion compensator 324 may read out encoded and already locally decoded picture data stored in the frame memory in accordance with the prediction mode and the motion vector and may supply the read-out data as a prediction picture to arithmetic unit 325 and switch 344.
  • the arithmetic unit 325 also receives the input block and calculates the difference between the input block and the prediction picture from the motion compensator 324. The difference value is then supplied to the DCT circuit 330. If only the prediction mode is received from the motion estimator 322, that is, if the prediction mode is the intra-coding mode, the motion compensator 324 may not output a prediction picture.
  • the arithmetic unit 325 may not perform the above- described processing, but instead may directly output the input block to the DCT circuit 330.
  • the DCT circuit 330 performs DCT processing on the output signal from the arithmetic unit 33 so as to obtain DCT coefficients which are supplied to a quantizer 332.
  • the quantizer 332 sets a quantization step (quantization scale) in accordance with the data storage quantity in a buffer (not illustrated) received as a feedback and quantizes the DCT coefficients from the DCT circuit 330 using the quantization step.
  • the quantized DCT coefficients are supplied to the VLC unit 334 along with the set quantization step.
  • the VLC unit 334 converts the quantization coefficients supplied from the quantizer 332 into a variable length code, such as a Huffman code, in accordance with the quantization step supplied from the quantizer 332.
  • the resulting converted quantization coefficients are outputted to a buffer not illustrated.
  • the quantization coefficients and the quantization step are also supplied to an inverse quantizer 338 which dequantizes the quantization coefficients in accordance with the quantization step so as to convert the same to DCT coefficients.
  • the DCT coefficients are supplied to the inverse DCT unit 340 which performs inverse DCT on the DCT coefficients.
  • the obtained inverse DCT coefficients are then supplied to the arithmetic unit 348.
  • the arithmetic unit 348 receives the inverse DCT coefficients from the inverse
  • the DCT unit 340 and the data from the motion compensator 324 depending on the location of switch 344.
  • the arithmetic unit 348 sums the signal (prediction residuals) from the inverse DCT unit 340 to the predicted picture from the motion compensator 324 to locally decode the original picture. However, if the prediction mode indicates intra-coding, the output of the inverse DCT unit 340 may be directly fed to the frame memory.
  • the decoded picture obtained by the arithmetic unit 340 is sent to and stored in the frame memory so as to be used later as a reference picture for an inter-coded picture, forward predictive coded picture, backward predictive coded picture, or a bi-directional predictive coded picture.
  • the quantization coefficients from the quantizer 332 are also applied to a modifying means 400.
  • the modifying device 400 comprises a plurality of attenuation steps which can be arranged in series as illustrated in Figure 4 or in cascade or parallel as illustrated in Figure 5.
  • the quantization coefficients from the quantizer 332 are applied to an attenuator 401.
  • the signal is then attenuated by the attenuator 401 which results in attenuated DCT coefficients carried by a signal 407.
  • a second attenuator 403 Attenuates the amplitude of the DCT coefficients carried by the signal 407 and delivers new attenuated coefficients carried by signal 413, that are variable length coded by a variable length coder 422 for generating a first base video stream BaseBaseO.
  • the attenuators 401 and 403 are composed of an inverse quantizer 402 and 408, respectively, a weighting device 404 and 410, respectively, followed in series by a quantizer 406 and 412, respectively.
  • the quantization coefficients from the quantizer 332 are inverse quantized by the inverse quantizer 402.
  • the weighting is performed by a 8*8 weighting matrix multiplied to DCT blocks, each DCT coefficient being thus multiplied by a weighting factor contained in the matrix, the results of each multiplication being rounded to the nearest integer, weighting matrix being filled by values which amplitude are between 0 and 1 , set for example to non-uniform values close to 1 for low frequential values and close to 0 for high frequential values, or to uniform values so that all coefficients in the 8*8 DCT block are equally attenuated.
  • the quantization step consists of dividing weighted DCT coefficients by a new quantization factor for delivering quantized DCT coefficients, said quantization factor being the same for all coefficients of all 8*8 blocks composing a macroblock.
  • the coding error 415 relative to the attenuator 401 is generated by subtracting signal 407 from a signal from the quantizer 332 by means of a subtraction unit 414.
  • the coding error 415 is then variable length coded by a variable length coder 416 for generating a base enhancement video stream BaseEnh2.
  • the coding error 419 relative to the attenuator 403 is generated by subtracting a signal 413 from signal 407 by means of a subtraction unit 418.
  • the coding error 419 is then variable length coded by a variable length encoder 420 for generating a second base enhancement video stream BaseEnhl.
  • the minimum quality base resolution would be provided by the video stream BaseBaseO.
  • the enhancement encoder 314 comprises a motion estimator 354, a motion compensator 356, a DCT circuit 368, a quantizer 370, a VLC unit 372, a bitrate controller 374, an inverse quantizer 376, an inverse DCT circuit 378, switches 366 and 382, subtracters 358 and 364, and adders 380 and 388.
  • the enhancement encoder 314 may also include DC-offsets 360 and 384, adder 362 and subtractor 386. The operation of many of these components is similar to the operation of similar components in the base encoder 312 and will not be described in detail.
  • the output of the arithmetic unit 340 is also supplied to the upsampler 350 which generally reconstructs the filtered out resolution from the decoded video stream and provides a video data stream having substantially the same resolution as the high-resolution input. However, because of the filtering and losses resulting from the compression and decompression, certain errors are present in the reconstructed stream.
  • the errors are determined in the subtraction unit 358 by subtracting the reconstructed high-resolution stream from the original, unmodified high resolution stream. According to one embodiment of the invention illustrated in Figure 3, the original unmodified high-resolution stream is also provided to the motion estimator 354.
  • the reconstructed high-resolution stream is also provided to an adder 388 which adds the output from the inverse DCT 378 (possibly modified by the output of the motion compensator 356 depending on the position of the switch 382).
  • the output of the adder 388 is supplied to the motion estimator 354.
  • the motion estimation is performed on the upscaled base layer plus the enhancement layer instead of the residual difference between the original high- resolution stream and the reconstructed high-resolution stream.
  • This motion estimation produces motion vectors that track the actual motion better than the vectors produced by the known systems of Figures 1 and 2. This leads to a perceptually better picture quality especially for consumer applications which have lower bit rates than professional applications.
  • a DC-offset operation followed by a clipping operation can be introduced into the enhancement encoder 314, wherein the DC-offset value 360 is added by adder 362 to the residual signal output from the subtraction unit 358.
  • This optional DC-offset and clipping operation allows the use of existing standards, e.g., MPEG, for the enhancement encoder where the pixel values are in a predetermined range, e.g., 0...255.
  • the residual signal is normally concentrated around zero.
  • the concentration of samples can be shifted to the middle of the range, e.g., 128 for 8 bit video samples.
  • the advantage of this addition is that the standard components of the encoder for the enhancement layer can be used and result in a cost efficient (re-use of IP blocks) solution.
  • the various enhancement layer video streams are created in a similar manner as the creation of the multiple base video streams described above.
  • the quantization coefficients from the quantizer 370 are also applied to the modifying device 450.
  • the modifying device 450 may have the same elements as the modifying device 400 illustrated in Figure 4, and in the following description the same reference numerals will be used for like elements.
  • the quantization coefficients from the quantizer 370 are applied to the attenuator 401.
  • the signal is then attenuated by the attenuator 401 which results in attenuated DCT coefficients carried by a signal 407.
  • a second attenuator 403 Attenuates the amplitude of the DCT coefficients carried by the signal 407 and delivers new attenuated coefficients carried by signal 413, that are variable length coded by a variable length coder 422 for generating a first enhancement video stream EnhBaseO.
  • the attenuators 401 and 403 are composed of an inverse quantizer 402 and 408, respectively, a weighting device 404 and 4410, respectively, followed in series by a quantizer 406 and 412, respectively.
  • the weighting is performed by a 8*8 weighting matrix multiplied to DCT blocks, each DCT coefficient being thus multiplied by a weighting factor contained in the matrix, the results of each multiplication being rounded to the nearest integer, weighting matrix being filled by values which amplitude are between 0 and 1, set for example to non-uniform values close to 1 for low frequential values and close to 0 for high frequential values, or to uniform values so that all coefficients in the 8*8 DCT block are equally attenuated.
  • the quantization step consists of dividing weighted DCT coefficients by a new quantization factor for delivering quantized DCT coefficients , said quantization factor being the same for all coefficients of all 8*8 blocks composing a macroblock.
  • the coding error 415 relative to the attenuator 401 is generated by subtracting signal 407 from a signal from the quantizer 370 by means of a subtraction unit 414.
  • the coding error 415 is then variable length coded by a variable length coder 416 for generating a second enhancement video stream EnhEnh2.
  • the coding error 419 relative to the attenuator 403 is generated by subtracting a signal 413 from signal 407 by means of a subtraction unit 418.
  • the coding error 419 is then variable length coded by a variable length encoder 420 for generating a third base enhancement video stream EnhEnhl .
  • the minimum quality full resolution would be provided by adding the video stream EnhBaseO to the high quality base resolution video stream.
  • a medium quality full resolution would be provided by combining the video streams EnhBaseO and EnhEnhl with the high quality base resolution.
  • a high quality full resolution would be provided by combining the video streams EnhBaseO, EnhEnhl and EnhEnh2 with the high quality base resolution.
  • Figure 5 illustrates a modifying device wherein the attenuators are connected in cascade or parallel. It will be understood that the modifying device 500 can be used in both the base layer and the enhancement layer as a substitute for modifying devices 400 and 450.
  • the quantization coefficients from the quantizer 332 (or quantizer 370) are supplied to the first attenuator 501.
  • the attenuator 501 comprises an inverse quantizer 502, a weighting device 504 and a quantizer 506.
  • the quantization coefficients are inverse quantized in the inverse quantizer 502, then weighted and requantized, as described above with respect to Figure 4, in the weighting device 504 and the quantizer 506.
  • the attenuated DCT coefficients carried by a signal 513 are then coded in a variable length coder 514 to produce a first base (enhancement) stream.
  • the coding error 517 of the attenuator 501 is generated by subtracting the signal 517 from the signal from the quantizer 332 (quantizer 370) by means of a subtraction unit 516.
  • the coding error is applied to the second attenuator 503 which is comprised of an inverse quantizer 508, a weighting device 510 and a quantizer 512.
  • the attenuated signal 519 is encoded by a variable length coder 520 which produces a second base(or enhancement) stream.
  • the coding error 523 of the attenuator 503 is generated by subtracting the signal 519 from the signal 517 by means of a subtraction unit 522.
  • the coding error 523 is encoded by a variable length coder 524 which produces a third base (enhancement) stream.
  • Figure 6 illustrates a decoder according to one embodiment of the invention for decoding the multiple base or enhancement streams produced by the modifying devices.
  • the multiple base (enhancement) streams are decoded by a plurality of variable length decoders 602, 604 and 606.
  • the decoded streams are then added together in an arithmetic unit 608.
  • the decoded quantization coefficients in the combined stream are supplied to an inverse quantizer 610 which dequantizes the quantization coefficient in accordance with the quantization step so as to convert the quantization coefficients into DCT coefficients.
  • the DCT coefficients are supplied to the inverse DCT unit 612 which performs inverse DCT on the DCT coefficients.
  • the obtained inverse DCT coefficients are then supplied to the arithmetic unit 614.
  • the arithmetic unit 614 receives the inverse DCT coefficients from the inverse DCT unit 612 and data (produced in a known manner) from a motion compensator 616.
  • the arithmetic unit 614 sums the stream from the inverse DCT unit 612 to the predicted picture from the motion compensator 616 to produce the decoded base (or enhancement) stream.
  • the decoded base and enhancement streams can be combined in a known manner to create the decoded video output.

Abstract

An apparatus for efficiently performing spatial scalable compression of an input video stream is disclosed. A base encoder encodes a base encoder stream. Modifying means modifies content of the base encoder stream to create a plurality of base streams. An enhancement encoder encodes an enhancement encoder stream. Modifying means modifies content of the enhancement encoder stream to create a plurality of enhancement streams.

Description

Spatial scalable compression
FIELD OF THE INVENTION
The invention relates to a video encoder, and more particularly to a video encoder which uses spatial scalable compression schemes to produce a plurality of base streams and a plurality enhancement streams.
BACKGROUND OF THE INVENTION
Because of the massive amounts of data inherent in digital video, the transmission of full-motion, high-definition digital video signals is a significant problem in the development of high-definition television. More particularly, each digital image frame is a still image formed from an array of pixels according to the display resolution of a particular system. As a result, the amounts of raw digital information included in high resolution video sequences are massive. In order to reduce the amount of data that must be sent, compression schemes are used to compress the data. Various video compression standards or processes have been established, including, MPEG-2, MPEG-4, H.263 and H.264. Many applications are enabled where video is available at various resolutions and/or qualities in one stream. Methods to accomplish this are loosely referred to as scalability techniques. There are three axes on which one can deploy scalability. The first is scalability on the time axis, often referred to as temporal scalability. Secondly, there is scalability on the quality axis, often referred to as signal-to-noise scalability or fine-grain scalability. The third axis is the resolution axis (number of pixels in image) often referred to as spatial scalability or layered coding. In layered coding, the bitstream is divided into two or more bitstreams, or layers. Each layer can be combined to form a single high quality signal. For example, the base layer may provide a lower quality video signal, while the enhancement layer provides additional information that can enhance the base layer image. In particular, spatial scalability can provide compatibility between different video standards or decoder capabilities. With spatial scalability, the base layer video may have a lower resolution than the input video sequence, in which case the enhancement layer carries information which can restore the resolution of the base layer to the input sequence level. Most video compression standards support spatial scalability. Figure 1 illustrates a block diagram of an encoder 100 which supports MPEG-2/MPEG-4 spatial scalability. The encoder 100 comprises a base encoder 112 and an enhancement encoder 114. The base encoder is comprised of a low pass filter and downsampler 120, a motion estimator 122, a motion compensator 124, an orthogonal transform (e.g., Discrete Cosine Transform (DCT)) circuit 130, a quantizer 132, a variable length coder 134, a bitrate control circuit 135, an inverse quantizer 138, an inverse transform circuit 140, switches 128, 144, and an interpolate and upsample circuit 150. The enhancement encoder 114 comprises a motion estimator 154, a motion compensator 155, a selector 156, an orthogonal transform (e.g., Discrete Cosine Transform (DCT)) circuit 158, a quantizer 160, a variable length coder 162, a bitrate control circuit 164, an inverse quantizer 166, an inverse transform circuit 168, switches 170 and 172. The operations of the individual components are well known in the art and will not be described in detail.
Unfortunately, the coding efficiency of this layered coding scheme is not very good. Indeed, for a given picture quality, the bitrate of the base layer and the enhancement layer together for a sequence is greater than the bitrate of the same sequence coded at once.
Figure 2 illustrates another known encoder 200 proposed by DemoGrafx. The encoder is comprised of substantially the same components as the encoder 100 and the operation of each is substantially the same so the individual components will not be described. In this configuration, the residue difference between the input block and the upsampled output from the upsampler 150 is inputted into a motion estimator 154. To guide/help the motion estimation of the enhancement encoder, the scaled motion vectors from the base layer are used in the motion estimator 154 as indicated by the dashed line in Figure 2. However, this arrangement does not significantly overcome the problems of the arrangement illustrated in Figure 1.
SUMMARY OF THE INVENTION
It is an object of the invention to overcome at least part of the above-described deficiencies of the known spatial scalability schemes by providing a spatial scalable compression scheme which produces a plurality of base streams with differing quality levels and a plurality of enhancement streams with differing quality levels.
According to one embodiment of the invention, an apparatus for efficiently performing spatial scalable compression of an input video stream is disclosed. A base encoder encodes a base encoder stream. Modifying means modifies content of the base encoder stream to create a plurality of base streams. An enhancement encoder encodes an enhancement encoder stream. Modifying means modifies content of the enhancement encoder stream to create a plurality of enhancement streams.
According to another embodiment of the invention, a method and apparatus for providing spatial scalable compression of an input video stream is disclosed. The input video stream is downsampled to reduce the resolution of the video stream. The downsampled video stream is encoded to produce a base encoder stream. A plurality of base streams are created from the base encoder stream. The base encoder stream is decoded and upconverted to produce a reconstructed video stream. The expected motion between frames from the input video stream and the reconstructed video stream is estimated and motion vectors for each frame of the received streams is calculated based upon an upscaled base layer plus enhancement layer. The reconstructed video stream is subtracted from the video stream to produce a residual stream. A predicted stream is calculated using the motion vectors in a motion compensation unit. The predicted stream is subtracted from the residual stream. The resulting residual stream is encoded and an enhancement encoder stream is outputted. A plurality of enhancement streams are created from the enhancement encoder stream.
According to another embodiment of the invention, a method and apparatus for decoding a plurality of coded video signals is disclosed. Each of the video streams is decoded and then the video streams are combined. An inverse quantization operation is performed on quantization coefficients in the decoded video streams to produce DCT coefficients. An inverse DCT operation is performed on the DCT coefficients to produce a first signal. Predicted pictures are produced in a motion compensator and the first signal and the predicted pictures are combined to produce an output signal. These and other aspects of the invention will be apparent from and elucidated with reference to the embodiments described hereafter.
BRIEF DESCRIPTION OF THE DRAWINGS
The invention will now be described, by way of example, with reference to the accompanying drawings, wherein:
Figure 1 is a block schematic representation of a known encoder with spatial scalability;
Figure 2 is a block schematic representation of a known encoder with spatial scalability; Figure 3 is a block schematic representation of an encoder with spatial scalability according to one embodiment of the invention;
Figure 4 illustrates a modifying device with attenuators in series according to one embodiment of the invention; Figure 5 illustrates a modifying device with attenuators in cascade according to one embodiment of the invention; and
Figure 6 illustrates a decoder according to one embodiment of the invention.
DETAILED DESCRIPTION OF THE INVENTION Figure 3 is a schematic diagram of an encoder according to one embodiment of the invention. The depicted encoding system 300 accomplishes layered compression, whereby a portion of the channel is used for providing a plurality of lower resolution base layers and the remaining portion is used for transmitting a plurality of enhancement layers, whereby various base layers and base and enhancement layers can be combined to create video streams of differing quality levels. It will be understood by those skilled in the art that other encoding arrangements can also be used to create multilayered base and enhancement video streams and the invention is not limited thereto.
The encoder 300 comprises a base encoder 312 and an enhancement encoder 314. The base encoder is comprised of a low pass filter and downsampler 320, a motion estimator 322, a motion compensator 324, an orthogonal transform (e.g., Discrete Cosine Transform (DCT)) circuit 330, a quantizer 332, a variable length coder (VLC) 334, a bitrate control circuit 335, an inverse quantizer 338, an inverse transform circuit 340, switches 328, 344, and an interpolate and upsample circuit 350.
An input video block 316 is split by a splitter 318 and sent to both the base encoder 312 and the enhancement encoder 314. In the base encoder 312, the input block is inputted into a low pass filter and downsampler 320. The low pass filter reduces the resolution of the video block which is then fed to the motion estimator 322. The motion estimator 322 processes picture data of each frame as an I-picture, a P-picture, or as a B- picture. Each of the pictures of the sequentially entered frames is processed as one of the I-, P-, or B-pictures in a pre-set manner, such as in the sequence of I, B, P, B, P,..., B, P. That is, the motion estimator 322 refers to a pre-set reference frame in a series of pictures stored in a frame memory not illustrated and detects the motion vector of a macro-block, that is, a small block of 16 pixels by 16 lines of the frame being encoded by pattern matching (block Matching) between the macro-block and the reference frame for detecting the motion vector of the macro-block.
In MPEG, there are four picture prediction modes, that is an intra-coding (intra- frame coding), a forward predictive coding, a backward predictive coding, and a bi- directional predictive-coding. An I-picture is an intra-coded picture, a P-picture is an intra- coded or forward predictive coded or backward predictive coded picture, and a B-picture is an intra-coded, a forward predictive coded, or a bi-directional predictive-coded picture.
The motion estimator 322 performs forward prediction on a P-picture to detect its motion vector. Additionally, the motion estimator 322 performs forward prediction, backward prediction, and bi-directional prediction for a B-picture to detect the respective motion vectors. In a known manner, the motion estimator 322 searches, in the frame memory, for a block of pixels which most resembles the current input block of pixels. Various search algorithms are known in the art. They are generally based on evaluating the mean absolute difference (MAD) or the mean square error (MSE) between the pixels of the current input block and those of the candidate block. The candidate block having the least MAD or MSE is then selected to be the motion-compensated prediction block. Its relative location with respect to the location of the current input block is the motion vector.
Upon receiving the prediction mode and the motion vector from the motion estimator 322, the motion compensator 324 may read out encoded and already locally decoded picture data stored in the frame memory in accordance with the prediction mode and the motion vector and may supply the read-out data as a prediction picture to arithmetic unit 325 and switch 344. The arithmetic unit 325 also receives the input block and calculates the difference between the input block and the prediction picture from the motion compensator 324. The difference value is then supplied to the DCT circuit 330. If only the prediction mode is received from the motion estimator 322, that is, if the prediction mode is the intra-coding mode, the motion compensator 324 may not output a prediction picture. In such a situation, the arithmetic unit 325 may not perform the above- described processing, but instead may directly output the input block to the DCT circuit 330. The DCT circuit 330 performs DCT processing on the output signal from the arithmetic unit 33 so as to obtain DCT coefficients which are supplied to a quantizer 332. The quantizer 332 sets a quantization step (quantization scale) in accordance with the data storage quantity in a buffer (not illustrated) received as a feedback and quantizes the DCT coefficients from the DCT circuit 330 using the quantization step. The quantized DCT coefficients are supplied to the VLC unit 334 along with the set quantization step. The VLC unit 334 converts the quantization coefficients supplied from the quantizer 332 into a variable length code, such as a Huffman code, in accordance with the quantization step supplied from the quantizer 332. The resulting converted quantization coefficients are outputted to a buffer not illustrated. The quantization coefficients and the quantization step are also supplied to an inverse quantizer 338 which dequantizes the quantization coefficients in accordance with the quantization step so as to convert the same to DCT coefficients. The DCT coefficients are supplied to the inverse DCT unit 340 which performs inverse DCT on the DCT coefficients. The obtained inverse DCT coefficients are then supplied to the arithmetic unit 348. The arithmetic unit 348 receives the inverse DCT coefficients from the inverse
DCT unit 340 and the data from the motion compensator 324 depending on the location of switch 344. The arithmetic unit 348 sums the signal (prediction residuals) from the inverse DCT unit 340 to the predicted picture from the motion compensator 324 to locally decode the original picture. However, if the prediction mode indicates intra-coding, the output of the inverse DCT unit 340 may be directly fed to the frame memory. The decoded picture obtained by the arithmetic unit 340 is sent to and stored in the frame memory so as to be used later as a reference picture for an inter-coded picture, forward predictive coded picture, backward predictive coded picture, or a bi-directional predictive coded picture.
The quantization coefficients from the quantizer 332 are also applied to a modifying means 400. The modifying device 400 comprises a plurality of attenuation steps which can be arranged in series as illustrated in Figure 4 or in cascade or parallel as illustrated in Figure 5. As illustrated in Figure 4, the quantization coefficients from the quantizer 332 are applied to an attenuator 401. The signal is then attenuated by the attenuator 401 which results in attenuated DCT coefficients carried by a signal 407. In series with the attenuator 401, a second attenuator 403 attenuates the amplitude of the DCT coefficients carried by the signal 407 and delivers new attenuated coefficients carried by signal 413, that are variable length coded by a variable length coder 422 for generating a first base video stream BaseBaseO.
The attenuators 401 and 403 are composed of an inverse quantizer 402 and 408, respectively, a weighting device 404 and 410, respectively, followed in series by a quantizer 406 and 412, respectively. The quantization coefficients from the quantizer 332 are inverse quantized by the inverse quantizer 402. The weighting is performed by a 8*8 weighting matrix multiplied to DCT blocks, each DCT coefficient being thus multiplied by a weighting factor contained in the matrix, the results of each multiplication being rounded to the nearest integer, weighting matrix being filled by values which amplitude are between 0 and 1 , set for example to non-uniform values close to 1 for low frequential values and close to 0 for high frequential values, or to uniform values so that all coefficients in the 8*8 DCT block are equally attenuated. The quantization step consists of dividing weighted DCT coefficients by a new quantization factor for delivering quantized DCT coefficients, said quantization factor being the same for all coefficients of all 8*8 blocks composing a macroblock.
The coding error 415 relative to the attenuator 401 is generated by subtracting signal 407 from a signal from the quantizer 332 by means of a subtraction unit 414. The coding error 415 is then variable length coded by a variable length coder 416 for generating a base enhancement video stream BaseEnh2. The coding error 419 relative to the attenuator 403 is generated by subtracting a signal 413 from signal 407 by means of a subtraction unit 418. The coding error 419 is then variable length coded by a variable length encoder 420 for generating a second base enhancement video stream BaseEnhl. In this example, the minimum quality base resolution would be provided by the video stream BaseBaseO. A medium quality base resolution would be provided by combining the video stream BaseBaseO with the video stream BaseEnhO. A high quality base resolution would be provided by combining the video stream BaseBaseO, BaseEnhO and BaseEnhl. The enhancement encoder 314 comprises a motion estimator 354, a motion compensator 356, a DCT circuit 368, a quantizer 370, a VLC unit 372, a bitrate controller 374, an inverse quantizer 376, an inverse DCT circuit 378, switches 366 and 382, subtracters 358 and 364, and adders 380 and 388. In addition, the enhancement encoder 314 may also include DC-offsets 360 and 384, adder 362 and subtractor 386. The operation of many of these components is similar to the operation of similar components in the base encoder 312 and will not be described in detail.
The output of the arithmetic unit 340 is also supplied to the upsampler 350 which generally reconstructs the filtered out resolution from the decoded video stream and provides a video data stream having substantially the same resolution as the high-resolution input. However, because of the filtering and losses resulting from the compression and decompression, certain errors are present in the reconstructed stream. The errors are determined in the subtraction unit 358 by subtracting the reconstructed high-resolution stream from the original, unmodified high resolution stream. According to one embodiment of the invention illustrated in Figure 3, the original unmodified high-resolution stream is also provided to the motion estimator 354. The reconstructed high-resolution stream is also provided to an adder 388 which adds the output from the inverse DCT 378 (possibly modified by the output of the motion compensator 356 depending on the position of the switch 382). The output of the adder 388 is supplied to the motion estimator 354. As a result, the motion estimation is performed on the upscaled base layer plus the enhancement layer instead of the residual difference between the original high- resolution stream and the reconstructed high-resolution stream. This motion estimation produces motion vectors that track the actual motion better than the vectors produced by the known systems of Figures 1 and 2. This leads to a perceptually better picture quality especially for consumer applications which have lower bit rates than professional applications.
Furthermore, a DC-offset operation followed by a clipping operation can be introduced into the enhancement encoder 314, wherein the DC-offset value 360 is added by adder 362 to the residual signal output from the subtraction unit 358. This optional DC-offset and clipping operation allows the use of existing standards, e.g., MPEG, for the enhancement encoder where the pixel values are in a predetermined range, e.g., 0...255. The residual signal is normally concentrated around zero. By adding a DC-offset value 360, the concentration of samples can be shifted to the middle of the range, e.g., 128 for 8 bit video samples. The advantage of this addition is that the standard components of the encoder for the enhancement layer can be used and result in a cost efficient (re-use of IP blocks) solution. The various enhancement layer video streams are created in a similar manner as the creation of the multiple base video streams described above. The quantization coefficients from the quantizer 370 are also applied to the modifying device 450. The modifying device 450 may have the same elements as the modifying device 400 illustrated in Figure 4, and in the following description the same reference numerals will be used for like elements. The quantization coefficients from the quantizer 370 are applied to the attenuator 401. The signal is then attenuated by the attenuator 401 which results in attenuated DCT coefficients carried by a signal 407. In series with the attenuator 401, a second attenuator 403 attenuates the amplitude of the DCT coefficients carried by the signal 407 and delivers new attenuated coefficients carried by signal 413, that are variable length coded by a variable length coder 422 for generating a first enhancement video stream EnhBaseO.
The attenuators 401 and 403 are composed of an inverse quantizer 402 and 408, respectively, a weighting device 404 and 4410, respectively, followed in series by a quantizer 406 and 412, respectively. The weighting is performed by a 8*8 weighting matrix multiplied to DCT blocks, each DCT coefficient being thus multiplied by a weighting factor contained in the matrix, the results of each multiplication being rounded to the nearest integer, weighting matrix being filled by values which amplitude are between 0 and 1, set for example to non-uniform values close to 1 for low frequential values and close to 0 for high frequential values, or to uniform values so that all coefficients in the 8*8 DCT block are equally attenuated. The quantization step consists of dividing weighted DCT coefficients by a new quantization factor for delivering quantized DCT coefficients , said quantization factor being the same for all coefficients of all 8*8 blocks composing a macroblock. The coding error 415 relative to the attenuator 401 is generated by subtracting signal 407 from a signal from the quantizer 370 by means of a subtraction unit 414. The coding error 415 is then variable length coded by a variable length coder 416 for generating a second enhancement video stream EnhEnh2. The coding error 419 relative to the attenuator 403 is generated by subtracting a signal 413 from signal 407 by means of a subtraction unit 418. The coding error 419 is then variable length coded by a variable length encoder 420 for generating a third base enhancement video stream EnhEnhl .
In this example, the minimum quality full resolution would be provided by adding the video stream EnhBaseO to the high quality base resolution video stream. A medium quality full resolution would be provided by combining the video streams EnhBaseO and EnhEnhl with the high quality base resolution. A high quality full resolution would be provided by combining the video streams EnhBaseO, EnhEnhl and EnhEnh2 with the high quality base resolution.
Figure 5 illustrates a modifying device wherein the attenuators are connected in cascade or parallel. It will be understood that the modifying device 500 can be used in both the base layer and the enhancement layer as a substitute for modifying devices 400 and 450. The quantization coefficients from the quantizer 332 (or quantizer 370) are supplied to the first attenuator 501. The attenuator 501 comprises an inverse quantizer 502, a weighting device 504 and a quantizer 506. The quantization coefficients are inverse quantized in the inverse quantizer 502, then weighted and requantized, as described above with respect to Figure 4, in the weighting device 504 and the quantizer 506. The attenuated DCT coefficients carried by a signal 513 are then coded in a variable length coder 514 to produce a first base (enhancement) stream.
The coding error 517 of the attenuator 501 is generated by subtracting the signal 517 from the signal from the quantizer 332 (quantizer 370) by means of a subtraction unit 516. The coding error is applied to the second attenuator 503 which is comprised of an inverse quantizer 508, a weighting device 510 and a quantizer 512. The attenuated signal 519 is encoded by a variable length coder 520 which produces a second base(or enhancement) stream. The coding error 523 of the attenuator 503 is generated by subtracting the signal 519 from the signal 517 by means of a subtraction unit 522. The coding error 523 is encoded by a variable length coder 524 which produces a third base (enhancement) stream.
Figure 6 illustrates a decoder according to one embodiment of the invention for decoding the multiple base or enhancement streams produced by the modifying devices. The multiple base (enhancement) streams are decoded by a plurality of variable length decoders 602, 604 and 606. The decoded streams are then added together in an arithmetic unit 608. The decoded quantization coefficients in the combined stream are supplied to an inverse quantizer 610 which dequantizes the quantization coefficient in accordance with the quantization step so as to convert the quantization coefficients into DCT coefficients. The DCT coefficients are supplied to the inverse DCT unit 612 which performs inverse DCT on the DCT coefficients. The obtained inverse DCT coefficients are then supplied to the arithmetic unit 614. The arithmetic unit 614 receives the inverse DCT coefficients from the inverse DCT unit 612 and data (produced in a known manner) from a motion compensator 616. The arithmetic unit 614 sums the stream from the inverse DCT unit 612 to the predicted picture from the motion compensator 616 to produce the decoded base (or enhancement) stream. The decoded base and enhancement streams can be combined in a known manner to create the decoded video output.
It will be understood that the different embodiments of the invention are not limited to the exact order of the above-described steps as the timing of some steps can be interchanged without affecting the overall operation of the invention. Furthermore, the term "comprising" does not exclude other elements or steps, the terms "a" and "an" do not exclude a plurality and a single processor or other unit may fulfill the functions of several of the units or circuits recited in the claims.

Claims

CLAIMS:
1. An apparatus for efficiently performing spatial scalable compression of an input video stream, comprising: a base encoder for encoding a base encoder stream; means for modifying content of the base encoder stream to create a plurality of base streams; an enhancement encoder for encoding an enhancement encoder stream; and means for modifying content of the enhancement encoder stream to create a plurality of enhancement streams.
2. The apparatus according to claim 1, wherein said modifying is performed by a set of attenuation steps applied to coefficients composing said base encoder stream being assembled in series and a re-encoding step associated to each of said attenuation steps for delivering one of said plurality of base streams from a coding error by each attenuation step.
3. The apparatus according to claim 1, wherein said modifying is performed by a set of attenuation steps applied to coefficients composing said enhancement encoder stream being assembled in series and a re-encoding step associated to each of said attenuation steps for delivering one of said plurality of enhancement streams from a coding error by each attenuation step.
4. The apparatus according to claim 1 , wherein said modifying is performed by a set of attenuation steps applied to coefficients composing said base encoder stream being assembled in cascade and a re-encoding step associated to each of said attenuation steps for delivering one of said plurality of base streams from a coding error by each attenuation step.
5. The apparatus according to claim 1, wherein said modifying is performed by a set of attenuation steps applied to coefficients composing said enhancement encoder stream being assembled in cascade and a re-encoding step associated to each of said attenuation steps for delivering one of said plurality of enhancement streams from a coding error by each attenuation step.
6. A layered encoder for encoding an input video stream, comprising: a downsampling unit for reducing the resolution of the video stream; a base encoder for encoding a base encoder stream; means for creating a plurality of base streams by modifying content of the base encoder stream; an upconverting unit for decoding and increasing the resolution of the base encoder stream to produce a reconstructed video stream; a motion estimation unit which receives the input video stream and the reconstructed video stream and calculates motion vectors for each frame of the received streams based upon an upscaled base layer plus enhancement layer; a first subtraction unit for subtracting the reconstructed video stream from the input video stream to produce a residual stream; a motion compensation unit which receives the motion vectors from the motion estimation unit and produces a predicted stream; a second subtraction unit for subtracting the predicted stream from the residual stream; an enhancement encoder for encoding the resulting stream from the subtraction unit and outputting an enhancement encoder stream; means for creating a plurality of enhancement streams by modifying content of the enhancement encoder stream.
7. The layered encoder according to claim 6, wherein said means for creating a plurality of base streams comprises: a set of attenuation means applied to coefficients composing the base encoder stream, said attenuation means being assembled in series for delivering one of said plurality of base streams; re-encoding means associated with each attenuation means for delivering one of said plurality of base streams, from a coding error generated by each attenuation means.
8. The layered encoder according to claim 6, wherein said means for creating a plurality of base streams comprises: a set of attenuation means applied to coefficients composing the base encoder stream, said attenuation means being assembled in cascade for delivering one of said plurality of base streams; re-encoding means associated with each attenuation means for delivering one of said plurality of base streams, from a coding error generated by each attenuation means.
9. The layered encoder according to claim 7, wherein means for creating a plurality of enhancement streams comprises: a set of attenuation means applied to coefficients composing the enhancement encoder stream, said attenuation means being assembled in series for delivering one of said plurality of enhancement streams; re-encoding means associated with each attenuation means for delivering one of said plurality of enhancement streams, from a coding error generated by each attenuation means.
10. The layered encoder according to claim 8, wherein means for creating a plurality of enhancement streams comprises: a set of attenuation means applied to coefficients composing the enhancement encoder stream, said attenuation means being assembled in cascade for delivering one of said plurality of enhancement streams; re-encoding means associated with each attenuation means for delivering one of said plurality of enhancement streams, from a coding error generated by each attenuation means.
11. The layered encoder according to claim 7, wherein the attenuation means comprises frequential weighting means followed in series by quantization means for quantizing the coefficients, performed at the block level.
12. The layered encoder according to claim 7, wherein each re-encoding means comprises subtracting means for subtracting an output signal from an input signal of the associated attenuation means for delivering the coding error, and variable length coding means for creating one of said base streams from the coding error.
13. The layered encoder according to claim 8, wherein the attenuation means comprises frequential weighting means followed in series by quantization means for quantizing the coefficients, performed at the block level.
14. The layered encoder according to claim 13, wherein each re-encoding means comprises subtracting means for subtracting an output signal from an input signal of the associated attenuation means for delivering the coding error, and variable length coding means for creating one of said base streams from the coding error.
15. The layered encoder according to claim 9, wherein the attenuation means comprises frequential weighting means followed in series by quantization means for quantizing the coefficients, performed at the block level.
16. The layered encoder according to claim 7, wherein each re-encoding means comprises subtracting means for subtracting an output signal from an input signal of the associated attenuation means for delivering the coding error, and variable length coding means for creating one of said enhancement streams from the coding error.
17. A method for providing spatial scalable compression of an input video stream, comprising the steps of: downsampling the input video stream to reduce the resolution of the video stream; encoding the downsampled video stream to produce a base encoder stream; creating a plurality of base streams by modifying content of the base encoder stream; decoding and upconverting the base stream to produce a reconstructed video stream; estimating the expected motion between frames from the input video stream and the reconstructed video stream and calculating motion vectors for each frame of the received streams based upon an upscaled base layer plus enhancement layer; subtracting the reconstructed video stream from the video stream to produce a residual stream; calculating a predicted stream using the motion vectors in a motion compensation unit; subtracting the predicted stream from the residual stream; encoding the resulting residual stream and outputting an enhancement encoder stream; and creating a plurality of enhancement streams by modifying content of the enhancement encoder sfream.
18. A decoder for decoding a plurality of coded video signals, comprising: a plurality of decoders, one for each video stream, for decoding said video streams; arithmetic unit for combining said decoded video sfreams; inverse quantization means for performing an inverse quantization operation on quantization coefficients in said decoded video streams to produce DCT coefficients; inverse DCT means for performing an inverse DCT operation on the DCT coefficients to produce a first signal; a motion compensation unit for producing predicted pictures; arithmetic unit for combining the first signal and the predicted pictures to produce an output signal.
19. The decoder according to claim 18, wherein the plurality of coded video sfreams are base streams.
20. The decoder according to claim 18, wherein the plurality of video streams are enhancement streams.
21. A method for decoding a plurality of coded video signals, comprising: decoding each of said video streams; combining said decoded video streams; performing an inverse quantization operation on quantization coefficients in said decoded video streams to produce DCT coefficients; performing an inverse DCT operation on the DCT coefficients to produce a first signal; producing predicted pictures in a motion compensator; combining the first signal and the predicted pictures to produce an output signal.
PCT/IB2002/004370 2001-10-26 2002-10-21 Spatial scalable compression WO2003036981A1 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
KR10-2004-7006228A KR20040047977A (en) 2001-10-26 2002-10-21 Spatial scalable compression
US10/493,267 US20050002458A1 (en) 2001-10-26 2002-10-21 Spatial scalable compression
EP02777621A EP1442606A1 (en) 2001-10-26 2002-10-21 Spatial scalable compression
JP2003539340A JP2005507587A (en) 2001-10-26 2002-10-21 Spatial scalable compression

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
EP01204066.3 2001-10-26
EP01204066 2001-10-26
EP02075916 2002-03-08
EP02075916.3 2002-03-08

Publications (1)

Publication Number Publication Date
WO2003036981A1 true WO2003036981A1 (en) 2003-05-01

Family

ID=26077019

Family Applications (4)

Application Number Title Priority Date Filing Date
PCT/IB2002/004231 WO2003036978A1 (en) 2001-10-26 2002-10-14 Method and apparatus for spatial scalable compression
PCT/IB2002/004389 WO2003036983A2 (en) 2001-10-26 2002-10-21 Spatial scalable compression
PCT/IB2002/004370 WO2003036981A1 (en) 2001-10-26 2002-10-21 Spatial scalable compression
PCT/IB2002/004373 WO2003036982A2 (en) 2001-10-26 2002-10-21 Video coding

Family Applications Before (2)

Application Number Title Priority Date Filing Date
PCT/IB2002/004231 WO2003036978A1 (en) 2001-10-26 2002-10-14 Method and apparatus for spatial scalable compression
PCT/IB2002/004389 WO2003036983A2 (en) 2001-10-26 2002-10-21 Spatial scalable compression

Family Applications After (1)

Application Number Title Priority Date Filing Date
PCT/IB2002/004373 WO2003036982A2 (en) 2001-10-26 2002-10-21 Video coding

Country Status (7)

Country Link
US (4) US20040252767A1 (en)
EP (4) EP1442601A1 (en)
JP (4) JP2005506815A (en)
KR (4) KR20040054746A (en)
CN (4) CN1253008C (en)
AU (2) AU2002341323A1 (en)
WO (4) WO2003036978A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005081532A1 (en) * 2004-01-21 2005-09-01 Koninklijke Philips Electronics N.V. Method of spatial and snr fine granular scalable video encoding and transmission
JPWO2006019093A1 (en) * 2004-08-16 2008-05-08 日本電信電話株式会社 Image encoding method, image decoding method, image encoding device, image decoding device, image encoding program, and image decoding program
US7889937B2 (en) 2004-07-13 2011-02-15 Koninklijke Philips Electronics N.V. Method of spatial and SNR picture compression
US8005137B2 (en) 2005-03-25 2011-08-23 Samsung Electronics Co., Ltd. Video coding and decoding method using weighted prediction and apparatus for the same
US9462284B2 (en) 2004-11-23 2016-10-04 Siemens Aktiengesellschaft Encoding and decoding method and encoding and decoding device

Families Citing this family (120)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2441206A1 (en) 2001-03-19 2002-09-26 Gyros Ab Characterization of reaction variables
US6956902B2 (en) * 2001-10-11 2005-10-18 Hewlett-Packard Development Company, L.P. Method and apparatus for a multi-user video navigation system
KR20040054746A (en) * 2001-10-26 2004-06-25 코닌클리케 필립스 일렉트로닉스 엔.브이. Method and apparatus for spatial scalable compression
GB2417384B (en) * 2001-12-03 2006-05-03 Imagination Tech Ltd Method and apparatus for compressing data and decompressing compressed data
GB0128888D0 (en) 2001-12-03 2002-01-23 Imagination Tech Ltd Method and apparatus for compressing data and decompressing compressed data
KR20060111735A (en) * 2002-01-18 2006-10-27 가부시끼가이샤 도시바 Video decoding method and apparatus
US20030151753A1 (en) 2002-02-08 2003-08-14 Shipeng Li Methods and apparatuses for use in switching between streaming video bitstreams
US6996173B2 (en) * 2002-01-25 2006-02-07 Microsoft Corporation Seamless switching of scalable video bitstreams
US7483487B2 (en) * 2002-04-11 2009-01-27 Microsoft Corporation Streaming methods and systems
BR0316559A (en) * 2002-11-25 2005-10-04 Thomson Licensing Sa Two layer decoder for high definition hybrid dvd
JP2004350263A (en) * 2003-04-28 2004-12-09 Canon Inc Image processing apparatus and method therefor
BRPI0411655A (en) * 2003-06-19 2006-08-08 Thomson Licensing method and apparatus for low complexity spatial scalable coding
TWI362226B (en) * 2003-10-24 2012-04-11 Qualcomm Inc Method and apparatus for seamlessly switching reception between multimedia streams in a wireless communication system
KR101117586B1 (en) * 2003-12-03 2012-02-27 코닌클리케 필립스 일렉트로닉스 엔.브이. System and method for improved scalability support in MPEG-2 systems
EP1695558A2 (en) * 2003-12-09 2006-08-30 Koninklijke Philips Electronics N.V. Spatial and snr scalable video coding
JP2005295494A (en) * 2003-12-25 2005-10-20 Matsushita Electric Ind Co Ltd Dc offset canceling circuit
KR100586883B1 (en) * 2004-03-04 2006-06-08 삼성전자주식회사 Method and apparatus for video coding, pre-decoding, video decoding for vidoe streaming service, and method for image filtering
KR100994773B1 (en) * 2004-03-29 2010-11-16 삼성전자주식회사 Method and Apparatus for generating motion vector in hierarchical motion estimation
KR101014667B1 (en) * 2004-05-27 2011-02-16 삼성전자주식회사 Video encoding, decoding apparatus and method
WO2006001777A1 (en) * 2004-06-23 2006-01-05 Agency For Science, Technology And Research Scalable video coding with grid motion estimation and compensation
KR100621581B1 (en) 2004-07-15 2006-09-13 삼성전자주식회사 Method for pre-decoding, decoding bit-stream including base-layer, and apparatus thereof
KR100679011B1 (en) * 2004-07-15 2007-02-05 삼성전자주식회사 Scalable video coding method using base-layer and apparatus thereof
CN100466735C (en) * 2004-07-15 2009-03-04 三星电子株式会社 Video encoding and decoding methods and video encoder and decoder
KR100662350B1 (en) * 2004-08-23 2007-01-02 엘지전자 주식회사 Apparatus and Method for Transmission Video
EP1631089A1 (en) * 2004-08-30 2006-03-01 Matsushita Electric Industrial Co., Ltd. Video coding apparatus and decoding apparatus
EP1790166A2 (en) * 2004-08-31 2007-05-30 Koninklijke Philips Electronics N.V. A method and apparatus for motion estimation
KR100679018B1 (en) * 2004-09-07 2007-02-05 삼성전자주식회사 Method for multi-layer video coding and decoding, multi-layer video encoder and decoder
KR100878809B1 (en) 2004-09-23 2009-01-14 엘지전자 주식회사 Method of decoding for a video signal and apparatus thereof
DE102004059993B4 (en) * 2004-10-15 2006-08-31 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus and method for generating a coded video sequence using interlayer motion data prediction, and computer program and computer readable medium
KR100679022B1 (en) 2004-10-18 2007-02-05 삼성전자주식회사 Video coding and decoding method using inter-layer filtering, video ecoder and decoder
EP1803302A4 (en) * 2004-10-18 2007-11-07 Samsung Electronics Co Ltd Apparatus and method for adjusting bitrate of coded scalable bitsteam based on multi-layer
KR100664929B1 (en) * 2004-10-21 2007-01-04 삼성전자주식회사 Method and apparatus for effectively compressing motion vectors in video coder based on multi-layer
US20060120454A1 (en) * 2004-11-29 2006-06-08 Park Seung W Method and apparatus for encoding/decoding video signal using motion vectors of pictures in base layer
KR20060059769A (en) * 2004-11-29 2006-06-02 엘지전자 주식회사 Method for coding vector refinement for using vectors of base layer picturs and decoding method using the vector refinement
KR20060059764A (en) * 2004-11-29 2006-06-02 엘지전자 주식회사 Method and apparatus for encoding a video signal using previously-converted h-pictures as references and method and apparatus for decoding the video signal
KR100888963B1 (en) * 2004-12-06 2009-03-17 엘지전자 주식회사 Method for scalably encoding and decoding video signal
KR100888962B1 (en) 2004-12-06 2009-03-17 엘지전자 주식회사 Method for encoding and decoding video signal
KR100670459B1 (en) * 2004-12-10 2007-01-16 엘지전자 주식회사 Video coding and decoding system
US20090252425A1 (en) * 2004-12-13 2009-10-08 Koninklijke Philips Electronics, N.V. Scalable picture encoding
KR20060069227A (en) * 2004-12-16 2006-06-21 엘지전자 주식회사 Method and apparatus for deriving motion vectors of macro blocks from motion vectors of pictures of base layer when encoding/decoding video signal
JP2008526077A (en) * 2004-12-22 2008-07-17 エヌエックスピー ビー ヴィ Video stream changing device
JP2006333436A (en) 2005-01-07 2006-12-07 Ntt Docomo Inc Motion image encoding apparatus, method, and program, and motion image decoding apparatus, method, and program
JP5313223B2 (en) * 2005-01-07 2013-10-09 株式会社エヌ・ティ・ティ・ドコモ Moving picture decoding apparatus and moving picture encoding apparatus
KR100701740B1 (en) * 2005-01-11 2007-03-29 엘지전자 주식회사 Apparatus and Method for encoding and decoding PI frame of digital visual data
US20060153295A1 (en) * 2005-01-12 2006-07-13 Nokia Corporation Method and system for inter-layer prediction mode coding in scalable video coding
KR100714689B1 (en) * 2005-01-21 2007-05-04 삼성전자주식회사 Method for multi-layer based scalable video coding and decoding, and apparatus for the same
WO2006078109A1 (en) * 2005-01-21 2006-07-27 Samsung Electronics Co., Ltd. Method of multi-layer based scalable video encoding and decoding and apparatus for the same
KR100913088B1 (en) 2005-01-21 2009-08-21 엘지전자 주식회사 Method and apparatus for encoding/decoding video signal using prediction information of intra-mode macro blocks of base layer
WO2006078141A1 (en) 2005-01-21 2006-07-27 Lg Electronics Inc. Method and apparatus for encoding/decoding video signal using block prediction information
CN100340116C (en) * 2005-01-21 2007-09-26 浙江大学 Motion estimating method with graded complexity
KR100703748B1 (en) 2005-01-25 2007-04-05 삼성전자주식회사 Method for effectively predicting video frame based on multi-layer, video coding method, and video coding apparatus using it
US7876833B2 (en) * 2005-04-11 2011-01-25 Sharp Laboratories Of America, Inc. Method and apparatus for adaptive up-scaling for spatially scalable coding
DE102005016827A1 (en) * 2005-04-12 2006-10-19 Siemens Ag Adaptive interpolation during image or video coding
KR100763182B1 (en) * 2005-05-02 2007-10-05 삼성전자주식회사 Method and apparatus for coding video using weighted prediction based on multi-layer
US8619860B2 (en) * 2005-05-03 2013-12-31 Qualcomm Incorporated System and method for scalable encoding and decoding of multimedia data using multiple layers
US7974341B2 (en) * 2005-05-03 2011-07-05 Qualcomm, Incorporated Rate control for multi-layer video design
KR20060122671A (en) * 2005-05-26 2006-11-30 엘지전자 주식회사 Method for scalably encoding and decoding video signal
KR100878811B1 (en) * 2005-05-26 2009-01-14 엘지전자 주식회사 Method of decoding for a video signal and apparatus thereof
US7830961B2 (en) * 2005-06-21 2010-11-09 Seiko Epson Corporation Motion estimation and inter-mode prediction
WO2007009239A1 (en) * 2005-07-19 2007-01-25 March Networks Corporation Hierarchical data storage
US8289370B2 (en) 2005-07-20 2012-10-16 Vidyo, Inc. System and method for scalable and low-delay videoconferencing using scalable video coding
KR20070012201A (en) * 2005-07-21 2007-01-25 엘지전자 주식회사 Method for encoding and decoding video signal
KR100725407B1 (en) * 2005-07-21 2007-06-07 삼성전자주식회사 Method and apparatus for video signal encoding and decoding with directional intra residual prediction
US7894535B2 (en) * 2005-08-23 2011-02-22 Sony Ericsson Mobile Communications Ab Systems and methods for distributing and/or playing multicasted video signals in multiple display formats
US8139642B2 (en) * 2005-08-29 2012-03-20 Stmicroelectronics S.R.L. Method for encoding signals, related systems and program product therefor
KR100763194B1 (en) * 2005-10-14 2007-10-04 삼성전자주식회사 Intra base prediction method satisfying single loop decoding condition, video coding method and apparatus using the prediction method
FR2894422A1 (en) * 2005-12-01 2007-06-08 Thomson Licensing Sas METHOD FOR PREDICTING MOTION DATA AND TEXTURE
FR2894423A1 (en) * 2005-12-05 2007-06-08 Thomson Licensing Sas METHOD FOR PREDICTING MOTION DATA AND TEXTURE
FR2894424A1 (en) * 2005-12-05 2007-06-08 Thomson Licensing Sas METHOD FOR PREDICTING MOTION DATA AND TEXTURE
JP2009518981A (en) 2005-12-08 2009-05-07 ヴィドヨ,インコーポレーテッド System and method for error resilience and random access in video communication systems
US8170102B2 (en) * 2005-12-19 2012-05-01 Seiko Epson Corporation Macroblock homogeneity analysis and inter mode prediction
US7843995B2 (en) * 2005-12-19 2010-11-30 Seiko Epson Corporation Temporal and spatial analysis of a video macroblock
US8315308B2 (en) 2006-01-11 2012-11-20 Qualcomm Incorporated Video coding with fine granularity spatial scalability
KR100772873B1 (en) * 2006-01-12 2007-11-02 삼성전자주식회사 Video encoding method, video decoding method, video encoder, and video decoder, which use smoothing prediction
KR100843080B1 (en) * 2006-02-24 2008-07-02 삼성전자주식회사 Video transcoding method and apparatus thereof
US8693538B2 (en) * 2006-03-03 2014-04-08 Vidyo, Inc. System and method for providing error resilience, random access and rate control in scalable video communications
US20100232508A1 (en) * 2006-03-24 2010-09-16 Jung-Won Kang Coding method of reducing interlayer redundancy using mition data of fgs layer and device thereof
KR100759870B1 (en) * 2006-03-24 2007-09-18 경희대학교 산학협력단 H.264 scalable encoding/decording method and apparatus for performing interlayer prediction using selected interpolator filter based on the cbp
US8184712B2 (en) * 2006-04-30 2012-05-22 Hewlett-Packard Development Company, L.P. Robust and efficient compression/decompression providing for adjustable division of computational complexity between encoding/compression and decoding/decompression
US8250618B2 (en) * 2006-09-18 2012-08-21 Elemental Technologies, Inc. Real-time network adaptive digital video encoding/decoding
WO2008071037A1 (en) * 2006-12-14 2008-06-19 Thomson Licensing Method and apparatus for encoding and/or decoding video data using enhancement layer residual prediction for bit depth scalability
CA2674710C (en) * 2007-01-09 2016-02-23 Vidyo, Inc. Improved systems and methods for error resilience in video communication systems
EP1944978A1 (en) * 2007-01-12 2008-07-16 Koninklijke Philips Electronics N.V. Method and system for encoding a video signal. encoded video signal, method and system for decoding a video signal
JP2008176415A (en) * 2007-01-16 2008-07-31 Nikon Corp Image pickup device, image recording program, image data recording medium, image processor, and image processing program
CN101272489B (en) * 2007-03-21 2011-08-10 中兴通讯股份有限公司 Encoding and decoding device and method for video image quality enhancement
WO2009003499A1 (en) * 2007-06-29 2009-01-08 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Scalable video coding supporting pixel value refinement scalability
US8331451B2 (en) * 2007-07-18 2012-12-11 Samsung Electronics Co., Ltd. Method and apparatus for enhancing resolution of video image
US8184715B1 (en) 2007-08-09 2012-05-22 Elemental Technologies, Inc. Method for efficiently executing video encoding operations on stream processor architectures
US8121197B2 (en) * 2007-11-13 2012-02-21 Elemental Technologies, Inc. Video encoding and decoding using parallel processors
KR101375663B1 (en) * 2007-12-06 2014-04-03 삼성전자주식회사 Method and apparatus for encoding/decoding image hierarchically
JP5138048B2 (en) * 2007-12-19 2013-02-06 ドルビー ラボラトリーズ ライセンシング コーポレイション Adaptive motion estimation
JP4849130B2 (en) * 2008-02-19 2012-01-11 ソニー株式会社 Image processing apparatus, image processing method, and program
JP5497020B2 (en) * 2008-06-09 2014-05-21 ヴィディオ・インコーポレーテッド Improved view layout management in scalable video and audio communication systems
JP5232114B2 (en) * 2008-09-29 2013-07-10 パナソニック株式会社 Image coding apparatus and image coding method
KR100958253B1 (en) 2008-10-21 2010-05-17 인하대학교 산학협력단 Scalable encoder, decoder of block unit and method thereof
KR101557504B1 (en) * 2009-04-13 2015-10-07 삼성전자주식회사 Method for transmitting adapted channel condition apparatus using the method and providing system
US8948488B2 (en) * 2009-07-31 2015-02-03 General Electric Company Methods and systems for digitally enhancing an image of a stained material
KR101768207B1 (en) 2010-01-19 2017-08-16 삼성전자주식회사 Method and apparatus for encoding/decoding motion vector based on reduced motion vector predictor candidates
WO2011112316A1 (en) * 2010-03-09 2011-09-15 Telegent Systems, Inc. Adaptive video decoding circuitry and techniques
IT1399565B1 (en) * 2010-04-16 2013-04-19 Fond Istituto Italiano Di Tecnologia "PROCEDURE FOR CODIFYING / DECODING VIDEO / IMAGE SIGNALS WITH MULTIPLE DESCRIPTION AND ITS CODIFICATION / DECODING APPROACH"
EP2617193A4 (en) * 2010-09-14 2016-03-23 Samsung Electronics Co Ltd Apparatus and method for multilayer picture encoding/decoding
US8644383B2 (en) 2011-03-10 2014-02-04 Microsoft Corporation Mean absolute difference prediction for video encoding rate control
US20120230431A1 (en) 2011-03-10 2012-09-13 Jill Boyce Dependency parameter set for scalable video coding
WO2013003143A2 (en) * 2011-06-30 2013-01-03 Vidyo, Inc. Motion prediction in scalable video coding
WO2013003182A1 (en) * 2011-06-30 2013-01-03 Vidyo, Inc. Scalable video coding techniques
US20130016776A1 (en) * 2011-07-12 2013-01-17 Vidyo Inc. Scalable Video Coding Using Multiple Coding Technologies
EP2786576B1 (en) * 2011-12-01 2017-11-22 Intel Corporation Motion estimation methods for residual prediction
US9532048B2 (en) 2012-03-15 2016-12-27 Intel Corporation Hierarchical motion estimation employing nonlinear scaling and adaptive source block size
US9313486B2 (en) 2012-06-20 2016-04-12 Vidyo, Inc. Hybrid video coding techniques
JP5464238B2 (en) * 2012-07-10 2014-04-09 株式会社ニコン Imaging apparatus, image recording program, image processing apparatus, and image processing program
US20140072048A1 (en) * 2012-09-13 2014-03-13 Samsung Electronics Co., Ltd Method and apparatus for a switchable de-ringing filter for image/video coding
WO2014050748A1 (en) * 2012-09-28 2014-04-03 ソニー株式会社 Encoding device, encoding method, decoding device, and decoding method
US20160234520A1 (en) * 2013-09-16 2016-08-11 Entropic Communications, Llc Efficient progressive jpeg decode method
CN104581180A (en) * 2014-12-31 2015-04-29 乐视网信息技术(北京)股份有限公司 Video coding method and device
US10616583B2 (en) * 2016-06-30 2020-04-07 Sony Interactive Entertainment Inc. Encoding/decoding digital frames by down-sampling/up-sampling with enhancement information
MX2019008890A (en) * 2017-02-03 2019-09-10 Sony Corp Transmission device, transmission method, reception device, and reception method.
US10356404B1 (en) * 2017-09-28 2019-07-16 Amazon Technologies, Inc. Image processing using just-noticeable-difference thresholds
CN109274966A (en) * 2018-09-21 2019-01-25 华中科技大学 A kind of monitor video content De-weight method and system based on motion vector
US20220272342A1 (en) * 2019-07-05 2022-08-25 V-Nova International Limited Quantization of residuals in video coding
GB2623226A (en) * 2019-07-05 2024-04-10 V Nova Int Ltd Quantization of residuals in video coding

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5253058A (en) * 1992-04-01 1993-10-12 Bell Communications Research, Inc. Efficient coding scheme for multilevel video transmission
EP0596423A2 (en) * 1992-11-02 1994-05-11 Sony Corporation Layer encoding/decoding apparatus for input non-interlace video signal
EP0883300A2 (en) * 1997-06-05 1998-12-09 General Instrument Corporation Temporal and spatial scaleable coding for video object planes
WO1999023826A1 (en) * 1997-11-05 1999-05-14 Intel Corporation Multi-layer coder/decoder
US6269192B1 (en) 1997-07-11 2001-07-31 Sarnoff Corporation Apparatus and method for multiscale zerotree entropy encoding
US20010031009A1 (en) * 1994-06-17 2001-10-18 Knee Michael James Video compression

Family Cites Families (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH04177992A (en) * 1990-11-09 1992-06-25 Victor Co Of Japan Ltd Picture coder having hierarchical structure
FR2697393A1 (en) * 1992-10-28 1994-04-29 Philips Electronique Lab Device for coding digital signals representative of images, and corresponding decoding device.
JP3367992B2 (en) * 1993-04-27 2003-01-20 日本ビクター株式会社 Video encoding device and decoding device
JPH08256341A (en) * 1995-03-17 1996-10-01 Sony Corp Image signal coding method, image signal coder, image signal recording medium and image signal decoder
US5619256A (en) * 1995-05-26 1997-04-08 Lucent Technologies Inc. Digital 3D/stereoscopic video compression technique utilizing disparity and motion compensated predictions
US6957350B1 (en) * 1996-01-30 2005-10-18 Dolby Laboratories Licensing Corporation Encrypted and watermarked temporal and resolution layering in advanced television
JP3263807B2 (en) * 1996-09-09 2002-03-11 ソニー株式会社 Image encoding apparatus and image encoding method
ES2323358T3 (en) * 1997-04-01 2009-07-14 Sony Corporation IMAGE CODING, IMAGE CODING METHOD, IMAGE DECODER, IMAGE DECODING METHOD, AND DISTRIBUTION MEDIUM.
KR100281099B1 (en) * 1997-07-30 2001-04-02 구자홍 Method for removing block phenomenon presented by cording of moving picture
US6850564B1 (en) * 1998-06-26 2005-02-01 Sarnoff Corporation Apparatus and method for dynamically controlling the frame rate of video streams
US6700933B1 (en) * 2000-02-15 2004-03-02 Microsoft Corporation System and method with advance predicted bit-plane coding for progressive fine-granularity scalable (PFGS) video coding
US6493387B1 (en) * 2000-04-10 2002-12-10 Samsung Electronics Co., Ltd. Moving picture coding/decoding method and apparatus having spatially scalable architecture and signal-to-noise ratio scalable architecture together
US7133449B2 (en) * 2000-09-18 2006-11-07 Broadcom Corporation Apparatus and method for conserving memory in a fine granularity scalability coding system
US6907070B2 (en) * 2000-12-15 2005-06-14 Microsoft Corporation Drifting reduction and macroblock-based control in progressive fine granularity scalable video coding
US6792044B2 (en) * 2001-05-16 2004-09-14 Koninklijke Philips Electronics N.V. Method of and system for activity-based frequency weighting for FGS enhancement layers
WO2002098136A2 (en) * 2001-05-29 2002-12-05 Koninklijke Philips Electronics N.V. Method and device for video transcoding
KR20040054746A (en) * 2001-10-26 2004-06-25 코닌클리케 필립스 일렉트로닉스 엔.브이. Method and apparatus for spatial scalable compression

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5253058A (en) * 1992-04-01 1993-10-12 Bell Communications Research, Inc. Efficient coding scheme for multilevel video transmission
EP0596423A2 (en) * 1992-11-02 1994-05-11 Sony Corporation Layer encoding/decoding apparatus for input non-interlace video signal
US20010031009A1 (en) * 1994-06-17 2001-10-18 Knee Michael James Video compression
EP0883300A2 (en) * 1997-06-05 1998-12-09 General Instrument Corporation Temporal and spatial scaleable coding for video object planes
US6269192B1 (en) 1997-07-11 2001-07-31 Sarnoff Corporation Apparatus and method for multiscale zerotree entropy encoding
WO1999023826A1 (en) * 1997-11-05 1999-05-14 Intel Corporation Multi-layer coder/decoder

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005081532A1 (en) * 2004-01-21 2005-09-01 Koninklijke Philips Electronics N.V. Method of spatial and snr fine granular scalable video encoding and transmission
US7889937B2 (en) 2004-07-13 2011-02-15 Koninklijke Philips Electronics N.V. Method of spatial and SNR picture compression
JPWO2006019093A1 (en) * 2004-08-16 2008-05-08 日本電信電話株式会社 Image encoding method, image decoding method, image encoding device, image decoding device, image encoding program, and image decoding program
JP5052134B2 (en) * 2004-08-16 2012-10-17 日本電信電話株式会社 Image encoding method, image decoding method, image encoding device, image decoding device, image encoding program, and image decoding program
US9402087B2 (en) 2004-08-16 2016-07-26 Nippon Telegraph And Telephone Corporation Picture encoding method, picture decoding method, picture encoding apparatus, picture decoding apparatus, picture encoding program, and picture decoding program
US9462284B2 (en) 2004-11-23 2016-10-04 Siemens Aktiengesellschaft Encoding and decoding method and encoding and decoding device
US8005137B2 (en) 2005-03-25 2011-08-23 Samsung Electronics Co., Ltd. Video coding and decoding method using weighted prediction and apparatus for the same
US8396123B2 (en) 2005-03-25 2013-03-12 Samsung Electronics Co., Ltd. Video coding and decoding method using weighted prediction and apparatus for the same

Also Published As

Publication number Publication date
KR100929330B1 (en) 2009-12-03
US20030086622A1 (en) 2003-05-08
KR20040047977A (en) 2004-06-05
CN1254978C (en) 2006-05-03
CN1575602A (en) 2005-02-02
CN1575604A (en) 2005-02-02
WO2003036982A3 (en) 2004-06-03
WO2003036983A3 (en) 2004-06-10
JP2005507588A (en) 2005-03-17
WO2003036983A2 (en) 2003-05-01
AU2002341323A1 (en) 2003-05-06
CN1294761C (en) 2007-01-10
KR20040054742A (en) 2004-06-25
US7146056B2 (en) 2006-12-05
JP2005506815A (en) 2005-03-03
EP1442601A1 (en) 2004-08-04
KR20040054746A (en) 2004-06-25
JP2005507589A (en) 2005-03-17
US7359558B2 (en) 2008-04-15
KR20040054747A (en) 2004-06-25
CN1575605A (en) 2005-02-02
CN1253008C (en) 2006-04-19
US20040252901A1 (en) 2004-12-16
EP1442606A1 (en) 2004-08-04
US20040252767A1 (en) 2004-12-16
EP1452035A2 (en) 2004-09-01
US20050002458A1 (en) 2005-01-06
CN1611077A (en) 2005-04-27
WO2003036978A1 (en) 2003-05-01
WO2003036982A2 (en) 2003-05-01
JP2005507587A (en) 2005-03-17
AU2002339573A1 (en) 2003-05-06
CN100471269C (en) 2009-03-18
EP1442605A2 (en) 2004-08-04

Similar Documents

Publication Publication Date Title
US20050002458A1 (en) Spatial scalable compression
US20060133475A1 (en) Video coding
KR0161551B1 (en) Method and apparatus for editing or mixing compressed pictures
US6393059B1 (en) Conversion of video data bit stream
KR100314116B1 (en) A motion-compensated coder with motion vector accuracy controlled, a decoder, a method of motion-compensated coding, and a method of decoding
JP2005507589A5 (en)
JP2005506815A5 (en)
KR20040054743A (en) Spatial scalable compression
KR100202538B1 (en) Mpeg video codec
US20070025438A1 (en) Elastic storage
KR100203281B1 (en) Moving picture decorder based on forced one-direction motion compensation
JP3432886B2 (en) Hierarchical encoding / decoding apparatus and method, and transmission / reception system
KR0172902B1 (en) Mpeg encoder
KR0181067B1 (en) Moving picture encoder of having compatibility
EP1790166A2 (en) A method and apparatus for motion estimation
KR0148146B1 (en) Apparatus for loss-prevention of important image data
KR0130167B1 (en) Mpeg apparatus
KR0178225B1 (en) Encoder of image system
KR20020001062A (en) Moving picture encoder adapted in application part

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SD SE SG SI SK SL TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR IE IT LU MC NL PT SE SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 2002777621

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 10493267

Country of ref document: US

Ref document number: 2003539340

Country of ref document: JP

WWE Wipo information: entry into national phase

Ref document number: 20028210646

Country of ref document: CN

WWE Wipo information: entry into national phase

Ref document number: 1020047006228

Country of ref document: KR

WWP Wipo information: published in national office

Ref document number: 2002777621

Country of ref document: EP