US 20010019630 A1 Abstract An image of certain resolution higher than possible in a single transmission over a finite bandwidth channel is obtained by transferring a progressively-rendered, compressed image. Initially, a low quality image is compressed and transmitted over a finite bandwidth channel. Then, a successively higher resolution image information is compressed at a source and is transmitted. The successively higher resolution image information received at the destination end is used to display a higher resolution image at the destination end.
Claims(17) 1. A method of transferring a progressively-rendered, compressed image, over a finite bandwidth channel, comprising:
producing a coarse quality compressed image at a source and transmitting said coarse quality compressed image over a channel as a first part of a transmission to a destination end; receiving the coarse quality compressed image at a receiver at the destination end at a first time and displaying an image based on said coarse quality compressed image on a display system of the receiver when received at said first time; creating additional information about the image, at the source end, from which a standard quality image can be displayed, said standard quality image being of a higher quality than said coarse quality image, and sending compressed information over said channel indicative of information for said standard quality image, said sending said standard quality image information occurring subsequent in time to said sending of all of said information for said coarse quality image; receiving said standard quality information at the receives at a second time, subsequent to the first time, and decompressing said standard quality image information, to improve the quality of the image displayed on said display system, and to display said standard quality image; obtaining further information about the image beyond the information in said standard quality image, to provide an enhanced quality image, and compressing said information for said enhanced quality image, said enhanced quality image having more image details than said standard quality image; transmitting said information for said enhanced quality image, at a time subsequent to transmitting said information for said coarse quality image and said standard quality image; and receiving said enhanced quality image information at said receiver, at a third time subsequent to said first and second times, and updating a display on said display system to display the additional enhanced quality image. 2. A method as in claim 1 3. A method as in claim 1 4. A method as in claim 2 5. A method as in claim 4 6. A method as in claim 5 7. A method as in claim 5 8. A method as in claim 3 9. A method of transmitting and displaying a compressed image comprising:
first obtaining and sending a first layer of information indicative of a compressed miniature image at a first time; first receiving said first layer at said decoder end and decompressing and displaying a first coarse image indicative thereof; second obtaining and sending information indicative of a compressed improved resolution image having more details than said first coarse image, and transmitting said information at a second time subsequent to said first time; and second receiving and decompressing said improved resolution image information to provide an updated display which improves the resolution of said first coarse image. 10. A method as in claim 9 transmitting information indicative of a compressed miniature of the image; receiving the compressed miniature of the image; interpolating the compressed miniature of the image into a full sized image; and displaying the full sized image. 11. A method as in claim 10 12. A method as in claim 11 13. A method as in claim 12 14. A method as in claim 11 15. A method as in claim 14 16. A method as in claim 11 17. A method as in claim 9 Description [0001] 1. Field of the Invention [0002] This invention relates to the compression and decompression of digital data and, more particularly, to the reduction in the amount of digital data necessary to store and transmit images. [0003] 2. Background of the Invention [0004] Image compression systems are commonly used in computers to reduce the storage space and transmittal times associated with storing, transferring and retrieving images. Due to increased use of images in computer applications, and the increase in the transfer of images, a variety of image compression techniques have attempted to solve the problems associated with the large amounts of storage space (i.e., hard disks, tapes or other devices) needed to store images. [0005] Conventional devices store an image as a two-dimensional array of picture elements, or pixels. The number of pixels determines the resolution of an image. Typically the resolution is measured by stating the number of horizontal and vertical pixels contained in the two dimensional image array. For example, a 640 by 480 image has 640 pixels across and 480 from top to bottom to total 307,200 pixels. [0006] While the number of pixels represents the image resolution, the number of bits assigned to each pixel represents the number of available intensity levels of each pixel. For example, if a pixel is only assigned one bit, the pixel can represent a maximum of two values. Thus the range of colors which can be assigned to that pixel is limited to two (typically black and white). In color images, the bits assigned to each pixel represent the intensity values of the three primary colors of red, green and blue. In present “true color” applications, each pixel is normally represented by 24 bits where 8 bits are assigned to each primary color allowing the encoding of 16.8 million (2 [0007] Consequently, color images require large amounts of storage capacity. For example, a typical color (24 bits per 5 pixel) image with a resolution of 640 by 480 requires approximately 922,000 bytes of storage. A larger 24-bit color image with a 2000 by 2000 pixel resolution requires approximately twelve million bytes of storage. As a result, image-based applications such as interactive shopping, multimedia products, electronic games and other image-based presentations require large amounts of storage space to display high quality color images. [0008] In order to reduce storage requirements, an image is compressed (encoded) and stored as a smaller file which requires less storage space. In order to retrieve and view the compressed image, the compressed image file is expanded (decoded) to its original size. The decoded (or “reconstructed”) image is usually an imperfect or “lossy” representation of the original image because some information may be lost in the compression process. Normally, the greater the amount of compression the greater the divergence between the original image and the reconstructed image. The amount of compression is often referred to as the compression ratio. The compression ratio is the amount of storage space needed to store the original (uncompressed) digitized image file divided by the amount of storage space needed to store the corresponding compressed image file. [0009] By reducing the amount of storage space needed to store an image, compression is also used to reduce the time needed to transfer and communicate images to other locations. In order to transfer an image, the data bits that represent the image are sent via a data channel to another location. The sequence of transmitted bytes is called the data stream. Generally, the image data is encoded and the compressed image data stream is sent over a data channel and when received, the compressed image data is decoded to recreate the original image. Thus, compression speeds the transmission of image files by reducing their size. [0010] Several processes have been developed for compressing the data required to represent an image. Generally, the processes rely on two methods: 1) spatial or time domain compression, and 2) frequency domain compression. In frequency domain compression, the binary data representing each pixel in the space or time domain are mapped into a new coordinate system in the frequency domain. [0011] In general, the mathematical transforms, such as the discrete cosine transform (DCT), are chosen so that the signal energy of the original image is preserved, but the energy is concentrated in a relatively few transform coefficients. Once transformed, the data is compressed by quantization and encoding of the transform coefficients. [0012] Optimization of the process of compressing an image includes increasing the compression ratio while maintaining the quality of the original image, reducing the time to encode an image, and reducing the time to decode a compressed image. In general, a process that increases the compression ratio or decreases the time to compress an image results in a loss of image quality. A process that increases the compression ratio and maintains a high quality image often results in longer encoding and decoding times. Accordingly, it would be advantageous to increase the compression ratio and reduce the time needed to encode and decode an image while maintaining a high quality image. [0013] It is well known that image encoders can be optimized for specific image types. For example, different types of images may include graphical, photographic, or typographic information or combinations thereof. As discussed in more detail below, the encoding of an image can be viewed as a multi-step process that uses a variety of compression methods which include filters, mathematical transformations, quantization techniques, etc. In general each compression method will compress different image types with varying comparative efficiency. These compression methods can be selectively applied to optimize an encoder with respect to a certain type of image. In addition to selectively applying various compression methods, it is also possible to optimize an encoder by varying the parameters (e.g., quantization tables) of a particular compression method. [0014] Broadly speaking, however, the prior art does not provide an adaptive encoder that automatically decomposes a source image, classifies its parts, and selects the optimal compression methods and the optimal parameters of the selected compression methods resulting in an optimized encoder that increases relative compression rates. [0015] Once an image is optimally compressed with an encoder, the set of compressed data are stored in a file. The structure of the compressed file is referred to as the file format. The file format can be fairly simple and common, or the format can be quite complex and include a particular sequence of compressed data or various types of control instructions and codes. [0016] The file format (the structure of the data in the file) is especially important when compressed data in the file will be read and processed sequentially and when the user desires to view or transmit only part of a compressed image file. Accordingly, it would be advantageous to provide a file format that “layers” the compressed image components, arranging those of greatest visual importance first, those of secondary visual importance second, and so on. Layering the compressed file format in such a way allows the first segment of the compressed image file to be decoded prior to the remainder of the file being received or read by the decoder. The decoder can display the first segment (layer) as a miniature version of the entire image or can enlarge the miniature to display a coarse or “splash” quality rendition of the original image. As each successive file segment or layer is received, the decoder enhances the quality of the displayed picture by selectively adding detail and correcting pixel values. [0017] Like the encoding process, the decoding of an image can be viewed as a multi-step process that uses a variety of decoding methods which include inverse mathematical transformations, inverse quantization techniques, etc. Conventional decoders are designed to have an inverse function relative to the encoding system. These inverse decoding methods must match the encoding process used to encode the image. In addition, where an encoder makes content-sensitive adaptations to the compression algorithm, the decoder must apply a matching content-sensitive decoding process. [0018] Generally, a decoder is designed to match a specific encoding process. Prior art compression systems exist that allow the decoder to adjust particular parameters, but the prior art encoders must also transmit accompanying tables and other information. In addition, many conventional decoders are limited to specific decoding methods that do not accommodate content-sensitive adaptations. [0019] The problems outlined above are solved by the method and apparatus of the present invention. That is, the computer-based image compression system of the present invention includes a unique encoder which compresses images and a unique decoder which decompresses images. The unique compression system obtains high compression ratios at all image quality levels while achieving relatively quick encoding and decoding times. [0020] A high compression ratio enables faster image transmission and reduces the amount of storage space required to store an image. When compared with conventional compression techniques, such as the Joint Photographic Experts Group (JPEG), the present invention significantly increases the compression ratio for color images which, when decompressed, are of comparable quality to the JPEG images. The exact improvement over JPEG will depend on image content, resolution, and other factors. [0021] Smaller image files translate into direct storage and transmission time savings. In addition, the present invention reduces the number of operations to encode and decode an image when compared to JPEG and other compression methods of a similar nature. Reducing the number of operations reduces the amount of time and computing resources needed to encode and decode an image, and thus improves computer system response times. [0022] Furthermore, the image compression system of the present invention optimizes the encoding process to accommodate different image types. As explained below, the present invention uses fuzzy logic techniques to automatically analyze and decompose a source image, classify its components, select the optimal compression method for each component, and determine the optimal content-sensitive parameters of the selected compression methods. The encoder does not need prior information regarding the type of image or information regarding which compression methods to apply. Thus, a user does not need to provide compression system customization or need to set the parameters of the compression methods. [0023] The present invention is designed with the goal of providing an image compression system that reliably compresses any type of image with the highest achievable efficiency, while maintaining a consistent range of viewing qualities. Automating the system's adaptivity to varied image types allows for a minimum of human intervention in the encoding process and results in a system where the compression and decompression process are virtually transparent to the users. [0024] The encoder and decoder of the present invention contain a library of encoding methods that are treated as a “toolbox.” The toolbox allows the encoder to selectively apply particular encoding methods or tools that optimize the compression ratio for a particular image component. The toolbox approach allows the encoder to support many different encoding methods in one program, and accommodates the invention of new encoding methods without invalidating existing decoders. The toolbox approach thus allows upgradeability for future improvements in compression methods and adaptation to new technologies. [0025] A further feature of the present invention is that the encoder creates a file format that segments or “layers” the compressed image. The layering of the compressed image allows the decoder to display image file segments, beginning with the data at the front of the file, in a coherent sequence which begins with the decoding and display of the information that constitutes the core of the image as defined by human perception. This core information can appear as a good quality miniature of the image and/or as a full sized “splash” or coarse quality version of the image. Both the miniature and splash image enable the user to view the essence of an image from a relatively small amount of encoded data. In applications where the image file is being transmitted over a data channel, such as a telephone line or limited bandwidth wireless channel, display of the miniature and/or splash image occurs as soon as the first segment or layer of the file is received. This allows users to view the image quickly and to see detail being added to the image as subsequent layers are received, decoded, and added to the core image. [0026] The decoder decompresses the miniature and the full sized splash quality image from the same information. User specified preferences and the application determine whether the miniature and/or the full sized splash quality image are displayed for any given image. [0027] Whether the first layer is displayed as a miniature or a splash quality full size image, the receipt of each successive layer allows the decoder to add additional image detail and sharpness. Information from the previous layer is supplemented, not discarded, so that the image is built layer by layer. Thus a single compressed file with a layered file format can store both a thumbnail and a full size version of the image and can store the full size version at various quality levels without storing any redundant information. [0028] The layered approach of the present invention allows the transmission or decoding of only the part of the compressed file which is necessary to display a desired image quality. Thus, a single compressed file can generate a thumbnail and different quality full size images without the need to recompress the file to a smaller size and lesser quality, or store multiple files compressed to different file sizes and quality levels. [0029] This feature is particularly advantageous for on line service applications, such as shopping or other applications where the user or the application developer may want several thumbnail images downloaded and presented before the user chooses to receive the entire full size, high quality image. In addition to conserving the time and transmission costs associated with viewing a variety of high quality images that may not be of interest, the user need only subsequently download the remainder of each image file to view the higher detail versions of the image. [0030] The layered format also allows the storage of different layers of the compressed data file separate from one another. Thus, the core image data (miniature) can be stored locally (e.g., in fast RAM memory for fast access), and the higher quality “enhancement” layers can be stored remotely in lower cost bulk storage. [0031] A further feature of the layered file format of the present invention allows the addition of other compressed data information. The layered and segmented file format is extendable so that new layers of compressed information such as sound, text and video can be added to the compressed image data file. The extendable file format allows the compression system to adapt to new image types and to combine compressed image data with sound, text and video. [0032] Like the encoder, the decoder of the present invention includes a toolbox of decoding methods. The decoding process can begin with the decoder first determining the encoding methods used to encode each data segment. The decoder determines the encoding methods from instructions the encoder inserts into the compressed data file. [0033] Adding decoder instructions to the compressed image data provides several advantages. A decoder that recognizes the instructions can decode files from a variety of different encoders, accommodate content-sensitive encoding methods, and adjust to user specific needs. The decoder of the present invention also skips parts of the data stream that contain data that are unnecessary for a given rendition of the image, or ignore parts of the data stream that are in an unknown format. The ability to ignore unknown formats allows future file layers to be added while maintaining compatibility with older decoders. [0034] In a preferred embodiment of the present invention, the encoder compresses an image using a first Reed Spline Filter, an image classifier, a discrete cosine transform, a second and third Reed Spline Filter, a differential pulse code modulator, an enhancement analyzer, and an adaptive vector quantizer to generate a plurality of data segments that contain the compressed image. The plurality of data segments are further compressed with a channel encoder. [0035] The Reed Spline Filter includes a color space conversion transform, a decimation step and a least mean squared error (LMSE) spline fitting step. The output of the first Reed Spline Filter is then analyzed to determine an image type for optimal compression. The first Reed Spline Filter outputs three components which are analyzed by the image classifier. The image classifier uses fuzzy logic techniques to classify the image type. Once the image type is determined, the first component is separated from the second and third components and further compressed with an optimized discrete cosine transform and an adaptive vector quantizer. The second and third components are further compressed with a second and third Reed Spline Filter, the adaptive vector quantizer, and a differential pulse code modulator. [0036] The enhancement analyzer enhances areas of an image determined to be the most visually important, such as text or edges. The enhancement analyzer determines the visual priority of pixel blocks. The pixel block dimensions typically correspond to 16×16 pixel blocks in the source image. In addition, the enhancement analyzer prioritizes each pixel block so that the most important enhancement information is placed in the earliest enhancement layers so that it can be decoded first. The output of the enhancement analyzer is compressed with the adaptive vector quantizer. [0037] A user may set the encoder to compute a color palette optimized to the color image. The color palette is combined with the output of the discrete cosine transform, the adaptive vector quantizer, the differential pulse code modulator, and the enhancement analyzer to create a plurality of data segments. The channel encoder then interleaves and compresses the plurality of data segments. [0038] These and other aspects, advantages, and novel features of the invention will become apparent upon reading the following detailed description and upon reference to accompanying drawings in which: [0039]FIG. 1 is a block diagram of an image compression system that encodes, transfers and decodes an image and includes a source image, an encoder, a compressed file, a first storage device, a data channel, a data stream, a decoder, a display, a second storage device, and a printer; [0040]FIG. 2 illustrates the multi-step decoding process and includes the source image, the encoder, the compressed file, the data channel, the data stream, the decoder, a thumbnail image, a splash image, a panellized standard image, and the final representation of the source image; [0041]FIG. 3 is a block diagram of the encoder showing the four stages of the encoding process; [0042]FIG. 4 is a block diagram of the encoder showing a first Reed Spline Filter, a color space conversion transform, a Y miniature, a U miniature, an X miniature, an image classifier, an optimized discrete cosine transform, a discrete cosine transform residual calculator, an adaptive vector quantizer, a second and third Reed Spline Filter, a Reed Spline residual calculator, a differential pulse coder modulator, an enhancement analyzer, a high resolution residual calculator, a palette selector, a plurality of data segments and a channel encoder; [0043]FIG. 5 is a block diagram of the image formatter; [0044]FIG. 6 is a block diagram of the Reed Spline Filter; [0045]FIG. 7 is a block diagram of the color space conversion transform; [0046]FIG. 8 is a block diagram of the image classifier; [0047]FIG. 9 is a block diagram of the optimized discrete cosine transform; [0048]FIG. 10 is a block diagram of the DCT residual calculator; [0049]FIG. 11 is a block diagram of the adaptive vector quantizer; [0050]FIG. 12 is a block diagram of the second and third Reed Spline Filters; [0051]FIG. 13 is a block diagram of the Reed Spline residual calculator; [0052]FIG. 14 is a block diagram of the differential pulse code modulator; [0053]FIG. 15 is a block diagram of the enhancement analyzer; [0054]FIG. 16 is a block diagram of the high resolution residual calculator; [0055]FIG. 17 is the block diagram of the palette selector; [0056]FIG. 18 is the block diagram of the channel encoder; [0057]FIG. 19 is a block diagram of the vector quantization process; [0058]FIGS. 20 [0059]FIG. 21 illustrates the normal segment; [0060]FIG. 22 [0061]FIG. 23 is a block diagram of the decoder of the present invention; [0062]FIG. 24 illustrates the multi-step decoding process and includes a Ym miniature, a Um miniature, an Xm miniature, the thumbnail miniature, the splash image and the standard image, and the enhanced image; [0063]FIG. 25 is a block diagram of the decoder and includes an inverse Huffman encoder, an inverse DPCM, a dequantizer, a combiner, an inverse DCT, a demultiplexer, and an adder; [0064]FIG. 26 is a block diagram of the decoder and includes the interpolator, interpolation factors, a scaler, scale factors, a replicator, and an inverse color converter; [0065]FIG. 27 is a block diagram of the decoder that includes the inverse Huffman encoder, the combiner, the dequantizer, the inverse DCT, a pattern matcher, the adder, the interpolator, and an enhancement overlay builder; [0066]FIG. 28 is block diagram of the scaler with an input to output ratio of five-to-three in the one dimensional case; [0067]FIG. 29 illustrates the process of bilinear interpolation; [0068]FIG. 30 is a block diagram of the process of optimizing the compression methods with the image classifier, the enhancement analyzer, the optimized DCT, the AVQ, and the channel encoder; [0069]FIG. 31 is a block diagram of the image classifier; [0070]FIG. 32 is a flow chart of the process of creating an adaptive uniform DCT quantization table; [0071]FIG. 33 illustrates a table of several examples showing the mapping from input measurements to input sets to output sets; [0072]FIG. 34 is a block diagram of image data compression; [0073]FIG. 35 is a block diagram of a spline decimation/interpolation filter; [0074]FIG. 36 is a block diagram of an optimal spline filter; [0075]FIG. 37 is a vector representation of the image, processed image, and residual image; [0076]FIG. 38 is a block diagram showing a basic optimization block of the present invention; [0077]FIG. 39 is a graphical illustration of a one-dimensional bi-linear spline projection; [0078]FIG. 40 is a schematic view showing periodic replication of a two-dimensional image; [0079]FIGS. 41 [0080]FIG. 42 is a diagram showing representations of the hexagonal tent function; [0081]FIG. 43 is a flow diagram of compression and reconstruction of image data; [0082]FIG. 44 is a graphical representation of a normalized frequency response of a one-dimensional bi-linear spline basis; [0083]FIG. 45 is a graphical representation of a one-dimensional eigenfilter frequency response; [0084]FIG. 46 is a perspective view of a two-dimensional eigenfilter frequency response; [0085]FIG. 47 is a plot of standard error as a function of frequency for a one-dimensional cosinusoidal image; [0086]FIG. 48 is a plot of original and reconstructed one-dimensional images and a plot of standard error; [0087]FIG. 49 is a first two-dimensional image reconstruction for different compression factors; [0088]FIG. 50 is a second two-dimensional image reconstruction for different compression factors; [0089]FIG. 51 is plots of standard error for representative images [0090]FIG. 52 is a compressed two- miniature using the optimized decomposition weights; [0091]FIG. 53 is a block diagram of a preferred adaptive compression scheme in which the method of the present invention is particularly suited; [0092]FIG. 54 is a block diagram showing a combined sublevel and optimal-spline compression arrangement; [0093]FIG. 55 is a block diagram showing a combined sublevel and optimal-spline reconstruction arrangement; [0094]FIG. 56 is a block diagram showing a multi-resolution optimized interpolation arrangement; and [0095]FIG. 57 is a block diagram showing an embodiment of the optimizing process in the image domain. [0096]FIG. 1 illustrates a block diagram of an image compression system that includes a source image [0097] Each pixel is assigned a number of bits that represent the intensity level of the three primary colors: red, green, and blue. In the preferred embodiment, the full-color source image [0098] As discussed in more detail below, the encoder [0099] The decoder [0100] For example, if the source image [0101] Referring to FIG. 2, it can be seen that the layering of the compressed file [0102]FIG. 3 illustrates a block diagram of the encoder [0103] The encoder [0104]FIG. 4 illustrates a more detailed block diagram of the encoder [0105] The formatter [0106] The first Reed Spline Filter [0107] In the spline fitting step in block [0108] As explained in more detail below, the decoder [0109] The reconstruction weights output from the Reed Spline Filter [0110] More specifically, the preferred color space converter
[0111] Referring to FIG. 6, it can be seen that a R_tau2 miniature [0112]FIG. 7 illustrates the color space converter [0113] Referring to FIG. 8, it can be seen that the second stage [0114] As shown in FIG. 4, during the third stage [0115] As illustrated in FIG. 9, the optimized DCT [0116] In order to preserve the image information lost by the optimized DCT [0117] Referring to FIG. 11, it can be seen that the rY_tau2 residual [0118] Thus, the AVQ [0119] As shown in FIG. 11, the AVQ [0120]FIG. 12 illustrates a block diagram of the second Reed Spline Filter [0121] The third Reed Spline Filter [0122] In FIG. 13 the Reed Spline residual calculator [0123] As illustrated in FIG. 11, the rU_tau4 residual [0124] The U_tau16 miniature [0125]FIG. 15 illustrates the enhancement analyzer [0126] If the result of convolving a particular 16×16 high resolution block is greater than the threshold value E [0127] The high resolution residual calculator [0128] The high resolution residual calculator [0129] The xr_Y residual [0130]FIG. 17 illustrates a block diagram of the palette selector [0131] The channel encoder [0132] After interleaving the plurality of data segments [0133] The Adaptive Vector Ouantizer [0134] The preferred embodiment of the AVQ [0135] The codebook [0136] Finding a best-fit pattern from the codebook [0137] where: X [0138] The comparison equation finds the best match by selecting the minimum error term that results from comparing the input block with the codebook patterns. In other words, the AVQ [0139] The process of searching for a matching pattern in the codebook [0140] First, in order to find the optimal codebook pattern, the AVQ [0141] If at any time in the comparison, the accumulated squared error for the new pattern is greater than the minimum squared error, the current pattern is immediately rejected and the AVQ [0142] Also, if the accumulated squared error for a particular codebook pattern is less than a pre-determined threshold, the codebook pattern is immediately accepted and the AVQ [0143] Besides improving the time it takes for the AVQ [0144] Therefore, the AVQ [0145] The Compressed File Format [0146]FIGS. 20 [0147] In addition to layering the compressed data, the segmented architecture allows the decoder [0148] As shown in FIG. 20 [0149] As shown in FIG. 20 [0150] Byte [0151]FIG. 21 illustrates the normal segment [0152] The identification section [0153] For example, the file format of the preferred embodiment allows the use of different Huffman bitstreams such as an 8-bit Huffman stream, a 10-bit Huffman stream, and a DCT Huffman stream. The encoder [0154]FIGS. 22 [0155] Block diagram [0156] Block diagram [0157] Block diagram [0158] The plurality of data segments [0159] Block [0160] The Decoder [0161]FIG. 23 illustrates the decoder [0162] As illustrated in FIG. 24, the decoder [0163]FIG. 25 illustrates the elements of the first step [0164] The decoder [0165] The combiner [0166] The demultiplexer [0167] The adder [0168]FIG. 26 illustrates the second step [0169] The Ym interpolation factor [0170] The interpolator [0171] After interpolation, the scaler [0172] The inverse color converter [0173] In order to create the splash image [0174] The interpolated data are then expanded with the replicator [0175] The inverse color converter [0176]FIG. 27 illustrates the third step [0177] The decoding of the standard image [0178] The decoder [0179] The inverse Huffman encoder [0180] The pattern matcher [0181] The inverse Huffman encoder [0182] To calculate the full sized U image [0183] To calculate the full sized X image [0184] The decoder stores the full sized Y image [0185] In the forth step the decoder [0186] The decoder [0187] The inverse color converter [0188] The inverse DCT [0189] Referring to FIG. 9, the compressed DCT coefficients [0190] The inverse DCT [0191] The equation for an inverse DCT is:
[0192] where u:=0 . . . 7 v:=0 . . . 7 x:=0 . . . 7 y:=0 . . . 7 [0193] [0194] The inverse DCT [0195] For example, for a 4×4 output matrix the 8×8 matrix from the inverse DCT [0196] For a 2×2 output matrix, the 8×8 matrix from the inverse DCT [0197] In addition, since most of the AC coefficients are zero, the inverse DCT [0198] All elements with i or j greater than 1 are set to zero. The setting of the high frequency index to zero is equivalent to filtering out the high frequency coef ficients from the signal. [0199] Assigning Y as the, 2×2 output matrix, the decimated output is thus equal to: [0200] where
k [0201] The creation of a 4×4 output matrix where a given X is an 8×8 input matrix that consists of DC terms [0202] All elements with i or j greater than 3 are set to zero. [0203] It is possible to implement the calculations in the 2×2 case where the two dimensional equation is decomposed downward; however, performing the one dimensional approach twice reduces complexity and decreases the calculation time. In the preferred embodiment, the inverse DCT [0204] column inverse DCT. [0205] The equation for a one dimensional case is as follows: (1dout 1dout 1dout 1dout 1dout [0206] [0207] where c(k) is defined as in the 2×2 output matrix. [0208] The scaler [0209] The scaler [0210] The decimated data is then filtered with a reconstruction filter and an area average filter. The reconstruction filter interpolates the input data by replicating the pixel data. The area average filter then area averages by integrating the area covered by the output pixel. [0211] If the output ratio is less than 1 (i.e, interpolation is necessary), the interpolator C+β*D). [0212] The Image Classifier [0213] The preferred embodiment of the image classifier [0214] The source image [0215] The image classifier [0216] Fuzzy logic is a set-theoretic approach to classification of objects that assigns degrees of membership in a particular class. In classical set theory, an object either belongs to a set or it does not; membership is either 100% or 0%. In fuzzy set theory, an object can be partly in one set and partly in another. The fuzziness is of greater significance when the content must be categorized for the purpose of applying appropriate compression techniques. Relevant categories in image compression include photographic, graphical, noisy, and high-energy. Clearly the boundaries of these sets are not sharp. A scheme that matches appropriate compression tools to image content must reliably distinguish between content types that require different compression techniques, and must also be able to judge how to blend tools when types requiring different tools overlap. [0217]FIG. 30 illustrates the optimization of the compression process. The optimization process analyzes the input image [0218] The fuzzy logic image classifier [0219] Furthermore, the fuzzy logic image classifier [0220] The fuzzy logic image classifier [0221] Still further, the fuzzy logic rule base is easily maintained. The rules are modular. Thus, the rules can be understood, researched, and modified independently of one another. In addition, the rule bases are easily modified allowing new rules to make the image classifier [0222]FIG. 31 illustrates a block diagram of the image classifier [0223] The image classifier [0224] In block [0225] Membership in the activity input set and the scale image input set are determined by the input measurements [0226] The color depth input set includes four classifications: gray scale images, 4-bit images, 8-bit images and 24-bit images. The color depth input corresponds to the input measurements [0227] The special feature input set corresponds to the input measurements [0228] In block [0229] For the second Reed Spline Filter [0230] Referring to FIG. 31, in block [0231] In block [0232] The Enhancement Analyzer [0233] The preferred embodiment of the enhancement analyzer [0234] The enhancement analyzer [0235] Every pixel in the test block is convolved with the following filter masks: M M [0236] to compute two statistics S [0237] Masks M x x x [0238] where the pixel x [0239] S [0240] If S [0241] In addition to the enhancement list [0242] Optimized DCT [0243] The preferred embodiment of the optimized DCT [0244] The fixed DCT quantization tables are tuned to different image types, including eight standard tables corresponding to images differing along three dimensions: photographic versus graphic, small-scale versus large-scale, and high-activity versus low-activity. In the preferred embodiment, additional tables can be added to the resource file [0245] The control script [0246] for i=0, 1, . . . , 63 [0247] if x [0248] if x [0249] Reconstruction is also linear unless reconstruction values have been computed and stored in the CS data segment [0250] for i=0, 1, . . . , 63
[0251] In the fixed-table DCT mode, the optimized DCT [0252] for j=0, 1, . . . , N [0253] H [0254] where x [0255] Second, the centroid of coefficient values is calculated between each quantization step. The formula for the centroid of the ith coefficient in the kth quantization interval is:
[0256] where
[0257] This provides a non-linear mapping of quantized coefficients onto reconstructed values as follows: r [0258] In the adaptive uniform DCT quantization mode, the image the classifier [0259] The optimized DCT [0260] MRT (bits/VMSE) ratio is calculated as follows: [0261] MRT (bits/VMSE)=((Δbits/Δq)/I(ΔVMSE/Δq)). [0262] Increasing the quantization step value q will add more bits to the representation of the corresponding DCT coefficient. However, adding more bits to the representation of a DCT coefficient will reduce the VMSE. Since the bits added to the step value q are usually transformed into VMSE reduction, the MRT is generally negative. [0263] The MRT is calculated for all of the DCT coefficients. The adaptive method utilized by the optimized DCT [0264]FIG. 32 shows a flow chart of the process of creating an adaptive uniform DCT quantization table. In a step [0265] In step [0266] The optimized DCT [0267] The Reed Spline Filter [0268] FIGS. [0269] The Reed Spline Filter is based on the a least-mean-square error (LMS)-error spline approach, which is extendable to N dimensions. One- and two-dimensional image data compression utilizing linear and planar splines, respectively, are shown to have compact, closed-form optimal solutions for convenient, effective compression. The computational efficiency of this new method is of special interest, because the compression/reconstruction algorithms proposed herein involve only the Fast Fourier Transform (FFT) and inverse FFT types of processors or other high-speed direct convolution algorithms. Thus, the compression and reconstruction from the compressed image can be extremely fast and realized in existing hardware and software. Even with this high computational efficiency, good image quality is obtained upon reconstruction. An important and practical consequence of the disclosed method is the convenience and versatility with which it is integrated into a variety of hybrid digital data compression systems. [0270] I. SPLINE FILTER OVERVIEW [0271] The basic process of digital image coding entails transforming a source image X into a “compressed” image Y such that the signal energy of Y is concentrated into fewer elements than the signal energy of X, with some provisions regarding error. As depicted in FIG. 34, digital source image data [0272] G and G′ are not necessarily processes of mutual inversion, and the processes may not conserve the full information content of image data X. Consequently, X′ will, in general, differ from X, and information is lost through the coding/reconstruction process. The residual image or so-called residue is generated by supplying compressed data Y′ to a “local” reconstruction process [0273] In practice, to reduce computational overhead associated with large images during compression, a decimating or subsampling process may be performed to reduce the number of samples. Decimation is commonly characterized by a reduction factor τ (tau), which indicates a measure of image data elements to compressed data elements. However, one skilled in the art will appreciate that image data X must be filtered in conjunction with decimation to avoid aliasing. As shown in FIG. 35, a low-pass input filter may take the form of a pointwise convolution of image data X with a suitable convolution filter [0274] The present invention disclosed herein solves this problem by providing a method of optimizing the compressed data such that the mean-square-residue <ΔX [0275] A. Image Approximation by Spline Functions [0276] As will become clear in the following detailed description, the input decimation filter [0277] where X′ is the reconstructed image vector and χ [0278] According to the method, the basis functions need not be orthogonal and are preferably chosen to overlap in order to provide a continuous approximation to image data, thereby rendering a non-diagonal basis correlation matrix: [0279] This property is exploited by the method of the present invention, since it allows the user to “adapt” the response of the filter by the nature and degree of cross-correlation. Furthermore, the basis of spline functions need not be complete in the sense of spanning the space of all image data, but preferably generates a close approximation to image X. It is known that the decomposition of image vector X into components of differing spline basis functions {ψ [0280] In a schematic view, a set of spline basis functions S′={ψ [0281] The “best” X′ is determined by the constraint that ΔX=X−X′ is minimized with respect to variations in the weights χ [0282] which by analogy to FIG. 37, described an orthogonal projection of X onto S′. [0283] Generally, the above system of equations which determines the optimal χ A (χ [0284] where A χ [0285] rendering compression with the least residue. One skilled in the art of LMS criteria will know how to express the processes given here in the geometry of multiple dimensions. Hence, the processes described herein are applicable to a variety of image data types. [0286] The present brief and general description has direct processing counterparts depicted in FIG. 36. The operation X*ψ [0287] represents a convolution filtering process A [0288] represents the optimizing process [0289] In addition, as will be demonstrated in the following sections, the inverse operation A [0290] where DFT is the familiar discrete Fourier transform (DFT) and λ [0291] II. IMAGE DATA COMPRESSION BY OPTICAL SPLINE INTERPOLATION [0292] A. One-Dimensional Data Compression by LMS-Error Linear Splines [0293] For one-dimensional image data, bi-linear spline functions are combined to approximate the image data with a resultant linear interpolation, as shown in FIG. 39. The resultant closed-form approximating and optimizing process has a significant advantage in computational simplicity and speed. [0294] Letting the decimation index τ and image sampling period t be fixed, positive integers τ, t=1,2, . . . , and letting X(t) be a periodic sequence of data of period nτ, where n is also an integer, consider a periodic, linear spline F(t)=F(t+nτ), (1) [0295] where (2) [0296] as shown by the functions ψ [0297] The family of shifted linear splines F(t) is defined as follows: ψ [0298] One object of the present embodiment is to approximate X(t) by the n-point sum:
[0299] in a least-mean-squares fashion where X [0300] Hence, S(t) [0301] To find the “best” weights X [0302] where the sum has been taken over one period plus τ of the data. X [0303] This leads to the system,
[0304] of linear equations for X [0305] and
[0306] The term Y [0307] Letting (t-jτ)=m, then:
[0308] The Y [0309] Since F(t) is assumed to be periodic with period nτ, the matrix form of A [0310] By Equation 13, A A [0311] where (k-j) a [0312] Therefore, A [0313] One skilled in the art of matrix and filter analysis will appreciate that the periodic boundary conditions imposed on the data lie outside the window of observation and may be defined in a variety of ways. Nevertheless, periodic boundary conditions serve to simplify the process implementation by insuring that the correlation matrix [A [0314] B. Two-Dimensional Data Compression by Planar Splines [0315] For two-dimensional image data, multi-planar spline functions are combined to approximate the image data with a resultant planar interpolation. In FIG. 40, X(t [0316] Consider now a doubly periodic planar spline, F(t ψ [0317] for (k [0318] is a minimum. [0319] A condition for L to be a minimum is
[0320] The best efficients X A [0321] where the ummation is on k [0322] and
[0323] With the visual aid of FIG. 41 [0324] Letting t [0325] for (j [0326] The values of α, β, γ, and ξ depend on τ, and the shape and orientation of the hexagonal tent with respect to the image domain, where for example m [0327] From Equation 25 above, A A [0328] where (k [0329] where (s [0330] C. Compression-Reconstruction Alqorithms [0331] Because the objective is to apply the above-disclosed LMS error linear spline interpolation techniques to image sequence coding, it is advantageous to utilize the tensor formalism during the course of the analysis in order to readily solve the linear systems in equations 8 and 20. Here, the tensor summation convention is used in the analysis for one and two dimensions. It will be appreciated that such convention may readily apply to the general case of N dimensions. [0332] 1. Linear Transformation of Tensors [0333] A linear transformation of a 1st-order tensor is written as Y [0334] where A Y [0335] The product or composition of linear transformations is defined as follows. When the above Equation 29 holds, and Z [0336] then Z [0337] Hence, C [0338] is the composition or product of two linear transformations. [0339] 2. Circulant Transformation of 1st-Order Tensors [0340] The tensor method for solving equations 8 and 20 is illustrated for the 1-dimensional case below: Letting A A [0341] and considering the n special 1st-order tensors as W [0342] where ω is the n-th root of unity, then A [0343] where
[0344] are the distinct eigenvalues of A [0345] At this point it is convenient to normalize these tensors as follows:
[0346] Φ Φ [0347] where δ [0348] A linear transformation is formed by summing the n dyads Φ [0349] Then
[0350] Since Ă [0351] 3. Inverse Transformation of 1st-Order Tensors. [0352] The inverse transformation of A [0353] This is proven easily, as shown below:
[0354] 4. Solving 1st-Order Tensor Equations [0355] The solution of a A [0356] so that
[0357] where DFT denotes the discrete Fourier Transform and DFT [0358] An alternative view of the above solution method is derived below for one dimension using standard matrix methods. A linear transformation of a 1st-order tensor can be represented by a matrix. For example, let A denote A A=QΛQ [0359] The solution to y=Ax is then x=A [0360] For the one-dimensional process described above, the eigenvalues of the transformation operators are:
[0361] where a λ(l)=α+βω [0362] A direct extension of the 1st-order tensor concept to the 2nd-order tensor will be apparent to those skilled in the art. By solving the 2nd-order tensor equations, the results are extended to compress a 2-D image. FIG. 42 depicts three possible hexagonal tent functions for 2-dimensioned image compression indices τ=2,3,4. The following table exemplifies the relevant parameters for implementing the hexagonal tent functions:
[0363] The algorithms for compressing and reconstructing a still image are explained in the succeeding sections. [0364] III. OVERVIEW OF CODING-RECONSTRUCTION SCHEME [0365] A block diagram of the compression/reconstruction scheme is shown in FIG. 43. The signal source [0366] The embodiment of the optimized spline filter described above may employ a DFT and DFT [0367] which improves with the size of the image. [0368] A. The Compression Method [0369] The coding method is specified in the following steps: [0370] 1. A suitable value of τ (an integer) is chosen. The compression ratio is τ [0371] 2. Equation 23 is applied to find Y [0372] B. The Reconstruction Method [0373] The reconstruction method is shown below in the following steps: [0374] 1. Find the FFT [0375] 2. The results of step 1 are divided by the eigenvalues λ(l, m) set forth below. The eigenvalues λ(l,m) are found by extending Equation 48 to the two-dimensional case to obtain: λ(l,m)=α+β(ω [0376] where ω [0377] 3. The FFT of the results from step 2 is then taken. After computing the FFT, X [0378] 4. The recovered or reconstructed image is:
[0379] 5. Preferably, the residue is computed and retained with the optimized weights: ΔX(t [0380] Although the optimizing procedure outlined above appears to be associated with an image reconstruction process, it may be implemented at any stage between the aforementioned compression and reconstruction. It is preferable to implement the optimizing process immediately after the initial compression so as to minimize the residual image. The preferred order has an advantage with regard to storage, transmission and the incorporation of subsequent image processes. [0381] C. Response Considerations [0382] The inverse eigenfilter in the conjugate domain is described as follows:
[0383] where λ(i,j) can be considered as an estimation of the frequency response of the combined decimation and interpolation filters. The optimization process H(i,j) attempts to “undo” what is done in the combined decimation/interpolation process. Thus, H(i,j) tends to restore the original signal bandwidth. For example, for τ=2, the decimation/interpolation combination is described as having an impulse response resembling that of the following 3×3 kernel:
[0384] Then, its conjugate domain counterpart, λ(i,j)| [0385] where i,j are frequency indexes and N represents the number of frequency terms. Hence, the implementation accomplished in the image conjugate domain is the conjugate equivalent of the inverse of the above 3×3 kernel. This relationship will be utilized more explicitly for the embodiment disclosed in Section V. [0386] IV. NUMERICAL SIMULATIONS [0387] A. One-Dimensional Case [0388] For a one-dimensional implementation, two types of signals are demonstrated. A first test is a cosine signal which is useful for observing the relationship between the standard error, the size of τ and the signal frequency. The standard error is defined herein to be the square root of the average error:
[0389] A second one-dimensional signal is taken from one line of a grey-scale still image, which is considered to be realistic data for practical image compression. [0390]FIG. 47 shows the plots of standard error versus frequency of the cosine signal for different degrees of decimation τ [0391] Another test example comes from one line of realistic still image data. FIG. 48 [0392] B. Two-Dimensional Case [0393] For the two-dimensional case, realistic still image data are used as the test. FIG. 49 and [0394] An additional aspect of interest is to look at the optimized weights directly. When these optimal weights are viewed in picture form, high-quality miniatures [0395] V. ALTERNATIVE EMBODIMENTS [0396] Video compression is a major component of high-definition television (HDTV) According to the present invention, video compression is formulated as an equivalent three-dimensional approximation problem, and is amenable to the technique of optimum linear or more generally by hyperplanar spline interpolation. The main advantages of this approach are seen in its fast speed in coding/reconstruction, its suitability in a VLSI hardware implementation, and a variable compression ratio. A principal advantage of the present invention is the versatility with which it is incorporated into other compression systems. The invention can serve as a “front-end” compression platform from which other signal processes are applied. Moreover, the invention can be applied iteratively, in multiple dimensions and in either the image or image conjugate domain. The optimizing method can for example apply to a compressed image and further applied to a corresponding compressed residual image. Due to the inherent low-pass filtering nature of the interpolation process, some edges and other high-frequency features may not be preserved in the reconstructed images, but which are retained through the residue. To address this problem, the following procedures are set forth: [0397] Procedure (a) [0398] Since the theoretical formulation, derivation, and implementation of the disclosed compression method do not depend strongly on the choice of the interpolation kernel function, other kernel functions can be applied and their performances compared. So far, due to its simplicity and excellent performance, only the linear spline function has been applied. Higher-order splines, such as the quadratic spline, cubic spline could also be employed. Aside from the polynomial spline functions, other more complicated function forms can be used. [0399] Procedure (b) [0400] Another way to improve the compression method is to apply certain adaptive techniques. FIG. 53 illustrates such an adaptive scheme. For a 2-D image [0401] Procedure (c) [0402] Subband coding techniques have been widely used in digital speech coding. Recently, subband coding is also applied to digital image data compression. The basic approach of subband coding is to split the signal into a set of frequency bands, and then to compress each subband with an efficient compression algorithm which matches the statistics of that band. The subband coding techniques divide the whole frequency band into smaller frequency subbands. Then, when these subbands are demodulated into the baseband, the resulting equivalent bandwidths are greatly reduced. Since the subbands have only low frequency components, one can use the above described, linear or planar spline, data compression technique for coding these data. A 16-band filter compression system is shown in FIG. 54, and the corresponding reconstruction system in FIG. 55. There are, of course, many ways to implement this filter bank, as will be appreciated by those skilled in the art. For example, a common method is to exploit the Quadrature Mirror Filter structure. [0403] V. IMAGE DOMAIN IMPLEMENTATION [0404] The embodiments described earlier utilize a spline filter optimization process in the image conjugate domain using an FFT processor or equivalent thereof. The present invention also provides an equivalent image domain implementation of a spline filter optimization process which presents distinct advantages with regard to speed, memory and process application. [0405] Referring back to Equation 45, it will be appreciated that the transform processes DFT and DFT [0406] If Ω=DFT (1/λ X [0407] Furthermore, with λ [0408] In practice, there is a compromise between accuracy and economy with regard to the specific form of Ω. The optimizer tensor Ω should be of sufficient size for adequate approximation of:
[0409] On the other hand, the term Ω should be small enough to be computationally tractable for the online convolution process [0410] Additionally, to reduce computational overhead, the smallest elements (i.e., the elements near the perimeter) such as f, g, and h may be set to zero with little noticeable effect in the reconstruction. [0411] The principal advantages of the present preferred embodiment are in computational saving above and beyond that of the previously described conjugate domain inverse eigenfilter process (FIG. 38, 1018). For example, a two-dimensional FFT process may typically require about N [0412] Additionally, there is substantial reduction in buffer demands because the image domain process [0413] The above detailed description is intended to be exemplary and not limiting. From this detailed description, taken in conjunction with the appended drawings, the advantages of the present invention will be readily understood by one who is skilled in the relevant technology. The present apparatus and method provides a unique encoder, compressed file format and decoder which compresses images and decodes compressed images. The unique compression system increases the compression ratios for comparable image quality while achieving relatively quick encoding and decoding times, optimizes the encoding process to accommodate different image types, selectively applies particular encoding methods for a particular image type, layers the image quality components in the compressed image, and generates a file format that allows the addition of other compressed data information. [0414] While the above detailed description has shown, described and pointed out the fundamental novel features of the invention as applied to various embodiments, it will be understood that various omissions and substitutions and changes in the form and details of the illustrated device may be made by those skilled in the art, without departing from the spirit of the invention. Referenced by
Classifications
Legal Events
Rotate |