|Publication number||US7613347 B1|
|Application number||US 11/303,076|
|Publication date||Nov 3, 2009|
|Filing date||Dec 13, 2005|
|Priority date||Dec 13, 2005|
|Publication number||11303076, 303076, US 7613347 B1, US 7613347B1, US-B1-7613347, US7613347 B1, US7613347B1|
|Inventors||Anubha Rastogi, Balaji Krishnamurthy|
|Original Assignee||Adobe Systems, Incorporated|
|Export Citation||BiBTeX, EndNote, RefMan|
|Patent Citations (7), Non-Patent Citations (1), Classifications (5), Legal Events (4)|
|External Links: USPTO, USPTO Assignment, Espacenet|
1. Field of the Invention
The present invention relates to techniques for compressing digital images. More specifically, the present invention relates to a method and apparatus for compressing a digital image, which operates by transforming a digital image into coefficients and then compressing the low-entropy portions of the coefficients while not substantially compressing the high-entropy portions of the coefficients.
2. Related Art
In recent years, the use of high-resolution digital cameras has increased significantly. Unfortunately, storing the high-resolution digital images produced by such cameras requires large amounts of memory. For example, a 4 mega-pixel digital camera requires up to 12 mega-bytes (MB) of memory to store a single raw, uncompressed digital image. Furthermore, transmitting a large digital image across a network requires a substantial amount of time, even when a high-speed broadband Internet connection is used. In order to deal with these problems, image-compression techniques are typically used to reduce the size of a digital image.
In commonly-used transform-based compression techniques, a digital image is transformed, for example, by using a discrete cosine transform (DCT) or a wavelet transform, to convert pixel values in the digital image into transform-domain coefficients. Next, the transform-domain coefficients are quantized by dividing the transform-domain coefficients by a quantizer and rounding the result to the nearest integer. These quantized transform-domain coefficients are then “entropy coded.” Note that it is the entropy coding that actually compresses the data. The purpose of the transform and the quantization operations is to make the image data more amenable to compression by removing redundancies and by presenting a low-entropy data stream to the entropy coder. Some examples of transform-based image coding techniques are JPEG, which uses a DCT, and JPEG2000, which uses a wavelet transform.
A corresponding decoder performs the inverse operations in reverse order to recover the image. For example, the decoding process can involve entropy decoding, followed by de-quantization and an inverse transform.
In order to produce the best possible compression, modern wavelet-based compression systems use arithmetic coders for entropy coding. An arithmetic coder is an entropy coder that can compress data close to the theoretical maximum that can be achieved (i.e. the length of an arithmetically-coded stream will be close to the entropy of the stream). Since the success of arithmetic-coding depends on the availability of a good probabilistic model of the coefficient bit patterns, highly sophisticated bit-modeling techniques have been developed. One example of a coefficient bit-modeling/arithmetic-coding scheme is EBCOT (Embedded Block Coding with Optimized Truncation), which is incorporated into the JPEG2000 image coding standard.
Although bit-modeling/arithmetic-coding schemes provide good compression, they tend to be slow, even on modern processors. The majority of the time spent decompressing/compressing an image is spent on the coding of the wavelet coefficients. For example, up to 70% of the time taken to decompress a JPEG2000 stream is spent in entropy coding. This large amount of computational time is a substantial obstacle in the deployment of wavelet-based compression techniques for many image-processing applications, the most notable being video applications.
One solution to this problem is to bypass the arithmetic coder. The JPEG2000 standard provides an arithmetic bypass mode, in which the arithmetic coder is bypassed for some of the sub-bitplane passes. Unfortunately, this technique is still extremely complex because: (1) there is at least one pass in each plane that is arithmetically coded, and (2) the bit modeling is performed for all passes regardless of whether the arithmetic coder is to be employed or not.
Another solution is to use the existing JPEG compression standard, which uses DCT and Huffman coding, and has low complexity. Unfortunately, JPEG compression cannot achieve compression comparable to wavelet-based systems. For example, a typical JPEG2000 image is about 30% smaller than a JPEG image of comparable quality. Furthermore, at low quality the blocking artifacts of JPEG are very significant. Note that the entropy coder used for JPEG compression does not attempt to separate the low-entropy and high-entropy portions of the quantized DCT coefficients.
Hence, what is needed is a method and an apparatus for compressing a digital image without the problems described above.
One embodiment of the present invention provides a system that compresses a digital image. During operation, the system performs a transform operation on the digital image to produce transform domain coefficients, and then quantizes the transform domain coefficients to produce quantized coefficients. The system then separates the low-entropy portions of the quantized coefficients from the high-entropy portions of the quantized coefficients, wherein the low-entropy portions of the quantized coefficients comprise the most-significant bitplanes of the absolute values of the quantized coefficients. Next, the system compresses the low-entropy portions of the quantized coefficients while not substantially compressing the high-entropy portions of the quantized coefficients.
In a variation on this embodiment, while performing the transform operation on the digital image to produce transform domain coefficients, the system performs a discrete cosine transform or a wavelet transform.
In a variation on this embodiment, while compressing the low-entropy portions of the quantized coefficients, the system partitions the quantized coefficients into smaller blocks of coefficients. If the maximum number of bitplanes for the absolute values of the smaller block of coefficients is greater than zero, the system: (1) partitions the smaller blocks of coefficients into sub-blocks of coefficients; (2) generates a bit pattern for a given low-entropy bitplane within a given sub-block by using a one to represent each column of the given low entropy bitplane which contains ones, and by using a zero to represent each column of the given low-entropy bitplane which does not contain ones; (3) encodes the bit pattern for the given low-entropy bitplane within the given sub-block; and (4) for each column of the given low-entropy bitplane within the given sub-block which contains ones, appends the column of bits to the encoded bit pattern for the given sub-block.
In a further variation on this embodiment, if the maximum number of bitplanes for the smaller block of coefficients is zero, the system encodes the smaller block of coefficients with a zero.
In a further variation on this embodiment, if the sub-block of coefficients is a corner sub-block, which is not a standard sub-block size, the system generates a bit pattern for the given low-entropy bitplane within the corner sub-block by doing the following: (1) for each column of the given low-entropy bitplane within the corner sub-block which contains a one, using a one to represent the column, and appending the column of bits to the bit pattern for the given low-entropy bitplane within the corner sub-block; and (2) for each column of the given low-entropy bitplane within the corner sub-block which contains a zero, using a zero to represent the column.
In a further variation on this embodiment, the standard sub-block size is n bits by n bits.
In a variation on this embodiment, while encoding the bit pattern for the given low entropy bitplane, the system uses a Huffman coding technique.
In a further variation on this embodiment, the Huffman table used with the Huffman coding technique is either a static table, or a custom-generated table saved with each image.
In a variation on this embodiment, the method is used to compress a sequence of images or to compress motion-compensated difference frames.
In a variation on this embodiment, sign information for the quantized coefficients is not compressed.
In a variation on this embodiment, the high-entropy portions of the quantized coefficients are not compressed, and are directly written to an output bitstream.
The following description is presented to enable any person skilled in the art to make and use the invention, and is provided in the context of a particular application and its requirements. Various modifications to the disclosed embodiments will be readily apparent to those skilled in the art, and the general principles defined herein may be applied to other embodiments and applications without departing from the spirit and scope of the present invention. Thus, the present invention is not limited to the embodiments shown, but is to be accorded the widest scope consistent with the principles and features disclosed herein.
The data structures and code described in this detailed description are typically stored on a computer-readable storage medium, which may be any device or medium that can store code and/or data for use by a computer system. This includes, but is not limited to, magnetic and optical storage devices such as disk drives, magnetic tape, CDs (compact discs) and DVDs (digital versatile discs or digital video discs).
Many applications can sacrifice some compression in exchange for increased decompression performance. One such example is an application that handles high-quality video. Note that even applications that need to handle single images will benefit from a fast decompression if they have to handle large number of images at a time (for example, a slideshow).
Up to 70% of the time in the state of the art compression systems, such as JPEG2000, is spent in entropy coding. Therefore, it is desirable to replace the bit-modeling/arithmetic-coding technique specified in the JPEG2000 standard with a lower-complexity technique which can compress images to sizes comparable to the ones achieved by JPEG2000, but which are substantially faster.
In one embodiment of the present invention, the “compressible” (low-entropy) portion and the “incompressible” (high-entropy) portions of the quantized wavelet transform coefficients are separated. A simple compression technique is applied to the compressible portions of the quantized wavelet transform coefficients and the incompressible portions of the quantized wavelet transform coefficients are not compressed.
Note that following the wavelet transform and quantization processes, blocks of nearby coefficients within each sub-band have nearly the same length in bits. Moreover, only a few most significant bitplanes of each block are compressible (i.e. have low-entropy).
One embodiment of the present invention uses a low-complexity Huffman coder to code the low-entropy portions of the bitplanes, and the high-entropy portions (i.e. the lower significant bitplanes) are written out without any compression.
In one embodiment of the present invention, the low-entropy portions of the bitplanes are the top n bitplanes.
Note that one embodiment of the present invention does not employ bit-modeling or arithmetic-coding.
Coding Coefficient Lengths
A wavelet transform process transforms the spatial-domain representation of the digital image into a frequency-domain representation. This wavelet transform process generates coefficients which are grouped into sub-bands. Each sub-band is divided into small sub-blocks and the significant bitplanes of the sub-block are encoded. Note that the maximum number of significant bitplanes of a sub-block is the length (in bits) of the sub-block coefficient that has the largest absolute value. Also, note that the size of the sub-block is chosen to be small because a small group of coefficients next to each other has a greater chance of having the same length. A sub-block which is too small causes excessive overhead while coding the number of significant planes of the sub-blocks. (Furthermore, note that although this specification describes a system that applies the present invention to a wavelet-based compression technique, the present invention can be applied to other compression techniques, and is not meant to be limited to wavelet-base techniques.)
In one embodiment of the present invention, the standard block size is n coefficients by n coefficients.
In one embodiment of the present invention, to minimize the number of bits used to encode the number of significant bitplanes for each block, the minimum number of significant bitplanes for a block of a sub-band is determined, and for each block, the difference between the number of bitplanes in a given block and the minimum number of bitplanes is used to encode the number of significant planes for each block.
In one embodiment of the present invention, the top two bit-planes of each sub-block are mostly zeros and hence can be compressed. Each bitplane is compressed independently. The present invention uses Huffman coding for compression. The symbols for Huffman coding are created by aggregating the values of individual bits of the sub-block in the following way.
A four-bit symbol is produced for a four-by-four block of pixels within the sub-block. The four-bit symbol indicates which columns are zero and which columns contain at least a one (i.e. non-zero). For each column of four-by-one pixels, a zero is used to represent the column if all bits in the column are zero. A one is used to represent the column if at least one bit in the column is one (i.e. non-zero). Hence, a four-bit symbol for the four-by-four sub-block (one for each column) is generated, which indicates which columns are all zero and which columns contain at least a one bit.
A Huffman code is generated for the symbols (decimals 0 to 15). The Huffman tree can be statically created, based on experimental data, or it can be fine-tuned for a particular image. The encoding of each four-by-four block includes the Huffman code for the block followed by the bit pattern of each column which contains at least one non-zero bit. If all the bits of a particular column are zero, nothing is signaled since it can be inferred from the Huffman code that all entries in the column are zero.
For example, in
For corner sub-blocks (i.e. blocks that are not four-by-four in size), a zero is used to represent the column if all the bits in the column are zero. A one bit followed by the column of bits is used to represent a column containing at least a one.
Next, the system partitions the sub-bands of the quantized coefficients into smaller blocks of coefficients (step 310).
The system performs steps 312 to 322 on each sub-block. If the maximum number of bitplanes for the smaller block of coefficients is zero, the system encodes the smaller block of coefficients with a zero (steps 312 and 322).
If the maximum number of bitplanes for the smaller block of coefficients is greater than zero, the system partitions the smaller block of coefficients into sub-blocks (steps 312 and 314). The system then generates a bit pattern for a given low-entropy bitplane within a given sub-block by using a one to represent each column of the given low-entropy bitplane which contains ones, and using a zero to represent each column of the given low-entropy bitplane which does not contain ones (step 316).
Next, the system encodes the bit pattern for the given low-entropy bitplane within the given sub-block (step 318). For each column of the given low-entropy bitplane within the given sub-block which contains ones, the system appends the column of bits to the encoded bit pattern for the given sub-block (step 320).
Beyond the top two bitplanes, there is very little to compress. Hence, the remaining bitplanes are written out with no compression.
For systems that require even less complexity, this bitplane coding step can be bypassed and the coefficients can be written out as it is, without any compression.
In one embodiment of the present invention, the sign information of the non-zero coefficients in the block is collected and is written directly without any compression. If greater compression of the sign information is desired, then the sign information is signaled at the end of each coefficient such that the sign information is sent only if the coefficient is non-zero.
Note that if the coefficients are written bitplane-by-bitplane instead of all at once, an embedded code stream organization is achieved, thereby enabling progressive transmission. A post-compression rate-distortion (PCRD) process can be performed on the sub-blocks to produce an optimally scalable and embedded code stream. In this case, the sign plane needs to be written for each coefficient at the beginning, whether the coefficient is zero or not. This will cause a minor increase in file sizes.
One embodiment of the present invention uses two-by-two blocks to generate symbols for Huffman coding. Another embodiment of the present invention uses one-by-four (horizontal) blocks to generate symbols for Huffman coding. The compression efficiency for these embodiment is similar to the four-by-one blocks case described in detail here, but the advantage of four-by-one blocks (vertical lines) is that it is amenable to optimization (for example, SIMD instruction optimization).
In one embodiment of the present invention, the compression technique can be applied to individual images or to a sequence of images. For a sequence of images, such as a video stream, the technique can be applied to individual video frames or to the motion-compensated difference frames. If the compression technique is applied to each frame separately, the resulting stream is easily editable.
Decompressing a Compressed Digital Image
Next, for each sub-block in the compressed digital image, the system produces the decoded quantized transform-domain coefficients by combining the decoded low-entropy portion of the quantized transfer-domain coefficients and decoded high-entropy portions of the quantized transfer-domain coefficients (step 404). The system decodes a given low-entropy bitplane within the sub-block by decoding the Huffman code for the given bitplane within the sub-block to produce the bit pattern which indicates whether a given column within the given bitplane contains ones. The system then generates an uncompressed bitplane within the sub-block for the given bitplane by inserting the corresponding bit pattern for each column into the uncompressed bitplane within the sub-block. Note that the corresponding bit pattern for each column follows the Huffman code for the given bitplane within the sub-block. If a column contains all zeros (indicated by a zero in the bit pattern), a column of zeros is inserted to the given uncompressed bitplane within the sub-block. If the Huffman code for the sub-block is a zero, the sub-block contains all zeros.
The system then de-quantizes the decoded transform-domain coefficients (step 406) and performs an inverse discrete wavelet transform operation on the decoded quantized transform-domain coefficients to produce the uncompressed digital image (step 408).
The foregoing descriptions of embodiments of the present invention have been presented only for purposes of illustration and description. They are not intended to be exhaustive or to limit the present invention to the forms disclosed. Accordingly, many modifications and variations will be apparent to practitioners skilled in the art. Additionally, the above disclosure is not intended to limit the present invention. The scope of the present invention is defined by the appended claims.
|Cited Patent||Filing date||Publication date||Applicant||Title|
|US5537493 *||Jun 29, 1994||Jul 16, 1996||Sony Corporation||Apparatus for compressing image data employing entropy encoding of data scanned from a plurality of spatial frequency bands|
|US5754793||Aug 3, 1995||May 19, 1998||Samsung Electronics Co., Ltd.||Wavelet image compression/recovery apparatus and method using human visual system modeling|
|US5815097||May 23, 1996||Sep 29, 1998||Ricoh Co. Ltd.||Method and apparatus for spatially embedded coding|
|US5867602||Jun 30, 1995||Feb 2, 1999||Ricoh Corporation||Reversible wavelet transform and embedded codestream manipulation|
|US6658159||Mar 17, 2000||Dec 2, 2003||Hewlett-Packard Development Company, L.P.||Block entropy coding in embedded block coding with optimized truncation image compression|
|US20020015443 *||Apr 30, 2001||Feb 7, 2002||Boris Felts||Encoding method for the compression of a video sequence|
|US20040141652 *||Oct 24, 2003||Jul 22, 2004||Sony Corporation||Picture encoding apparatus and method, program and recording medium|
|1||Subhasis Saha and Rao Vemuri, "Adaptive wavelet coding of multimedia images", Proceedings of the seventh ACM international conference on multimedia (part 2) multimedia '99, Oct. 1999, Publisher: ACM Press.|
|Cooperative Classification||H04N19/13, H04N19/63|
|Dec 13, 2005||AS||Assignment|
Owner name: ADOBE SYSTEMS, INCORPORATED, CALIFORNIA
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:RASTOGI, ANUBHA;KRISHNAMURTHY, BALAJI;REEL/FRAME:017340/0651
Effective date: 20051121
|Oct 19, 2010||CC||Certificate of correction|
|Mar 7, 2013||FPAY||Fee payment|
Year of fee payment: 4
|Mar 25, 2016||AS||Assignment|
Owner name: ADOBE SYSTEMS INCORPORATED, CALIFORNIA
Free format text: CORRECTIVE ASSIGNMENT TO CORRECT THE ASSIGNEE NAME PREVIOUSLY RECORDED AT REEL: 017340 FRAME: 0651.ASSIGNOR(S) HEREBY CONFIRMS THE ASSIGNMENT;ASSIGNORS:RASTOGI, ANUBHA;KRISHNAMURTHY, BALAJI;REEL/FRAME:038260/0286
Effective date: 20051121