|Publication number||US7046854 B2|
|Application number||US 10/141,498|
|Publication date||May 16, 2006|
|Filing date||May 7, 2002|
|Priority date||May 7, 2001|
|Also published as||US7028060, US20030037082, US20030076888, WO2002091219A1, WO2002091222A1|
|Publication number||10141498, 141498, US 7046854 B2, US 7046854B2, US-B2-7046854, US7046854 B2, US7046854B2|
|Original Assignee||Hrl Laboratories, Llc|
|Export Citation||BiBTeX, EndNote, RefMan|
|Patent Citations (20), Referenced by (7), Classifications (38), Legal Events (6)|
|External Links: USPTO, USPTO Assignment, Espacenet|
This application claims the benefit of priority to the following provisional applications; 60/289,608, titled Joint Filter Optimization for Signal Processing Subband Coder Architecture filed with the United States Patent and Trademark Office on May 7, 2001; and 60/289,349, titled Signal Processing Subband Code Architecture, filed with the United States Patent and Trademark Office on May 7, 2001; and 60/289,408, titled Object Recognition in Compressed Imagery, filed with the United States Patent and Trademark Office on May 7, 2001.
(1) Technical Field
The present invention relates to techniques for signal processing. More specifically, the present invention relates to the performance of linear signal processing operations on data stored in a transformed (e.g. compressed) state without the need to transform the data out of the transformed state.
Data filtering or mapping techniques have been used for many years in the field of signal processing. Filters are used in a wide variety of applications, such as image processing, pattern recognition, noise reduction, data manipulation, data compression, and data encryption. Many of these filters can be used in conjunction with one another, e.g., for performing multiple functions such as pattern recognition and encryption at the same time.
In file compression, previously a signal had to be brought out of the transform (i.e., compressed) domain and reconstructed prior to any linear signal processing operation. To date, there has been no universal technique for performing linear filtering operations directly on a data in a transform domain (e.g. a multiresolution subband decomposition) without first inverse-transforming the data. Thus, signals or data on which an operation is to be performed must first be reconstructed prior to performing a linear filtering operation.
It is therefore desirable to provide a mechanism by which transformed data may be manipulated without the need for inverse-transforming the data first. Such a system would allow for a variety of linear operations to be performed on transformed data such as compressed images and data, as well as for searching through large databases of compressed data without the need for computationally expensive decompression and recompression.
The present invention provides method, apparatus, and computer program product embodiments for combining a subband decomposition and a linear signal processing filter, where the subband decomposition comprises N-levels, with each level comprises at least two synthesis filters and an adjacent upsampler, and where each synthesis filter includes a low-pass composite filter and at least one upper-subband filter with a high-pass component. Operations performed include (a) merging the linear signal processing filter, via superposition, with each synthesis filter of the first level of the subband decomposition into a set of composite filters; (b) generating, via an identity formulation, an equivalent structure for the low-pass composite filter and its adjacent upsampler to allow for combination of the equivalent structure with the synthesis filter on the next level of the subband decomposition; (c) repeatedly operating the means for merging on the next level of the subband decomposition; (d) creating, via an inverse form of the identity formulation, an equivalent structure for the intermediate structure generated by the merging step and its adjacent upsampler in order to allow for combination of the equivalent structure with the at least one synthesis filter on the next level; and (e) repeatedly operating the generating means (b), the repeating means (c), and the creating means (d) for each remaining level in the subband decomposition to generate a composite subband synthesis linear operator; whereby the composite subband synthesis linear operator may be used to operate directly on the data represented by the subband decomposition.
In a further embodiment, the subband decomposition is a data compression scheme, whereby the composite subband synthesis linear operator allows for processing data directly in the compressed domain without the need for prior decompression.
In a still further embodiment, the linear signal processing filter is selected from a group consisting of correlation filters, noise-reduction filters, encryption filters, and data manipulation filters; and wherein the subband decomposition is selected from a group consisting of compression filters and encryption filters.
Each of the operations of the apparatus discussed above typically corresponds to a software module for performing the function on a computer or a piece of dedicated hardware with instructions “hard-coded” therein. In other embodiments, the means or modules may be incorporated onto a computer readable medium to provide a computer program product. Also, the means discussed above also correspond to steps in a method. Finally, the present invention also comprises a composite filter produced by the method, apparatus, or computer program product of the present invention.
The objects, features and advantages of the present invention will be apparent from the following detailed descriptions of the preferred embodiment of the invention in conjunction with reference to the following drawings.
The present invention relates to techniques for signal processing. More specifically, the present invention relates to the performance of linear signal processing operations on data stored in a transformed (e.g. compressed) state without the need to transform the data out of the transformed state. The following description, taken in conjunction with the referenced drawings, is presented to enable one of ordinary skill in the art to make and use the invention and to incorporate it in the context of particular applications. Various modifications, as well as a variety of uses in different applications, will be readily apparent to those skilled in the art, and the general principles defined herein, may be applied to a wide range of embodiments. Thus, the present invention is not intended to be limited to the embodiments presented, but is to be accorded the widest scope consistent with the principles and novel features disclosed herein. Furthermore, it should be noted that, unless explicitly stated otherwise, the figures included herein are illustrated diagrammatically and without any specific scale, as they are provided as qualitative illustrations of the concept of the present invention.
The present invention may be applied, for example, to the manipulation of material in large databases. It may also be used in the manipulation of data that must be compressed because it is transmitted over a limited bandwidth. The joint optimization technique specified by this patent can also be used to perform linear signal processing operations on stored data. A few other, non-limiting examples of applications of the present invention include remote sensing and image exploitation; signal classification; object detection and collision avoidance in vehicles; occupant sensing in vehicles; pedestrian detection; and in image architecture designs.
In order to provide a working frame of reference, first a glossary of terms used in the description and claims is given as a central resource for the reader. Next, a discussion of various physical embodiments of the present invention is provided. Finally, a discussion is provided to give an understanding of the specific details.
Before describing the specific details of the present invention, a centralized location is provided in which various terms used herein and in the claims are defined. The glossary provided is intended to provide the reader with a general understanding of the intended meaning of the terms, but is not intended to convey the entire scope of each term. Rather, the glossary is intended to supplement the rest of the specification in more accurately explaining the terms used.
Data Compression Scheme—A data compression scheme, as discussed herein, may be any lossy or lossless data compression technique.
Means—The term “means” as used with respect to this invention generally indicates a set of operations to be performed on, or in relation to, a computer. Non-limiting examples of “means” include computer program code (source or object code) and “hard-coded” electronics or dedicated hardware, such as an field programmable gate array (FPGA), digital signal processing (DSP) chip, etc. The “means” may be stored in the memory of a computer or on a computer readable medium, whether on or remote from the computer.
Signal Processing—This term is intended to include its traditional meaning and to apply to areas such as image processing, voice signal processing, as well as data processing.
(2) Physical Embodiments
The present invention has three principal “physical” embodiments. The first is a system for combining subband decomposition and a linear signal processing filter in order to allow for operations directly in the transformed subband domain. The system is typically in the form of a general purpose or dedicated computer system operating software or in the form of a “hard-coded” instruction set. The system allows linear operations to be performed directly without the need for first inverse-transforming the data out of the subband domain. Thus, for example, linear operations such as pattern recognition may be performed directly on subband coded data such as compressed data without the need to first decompress the data. The second physical embodiment is a method, typically in the form of software, operated using a data processing system (computer). Software embodying the a combined subband decomposition and a linear signal processing filter could, for example, be used as a software module along with other software for operations on compressed or encrypted files such as images or text files. The third principal physical embodiment is a computer program product. The computer program product generally represents computer readable code stored on a computer readable medium, such as an optical storage device, e.g., a compact disc (CD) or digital versatile disc (DVD), or a magnetic storage device such as a floppy disk or magnetic tape. Other, non-limiting examples of computer readable media include hard disks, read only memory (ROM), and flash-type memories. These embodiments will be described in more detail below.
A block diagram depicting the components of a computer system used in the present invention is provided in
An illustrative diagram of a computer program product embodying the present invention is depicted in
The present invention provides a signal processing architecture that integrates a linear signal processing filter with subband filters in a homogeneous operation. It allows for linear signal processing within a subband multiresolution decomposition, thus allowing for linear signal processing directly, for example, in the compressed domain (e.g. directly on the multiresolution data of a subband decomposition). The present invention yields several advantages over conventional techniques. In an embodiment geared for use in image processing, for example, it produces a response identical to that produced by the reconstructed image without having to actually perform the reconstruction. Further, it results in a response identical to that affected by the original image, within the fidelity of the subband transform. It is also more computationally efficient than the separate implementation of two filters (i.e. the signal processing linear filter and the subband filter). Additionally, for two-dimensional signals, it provides a faster response than that resulting from the original image by a factor of 1.6, in the limit, as image size approaches infinity.
The number of computes saved by performing the linear and subband filtering together in the signal processing subband coder (SPSC) is dependent on the number of decomposition levels, the length of the subband filter, and the size of the signal. For example, using a correlation filter with a subband filter of length 8 in a 3 level decomposition on a 16,384 square image saves 1.1×1010 computes, a number equal to 50% of the computes required for operating the two filters independently.
A flow chart of the processing steps involved in the present invention is depicted in
The present invention may be used for a wide variety of filter types, some of which include linear signal processing filters such as correlation filters, noise-reduction filters, and data manipulation filters; and subband decompositions such as compression filters and encryption filters. These filters may be used, for example, for the manipulation of material in large databases or the manipulation of data that must be compressed because it is transmitted over a limited bandwidth. The SPSC of the present invention can perform the linear signal processing operations on stored data without having to inverse transform it from the subband domain (e.g. out of its compressed state).
In order to provide the reader with a tangible point of reference, the examples herein are discussed in the context of a correlation filter as the linear signal processing filter and a compression filter as the subband decomposition. This particular combination is useful for pattern/object recognition in images. Although these particular filters are chosen for illustrative purposes, the present invention may be used in conjunction with any linear signal processing filter in place of the correlation filter or any subband decomposition. The exact approach as well as the computational complexity will vary from filter to filter, without departing from the scope of the present invention, as can be understood by one of skill in the art. This in mind, the signal processing subband coder (SPSC) architecture of the present invention will always be more computationally efficient than the independent operation of a subband reconstruction and a subsequent linear filter.
This discussion is divided into four sections for clarity. In the first section, background information is provided in the form of a mathematical formulation of a linear system that results in an identical response for both compressed and uncompressed data. This provides basis for a further discussion on the technique for the second section, which elucidates the methodology for merging the subband and correlation (linear) filters. In this section, the SPSC architecture is defined and its construction is explained. The computational efficiency of the SPSC is discussed in the third section. Information regarding the effectiveness and other issues pertaining to the SPSC are discussed in the fourth section.
I. One-Dimensional Formulation
This portion of the description is provided in order to lay the foundation for a more generalized version of the present invention. As mentioned previously, the discussion is provided in the context of a correlation filter and a subband decomposition for data compression. The equation,
represents a one-dimensional correlation process in matrix vector notation, where x is a one-dimensional input signal, C is a Toeplitz matrix defined by a one-dimensional correlation filter, and y is the one-dimensional correlation output. Matrix vector notation provides a convenient mathematical and analytical tool for this discussion.
In order to perform the function of the present invention, it is necessary to express the response y in terms of the transformed input, xt. In other words, it is necessary to manipulate the linear system, to achieve an output from a transformed input that is identical to the one produced from its original counterpart. This discussion and the equations depicted herein hold for any linear system; any general transform that can each be written as a matrix operation. It is important to note that although, for purposes of clarity and coherency, this discussion is in the context of correlation processes and compression transforms, it is not limited to these specific applications.
I.a A General Transform Representation
Given a matrix construction of a unitary transform operator, T, Equation (1.1) above may be rewritten as:
xt is defined as the transformed input,
Ct as the ‘inverse transformed’ correlation matrix,
Ct=CTt, and (1.4)
T is the transformation matrix. With these definitions, Equation (1.2) can be restated as the following:
If T is not unitary, the above set of equations remains intact with Tt being replaced by T−1, as long as T is nonsingular. It is apparent that if T is a unitary compression transform, then Equation (1.5) provides an elegant method for achieving the correlation surface, y, directly from the compressed image without any reconstruction.
The implementation of the technique above for a compression and correlation system will now be discussed. For this discussion, it is assumed that xt represents compressed images; that T represents the (known) compression transformation matrix; and that the transformed correlation matrix, Ct, may be computed a priori and stored off-line.
In order to formulate the transformed correlation matrix, Ct, the standard correlation filter must first be produced in the uncompressed domain (along with its matrix counterpart C). This requires the use of uncompressed images. A typical recognition system can complete the training phase a priori at a point when time and processing power are not as limited as during system operation.
If the system specifically calls for a subband transform, the transformation matrix, T, is created directly from the subband analysis filter pair. For clarity, a discussion of the construction of the transform matrix, T, is presented in Section V, the Appendix, below. In brief, the transformation matrix, T, combines the filtering and downsampling of a subband analysis transform over multiple levels in a matrix structure. An informal treatment of the transformation matrix T in the following discussion is intended as a preface to the more detailed material of the Section V, the Appendix, below.
For recursive unitary transforms, an example of which is the subband transform, a unique formulation of Equation (1.5) is defined that lends itself to parallel implementation (note that the present invention may generally be applied to unitary transforms). Using the case of a recursive subband transform, the first level of analysis and synthesis is given by
x1=Tx and x=Ttx1. (1.6)
The transformed signal can be separated into its low and high frequency components: x1=[x1Lx1H]. Or, with proper zero padding of the individual vector components,
x 1 =x 1L +x 1H. (1.7)
Now, adapting T for decreasing vector size, and continuing to split the low frequency band further, the lower levels become
x2=Tx1L and x3=Tx2L. (1.8)
Analogous to Equation (1.7), and with the appropriate zero padding, the lower levels can also be written as a sum of their low and high frequency components as follows:
x 2 =x 2L +x 2H and x 3 =x 3L +x 3H. (1.9)
Because the T matrix must decrease in size at every level, however, a subscript, k, is appended (i.e., creating Tk) in order to denote the matrix's accommodation of the size of the signal at the kth level. Thus, the appropriate analysis and synthesis equations become:
x1=T1x x2=T2x1Lx3=T3x2L and (1.10)
x=T1 tx1x1L=T2 tx2x2L=T3 tx3. (1.11)
It is worth noting that in the formal treatment of the matrix T in Section V, the Appendix, below, the matrix will no longer require subscripts, as it will be more formally defined over multiple levels of recursion.
Finally, by writing each level as a sum of its low and high frequency components, and using the appropriate zero padding on the signals at levels one and two, the synthesis of the original signal, x, is as follows:
Now, by again treating the correlation filter as a linear system, Equation (1.5) can be computed in parallel, i.e.
y=Cx=CT 1 t T 2 t T 3 t x 3 +CT 1 t T 2 t x 2H +CT 1 t x 1H. (1.13)
After combining all the matrix terms, the above equation becomes
y=C t3 x 3 +C t2 x 2H +C t1 x 1H, (1.14)
where Ctk is the appropriate combination of correlation and transform matrices for level k, and the signals at levels one and two still carry the appropriate zero padding. Equation (1.14) can also be viewed as:
y=y 3 +y 2 +y 1 (1.15)
where yk is the correlation output surface for level k.
By nesting the recursion (Equation 1.12), a parallel implementation is manifested as clearly elucidated in Equation (1.14). This equation can be helpful in the implementation of one-dimensional linear systems in the transformed domain. The formal treatment here is one possible mathematical expression of the new signal processing architecture, which is introduced further below in section II.
I.b Extension to Two Dimensions
While the implementation of Equation (1.2) is straightforward in one dimension, this is not always quite as straightforward for two dimensions. It is uncomplicated in two dimensions when both the transform, T, and the linear operator, C, are separable. For ease of explanation, the subband compression transform, used herein is separable, and therefore can be written as Trow for the row operations and Tcol for the column processing.
The correlation operation, on the other hand, is not separable. It becomes cumbersome to use Equation (1.2) when searching for two-dimensional patterns in the input signal. It is natural to use a Toeplitz matrix (C in Equation (1.2)) to implement convolution of a one-dimensional signal. However, determining the equivalent structure necessary to perform convolution of a two-dimensional signal is not as easy. While tractable mathematically and algorithmically, the resulting matrix is impractical and computationally expensive. When processing uncompressed imagery, correlation filters avoid the issue completely by not utilizing a matrix implementation, but rather operating in the frequency domain. Thus, another structure must be used in order to operate two-dimensionally in the compressed domain and execute an ‘inverse transformed’ correlation filter.
II. Merging the Interpolation and Correlation Filters
To provide an alternative method of implementing an ‘inverse transformed’ correlation filter, polyphase structures are useful, allowing manipulation of the correlation filter and allowing its operation within the inverse transform, i.e., the interpolation filters, of the individual subbands within a subband coder.
II.a. Polyphase Structures
Fundamental to many signal processing applications is the polyphase decomposition the basic equations, which are reviewed here for purposes of clarification.
Consider a transfer function, H(z), which represents a digital filter.
A polyphase decomposition is simply a way of splitting a filter into its even and odd components as illustrated by Equation (2.2).
H(z)=H e(z 2)+z −1 H o(z 2) (2.2)
Also useful for the present invention is the noble identity for interpolators, which is graphically depicted in
Combining the polyphase decomposition with the above noble identity, results in a new construction known as an efficient interpolator, and exhibited for one-dimension in
Note that the structure of
II.b. Rearranging the Architecture
In order to properly introduce this discussion, first, a review of the baseline compression recognition system is presented in the context of image compression. Such a system is defined as one that fully reconstructs an image prior to performing object recognition with the correlation filter.
Throughout the ensuing discussion, all filters are represented in their z-Transform notation. For simplicity of notation, as mentioned with respect to
Next, a method of merging the correlation filter, C(z), with the subband synthesis filters is described. In the interest of clarity, without implying any particular limitation, this discussion will be presented in the form of a two level, one-dimensional system. In this case, the rearrangement of the architecture is a four-step process which is illustrated via
Step One: In this step, the correlation filter is moved back into level one of the subband decomposition by using the principle of superposition. The correlation filter, C(z), of
CF(z)=C(z)F(z) CG(z)=C(z)G(z) (2.4)
Step Two: Before the migration of the recognition filter can continue, the efficient interpolator of
Step Three: The above substitution allows the recognition filter to be pushed further back into the subband architecture. The polyphase structure on level one is moved behind the summation at level two, and then the participating filters are combined. Equation (2.5) describes the newly merged filters in a manner similar to Equation (2.4).
CF o F(z)=CF o(z)·F(z) CF e F(z)=CF e(z)·F(z)
CF o G(z)=CF o(z)·G(z) CF e G(z)=CF e(z)·G(z) (2.5)
The new filters, CFoF(z), CFeF(z), CFoG(z), and CFeG(z), retain the odd and even structures of the underlying filters CFo(z) and CFe(z). Now the architecture consists of two efficient interpolators, one in each branch of the decomposition's second level, as shown in
Step Four: Finally, to complete the new architecture, the efficient interpolator ‘identity’ is used. As illustrated in
As mentioned, and as shown in
Finally, correlation filters are typically implemented in the frequency domain. In the frequency domain, the correlation filter operation takes the form of a point-to-point multiplication rather than a full convolution. This method greatly speeds up the computation. It is also possible to take advantage of this computational simplicity in the SPSC. Toward this end, a Fast Fourier Transform (FFT) 1000 of each subband's input signal is used, as shown in
II.c. Extension to Two Dimensions
By continuing the above process outlined in Section II.b recursively, the new SPSC architecture shown in
In summary, the SPSC dissolves the boundary between the compression and recognition domains. Within each branch, all of the subband synthesis filters and the recognition filter are combined into one composite synthesis/recognition filter. The result is ten parallel branches of computation. The FFTs are used to enhance computational efficiency.
The result, y(m,n), is the correlation surface which exactly duplicates the one achieved by the baseline compression recognition system. Such a system, albeit for a two level decomposition of a one-dimensional signal, was shown in
III. Computational Complexity
In this section the computational requirements are compared for three systems: 1) the new SPSC architecture, 2) the baseline compression recognition system, and 3) correlation on uncompressed images. Here, only multiplication operations are considered in assessing the computational complexity of these systems. Some of the operations in the first two systems can be parallelized. Thus, not only the total computation required is examined, but also the maximum number of computes necessary to arrive at an output. The latter is termed the ‘effective computation’ herein.
III.a. Total Computation
Baseline System and Uncompressed Case
First, the computation involved for performing correlation on uncompressed images is reviewed, along with the computation involved in the baseline compression recognition system. In both cases, the correlation filter is implemented in the most computationally efficient way, i.e., in the frequency domain.
The computation for each step shown in
First, the inverse subband transform (IST) computation is examined. The computation required for the synthesis filters to interpolate one subband is
Therefore, the computation required to reconstruct an image from its compressed state is
Both the FFT and IFFT operations take the same number of computes, as specified in Equation (3.2).
FFT Computation=IFFT Computation=N2 log2 N (3.2)
Because the filter is implemented in the frequency domain, its operation takes the form of a point-to-point multiplication rather than a full convolution.
Correlation Computation=N2 (3.3)
Equations (3.1)–(3.3) provide the building blocks for the computation calculations. By adding the appropriate constructs, the following calculations of total computation, P, result for the two systems diagrammed in FIG. 12.
No Compression: P=N 2(2 log2 N+1) (3.4)
Now the SPSC is examined. The three-level process in
There are three FFTs on all levels but the lowest, where there are four.
The IFFT computation occurs only at the end, and is of the reconstructed image size, N.
IFFT Computation=N2 log 2 N (3.8)
Each parallel branch contains a composite synthesis/correlation filter of size N. There are, however, [4+3(K−1)] branches in a SPSC with K decomposition levels. Again, this filter is implemented in the frequency domain, so its operation takes the form of a point-to-point multiplication rather than a full convolution. This filter operation is of size N because it occurs after the upsampling operations.
Correlation Computation=[4+3(K−1)]N 2 (3.9)
By adding Equations (3.8)–(3.10), the total computation necessary for the SPSC operation may be written as follows:
After algebraic manipulation and combining of terms, this value becomes
Summary and Comparisons
The total computation necessary for each system's operation is summarized in the list below.
No Compression: P=N 2(2 log2 N+1)
An important observation is that the SPSC computation is not dependent on the value of M, i.e., the length of the subband synthesis filter. This is because the subband synthesis filter is incorporated into the correlation filter prior to the system operation.
III.b. Effective Computation
Recall that this term is defined to represent the total number of computations required to arrive at a result. By parallelizing some operations, the effective computation can be much lower than the total computation of a system.
For many applications, the speed of computation is more important than the total computation involved. In this regard, the parallelism inherent in the SPSC is beneficial. Operation of the SPSC can be viewed as ten parallel processing steps, with the total processing time dictated by the longest leg (i.e., one of the largest subbands). The largest subband requires an FFT of size
and a filter operation of size N. Lastly, the final stream of data requires an IFFT of size N. Thus, the effective computation of the SPSC is reflected in the following equation:
Baseline System and Uncompressed Case
The inverse subband transform can also be performed in parallel in a manner analogous to the SPSC. Thus, the effective computation of the baseline compression recognition system is contingent upon only one of its largest subbands. The effective computation required to fully reconstruct the image is derived from Equation (3.1) and is given below.
To the effective computation of the baseline system may be determined with reference to
Finally, the standard method of working with uncompressed data is completely serial. Therefore, P=P′ and is given below.
P′=N 2(2 log2 N+1) (3.15)
Summary and Comparisons
The effective computation necessary for each system's operation is summarized in the following list.
No Compression: P′=N 2(2 log2 N+1) (3.16)
It is readily apparent from the list above that the baseline system provides the slowest output response. The most important observation, however, is that the SPSC provides a faster system response than that provided by working with uncompressed imagery. As may be seen by subtracting Equation (3.18) from (3.16), a result faster may be achieved with compressed images than with uncompressed images, by the following number of computations:
For large values of N, this difference can become quite significant as exhibited in
Moreover, by taking a ratio of Equation (3.16) and (3.18), the savings factor, F′, achieved in the effective computation by using compressed data may be estimated. In the limit, the factor goes to the value of 1.6, and is very close to this value even for small image sizes, as
The SPSC is a new architecture that may be used, for example, for performing linear signal processing operations directly on compressed image data. It is equivalent to performing the linear filtering on the reconstructed image and has several novel benefits, as outlined below.
In conclusion, subband coders afford the image and signal processing world with effective compression, encryption, and other functionalities. The SPSC architecture of the present invention provides more than correlation within a subband coder; it also offers a technique for linear filtering in the subband transform domain. Thus, any signal and image processing operation (non-limiting examples of which include boundary detection, segmentation, feature extraction, and feature description) that can be written as a linear filter (in other words, any linear signal processing filter), can be performed in the transform domain with no loss of information, within the fidelity of the subband transform. This allows the introduction of quantization, and hence, compression. Thus, the SPSC brings signal processing one step closer to operating directly on compressed imagery.
V. Appendix—Matrix Formulation of a Subband Decomposition
This Appendix is provided for clarity and to assist the reader in understanding the concepts presented herein. A review of Equations (1.2)–(1.4) suggests that it would be helpful to formulate the subband transforms in matrix representations. Section I, above, briefly introduced this construction; a more formal treatment is presented here. In Section I, the T matrix was discussed for the purpose of a single recursion level. Now, the more general case of multiple recursion levels is presented. The formulation presented here evolved from the work of Mahalanobis, et. al., (A. Mahalanobis, S. Song, M. Petragalia and S. K. Mitra, “Adaptive FIR Filters Based on Structural Subband Decomposition for System Identification Problems,” IEEE Trans. on Circ. and Sys., 40, pp. 354–362, June 1993), which may be consulted for further background.
Central to the following discussion is the fact that the matrix representation developed here combines the filtering and downsampling of a subband analysis transform over multiple levels in a matrix structure. In the interest of clarity, the discussion is limited to the one-dimensional case, though the same discussion could readily be expanded to multi-dimensional cases by one of skill in the art.
To start, let x=[x(0)x(1) . . . x(L−1)] be a vector of length L which contains the samples of the signal x(n). For simplicity, it is assumed that L=2M, where M is any positive integer. It is further assumed that the subband analysis filters ho(n) and h1(n) of length N, decompose x(n) into two new subsequences x0(n) and x1(n). In matrix-vector notation, the decomposition of x into two subsequences x0 and x1 of length (L+N−2)/2 can be succinctly expressed as:
x2=A2x′ and (5.1)
x=A 1 t x 1 +A 2 t x 2; (5.2)
where A0 and A1 are unitary transform matrices of size (L+N−2)/2 by L. A0 and A1 satisfy the conditions A0 tA0=I, A1 tA1=I, and A0 tA1=0. These matrices are constructed from the subband analysis filters as follows:
The rows of A0 and A1 are shifted versions of the subband filters, appropriately padded with zeros to implement the required filtering and decimation. As a result, the matrix-vector multiplication of Equation (5.1) yields the desired filtered and downsampled subband sequences. The relation between the subband sequences and the original signal can also be expressed as:
may be considered to be a one level decomposition matrix for splitting a signal into two subbands. Because the subband decomposition matrices are unitary, it follows that TtT=I, and
which is the inverse decomposition (i.e., synthesis) equation.
Furthermore, the structure of T can be generalized to represent any one to M band transformation including partial decompositions and the maximally decimated case. Here, the focus of attention is on the structure of T for the case of a dyadic hierarchical decomposition in which only the low frequency signal is recursively decomposed. For example,
The process in
In the above equations, the symbols ‘0’ and ‘1’ refer to the branch of the dyadic tree being traversed. The number of symbols in the subscripts indicates the stage of the decomposition. Thus, the subscript ‘001’ refers to quantities in the second subband at the third stage of the decomposition process. All the subband decomposition matrices are of the type in Equation (5.3), but are appropriately dimensioned to match the lengths of the input and output signals at each stage. Based on the above formulation, it is now easy to see that the relation between the input signal and the subband output is given by:
is the three level decomposition matrix that implements the process shown in
Now, consider the case where the analysis and synthesis filter banks are not unitary or orthogonal, but rather biorthogonal, as is the case with many wavelets. The above equations and discussion still hold, with the exception that the biorthogonal matrix U replaces Tt, and TU=I. The matrix U is formed in a manner analogous to the way T is formed, except that the submatrices, A0 and A1, are composed from the subband synthesis filters, rather than from the subband analysis filters. In fact, as long as the forward transform is invertible, the matrix formulation T is applicable. (T has to be nonsingular, and thus invertible.) In this case, T−1 is formed from the inverse transform filters, rather than the forward transform filters.
For illustration purposes,
For further clarity, in
In summary, the matrix construct T collapses the complete subband hierarchy into an aggregate structure which provides a direct channel from input to output and vice-versa. Multiplying an input image (in both the row and column direction) with an M level T matrix results in an M level decomposition with (3M+1) bands. Thus, the matrix T provides a single to multiple band relationship.
|Cited Patent||Filing date||Publication date||Applicant||Title|
|US4956871||Sep 30, 1988||Sep 11, 1990||At&T Bell Laboratories||Improving sub-band coding of speech at low bit rates by adding residual speech energy signals to sub-bands|
|US5216719 *||Aug 29, 1991||Jun 1, 1993||Goldstar Co., Ltd.||Subband coding method and encoding/decoding system|
|US5384869 *||Oct 7, 1992||Jan 24, 1995||Sony United Kingdom Limited||Image processing apparatus|
|US5446495||Jun 9, 1992||Aug 29, 1995||Thomson-Csf||Television signal sub-band coder/decoder with different levels of compatibility|
|US5453945 *||Jan 13, 1994||Sep 26, 1995||Tucker; Michael R.||Method for decomposing signals into efficient time-frequency representations for data compression and recognition|
|US5481269||May 27, 1994||Jan 2, 1996||Westinghouse Electric Corp.||General frame wavelet classifier|
|US5740036||Sep 15, 1995||Apr 14, 1998||Atlantic Richfield Company||Method and apparatus for analyzing geological data using wavelet analysis|
|US5748786||Sep 21, 1994||May 5, 1998||Ricoh Company, Ltd.||Apparatus for compression using reversible embedded wavelets|
|US5798795||Mar 1, 1996||Aug 25, 1998||Florida Atlantic University||Method and apparatus for encoding and decoding video signals|
|US5799112 *||Aug 30, 1996||Aug 25, 1998||Xerox Corporation||Method and apparatus for wavelet-based universal halftone image unscreening|
|US5848193||Apr 7, 1997||Dec 8, 1998||The United States Of America As Represented By The Secretary Of The Navy||Wavelet projection transform features applied to real time pattern recognition|
|US5867598||Sep 26, 1996||Feb 2, 1999||Xerox Corporation||Method and apparatus for processing of a JPEG compressed image|
|US5933546||May 6, 1996||Aug 3, 1999||Nec Research Institute, Inc.||Method and apparatus for multi-resolution image searching|
|US5974186 *||Oct 24, 1996||Oct 26, 1999||Georgia Tech Research Corporation||Video coding system and method for noisy signals|
|US6064768||Jun 30, 1997||May 16, 2000||Wisconsin Alumni Research Foundation||Multiscale feature detector using filter banks|
|US6173275||Sep 17, 1997||Jan 9, 2001||Hnc Software, Inc.||Representation and retrieval of images using context vectors derived from image information elements|
|US6426983 *||Sep 14, 1998||Jul 30, 2002||Terayon Communication Systems, Inc.||Method and apparatus of using a bank of filters for excision of narrow band interference signal from CDMA signal|
|US6553396 *||Dec 9, 1999||Apr 22, 2003||Sony Corporation||Filter bank constituting method and filter bank apparatus|
|US6643406 *||Mar 29, 2000||Nov 4, 2003||Polaroid Corporation||Method and apparatus for performing linear filtering in wavelet based domain|
|WO2001009760A1||Jul 27, 2000||Feb 8, 2001||Polaroid Corporation||Method and apparatus for performing linear filtering in wavelet based domain|
|Citing Patent||Filing date||Publication date||Applicant||Title|
|US7197191 *||Mar 19, 2003||Mar 27, 2007||Sanyo Electric Co., Ltd.||Image transformation apparatus and method|
|US7330597 *||Nov 21, 2003||Feb 12, 2008||Texas Instruments Incorporated||Image compression|
|US8798388 *||Dec 3, 2009||Aug 5, 2014||Qualcomm Incorporated||Digital image combining to produce optical effects|
|US20030190083 *||Mar 19, 2003||Oct 9, 2003||Sanyo Electric Co., Ltd.||Image transformation apparatus and method|
|US20040120592 *||Nov 21, 2003||Jun 24, 2004||Felix Fernandes||Image compression|
|US20110135208 *||Jun 9, 2011||Qualcomm Incorporated||Digital image combining to produce optical effects|
|US20110170615 *||Sep 1, 2009||Jul 14, 2011||Dung Trung Vo||Methods and apparatus for video imaging pruning|
|U.S. Classification||382/235, 375/E07.045, 375/E07.193, 375/E07.128, 375/E07.153, 708/300, 375/E07.044, 382/260, 382/240, 375/E07.054|
|International Classification||H04N7/26, G06F17/14, H03H17/02, G06K9/36, G06K9/46|
|Cooperative Classification||G06F17/148, H04N19/19, H04N19/192, H04N19/122, H04N19/146, H04N19/147, H04N19/42, H04N19/63, H04N19/635, H04N19/80, H04N19/593, H03H17/0266|
|European Classification||H04N7/26A10T, H04N7/26A6E, H04N7/26H30, H04N7/26A10L, H04N7/26A6D, G06F17/14W, H03H17/02F8A, H04N7/26H30C1D, H04N7/26F, H04N7/26H30D2, H04N7/26H30D1|
|Jul 22, 2002||AS||Assignment|
Owner name: HRL LABORATORIES, LLC, CALIFORNIA
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:DANIELL, CINDY;REEL/FRAME:013121/0765
Effective date: 20020626
|May 27, 2008||CC||Certificate of correction|
|Nov 4, 2009||FPAY||Fee payment|
Year of fee payment: 4
|Dec 27, 2013||REMI||Maintenance fee reminder mailed|
|May 16, 2014||LAPS||Lapse for failure to pay maintenance fees|
|Jul 8, 2014||FP||Expired due to failure to pay maintenance fee|
Effective date: 20140516